Poster Hadoop summit 2011: pig embedding in scripting languages

1

PIG Scripting Making Pig Turing-complete through embedding in a scripting language Julien Le Dem - Yahoo Solution 2: using Pig scripting Embedded Pig calls in Python Python functions are automatically available as UDFs Python variables can be used in the pig scripts $n, $i, ... UDFs use standard Python constructs (tuple/list/dictionary ) automatically converted to Pig. UDFs take the elements of the tuple as parameters, not a tuple. The output schema is specified using a decorator (it can be a function if you need to manipulate the input schema) Unfortunately this solution does not fit here. It requires 7 artifacts: - 3 Java UDFs: 3 classes must be compiled and packaged in a jar. The average UDF size is 50 to 100 lines of code. - 3 Pig scripts in 3 separate files: init, main loop content, finalization. The average Pig script size is 5 to 10 lines. - Main program: executes and coordinates the Pig scripts. It contains about 50 lines of code. 1 file required: Modification flow: Modification flow: Overview Example: Transitive closure - Iterative process: requires a loop and a termination condition - Requires multiple join/group by: typical Pig usage - Requires User Defined Functions - Map/Reduce: a solution to process big data. - Pig: makes it easier to manipulate data on the grid. - Pig scripting: makes Pig easier with iterative algorithms and User Defined Functions. PIG-928 (in 0.8) Solution 1: plain Pig Pig execution model 7 files required: This solution does fit on the poster. It requires 1 artifact, containing: - 3 UDFs provided as functions: The average UDF size is 8 to 12 lines of code. - 3 Pig queries: init, main loop content, finalization. Each Pig query adds an average 5 to 10 lines. - Main function: executes and coordinates the Pig scripts. It adds about 10 lines to Access to the output of pig scripts

Upload
julien-le-dem
Category

Technology
view
1.026
download
0

TAGS:

Embed Size (px):

description

Poster presented at Hadoop summit 2011

Transcript of Poster Hadoop summit 2011: pig embedding in scripting languages

Page 1: Poster Hadoop summit 2011: pig embedding in scripting languages

PIG ScriptingMaking Pig Turing-complete through embedding in a scripting language

Julien Le Dem - Yahoo

Solution 2: using Pig scripting

Embedded Pig calls in Python

Python functions are automatically available as UDFs

Python variables can be used in the pig scripts $n, $i, ...

UDFs use standard Python constructs (tuple/list/dictionary) automatically converted to Pig.

UDFs take the elements of the tuple as parameters, not a tuple.

The output schema is specified using a decorator (it can be a function if you need to manipulate the input schema)

Unfortunately this solution does not fit here.

It requires 7 artifacts:

- 3 Java UDFs: 3 classes must be compiled and packaged in a jar. The average UDF size is 50 to 100 lines of code.

- 3 Pig scripts in 3 separate files: init, main loop content, finalization. The average Pig script size is 5 to 10 lines.

- Main program: executes and coordinates the Pig scripts. It contains about 50 lines of code.

1 file required:

Modification flow: Modification flow:

OverviewExample: Transitive closure

- Iterative process: requires a loop and a termination condition

- Requires multiple join/group by: typical Pig usage

- Requires User Defined Functions

- Map/Reduce: a solution to process big data.

- Pig: makes it easier to manipulate data on the grid.

- Pig scripting: makes Pig easier with iterative algorithms and User Defined Functions.

Jiras: PIG-1479, PIG-1794 (in 0.9), PIG-928 (in 0.8)

Solution 1: plain Pig

Pig execution model

7 files required:

This solution does fit on the poster.

It requires 1 artifact, containing:

- 3 UDFs provided as functions: The average UDF size is 8 to 12 lines of code.

- 3 Pig queries: init, main loop content, finalization. Each Pig query adds an average 5 to 10 lines.

- Main function: executes and coordinates the Pig scripts. It adds about 10 lines to the script.

Access to the output of pig scripts

1 Topics: Scripting What is game scripting Scripting Features Torque Script.

1 Topics: Scripting What is game scripting Scripting Features Torque Script.

APIs for Scripting - 2014 - Perforce · APIs for Scripting APIs for Scripting v Instance Methods ..... 32

APIs for Scripting - 2014 - Perforce · APIs for Scripting APIs for Scripting v Instance Methods ..... 32

Apache Big Data Europe 2016 Power Pig with Spark...Apache Pig Procedural scripting language Pig Latin: similar to sql Heavily used for ETL Schema / No schema data, Pig eats everything

Apache Big Data Europe 2016 Power Pig with Spark...Apache Pig Procedural scripting language Pig Latin: similar to sql Heavily used for ETL Schema / No schema data, Pig eats everything

MapReduce vs Pig | MapReduce Pig Integration

MapReduce vs Pig | MapReduce Pig Integration

Db2 Scripting With Windows Scripting Host

Db2 Scripting With Windows Scripting Host

Embedding SQL Server Express into Custom Applications - Web viewFilename: #1266 Embedding Express Cust_Apps3. Embedding SQL Server Express into Custom Applications3. Embedding SQL

Embedding SQL Server Express into Custom Applications - Web viewFilename: #1266 Embedding Express Cust_Apps3. Embedding SQL Server Express into Custom Applications3. Embedding SQL

Developing Pig on Tez - events.static.linuxfound.org...Developing Pig on Tez Mark Wagner Committer, Apache Pig LinkedIn Cheolsoo Park VP, Apache Pig Netflix. What is Pig Apache project

Developing Pig on Tez - events.static.linuxfound.org...Developing Pig on Tez Mark Wagner Committer, Apache Pig LinkedIn Cheolsoo Park VP, Apache Pig Netflix. What is Pig Apache project

Scripting - Home | IHO · Part 50 - Scripting 7 Figure 2 - Scripting Catalogue / Host interaction within a Scripting Domain 50-6.1 Distribution The distribution mechanism of a scripting

Scripting - Home | IHO · Part 50 - Scripting 7 Figure 2 - Scripting Catalogue / Host interaction within a Scripting Domain 50-6.1 Distribution The distribution mechanism of a scripting

Scripting versus Programming Scripting 1 bash · Scripting Computer Organization I 1 CS@VT ©2005-2016 McQuain Scripting versus Programming bash supports a scripting language. Programming

Scripting versus Programming Scripting 1 bash · Scripting Computer Organization I 1 CS@VT ©2005-2016 McQuain Scripting versus Programming bash supports a scripting language. Programming

Introduction to Pig | Pig Architecture | Pig Fundamentals

Introduction to Pig | Pig Architecture | Pig Fundamentals

Introduction to pig & pig latin

Introduction to pig & pig latin

Scripting Windows A. Habert C. Bravo Scripting Antoine ...thomashenrywarner.free.fr/eBooks/Scripting/Scripting Windows... · De Windows NT4 à Windows XP et 2003, ... constantes,

Scripting Windows A. Habert C. Bravo Scripting Antoine ...thomashenrywarner.free.fr/eBooks/Scripting/Scripting Windows... · De Windows NT4 à Windows XP et 2003, ... constantes,

AE6382 Introduction to Scripting AE 6382. What is a scripting l Scripting is the process of programming using a scripting language l A scripting language,

AE6382 Introduction to Scripting AE 6382. What is a scripting l Scripting is the process of programming using a scripting language l A scripting language,

Scripting - Client-side vs. Server-side Scripting

Scripting - Client-side vs. Server-side Scripting

Embedding High-Level Formal Speci cations into Applications · a scripting language that allows easy embedding of domain speci c languages (DSLs). For this, we picked and embraced

Embedding High-Level Formal Speci cations into Applications · a scripting language that allows easy embedding of domain speci c languages (DSLs). For this, we picked and embraced

Pig and Hive Hadoop with Programming in · 2019-11-14 · for Hadoop-- Pig and Hive •Pig is a "scripting language" that excels in specifying a processing pipeline that is automatically

Pig and Hive Hadoop with Programming in · 2019-11-14 · for Hadoop-- Pig and Hive •Pig is a "scripting language" that excels in specifying a processing pipeline that is automatically

Client side scripting and server side scripting

Client side scripting and server side scripting

Best Practice Web Client Scripting - Kofax Web Client scripting can be divided in client-side scripting and server-side scripting. 1.2.1 Client-side Scripting With client-side scripting

Best Practice Web Client Scripting - Kofax Web Client scripting can be divided in client-side scripting and server-side scripting. 1.2.1 Client-side Scripting With client-side scripting

Pig Veterinary Society THE CASUALTY PIG - … Pig - April 2013-1... · 1 Pig Veterinary Society THE CASUALTY PIG Interim Update April 2013

Pig Veterinary Society THE CASUALTY PIG - … Pig - April 2013-1... · 1 Pig Veterinary Society THE CASUALTY PIG Interim Update April 2013

¡Peppa Pig en tu fiesta! · 03/09/2018 ¡Peppa Pig en tu fiesta! SHOW DESCRIPCION COSTO Cuento 8 PERSONAJES: Peppa Pig, George, Mamá Pig, Papá Pig…

¡Peppa Pig en tu fiesta! · 03/09/2018 ¡Peppa Pig en tu fiesta! SHOW DESCRIPCION COSTO Cuento 8 PERSONAJES: Peppa Pig, George, Mamá Pig, Papá Pig…

Languages

Pages

Legal

Copyright © 2022 FDOCUMENTS