Apache Pig Complex Types – Tuple & Bag

In this post, we’ll talk about Apache Pig’s complex types, “Tuples & Bags”. Maps will be discussed in another post.

Pig Latin statements work with relations, and a relation is a bag, which is a collection of tuples, which are an ordered set of fields, and a field is a piece of data.

A Pig relation is similar to a table in a relational database, with the tuples in the bag representing the rows in the table. Pig relations, unlike relational tables, do not require that every tuple include the same number of fields or that the fields in the same position (column) be of the same type.

Tuple, in other words, is nothing more than a record containing a set of columns. The following example will explain how to work with collection of tuples.

Example:
Sample Data.
File Name: emp_tuple.csv
(7839,KING,CHAIRMAN) (5000,300)
(7566,JONES,ANALYST) (3400,200)

The above data is collection of tuples delimited by a space. As stated above, collection of tuples is called a bag. Individual elements in the above data are called “Atoms”.

Loading the data into a relation.
emp = load ‘Desktop/Docs/emp_tuple.csv’ USING PigStorage(‘ ‘) as (empdetails:(empid:int, ename:chararray, job:chararray),income:(salary:int, commission:int));

The above command will help in storing the tuples (empdetails and income) into a relation named “emp”.

Review the below screenshots.

This slideshow requires JavaScript.

Now, let’s retrieve the data from the relation.

grunt> EmpDetailsRec = foreach emp generate empdetails;

The above will return all the columns from the specified tuple “empdetails”. It will return the following output.

((7839,KING,CHAIRMAN))
((7566,JONES,ANALYST))

grunt> EmpSalaryDetails = foreach emp generate empdetails.ename,empdetails.job,income.salary,income.commission;

In the above example we fetched the specific information using <tuplename>.<columnname>. It will return the below result.
(KING,CHAIRMAN,5000,300)
(JONES,ANALYST,3400,200)

Refer to the below screenshots.

This slideshow requires JavaScript.

Hope you find this article helpful.

Please do follow this blog for more interesting updates.

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s