Table Generting Functions in Hive – Explode

Normal user-defined functions take in a single input row and output a single output row. In contrast, table-generating functions transform a single input row to multiple output rows.

Explode is one of the UDTF (User Defined Table Generation Function) that allows the array to explode into several rows. This returns a row-set with a single column, one row from the array for each element.

Explode function is also helpful in exploding the map-key values to multiple rows. It returns a row-set with a two columns (key,value) , one row for each key-value pair from the input map.

Examples are given below:

Exploding An Array
Dataset:
1   John Smith  1234567679   1126543210   Ottawa Toronto
2   Scott Tiger   1234567891   1125432190   Regina Yellowknife

-Create table statement
CREATE TABLE ExplodeTest2(
            Id int,
            Name string,
            Mobile array<string>,
            Places_lived array<string>)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t’
COLLECTION ITEMS TERMINATED BY ‘,’
STORED AS TEXTFILE;

–Query
SELECT Id, Name, Ph as Mobile, CT as City
FROM ExplodeTest2 t
lateral VIEW explode(mobile) P AS Ph
lateral VIEW explode(places_lived) C AS CT;

Exploding A Map
Dataset:
1   Toronto   2019:43210,2020:44100
2   Ottawa    2019:43210,2020:44100

–Create table statement
CREATE TABLE ExplodeTest3(
         Id int,
        City string,
        Stats MAP<int,int>)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t’
COLLECTION ITEMS TERMINATED BY ‘,’
MAP KEYS TERMINATED BY ‘:’
STORED AS TEXTFILE;

–Query
SELECT t.id, t.city, t.stats, s.st1, s.st2 FROM ExplodeTest3 t
LATERAL VIEW EXPLODE(Stats) S AS st1, st2;

Hope you find this article helpful.

Please click on follow button to get notifications on latest updates.

2 comments

Leave a Reply