Normal user-defined functions take in a single input row and output a single output row. In contrast, table-generating functions transform a single input row to multiple output rows.
Explode is one of the UDTF (User Defined Table Generation Function) that allows the array to explode into several rows. This returns a row-set with a single column, one row from the array for each element.
Explode function is also helpful in exploding the map-key values to multiple rows. It returns a row-set with a two columns (key,value) , one row for each key-value pair from the input map.
Examples are given below:
Exploding An Array
Dataset:
1 John Smith 1234567679 1126543210 Ottawa Toronto
2 Scott Tiger 1234567891 1125432190 Regina Yellowknife
–-Create table statement
CREATE TABLE ExplodeTest2(
Id int,
Name string,
Mobile array<string>,
Places_lived array<string>)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t’
COLLECTION ITEMS TERMINATED BY ‘,’
STORED AS TEXTFILE;
–Query
SELECT Id, Name, Ph as Mobile, CT as City
FROM ExplodeTest2 t
lateral VIEW explode(mobile) P AS Ph
lateral VIEW explode(places_lived) C AS CT;
Exploding A Map
Dataset:
1 Toronto 2019:43210,2020:44100
2 Ottawa 2019:43210,2020:44100
–Create table statement
CREATE TABLE ExplodeTest3(
Id int,
City string,
Stats MAP<int,int>)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t’
COLLECTION ITEMS TERMINATED BY ‘,’
MAP KEYS TERMINATED BY ‘:’
STORED AS TEXTFILE;
–Query
SELECT t.id, t.city, t.stats, s.st1, s.st2 FROM ExplodeTest3 t
LATERAL VIEW EXPLODE(Stats) S AS st1, st2;
Hope you find this article helpful.
Please click on follow button to get notifications on latest updates.
2 comments