In the previous article, we looked at how to work with the complex datatype “Array.” Now, we’ll look at the “MAP,” which is a another complex datatype.
Before we proceed for the exercise, let us know what MAP data type is.
Map is a complex data type in Apache Hive that can store Key-Value pairs. Values from a map can be accessed using the keys. It is an unordered collection of key-value pairs. Keys must be of primitive types. Values can be of any type.
We are going to consider the below data set to learn “Complex Data Types” in Apache Hive. The female-male ratio in top-ranked universities in the United States, the United Kingdom, and Australia is seen in this figure.
In the above “Female:Male Ratio” column, if you see the values are in pairs. Male and Female are the keys separated by commas (,) and are assigned with some values separated by colons (:). Let’s implement it in Apache Hive.
CREATE TABLE tbRatio(
Rnk Int, University String, Ratio Map<String, Int>)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t’
COLLECTION ITEMS TERMINATED BY ‘,’
MAP KEYS TERMINATED BY ‘:’;
LOAD DATA LOCAL INPATH ‘Desktop/MAP_dataset.csv’ INTO TABLE tbRatio;
SELECT * FROM tbRatio;
Now, lets query the table using MAP keys.
SELECT ratio[‘Male’] FROM tbRatio WHERE Rnk=1
SELECT ratio[‘Male’], ratio[‘Female’] FROM tbRatio WHERE Rnk=1
SELECT ratio[‘Male’], ratio[‘Female’] FROM tbRatio WHERE Rnk BETWEEN 1 AND 5;
Hope you liked this post.
Please do subscribe for more interesting updates.