Read: Array – Complex Data Type In Apache Hive
Read: Map – Complex Data Type in Apache Hive
In this post, we’ll look at another complex data type in Apache Hive called “STRUCT.”
STRUCT in Hive is similar to STRUCT in the C programming language. It’s a record type that contains a collection of named fields that can be of any primitive data type. Using the DOT (.) notation, we can access the elements of the STRUCT type.
In simple words, STRUCT, in contrast to “Array,” is a collection different data.
Let’s consider the below dataset.
If you look at the above image, the “info” column has different sub-columns that contains Integers, decimals and strings.
Let’s do the exercise now.
CREATE TABLE tbUniversity(
rnk INT,
Name STRING,
Info STRUCT<FTEStdCount:Int,
StdPerStaff:Float,
Area:String,
City:String,
State:String,
Zip:String,
Country:String>)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t’
COLLECTION ITEMS TERMINATED BY ‘,’;
LOAD DATA LOCAL INPATH ‘Desktop/STRUCT_dataset.csv’ INTO TABLE tbUniversity;
SELECT * FROM tbUniversity;
Now, let’s query the table to get some results. This way, we can see how to retrieve STRUCT data.
SELECT info.city from tbUniversity;
SELECT info.StdPerStaff from tbUniversity;
SELECT Rnk,Name,info.StdPerStaff,info.City,Info.Country FROM tbUniversity WHERE rnk=1;
SELECT Rnk,Name,info.StdPerStaff,info.area,info.City,Info.Country
FROM tbUniversity
WHERE info.country=’UK’;
Hope you find this article helpful.
Please subscribe to get updates on latest posts.
2 comments