Complex Data Types – Part-4

Read: Array – Complex Data Type In Apache Hive
Read: Map – Complex Data Type in Apache Hive

In this post, we’ll look at another complex data type in Apache Hive called “STRUCT.”

STRUCT in Hive is similar to STRUCT in the C programming language. It’s a record type that contains a collection of named fields that can be of any primitive data type. Using the DOT (.) notation, we can access the elements of the STRUCT type.

In simple words, STRUCT, in contrast to “Array,” is a collection different data.

Let’s consider the below dataset.

STRUCT_dataset

If you look at the above image, the “info” column has different sub-columns that contains Integers, decimals and strings.

Let’s do the exercise now.

CREATE TABLE tbUniversity(
rnk INT,
Name STRING,
Info STRUCT<FTEStdCount:Int,
                         StdPerStaff:Float,
                         Area:String,
                         City:String,
                         State:String,
                         Zip:String,
                         Country:String>)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t’
COLLECTION ITEMS TERMINATED BY ‘,’;

LOAD DATA LOCAL INPATH ‘Desktop/STRUCT_dataset.csv’ INTO TABLE tbUniversity;

SELECT * FROM tbUniversity;

Struct_Implementation

Now, let’s query the table to get some results. This way, we can see how to retrieve STRUCT data.
SELECT info.city from tbUniversity;

SELECT info.StdPerStaff from tbUniversity;

SELECT Rnk,Name,info.StdPerStaff,info.City,Info.Country FROM tbUniversity WHERE rnk=1;

SELECT Rnk,Name,info.StdPerStaff,info.area,info.City,Info.Country
FROM tbUniversity
WHERE info.country=’UK’;

Struct_Implementation2

Struct_Implementation3

Hope you find this article helpful.

Please subscribe to get updates on latest posts.

2 comments

Leave a Reply