Related to the previous post, the following is another example of complex data types being used within another complex data type. We’ll deal with STRUCT inside an ARRAY again in this case. In addition to extracting the required data, we’ll see how to use Apache Hive’s built-in functions to the nested fields.
Consider the below dataset.
The data in the CSV file will look the below.
In the above example, the “date” is in “bigint” format which needs to be converted once it is loaded in the table. Let’s see how to do it.
— Creating the table.
CREATE TABLE CustOrders2(
CustID BIGINT,
CustName STRING,
SalesInfo Array<Struct<ProductID: BigInt,StatusModifiedDate:BigInt,Status:String>>)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t’
COLLECTION ITEMS TERMINATED BY ‘,’
MAP KEYS TERMINATED BY ‘:’
LINES TERMINATED BY ‘\n’;
— Loading the data
LOAD DATA LOCAL INPATH ‘Desktop/StructWithinArrayPart-2.csv’ INTO TABLE CustOrders2;
–Querying to retrieve the required information from the table.
SELECT Custid,SalesInfo[0].ProductID,
FROM_UNIXTIME(SalesInfo[0].StatusModifiedDate)
FROM CustOrders2
SELECT Custid,SalesInfo[0].ProductID,
FROM_UNIXTIME(SalesInfo[0].StatusModifiedDate)
FROM CustOrders2
WHERE SalesInfo[0].ProductID=110909316918;
— Using built-in functions to convert datetime format which is in BIGINT.
SELECT Custid,SalesInfo[0].ProductID,
FROM_UNIXTIME(SalesInfo[0].StatusModifiedDate)
FROM CustOrders2
WHERE SalesInfo[0].Status=’Cancelled’;
Hope you liked this post.
Please subscribe to receive latest updates information.
2 comments