As we’ve seen in previous blogs, Hive complex data types such as arrays, maps, and structs are a composite of primitive or complex data types. This means that integer, string, decimal, and other complex data types can be represented as an Array. Similarly, MAP and STRUCT can also have primitive and complex data types.
We learned how to embed a complex data type STRUCT inside an ARRAY datatype, ARRAY within a STRUCT, and ARRAY within MAP in previous topics. Let’s explore how to handle if STRUCT is within MAP.
The dataset for the practice is shown below, and it comprises of Department ID and Department Location:
DeptID, Building, Area, City, ZipCode
10, Data Centre, Internet City, Dubai, 83000
20, Corporate Plaza, Internet City, Dubai, 83001
30, Data Centre, Media City, Dubai, 83002
Structure of the table in which the above data should be placed:
+--------------+------------------------------------+
| name | type |
+--------------+------------------------------------+
| Dept | MAP<INT, STRUCT< |
| | Building:String, |
| | Area:String, |
| | City:String, |
| | ZipCode:BigINT>> |
+--------------+------------------------------------+
Because the data is in CSV (comma separated values) format, let’s create a table in Hive to temporarily store it.
CREATE TABLE StructInMapTestData(
DeptID INT,
Building STRING,
Area STRING,
City STRING,
ZipCode BIGINT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’;
LOAD DATA LOCAL INPATH ‘Desktop/Docs/StructInMap’ INTO TABLE StructInMapTestData;
We will have the data in a table using the aforementioned implementation. The data will now be converted into nested values as per the structure.
CREATE TABLE StructInMap(
Dept MAP<INT, STRUCT<Building:STRING, Area:STRING,City:STRING,ZipCode:BIGINT>>);
INSERT INTO StructInMap
SELECT MAP(DeptID,
named_struct(‘Building’,Building,’Area’,area,’City’,City,’ZipCode’,ZipCode))
FROM StructInMapTestData;
Hope you find this article helpful.
Please subscribe for more interesting updates.
2 comments