Complex Data Types – STRUCT within ARRAY -1

We’ve learned how to deal with each complex datatype separately so far, and we’ve even looked into the possibilities of including all complex data types in a single table. Let’s move on to more complicated data now.

A complex data type, as previously stated, is a representation of multiple fields of a single item. MAP is a key value combination where the key part is a scalar type and the value part may be a scalar or another complex type. Array is a list of similar data with all fields of the same data type. STRUCT is a list of different types of data with different primitive data types.

These complex types, on the other hand, are often used in combination, such as an ARRAY of STRUCT elements, an ARRAY of MAP elements, and a MAP within a STRUCT within an ARRAY. A STRUCT can be the top-level type for a column, or can itself be an item within an ARRAY or the value part of the key-value pair in a MAP. STRUCT can be directly inside an ARRAY or MAP.

Look at the below data to understand how the complex data types can be within each other.

StructWithinArray

In CSV format, the data looks like below.

StructWithinArray-1

If you observe, “Orders” is a column whereas “OrderID” to “PricePerItem” are the sub-columns or “fields” within the “Orders” column. However, it is not a single entry for each customer. It’s an array of struct values.

This is an example of STRUCT within ARRAY.

Let’s see how it can be implemented.

CREATE TABLE CustOrders(
CustID BIGINT,
CustName STRING,
Orders Array<Struct<OrderID: Int,OrderItem: string,Qty: Int,PricePerProduct: Decimal(9,2)>>)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t’
COLLECTION ITEMS TERMINATED BY ‘,’
MAP KEYS TERMINATED BY ‘-‘
LINES TERMINATED BY ‘\n’;

— Load the data.
LOAD DATA LOCAL INPATH ‘Desktop/StructInArrayCustOrders.csv’ INTO TABLE CustOrders;

Let’s see how to retrieve the data from the table.

If you want to see all the orders
SELECT orders[0] FROM CustOrders;

If you want to get all the orderIDs
SELECT orders[0].orderid FROM CustOrders;

If you want get how much a customer is paid for the specific order.
SELECT custid,orders[0].orderid,
SUM(orders[0].qty*orders[0].PricePerProduct)
FROM CustOrders
WHERE orders[0].orderid=180
GROUP BY custid, orders[0].orderid;

If you want to get specific customer’s information along with his orders.
SELECT custid,custname,
orders[0].orderid,
orders[0].qty,
orders[0].PricePerProduct
FROM CustOrders WHERE custid=1001112;

Screenshots:

This slideshow requires JavaScript.

Hope you liked this post.

Please subscribe to receive latest updates.

5 comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s