Convert Delimited Data Into Columns In Hive

The conversion of delimited data into columns in Apache Hive is discussed in this article. If the data is already loaded as a single column in a table, there is a “SPLIT” function that helps us solve this problem. If the table has not been generated yet you can simply create a table with the FIELDS TERMINATED BY clause in the Apache Hive and load the data directly into the table; the delimited data will be converted to columns automatically. 

Let’s assume that the table is already created and the data is loaded as a single column. Let’s look into it with an example. Consider the below data set for practice.

1456, ConvertedData1, SomeOtherData1
1466, ConvertedData2, SomeOtherData2
1476, ConvertedData3, SomeOtherData4

Let’s create the table and load the data using Apache Impala.

— Create table statement

CREATE TABLE splitStringData(dataCol1 STRING);

— Data insertion

(‘1456, ConvertedData1, SomeOtherData1’),
(‘1466, ConvertedData2, SomeOtherData2’),
(‘1476, ConvertedData3, SomeOtherData4’)

Connect to Hive now to convert the delimited data using SPLIT function.


The split function splits the string around the pattern pat and returns an array of strings. You can also specify regular expressions as patterns.

split(string A, string pattern)

SELECT SPLIT(dataCol1,’\\,’)[0] Col1,
SPLIT(dataCol1,’\\,’)[1] Col2,
SPLIT(dataCol1,’\\,’)[2] Col3
FROM splitStringData;

Hive Split


Similar implementation in other RDBMS platforms –


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s