Skipping First and Footer Row – Hive Internal & External Tables

Most of the data-sets (CSV files, Text files, and so forth.) do have header row inside the data, and loading them in the Hive tables will reason null values. This article will assist you in how to deal with the header rows while creating Internal or external tables. 

If you are creating an internal table –

CREATE TABLE IF NOT EXISTS Emp(
empID int,
Ename String,
Sal Decimal,
Dno Int)
row format delimited
fields terminated BY ‘,’
tblproperties(“skip.header.line.count”=”1”);

If you are creating an external table –

CREATE EXTERNAL TABLE IF NOT EXISTS Emp(
empID int,
Ename String,
Sal Decimal,
Dno Int)
row format delimited
fields terminated BY ‘,’
LOCATION ‘/user/cloudera/empdirectory
tblproperties(“skip.header.line.count”=”1”);

If in case you already created a table without specifying “skip.header.line.count” then you still can alter the table properties using the following command. 

ALTER TABLE Emp SET TBLPROPERTIES (“skip.header.line.count”=”1”);

Similarly, if you want to skip the footer line you can use the below in the table properties.

“skip.footer.line.count”=”1”

Even after this implementation if you are getting the header while querying your table, that is probably because vectorization is enabled. Disabling the feature will help in solving the issue.

set hive.vectorized.execution.enabled=false; 

However, vectorization in Hive is a feature that streamlines the query execution operations by processing a block of 1024 rows at a time rather than reading one row at a time. If this feature is enabled it greatly reduces the CPU usage for typical query operations like scans, filters, aggregates, and joins. But enabling or disabling this feature is depends on your data volume and type of data you are dealing with.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s