Inserting Data Into Hive Tables From Queries

In this tutorial, we’ll look at how to use SELECT statements to INSERT data. This implementation is identical to that of SQL. Although I couldn’t conceive of this article’s content as a distinct topic, and it is also an integral part of many of my previous articles, I am considering it now for the purpose of novices who are studying “Apache Hive Course Contents” step by step. 

Inserting data into Hive tables can be done in three ways.
(1) Using the LOAD command,
(2) Query-based inserting, and
(3) Straight insert statements.

LOAD command will help loading data from Local file system and HDFS into Hive. DML operations were not supported in earlier versions of Hive, however this was fixed in later versions. In this article, we will see how the data can be inserted using queries (SELECT statements).

Standard syntax:
INSERT OVERWRITE TABLE tablename1
[PARTITION (partcol1=val1, partcol2=val2 …) [IF NOT EXISTS]]
select_statement1 FROM from_statement;

Examples:
INSERT INTO TempTable2 SELECT * FROM TempTable1

INSERT INTO TempTable2
SELECT Col1, Col2, Col3 FROM TempTable1

INSERT INTO TempTable2
SELECT Col1, Col2, Col3 FROM TempTable1 WHERE Col1 = 1000;

INSERT INTO TempTable2
SELECT DISTINCT Col1, Col2, Col3 FROM TempTable1 WHERE Col1 = 1000;

INSERT INTO TempTable2
SELECT Col1, Col2, Col3 FROM TempTable1 WHERE Col1 = 1000;

— If the table is partitioned
INSERT INTO TownsList_Dynamic PARTITION(Country)
SELECT * FROM TownsList;

Hope you find this article helpful.

Please subscribe for more interesting updates.

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s