Loading Data From HDFS into Hive

Importing data into an RDBMS is a separate feature or function, although it is one of the DML commands in Hive. Data may be imported into a Hive table from HDFS or a local system. We’ll speak about importing data from HDFS into Hive in this post.

Syntax:
LOAD DATA INPATH <HDFS-Location> OVERWRITE INTO TABLE <Table-Name>

The use of “OVERWRITE” is optional, and it is intended to overwrite existing data. If the ‘OVERWRITE’ keyword is omitted, data files are appended to existing data sets. The load command does not do any data validation against the schema. The file is transferred into the Hive-controlled file system namespace if it is in HDFS.

Examples:
LOAD DATA INPATH ‘/user/cloudera/testfolder/test.txt’
INTO TABLE TestTable;
The command above will import data from ‘test.txt’ into an existing Hive table named “TestTable.”

LOAD DATA INPATH ‘/user/cloudera/testfolder/JulySales.csv’
OVERWRITE INTO TABLE tblSales PARTITION (month=’July’);
The command above will import data from “JulySales.csv” into a partitioned Hive table called “tblSales.” Because overwrite is utilized, the existing data will be replaced.

Hope you find this article helpful.

Please subscribe for more interesting updates.

2 comments

Глеб says:

1st Aug 2021 at 8:24 pm

Use the knowledge modules listed in the following table to load data from an HDFS file or Hive source into an Oracle database target using Oracle Loader for Hadoop.

LikeLike

Pingback: Apache Hive Course Contents – Big Data and SQL