Exporting Hive Table Data To HDFS

We frequently import and export data to and from Hive in a variety of scenarios. This article will show you how to save data to HDFS or a local file system as a text file or in any other file type.

The following is the syntax to EXPORT the data from Hive.

INSERT OVERWRITE [LOCAL] DIRECTORY <directory-location>
[ROW FORMAT <format-type>] [ STORED AS <file-format>]
{{ VALUES ({value | NULL} [,…] ) [ , (…) ] } | query }

The “Directory-Location” provides the storage location of the data, which can be either HDFS or a local file system. If the target directory exists, the INSERT OVERWRITE statement will overwrite it; otherwise, it will create the specified directory in the provided path. Within a map-reduce task, Hive can write to HDFS directories in parallel. DELIMITED or a desired SERDE should be used as the basis for the ROW FORMAT. STORED AS should be used to specify the file format in which the data should be saved. The data to be stored should be supplied in the “VALUES” section, or it might come from a specific table.

Overall, INSERT OVERWRITE is the most efficient and straightforward method for extracting huge amounts of data from Hive.

Example:
> INSERT OVERWRITE DIRECTORY ‘/user/cloudera/testdatedir’
> ROW FORMAT DELIMITED
> STORED AS TEXTFILE
> SELECT * FROM testdate;

This slideshow requires JavaScript.

Hope you find this article helpful.

Please subscribe for latest updates.