SQOOP Complete Tutorial Part-9

Previous articles covered how to move data from a relational database to HDFS and Hive. We’ll now look at how to get the data out of HDFS and Hive. Before exporting, the target table must exist in the target database. As previously stated, everything in HDFS is a file; there are no tables. When these files are exported into an RDBMS, the rows will be read and parsed into records. In this and subsequent posts, the following topics will be discussed.

  • Exporting CSV/Text data from HDFS to MySQL
  • Exporting into a staging table
  • Sqoop Export – Update the records in the destination table.
  • Sqoop Export – Append the new and modify the existing the values in the destination table.
  • Export data from Hive Warehouse/Catalog to MySQL

23) Exporting CSV/Text data from HDFS to MySQL
Employee records are stored in a file called emp in /user/cloudera/empdir.

EmpDataset_in_HDFS

The information is provided below.

101, ‘Smith’,20000,’IT’
102, ‘Sean’,20000,’IT’
103, ‘Debby’,15000,’Accounts’
104, ‘Juan’,14000,’Accounts’
105, ‘Mona’,15000,’HR’

Now, let’s create a table in MySQL, because, as previously said, the destination table must exist before export.

In MySQL, use the below commands to create the database and the target table.

–Create the database
CREATE DATABASE NewTestDB;

–Change into the newly created database to create the target table.
USE NewTestDB;

–Creating the Target table
CREATE TABLE Emp(
EmpID INT NOT NULL PRIMARY KEY,
Ename VARCHAR(20),
Salary INT,
Dept VARCHAR(10));

Let’s now use Sqoop to export the data from the HDFS console.

sqoop export \
–connect jdbc:mysql://localhost/NewTestDB \
–username root \
–table Emp\
–export-dir /user/cloudera/empdir

This slideshow requires JavaScript.

Let’s return to MySQL and query the table “Emp” to verify if the data was successfully exported.

MySQL_SqoopExport

Hope you find this article helpful.

Please proceed to next article to learn more about Sqoop Export.

Leave a Reply