Sqoop Complete Tutorial Part-5

This is the continuation part of “Sqoop Complete Tutorial”. If you want to read –

14) Read the contents of the database without importing it. 

sqoop list-tables
–connect jdbc:mysql://localhost/empdept
–username root
–password cloudera

sqoop_listables
This above command will list out the tables from the given database.

15) Get the list of of the databases using Sqoop

sqoop list-databases
–connect jdbc:mysql://localhost/
–username root
–password cloudera

sqoop_listdb

16) Import specific columns from a MySQL’s table to HDFS

sqoop list-databases
–connect jdbc:mysql://localhost/
–username root
–password cloudera
–table Emp
–columns “EmpNo, EName, DeptID”

The above command will help in importing only the columns that are specified. This is important as most of the times, we do not to need to import the complete data from a table.

17) Controlling the parallelism while import.

In the below example, we are trying to import the whole data as quickly as possible by defining the multiple processes. This can be achieved by assigning more number of mappers.

sqoop import
–connect jdbc:mysql://localhost/
–username root
–password cloudera
–table Emp
–m 8

With the above command, we completed learning how to import data from MySQL to HDFS using Sqoop. The upcoming articles are on how to import data from MySQL to Hive.

Please click here for the next part

7 comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s