"Target-Dir" vs "Warehouse-Dir" in Sqoop

This article is about using “Target-Directory” and “Warehouse-Directory” while Sqoop Import.

Please refer to the below codes.

Code-1: Usage of “Target-Directory”

sqoop import 

–connect jdbc:mysql://localhost/empinfo

–username root

–password cloudera

–table emp

–target-dir /user/hive/warehouse/empinfo;

Code-2: Usage of “Warehouse-Directory”

sqoop import 

–connect jdbc:mysql://localhost/empinfo

–username root

–password cloudera

–table emp

–warehouse-dir /user/hive/warehouse/empinfo;

Both the codes works in the same way. Both ‘target-dir’ and ‘warehouse-dir’ in the above mentioned examples creates the “empinfo” folder in /user/hive/warehouse location.

The difference is, when using “target-dir”, the emp data (part files) will be stored in “empinfo” directly.

The path of the data will be- /user/hive/warehouse/empinfo/part-m-00000.

Warehouse-dir creates the folder named “emp” under “empinfo” and places the data in it.

The path of the data will be /user/hive/warehouse/empinfo/emp/part-m-00000.

 

Note the below points:

  • Target-dir will work only when you import a single table. That implies this won’t work when you use “Sqoop import-all-tables”
  • Warehouse-dir creates the parent directory in which all your tables will be stored in the folders which are named after the table name.
  • If you are importing table by table, each time you need to provide the distinctive target-directory location as target-directory location can’t be same in each import.

Hope you like this article.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s