Internal Tables in Hive – Part-2

This is a continuation of “Internal Tables in Hive” from the previous article. There is an option to decide where to place the data when creating the internal table in the Hive. Apart from the Hive’s default location (/user/hive/warehouse/database/table/), it can be HDFS or local file system.

Let’s see how this can be accomplished.

Consider the same data-set specified in the previous article.

CREATE TABLE Books2(
BookID INT,
BookName STRING
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’
STORED AS TEXTFILE
LOCATION ‘/user/cloudera/books’

The location specified above is HDFS.

Now, load the data into the table.
LOAD DATA LOCAL INPATH ‘Desktop/book.csv’ INTO TABLE Books2;

Internal Table-2

Now let’s verify the location of the data using Hue file browser.

This slideshow requires JavaScript.

If you look at the slide-show, The data is placed in the HDFS location, not the usual Hive warehouse. If you drop this table in Hive, you’ll loose this data as well.

Click here to see how to specify the local file location for the data.

2 comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s