Adding a file in External Table’s Location in Hive

When we create an external table in Hive, we typically specify its location rather than the actual file. Hive reads data from the specified location, so there is no need for an insert statement. That is, if we issue a SELECT statement on the table, Hive will read all of the files from the location specified during creation and return information. There will be no data returned if there is no file in the location.

Now the question is, if we copy another file to the same location, will the data from that file be displayed in the Hive table?

Yes, it reads the location whenever we issue the SELECT statement and searches for files and retrieves information from them.

Let’s put it to the test with an exercise.

I’m creating three text files with the following data.
file1.txt                 file2.txt              file3.txt
1,Test1_1               5,Test2_1            9,Test3_1
2,Test1_2               6,Test2_2            10,Test3_2
3,Test1_3               7,Test2_3            11,Test3_3
4,Test1_4               8,Test2_4            12,Test3_4

As shown below, file1.txt and file2.txt are placed in the ‘exttable’ folder, and file3.txt is copied to the Desktop.

exttest_folder

Now, using the HDFS terminal, I am copying the exttest folder from Desktop to /user/cloudera, which will be the file location for the external table.
[cloudera@quickstart ~]$ hadoop fs -put Desktop/exttest /user/cloudera/
The directory/folder has been copied to the user directory. Let’s verify.

[cloudera@quickstart ~]$ hadoop fs -ls /user/cloudera/exttest;
Found 2 items
-rw-r–r– 1 cloudera cloudera 40 2021-12-24 01:03 /user/cloudera/exttest/file1.txt
-rw-r–r– 1 cloudera cloudera 40 2021-12-24 01:03 /user/cloudera/exttest/file2.txt

Now, in Hive, let’s create the external table.
hive> CREATE EXTERNAL TABLE exttest(someid INT, sometxt STRING)
ROW FORMAT DELIMITED
FILEDS TERMINATED BY ‘,’
LOCATION ‘/user/cloudera/exttest’;

Now, let’s verify the data.
SELECT * FROM exttest;

We can now see the data from the two text files.
Let’s now copy file3.txt to the same location.

[cloudera@quickstart ~]$ hadoop fs -put Desktop/file3.txt /user/cloudera/exttest
[cloudera@quickstart ~]$ hadoop fs -ls /user/cloudera/exttest;
Found 3 items
-rw-r–r– 1 cloudera cloudera 40 2021-12-24 01:03 /user/cloudera/exttest/file1.txt
-rw-r–r– 1 cloudera cloudera 40 2021-12-24 01:03 /user/cloudera/exttest/file2.txt
-rw-r–r– 1 cloudera cloudera 43 2021-12-24 01:06 /user/cloudera/exttest/file3.txt

Now, let’s go back and see what Hive table displays.

SELECT * FROM exttest;
Data from the three files are being displayed.

Placing File in External Location

I hope you found this post to be informative.

Please enter your email address to receive notifications of new postings.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s