Tables in Apache Hive are partitioned similarly to SQL. Partitioning is a method of dividing a table into chunks based on the values of particular columns. In comparison, creating and managing partitions in Hive is much easier. Partitions are typically specified when a table is created; however, in this post, we will explain how to add a partition to an existing table.
Consider the below example. For the sample data, click here.
CREATE TABLE CitiesList(Id INT, Name STRING)
PARTITIONED BY (Country STRING);
LOAD DATA LOCAL INPATH ‘Desktop/Docs/UK_Cities.txt’
OVERWRITE INTO TABLE CitiesList
PARTITION (Country=’UK’);
LOAD DATA LOCAL INPATH ‘Desktop/Docs/US_Cities.txt’
OVERWRITE INTO TABLE CitiesList
PARTITION (Country=’US’);
Using the aforementioned approach, we created a table, defined the partitioned column, and loaded the data. Now, lets add a partition to it.
ALTER TABLE CitiesList
ADD PARTITION (Country=’UAE’);
A partition has been added to the table. Now we can load the data into the partition.
LOAD DATA LOCAL INPATH ‘Desktop/Docs/UAE_Emirates.txt’
OVERWRITE INTO TABLE CitiesList
PARTITION(Country=’UAE’);
Hope you find this article helpful.
Please subscribe for more interesting updates.
One comment