OVERWRITE is used to replace any existing data in a table or partition with new rows before inserting or loading them. This indicates that OVERWRITE will erase the data from the target table and reload the files. Otherwise, data will be appended to the current data, perhaps resulting in redundancy. The keyword OVERWRITE can be used in both the INSERT and LOAD commands.
Let’s do some exercise.
CREATE TABLE Emp(
EmpNo INT,
EName STRING,
Job STRING,
Mgr INT,
HireDate String,
Sal INT,
Comm INT,
DeptNo INT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’;
LOAD DATA LOCAL INPATH ‘Desktop/Docs/empdata.txt’ INTO TABLE emp;
We now have the table and the data. Let’s edit “empdata.txt” and add one more row at the end.
If we execute the LOAD command above again, the data from the target table (emp) will not be removed, and the whole data from the file, together with the new row, will be loaded to the table. In order to avoid this, we must utilize the OVERWRITE keyword in the load statement.
LOAD DATA LOCAL INPATH ‘Desktop/Docs/empdata.txt’ OVERWRITE INTO TABLE emp;
Now you can query the table and see the results. The newly added row in the file will also be loaded without any duplications.
Hope you find this article helpful.
Please do subscribe to receive notifications on latest posts.