Apache Hive Internal Tables

The following is a list of posts related to “Apache Hive Internal Tables.”

Internal Tables in Hive – Part-1

As mentioned in the previous post, when the data is temporary or if you want Hive to control the life cycle of the table and data, internal tables will be created. In internal tables, data and metadata are kept within the Hive warehouse by default. Prior to dropping some internal table, one must be careful…

Internal Tables in Hive – Part-2

This is a continuation of “Internal Tables in Hive” from the previous article. There is an option to decide where to place the data when creating the internal table in the Hive. Apart from the Hive’s default location (/user/hive/warehouse/database/table/), it can be HDFS or local file system. Let’s see how this can be accomplished. Consider…

Internal Tables in Hive – Part-3

This is a continuation of “Internal Tables in Hive” from the previous article. There is an option to decide where to place the data when creating the internal table in the Hive. Apart from the Hive’s default location (/user/hive/warehouse/database/table/), it can be HDFS or local file system. Let’s see how this can be accomplished.  Consider…

Hive Internal Tables Using CTAS

In addition to previous implementations of the creation of internal tables (a.k.a. managed tables), another approach for creating internal tables is using the CTAS (create-table-as-select) statement. One should be aware that until the query results are populated, the table will not be seen by other users since CTAS is atomic. In other words, other users…

Hive Internal vs External Tables

This article offers a summary of the situations in which  you would need to create internal (managed) tables and external tables in Apache Hive.  Create “External” tables when: the data is being used outside the Hive. The data files are read and interpreted by an existing program that does not lock the files, for instance. data needs to stay in the underlying position even after a DROP TABLE. In other words, the data…

Skipping First and Footer Row – Hive Internal & External Tables

Most of the data-sets (CSV files, Text files, and so forth.) do have header row inside the data, and loading them in the Hive tables will reason null values. This article will assist you in how to deal with the header rows while creating Internal or external tables.  If you are creating an internal table…

Hive Internal Table – With External Data

Have you ever wonder what will happen if you miss the “external” keyword while creating an external table? It will be an internal table at the end. Let’s check it out. Here is my sample data. It has three columns namely dno, dname, location. 11, marketing, hyd12, hr, delhi13, finance, bang14, retail, madras20, db, hyd21,…

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s