Skewed Tables – All Articles

This is a collection of articles regarding skewed tables in Hive that have been published on this website.

Skewed Values on Several Columns – Hive

Skewed tables are those in which some column values occur more frequently than others. As a result, the distribution is skewed. Hive will automatically separate skewed values into different files and take this into consideration during searches so that it can skip or include whole files if possible; thus enhances the performance. In this post,…

Keep reading

Partitioned, Bucketed and Skewed Tables in Hive

When working with a large amount of data on a Hadoop file system, both partitioning and bucketing in Hive are used to avoid table scans and boost efficiency (HDFS). Tables are divided into smaller and more manageable pieces by defined partitions and/or buckets, which should possibly improve query performance. The way data is segregated is…

Keep reading

Check if table is skewed – Apache Hive

As stated in the earlier article, Skewed tables are those in which some column values occur more frequently than others. As a result, the distribution is skewed. Hive will automatically separate skewed values into different files and take this into consideration during searches so that it can skip or include whole files if possible; thus enhances…

Keep reading

Altering Skewed Tables in Hive

As we discussed in the earlier posts, Skewed tables are those in which some column values occur more frequently than others. As a result, the distribution is skewed. Hive will automatically separate skewed values into different files and take this into consideration during searches so that it can skip or include whole files if possible;…

Keep reading

Apache Hive Skewed Tables Examples

As stated in the earlier article, Skewed tables are those in which some column values occur more frequently than others. As a result, the distribution is skewed. Hive will automatically separate skewed values into different files and take this into consideration during searches so that it can skip or include whole files if possible; thus…

Keep reading

Apache Hive – Skewed Tables

Skewed tables are those in which some column values occur more frequently than others. As a result, the distribution is skewed. Hive will automatically separate skewed values into different files and take this into consideration during searches so that it can skip or include whole files if possible; thus enhances the performance. Look at the…

Keep reading

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s