Partitioning in Hive – Big Data & SQL

Apache Hive Questions and Answers-1

26th Dec 2022 SHAFI SHAIK

The collection of interview questions for Apache Hive is available here. I don’t claim full ownership of the questions and answers because the majority of

Partitioning vs Bucketing in Hive

20th Jan 2022 SHAFI SHAIK

Partitioning in Hive divides huge tables into smaller logical tables depending on column values; one logical table is created for each individual value. By defining

Apache Hive Data Model

20th Jan 2022 SHAFI SHAIK

Apache Hive is built on top of Apache Hadoop, which is a distributed, fault-tolerant, and open source data warehouse platform for reading, writing, and handling

Partitioned, Bucketed and Skewed Tables in Hive

14th Jan 2022 SHAFI SHAIK

When working with a large amount of data on a Hadoop file system, both partitioning and bucketing in Hive are used to avoid table scans

Bucketing in Apache Hive Part-2

12th Nov 2021 SHAFI SHAIK

Please see my previous post on bucketing and bucketed tables for more information. Bucketed Sorted Tables will be explored in this post. As discussed in

Performance Tuning in Hive

15th Aug 2021 SHAFI SHAIK

Performance tuning is the process of ensuring that an application’s SQL queries execute as quickly as feasible. The procedures may or may not differ from