This article offers an overview of the various data types that are available both in Apache Hive & Impala.
TINYINT – 1 byte
Range: -128 to 127
SMALLINT – 2 bytes
Range: -32,768 to 32,767
INT – 4-bytes
Range: -2,147,483,648 to 2,147,483,647
BigInt – 8 bytes value
Range: -9223372036854775808 .. 9223372036854775807.
FLOAT – 4 bytes
single precision floating point number
single precision floating point number
DOUBLE – 8-byte
double precision floating point number
DECIMAL
Hive 0.13.0 introduced user definable precision and scale
STRING
The hard limit on the size of a STRING and the total size of a row is 2 GB.
The limit is 1 GB on STRING when writing to Parquet files.
TIMESTAMP
Timestamps were introduced in Hive 0.8.0. It supports traditional UNIX timestamp with the optional nanosecond precision.
The supported Timestamps format is yyyy-mm-dd hh:mm:ss[.f…].
Complex types:
Complex types (also referred to as nested types) in Hive let you represent multiple data values within a single row/column position. Impala supports the complex types ARRAY, MAP, and STRUCT in Impala 2.3 and higher.
Arrays: Array
Collection of Similar Data
Maps: Map
Key Value Combination
Structs: Struct
Collection of Different Data
2 comments