Avro is a serialization system developed by Apache. It is a row-based storage format.
Avro contains the data definition as well as the data in the same message or file. The data definition is stored in JSON format, making it easy to read and analyze; the data itself is stored in binary format, making it compact and efficient.
Avro supports rich data structures, a compact binary encoding, and a container file for Avro data sequences (often referred to as Avro data files). Avro is language-independent, with language bindings available for Java, C, C++, Python, and Ruby.
The above are the definitions or introduction about Avro as per the documentation. Comparatively Avro is better than PARQUET when it comes to WRITE operations. Also, Avro is better than JSON when it comes to data format. However, Avro data is in machine-readable binary format similar to ORC and Parquet.
In this post, we will see how to create a table with Avro schema and load Avro data into the table.
Based on the EMP table of Oracle SQL*Plus AVRO data file and schema has already been generated and available for download.
“Avro” data file sample:
Let’s begin the exercise:
–Creating the table with Avro Schema
CREATE TABLE empavro
ROW FORMAT
SERDE ‘org.apache.hadoop.hive.serde2.avro.AvroSerDe’
STORED AS
INPUTFORMAT ‘org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat’
OUTPUTFORMAT ‘org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat’
TBLPROPERTIES (‘avro.schema.literal’='{
“type” : “record”,
“namespace” : “BigDatanSQL”,
“name” : “Employees”,
“fields” : [
{ “name” : “empno” , “type” : “string” },
{ “name” : “ename” , “type” : “string” },
{ “name” : “job” , “type” : “string” },
{ “name” : “mgr” , “type” : “string” },
{ “name” : “hiredate” , “type” : “string” },
{ “name” : “sal” , “type” : “string” },
{ “name” : “comm” , “type” : “string” },
{ “name” : “deptno” , “type” : “string” }
]
}’);
Now, load the data into the table.
LOAD DATA LOCAL INPATH ‘Desktop/Docs/empavro’ INTO TABLE empavro;
Let’s query and see if the data is inserted correctly.
Hope you find this article helpful.
Please do subscribe for more interesting updates.
4 comments