Metadata is a type of data that describes and provides information about other types of data, such as database objects. And Hive’s metastore maintains metadata about each table, such as its structure and location. It also contains partition metadata, which assists the driver in tracking the progress of various data sets distributed across the cluster. The data is saved in the standard RDBMS format. The metadata assists the driver in keeping track of the data, which is critical.
We frequently use the following queries to acquire metadata about database objects.
mysql> SHOW DATABASES;
+———————————+
| Database |
+———————————+
| information_schema |
| cm |
| firehose |
| hue |
| metastore |
| mysql |
| nav |
| navms |
| oozie |
| retail_db |
| rman |
| sentry |
+———————————+
12 rows in set (0.00 sec)
The database “metastore” contains all the information about objects that are created using Hive. Let’s check it out.
USE metastore;
mysql> select db_id, name,db_location_uri from DBS;
+———+——————–+——————————————————————————————————+
| db_id | name | db_location_uri |
+———+——————–+——————————————————————————————————+
| 1 | default | hdfs://quickstart.cloudera:8020/user/hive/warehouse |
| 2 | dbtest | hdfs://quickstart.cloudera:8020/user/hive/warehouse/dbtest.db |
| 16 | retail_db | hdfs://quickstart.cloudera:8020/user/hive/warehouse/retail_db.db |
| 21 | dbsales | hdfs://quickstart.cloudera:8020/user/hive/warehouse/dbsales.db |
| 22 | dbsales2019 | hdfs://quickstart.cloudera:8020/user/hive/warehouse/dbsales2019.db |
| 23 | dbsales2020 | hdfs://quickstart.cloudera:8020/user/hive/warehouse/dbsales2020.db |
| 24 | dbsales2021 | hdfs://quickstart.cloudera:8020/user/hive/warehouse/dbsales2021.db |
+——-+————-+—————————————————————————————————————-+
7 rows in set (0.00 sec)
To obtain all database information, utilize the “DBS” table, which allows you to identify the database name, location, description, and owner information. If you need table information, use “TBLS”, which will give you the table name, created date, database it was created in, last accessed time, user who created it, if the table is internal or external, and so on. TABLE_PARAMS will let you know all other parameters including the comments. If you want to get a list of all the tables, as well as the database name in which the table was created, you may do so by joining them with DB_ID, as seen below.
SELECT DBS.Name, TBLS.Tbl_Name, TBLS.Tbl_Id, TBLS.Tbl_Type
FROM DBS
JOIN TBLS ON DBS.DB_ID = TBLS.DB_ID
JOIN SDS ON TBLS.SD_ID = SDS.SD_ID;
If you want to use Hive CLI then you’ll still use its commands which are listed below to fetch information about databases and tables.
SHOW DATABASES;
SHOW TABLES;
DESC <TableName>;
DESC EXTENDED <TableName>;
DESC FORMATTED <TableName>;
Hope you find this article helpful.
Please do subscribe for more interesting updates.
Wonderful goods from you, mɑn. I’ve սnderstand your stᥙff previous to and you’re just
extremely excellent. I really like what уօu’ve аcquired here, certainly like ѡһаt
ʏou are stating and the way in which үou say it. You make it
enjoyablе and you stilⅼ cаre for to keep it wise. I can not wait to read much more from you.
This is actuaⅼly a great site.
LikeLike
I spend a lot of time writing blog posts and frequently forget to express gratitude to my readers and followers. Your feedback is really valuable to me. Thanks a lot.
LikeLike