'Exploiting Hive Metastore of Databricks for lineage

Wanted to check if anyone has made an attempt to exploit the Hive Metastore of Databricks for lineage?

For example, I loaded metadata of 2 databricks databases using the Collibra Marketplace provided Databricks driver. Here is the scenario -

Database 1 > Table_A Database 2 > View_A based on Table_A

As the table & view relations are implicit, I expected the driver to show lineage/links between these 2 objects across databases within Collibra but it did not.

So, I plan to fetch the relationships information from Hive Metastore and feed into Collibra.

Couple of questions -

  1. Where can I see the data model of Hive Metastore? Is there any documentation link from Databricks side so I can quickly understand the schemata of the metastore.
  2. Is it advisable to query the metastore tables or are there any side-effects?
  3. How easy will it be to fetch the relationships between tables & views? Is there an out-of-the-box query?

Appreciate your insights please.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source