External Catalogs
This page discusses importing and using metadata from external catalog systems.
Page Contents
Overview
The Data Catalog can import metadata from external catalog systems to enable a single unified semantic layer over multiple catalogs.
Databricks Unity Catalog
The Data Catalog can be configured to import Unity Catalog metadata using a Databricks account. You can configure the import to occur on a customizable schedule. Databricks Unity metadata is written to the Stardog Catalog where it can be queried in conjuction with your Stardog databases.
Configuration
To import Databricks Unity Catalog metadata you insert a DatabricksProvider
configuration into the Data Catalogs stardog:catalog:providers
named graph. The configuration describes how Databricks can be accessed and how often the Data Catalog should refesh the metadata.
insert data {
graph stardog:catalog:providers
{
<urn:myDBricksProvider> a <tag:stardog:api:catalog:DatabricksProvider> ;
<tag:stardog:api:catalog:unity:dataSource> "DATA_SOURCE_HERE" ;
<tag:stardog:api:catalog:unity:schedule> "SCHEDULE_HERE" .
}
}
This table details the property values that need to be set for configuring a Databricks metadata provider.
Property | Description | Values |
---|---|---|
rdf:type | Databricks metadata provider class | tag:stardog:api:catalog:DatabricksProvider |
tag:stardog:api:catalog:unity:dataSource | Datasource to use for connecting to a Databricks account | The IRI of an existing Data Source |
tag:stardog:api:catalog:unity:schedule | Frequency of metadata imports | Quartz cron expression (ex. 0 0 22 * * ? Every day at 10pm) |
After the configuration is inserted a job is automatically created to run on the specified schedule. The job will import Databricks Unity metadata and load a general data model for viewing the metadata in Explorer.
Data Model
This table contains the classes used for modeling the Databricks metadata. Prefix bricks
is namespace tag:stardog:api:catalog:databricks:
.
Class | Property | Description |
---|---|---|
bricks:Databricks | The metadata from an external Databricks platform | |
bricks:DatabricksCatalog | A Databricks catalog | |
bricks:owner | The owner account | |
bricks:catalogType | The catalog type | |
bricks:DatabricksSchema | A Databricks schema | |
bricks:owner | The owner account | |
bricks:fullName | The full name of a schema | |
bricks:DatabricksTable | A Databricks table | |
bricks:tableType | The table type | |
bricks:fullName | The full name of the table | |
bricks:dataSourceFormat | The data source format | |
bricks:owner | The owner account | |
bricks:DatabricksColumn | A Databricks column | |
bricks:position | The column position | |
bricks:precision | The column precision | |
bricks:nullable | If the column is nullable | |
bricks:dataType | The column data type | |
bricks:scale | The column scale |