Link Search Menu Expand Document
Start for Free

External Catalogs

This page discusses importing and using metadata from external catalog systems.

Page Contents
  1. Overview
  2. Databricks Unity Catalog
    1. Configuration
    2. Data Model

Overview

The Data Catalog can import metadata from external catalog systems to enable a single unified semantic layer over multiple catalogs.

Databricks Unity Catalog

The Data Catalog can be configured to import Unity Catalog metadata using a Databricks account. You can configure the import to occur on a customizable schedule. Databricks Unity metadata is written to the Stardog Catalog where it can be queried in conjuction with your Stardog databases.

Configuration

To import Databricks Unity Catalog metadata you insert a DatabricksProvider configuration into the Data Catalogs stardog:catalog:providers named graph. The configuration describes how Databricks can be accessed and how often the Data Catalog should refesh the metadata.

insert data {
    graph stardog:catalog:providers 
    { 
        <urn:myDBricksProvider> a <tag:stardog:api:catalog:DatabricksProvider> ;
            <tag:stardog:api:catalog:unity:dataSource> "DATA_SOURCE_HERE" ;
            <tag:stardog:api:catalog:unity:schedule> "SCHEDULE_HERE"  .
    }
} 

This table details the property values that need to be set for configuring a Databricks metadata provider.

Property Description Values
rdf:type Databricks metadata provider class tag:stardog:api:catalog:DatabricksProvider
tag:stardog:api:catalog:unity:dataSource Datasource to use for connecting to a Databricks account The IRI of an existing Data Source
tag:stardog:api:catalog:unity:schedule Frequency of metadata imports Quartz cron expression (ex. 0 0 22 * * ? Every day at 10pm)

After the configuration is inserted a job is automatically created to run on the specified schedule. The job will import Databricks Unity metadata and load a general data model for viewing the metadata in Explorer.

Data Model

This table contains the classes used for modeling the Databricks metadata. Prefix bricks is namespace tag:stardog:api:catalog:databricks:.

Class Property Description
bricks:Databricks   The metadata from an external Databricks platform
bricks:DatabricksCatalog   A Databricks catalog
  bricks:owner The owner account
  bricks:catalogType The catalog type
bricks:DatabricksSchema   A Databricks schema
  bricks:owner The owner account
  bricks:fullName The full name of a schema
bricks:DatabricksTable   A Databricks table
  bricks:tableType The table type
  bricks:fullName The full name of the table
  bricks:dataSourceFormat The data source format
  bricks:owner The owner account
bricks:DatabricksColumn   A Databricks column
  bricks:position The column position
  bricks:precision The column precision
  bricks:nullable If the column is nullable
  bricks:dataType The column data type
  bricks:scale The column scale