Pass-through authentication uses the credentials of the logged-in user when creating a data source connection for the purpose of querying a virtual graph. Pass-through authentication is currently supported using OAuth between Azure AD (Microsoft Identity) and Databricks.
This page covers the configuration options and examples for pass-through authentication.
The configuration of a data source requires the credentials for a Stardog system account that has access to the backing database. By default, these credentials will be used for all interactions with the database - creating virtual graphs, loading table metadata, and, in particular, submitting virtual graph queries. The virtual graph queries for all Stardog users are executed using these common, shared credentials. Access to sensitive databases (entire virtual graphs) can be secured through the use of Named Graph Security, and access to specific mapped relationships can be secured through the configuration of fine-grained security / sensitive properties.
Pass-through authentication can provide an elegant alternative to either named-graph or fine-grained security when access to an external database has already been configured to mask sensitive columns based on user roles, as it relies on that existing security. This eliminates the need to duplicate the role-based access rules in Stardog.
The requirements for pass-through authentication are:
- Stardog configured for OAuth authentication with Azure AD as the IdP
- Databricks configured for Azure AD authentication
- Role-based data masking implemented at the storage level
Note, while not strictly required, a data masking approach is highly recommended so that all users will see the same database schema while seeing different content based upon their authorization level. This approach ensures SQL queries will parse and execute without error.
The following is a sample data source options file for a Databricks database:
# This is Databricks' client id, plus .default scope
Note that the top section of the properties file is unchanged, with the exception that the
AuthMech query parameter was moved from the URL to an external ext. property.
The remaining built-in properties are:
OAUTH to enable pass-through authentication.
The scope of the OAuth resource required. Use
2ff814a6-3304-4ab8-85cb-cd0e6f879c1d/.default for Databricks.
The Databricks JDBC driver requires
Auth_AccessToken connection parameters when connecting using OAuth. To configure these parameters, include them in the properties file with a
passthrough. prefix, as indicated in the example.
ACCESS_TOKEN variable can be included in curly braces as a substitute for the Databricks access token.
Using pass-through authentication requires configuring OAuth with client credentials for the Stardog server. See Client Credentials For Stardog Server for configuration instructions.
Pass-through authentication is based on OAuth, which is, at its core, a standard for users to grant client applications access to resources that are secured by other applications. In a Databricks use case, the user must consent to Stardog accessing the user’s resources on Databricks. This can be configured by the Azure AD administrator by following the directions here.
If this consent is not granted, Azure AD will respond with an error message similar to:
AADSTS65001: The user or administrator has not consented to use the application with ID 5439b38f-21f3-459c-b417-c7f6ec3ef55a named Stardog. Send an interactive authorization request for this user and resource.
This permission can be granted interactively by the user by pointing their browser to:
If all query parameters are correct, the user should be asked to choose the Microsoft account to use and then to consent to Stardog having access to Databricks. Once the consent is granted, they will be redirected to the redirect URL, which might not be found, which is fine. Once consent is granted, the user should be able to use pass-through authentication with Databricks.