Virtual Transparency

This page discusses Virtual Transparency, a database option that adds the set of accessible virtual graphs to the query dataset.

Page Contents

Overview
Configuration
Effect on the Query Dataset
Relationship to query.all.graphs
Options and Limitations
Legacy Support for Virtual Graphs Not in the Dataset
Differences in versions prior to 7.7.0

Overview

Virtual graphs provide a facility for accessing external data sources by mapping them to individual named graphs. The example queries shown previously all specify the source of the data using the virtual graph name. This fine-grained declaration can be useful in some circumstances, but it’s also desirable to query over the set of all graphs without enumerating them individually. Virtual Transparency is a feature that, when enabled, will include results from virtual graphs in queries over the set of all graphs.

Configuration

How does it work? First, you need to set the virtual.transparency database option to true.

When this is enabled, queries are evaluated not only over local graphs, but also over accessible virtual graphs. The set of accessible virtual graphs is determined by the virtual graph access rules. It may differ by database and user.

Effect on the Query Dataset

The example queries shown previously use explicit graph blocks to name the source graph of the data, e.g.:

SELECT * {
   GRAPH <virtual://dept> {
      ?person a emp:Employee ;
         emp:name "SMITH"
   }
}

In contrast, a query with a graph block with a variable for the graph name would only return data from the local named graphs:

SELECT * {
   GRAPH ?g {
      ?person a emp:Employee ;
         emp:name "SMITH"
   }
}

That query is using the default dataset (no FROM or FROM NAMED clause is included after the SELECT), which has the set of local named graphs for its named graphs scope. It is equivalent to:

SELECT *
FROM <tag:stardog:api:context:default>
FROM NAMED <tag:stardog:api:context:named> {
   GRAPH ?g {
      ?person a emp:Employee ;
         emp:name "SMITH"
   }
}

See the SPARQL spec for an explanation of datasets, as well as the Special Named Graph section of this doc for an explanation of the tag:stardog:api:context:* special named graphs, along with some examples.

If Virtual Transparency is enabled, the named scope of the dataset will include all accessible virtual graphs, refered to by the tag:stardog:api:context:virtual special named graph. The effective query becomes:

SELECT *
FROM <tag:stardog:api:context:default>
FROM NAMED <tag:stardog:api:context:named>
FROM NAMED <tag:stardog:api:context:virtual> {
   GRAPH ?g {
      ?person a emp:Employee ;
         emp:name "SMITH"
   }
}

Relationship to `query.all.graphs`

The query.all.graphs server configuration or database option adds the named graphs portion of the dataset to the default dataset. When Virtual Transparency is enabled, the default set will consist of not only the default graph (tag:stardog:api:context:default) and local named graphs (tag:stardog:api:context:named) as it would with Virtual Transparency off, but it will also include the accessible virtual graphs (tag:stardog:api:context:virtual).

With both Virtual Transparency and query.all.graphs set to true, our sample query becomes:

SELECT *
FROM <tag:stardog:api:context:default>
FROM <tag:stardog:api:context:named>
FROM <tag:stardog:api:context:virtual>
FROM NAMED <tag:stardog:api:context:named>
FROM NAMED <tag:stardog:api:context:virtual> {
   GRAPH ?g {
      ?person a emp:Employee ;
         emp:name "SMITH"
   }
}

For completeness, with query.all.graphs on and Virtual Transparency off, the query becomes:

SELECT *
FROM <tag:stardog:api:context:default>
FROM <tag:stardog:api:context:named>
FROM NAMED <tag:stardog:api:context:named> {
   GRAPH ?g {
      ?person a emp:Employee ;
         emp:name "SMITH"
   }
}

Note no virtual graphs in either the default or named scopes of the dataset.

With Virtual Transparency, the key difference between including or omitting a graph block comes from how triple patterns are joined together. Just as a graph block over a set of local named graphs limits BGP (basic graph pattern) matches to a single named graph, a graph block with Virtual Transparency limits BGP matches to a single local or virtual graph. To illustrate, consider the query with a graph block:

SELECT * {
   GRAPH ?g {
      ?person a emp:Employee ;
         emp:name "SMITH"
   }
}

If the set of employees is stored in a different virtual graph than the employee names, this query will return an empty result because the entire BGP will not match any set of triples in any individual graph. However, if we remove the graph block, each individual triple pattern will match triples from different graphs, and these results will be joined together. The result is similar to what we would obtain by specifying the sources manually:

SELECT * {
   GRAPH <virtual://employees> {
      ?person a emp:Employee
   }
   GRAPH <virtual://names> {
      ?person emp:name "SMITH"
   }
}

Options and Limitations

Virtual Transparency is compatible with all SPARQL operators with the exception of “zero or more” and “one or more” property paths. These constructs are supported on some DBMS platforms when placed inside the graph block specifying the virtual graph source.

A query hint is provided to disable Virtual Transparency for all or part of a query. Placing the hint #pragma virtual.transparency off in a SPARQL block will disable consideration of virtual graphs for that block.

Queries with edge properties are not supported when using Virtual Transparency. Specifying the virtual graph in a graph block will bypass this limitation.

When using Virtual Transparency and querying data residing in both local databases and Virtual Graphs, setting the database configuration options local.iri.template.includes and local.iri.template.excludes can help to improve query performance.

Legacy Support for Virtual Graphs Not in the Dataset

Ever since Stardog started supporting virtual graphs, it supported GRAPH blocks with explicit virtual graph names, even if the virtual graph was not included in the query dataset. For example, this query works even if both Virtual Transparency and query.all.graphs are disabled:

SELECT * {
   GRAPH <virtual://dept> {
      ?person a emp:Employee ;
         emp:name "SMITH"
   }
}

This query uses the default dataset, which has tag:stardog:api:context:named for its named scope (which does not include virtual graphs). However, the virtual graph is still accessible. Stardog supports this for backward compatibility.

Differences in versions prior to 7.7.0

As explained above, virtual.transparency adds all accessible virtual graphs to the named scope of the default query dataset¹. It also adds those virtual graphs to the default scope of the default query dataset if the query.all.graphs option is enabled.

Before version 7.7, if virtual graphs were included in the named scope of the dataset (via FROM NAMED or API), they would be substituted into GRAPH ?g-type variables only if virtual.transparency was enabled. With version 7.7, virtual graphs can be added to the named scope of the dataset (again, via FROM NAMED or API) with or without Virtual Transparency.

Similarly, prior to version 7.7, if virtual graphs were included in the default scope of the query dataset (via FROM or API), they would not be included in the default graph unless virtual.transparency was enabled. With 7.7, they will be included with or without Virtual Transparency.

The only exception to the pre-7.7 behavior was if there was one virtual graph and no other graphs in the default scope of the dataset and no graphs in the named scope of the dataset, pre-7.7 versions would execute the query over the one virtual graph. With 7.7, there is no special handling for the single-virtual graph use case.

In summary, the virtual.transparency option has an effect when the dataset is not specified (either in the query with FROM or FROM NAMED or through the API). When virtual.transparency is off, the dataset will default to tag:stardog:api:context:default for the default scope and tag:stardog:api:context:named for the named scope. When virtual.transparency is on, the dataset will default to tag:stardog:api:context:default for the default scope and the union of tag:stardog:api:context:named and tag:stardog:api:context:virtual for the named scope.

The defalut dataset is the dataset used when no FROM or FROM NAMED is included in the query and the dataset is not specified via the query API. ↩

Overview
Configuration
Effect on the Query Dataset
Relationship to query.all.graphs
Options and Limitations
Legacy Support for Virtual Graphs Not in the Dataset
Differences in versions prior to 7.7.0

Virtual Transparency

Overview

Configuration

Effect on the Query Dataset

Relationship to query.all.graphs

Options and Limitations

Legacy Support for Virtual Graphs Not in the Dataset

Differences in versions prior to 7.7.0

Relationship to `query.all.graphs`