Managing Databases
This page discusses managing Stardog databases.
Creating a Database
Stardog databases may be created locally or remotely. All data files, indexes, and server metadata for the new database will be stored in $STARDOG_HOME
.
- Stardog will not let you create a database with the same name as an existing database. Stardog database names must start with an alpha character followed by zero or more alphanumeric, hyphen or underscore characters, as is given by the regular expression
[A-Za-z]{1}[A-Za-z0-9_-]*
.
There are four reserved words that may not be used for the names of Stardog databases: system
, admin
,docs
and watchdog
.
- Minimally, the only thing you must know to create a Stardog database is a database name; alternately, you may customize some other database parameters and options depending on anticipated workloads, data modeling, and other factors.
- Bulk loading data can be done at database creation time
- Bulk loading performance is better if data files don’t have to be transferred over a network during creation and initial loading.
To create an empty database:
-
At a minimum, a database name needs to be passed into the
db create
CLI command using the-n
/--name
option.stardog-admin db create -n myDatabase
-
- Navigate to the “Databases” tab
- Click on the “Create Database” button
- Enter a name for the new database
- Click the “Create” button
-
See the HTTP API documentation for more information.
curl -u username:password -X POST \ -F root="{\ "dbname":"myDatabase"\ }" http://localhost:5820/admin/databases
import com.complexible.stardog.api.admin.AdminConnection; import static com.complexible.stardog.api.admin.AdminConnectionConfiguration.toServer; public class CreateDatabaseBasic { public static void main(String[] args){ String serverUrl = "http://localhost:5820"; String username = "username"; String password = "password"; try(AdminConnection adminConnection = toServer(serverUrl).credentials(username, password).connect();){ adminConnection.newDatabase("myDatabase").create(); } } }
See
com.complexible.stardog.api.admin.AdminConnection.AdminConnection#newDatabase
for more information.
Configuring a Database at Creation Time
It’s possible to provide configuration options to a database at creation time. While you can provide any type of configuration option at creation time regardless of their mutability, you must declare immutable database configuration options at creation time (e.g. edge.properties
).
-
Database configuration options can be passed into the
db create
CLI command using the-o
/--options
option. Each option is akey=value
pair; multiple options are separated by whitespaces, e.g.,-o option1=value1 option2=value2
. When used as the last option, values should be followed by--
.# enables search and edge properties for the database stardog-admin db create -n myDatabase -o search.enabled=true edge.properties=true --
-
- Navigate to the “Databases” tab
- Click on the “Create Database” button
- Enter a name for the new database
-
Select the “Manually Set Immutable Properties” radio button and select which properties to set
The only way to set mutable and database options at database creation time in Studio is by providing a Java properties file containing configuration options.
- Click the “Create” button
-
See the HTTP API documentation for more information.
curl -u username:password -X POST \ -F root="{\ "dbname":"myDatabase",\ "options":{"search.enabled":true, "edge.properties":true}\ }" http://localhost:5820/admin/databases
import com.complexible.stardog.api.admin.AdminConnection; import com.complexible.stardog.db.DatabaseOptions; import com.complexible.stardog.metadata.Metadata; import com.complexible.stardog.search.SearchOptions; import static com.complexible.stardog.api.admin.AdminConnectionConfiguration.toServer; public class CreateDatabaseWithConfigOptions { public static void main(String[] args){ String serverUrl = "http://localhost:5820"; String username = "username"; String password = "password"; Metadata metadata = Metadata.create() .set(SearchOptions.SEARCHABLE, true) .set(DatabaseOptions.EDGE_PROPERTIES, true); try(AdminConnection adminConnection = toServer(serverUrl).credentials(username, password).connect();){ adminConnection.newDatabase("myDatabase").setAll(metadata).create(); } } }
Database Creation Templates
The Stardog CLI and Studio allow you can pass a Java properties file containing database configuration options at database creation time. If the configuration option database.name
is provided in the properties file it will override the name passed in at creation time.
-
A properties file can be provided to the
db create
command using the--config
/-c
option.$ cat database.properties database.name=myDatabase search.enabled=true edge.properties=true
# -n/--name option can be omitted because 'database.name' is contained in database.properties stardog-admin db create -c database.properties Successfully created database 'myDatabase'.
-
- Navigate to the “Databases” section
- Click on the “Create Database” button
- Enter a name for the database if
database.name
is not defined in the properties file being used to configure the database - Select the “Use File” radio button
- Select the properties file on your filesystem.
- Click the “Create” button
Bulk Loading Data at Creation Time
Stardog tries hard to do bulk loading at database creation time in the most efficient and scalable way possible.
To load data at creation time:
-
Files to be added to the database may be passed as final arguments to the
db create
command.- If a directory is passed as one of the final arguments, all the files in that directory and its child directories will be recursively loaded to the database.
- Zip files will be uncompressed and the RDF files they contain will be loaded.
- Files with unrecognized extensions, or that produce parse errors, will be (silently) ignored. Named graphs can be specified with an
@
sign preceding the graph iri. - All files after that graph will be loaded into that graph until another
@graph
is encountered. A single@
can be used to switch back to the default graph.
By default, files are not copied to the remote server; only the paths are sent. If the files do not exist on the remote server, the
--copy-server-side
flag should be specified in order to copy them before creating the database and bulk loading the data.# load input01.ttl to the default graph, input1.ttl and input2.ttl to urn:stardog:graph1, switch back to the default graph and load input3.ttl to it stardog-admin db create -n myDatabase input0.ttl @urn:stardog:graph:1 input1.ttl input2.ttl @ input3.ttl
-
Create a geospatial enabled database, and bulk load
labels_en.nq.bz2
to the named graphsome:graph
andgeo_coordinates_en.nq.bz2
to the default graph. Both of these files exist on the client machine and will be shipped to the server.curl -u username:password -X POST \ -F root="{\ "dbname":"spatialDB",\ "options":{"spatial.enabled":true},\ "files":[{"filename":"labels_en.nq.bz2", "context":"some:graph"},{"filename":"geo_coordinates_en.nq.bz2"}]\ }"\ -F "geo_coordinates_en.nq.bz2"=@/path/to/geo_coordinates_en.nq.bz2 \ -F "labels_en.nq.bz2"=@/path/to/labels_en.nq.bz2 \ http://remote-server:5820/admin/databases
Create a search enabled database, and bulk load
/path/to/data1.ttl
to the named graphsome:graph
and/path/to/data2.ttl
to the default graph. Both of these files exist on the same machine Stardog is running on.curl -u admin:admin -X POST \ -F root="{\ "dbname":"myDatabase",\ "options":{"search.enabled":true},\ "files":[{"filename":'/path/to/data1.ttl',"context":'some:graph'}, {"filename":'/path/to/data2.ttl'}]}" \ http://localhost:5820/admin/databases
See the HTTP API for more information.
import com.complexible.stardog.api.admin.AdminConnection; import com.google.common.collect.ImmutableMap; import com.stardog.stark.Resource; import com.stardog.stark.Values; import java.nio.file.Path; import java.nio.file.Paths; import static com.complexible.stardog.api.admin.AdminConnectionConfiguration.toServer; public class CreateDatabaseBulkLoad { public static void main(String[] args){ String serverUrl = "http://localhost:5820"; String username = "username"; String password = "password"; try(AdminConnection adminConnection = toServer(serverUrl).credentials(username, password).connect();){ Resource g1 = Values.iri("urn:g1"); Resource g2 = Values.iri("urn:g2"); Path f1 = Paths.get("/path/to/data1.ttl"); Path f2 = Paths.get("/path/to/data2.ttl"); Path f3 = Paths.get("/path/to/data3.ttl"); ImmutableMap<Path, Resource> contexts = ImmutableMap.of(f1, g1, f2, g2); // f1 is loaded into g1 // f2 is loaded into g2 // f3 is loaded into the default graph adminConnection.newDatabase("myDatabase").create(contexts::get, f1, f2, f3); } } }
See
com.complexible.stardog.api.admin.DatabaseBuilder#create
for more information.If the files to be bulk loaded do not exist on the same machine as Stardog, use
com.complexible.stardog.api.admin.DatabaseBuilder#copyServerSide
to specify that the files should be first copied to the server.
It’s not currently possible to bulk load data at creation time via Stardog Studio.
Tuning Bulk Loading Performance
Data loading time can vary widely, depending on factors in the data to be loaded, including the number of unique resources, etc. Below are some tips to help you achieve the best bulk loading times:
- Copy or move the files to be loaded onto the same machine as Stardog. Copying the files from a client over a network will introduce overhead.
-
In your
stardog.properties
file, set thememory.mode
configuration option to a value ofbulk_load
:memory.mode=bulk_load
Be sure to disable this option after bulk loading is complete. See Memory Configuration for more information.
- Load compressed data (GZIP, BZ2, ZIP) since compression minimizes disk access.
- Use a multicore machine since bulk loading is highly parallelized and database indexes are built concurrently.
-
Load many files together at creation time since different files will be parsed and processed concurrently improving the load speed.
The
file split
CLI utility can be used to split an RDF files into smaller files. - With caution, turn off the database configuration option
strict.parsing
.
Archetypes
A database archetype is a simple templating mechanism for bundling a set of namespaces, schemas and constraints to populate a newly created database. Archetypes are an easy way to register the namespaces, reasoning schemas and constraints for standardized vocabularies and ontologies with a database. Archetypes are composable so multiple archetypes can be specified at database creation time to load all the defined namespaces, schemas and constraints into the database. Archetypes are intended to be used alongside your domain data, which may include as many other schemas and constraints as are required.
As of Stardog 7.2.0, the preferred way of using archetypes is via the Stardog Archetype Repository which comes with archetypes for FOAF, SKOS, PROV, and CIM. Follow the instructions on the GitHub repository for setting up and using archetypes.
Once the archetypes have been setup you can use the following command to create a new database that will load the namespaces, schemas and constraints associated with an archetype:
stardog-admin db create -o database.archetypes="cim" -n myDatabase
Inline Archetypes
Archetypes can be used as a predefined way of loading a schema and a set of constraints to the database just like any RDF data can be loaded to a database. These kinds of archetypes are called “inline” as their contents will appear in the database under predefined named graphs as explained next. These named graphs that are automatically created by archetypes can be queried and modified by the user as any other named graph.
Each archetype has a unique IRI identifying it and the schema contents of inline archetypes will be loaded into a named graph with that IRI. To see an example, follow the setup instructions to download the archetypes to ${STARDOG_HOME}/.archetypes
and create a new database with the FOAF archetype:
stardog-admin db create -o database.archetypes="foaf" myDatabase
If you query the database you will see a named graph automatically created:
$ stardog query myDatabase "select distinct ?g { graph ?g { } }"
+----------------------------+
| g |
+----------------------------+
| http://xmlns.com/foaf/0.1/ |
+----------------------------+
Protected Archetypes
Archetypes can also be defined in a “protected” mode where the schema and the constraints will be available for reasoning and validation services but they will not be stored in the database. In this mode, archetypes prevent unintended modifications to the schema and the constraints without losing their reasoning and validation functionality. An ontology like PROV is standardized by W3C and is not meant to change over time so the protected mode can be used with it.
The user-defined archetypes are inline by default but the archetype definition can be configured to make the schema and/or the constraints protected as explained in the Github Repository.
The following example shows how using a protected archetype would look:
$ stardog-admin db create -o database.archetypes="prov" -n provDB
Successfully created database 'provDB'.
$ stardog query execute provDB "select distinct ?g { graph ?g { } }"
+-------+
| g |
+-------+
+-------+
$ stadog reasoning schema provDB
prov:wasDerivedFrom a owl:ObjectProperty
prov:wasGeneratedBy owl:propertyChainAxiom (prov:qualifiedGeneration prov:activity)
prov:SoftwareAgent a owl:Class
prov:wasInfluencedBy rdfs:domain (prov:Activity or prov:Agent or prov:Entity)
...
$ stardog query execute --reasoning provDB "select * { ?cls rdfs:subClassOf prov:Agent }"
+--------------------+
| cls |
+--------------------+
| prov:Agent |
| prov:SoftwareAgent |
| owl:Nothing |
| prov:Person |
| prov:Organization |
+--------------------+
$ stardog icv export provDB
AxiomConstraint{prov:EmptyCollection rdfs:subClassOf (prov:hadMember max 0 owl:Thing)}
AxiomConstraint{prov:Entity owl:disjointWith prov:Derivation}
SPARQLConstraint{
...
This example demonstrates that the database looks empty to regular SPARQL queries but reasoning queries see the PROV ontology. Similarly PROV constraints are visible for validation purposes but they cannot be removed by the icv drop
command.
Built-in Archetypes
Before Stardog 7.2.0, the only way to define archetypes was by creating and registering a new Java class that contained the archetype definition. This method is deprecated as of Stardog 7.2.0 but it will continue to work until Stardog 8 at which point support for Java-based archetypes will be removed. Until that time, the Java-based PROV and SKOS archetypes that were bundled in the Stardog distribution as built-in archetypes will be available and can be used without setting up the archetype location as describe above.
Listing Databases
To list all of the databases in the Stardog server
stardog-admin db list
Output:
+-------------+ | Databases | +-------------+ | db1 | | db2 | +-------------+
See the
db list
command for more information.-
- Navigate to the “Databases” section. The list of all databases the user has access to will appear in the left pane.
curl -u username:password http://localhost:5820/admin/databases
See the HTTP API for more information
import com.complexible.stardog.api.admin.AdminConnection; import static com.complexible.stardog.api.admin.AdminConnectionConfiguration.toServer; public class ListDatabases { public static void main(String[] args) { String serverUrl = "http://localhost:5820"; String username = "admin"; String password = "admin"; try (AdminConnection adminConnection = toServer(serverUrl).credentials(username, password).connect();) { System.out.println(adminConnection.list()); } } }
See
com.complexible.stardog.api.admin.AdminConnection#list
for more information.
Database Status
One can obtain a status report for any database in the server. The status report contains the following information:
- Database: name of the database
- Status: whether the databse is online/offline
- Approx. Size: the approximate number of triples in the database
- Queries: number of queries currently running
- Open Connections: number of open connections to the database
- Open Transactions: number of open transactions to the database
- Query Avg. Time: average query execution time
- Plans Cached: number of query plans cached for the database
- Plan Cache Hit Ratio: ratio to monitor hits/misses on the plan cache for the database
To obtain a status report for a database:
stardog-admin db status db1
Output:
Database : db1 Status : Online Approx. size : 0 triples Queries : None running Open Connections : 0 Open Transactions : 0 Query Avg. Time : 0.00 s Query Rate : 0.00 queries/sec Plans Cached : 3 Plan Cache Hit Ratio : 57.14%
See the
db status
command for more information.-
- Navigate to the “Databases” section
- Not all information obtainable via the
db status
CLI command is available in Studio. The listing of databases in the left pane shows the approximate amount of triples in each database. If you click on a specific database in the listing and into the “Admin” tab, the number of running queries and the database’s status (offline/online) will be displayed.
- Not all information obtainable via the
- Navigate to the “Databases” section
Offline/Online a Database
Databases are either online
or offline
; this allows database maintenance to be decoupled from server maintenance.
- Databases are put
online
oroffline
synchronously: these operations block until other database activity is completed or terminated. - If the Stardog server is shutdown while a database is
offline
, the database will beoffline
when the server restarts. - Some database configuration options (e.g.
search.enabled
) require the database to beoffline
when the configuration option is set. See Getting and Setting Database Options for more information.
To offline a database:
stardog-admin db offline myDatabase
See the
db offline
command for more information.-
- Navigate to the “Databases” section.
- Select the database you wish to offline
- Toggle the switch from the online position to the offline position. The green dot that was previously displayed to the right of the database name should now be orange.
curl -u username:password -X PUT \ http://localhost:5820/admin/databases/myDatabase/offline
See the HTTP API for more information.
import com.complexible.stardog.api.admin.AdminConnection; import static com.complexible.stardog.api.admin.AdminConnectionConfiguration.toServer; public class OfflineDatabase { public static void main(String[] args) { String serverUrl = "http://localhost:5820"; String username = "username"; String password = "password"; try (AdminConnection adminConnection = toServer(serverUrl).credentials(username, password).connect();) { adminConnection.offline("myDatabase"); } } }
See
com.complexible.stardog.api.admin.AdminConnection#offline
for more information.
To online a database:
stardog-admin db online myDatabase
See the
db online
command for more information.-
- Navigate to the “Databases” section.
- Select the database you wish to online
- Toggle the switch from the offline position to the online position. The orange dot that was previously displayed to the right of the database name should now be green.
curl -u username:password -X PUT \ http://localhost:5820/admin/databases/myDatabase/online
See the HTTP API for more information.
import com.complexible.stardog.api.admin.AdminConnection; import static com.complexible.stardog.api.admin.AdminConnectionConfiguration.toServer; public class OnlineDatabase { public static void main(String[] args) { String serverUrl = "http://localhost:5820"; String username = "username"; String password = "password"; try (AdminConnection adminConnection = toServer(serverUrl).credentials(username, password).connect();) { adminConnection.online("myDatabase"); } } }
See
com.complexible.stardog.api.admin.AdminConnection#online
for more information.
Namespaces
Stardog allows database administrators to persist and manage custom namespace prefix bindings.
At database creation time, if data is loaded to the database that has namespace prefixes, then those are persisted for the life of the database. This includes setting the default namespace to the default that appears in the file. Any subsequent queries to the database may simply omit the PREFIX
declarations:
If no files are used during database creation, or if the files do not define any prefixes (e.g. NTriples), then the following prefixes are stored:
Prefix | IRI |
---|---|
(default prefix) | http://api.stardog.com/ |
rdf | http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs | http://www.w3.org/2000/01/rdf-schema# |
xsd | http://www.w3.org/2001/XMLSchema# |
owl | http://www.w3.org/2002/07/owl# |
stardog | tag:stardog:api: |
-
When executing queries in the CLI, the default table format for SPARQL
SELECT
results will use the bindings as qnames. SPARQLCONSTRUCT
query output (including export) will also use the stored prefixes. To reiterate, namespace prefix bindings are per database, not global.Suppose you had a database
movies
and stored the namespace consisting of the prefixn
corresponding to the IRIhttp://www.imdb.com/name/
$ stardog query execute movies "select * { ?s rdf:type :Person } limit 5" +-------------+ | s | +-------------+ | n:nm0000001 | | n:nm0000002 | | n:nm0000003 | | n:nm0000004 | | n:nm0000005 | +-------------+
The result set above uses the binding
n
as the qname for each result.
To add new bindings, use the
namespace add
commandstardog namespace add movies --prefix n --uri 'http://www.imdb.com/name/'
To change the default binding, use a quote prefix (
""
) when adding a new one:stardog namespace add movies --prefix "" --uri 'http://new.default'
To change an existing binding, remove the existing one using the
namespace remove
command and then add a new one:stardog namespace remove movies --prefix ex && stardog namespace add movies --prefix "ex" --uri 'http://another.iri'
To list all namespace prefix bindings use the
namespace list
command:$ stardog namespace list movies +---------+---------------------------------------------+ | Prefix | Namespace | +---------+---------------------------------------------+ | | http://schema.org/ | | ex | http://some.iri | | foaf | http://xmlns.com/foaf/0.1/ | | geo | http://www.w3.org/2003/01/geo/wgs84_pos# | | n | http://www.imdb.com/name/ | | owl | http://www.w3.org/2002/07/owl# | | rdf | http://www.w3.org/1999/02/22-rdf-syntax-ns# | | rdfs | http://www.w3.org/2000/01/rdf-schema# | | sfn | tag:stardog:api:functions: | | spf | tag:stardog:api:property: | | stardog | tag:stardog:api: | | t | http://www.imdb.com/title/ | | xml | http://www.w3.org/XML/1998/namespace | | xsd | http://www.w3.org/2001/XMLSchema# | +---------+---------------------------------------------+
To export all namespace prefix bindings in the database, use the
namespace export
command:# saving the namespaces (exported in Turtle by default) to a file prefixes.ttl stardog namespace export movies > prefixes.ttl
To import namespace prefixes from an RDF file that contains prefix declarations into the database use the
namespace import
:stardog namespace import -- newDatabase /path/to/prefixes.ttl
Any prefix imported will override any previous mappings for the prefix sharing the same name.
-
- Navigate to the “Databases” section.
- Select the database you wish to see all namespace prefix bindings for
- Select the “Namespaces” tab
- From this view, you can edit, add, or remove any prefix binding. Be sure to click the “Save” button in the top right corner after making any changes you wish to persist.
- There are two buttons to import new prefix bindings and export the existing ones.
-
To retrieve the namespaces stored in the database
curl -u username:password http://localhost:5820/movies/namespaces
See the HTTP API for more information
To import namespaces stored in an RDF file:
curl -u username:password -X POST \ -F name=@path/to/prefixes.ttl \ http://localhost:5820/movies/namespaces
See the HTTP API for more information.
import com.complexible.stardog.api.Connection; import com.complexible.stardog.api.ConnectionConfiguration; import com.stardog.stark.Namespace; import com.stardog.stark.io.turtle.TurtleUtil; import java.io.PrintStream; import java.util.Optional; public class ManagingNamespaces { public static void main(String[] args) { String serverUrl = "http://localhost:5820"; String username = "username"; String password = "password"; try (Connection connection = ConnectionConfiguration.to("movies").server(serverUrl).credentials(username, password).connect()){ // Add a namespace connection.namespaces().add("somePrefix", "http://some.iri"); // Given an IRI, get the corresponding prefix Optional<String> thePrefix = connection.namespaces().prefix("http://some.iri"); System.out.println(thePrefix.get()); // Given a prefix, get the corresponding IRI Optional<String> theIRI = connection.namespaces().iri("somePrefix"); System.out.println(theIRI.get()); // Remove a namespace connection.namespaces().remove("somePrefix"); // List/export namespaces in Turtle try(PrintStream out = System.out){ for(Namespace ns : connection.namespaces()){ out.print(ns.prefix()); out.print(": <"); out.print(TurtleUtil.encodeURIString(ns.iri())); out.print("> ."); out.println(); } } } } }
See
com.complexible.stardog.api.Connection#namespaces
for more information.
Dropping a Database
Dropping a database deletes the database, all associated files, and metadata. This means all files on disk related to the database will be deleted so please use with caution.
-
Provide the database name as the only argument to the
db drop
command:stardog-admin db drop myDatabase
-
- Navigate to the “Databases” section
- Select the database to be dropped
- Click on “Drop Database”. Confirm you do indeed want to drop this database.
curl -u username:password -X DELETE \ http://localhost:5820/admin/databases/myDatabase
See the HTTP API for more information.
import com.complexible.stardog.api.admin.AdminConnection; import static com.complexible.stardog.api.admin.AdminConnectionConfiguration.toServer; public class DropDatabase { public static void main(String[] args) { String serverUrl = "http://localhost:5820"; String username = "username"; String password = "password"; String db = "myDatabase"; try (AdminConnection adminConnection = toServer(serverUrl).credentials(username, password).connect();) { if(adminConnection.list().contains(db)){ adminConnection.drop(db); } } } }
See
com.complexible.stardog.api.admin.AdminConnection#drop
for more information.