Link Search Menu Expand Document
Start for Free

Backup and Restore

This page discusses backing up and restoring individual Stardog databases. For more information on backing up (and restoring) the entire Stardog server, please see the Server Backup section.

Page Contents
  1. Overview
  2. Backing up the Database
    1. Backup to S3
      1. IAM roles
    2. Backup to Google Cloud Platform (GCP)
    3. Backup to Azure Blob Storage
  3. Restoring a Database
    1. Restore from S3
    2. Restore from Google Cloud Platform (GCP)
    3. Restore from Azure Blob Storage
    4. Automatic Restore
    5. Restoring Permissions
  4. Logical Backup

Overview

Stardog provides two different kinds of backup operations: database backups and server backups. In this section, we’ll just be discussing database backups. These commands perform physical backups, including database metadata, rather than logical backups via some RDF serialization. They are native Stardog backups and can only be restored with Stardog tools as explained below.

Backups may be accomplished while a database is online. The backup is performed in a read transaction; reads and writes may continue, but writes performed during the backup are not reflected in the backup.

Backing up the Database

A database backup saves the contents of a single database along with database metadata, including user and role permissions associated with the database. Database backups can be written to the file system, AWS S3, Google Cloud Platform, or Azure Blob Storage.

The stardog-admin db backup command assumes a default location for its output, namely, $STARDOG_HOME/.backup. That default may be overridden by setting the backup.dir server property in your stardog.properties file. Backups are stored in directories by database name and then in date-versioned subdirectories for each backup volume.

Your typical backup directory would have a layout similar to this:

.backup/myDb/2020-10-02
.backup/myDb/2020-10-11
.backup/myOtherDb/2020-06-21

If you need to specify a location outside of $STARDOG_HOME (e.g. a network mount) you can set the backup.location server property in your stardog.properties or pass it to the --to argument in the stardog-admin db backup command.

EXAMPLE

To backup a Stardog database called foobar:

$ stardog-admin db backup foobar

EXAMPLE

To perform a remote backup, for example, pass in a specific directory that may be mounted in the current OS namespace via some network protocol, thus:

$ stardog-admin db backup --to /my/network/share/stardog-backups foobar

In the progress monitor, backups to remote systems (e.g., S3, GCP, and Azure) may show progress at 100% for an extended period of time due to file transfers.

Backup to S3

For S3 backups use a URL in the following format:

s3://[<endpoint hostname>:<endpoint port>]/<bucket name>/<path prefix>?region=<AWS Region>&AWS_ACCESS_KEY_ID=<access key>&AWS_SECRET_ACCESS_KEY=<verySecretKey1>

The endpoint hostname and endpoint port values are only used for on-premises S3 clones. To use Amazon S3 those values can be left blank and the URL will have three / before the bucket as in:

s3:///mybucket/backup/prefix?region=us-east-1&AWS_ACCESS_KEY_ID=accessKey&AWS_SECRET_ACCESS_KEY=secret

A default S3 location can also be specified in the stardog.properties file with the key backup.location.

IAM roles

In Stardog 9.1, database backup and restore to s3 support IAM roles attached to the instance as a means to provide credentials. If an access key and secret key are not specified in the s3 URL, Stardog will attempt to use an IAM role attached to the instance. You can read more about how to use IAM roles in the AWS documentation.

Requests using IAM roles attached to the instance are essentially the same except you do not provide the AWS access key or secret:

s3:///mybucket/backup/prefix?region=us-east-1

If an IAM role is attached to the instance and AWS keys are provided in the s3 URL, Stardog will only attempt the request with the access and secret keys provided in the s3 URL. If the credentials are incorrect and the request fails, Stardog will not attempt to use the IAM role attached to the instance. You will either need to specify the correct credentials as part of the s3 URL or remove the credentials from the request so the IAM role can be used.

Backup to Google Cloud Platform (GCP)

For GCP backups use a URL in the following format:

gs://<bucket name>/<path prefix>?GOOGLE_APPLICATION_CREDENTIALS=<path to Google Credentials JSON file>

See GCP documentation for creating Google credentials JSON file.

A default GCP backup location can also be specified in the stardog.properties file with the key backup.location.

Backup to Azure Blob Storage

For Azure backups use a URL in the following format:

https://<storage account>.blob.core.windows.net/<container>/<prefix>?<token>

The database will be stored in your Azure storage account under the specified container and directory identified by prefix.

If another scheme or host is required for your Azure account, they can be configured in stardog.properties with backup.azure.scheme, which defaults to https, and backup.azure.host, which defaults to blob.core.windows.net.

See Azure Blob Storage documentation for configuring and securing a storage container in your storage account.

Similar to S3 and GCP backups, a default Azure Blob Storage backup location can also be specified in the stardog.properties file with the key backup.location.

Restoring a Database

To restore a Stardog database from a Stardog backup volume, simply pass a fully-qualified path to the volume in question. The location of the backup should be the full path to the backup, not the location of the backup directory as specified in your Stardog configuration. There is no need to specify the name of the database to restore.

To restore a database from its backup:

$ stardog-admin db restore $STARDOG_HOME/.backups/myDb/2012-06-21

Restore from S3

Backups can also be restored directly from S3 by using an S3 URL in the following format:

s3://[<endpoint hostname>:<endpoint port>]/<bucket name>/<path prefix>/<database name>?region=<AWS Region>&AWS_ACCESS_KEY_ID=<access key>&AWS_SECRET_ACCESS_KEY=<verySecretKey1>

Unlike the backup URL the database name must be specified as the last entry of the path field in the URL.

Restore from Google Cloud Platform (GCP)

Backups can also be restored directly from GCP by using a GCP URL in the following format:

gs://<bucket name>/<path prefix>?GOOGLE_APPLICATION_CREDENTIALS=<path to Google Credentials JSON file>

Restore from Azure Blob Storage

Backups can also be restored directly from Azure Blob Storage by using a URL in the same format as backups:

https://<storage account>.blob.core.windows.net/<container>/<prefix>/<database name>?<token>

Unlike the backup URL the database name must be specified as the last entry of the path field in the URL.

Automatic Restore

Stardog can be configured to automatically restore databases from a backup location on startup. For example, when a Stardog cluster node first starts it could pull all of the database data down from an S3 backup before joining the cluster.

Automatic restore is not supported for GCP or Azure Blob Storage backups.

There are two server properties that control this behavior.

Properties Description
backup.autorestore.dbnames A regular expression that matches the names of the databases to automatically restore on startup, eg: .* for every database.
backup.autorestore.onfailure A boolean value that determines if all databases which failed to load should be automatically restored from a backup location.

As with any server property, they should be set in your stardog.properties file

Restoring Permissions

Backups created by version 7.7.1 or newer include permissions related to the database in the backup and grant these permissions when the database is restored. The permission included in the backup cover permissions for the database, database metadata, database admin, named graphs, data quality constraints and sensitive properties. See the security model section for details on these security resources.

There are some caveats with restoring permissions and some permissions might not be restored. For example, at the time the backup was created a certain user might have had permissions over the database. But if that user was deleted and does not exist at the time the database is being restored, the corresponding permissions will not be restored.

Furthermore, for the permissions in the backup to be restored, the user performing the restore operation should have privileges to grant the permissions specified in the backup. If the user performing the restore operation does not have such privileges, then permissions will not be restored. As a best practice, it is recommended that a superuser or a user with grant:*:* privileges perform the restore operation.

Database will be restored even if errors are encountered while restoring the permissions. Such errors will be included in the restore operation output.

Backups created by version 7.7.0 or earlier do not contain any permission information. When such backups are restored the database owner is granted the default permission but no additional permissions will be granted. Any required permissions need to be manually granted after the restore is complete.

Logical Backup

In addition to physical backups, one can perform a logical backup using the stardog data export command that will save the contents of a database into a standard RDF file.

EXAMPLE

Export the database myDb as NTRIPLES:

$ stardog data export --format NTRIPLES myDb

EXAMPLE

Export the database myDb to a gzipped file in TURTLE:

$ stardog data export myDb export.ttl.gz

Logical backups do not contain database metadata or configuration options.