Backup and Restore
This page discusses backing up and restoring individual Stardog databases. For more information on backing up (and restoring) the entire Stardog server, please see the Server Backup section.
Page Contents
Overview
Stardog provides two different kinds of backup operations: database backups and server backups. In this section we’ll just be discussing database backups. These commands perform physical backups, including database metadata, rather than logical backups via some RDF serialization. They are native Stardog backups and can only be restored with Stardog tools as explained below.
Backups may be accomplished while a database is online; backup is performed in a read transaction: reads and writes may continue, but writes performed during the backup are not reflected in the backup.
Backing up the Database
A database backup saves the contents of a single database along with database metadata including user and role permissions associated with the database.
The stardog-admin db backup
command assumes a default location for its output, namely, $STARDOG_HOME/.backup
; that default may be overridden by setting the server property backup.dir
in your stardog.properties
file. Backups are stored in directories by database name and then in date-versioned subdirectories for each backup volume.
Your typical backup directory would have a layout similar to this:
.backup/myDb/2020-10-02
.backup/myDb/2020-10-11
.backup/myOtherDb/2020-06-21
If you need to specify a location outside of $STARDOG_HOME
(e.g. a network mount) you can set the backup.location
server property in your stardog.properties
or pass it to the --to
argument in the stardog-admin db backup
command.
EXAMPLE
To backup a Stardog database called foobar
:
$ stardog-admin db backup foobar
EXAMPLE
To perform a remote backup, for example, pass in a specific directory that may be mounted in the current OS namespace via some network protocol, thus:
$ stardog-admin db backup --to /my/network/share/stardog-backups foobar
Backup to S3
Database backups can also be performed directly to S3 or Google Cloud Platform. For S3 backups use a URL in the following format:
s3://[<endpoint hostname>:<endpoint port>]/<bucket name>/<path prefix>?region=<AWS Region>&AWS_ACCESS_KEY_ID=<access key>&AWS_SECRET_ACCESS_KEY=<verySecretKey1>
The endpoint hostname
and endpoint port
values are only used for on-premises S3 clones. To use Amazon S3 those values can be left blank and the URL will have three /
before the bucket as in:
s3:///mybucket/backup/prefix?region=us-east-1&AWS_ACCESS_KEY_ID=accessKey&AWS_SECRET_ACCESS_KEY=secret`
A default S3 location can also be specified in the stardog.properties
file with the key backup.location
.
Backup to Google Cloud Platform (GCP)
For GCP backups use a URL in the following format:
gs://<bucket name>/<path prefix>?GOOGLE_APPLICATION_CREDENTIALS=<path to Google Credentials JSON file>
See GCP documentation for creating Google credentials JSON file.
A default GCP backup location can also be specified in the stardog.properties
file with the key backup.location
.
Restoring a Database
To restore a Stardog database from a Stardog backup volume, simply pass a fully-qualified path to the volume in question. The location of the backup should be the full path to the backup, not the location of the backup directory as specified in your Stardog configuration. There is no need to specify the name of the database to restore.
To restore a database from its backup:
$ stardog-admin db restore $STARDOG_HOME/.backups/myDb/2012-06-21
Restore from S3
Backups can also be restored directly from S3 by using an S3 URL in the following format:
s3://[<endpoint hostname>:<endpoint port>]/<bucket name>/<path prefix>/<database name>?region=<AWS Region>&AWS_ACCESS_KEY_ID=<access key>&AWS_SECRET_ACCESS_KEY=<verySecretKey1>
Unlike the backup URL the database name must be specified as the last entry of the path
field in the URL.
Restore from Google Cloud Platform (GCP)
Backups can also be restored directly from GCP by using an GCP URL in the following format:
gs://<bucket name>/<path prefix>?GOOGLE_APPLICATION_CREDENTIALS=<path to Google Credentials JSON file>
Automatic Restore
Stardog can be configured to automatically restore databases from a backup location on startup. For example, when a Stardog cluster node first starts it could pull all of the database data down from an S3 backup before joining the cluster.
There are two server properties that control this behavior.
Properties | Description |
---|---|
backup.autorestore.dbnames | A regular expression that matches the names of the databases to automatically restore on startup, eg: .* for every database. |
backup.autorestore.onfailure | A boolean value that determines if all databases which failed to load should be automatically restored from a backup location. |
As with any server property, they should be set in your stardog.properties
file
Automatic Restore
Backups created by version 7.7.1 or newer include permissions related to the database in the backup and grant these permissions when the database is restored. The permission included in the backup cover permissions for the database, database metadata, database admin, named graphs, data quality constraints and sensitive properties. See the security model
section for details on these security resources.
There are some caveats with restoring permissions and some permissions might not be restored. For example, at the time the backup was created a certain user might have had permissions over the database. But if that user was deleted and does not exist at the time the database is being restored the corresponding permissions will not be restored.
Furthermore, for the permissions in the backup to be restored, the user performing the restore operation should have privileges to grant the permissions specified in the backup. If the user performing the restore operation does not have such privileges then permissions will not be restored. As a best practice, it is recommended that a superuser or a user with grant:*:*
privileges perform the restore operation.
Database will be restored even if errors are encountered while restoring the permissions. Such errors will be included in the restore operation output.
Backups created by version 7.7.0 or earlier do not contain any permission information. When such backups are restored the database owner is granted the default permission but no additional permissions will be granted. Any required permissions need to be manually granted after the restore is complete.
Logical Backup
In addition to physical backups one can perform a logical backup using the stardog data export
command that will save the contents of a database into a standard RDF file.
EXAMPLE
Export the database myDb
as NTRIPLES:
$ stardog data export --format NTRIPLES myDb
EXAMPLE
Export the database myDb
to a gzipped file in TURTLE:
$ stardog data export myDb export.ttl.gz
Logical backups do not contain database metadata or configuration options.