Point-in-Time Recovery
In this tutorial you will build a continuous point-in-time recovery (PITR) pipeline for a Stardog database. The pipeline takes one full backup and then captures incremental transaction-log slices on demand, so that any moment from the backup forward can be restored — including a specific transaction.
Page Contents
Overview
This tutorial assumes you have already read Transaction Logs for the underlying concepts. Briefly:
- A full backup captures the database at a point in time. It also records, in its metadata, the UUID of the last committed transaction at that moment. This is the anchor for replay.
- A transaction-log slice is a contiguous portion of the database’s transaction log, identified by its starting and ending transaction UUIDs. Replayed back-to-back, slices reconstruct everything that happened between two anchors.
- Point-in-time recovery means: restore the backup → replay every slice in order until the desired point.
The pipeline you’ll build has two parts:
pitr-backup.sh— takes one full backup, extracts the anchor UUID from the backup’s metadata, then polls the live database for its currentindex.last.tx. Each time the UUID advances, it exports a new tx-log slice covering exactly the new range.pitr-restore.sh— restores the backup, then replays every captured slice in order. Optionally stops at a specific UUID.
The “poll, then slice only on change” approach avoids the obvious trap of writing empty slices when nothing has changed, and it makes every slice exactly cover one chunk of history with no overlap and no gap.
Prerequisites
- Stardog 12.0 or later.
jqinstalled on the backup host — used to parse the JSON metadata output.- A user with DBMS execute permission
[EXECUTE, "admin:<db>"]on the target database (the same permission required fordb backupanddb restore). -
Transaction logging enabled on the database (always enabled for cluster databases). If it’s not already enabled, turn it on:
$ stardog-admin db offline myDatabase $ stardog-admin metadata set -o transaction.logging=true -- myDatabase $ stardog-admin db online myDatabase
Without transaction.logging=true, no records are written and tx log returns an empty log. Cluster databases have transaction logging enabled by default and cannot disable it.
How it works
Anchoring on the backup
When a backup is created, the database’s metadata file (metadata.bk, inside the dated backup directory) records the UUID of the last committed transaction. That UUID is the exact point at which the backup’s data is consistent. Any tx-log slice you want to replay onto the restored backup must start from this UUID.
Use metadata convert to read the UUID directly from the file:
$ stardog-admin metadata convert \
--input-format BINARY --output-format json \
/var/stardog/pitr/backup/myDatabase/2026-05-12/metadata.bk \
| jq -r '.["index.last.tx"]'
54a1f110-fa6c-4d71-8b11-820bd9ea01be
Reading from the backup file (rather than querying the live database) guarantees the anchor UUID and the backup data are in sync — even if more transactions committed in the small window between the snapshot completing and the script reading the metadata.
Detecting new transactions
The currently-committed UUID on the live database is exposed under the metadata key index.last.tx:
$ stardog-admin metadata get -o index.last.tx myDatabase --output-format json \
| jq -r '.["index.last.tx"]'
9f823c11-77be-4d28-94a1-1d04e8a3aabe
If this UUID is different from the one at the end of the last slice, new transactions have been committed and need to be captured. If it’s identical, there’s nothing to do — skip the export.
Slicing the log
stardog-admin tx log --format raw exports a binary slice that tx replay can consume directly. Bounding it with --from-uuid and --to-uuid, using the UUID values captured earlier, makes the slice precisely cover the new range:
$ stardog-admin tx log myDatabase \
--from-uuid 54a1f110-fa6c-4d71-8b11-820bd9ea01be \
--to-uuid 9f823c11-77be-4d28-94a1-1d04e8a3aabe \
--format raw --output /var/stardog/pitr/txlog/tx-0001.log
Transaction log exported to: /var/stardog/pitr/txlog/tx-0001.log
Last transaction UUID: 9f823c11-77be-4d28-94a1-1d04e8a3aabe
The backup script
pitr-backup.sh writes everything under a single capture directory:
$BACKUP_DIR/
├── backup/myDatabase/2026-05-12/ # the full backup
└── txlog/
├── tx-0001-<uuid>.log
├── tx-0002-<uuid>.log
└── ...
Each slice’s filename embeds its terminal UUID for traceability. The numeric prefix is a monotonic counter, so lexicographic sort = chronological order at replay time.
#!/usr/bin/env bash
set -euo pipefail
# Continuous point-in-time backup. Takes one full database backup, then
# polls index.last.tx every $INTERVAL seconds. Whenever the UUID
# advances, exports the new range as a tx-log slice.
#
# Args / env:
# $1 database name
# BACKUP_DIR capture root (default: /var/stardog/pitr)
# INTERVAL seconds between polls (default: 60)
# STARDOG_ADMIN path to stardog-admin (default: stardog-admin)
DB="${1:?usage: $0 <database>}"
BACKUP_DIR="${BACKUP_DIR:-/var/stardog/pitr}"
INTERVAL="${INTERVAL:-60}"
STARDOG_ADMIN="${STARDOG_ADMIN:-stardog-admin}"
mkdir -p "$BACKUP_DIR/backup" "$BACKUP_DIR/txlog"
# 1. Take a full backup. db backup prints a line of the form
# "Database <db> backed up <n> triples to <path> in <time>".
# Extract the path so we can read the metadata file out of it.
echo "Taking full backup of $DB..."
backup_path="$("$STARDOG_ADMIN" db backup --to "$BACKUP_DIR/backup" "$DB" \
| sed -n 's/.* triples to \(.*\) in .*/\1/p' \
| tail -n 1)"
if [ -z "$backup_path" ] || [ ! -d "$backup_path" ]; then
echo "Could not determine the backup directory from db backup output." >&2
exit 1
fi
echo "Backup written to $backup_path"
# 2. Read the anchor UUID directly from the backup metadata. This is the
# UUID of the last committed transaction at the exact moment of the
# snapshot — reading it from the file guarantees consistency with the
# backup's data, regardless of what commits afterwards on the live db.
last_uuid="$("$STARDOG_ADMIN" metadata convert \
--input-format BINARY --output-format json \
"$backup_path/metadata.bk" \
| jq -r '.["index.last.tx"]')"
if [ -z "$last_uuid" ] || [ "$last_uuid" = "null" ]; then
echo "Could not read index.last.tx from $backup_path/metadata.bk;" \
"is transaction.logging enabled for $DB?" >&2
exit 1
fi
echo "Backup anchored at $last_uuid"
# 3. Poll the live db. Whenever index.last.tx advances, slice the log.
seq=0
while true; do
sleep "$INTERVAL"
current_uuid="$("$STARDOG_ADMIN" metadata get -o index.last.tx "$DB" \
--output-format json | jq -r '.["index.last.tx"]')"
if [ -z "$current_uuid" ] || [ "$current_uuid" = "null" ]; then
echo "[$(date -u +%FT%TZ)] failed to read current UUID; will retry" >&2
continue
fi
if [ "$current_uuid" = "$last_uuid" ]; then
# No new transactions since the previous slice.
continue
fi
seq=$((seq + 1))
slice="$BACKUP_DIR/txlog/$(printf 'tx-%04d-%s.log' "$seq" "$current_uuid")"
"$STARDOG_ADMIN" tx log "$DB" \
--from-uuid "$last_uuid" --to-uuid "$current_uuid" \
--format raw --output "$slice" >/dev/null
echo "[$(date -u +%FT%TZ)] slice $seq: $last_uuid -> $current_uuid"
last_uuid="$current_uuid"
done
A few details worth pointing out:
$backup_pathis parsed out ofdb backup’s stdout, which printsDatabase <db> backed up <n> triples to <path> in <time>. Thesedextracts the path between ` triples to ` and ` in. This avoids guessing the date-versioned directory name and works whetherbackup.dirdefaults are used or–to` overrides them.- Polling reuses the same UUID the previous slice ended on as
--from-uuidfor the next slice. Slices form a continuous chain with no overlap and no gap. - If a single iteration fails (network blip, server restart), the loop retries from the same
last_uuidon the next tick. The interval’s worth of transactions simply rolls into the next successful slice. - The slice filename
tx-NNNN-<uuid>.loguses the numeric prefix for ordering and embeds the terminal UUID for human traceability.
The replay script
pitr-restore.sh restores the captured backup and then replays each tx-log slice in chronological (lexical) order. tx replay validates log continuity by default, so any gap will surface as a validation error rather than silently producing an inconsistent state.
#!/usr/bin/env bash
set -euo pipefail
# Restore a database from a pitr-backup.sh capture and replay the
# accumulated tx-log slices.
#
# Args / env:
# $1 capture root (the BACKUP_DIR used by pitr-backup.sh)
# $2 source database name (must match the name used at
# backup time; this is the directory under $1/backup/)
# $3 optional target database name to restore into
# (default: same as source). Must not already exist on
# the target server.
# STOP_UUID optional: replay only up to this UUID (inclusive);
# passed to every tx replay call until the restored
# database reaches it
# STARDOG_ADMIN path to stardog-admin (default: stardog-admin)
CAPTURE="${1:?usage: $0 <capture-dir> <source-db> [target-db]}"
SOURCE_DB="${2:?usage: $0 <capture-dir> <source-db> [target-db]}"
TARGET_DB="${3:-$SOURCE_DB}"
STARDOG_ADMIN="${STARDOG_ADMIN:-stardog-admin}"
# 1. Locate and restore the most recent full backup under the capture
# directory. The backup is looked up by the source name; the restored
# database is named with the target name.
latest_backup="$(ls -1d "$CAPTURE/backup/$SOURCE_DB"/* | LC_ALL=C sort | tail -n 1)"
echo "Restoring $latest_backup -> $TARGET_DB"
"$STARDOG_ADMIN" db restore --name "$TARGET_DB" "$latest_backup"
# 2. Replay tx-log slices in order. tx-NNNN-<uuid>.log makes lex sort
# = chronological. The numeric prefix carries the ordering; the
# trailing UUID is for human traceability.
shopt -s nullglob
slices=("$CAPTURE/txlog/"tx-*.log)
shopt -u nullglob
if [ "${#slices[@]}" -eq 0 ]; then
echo "No tx-log slices found; restored backup as-is."
exit 0
fi
IFS=$'\n' sorted=($(printf '%s\n' "${slices[@]}" | LC_ALL=C sort))
unset IFS
for i in "${!sorted[@]}"; do
slice="${sorted[$i]}"
if [ -n "${STOP_UUID:-}" ]; then
echo "Replaying $slice up to $STOP_UUID"
"$STARDOG_ADMIN" tx replay --to-uuid "$STOP_UUID" "$TARGET_DB" "$slice"
current_uuid="$($STARDOG_ADMIN metadata get -o index.last.tx "$TARGET_DB" \
--output-format json | jq -r '."index.last.tx"')"
if [ "$current_uuid" = "$STOP_UUID" ]; then
echo "Reached stop UUID $STOP_UUID"
break
fi
else
echo "Replaying $slice"
"$STARDOG_ADMIN" tx replay "$TARGET_DB" "$slice"
fi
done
if [ -n "${STOP_UUID:-}" ] && [ "$current_uuid" != "$STOP_UUID" ]; then
echo "STOP_UUID $STOP_UUID was not reached in the available slices." >&2
exit 1
fi
echo "Recovery complete."
db restore rejects restoring over an existing database of the same name unless --overwrite is used, and --overwrite cannot be combined with --name. The script above always passes --name, so the target database name must be free on the target server. If you need to overwrite the existing same-named database, either drop it first (stardog-admin db drop <db>) or invoke db restore --overwrite directly without --name.
Walking through a full cycle
Put the two scripts on the backup host and start the capture in the background:
$ chmod +x pitr-backup.sh pitr-restore.sh
$ BACKUP_DIR=/var/stardog/pitr INTERVAL=60 ./pitr-backup.sh myDatabase &
Taking full backup of myDatabase...
Backup written to /var/stardog/pitr/backup/myDatabase/2026-05-12
Backup anchored at 54a1f110-fa6c-4d71-8b11-820bd9ea01be
[2026-05-12T14:32:11Z] slice 1: 54a1f110-... -> 7c9e1d44-...
[2026-05-12T14:34:11Z] slice 2: 7c9e1d44-... -> 9f823c11-...
...
The capture directory now looks like this:
$ tree /var/stardog/pitr
/var/stardog/pitr
├── backup
│ └── myDatabase
│ └── 2026-05-12
│ ├── data.bk
│ └── metadata.bk
└── txlog
├── tx-0001-7c9e1d44-7c1a-4b3f-9e88-6c2b73f9aa11.log
├── tx-0002-9f823c11-77be-4d28-94a1-1d04e8a3aabe.log
└── tx-0003-8aa3f114-f2b7-44e0-9c5e-0a3a8a59e4cd.log
To recover the captured backup to its original name (myDatabase must not already exist on the target server, drop it first if needed):
$ ./pitr-restore.sh /var/stardog/pitr myDatabase
Restoring /var/stardog/pitr/backup/myDatabase/2026-05-12 -> myDatabase
Replaying /var/stardog/pitr/txlog/tx-0001-7c9e1d44-....log
Replaying /var/stardog/pitr/txlog/tx-0002-9f823c11-....log
Replaying /var/stardog/pitr/txlog/tx-0003-8aa3f114-....log
Recovery complete.
To restore the same capture under a different name — for example into a side-by-side copy you can inspect before promoting — pass the target name as the third argument:
$ ./pitr-restore.sh /var/stardog/pitr myDatabase myDatabase_recovered
Restoring /var/stardog/pitr/backup/myDatabase/2026-05-12 -> myDatabase_recovered
...
Recovering to a specific moment
To stop replay at a precise point — for example, just before a bad transaction — pass that transaction’s UUID as STOP_UUID:
$ STOP_UUID=8aa3f114-f2b7-44e0-9c5e-0a3a8a59e4cd \
./pitr-restore.sh /var/stardog/pitr myDatabase myDatabase_recovered
When STOP_UUID is set, the script passes --to-uuid on every replay call and checks index.last.tx after each slice. As soon as the restored database reaches STOP_UUID, the script stops. If none of the captured slices reaches that UUID, the script exits with an error instead of replaying past the available history.
UUID-bounded replay is exact. Time-bounded replay (--from-time / --to-time) is also supported but requires --skip-validate because the filter doesn’t necessarily align with the transaction boundaries (see How Validation Works). Prefer UUIDs when you have them.
Finding the right UUID to stop at
If you don’t know the UUID off-hand, list the slices and inspect the relevant one in text format. The terminal UUID in each filename tells you which range it covers:
$ ls /var/stardog/pitr/txlog/
tx-0001-7c9e1d44-7c1a-4b3f-9e88-6c2b73f9aa11.log
tx-0002-9f823c11-77be-4d28-94a1-1d04e8a3aabe.log
tx-0003-8aa3f114-f2b7-44e0-9c5e-0a3a8a59e4cd.log
$ stardog-admin tx log --file \
/var/stardog/pitr/txlog/tx-0003-8aa3f114-f2b7-44e0-9c5e-0a3a8a59e4cd.log \
--format text
tx log --file takes a single file path — pass the exact slice you want to inspect, not a glob. Each transaction shows its Started / Commit / Done records with their UUIDs and timestamps; pick the UUID of the last good transaction.
Tuning and operational notes
- Polling interval (
INTERVAL). Shorter intervals make slices smaller and tighten your achievable recovery point objective (RPO). Most deployments do well between 30 seconds and 5 minutes. The cost of a poll is one cheap metadata read. - Slice retention. Slices accumulate forever as written. Pair the script with a retention policy that keeps slices at least as long as your oldest usable backup. When you take a new full backup, you can safely discard slices older than that backup’s anchor UUID.
- Disk space. Each slice contains the raw bytes of all transactions in its range — roughly the same size you’d expect from the source database’s own tx-log file for that window. Plan for peak write rates.
- No concurrent writers during recovery. While replaying, no other application should be writing to the target database. Concurrent writes can interleave their own transaction UUIDs and break validation on subsequent slices.
- Don’t reuse a recovered database for further capture. Once you’ve replayed to a stop point, the recovered database’s
index.last.txis that stop UUID — restartingpitr-backup.shagainst it would diverge from the original timeline. Take a fresh full backup of the recovered database first if you want to continue PITR going forward. - Cluster deployments. Run the backup script against any single cluster node; the tx log is replicated, so any node will return the same UUIDs. During recovery,
db restorefrom a file-based backup can only be run when the cluster has a single node — scale the cluster down to one node, restore, then scale back up (see Restoring the Cluster).db restorefrom a cloud backup (S3, GCP) replicates to all nodes and can run against a multi-node cluster directly. - Securing credentials. The scripts assume
stardog-adminis authenticated (e.g., via a stored token or~/.stardog/credentials). Don’t pass credentials on the command line, since they will appear in the process listing and shell history.
See also
- Transaction Logs — full reference for
tx logandtx replay, including validation rules and configuration options. - Backup and Restore — full reference for
db backupanddb restore.