Transaction Logs
This page discusses how to inspect and replay transaction logs in Stardog, enabling point-in-time recovery by combining database backups with transaction log replay.
Page Contents
Overview
Starting with version 12.0, Stardog provides commands to inspect and replay transaction logs. Transaction logs record all database modifications, enabling:
- Point-in-time recovery: Restore a database to any point between backups by replaying transactions
- Cluster synchronization: Transaction logs are used internally to keep cluster nodes in sync
With transaction logging enabled, Stardog writes a sequential list of records of all transactions to disk. This log can be exported and replayed onto a database backup to recover data up to a specific point in time.
Enabling Transaction Logging
Transaction logging is controlled by the transaction.logging database configuration option. By default, it is:
- Disabled (
false) for standalone Stardog servers - Enabled (
true) for Stardog Cluster nodes (required for replication)
Create a database with transaction logging enabled, or enable it for an existing database.
$ stardog-admin db create -o transaction.logging=true -n myDatabase
To enable transaction logging for an existing database, set the option using stardog-admin metadata set. The database must be offline to change this setting:
$ stardog-admin db offline myDatabase
$ stardog-admin metadata set -o transaction.logging=true -- myDatabase
$ stardog-admin db online myDatabase
To enable transaction logging when creating a database:
$ stardog-admin db create -o transaction.logging=true -n myDatabase
Related Configuration Options
| Option | Default | Description |
|---|---|---|
transaction.logging | false | Enable/disable transaction logging |
transaction.logging.rotation.size | 524288000 (500 MB) | Size in bytes at which the log file rotates |
transaction.logging.rotation.remove | true | Whether to delete old log files after rotation |
transaction.logging.use.rotated | true | Whether to use rotated transaction logs |
The option transaction.logging.rotation.remove=false only controls whether tx log entries will be kept around beyond one round of rotation. It is not necessary to set it to false for point-in-time restore. By default, after the configured tx log size is reached, old transactions will be pruned when the log rotates for the second time.
Transaction Log Structure
Transaction logs contain records for each phase of a transaction’s lifecycle:
| Record Type | Description |
|---|---|
Started | A new transaction has begun, identified by a UUID |
Update | Data modification within a transaction (additions/removals) |
Commit | Transaction has been committed |
Rollback | Transaction has been rolled back |
Done | Transaction processing is complete |
Each transaction is assigned a UUID when it starts. The log maintains a sequential record that preserves the order and relationship between transactions, which is critical for replay validation.
Inspecting Transaction Logs
The stardog-admin tx log command allows you to inspect transaction logs from a running database or from a local log file.
View Transactions from a Database
$ stardog-admin tx log myDatabase
This displays a human-readable summary of all transactions in the database’s log.
View Individual Updates
By default, the output shows summarized counts of additions and removals per transaction. Use --updates to see individual update records:
$ stardog-admin tx log myDatabase --updates
Export to a File
Export the raw transaction log for later replay:
$ stardog-admin tx log myDatabase --format raw --output backup-txlog.log
Filter by UUID Range
View transactions within a specific UUID range:
$ stardog-admin tx log myDatabase --from-uuid a1b2c3d4-... --to-uuid f9e8d7c6-...
Filter by Time Range
View transactions within a specific time range:
$ stardog-admin tx log myDatabase --from-time 2024-01-15T10:30:00Z --to-time 2024-01-16T10:30:00Z
Read a Local Log File
Read a previously exported log file:
$ stardog-admin tx log --file /path/to/txlog.log
Point-in-Time Recovery
Point-in-time recovery combines database backups with transaction log replay to restore a database to a specific moment in time.
Prerequisites
- Transaction logging must be enabled (
transaction.logging=true) - A regular backup should be available, taken within the desired recovery window (the backup includes the last committed transaction UUID)
Recovery Workflow
-
Restore from backup: Start by restoring the most recent backup before your target recovery point.
$ stardog-admin db restore /path/to/backup/myDatabase/2024-01-15 -
Export the transaction log: If the original database still has the transaction log, export it. Otherwise, use a previously exported log file.
$ stardog-admin tx log myDatabase --format raw --output txlog-for-replay.logNote: The exported log should cover all transactions from the backup point onward. This requires that the log file contains the last committed transaction UUID recorded in the database backup. Otherwise, replay validation will fail because some updates may be missing from the log. This validation can be skipped with
--skip-validate. -
Preview the replay (recommended): Use
--dry-runto verify the replay will succeed before applying changes.$ stardog-admin tx replay --dry-run myDatabase txlog-for-replay.log -
Replay transactions: Apply the transactions to bring the database forward in time.
$ stardog-admin tx replay myDatabase txlog-for-replay.log
Replay with UUID Filtering
To recover to a specific point, use UUID filtering. Both --from-uuid and --to-uuid are required:
$ stardog-admin tx replay --from-uuid a1b2c3d4-... --to-uuid f9e8d7c6-... myDatabase txlog.log
Replay with Time Filtering
You can also filter by time, but this requires --skip-validate since time-based filtering may not have the matching starting transaction at the beginning of the tx-log (see How Validation Works):
$ stardog-admin tx replay --skip-validate --from-time 2024-01-15T10:30:00Z --to-time 2024-01-16T10:30:00Z myDatabase txlog.log
Time-based filtering with --skip-validate should be used with care. Ensure you understand the transaction sequence to avoid applying an incomplete or inconsistent set of changes.
Validation and Consistency
The replay command validates transaction logs by default to ensure database consistency.
How Validation Works
When a database commits a transaction, it records the transaction’s UUID as the “last committed transaction” in its metadata. During replay, Stardog validates that:
- The log contains a
Startedrecord for the last committed transaction UUID of the database - The log contains a corresponding
Donerecord for that transaction - All subsequent transactions in the log are complete (have both
StartedandDonerecords)
This ensures the log maintains continuity with the database state and contains only complete transactions.
Invalid Log Scenarios
Replay validation will fail when:
- Missing start record: The first transaction in the replay range doesn’t match (or follow) the database’s last committed transaction
- Incomplete transactions: A
CommitorDonerecord exists without a correspondingStartedrecord after the last committed transaction point - Data corruption: The log contains hash sums to ensure integrity and will fail validation if a transaction record is corrupted
Skipping Validation
You can skip validation with --skip-validate, but this is not recommended:
$ stardog-admin tx replay --skip-validate myDatabase txlog.log
Skipping validation may result in an inconsistent database state. Only use this option if you understand the implications and have verified the log contents manually.
Permissions
Transaction log operations require the dbms execute permission execute:admin:mydatabase on the database - exactly the same permissions as backup and restore for a database.
Cluster Considerations
In Stardog Cluster deployments:
- Transaction logging is automatically enabled and cannot be disabled
- Transaction logs are used internally for node synchronization
- The
tx logandtx replaycommands work the same way as in standalone mode - Point-in-time recovery should be performed when the database is not receiving concurrent writes from other applications. Otherwise replaying might fail, or lead to an unexpected state.
Best Practices
-
Enable logging for production: Set
transaction.logging=truefor any database where point-in-time recovery is important -
Coordinate with backups: Schedule regular backups and note the backup timestamps. The backup implicitly records the last committed transaction, which serves as your recovery starting point
-
Test recovery procedures: Periodically test your point-in-time recovery workflow on a non-production system
-
Monitor log size: Transaction logs can grow large. Plan for adequate disk space and consider the rotation size setting
-
Use dry-run: Preview replay operations with
--dry-runbefore applying changes to production databases -
Maintain log retention: Ensure that transaction logs are retained long enough to cover your desired recovery window
-
No concurrent user of database: Ensure that no application is using the database while performing point-in-time recovery.