Transaction Logs

This page discusses how to inspect and replay transaction logs in Stardog, enabling point-in-time recovery by combining database backups with transaction log replay.

Page Contents

Overview
Enabling Transaction Logging
1. Related Configuration Options
Transaction Log Structure
Inspecting Transaction Logs
Point-in-Time Recovery
Validation and Consistency
Permissions
Cluster Considerations
Best Practices

Overview

Starting with version 12.0, Stardog provides commands to inspect and replay transaction logs. Transaction logs record all database modifications, enabling:

Point-in-time recovery: Restore a database to any point between backups by replaying transactions
Cluster synchronization: Transaction logs are used internally to keep cluster nodes in sync

With transaction logging enabled, Stardog writes a sequential list of records of all transactions to disk. This log can be exported and replayed onto a database backup to recover data up to a specific point in time.

Enabling Transaction Logging

Transaction logging is controlled by the transaction.logging database configuration option. By default, it is:

Disabled (false) for standalone Stardog servers
Enabled (true) for Stardog Cluster nodes (required for replication)

Create a database with transaction logging enabled, or enable it for an existing database.

$ stardog-admin db create -o transaction.logging=true -n myDatabase

To enable transaction logging for an existing database, set the option using stardog-admin metadata set. The database must be offline to change this setting:

$ stardog-admin db offline myDatabase
$ stardog-admin metadata set -o transaction.logging=true -- myDatabase
$ stardog-admin db online myDatabase

To enable transaction logging when creating a database:

$ stardog-admin db create -o transaction.logging=true -n myDatabase

Option	Default	Description
`transaction.logging`	`false`	Enable/disable transaction logging
`transaction.logging.rotation.size`	`524288000` (500 MB)	Size in bytes at which the log file rotates
`transaction.logging.rotation.remove`	`true`	Whether to delete old log files after rotation
`transaction.logging.use.rotated`	`true`	Whether to use rotated transaction logs

The option transaction.logging.rotation.remove=false only controls whether tx log entries will be kept around beyond one round of rotation. It is not necessary to set it to false for point-in-time restore. By default, after the configured tx log size is reached, old transactions will be pruned when the log rotates for the second time.

Transaction Log Structure

Transaction logs contain records for each phase of a transaction’s lifecycle:

Record Type	Description
`Started`	A new transaction has begun, identified by a UUID
`Update`	Data modification within a transaction (additions/removals)
`Commit`	Transaction has been committed
`Rollback`	Transaction has been rolled back
`Done`	Transaction processing is complete

Each transaction is assigned a UUID when it starts. The log maintains a sequential record that preserves the order and relationship between transactions, which is critical for replay validation.

Inspecting Transaction Logs

The stardog-admin tx log command allows you to inspect transaction logs from a running database or from a local log file.

View Transactions from a Database

$ stardog-admin tx log myDatabase

This displays a human-readable summary of all transactions in the database’s log.

View Individual Updates

By default, the output shows summarized counts of additions and removals per transaction. Use --updates to see individual update records:

$ stardog-admin tx log myDatabase --updates

Export to a File

Export the raw transaction log for later replay:

$ stardog-admin tx log myDatabase --format raw --output backup-txlog.log

Filter by UUID Range

View transactions within a specific UUID range:

$ stardog-admin tx log myDatabase --from-uuid a1b2c3d4-... --to-uuid f9e8d7c6-...

Filter by Time Range

View transactions within a specific time range:

$ stardog-admin tx log myDatabase --from-time 2024-01-15T10:30:00Z --to-time 2024-01-16T10:30:00Z

Read a Local Log File

Read a previously exported log file:

$ stardog-admin tx log --file /path/to/txlog.log

Point-in-Time Recovery

Point-in-time recovery combines database backups with transaction log replay to restore a database to a specific moment in time.

Prerequisites

Transaction logging must be enabled (transaction.logging=true)
A regular backup should be available, taken within the desired recovery window (the backup includes the last committed transaction UUID)

Recovery Workflow

Restore from backup: Start by restoring the most recent backup before your target recovery point.
```
$ stardog-admin db restore /path/to/backup/myDatabase/2024-01-15
```
Export the transaction log: If the original database still has the transaction log, export it. Otherwise, use a previously exported log file.
```
$ stardog-admin tx log myDatabase --format raw --output txlog-for-replay.log
```
Note: The exported log should cover all transactions from the backup point onward. This requires that the log file contains the last committed transaction UUID recorded in the database backup. Otherwise, replay validation will fail because some updates may be missing from the log. This validation can be skipped with --skip-validate.
Preview the replay (recommended): Use --dry-run to verify the replay will succeed before applying changes.
```
$ stardog-admin tx replay --dry-run myDatabase txlog-for-replay.log
```
Replay transactions: Apply the transactions to bring the database forward in time.
```
$ stardog-admin tx replay myDatabase txlog-for-replay.log
```

Replay with UUID Filtering

To recover to a specific point, use UUID filtering. Both --from-uuid and --to-uuid are required:

$ stardog-admin tx replay --from-uuid a1b2c3d4-... --to-uuid f9e8d7c6-... myDatabase txlog.log

Replay with Time Filtering

You can also filter by time, but this requires --skip-validate since time-based filtering may not have the matching starting transaction at the beginning of the tx-log (see How Validation Works):

$ stardog-admin tx replay --skip-validate --from-time 2024-01-15T10:30:00Z --to-time 2024-01-16T10:30:00Z myDatabase txlog.log

Time-based filtering with --skip-validate should be used with care. Ensure you understand the transaction sequence to avoid applying an incomplete or inconsistent set of changes.

Validation and Consistency

The replay command validates transaction logs by default to ensure database consistency.

How Validation Works

When a database commits a transaction, it records the transaction’s UUID as the “last committed transaction” in its metadata. During replay, Stardog validates that:

The log contains a Started record for the last committed transaction UUID of the database
The log contains a corresponding Done record for that transaction
All subsequent transactions in the log are complete (have both Started and Done records)

This ensures the log maintains continuity with the database state and contains only complete transactions.

Invalid Log Scenarios

Replay validation will fail when:

Missing start record: The first transaction in the replay range doesn’t match (or follow) the database’s last committed transaction
Incomplete transactions: A Commit or Done record exists without a corresponding Started record after the last committed transaction point
Data corruption: The log contains hash sums to ensure integrity and will fail validation if a transaction record is corrupted

Skipping Validation

You can skip validation with --skip-validate, but this is not recommended:

$ stardog-admin tx replay --skip-validate myDatabase txlog.log

Skipping validation may result in an inconsistent database state. Only use this option if you understand the implications and have verified the log contents manually.

Permissions

Transaction log operations require the dbms execute permission execute:admin:mydatabase on the database - exactly the same permissions as backup and restore for a database.

Cluster Considerations

In Stardog Cluster deployments:

Transaction logging is automatically enabled and cannot be disabled
Transaction logs are used internally for node synchronization
The tx log and tx replay commands work the same way as in standalone mode
Point-in-time recovery should be performed when the database is not receiving concurrent writes from other applications. Otherwise replaying might fail, or lead to an unexpected state.

Best Practices

Enable logging for production: Set transaction.logging=true for any database where point-in-time recovery is important
Coordinate with backups: Schedule regular backups and note the backup timestamps. The backup implicitly records the last committed transaction, which serves as your recovery starting point
Test recovery procedures: Periodically test your point-in-time recovery workflow on a non-production system
Monitor log size: Transaction logs can grow large. Plan for adequate disk space and consider the rotation size setting
Use dry-run: Preview replay operations with --dry-run before applying changes to production databases
Maintain log retention: Ensure that transaction logs are retained long enough to cover your desired recovery window
No concurrent user of database: Ensure that no application is using the database while performing point-in-time recovery.