Standby Nodes
This page describes Standby Nodes in Stardog - a useful feature for maintaining HA clusters.
Page Contents
Overview
The notion of a standby node was introduced in Stardog 6.2.3. A standby node runs next to the Stardog cluster and periodically requests updates. The standby does not service any user requests, neither reads nor writes. Its purpose is to stay very closely synchronized with the cluster but without disturbing the cluster with the more difficult join
event. By only drifting from full synchronization by limited time windows, it allows for two important features:
-
The standby node can safely run database and server backups while taking minimal CPU cycles from cluster nodes servicing user requests.
-
The standby node can be upgraded to a full node and thereby quickly join the cluster because it is already closely in sync.
This latter point is important for maintaining HA clusters. If one node goes down, a standby node can be promoted to a real, functional node, restoring the cluster to full strength.
Managing a Standby Node
To start a cluster node as a standby node, simply add the following line to stardog.properties
:
pack.standby=true
pack.standby.node.sync.interval=5m
This will configure the node to be in standby mode and to wait 5 minutes between synchronization attempts. The interval begins when the synchronization completes. In other words, if a synchronization takes 3 minutes, it will be 8 minutes before the next synchronization attempt.
Starting with Stardog 9.0, in order to convert a standby node to a full node, it must be shut down, the standby configuration properties removed, and then restarted. Once upgraded, it may take a bit of time for the node to fully join the cluster. Its progress can be monitored with stardog-admin cluster status
.
To check the status of a standby node, run the cluster standby-status
command:
$ stardog-admin --server http://<standby node IP>:5820 cluster standby-status
Another feature of a standby node is the ability to pause
synchronization. To request a pause of synchronization, run the cluster standby-pause
command:
$ stardog-admin --server http://<standby node IP>:5820 cluster standby-pause
This tells the standby node that you want to pause it; however, it does not mean it is paused. Pausing can take some time if the node is in the middle of a large synchronization event. The status of pausing can be monitored with the cluster standby-status
command:
$ stardog-admin --server http://<standby node IP>:5820 cluster standby-status
A node is not safely paused until the state PAUSED
is returned. To resume synchronization, run the cluster standby-resume
:
$ stardog-admin --server http://<standby node IP>:5820 cluster standby-resume
Finally, you can attempt a sync outside of the configured standby synchronization schedule with the cluster standby-attempt-sync
command:
$ stardog-admin --server http://<standby node IP>:5820 cluster standby-attempt-sync
You cannot use the IP address of a full cluster node, nor that of a load balancer directing requests to full cluster nodes. You must point directly to the standby node address.
Because standby nodes are not full cluster members, many cluster commands do not work with standby nodes, such as:
cluster info
cluster status
cluster readonly-start
cluster readonly-stop
cluster shutdown
cluster diagnostics-report
To shut down a standby node, you must issue the server stop
command directly to the standby node address or send the process a SIGTERM.