Link Search Menu Expand Document
Start for Free

Standby Nodes

This page describes Standby Nodes in Stardog - a useful feature for maintaining HA clusters.

Page Contents
  1. Overview
  2. Managing a Standby Node
  3. Standby Node Limits

Overview

The notion of a standby node was introduced in Stardog 6.2.3. A standby node runs next to the Stardog cluster and periodically requests updates. The standby does not service any user requests, neither reads nor writes. Its purpose is to stay very closely synchronized with the cluster but without disturbing the cluster with the more difficult join event. By only drifting from full synchronization by limited time windows, it allows for two important features:

  1. The standby node can safely run database and server backups while taking minimal CPU cycles from cluster nodes servicing user requests.

  2. The standby node can be upgraded to a full node and thereby quickly join the cluster because it is already closely in sync.

This latter point is important for maintaining HA clusters. If one node goes down, a standby node can be promoted to a real, functional node, restoring the cluster to full strength.

Managing a Standby Node

To start a cluster node as a standby node, simply add the following line to stardog.properties:

pack.standby=true
pack.standby.node.sync.interval=5m

This will configure the node to be in standby mode and to wait 5 minutes between synchronization attempts. The interval begins when the synchronization completes. In other words, if a synchronization takes 3 minutes, it will be 8 minutes before the next synchronization attempt.

Starting with Stardog 9.0, in order to convert a standby node to a full node, it must be shut down, the standby configuration properties removed, and then restarted. Once upgraded, it may take a bit of time for the node to fully join the cluster. Its progress can be monitored with stardog-admin cluster status.

To check the status of a standby node, run the cluster standby-status command:

$ stardog-admin --server http://<standby node IP>:5820 cluster standby-status

Another feature of a standby node is the ability to pause synchronization. To request a pause of synchronization, run the cluster standby-pause command:

$ stardog-admin --server http://<standby node IP>:5820 cluster standby-pause

This tells the standby node that you want to pause it; however, it does not mean it is paused. Pausing can take some time if the node is in the middle of a large synchronization event. The status of pausing can be monitored with the cluster standby-status command:

$ stardog-admin --server http://<standby node IP>:5820 cluster standby-status

A node is not safely paused until the state PAUSED is returned. To resume synchronization, run the cluster standby-resume:

$ stardog-admin --server http://<standby node IP>:5820 cluster standby-resume

Finally, you can attempt a sync outside of the configured standby synchronization schedule with the cluster standby-attempt-sync command:

$ stardog-admin --server http://<standby node IP>:5820 cluster standby-attempt-sync

You cannot use the IP address of a full cluster node, nor that of a load balancer directing requests to full cluster nodes. You must point directly to the standby node address.

Because standby nodes are not full cluster members, many cluster commands do not work with standby nodes, such as:

  • cluster info
  • cluster status
  • cluster readonly-start
  • cluster readonly-stop
  • cluster shutdown
  • cluster diagnostics-report

To shut down a standby node, you must issue the server stop command directly to the standby node address or send the process a SIGTERM.

Standby Node Limits

The number of standby nodes that can be run at a time is controlled by the pack.standby.node value in your license. Each standby node has an auto-generated unique ID associated with it and will register itself with the cluster using this ID when it first starts. When the number of standby nodes registered for a cluster reaches the limit allowed by the license, additional standby nodes will refuse to start.

Read replicas and geo replicas are categorized as standby nodes too. For this reason, the limit in the license applies to the total number of standby nodes, read replicas, and geo replicas registered for the cluster.

When a standby node shuts down, it will not deregister itself from the cluster. This is to avoid situations where a standby node is not allowed to start after a shutdown because another standby node registered itself in the meantime. But this also means if you delete a standby node permanently, fresh new standby nodes may fail to start because the license limit is reached. For this reason, you can revoke standby access for deleted standby nodes manually by using the standbyRevokeAccess API call.