Eager Replication v4
To prevent conflicts after a commit, set the bdr.commit_scope
parameter to global
. The default setting of local
disables Eager
Replication, so BDR applies changes and resolves potential conflicts
post-commit, as described in the Conflicts.
In this mode, BDR uses two-phase commit (2PC) internally to detect and
resolve conflicts prior to the local commit. It turns a normal
COMMIT
of the application into an implicit two-phase commit,
requiring all peer nodes to prepare the transaction before the origin
node continues to commit it. If at least one node is down or
unreachable during the prepare phase, the commit times out after
bdr.global_commit_timeout
, leading to an abort of the
transaction. There's no restriction on the use of
temporary tables as exists in explicit 2PC in PostgreSQL.
Once prepared, Eager All-Node Replication employs Raft to reach a commit decision. In case of failures, this allows a remaining majority of nodes to reach a congruent commit or abort decision so they can finish the transaction. This unblocks the objects and resources locked by the transaction and allows the cluster to proceed.
In case all nodes remain operational, the origin confirms the commit to the client only after all nodes have committed to ensure that the transaction is immediately visible on all nodes after the commit.
Requirements
Eager All-Node Replication uses prepared transactions internally.
Therefore all replica nodes need to have a max_prepared_transactions
configured high enough to handle all incoming transactions,
possibly in addition to local two-phase commit and CAMO transactions.
(See Configuration: Max-prepared transactions).
We recommend configuring it the same on all nodes and high enough to
cover the maximum number of concurrent transactions across the cluster
for which CAMO or Eager All-Node Replication is used. Other than
that, no special configuration is required, and every EDB Postgres Distributed cluster can
run Eager All-Node transactions.
Usage
To enable Eager All-Node Replication, the client needs to switch to global commit scope at session level or for individual transactions as shown here:
The client can continue to issue a COMMIT
at the end of the
transaction and let BDR manage the two phases:
Error handling
Given that BDR manages the transaction, the client needs to check only the
result of the COMMIT
. (This is advisable in any case, including single-node
Postgres.)
In case of an origin node failure, the remaining nodes eventually
(after at least bdr.global_commit_timeout
) decide to roll back the
globally prepared transaction. Raft prevents inconsistent commit versus
rollback decisions. However, this requires a majority of connected
nodes. Disconnected nodes keep the transactions prepared
to eventually commit them (or roll back) as needed to reconcile with the
majority of nodes that might have decided and made further progress.
Eager All-Node Replication with CAMO
Eager All-Node Replication goes beyond CAMO and implies it. There's no
need to also enable bdr.enable_camo
if bdr.commit_scope
is set to global
. Nor does a CAMO pair need to be
configured with bdr.add_camo_pair()
.
You can use any other active BDR node in the role of a CAMO partner to
query a transaction's status. However, you must indicate this non-CAMO usage
to the bdr.logical_transaction_status
function with a
third argument of require_camo_partner = false
. Otherwise, it might
complain about a missing CAMO configuration (which isn't required for
Eager transactions).
Other than this difference in configuration and invocation of that function, the client needs to adhere to the protocol described for CAMO. Reference client implementations are provided to customers on request .
Limitations
Transactions using Eager Replication can't yet execute DDL,
nor do they support explicit two-phase commit.
These might be allowed in later releases.
The TRUNCATE
command is allowed.
Replacing a crashed and unrecoverable BDR node with its physical standby isn't currently supported in combination with Eager All-Node transactions.
BDR currently offers a global commit scope only. Later releases will support Eager Replication with fewer nodes for increased availability.
You can't combine Eager All-Node Replication with
synchronous_replication_availability = 'async'
. Trying to configure
both causes an error.
Eager All-Node transactions are not currently supported in combination
with the Decoding Worker feature nor with transaction streaming.
Installations using Eager must keep enable_wal_decoder
and streaming_mode
disabled for the BDR node group.
Synchronous replication uses a mechanism for transaction confirmation
different from Eager. The two aren't compatible, and you must not use them
together. Therefore, whenever using Eager All-Node transactions,
make sure none of the BDR nodes are configured in
synchronous_standby_names
. Using synchronous replication to a
non-BDR node acting as a physical standby is possible.
Effects of Eager Replication in general
Increased commit latency
Adding a synchronization step means additional communication between the nodes, resulting in additional latency at commit time. Eager All-Node Replication adds roughly two network roundtrips (to the furthest peer node in the worst case). Logical standby nodes and nodes still in the process of joining or catching up aren't included but eventually receive changes.
Before a peer node can confirm its local preparation of the
transaction, it also needs to apply it locally. This further adds to
the commit latency, depending on the size of the transaction.
This is independent of the synchronous_commit
setting and
applies whenever bdr.commit_scope
is set to global
.
Increased abort rate
Note
The performance of Eager Replication is currently known to be unexpectedly slow (around 10 TPS only). This is expected to be improved in the next release.
With single-node Postgres, or even with BDR in its default asynchronous
replication mode, errors at COMMIT
time are rare. The additional
synchronization step adds a source of errors, so applications need to
be prepared to properly handle such errors (usually by applying a
retry loop).
The rate of aborts depends solely on the workload. Large transactions changing many rows are much more likely to conflict with other concurrent transactions.