Parallel Apply v5
What is Parallel Apply?
Parallel Apply is a feature of PGD that allows a PGD node to use multiple writers per subscription. This behavior generally increases the throughput of a subscription and improves replication performance.
Configuring Parallel Apply
Two variables control Parallel Apply in PGD 5:
bdr.max_writers_per_subscription
(defaults to 8) and
bdr.writers_per_subscription
(defaults to 2).
This configuration gives each subscription two writers. However, in some circumstances, the system might allocate up to eight writers for a subscription.
Changing bdr.max_writers_per_subscription
requires a server restart to take effect.
You can change
bdr.writers_per_subscription
for a specific subscription without a restart by:
- Halting the subscription using
bdr.alter_subscription_disable
. - Setting the new value.
- Resuming the subscription using
bdr.alter_subscription_enable
.
First though, establish the name of the subscription using select * from
bdr.subscription
. For this example, the subscription name is
bdr_bdrdb_bdrgroup_node2_node1
.
When to use Parallel Apply
Parallel Apply is always on by default and, for most operations, we recommend leaving it on.
Monitoring Parallel Apply
To support Parallel Apply's deadlock mitigation, PGD 5.2 adds columns to
bdr.stat_subscription
.
The new columns are nprovisional_waits
, ntuple_waits
, and ncommmit_waits
.
These are metrics that indicate how well Parallel Apply is managing what
previously would have been deadlocks. They don't reflect overall system
performance.
The nprovisional_waits
value reflects the number of operations on the same
tuples being performed by concurrent apply transactions. These are provisional
waits that aren't actually waiting yet but could start waiting.
If a tuple's write needs to wait until it can be safely applied, it's counted in
ntuple_waits
. Fully applied transactions that waited before being committed
are counted in ncommit_waits
.
Disabling Parallel Apply
To disable Parallel Apply, set bdr.writers_per_subscription
to 1
.
Deadlock mitigation
When Parallel Apply is operating, the transactional changes from the subscription are written by multiple writers. However, each writer ensures that the final commit of its transaction doesn't violate the commit order as executed on the origin node. If there's a violation, an error is generated and the transaction can be rolled back.
This mechanism previously meant that when the following are all true, the resulting error could manifest as a deadlock:
- A transaction is pending commit and modifies a row that another transaction needs to change.
- That other transaction executed on the origin node before the pending transaction did.
- The pending transaction took out a lock request.
Additionally, handling the error could increase replication lag due to a combination of the time taken:
- To detect the deadlock
- For the client to roll back its transaction
- For indirect garbage collection of the changes that were already applied
- To redo the work
This is where Parallel Apply’s deadlock mitigation, introduced in PGD 5.2, can help. For any transaction, Parallel Apply looks at transactions already scheduled for any row (tuple) that the current transaction wants to write. If it finds one, the row is marked as needing to wait until the other transaction is committed before applying its change to the row. This approach ensures that rows are written in the correct order.
Parallel Apply support
Up to and including PGD 5.1, don't use Parallel Apply with Group Commit, CAMO, and Eager Replication. Disable Parallel Apply in these scenarios. If, using PGD 5.1 or earlier, you're experiencing a large number of deadlocks, you might also want to disable Parallel Apply or consider upgrading.
From PGD 5.2, Parallel Apply works with CAMO. It isn't compatible with Group Commit or Eager Replication, so disable it if Group Commit or Eager Replication are in use.
- On this page
- What is Parallel Apply?