Using CRDTs v5

Using CRDTs in tables

Permissions required

PGD CRDTs requires usage access to CRDT types. Therefore, any user needing to access CRDT types must have at least the bdr_application role assigned to them.

To use CRDTs, you need to use a particular data type in CREATE/ALTER TABLE rather than standard built-in data types such as integer. For example, consider the following table with one regular integer counter and a single row:

Non-CRDT example

CREATE TABLE non_crdt_example (
    id      integer				PRIMARY KEY,
    counter integer				NOT NULL DEFAULT 0
);

INSERT INTO non_crdt_example (id) VALUES (1);

Suppose you issue the following SQL on two different nodes at same time:

UPDATE non_crdt_example
   SET counter = counter + 1    -- "reflexive" update
 WHERE id = 1;

After both updates are applied, you can see the resulting values using this query:

SELECT * FROM non_crdt_example WHERE id = 1;
   id |   counter
 -----+-----------
    1 |         1
(1 row)

This code shows that you lost one of the increments due to the update_if_newer conflict resolver.

CRDT example

To use a CRDT counter data type instead, you would follow these steps:

Create the table but with a CRDT (bdr.crdt_gcounter) as the counters data type.

CREATE TABLE crdt_example (
    id      integer				PRIMARY KEY,
    counter bdr.crdt_gcounter	NOT NULL DEFAULT 0
);

Configure the table for column-level conflict resolution:

ALTER TABLE crdt_example REPLICA IDENTITY FULL;

SELECT bdr.alter_table_conflict_detection('crdt_example',
			'column_modify_timestamp', 'cts');

And then insert a row with a value for this example.

INSERT INTO crdt_example (id) VALUES (1);

If you now issue, as before, the same SQL on two nodes at same time.

UPDATE crdt_example
   SET counter = counter + 1    -- "reflexive" update
 WHERE id = 1;

Once the changes are applied, you find that the counter has managed to concurrenct updates.

SELECT id, counter FROM crdt_example WHERE id = 1;
   id |   counter
 -----+-----------
    1 |         2
(1 row)

This example shows that the CRDT correctly allows the accumulator columns to work, even in the face of asynchronous concurrent updates that otherwise conflict.

Configuration options

The bdr.crdt_raw_value configuration option determines whether queries return the current value or the full internal state of the CRDT type. By default, only the current numeric value is returned. When set to true, queries return representation of the full state. You can use the special hash operator (#) to request only the current numeric value without using the special operator (the default behavior). If the full state is dumped using bdr.crdt_raw_value = on, then the value can reload only with bdr.crdt_raw_value = on.

Note

The bdr.crdt_raw_value applies formatting only of data returned to clients, that is, simple column references in the select list. Any column references in other parts of the query (such as WHERE clause or even expressions in the select list) might still require use of the # operator.

Different types of CRDTs

The crdt_gcounter type is an example of state-based CRDT types that work only with reflexive UPDATE SQL, such as x = x + 1, as the example shows.

Another class of CRDTs are delta CRDT types. These are a special subclass of operation-based CRDT.

With delta CRDTs, any update to a value is compared to the previous value on the same node. Then a change is applied as a delta on all other nodes.

CREATE TABLE crdt_delta_example (
    id       integer            PRIMARY KEY,
    counter  bdr.crdt_delta_counter NOT NULL DEFAULT 0
);

ALTER TABLE crdt_delta_example REPLICA IDENTITY FULL;

SELECT bdr.alter_table_conflict_detection('crdt_delta_example',
			'column_modify_timestamp', 'cts');

INSERT INTO crdt_delta_example (id) VALUES (1);

Suppose you issue the following SQL on two nodes at same time:

UPDATE crdt_delta_example
   SET counter = 2          -- notice NOT counter = counter + 2
 WHERE id = 1;

After both updates are applied, you can see the resulting values using this query:

SELECT id, counter FROM crdt_delta_example WHERE id = 1;
   id | counter
 -----+---------
    1 |       4
(1 row)

With a regular integer column, the result is 2. But when you update the row with a delta CRDT counter, you start with the OLD row version, make a NEW row version, and send both to the remote node. There, compare them with the version found there (e.g., the LOCAL version). Standard CRDTs merge the NEW and the LOCAL version, while delta CRDTs compare the OLD and NEW versions and apply the delta to the LOCAL version.

Query planning and optimization

An important question is how query planning and optimization works with these new data types. CRDT types are handled transparently. Both ANALYZE and the optimizer work, so estimation and query planning works fine without having to do anything else.