Sequin delivers industry-leading performance for change data capture (CDC), sustaining 10k operations per second with 33.9ms 99th percentile latency.

This consistently beats Debezium, Fivetran, and Airbyte. In a head-to-head comparison, Sequin sustains 67% more throughput than Debezium at 14.7x lower 99th percentile latency.

We initially aimed to bench Sequin and Debezium at the same throughput. However, Debezium was unable to sustain 10k operations per second so we instead tested Sequin at 10k ops/sec and Debezium at 6k ops/sec. Sequin latency at 6k ops is less than or equal to Sequin latency at 10k ops.

Benchmark methodology

All of our benchmarks are open source and available on GitHub.

Our benchmarks are conducted in a production-like environment. Sequin and Debezium are compared head-to-head capturing changes from AWS RDS and delivering to AWS MSK Kafka.

Load is applied to a single Postgres table using workload_generator.py deployed to a dedicated EC2 instance.

Throughput and end-to-end latency are measured with a Kafka consumer deployed to a separate EC2 instance. The stats are calculated as:

  • Throughput: the number of records delivered to Kafka per second.
  • Latency: the time between a change occuring in Postgres (updated_at timestamp) and it’s delivery to AWS MSK Kafka (Kafka creation timestamp).

Workload

workload_generator.py applies a mixed workload of INSERT, UPDATE, and DELETE operations to the benchmark_records table.

The benchmark_records Postgres table has the following schema:

                                           Table "public.benchmark_records"
     Column      |            Type             | Collation | Nullable |                    Default
-----------------+-----------------------------+-----------+----------+-----------------------------------------------
 id              | integer                     |           | not null | nextval('benchmark_records_id_seq'::regclass)
 string_field    | text                        |           |          |
 numeric_field   | numeric                     |           |          |
 timestamp_field | timestamp with time zone    |           |          |
 json_field      | jsonb                       |           |          |
 inserted_at     | timestamp without time zone |           |          | now()
 updated_at      | timestamp without time zone |           |          | now()

Stats collection

Similarly, the cdc_stats.py script is deployed to a separate EC2 instance and reads from AWS MSK Kafka. Stats are bucketed and saved to a CSV file for analysis.

Infrastructure

Sequin, Debezium, and the rest of the infrastructure are deployed to AWS in the following configuration:

  • AWS RDS Postgres db.r6g.xlarge instance (4 vCPUs, 16GB RAM)
  • AWS MSK Kafka provisioned with 3 brokers
  • Sequin running via ECS on an m8g.xlarge instance (4 vCPUs, 16GB RAM)
  • Debezium deployed on MSK Connect with 8 MCUs (equivalent to 16 vCPUs and 32 GB of memory).

Results

Sequin

The primary benchmark test for Sequin is 10,000 operations per second for 60 minutes. This was the throughput value for which Sequin had stable latency figures, though we observed Sequin burst above 28k ops/sec during our testing.

At this throughput, Sequin achieved 22.3 ms average latency and 33.9 ms 99th percentile latency.

Debezium

We then tested Debezium at the same throughput but unfortunately observed latency grow unbounded as Debezium fell behind:

With 10k ops/sec applied to the benchmark_records table, Debezium is unable to keep up and latency grows as Debezium allows messages to accumulate in the Postgres replication slot.

Debezium was deployed with 8 MCUs which is equivalent to 16 vCPUs and 32 GB of memory. Sequin was provisioned with half of each of these resources. Additionally, Debezium was deployed with all configurations recommended in the AWS Debezium deployment guide. However, additional tuning was not performed by the Sequin team, so let us know if you have suggestions for improving Debezium’s performance.

Therefore we retested Debezium at 6k ops/sec. At this throughput, Debezium had incredibly stable latency figures of 258.4 ms average latency and 498.7 ms 99th percentile latency.

Comparison

Head-to-head, Sequin is 14.7x faster than Debezium while maintaing 67% more throughput.

The table below summarizes these results and also includes figures for Airbyte and Fivetran. Those two services offer batch syncs only, so their latency figures are only roughly comparable.

ToolAvg Latency99%ile Latency
Sequin22.3ms33.9ms
Debezium258.4ms (11.6x slower)498.7ms (14.7x slower)
Fivetran5+ minutes-
Airbyte1+ hours-

Next Steps

Ready to see Sequin’s performance for yourself?