Consume Debezium change events from a Kafka topic and apply them to a Spice-accelerated dataset.
Use Debezium when:
For sources with a native CDC path, prefer the dedicated connector — PostgreSQL Logical Replication, DynamoDB Streams, or MongoDB Change Streams — to avoid the extra Kafka + Debezium hop.
┌────────────────┐ Debezium connector ┌───────────┐ Spice consumes ┌───────────────────┐ ChangeBatch ┌───────────────┐ │ Source DB │ ────────────────────▶│ Kafka │ ────────────────────▶│ Spice runtime │──────────────────▶│ Accelerator │ │ (MySQL, │ WAL → JSON events │ topic │ one consumer group │ (debezium │ (INSERT/ │ DuckDB / │ │ SQL Server, │ │ │ per Spice replica │ connector) │ UPDATE / │ SQLite / │ │ Oracle, …) │ │ │ │ │ DELETE) │ Postgres │ └────────────────┘ └───────────┘ └───────────────────┘ └───────────────┘
On startup, Spice subscribes to the configured Debezium-managed Kafka topic using either a uniquely generated consumer group or one specified via kafka_consumer_group_id. With a persistent acceleration engine (mode: file), data is fetched starting from the last committed offset, so restarts resume without reprocessing historical events.
bootstrap.servers).duckdb, sqlite, or postgres.The from field takes the form debezium:<kafka_topic>. The topic must contain Debezium-formatted change events for a single source table.
For Kafka clusters with SASL/SSL enabled:
The full set of kafka_* parameters is documented in the Debezium connector reference.
The connector manages Kafka consumer groups so offsets persist across restarts:
kafka_consumer_group_id to use your own group ID. The same ID must be used on every restart; if Spice detects a mismatch against the stored ID, it returns an error to prevent data inconsistency.To recover from a deliberate consumer-group change, reset the acceleration data so Spice starts fresh.
See the full description in the Debezium connector reference.
Debezium emits change events whose schema may evolve as the upstream table is altered. Set schema_evolution: true to have Spice peek at the latest Kafka message on reload and detect schema changes:
Two parameters control how many events Spice groups into a single CDC batch before applying it to the accelerator:
| Parameter | Default | Description |
|---|---|---|
batch_max_size | 10000 | Max number of change events to batch together before processing. |
batch_max_duration | 1s | Max time to wait for a batch to fill before processing. |
Larger batches improve throughput at the cost of higher per-batch latency.
The connector exposes the following component metrics:
| Metric Name | Type | Description |
|---|---|---|
bytes_consumed_total | Counter | Total number of bytes consumed from the Kafka topic |
records_consumed_total | Counter | Total number of records (messages) consumed from Kafka topics |
records_lag | Gauge | Total consumer lag across all topic partitions (number of messages not yet consumed) |
These metrics are opt-in; see the Debezium connector reference for an example metrics: block.
kafka is supported as the Debezium transport.json is supported as the message format.refresh_mode: changes — refresh-mode reference.