Change Data Capture (CDC) captures insert, update, and delete events from a database's transaction log and delivers them to consumers with low latency. This technique enables Spice to keep locally accelerated datasets synchronized with the source data in near real-time. CDC is efficient because it transfers only changed rows instead of re-fetching the entire dataset.
Using locally accelerated datasets configured with CDC enables Spice to provide high-performance accelerated queries and efficient real-time updates.
Consider a fraud detection application that needs to determine whether a pending transaction is likely fraudulent. The application queries a Spice-accelerated, real-time updated table of recent transactions to check if a pending transaction resembles known fraudulent ones. With CDC, the table is kept up-to-date, so the application can quickly identify potential fraud.
When configuring datasets to be accelerated with CDC, ensure that the data connector supports CDC and can return a stream of row-level changes. See the Supported Data Connectors section for more information.
The startup time for CDC-accelerated datasets may be longer than for non-CDC-accelerated datasets due to the initial synchronization.
:::tip
It is recommended to use CDC-accelerated datasets with persistent data accelerator configurations (i.e., file mode for DuckDB/SQLite or PostgreSQL). This ensures that when Spice restarts, it can resume from the last known state of the dataset instead of re-fetching the entire dataset.
:::
Enabling CDC by setting refresh_mode: changes in the acceleration settings requires support from the data connector to provide a stream of row-level changes.
Spice currently supports streaming ingestion via:
wal_level=logical + pgoutput) and streams INSERT/UPDATE/DELETE events into the accelerator. No Kafka, no Debezium, no external services.INSERT/UPDATE/DELETE events to the accelerator.refresh_mode: append for real-time, append-only acceleration (no separate CDC connector required).See an example of configuring a dataset to use CDC with Debezium by following the recipe at Streaming changes in real-time with Debezium CDC.