Stream every INSERT, UPDATE, and DELETE from a PostgreSQL table directly into a Spice-accelerated dataset over Postgres' native logical replication protocol.
This is the recommended way to keep a Spice accelerator (DuckDB, SQLite, PostgreSQL, Cayenne, Arrow) continuously in sync with a PostgreSQL source.
┌────────────────┐ WAL (pgoutput) ┌───────────────────┐ ChangeBatch ┌───────────────┐
│ PostgreSQL │──────────────────▶│ Spice runtime │────────────────▶│ Accelerator │
│ wal_level= │ replication │ (postgres │ (INSERT/ │ DuckDB / │
│ logical │ slot │ connector) │ UPDATE / │ SQLite / │
│ │ │ │ DELETE) │ Postgres / │
└────────────────┘ └───────────────────┘ │ Cayenne │
└───────────────┘
On first start the connector:
spice_<dataset>_<hash>_pub) containing the source table.spice_<dataset>_<dataset-hash>_<instance-hash>). The <instance-hash> gives each Spice replica its own slot.ChangeBatch (grouped INSERT/UPDATE/DELETE) and applied to the accelerator.On subsequent restarts the connector detects the existing slot and resumes from Postgres' stored confirmed_flush_lsn.
This requires a server restart.
Verify:
On managed Postgres services:
| Service | How to enable |
|---|---|
| AWS RDS | Set rds.logical_replication = 1 in the parameter group and restart. |
| Aurora PostgreSQL | Set rds.logical_replication = 1; wait for DB reboot. |
| GCP Cloud SQL | Flag: cloudsql.logical_decoding = on. |
| Azure Database | Under Replication, set Replication support to LOGICAL. |
| Supabase / Neon | Logical replication is enabled by default. |
Spice needs the primary key columns in every UPDATE/DELETE event, so one of the following must be true:
The table has a primary key (default — nothing to do).
Or the table has REPLICA IDENTITY FULL:
Tables with REPLICA IDENTITY NOTHING are rejected at startup.
Start the runtime. Spice will:
spice_users_<dataset-hash>_pub.spice_users_<dataset-hash>_<instance-hash>.public.users into the DuckDB accelerator.:::tip Use a persistent accelerator
Pair with mode: file on DuckDB/SQLite (or the PostgreSQL accelerator) so restarts resume from the last acknowledged LSN instead of re-snapshotting.
:::
All replication-specific parameters live under params: on the dataset and start with pg_:
| Parameter | Default | Description |
|---|---|---|
pg_replication_slot | spice_<dataset>_<dataset-hash>_<instance-hash> | Name of the replication slot. Must be unique per replica. |
pg_publication | spice_<dataset>_<dataset-hash>_pub | Publication name. Shared across replicas. Auto-created if missing. |
pg_replication_initial_snapshot | true | If true, take an initial snapshot of the table's existing rows before streaming. Set to false if you are pre-seeding the accelerator yourself. |
pg_replication_temporary_slot | false | If true, the slot is dropped when Spice disconnects. Every restart re-bootstraps. |
pg_replication_status_interval | 10s | How often StandbyStatusUpdate (LSN acknowledgement) is sent back to Postgres. Lower values free WAL faster; higher values reduce network chatter. Accepts any duration string (, , ). |
All existing pg_host, pg_port, pg_user, pg_pass, pg_db, pg_sslmode, pg_connection_string parameters continue to apply — see the PostgreSQL Data Connector reference.
pg_sslmode for WAL streamingverify-full is the recommended production default.
pg_sslmode | Replication transport | Cert chain verified | Hostname verified |
|---|---|---|---|
disable | plaintext | — | — |
prefer (default) | plaintext | — | — |
require | TLS | ❌ | ❌ |
verify-ca | TLS | ✅ | ❌ |
verify-full | TLS | ✅ | ✅ |
:::info
prefer behaves as plaintext on the replication transport because the replication client does not expose a safe "try TLS, fall back to plaintext" path. Set require, verify-ca, or verify-full to force TLS on the WAL stream.
:::
| Engine | INSERT | UPDATE | DELETE | Notes |
|---|---|---|---|---|
duckdb | ✅ | ✅ (upsert) | ✅ | Recommended for most workloads. |
sqlite | ✅ | ✅ (upsert) | ✅ | Great for small/medium datasets. |
postgres | ✅ | ✅ (upsert) | ✅ | Use when the accelerator is another Postgres. |
cayenne | ✅ | ✅ (upsert) | ✅ | S3-backed Vortex format, good for read-heavy analytics. |
arrow | ✅ | ✅ (upsert with primary key) | ✅ | Arrow's in-memory engine uses a hash index for primary-key upserts. Without a primary key, UPDATEs are appended as new rows. DELETE and TRUNCATE are applied via Arrow's DeletionTableProvider. |
For Arrow workloads that need true upsert semantics (so UPDATEs replace existing rows instead of duplicating them), configure a primary_key. DuckDB, SQLite, PostgreSQL, and Cayenne also support upsert behavior.
Every Spice replica must have its own replication slot. Spice hashes the replica's identity into the default slot name:
| Source | Used for |
|---|---|
SPICE_INSTANCE_ID env | Preferred — set it explicitly per replica. |
HOSTNAME / COMPUTERNAME env | Fallback — works on Kubernetes where each pod has a distinct hostname. |
Each Spice replica can use a different pg_replication_slot while sharing a publication (pg_publication).
:::danger Drop unused slots A permanent replication slot holds on to WAL until dropped. If you retire a Spice replica without cleaning up its slot, Postgres will keep accumulating WAL indefinitely and can run out of disk. :::
After removing a Spice replica, drop its slot:
Delete the accelerator's local storage (DuckDB file, SQLite file, etc.) and drop the replication slot. On next start, Spice will create a fresh slot, snapshot the table, and resume streaming.
dataset_postgres_replication_reconnects_total to detect flaky networks.Spice emits OpenTelemetry metrics for every replicated Postgres dataset. Metric names follow the pattern dataset_postgres_replication_<metric> with a name=<dataset> attribute.
Core freshness signals (auto-registered):
| Metric | Type | Description |
|---|---|---|
dataset_postgres_replication_lag_ms | Gauge | now() − commit_time(latest ingested txn). Primary CDC freshness signal. |
dataset_postgres_replication_lag_bytes | Gauge | server_wal_end_lsn − confirmed_flush_lsn. Unacknowledged WAL held by Spice's slot. |
dataset_postgres_replication_transactions_total | Counter | Committed transactions applied. |
dataset_postgres_replication_inserts_total / updates_total / deletes_total | Counter | Row-level events from WAL. |
dataset_postgres_replication_reconnects_total | Counter | Number of times the stream reconnected after a transient failure. |
| Symptom | Cause and fix |
|---|---|
Error: Table public.X has REPLICA IDENTITY NOTHING | Run ALTER TABLE public.X REPLICA IDENTITY FULL; (or add a primary key). |
Error: replication slot "..." already exists on startup | Another Spice replica is using the same slot name. Set pg_replication_slot uniquely, or ensure SPICE_INSTANCE_ID differs. |
| Error mentioning permission denied for database during setup | The role needs CREATE on the database, or pre-create the publication/slot yourself. |
pg_replication_slots.active is true but the accelerator isn't updating | Check Spice logs for schema-mismatch errors. The replication task holds the slot even after failure — restart after fixing. |
wal on the source disk growing forever | An abandoned slot. Drop it with pg_drop_replication_slot. |
UPDATEs on Arrow-engine dataset don't replace rows | Configure a primary_key so Arrow can use its hash index for upserts, or switch to duckdb, , , or . |
on_conflict upserts when a primary_key is configured. Without a primary key, UPDATEs appear as additional inserts rather than replacing existing rows. DELETE and TRUNCATE are applied either way.| Aspect | Debezium + Kafka | Native WAL streaming (this feature) |
|---|---|---|
| External services | Kafka + Schema Registry + Debezium + Connect | None — Spice connects to Postgres directly |
| Deployment footprint | JVM stack + ZooKeeper/KRaft | Zero extra pods |
| Setup complexity | Multiple topics, connector configs, ACLs | One connector config |
| Operational model | Consumer groups, topic retention | One replication slot per replica |
| Schema registry required | Yes (Avro/Protobuf) | No — schema derived from Postgres catalog |
| Latency | Kafka-bound (~100 ms+) | Commit-driven, typically <100 ms |
For greenfield Postgres → Spice CDC, prefer native WAL streaming. If Kafka is already deployed for other reasons, the Debezium path continues to work.
500ms30s2mpg_replication_bootstrap_batch_size | 8192 | Rows per emitted batch during the initial-snapshot bootstrap. Larger batches reduce per-batch overhead at the cost of more memory per batch. Maximum: 1048576. |
sqlitepostgrescayenneHuge TEXT/JSONB columns show as NULL after UPDATE | Unchanged TOASTed columns are omitted by pgoutput. Run ALTER TABLE ... REPLICA IDENTITY FULL; if you need them in every event. |