title: 'Read/Write Separation' sidebar_label: 'Read/Write Separation' description: 'Separate write/ingest workloads (cluster) from read workloads (application sidecars, agents) using shared snapshots and live query delegation.' sidebar_position: 8 pagination_next: null keywords: [ spice.ai, deployment, architecture, read write separation, cluster sidecar, snapshots, bootstrap, ingest, cqrs, sidecar, agent ] tags:
Production data and AI applications typically have two very different workloads on the same data:
Running both on the same Spice instance forces a single hardware shape, refresh schedule, and failure domain on workloads that have nothing in common. Read/write separation splits them into two tiers: a centralized write/ingest cluster that owns refresh and acceleration, and one or more lightweight read instances (typically sidecars next to the application) that serve queries from a local materialized copy.
The two tiers communicate through two channels:
Most production deployments use both: snapshots for the steady-state working set, and live delegation for the long tail.
Use this pattern when:
It is overkill when one Spice instance is sufficient (start with Sidecar) or when the workload is purely batch/analytical with relaxed latency (use Microservice).
The cluster owns every refresh, acceleration, and search index for the datasets in scope. It runs as a standalone Spice deployment — typically a Kubernetes Deployment or StatefulSet, or a managed Spice Cloud app — and holds the only credentials to the source systems.
Cluster Spicepod responsibilities:
Snapshots are partitioned by date and dataset (month=YYYY-MM/day=YYYY-MM-DD/dataset=<name>/...), so retention is a normal object-store lifecycle rule. See Snapshots for the full configuration reference.
Read instances run alongside applications — typically as Kubernetes pod sidecars, but the same configuration works in Cloud Run, on bare metal, or on a developer laptop. They never connect to source systems; their only inbound dependencies are the snapshot bucket and (optionally) the cluster's Arrow Flight endpoint.
Read Spicepod responsibilities:
bootstrap_only mode polls for new snapshots without writing them).snapshots: bootstrap_only is the key setting — read instances read snapshots but never write them, so multiple replicas don't race to upload. Combine with a periodic refresh trigger to pick up new snapshots without re-querying the source.
Snapshots cover the working set. For queries that span beyond it — historical analytics, cross-dataset joins, distributed search — read instances delegate to the cluster using a spiceai connector entry pointing at the cluster's Arrow Flight endpoint.
The application sees a single SQL surface — accelerated tables and delegated tables compose normally in joins and CTEs. See Cluster-Sidecar Architecture for the conceptual model.
When a read instance starts:
duckdb_file / sqlite_file).bootstrap_on_failure_behavior:
warn (default) — boot empty and refresh from the source. Avoid in read-tier instances that should not have source access.fallback — try older snapshots until one loads.retry — keep retrying the newest snapshot.For zero-source-credentials read instances, set bootstrap_on_failure_behavior: fallback or retry and ensure the dataset is never configured with usable source credentials.
Steady-state refresh on read instances is configured per dataset:
When a newer snapshot is available, the dataset hot-swaps without restarting the pod.
Snapshots are written to Hive-partitioned paths so retention is straightforward:
Apply an object-store lifecycle rule (S3 lifecycle, GCS Object Lifecycle Management, ADLS Lifecycle) to expire old partitions. Most deployments keep 24–72 hours of refresh-triggered snapshots and a daily archive beyond that.
The snapshot bucket is the only shared dependency between the tiers, so keep it in the same region as the read instances and apply VPC endpoints / Private Google Access to keep traffic on the private network.
The cluster and the read instances share dataset names but not full Spicepods. Two patterns work well:
datasets: definitions in a shared file and merge the cluster-only and read-only fields at deploy time (Helm value overlays, Kustomize, Jsonnet).snapshots: enabled (cluster) vs snapshots: bootstrap_only (reads) per role.Whichever approach is chosen, treat schema changes as backward-compatible by default — read instances may be running snapshots from a previous cluster version during a rollout.
The reference topology runs the cluster as a StatefulSet (or SpicepodSet on Spice.ai Enterprise) and the read instances as sidecars in application pods. Both use the same Spice Helm chart.
Read instances are deployed as a sidecar container in application pods, configured via a ConfigMap that holds the read-tier Spicepod. The application points at 127.0.0.1:8090 (HTTP) or 127.0.0.1:50051 (Arrow Flight) — no service discovery needed.
The read sidecar's ServiceAccount only needs read access to the snapshot bucket. It should not be granted source-system credentials — that's what makes the read tier safe to scale to many replicas.
For production, the Spice.ai Enterprise Kubernetes Operator manages both tiers as custom resources:
SpicepodSet — per-replica StatefulSets for the cluster, with automatic PVC resizing, configurable update strategies, and crashloop protection.SpicepodCluster — distributed scheduler/executor tiers when the cluster itself is large enough to need its own internal split.Rough first-pass sizing rules:
| Tier | Typical shape |
|---|---|
| Cluster (writer) | 3+ replicas. Memory sized for the largest accelerated dataset. Network bandwidth for source ingest. |
| Read instance | 1 replica per application pod. 0.5–2 vCPU, 512Mi–4Gi memory, 10–50Gi local SSD. |
| Snapshot bucket | Standard tier, same region. Lifecycle rule sized to refresh frequency × number of datasets × 24–72h. |
Read-tier memory is dominated by the working set of the file-mode acceleration engine. DuckDB compaction (snapshots_compaction: enabled) typically reduces snapshot size by 30–60%.
The split simplifies the credential surface area:
Compromising a read instance grants the attacker the read tier's snapshot bucket and the delegated query surface — never the source systems.
Both tiers expose the same metrics and tracing endpoints. Practical splits:
A high delegation rate is a signal to expand the materialized working set. A growing snapshot age is a signal that the cluster is falling behind on refresh.