Data sourced by Data Connectors can be locally materialized and accelerated using a Data Accelerator.
A Data Accelerator will query/fetch data from a connected data source and store/update it locally in an embedded acceleration engine, such as DuckDB or SQLite. To set data refresh behavior, such as refreshing data on an interval see Data Refresh.
Dataset acceleration is enabled by setting the acceleration configuration. E.g.
For the complete reference specification, see datasets.
By default, datasets will be locally materialized using in-memory Arrow records.
A choice of DuckDB, SQLite, or PostgreSQL engines can be used to materialize data, in-memory, on disk, or in attached databases.
Supported Data Accelerators include:
| Name | Description | Status | Engine Modes |
|---|---|---|---|
arrow | In-Memory Arrow Records | Stable | memory |
cayenne | Cayenne | Alpha (v1.9.0-rc.1+) | file, file_create, file_update |
duckdb | Embedded DuckDB | Stable | memory, file |
postgres | Attached PostgreSQL | Release Candidate | N/A |
sqlite | Embedded SQLite | Release Candidate | memory, file |
Select the appropriate accelerator based on dataset size, query patterns, and resource constraints:
| Use Case | Recommended Accelerator | Rationale |
|---|---|---|
| Small datasets (under 1 GB), maximum speed | arrow | In-memory storage provides lowest latency |
| Medium datasets (1-100 GB), complex SQL | duckdb | Mature SQL support with memory management |
| Large datasets (100 GB - 1+ TB), scalable analytics | cayenne | Vortex columnar format scales beyond single-file limits |
| Point lookups on large datasets | cayenne | Vortex provides 100x faster random access vs Parquet |
| Simple queries, low resource usage | sqlite | Lightweight, minimal overhead |
| External database integration | postgres | Leverage existing PostgreSQL infrastructure |
Both Spice Cayenne and DuckDB support file-based acceleration, but differ in architecture and performance characteristics:
Choose Spice Cayenne when:
Choose DuckDB when:
Data Accelerators may not support all possible Apache Arrow data types. For complete compatibility, see specifications.
:::warning[Memory Considerations]
When accelerating a dataset using mode: memory (the default), some or all of the dataset is loaded into memory. Ensure sufficient memory is available, including overhead for queries and the runtime, especially with concurrent queries.
In-memory limitations can be mitigated by storing acceleration data on disk, which is supported by duckdb and sqlite accelerators by specifying mode: file.
:::
import DocCardList from '@theme/DocCardList';