title: 'Performance Tuning' sidebar_label: 'Performance Tuning' sidebar_position: 33 description: 'Comprehensive guide to optimizing query performance, acceleration, and resource utilization in Spice deployments.' keywords:
This guide provides recommendations for optimizing Spice performance across data acceleration, query execution, caching, and resource allocation.
Choose the appropriate Data Accelerator based on dataset characteristics and query patterns.
| Scenario | Recommended Accelerator | Key Configuration |
|---|---|---|
| Small datasets (under 1 GB), low latency | arrow | Default in-memory |
| Medium datasets (1-100 GB), complex SQL | duckdb with mode: file | Set duckdb_memory_limit |
| Large datasets (100 GB - 1+ TB) | cayenne | Tune cache parameters |
| Write-heavy workloads | cayenne with zstd | Set cayenne_compression_strategy: zstd |
| Point lookups, large datasets | cayenne | Vortex provides 100x faster random access |
| Point lookups, small-medium datasets | arrow with hash index | Set hash_index: enabled (experimental, v1.11.0-rc.2+) |
| Point lookups with explicit indexes | duckdb or sqlite | Configure indexes |
Spice Cayenne uses the Vortex columnar format for high-performance analytics on large datasets.
Vortex provides 100x faster random access compared to Apache Parquet through:
is_sorted, is_constant, min, max for query optimizationFor point lookups on large datasets, Spice Cayenne often matches or exceeds the performance of traditional B-tree indexes while consuming no additional memory for index structures.
Spice Cayenne maintains two in-memory caches that significantly impact query performance:
Footer Cache Sizing:
The footer cache stores file metadata. Size based on file count:
Segment Cache Sizing:
The segment cache stores decompressed data. Size based on working set:
| Strategy | Read Performance | Write Performance | Compression Ratio |
|---|---|---|---|
btrblocks (default) | Fastest | Moderate | Higher |
zstd | Moderate | Faster | High |
Choose btrblocks for read-heavy analytics workloads. Use zstd only when size on disk is the primary concern—setting zstd trades query performance for reduced storage size.
DuckDB provides mature SQL support with sophisticated query optimization.
Guidelines:
duckdb_memory_limit to control memory per instanceduckdb_file share a connection poolIncrease connection_pool_size for high-concurrency workloads. Each connection consumes memory.
DuckDB supports ART (Adaptive Radix Tree) indexes for faster point lookups:
Indexes consume memory and do not spill to disk. Creating an index requires the entire dataset to be loaded into memory. Monitor memory usage when adding indexes. For more details on ART index performance, see the ART paper.
DuckDB automatically creates zone-maps (min/max statistics) for each row group, enabling efficient predicate pushdown. In practice, zone-maps on sorted data often outperform ART indexes for range and equality queries while consuming no additional memory.
Why Zone-Maps Outperform Indexes:
Optimization Pattern: Sorted Views
Accelerate a view with ORDER BY to create sorted physical data, then set duckdb_preserve_insertion_order: true to maintain sort order:
Key Configuration:
| Parameter | Value | Purpose |
|---|---|---|
duckdb_preserve_insertion_order | true on sorted view | Maintains physical sort order from ORDER BY |
duckdb_preserve_insertion_order | false on source table | Faster writes without order guarantees |
Separate duckdb_file | Per view | Isolates sorted data from source tables |
Queries filtering on account_id or (account_id, pool_id) benefit from zone-map pruning, skipping entire row groups that don't match the filter predicates.
Enable aggregate pushdown for improved performance on supported aggregate queries:
Requires query_federation to be disabled. Supports count, sum, avg, min, and max functions.
Spice uses Apache DataFusion as its query execution engine for Arrow and Spice Cayenne accelerators. DataFusion provides vectorized, multi-threaded query execution with automatic memory management and spilling.
DataFusion automatically parallelizes queries across available CPU cores. By default, the number of partitions equals the number of CPU cores, providing maximum parallelism.
DataFusion's FairSpillPool divides the configured memory_limit evenly among partitions. With more CPU cores, each partition receives less memory, which may increase spilling for memory-intensive queries.
DataFusion supports multiple join algorithms and automatically selects the best one based on query statistics:
| Algorithm | Memory Usage | Best For |
|---|---|---|
| Hash Join | Higher | Fast execution with sufficient memory (default) |
| Sort-Merge Join | Lower | Memory-constrained environments, pre-sorted data |
| Nested Loop Join | Variable | Cross joins, non-equi joins |
DataFusion prefers hash joins by default for equi-joins. Hash joins do not currently support spilling, so memory-constrained environments may benefit from sort-merge joins for large datasets.
DataFusion pushes filters from operators (TopK, Join, Aggregate) into file scans to prune data early. This optimization is enabled by default and can skip entire row groups or files based on statistics. For example, a SELECT * FROM t ORDER BY timestamp DESC LIMIT 10 query pushes timestamp filters down to file scans, pruning files that cannot contain top-10 candidates.
Spice uses DataFusion's Parquet reader, which applies several optimizations automatically when reading Parquet files from S3, file, Iceberg, and Delta Lake connectors:
These optimizations are applied automatically and require no configuration. The effectiveness depends on data layout—sorting data by frequently filtered columns maximizes row group pruning.
Spice connects to various file formats (Parquet, Iceberg, Delta Lake) and uses DataFusion's query execution engine to push down predicates and prune data at the file, row group, and page level. Understanding these optimizations helps when designing data layouts for optimal query performance.
Apache Parquet stores data in row groups with per-column statistics. DataFusion uses these statistics to skip row groups that cannot contain matching rows.
Row Group Pruning:
Each Parquet row group contains min/max statistics for each column. When a query includes a WHERE clause, DataFusion evaluates whether each row group could contain matching rows based on these statistics. Row groups that cannot match are skipped entirely.
For example, with a predicate WHERE timestamp > '2024-01-01', DataFusion skips row groups where the maximum timestamp is before 2024-01-01.
Page Index:
Parquet's Page Index provides finer-grained statistics at the page level within row groups. DataFusion uses the Page Index to skip individual pages, reducing I/O for selective queries.
Bloom Filters:
Parquet files can include bloom filters for membership testing. For equality predicates like WHERE user_id = 'abc123', DataFusion checks the bloom filter before reading column data. If the bloom filter indicates the value is not present, the row group is skipped.
Late Materialization (Filter Pushdown):
DataFusion supports applying filters during Parquet decoding rather than after. This optimization, called late materialization, filters rows before materializing all columns, reducing memory usage for selective queries.
Apache Iceberg provides hidden partitioning and multi-level metadata filtering that simplifies query optimization.
Hidden Partitioning:
Iceberg automatically derives partition values from source columns using transforms like day(timestamp), month(timestamp), or bucket(user_id, 16). Queries filter on the source column directly (e.g., WHERE timestamp > '2024-01-01'), and Iceberg automatically prunes partitions without requiring users to specify partition columns in predicates.
Two-Level Metadata Filtering:
Iceberg uses a hierarchical metadata structure that enables filtering at multiple levels:
This two-level approach can eliminate entire groups of files before reading any data, providing significant performance benefits for large tables.
Column-Level Statistics:
Iceberg manifests store column-level statistics including:
lower_bound and upper_bound for min/max filteringnull_count for null handling optimizationvalue_count for cardinality estimationPartition Evolution:
Iceberg supports changing partition schemes without rewriting existing data. Historical data retains its original partitioning while new data uses the updated scheme. Queries automatically account for both partition layouts.
Delta Lake provides data skipping and Z-ordering for query optimization.
Data Skipping:
Delta Lake collects column statistics (min, max, null counts) during writes. The delta.dataSkippingNumIndexedCols table property controls how many columns have statistics collected (counted from the first column in the schema). Queries filter using these statistics to skip files that cannot contain matching rows.
Generated Columns:
Delta Lake supports generated columns that derive values from other columns. When partitioned by a generated column (e.g., eventDate generated from CAST(eventTime AS DATE)), queries filtering on the source column automatically benefit from partition pruning.
Z-Ordering:
Z-ordering colocates related data in the same files by clustering on specified columns. After running OPTIMIZE ... ZORDER BY (column), queries filtering on the Z-ordered columns benefit from improved data skipping.
Compaction:
Delta Lake's OPTIMIZE command compacts small files into larger ones, reducing the number of files to scan and improving query performance through better I/O patterns.
Spice Cayenne uses Vortex, which provides segment-level statistics and compute push-down on compressed data.
Segment Statistics:
Vortex's ChunkedLayout maintains per-segment statistics including min, max, null_count, is_sorted, and is_constant for each column. These statistics function similarly to DuckDB's zone-maps, enabling segment pruning during query execution.
Compute Push-Down:
Vortex supports executing filter operations directly on compressed data. For encodings like FSST (strings), FastLanes (integers), and ALP (floats), predicates can be evaluated without full decompression, reducing CPU and memory usage.
Encoding-Aware Optimization:
Vortex tracks encoding metadata that enables additional optimizations:
is_sorted: Enables binary search for point lookupsis_constant: Returns values immediately without scanningSee Spice Cayenne Performance Optimization for cache tuning and other Cayenne-specific settings.
| Optimization | Parquet | Iceberg | Delta Lake | Vortex |
|---|---|---|---|---|
| Row group/file pruning | ✅ | ✅ | ✅ | ✅ |
| Page-level filtering | ✅ | ✅ (via Parquet) | ✅ (via Parquet) | ✅ (segment-level) |
| Bloom filters | ✅ | ✅ (via Parquet) | ❌ | ❌ |
| Hidden partitioning | ❌ | ✅ | ❌ | ❌ |
| Manifest-level filtering | ❌ | ✅ | ❌ | ❌ |
| Compute on compressed data | ❌ | ❌ | ❌ | ✅ |
| Z-ordering | ❌ | ✅ | ✅ | ❌ |
Optimization Recommendations:
delta.dataSkippingNumIndexedCols)Configure DataFusion query memory limits to prevent out-of-memory errors:
Set memory_limit based on available container/machine memory minus accelerator requirements:
| Compression | Disk Usage | CPU Overhead |
|---|---|---|
zstd (default) | Lowest | Moderate |
lz4_frame | Medium | Lowest |
uncompressed | Highest | None |
Choose lz4_frame when CPU is limited and disk is abundant. Choose uncompressed for debugging or when spill files are rare. DataFusion uses Arrow IPC Stream format for spill files.
DataFusion processes data in batches of 8192 rows by default. This batch size balances memory usage with vectorized execution efficiency. Larger batches improve CPU cache utilization and SIMD operations but consume more memory per partition.
Place spill files on fast storage (SSD/NVMe) separate from data files:
The max_temp_directory_size setting limits the total size of temporary files (default: 100 GB).
Spice supports multiple caching layers for query acceleration.
Cache query results for repeated queries:
Cache Key Types:
| Type | Behavior | Use Case |
|---|---|---|
plan (default) | Uses query logical plan | Varied query formatting |
sql | Uses exact SQL string | Identical repeated queries |
Use sql for lowest latency with identical queries. Use plan for semantic query matching.
Serve stale cached results while refreshing in the background:
This pattern reduces query latency spikes during cache refresh.
| Mode | Memory Impact | Use Case |
|---|---|---|
full | 2.5x dataset | Small-medium datasets, complete updates |
append | Minimal | Time-series, logs, immutable data |
changes | Minimal | CDC-enabled sources |
caching | Minimal | Dynamic content, API data |
Use time_column for efficient incremental updates:
For data that has a granular time column and a separate column for partitioning (i.e. day buckets), set time_partition_column to the partitioning column based on time:
In this scenario, event_date is the day bucket used for physical partitioning (e.g., s3://bucket/events/event_date=2025-10-01/), while event_time provides the granular timestamp for precise filtering.
| event_id | event_time (time_column) | event_date (time_partition_column) | event_type | user_id |
|---|---|---|---|---|
| 8f2a-1 | 2025-10-01 08:14:22.123 | 2025-10-01 | page_view | u_442 |
| 8f2a-2 | 2025-10-01 22:01:05.884 | 2025-10-01 | click | u_901 |
| 9c11-a | 2025-10-02 01:12:44.001 | 2025-10-02 | purchase | u_442 |
| 9c11-b | 2025-10-02 14:30:12.550 | 2025-10-02 | page_view | u_118 |
For object storage with append-only files, use last_modified to skip unchanged files:
Configure resource requests and limits based on workload:
:::tip[CPU Limits]
Avoid setting CPU limits. CPU limits can cause throttling even when CPU is available, degrading query performance. Set CPU requests to guarantee scheduling.
:::
| Storage Type | Use Case |
|---|---|
| SSD/NVMe | Acceleration data files, query spill files |
| HDD | Cold data archival, infrequent access |
| RAM (tmpfs) | Temporary files for high-performance workloads |
Enable query plan capture for slow query analysis:
Monitor key performance metrics:
query_duration_ms - Query execution timequery_executions - Query throughputdataset_load_state - Acceleration statusSee Observability for metric configuration.
Use this checklist when optimizing Spice deployments:
runtime.query.memory_limit with spill directory on fast storagerefresh_mode: append for time-series dataSpice Documentation:
External References: