A portable accelerated data query and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.
Spice is a portable, accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents. Run it as a sidecar next to your application — or scale to a multi-node distributed cluster — to get millisecond data and AI on localhost, backed by your existing data sources.
🎯 Goal: Build data-grounded AI apps and agents in minutes, not months. No pipelines. No glue. Just SQL, search, and inference — federated across your data, accelerated locally, served on localhost.
INSERT INTO. No Spark required.📣 Latest: Read Localhost Latency at Scale: The Spice Cluster-Sidecar Architecture and Apache Ballista at Spice AI: Distributed Query Execution Without the Operational Tax. | 📊 2025 Year in Review
Spice provides five APIs and interfaces in a lightweight, portable runtime (single binary or container):
vector_search, text_search, rrf, and rerank UDTFs.📺 More on the Spice.ai YouTube channel.
Each application gets a complete data plane on localhost. A lightweight Spice sidecar runs in the application pod, serves SQL/search/LLM-inference from a scoped working set, and transparently delegates the long tail to a central Spice cluster (Ballista distributed query, Cayenne acceleration, hybrid search indexing) over Arrow Flight. Three latency tiers: results cache (microseconds) → local working set (single-digit milliseconds) → cluster delegation. The application never holds credentials to Postgres, S3, Snowflake, or Iceberg — only a token to its sidecar. Read the architecture deep dive →
Spice extends Apache Ballista with multi-active scheduler HA coordinated through object storage (no etcd, ZooKeeper, or Redis required), bidirectional gRPC control streams, mandatory mTLS, multiple shuffle backends (local, in-memory, S3/Azure/GCS), Vortex-encoded shuffle data, and distributed embeddings inside SQL. TPC-H SF100: 2.9x faster than single-node DataFusion. 8x less RAM than Apache Spark with 2–8x better query performance in early preview. Read the engineering deep dive →
Cayenne pairs the Vortex columnar format with SQLite metadata to deliver multi-file acceleration without DuckDB's single-file ceiling or memory overhead. TPC-H SF-100: 1.4x faster than DuckDB-file with 3x less memory. ClickBench: 14% faster, 3.4x less memory. Vortex itself is 100x faster on random access, 10–20x faster on full scans, and 5x faster writes than Parquet — compute kernels run directly on encoded data, skipping decompression entirely for many operations. Read the Vortex deep dive →
Connect to any Iceberg catalog (REST, AWS Glue, Hadoop), query tables with full SQL semantics, selectively accelerate hot datasets for sub-10ms reads (down from 500ms–5s on S3), and write back with ACID guarantees via Iceberg's optimistic concurrency protocol — using standard SQL INSERT INTO. No Spark required. Read the Iceberg deep dive →
Native Amazon S3 Vectors (Day 1 launch partner) for billions of vectors at up to 90% lower cost than traditional vector DBs. Plus DuckDB HNSW and Elasticsearch kNN as .vectors.engine backends. Spice manages the full lifecycle — ingestion → embedding (AWS Bedrock, HuggingFace, OpenAI, Model2Vec for 500x faster static embeddings, multi-vector ColBERT-style late interaction with MaxSim) → indexing → query. SQL-integrated via vector_search, text_search, rrf (reciprocal rank fusion), and rerank UDTFs.
Spin up one Spice runtime per tenant or agent — each with its own sandboxed datasets, accelerators, secrets, and policies. Or share a runtime with config-level tenant isolation. Or do both with a hybrid model. The lightweight runtime makes "one Spicepod per tenant" actually viable — even at high tenant counts. Read the patterns →
pgoutput logical replication into any local accelerator. No Debezium or Kafka required. Auto-managed replication slots and LSN acknowledgement.Drop-in skills for Claude Code, Cursor, and any agent that supports the open Agent Skills format. Skills auto-activate to set up datasets, connect data sources, configure acceleration, run federated queries, and wire models — without you re-explaining Spice's configuration model.
In Claude Code:
github.com/spiceai/skills | Read the announcement →
Bootstrap accelerated datasets from S3 in seconds, not minutes. Cold-start ephemeral pods with pre-built Vortex/DuckDB/SQLite files. Recover from federated source outages by serving from the last known good snapshot. Critical for sidecar deployments and serverless environments.
localhost, transparently delegating the long tail to a central Spice cluster (Ballista distributed query, Cayenne acceleration, hybrid search indexing) over Arrow Flight. You get three latency tiers in one engine: results cache (microseconds) → local working set (single-digit milliseconds) → cluster delegation (distributed). No other open-source runtime gives you all three behind one connection. Read the architecture →spicepod.yaml are physically absent from the catalog, not filtered at query time. The application never holds credentials to Postgres, S3, Snowflake, or Iceberg — only a token to its sidecar. A compromised pod gets a loopback scoped to that tenant's working set, not database credentials.vector_search, text_search, rrf, rerank, NSQL, and tool calls are all SQL primitives.If you build with DataFusion, DuckDB, Vortex, Iceberg, or Ballista, Spice gives you a flexible, production-ready engine you can just use — instead of stitching them together yourself.
| Feature | Spice | Trino / Presto | Dremio | ClickHouse | Materialize |
|---|---|---|---|---|---|
| Primary Use-Case | Data & AI apps/agents | Big data analytics | Interactive analytics | Real-time analytics | Real-time analytics |
| Primary deployment model | Sidecar + Cluster | Cluster | Cluster | Cluster | Cluster |
| Federated Query Support | ✅ | ✅ | ✅ | ❌ | ❌ |
| Distributed Query Execution | ✅ (Apache Ballista, multi-active HA) | ✅ | ✅ | ✅ | Limited |
| Acceleration/Materialization | ✅ (Cayenne/Vortex, Arrow, SQLite, DuckDB, Postgres) | Intermediate storage | Reflections (Iceberg) | Materialized views | ✅ (Real-time views) |
| Catalog Support | ✅ (Iceberg, Unity Catalog, AWS Glue, Databricks) | ✅ | ✅ | ❌ | ❌ |
| Iceberg Write (SQL INSERT) | ✅ | ✅ | Limited | ❌ | ❌ |
| Query Result Caching | ✅ | ✅ | ✅ | ✅ | Limited |
| Multi-Modal Acceleration | ✅ (OLAP + OLTP per dataset) | ❌ | ❌ | ❌ | ❌ |
| Native CDC | ✅ (Postgres WAL, DynamoDB Streams, Debezium) | ❌ | ❌ | ❌ | ✅ (Debezium) |
| Built-in AI / LLM inference | ✅ | ❌ | ❌ | ❌ | ❌ |
| Feature | Spice | LangChain | LlamaIndex | AgentOps.ai | Ollama |
|---|---|---|---|---|---|
| Primary Use-Case | Data & AI apps | Agentic workflows | RAG apps | Agent operations | LLM apps |
| Programming Language | Any (HTTP / Flight / ODBC / JDBC) | JavaScript, Python | Python | Python | Any language (HTTP interface) |
| Unified Data + AI Runtime | ✅ | ❌ | ❌ | ❌ | ❌ |
| Federated Data Query | ✅ | ❌ | ❌ | ❌ | ❌ |
| Distributed Query | ✅ | ❌ | ❌ | ❌ | ❌ |
| Accelerated Data Access | ✅ | ❌ | ❌ | ❌ | ❌ |
| Tools/Functions | ✅ (MCP server + gateway, Streamable HTTP) | ✅ | ✅ | Limited | Limited |
| LLM Memory | ✅ | ✅ | ❌ | ✅ | ❌ |
| Hybrid Search | ✅ (BM25 + vector + RRF + rerank UDTFs) | ✅ | ✅ | Limited | Limited |
| Caching | ✅ (query, results, and provider-aware LLM prompt caching) | Limited | ❌ | ❌ | ❌ |
| Embeddings | ✅ (Built-in & pluggable; multi-vector ColBERT-style MaxSim) | ✅ | ✅ | Limited | ❌ |
✅ = Fully supported · ❌ = Not supported · Limited = Partial or restricted support
https://github.com/spiceai/spiceai/assets/80174/7735ee94-3f4a-4983-a98e-fe766e79e03a
See more demos on YouTube.
| Name | Description | Status | Protocol/Format |
|---|---|---|---|
databricks (mode: delta_lake) | Databricks | Stable | S3/Delta Lake |
delta_lake | Delta Lake | Stable | Delta Lake |
dremio | Dremio | Stable | Arrow Flight |
duckdb | DuckDB | Stable | Embedded |
file | File | Stable | Parquet, CSV |
github | GitHub | Stable | GitHub API |
postgres | PostgreSQL (with native WAL CDC) | Stable | |
s3 | S3 | Stable | Parquet, CSV |
mysql | MySQL | Stable | |
spice.ai |
| Name | Description | Status | Engine Modes |
|---|---|---|---|
cayenne | Spice Cayenne (Vortex) | Release Candidate | file |
arrow | In-Memory Arrow Records | Stable | memory |
duckdb | Embedded DuckDB | Stable | memory, file |
postgres | Attached PostgreSQL | Release Candidate | N/A |
sqlite | Embedded SQLite | Release Candidate | memory, file |
| Name | Description | Status | ML Format(s) | LLM Format(s) |
|---|---|---|---|---|
openai | OpenAI (or compatible) LLM endpoint | Release Candidate | - | OpenAI-compatible HTTP endpoint |
file | Local filesystem | Release Candidate | ONNX | GGUF, GGML, SafeTensor |
huggingface | Models hosted on HuggingFace | Release Candidate | ONNX | GGUF, GGML, SafeTensor |
spice.ai | Models hosted on the Spice.ai Cloud Platform | ONNX | OpenAI-compatible HTTP endpoint | |
azure | Azure OpenAI | - | OpenAI-compatible HTTP endpoint | |
bedrock | Amazon Bedrock (Nova models) | Alpha | - | OpenAI-compatible HTTP endpoint |
anthropic | Models hosted on Anthropic | Alpha | - | OpenAI-compatible HTTP endpoint |
xai | Models hosted on xAI | Alpha | - | OpenAI-compatible HTTP endpoint |
| Name | Description | Status | ML Format(s) | LLM Format(s) |
|---|---|---|---|---|
openai | OpenAI (or compatible) embeddings endpoint | Release Candidate | - | OpenAI-compatible embeddings endpoint |
file | Local filesystem | Release Candidate | ONNX | GGUF, GGML, SafeTensor |
huggingface | Models hosted on HuggingFace | Release Candidate | ONNX | GGUF, GGML, SafeTensor |
model2vec | Static embeddings (500x faster) | Release Candidate | Model2Vec | - |
azure | Azure OpenAI | Alpha | - | OpenAI-compatible HTTP endpoint |
bedrock | AWS Bedrock (Titan, Cohere, Nova, Nova 2) | Alpha | - | OpenAI-compatible HTTP endpoint |
Configured as .vectors.engine on a column-level embedding.
| Name | Description | Status |
|---|---|---|
s3_vectors | Amazon S3 Vectors for petabyte-scale vector storage and querying | Alpha |
duckdb | DuckDB with HNSW vector index | Alpha |
elasticsearch | Elasticsearch with kNN | Alpha |
Catalog Connectors connect to external catalog providers and make their tables available for federated SQL query in Spice. The schema hierarchy of the external catalog is preserved.
| Name | Description | Status | Protocol/Format |
|---|---|---|---|
spice.ai | Spice.ai Cloud Platform | Stable | Arrow Flight |
unity_catalog | Unity Catalog | Stable | Delta Lake |
databricks | Databricks | Beta | Spark Connect, S3/Delta Lake |
iceberg | Apache Iceberg | Beta | Parquet |
ducklake | DuckLake | Beta | Parquet |
glue | AWS Glue | Alpha | CSV, Parquet, Iceberg |
| Name | Description | Status |
|---|---|---|
env | Environment variables | Stable |
kubernetes | Kubernetes secrets | Stable |
keyring | OS keychain | Stable |
aws_secrets_manager | AWS Secrets Manager | Stable |
hashicorp_vault | HashiCorp Vault | Release Candidate |
azure_keyvault | Azure Key Vault | Release Candidate |
https://github.com/spiceai/spiceai/assets/88671039/85cf9a69-46e7-412e-8b68-22617dcbd4e0
Install the Spice CLI:
On macOS, Linux, and WSL:
Or using brew:
On Windows using PowerShell:
Note: Native Windows runtime builds are not provided in v2.0+. Use WSL for local development.
Step 1. Initialize a new Spice app with the spice init command:
A spicepod.yaml file is created in the spice_qs directory. Change to that directory:
Step 2. Start the Spice runtime:
Example output will be shown as follows:
The runtime is now started and ready for queries.
Step 3. In a new terminal window, add the spiceai/quickstart Spicepod. A Spicepod is a package of configuration defining datasets and ML models.
The spicepod.yaml file will be updated with the spiceai/quickstart dependency.
The spiceai/quickstart Spicepod will add a taxi_trips data table to the runtime which is now available to query by SQL.
Step 4. Start the Spice SQL REPL:
The SQL REPL interface will be shown:
Enter show tables; to display the available tables for query:
Enter a query to display the longest taxi trips:
Output:
Spice is available in the AWS Marketplace.
Run Spice as a multi-node cluster: start scheduler nodes with --role scheduler and start executor nodes with --scheduler-address <scheduler-url> to join them. Multi-active schedulers coordinate through your object store (configured via runtime.scheduler.state_location) — no etcd, ZooKeeper, or Redis. mTLS certificates are managed via the Spice CLI. See the Ballista architecture deep dive and the distributed query docs.
Drop-in skills for Claude Code, Cursor, and more.
In Claude Code (slash command):
In Cursor and other agents (shell):
86+ recipes and end-to-end examples — federation, acceleration, search, RAG, agents, CDC, and more — at github.com/spiceai/cookbook.
Access ready-to-use Spicepods and datasets hosted on the Spice.ai Cloud Platform with the open-source Spice runtime. Browse public Spicepods at spicerack.org.
To use public datasets, create a free account on Spice.ai:
Once set up, you can access ready-to-use Spicepods including datasets. For this demonstration, use the taxi_trips dataset from the Spice.ai Quickstart.
Step 1. Initialize a new project.
Step 2. Log in and authenticate. A pop-up browser window will prompt you to authenticate:
Step 3. Start the runtime:
Step 4. Configure the dataset:
In a new terminal window:
Step 5. Query from the SQL REPL:
Comprehensive documentation at spiceai.org/docs.
Spice.ai is designed to be extensible. See EXTENSIBILITY.md to build custom Data Connectors, Data Accelerators, Catalog Connectors, Secret Stores, Models, or Embeddings.
🚀 See the Roadmap. Highlights:
⭐️ Star this repo to follow along — it helps us a ton, and you'll see new releases as they ship. 🙏
spicepod.yaml| Spice.ai |
| Stable |
| Arrow Flight |
dynamodb | Amazon DynamoDB (with Streams) | Stable |
graphql | GraphQL | Release Candidate | JSON |
cosmosdb | Azure Cosmos DB (NoSQL) | Release Candidate |
git | Git repositories | Release Candidate |
snowflake | Snowflake | Release Candidate | Arrow |
adbc | ADBC | Release Candidate | Arrow |
iceberg | Apache Iceberg (read+write) | Release Candidate | Parquet |
databricks (mode: spark_connect) | Databricks | Beta | Spark Connect |
ducklake | DuckLake | Beta | Parquet |
flightsql | FlightSQL | Beta | Arrow Flight SQL |
mssql | Microsoft SQL Server | Beta | Tabular Data Stream (TDS) |
odbc | ODBC | Beta | ODBC |
spark | Spark | Beta | Spark Connect |
sharepoint | Microsoft SharePoint | Beta | Object-store listing |
oracle | Oracle | Alpha | Oracle ODPI-C |
abfs | Azure BlobFS | Alpha | Parquet, CSV |
clickhouse | ClickHouse | Alpha |
debezium | Debezium CDC | Alpha | Kafka + JSON |
elasticsearch | Elasticsearch (BM25 + kNN + RRF) | Alpha |
gcs, gs | Google Cloud Storage | Alpha | Parquet, CSV, JSON |
kafka | Kafka | Alpha | Kafka + JSON |
ftp, sftp | FTP/SFTP | Alpha | Parquet, CSV |
glue | AWS Glue | Alpha | Iceberg, Parquet, CSV |
http, https | HTTP(s) (dynamic headers, pagination) | Alpha | Parquet, CSV, JSON |
imap | IMAP | Alpha | IMAP Emails |
localpod | Local dataset replication | Alpha |
mongodb | MongoDB | Alpha |
scylladb | ScyllaDB | Alpha |
smb | SMB 3.1.1 | Alpha | SMB |
nfs | NFS | Alpha | Parquet, CSV, JSON |