# Spice AI - Full Content
> Spice.ai is a data and AI platform that combines federated SQL query, hybrid search, and LLM inference in a portable, open-source runtime
This file contains the complete content from the Spice AI website for AI/LLM consumption.
---
# Local Content
## About Us
URL: https://spice.ai/about-us
Date: 2025-11-19T21:05:55
Description: Learn about Spice AI's mission, team, and vision for empowering developers to build intelligent apps with unified data and AI infrastructure.
Over the last 15 years, Luke has brought together the best builders and engineers across the globe to create developer-focused experiences through tools and technologies used by millions worldwide. Before founding Spice AI, Luke was the founding manager and co-creator of Azure Incubations at Microsoft, where he led cross-functional engineering teams to create and develop technologies like Dapr.
\n',
},
{
photo: {
ID: 968,
id: 968,
title: 'eb22faba936070632d444bd11f59977cd866e1f8',
filename: 'eb22faba936070632d444bd11f59977cd866e1f8.png',
filesize: 36027,
url: '/website-assets/media/2025/11/eb22faba936070632d444bd11f59977cd866e1f8.png',
link: '/about-us/attachment/eb22faba936070632d444bd11f59977cd866e1f8/',
alt: '',
author: '5',
description: '',
caption: '',
name: 'eb22faba936070632d444bd11f59977cd866e1f8',
status: 'inherit',
uploaded_to: 963,
date: '2025-11-19 20:52:23',
modified: '2025-11-30 23:10:57',
menu_order: 0,
mime_type: 'image/jpeg',
type: 'image',
subtype: 'jpeg',
icon: '/website-assets/media/default.png',
width: 606,
height: 672,
sizes: {
thumbnail:
'/website-assets/media/2025/11/eb22faba936070632d444bd11f59977cd866e1f8-150x150.png',
'thumbnail-width': 150,
'thumbnail-height': 150,
medium:
'/website-assets/media/2025/11/eb22faba936070632d444bd11f59977cd866e1f8-271x300.png',
'medium-width': 271,
'medium-height': 300,
medium_large:
'/website-assets/media/2025/11/eb22faba936070632d444bd11f59977cd866e1f8.png',
'medium_large-width': 606,
'medium_large-height': 672,
large:
'/website-assets/media/2025/11/eb22faba936070632d444bd11f59977cd866e1f8.png',
'large-width': 606,
'large-height': 672,
'1536x1536':
'/website-assets/media/2025/11/eb22faba936070632d444bd11f59977cd866e1f8.png',
'1536x1536-width': 606,
'1536x1536-height': 672,
'2048x2048':
'/website-assets/media/2025/11/eb22faba936070632d444bd11f59977cd866e1f8.png',
'2048x2048-width': 606,
'2048x2048-height': 672,
},
},
name: 'Phillip LeBlanc',
position__title: 'Founder and CTO',
linkedin: 'https://www.linkedin.com/in/leblancphillip/',
x__twitter_profile_url: 'https://x.com/leblancphill',
paragraph:
'Phillip has spent a decade building some of the largest distributed systems and big data platforms used by millions worldwide. Before co-founding Spice AI, Phillip was both an engineering manager and IC working on distributed systems at GitHub and Microsoft. Phillip has contributed to services developers use every day, including GitHub Actions, Azure Active Directory, and Visual Studio App Center.
\n',
},
],
padding_top: 'unset',
padding_bottom: 'unset',
}}
/>
---
## 2025 Spice AI Year in Review
URL: https://spice.ai/blog/2025-spice-ai-year-in-review
Date: 2026-01-02T20:15:30
Description: From day one, Spice was designed to simplify building modern, intelligent applications. In 2025 that vision turned into reality.
In January 2025, Spice announced 1.0 stable, marking the transition from an open-source project to an enterprise-grade, production-ready platform. Spice has shipped 35 stable releases and 11 major releases since then.'
}
/>
From day one, Spice was designed to simplify building modern, intelligent applications. In 2025 that vision turned into reality. Spice now serves as the data and AI substrate for global, production workloads at enterprises like Twilio and Barracuda - where mission-critical applications query, search, and reason over big data in real time.'
}
/>
These data and AI workloads impose fundamentally different demands on the data layer than previous generations of applications. Instead of the complexity of multiple query engines, search platforms, caches, and inference layers, Spice brings this functionality into a single, high-performance data and AI stack. Development teams can query operational databases, data lakes, analytical warehouses, and more with a single SQL interface, while taking advantage of built-in acceleration, hybrid search, and AI.'
}
/>
All of this is delivered by a fully open-source engine built in Rust that can be deployed anywhere - as a sidecar, at the edge, in the cloud, or in enterprise clusters. Developers have complete optionality based on their access patterns and business requirements. '
}
/>
Below are some of the major features that defined 2025 across the core pillars of the Spice platform: federation and acceleration, search, and embedded LLM inference.'
}
/>
Major 2025 Federation & Acceleration Features'
}
/>
SQL federation and acceleration is at the core of Spice and the applications it enables; enterprise AI applications depend on contextual data drawn from many different systems, and that data must be fast and available to search and reason over in real time.'
}
/>
In 2025, Spice simplified querying across disparate data sources while improving performance, scale, and reliability. The connector ecosystem also significantly expanded, enabling teams to ingest and combine data across any source.'
}
/>
Spice Cayenne data accelerator: Introduced in v1.9, Spice Cayenne is the new premier data accelerators built on the Vortex columnar format that enables low-latency, highly concurrent queries over large datasets, overcoming the scalability and memory limits of single-file accelerators like DuckDB.Iceberg and Amazon S3 writes: Spice added write support for Iceberg tables (v1.8) and Amazon S3 Tables (1.10), delivering direct ingestion, transformation, and materialization of data into object storage. This simplifies writing operational data to object-stores, eliminating the need for complex and costly batch or streaming pipelines.Multi-node distributed query (preview): v1.9 brought multi-node distributed query execution based on Apache Ballista, designed for querying partitioned data lake formats across multiple execution nodes for significantly improved query performance on large datasets.Managed acceleration snapshots: Acceleration Snapshots enable faster restarts, shared accelerations across multiple Spice instances, reduced load on federated systems, and continued query serving even when source systems are temporarily unavailable for enterprise-grade resiliency.Caching acceleration mode: A new caching mode introduced in v1.10 provides stale-while-revalidate (SWR) behavior for accelerations with background refreshes, and file-persistence with Spice Cayenne, SQLite, or DuckDB.Expanded connector ecosystem: Delta Lake, S3, Databricks, Unity Catalog, AWS Glue, PostgreSQL, MySQL, Kafka, DynamoDB, Kafka, MongoDB, Iceberg and more were introduced or reached stable.'
}
/>
Federation & Acceleration Feature Highlight: Spice Cayenne'
}
/>
Figure 1: The Spice Cayenne architecture, built on Vortex and SQLite'
}
/>
Spice leans into the industry shift to object storage as the source of truth for applications. These workloads are often multi-terabyte datasets using open data lake formats like Parquet, Iceberg, or Delta that must serve data and search queries to applications with sub-second performance.'
}
/>
Existing data accelerators like DuckDB, are fast and simple for datasets up to 1TB, however for multi-terabyte workloads, a new class of accelerator is required.'
}
/>
So we built Spice Cayenne, the next-generation data accelerator for high volume and latency-sensitive applications.
Spice Cayenne combines Vortex, the next-generation columnar file format from the Linux Foundation, with a simple, embedded metadata layer. This separation of concerns ensures that both the storage and metadata layers are fully optimized for what each does best. Cayenne delivers better performance and lower memory consumption than the existing DuckDB, Arrow, SQLite, and PostgreSQL data accelerators.'
}
/>
Figure 2: Cayenne accelerated TPC-H queries 1.4x faster than DuckDB (file mode) and used nearly 3x less memory.'
}
/>
Spice Founder Luke Kim demonstrated and walked through the details of the Cayenne architecture in a December, 2025 Community Call:'
}
/>
Major 2025 Search Features'}
/>
AI applications are only as effective as the data they can retrieve and reason over. Beyond extracting data, they need to search across both structured and unstructured sources to surface the most relevant context at query time. In 2025, search evolved into a core primitive of the Spice platform, designed to operate natively across federated datasets.'
}
/>
Native Amazon S3 vectors integration: v1.5 added native support for Amazon S3 Vectors, making cost‑effective vector search on object storage a first‑class feature. Subsequent releases introduced multi-index scatter-gather, multi-column primary keys, and partitioned indexes to support scalable production workloads.Reciprocal Rank Fusion (RRF): Introduced in v1.7, RRF combines vector and full-text search results with configurable weighting and recency bias by a simple SQL table-function, producing higher-quality hybrid search rankings than either approach alone.Search on views (full-text and vector): Search on views enables advanced search scenarios across different search modalities over pre-aggregated or transformed data, extending the power of Spice\'s search functionality beyond base datasets.Search results caching: Runtime caching for search results improves performance for subsequent searches and chat completion requests that use the document_similarity LLM tool. Table‑level search enhancements: v1.8.2 added additional_columns and where support for table relations in search, enabling multi‑table search workflows.'
}
/>
Search Feature Highlight: Amazon S3 Vectors'
}
/>
Figure 3: Spice and S3 Vectors Architecture '
}
/>
In July, Spice introduced native support for Amazon S3 Vectors as a day 1 launch partner at the AWS Summit in NYC. Vector similarity search, structured filters, joins, and aggregations can now be executed in SQL within Spice without duplicating data. '
}
/>
Developers can make vector searches using SQL or HTTP and combine similarity search with relational predicates and joins. Spice pushes filters down to S3 Vectors to minimize data scanned, delivering scalable sub-second query performance with the flexibility of SQL.'
}
/>
The Spice team presented a live demo of Spice and Amazon S3 Vectors at 2025 AWS re:Invent:'
}
/>
Major AI Features Released in 2025'
}
/>
Spice deepened its AI capabilities by making LLM inference via SQL native within the query engine. LLMs can be invoked directly in SQL alongside federated queries, joins, and transformations, helping teams move from raw data to insights all within SQL.'
}
/>
AI SQL function: The AI SQL function was introduced in v1.8, supporting LLM calls directly from SQL for generation, translation and classification. Model inference can now run in the same execution path as joins, filters, search, and aggregations.MCP server support: Introduced in v1.1, Spice works as both an MCP server and client. Spice can run stdio-based MCP tools internally or connect to external MCP servers over HTTP SSE and streaming.Amazon Nova & Nova 2 embeddings: Support for models like Nova (v1.5.2) and Nova 2 multimodal embeddings (v1.9.1), which support high-dimensional vector representations with configurable truncation modes. Expanded model provider ecosystem: Spice added support for new providers including Anthropic, xAI, HuggingFace, Amazon Bedrock, Model2Vec static models, and more. Expanded tools ecosystem: Added native tool integrations including the OpenAI Responses API (for streaming tool calls and responses) and a Web Search tool powered by Perplexity. These tools can be invoked within the same execution context as SQL queries and model inference, enabling retrieval-augmented and agent-style workflows without external orchestration.'
}
/>
AI Feature Highlight: AI SQL Function'
}
/>
Figure 4: AI SQL function example in Spice Cloud'
}
/>
The ai() SQL function enables developers to invoke LLMs directly within SQL for bulk generation, classification, translation, or analysis. Inference runs alongside joins, filters, aggregations, and search results without additional application-layer plumbing. Developers can transform federated data into structured insights extracting data, call external completion APIs, or orchestrate separate pipelines.'
}
/>
Check out a live demo of the AI SQL function here:'}
/>
Looking ahead'}
/>
2025 was a major year for Spice as it grew from single-node data acceleration to a multi-node data, search, and AI platform. In 2026, Spice 2.0 will focus on bringing multi-node distributed query execution to GA, alongside continued improvements to search, acceleration, and AI primitives. These investments will help deliver even more predictable performance and operational simplicity.'
}
/>
The mission remains the same: to provide a durable, open data substrate that helps teams build and scale the next generation of intelligent, data and AI-driven applications.'
}
/>
Interested in seeing it for yourself? Get started with the open source runtime or cloud platform today.'
}
/>
'} />
---
## A Developer's Guide to Understanding Spice.ai
URL: https://spice.ai/blog/a-developers-guide-to-understanding-spice-ai
Date: 2026-02-05T22:12:21
Description: Learn what Spice.ai is, when to use it, and how it solves enterprise data challenges. A developer-focused guide to federation, acceleration, search, and AI.
TL;DR '}
/>
This hands-on guide is designed to help developers quickly build an understanding of Spice: what it is (an AI-native query engine that federates queries, accelerates data, and integrates search and AI), when to use it (data-intensive applications and AI agents), and how it can be leveraged to solve enterprise-scale data challenges. '
}
/>
*Note: This guide was last updated on February 5, 2026. Please see the docs for the latest updates. '
}
/>
Who this guide is for '
}
/>
This guide is for developers who want to understand why, how, and when to use Spice.ai. '
}
/>
If you are new to Spice, you might also be wondering how Spice is different than other query engines or data and AI platforms. Most developers exploring Spice are generally doing one of the following: '
}
/>
Operationalizing data lakes for real-time queries and search '
}
/>
Building applications that need fast access to disparate data '
}
/>
Building AI applications and agents that need fast, secure context '
}
/>
Let's start with the problem Spice is solving to anchor the discussion. "
}
/>
The problem Spice solves '
}
/>
Modern applications face a distributed data challenge. '
}
/>
Enterprise data is spread across operational databases, data lakes, warehouses, third-party APIs, and more. Each source has its own interface, latency characteristics, and access patterns. '
}
/>
AI workloads amplify the problem. RAG applications generally require: '
}
/>
A vector database (e.g. Pinecone, Weaviate) for embeddings '
}
/>
A text search engine (e.g. Elasticsearch) for keyword matching '
}
/>
A cache layer (e.g. Redis) for performance & latency '
}
/>
Model hosting and serving (OpenAI, Anthropic) for LLM inference '
}
/>
Orchestration code and services to coordinate everything '
}
/>
This can be a lot of complexity, even for a simple application. '
}
/>
What is Spice? '
}
/>
Spice is an open-source SQL query, search, and LLM-inference engine written in Rust, purpose-built for data-driven applications and AI agents. At its core, Spice is a high-performance compute engine that federates, searches, and processes data across your existing infrastructure - querying & accelerating data where it lives and integrating search and AI capabilities through SQL. '
}
/>
Figure 1. Spice.ai architecture '
}
/>
Unlike databases that require migrations & maintenance, Spice takes a declarative configuration approach: datasets, views, models, tools are defined in declarative YAML, and Spice handles the operations of fetching, caching, and serving that data. '
}
/>
This makes Spice ideal when: '}
/>
Your application needs fast, unified access to disparate data sources '
}
/>
You want simplicity and to avoid building and maintaining ETL pipelines '
}
/>
You want an operational data lake house for applications and agents '
}
/>
You need sub-second query performance without ETL '
}
/>
What Spice is not: '}
/>
Not a replacement for PostgreSQL or MySQL (use those for transactional workloads) '
}
/>
Not a data warehouse (use Snowflake/Databricks for centralized analytics) '
}
/>
Mental model: Spice as a data and AI substrate '
}
/>
Think of Spice as the operational data & AI layer between your applications and your data infrastructure. '
}
/>
Figure 2. Spice as the data substrate for data-intensive AI apps '
}
/>
How this guide works '}
/>
We'll start with a hands-on quickstart to get Spice running, then progressively build your mental model through the core concepts: "
}
/>
Federation '}
/>
Acceleration '
}
/>
Views '}
/>
Caching '}
/>
Snapshots '}
/>
Models '}
/>
Search '}
/>
Writes '}
/>
By the end, you'll understand how these primitives are used together to solve enterprise-scale data challenges."
}
/>
Quickstart '
}
/>
To install and get Spice started, run: '}
/>
```bash
curl https://install.spiceai.org | /bin/bash
```
Or using Homebrew: '} />
```bash
brew install spiceai/spiceai/spice
```
Next, in any folder, create a spicepod.yaml file with the following content: '
}
/>
```yaml
version: v1
kind: Spicepod
name: my_spicepod
datasets:
- from: s3://spiceai-demo-datasets/taxi_trips/2024/
name: taxi_trips
```
In the same folder, run:'} />
```bash
spice run
```
And, finally, in another terminal, run:'}
/>
```bash
> spice sql
Welcome to the Spice.ai SQL REPL! Type 'help' for help.
show tables; -- list available tables
sql> show tables;
+--------------+---------------+--------------+-------------+
| table_catalog | table_schema | table_name | table_type |
+--------------+---------------+--------------+-------------+
| spice | runtime | task_history | BASE TABLE |
| spice | public | taxi_trips | BASE TABLE |
+--------------+---------------+--------------+-------------+
Time: 0.010767 seconds. 2 rows.
sql> select count(*) from taxi_trips ;
+----------+
| count(*) |
+----------+
| 2964624 |
+----------+
```
Understanding what just happened '
}
/>
In that quickstart, you: '}
/>
Configured a dataset (taxi_trips) pointing to a remote S3 bucket '
}
/>
Started the Spice runtime, which connected to that source '
}
/>
Queried the data using standard SQL - without moving or copying it. '
}
/>
Spice.ai Cloud Platform '
}
/>
You can run the same Spicepod configuration in Spice.ai Cloud, the fully managed version of Spice that extends the open-source runtime with enterprise capabilities: built-in observability, elastic scaling, and team collaboration. '
}
/>
Core Concepts '
}
/>
1. Federation '
}
/>
In the quickstart, you queried taxi_trips stored in a remote S3 bucket using standard SQL without copying or moving that data. That's federation in action - querying data where it lives, not where you've moved it to. "
}
/>
This is foundational to Spice\'s architecture. Federation in Spice enables you to query data across multiple heterogeneous sources using a single SQL interface, without moving data or building ETL pipelines. '
}
/>
Traditional approaches force you to build ETL pipelines that extract data from these sources, transform it, and load it into a centralized database or warehouse. Every new data source means building and maintaining another pipeline. '
}
/>
Spice connects directly to your existing data sources and provides a unified SQL interface across all of them. You configure datasets declaratively in YAML, and Spice handles the connection, query translation, and result aggregation. '
}
/>
Spice supports query federation across: '}
/>
Databases: PostgreSQL, MySQL, Microsoft SQL Server, Oracle, MongoDB, ClickHouse, DynamoDB, ScyllaDB '
}
/>
Data Warehouses: Snowflake, Databricks, BigQuery '
}
/>
Data Lakes: S3, Azure Blob Storage, Delta Lake, Apache Iceberg '
}
/>
Other Sources: GitHub, GraphQL, FTP/SFTP, IMAP, Kafka, HTTP/API, and 30+ more connectors '
}
/>
Figure 3. Spice Federation (and acceleration) architecture '
}
/>
How it works '
}
/>
When you configure multiple datasets from different sources, Spice\'s query planner (built on Apache DataFusion) optimizes and routes queries appropriately: '
}
/>
```yaml
datasets:
# From PostgreSQL
- from: postgres:customers
name: customers
params:
pg_host: db.example.com
pg_user: ${secrets:PG_USER}
# From S3 Parquet files
- from: s3://bucket/orders/
name: orders
params:
file_format: parquet
# From Snowflake
- from: snowflake:analytics.sales
name: sales
```
```sql
-- Query across all three sources in one statement
SELECT c.name, o.order_total, s.region
FROM customers c
JOIN orders o ON c.id = o.customer_id
JOIN sales s ON o.id = s.order_id
WHERE s.region = 'EMEA';
```
Without additional configuration, each query fetches data directly from the underlying sources. Spice optimizes this as much as possible using filter pushdown and column projection. '
}
/>
📚 Docs: Spice Federation and Data Connectors '
}
/>
2. Acceleration '
}
/>
Federation solves the data movement problem, but alone often isn't enough for production applications. Querying remote S3 buckets for every request introduces latency - even with query pushdown and optimization, round-trips to distributed data sources can take seconds (or tens of seconds) for large datasets. "
}
/>
Figure 4. Acceleration example in a Spice sidecar architecture '
}
/>
Spice data acceleration materializes working sets of data locally, reducing query latency from seconds to milliseconds. When enabled, Spice syncs data from connected sources and stores it in local stores, like DuckDB or Vortex - giving you the speed of local data with the flexibility of federated access. '
}
/>
You can think of acceleration as an intelligent caching layer that understands your data access patterns. Hot data gets materialized locally for instant access and cold data remains federated. Unlike traditional caches that just store query results or static database materializations, Spice accelerates entire datasets with configurable refresh strategies, with the flexible compute of an embedded database. '
}
/>
Acceleration Engines '
}
/>
| Engine | Mode | Best For |
| Arrow | In-memory only | Ultra-fast analytical queries, ephemeral workloads |
| DuckDB | Memory or file | General-purpose OLAP, medium datasets, persistent storage |
| SQLite | Memory or file | Row-oriented lookups, OLTP patterns, lightweight deployments |
| Cayenne | File only | High-volume multi-file workloads, terabyte-scale data |
'
}
/>
To enable acceleration, add the acceleration block to your dataset configuration: '
}
/>
```yaml
datasets:
- from: s3://data-lake/events/
name: events
acceleration:
enabled: true
engine: cayenne # Choose your engine
mode: file # 'memory' or 'file'
```
With this configuration, Spice fetches the events dataset from S3 and stores it in a local Spice Cayenne Vortex files. Queries to events are then served from the local disk instead of making remote calls to S3. '
}
/>
Figure 5. Spice Cayenne architecture'
}
/>
While DuckDB and SQLite are general purpose engines, Spice Cayenne is purpose-built for modern data lake workloads. It\'s built on Vortex - a next-generation columnar format under the Linux Foundation - designed for the scale and access patterns of object storage. '
}
/>
Learn more: Introducing the Spice Cayenne Data Accelerator '
}
/>
📚 Docs: Data Accelerators '
}
/>
Refresh Modes '
}
/>
Spice offers multiple strategies for keeping accelerated data synchronized with sources: '
}
/>
| Mode | Description | Use Case |
| full | Complete dataset replacement on each refresh | Small, slowly-changing datasets |
| append (batch) | Adds new records based on a time column | Append-only logs, time-series data |
| append (stream) | Continuous streaming without time column | Real-time event streams |
| changes | CDC-based incremental updates via Debezium or DynamoDB | Frequently updated transactional data |
| caching | Request-based row-level caching | API responses, HTTP endpoints |
'
}
/>
```yaml
# Full refresh every 8 hours
acceleration:
refresh_mode: full
refresh_check_interval: 8h
# Append mode: check for new records from the last day every 10 minutes
acceleration:
refresh_mode: append
time_column: created_at
refresh_check_interval: 10m
refresh_data_window: 1d
# Continuous ingestion using Kafka
acceleration:
refresh_mode: append
# CDC with Debezium or DynamoDB Streams
acceleration:
refresh_mode: changes
```
📚 Docs: Refresh Modes '
}
/>
Retention Policies '
}
/>
While refresh modes control how acceleration is populated, retention policies prevent unbounded growth. As data continuously flows into an accelerated dataset-especially in append or streaming modes-storage can grow indefinitely. Retention policies automatically evict stale data using time-based or custom SQL strategies.
Retention is particularly useful for time-series workloads like logs, metrics, and event streams where only recent data is relevant for queries. For example, an application monitoring dashboard might only need the last 7 days of logs for troubleshooting, while a real-time analytics pipeline processing IoT sensor data might retain just 24 hours of readings. By defining retention policies, you ensure accelerated datasets stay bounded and performant without manual intervention.
Spice supports two retention strategies: time-based, which removes records older than a specified period, and custom SQL-based, which executes arbitrary DELETE statements for more complex eviction logic. Once defined, Spice runs retention checks automatically at the configured interval: '
}
/>
```yaml
acceleration:
# Common retention parameters
retention_check_enabled: true
retention_check_interval: 1h
# Time-based retention policy
retention_period: 7d
# Custom SQL-based Retention
retention_sql: "DELETE FROM logs WHERE status = 'archived'"
```
📚 Docs: Retention '
}
/>
Constraints and Indexes '
}
/>
Accelerated datasets support primary key constraints and indexes for optimized query performance and data integrity: '
}
/>
```yaml
datasets:
- from: postgres:orders
name: orders
acceleration:
enabled: true
engine: duckdb
primary_key: order_id # Creates non-null unique index
indexes:
customer_id: enabled # Single column index
'(created_at, status)': unique # Multi-column unique index
```
📚 Docs: Constraints & Indexes '
}
/>
3. Views '}
/>
Views are virtual tables defined by SQL queries - useful for pre-aggregations, transformations, and simplified access patterns: '
}
/>
```yaml
views:
- name: daily_revenue
sql: |
SELECT
DATE_TRUNC('day', created_at) as day,
SUM(amount) as revenue,
COUNT(*) as transactions
FROM orders
GROUP BY 1
- name: top_customers
sql: |
SELECT customer_id, SUM(total) as lifetime_value
FROM orders
GROUP BY customer_id
ORDER BY lifetime_value DESC
LIMIT 100
```
📚 Docs: Views '
}
/>
4. Caching '
}
/>
Spice provides in-memory caching for SQL query results, search results, and embeddings - all enabled by default. Caching eliminates redundant computation for repeated queries and improves performance for non-accelerated datasets. '
}
/>
```yaml
runtime:
caching:
sql_results:
enabled: true
cache_max_size: 128MiB
eviction_policy: lru
item_ttl: 1s
encoding: none
search_results:
enabled: true
cache_max_size: 128MiB
eviction_policy: lru
item_ttl: 1s
encoding: none
embeddings_results:
enabled: true
cache_max_size: 128MiB
eviction_policy: lru
item_ttl: 1s
encoding: none
```
| Option | Description | Default |
cache_max_size | Entry expiration duration | 128 MiB |
item_ttl | Maximum cache storage | 1 second |
eviction_policy | `lru` (least-recently-used) or `tiny_lfu` | lru |
encoding | Compression: `zstd` or `none` | none |
'
}
/>
Spice also supports HTTP cache-control headers (no-cache, max-stale, only-if-cached) for fine-grained control over caching behavior per request. '
}
/>
📚 Docs: Results Caching '
}
/>
5. Snapshots '
}
/>
Snapshots allow file-based acceleration engines (DuckDB, SQLite, or Cayenne) to bootstrap from pre-stored snapshots in object storage. This dramatically reduces cold-start latency in distributed deployments. '
}
/>
```yaml
snapshots:
enabled: true
location: s3://large_table_snapshots
datasets:
- from: postgres:large_table
name: large_table
acceleration:
engine: duckdb
mode: file
snapshots: enabled
```
Snapshot triggers vary by refresh mode: '}
/>
refresh_complete: Creates snapshots after each refresh (full and batch-append modes) '
}
/>
time_interval: Creates snapshots on a fixed schedule (all refresh modes) '
}
/>
stream_batches: Creates snapshots after every N batches (streaming modes: Kafka, Debezium, DynamoDB Streams) '
}
/>
📚 Docs: Snapshots '
}
/>
6. Models '}
/>
AI is a first-class capability in the Spice runtime - not a bolt-on integration. Instead of wiring external APIs, you call LLMs directly from SQL queries using the `ai()` function. Embeddings generate automatically during data ingestion, eliminating separate pipeline infrastructure. Text-to-SQL is schema-aware with direct data access, preventing the hallucinations common in external tools that don't understand your table structure. "
}
/>
This SQL-first approach means you can query your federated and accelerated data, pipe results to an LLM for analysis, and get synthesized answers in a single SQL statement. '
}
/>
You can connect to hosted providers (OpenAI, Anthropic, Bedrock) or serve models locally with GPU acceleration. Spice provides an OpenAI-compatible AI Gateway, so existing applications using OpenAI SDKs can swap endpoints without code changes. '
}
/>
Chat Models '
}
/>
Connect to hosted models or serve locally: '}
/>
```yaml
models:
- name: gpt4
from: openai:gpt-4o
params:
openai_api_key: ${secrets:OPENAI_API_KEY}
tools: auto # Enable tool use
- name: claude
from: anthropic:claude-3-5-sonnet
params:
anthropic_api_key: ${secrets:ANTHROPIC_KEY}
- name: local_llama
from: huggingface:huggingface.co/meta-llama/Llama-3.1-8B
```
Use via the OpenAI-compatible API or the spice chat CLI: '
}
/>
```bash
$ spice chat
Using model: gpt4
chat> How many orders were placed last month?
Based on the orders table, there were 15,234 orders placed last month.
```
NSQL (Text-to-SQL) '
}
/>
The /v1/nsql endpoint converts natural language to SQL and executes it: '
}
/>
```bash
curl -XPOST "http://localhost:8090/v1/nsql" \
-H "Content-Type: application/json" \
-d '{"query": "What was the highest tip any passenger gave?"}'
```
Spice uses tools like table_schema, random_sample, and sample_distinct_columns to help models write accurate, contextual SQL. '
}
/>
Embeddings '
}
/>
Transform text into vectors for similarity search. These embeddings power the vector search capabilities covered in the \'search\' section coming up next: '
}
/>
```yaml
embeddings:
- name: openai_embed
from: openai:text-embedding-3-small
params:
openai_api_key: ${secrets:OPENAI_API_KEY}
- name: bedrock_titan
from: bedrock:amazon.titan-embed-text-v2:0
params:
aws_region: us-east-1
- name: local_minilm
from: huggingface:sentence-transformers/all-MiniLM-L6-v2
```
Configure columns for automatic embedding generation: '}
/>
```yaml
datasets:
- from: postgres:documents
name: documents
acceleration:
enabled: true
columns:
- name: content
embeddings:
- from: openai_embed
chunking:
enabled: true
target_chunk_size: 512
```
📚 Docs: Models & Embeddings '
}
/>
7. Search '
}
/>
In the previous section, we configured embeddings to generate automatically during data ingestion. Those embeddings enable vector search - one of three search methods Spice provides as native SQL functions. '
}
/>
Spice takes the same integrated approach with search as it does with AI. Search indexes are built on top of accelerated datasets - the same data you're querying and piping to LLMs. Full-text search uses Tantivy with BM25 scoring for keyword matching. Vector search uses the embeddings you've already configured to generate during ingestion. Hybrid search combines both methods with Reciprocal Rank Fusion (RRF) to merge rankings - all via SQL functions like`text_search()`, `vector_search()`, and `rrf()`. Search in Spice powers retrieval-augmented generation (RAG), recommendation systems, and content discovery: "
}
/>
| Method | Best For | How It Works |
| Full-Text Search | Keyword matching, exact phrases | BM25 scoring via Tantivy |
| Vector Search | Semantic similarity, meaning-based retrieval | Embedding distance calculation |
| Hybrid Search | Queries with both keywords and semantic similarity | Hybrid execution and ranking through Reciprocal Rank Fusion (RRF) |
'
}
/>
Full-Text Search '
}
/>
Full-text search performs keyword-driven retrieval optimized for text data. Powered by Tantivy with BM25 scoring, it excels at finding exact phrases, specific terms, and keyword combinations. Enable it by indexing the columns you want to search: '
}
/>
```yaml
datasets:
- from: postgres:articles
name: articles
acceleration:
enabled: true
columns:
- name: title
full_text_search: enabled
- name: body
full_text_search: enabled
```
```sql
SELECT * FROM text_search(articles, 'machine learning', 10);
```
Vector Search '
}
/>
Vector search uses embeddings to find documents based on semantic similarity rather than exact keyword matches. This is particularly useful when users search with different wording than the source content-a query for "how to fix login issues" can match documents about "authentication troubleshooting."
Spice supports both local embedding models (like sentence-transformers from Hugging Face) and remote providers (OpenAI, Anthropic, etc.). Embeddings are configured as top-level components and referenced in dataset columns:'
}
/>
```yaml
datasets:
- from: s3://docs/
name: documents
vectors:
enabled: true
columns:
- name: body
embeddings:
- from: openai_embed
```
```sql
SELECT * FROM vector_search (documents, 'How do I reset my password?', 10)
WHERE category = 'support'
ORDER BY score;
```
Vector search is also available via the `/v1/search` HTTP API for direct integration with applications. '
}
/>
Hybrid Search with RRF '
}
/>
Neither vector nor full-text search alone produces optimal results for every query. A search for "Python error 403" benefits from both semantic understanding ("error" relates to "exception," "failure") and exact keyword matching ("403," "Python"). Hybrid search combines results from multiple search methods using Reciprocal Rank Fusion (RRF), merging rankings to improve relevance across diverse content types: '
}
/>
```sql
SELECT * FROM rrf(
vector_search(docs, 'query', 10),
text_search(docs, 'query', 10)
) LIMIT 10;
```
📚 Docs: Search & Vector Search '
}
/>
8. Writing Data '
}
/>
Spice supports writing to Apache Iceberg tables and Amazon S3 Tables via standard INSERT INTO statements. '
}
/>
Apache Iceberg Writes '
}
/>
```yaml
catalogs:
- from: iceberg:https://glue.us-east 1.amazonaws.com/iceberg/v1/catalogs/123456/namespaces
name: ice
access: read_write
datasets:
- from: iceberg:https://catalog.example.com/v1/namespaces/sales/tables/transactions
name: transactions
access: read_write
```
```sql
-- Insert from another table
INSERT INTO transactions
SELECT * FROM staging_transactions;
-- Insert with values
INSERT INTO transactions (id, amount, timestamp)
VALUES (1001, 299.99, '2025-01-15');
-- Insert into catalog table
INSERT INTO ice.sales.orders
SELECT * FROM federated_orders;
```
Amazon S3 Tables '
}
/>
Spice offers full read/write capability for Amazon S3 Tables, enabling direct integration with AWS\' managed table format for S3: '
}
/>
```yaml
datasets:
- from: glue:my_namespace.my_table
name: my_table
params:
glue_region: us-east-1
glue_catalog_id: 123456789012:s3tablescatalog/my-bucket
access: read_write
```
Note: Write support requires access: read_write configuration.'
}
/>
📚 Docs: Write-Capable Connectors '
}
/>
Deployment '
}
/>
Spice is designed for deployment flexibility and optionality - from edge devices to multi-node distributed clusters. It ships as a single file ~140MB binary with no external dependencies beyond your configured data sources. '
}
/>
This portability means you can deploy the same Spicepod configuration on a Raspberry Pi at the edge, as a sidecar in your Kubernetes cluster, or as a fully-managed cloud service - without code changes: '
}
/>
| Deployment Model | Description | Best For |
| Standalone | Single instance via Docker or binary | Development, edge devices, simple workloads |
| Sidecar | Co-located with your application pod | Low-latency access, microservices architectures |
| Microservice | Multiple replicas deployed behind a load balancer | Loosely couple architectures, heavy or varying traffic |
| Cluster | Distributed multi-node deployment | Large-scale data, horizontal scaling, fault tolerance |
| Sharded | Horizontal data partitioning across multiple instances | Large scale data, distributed query execution |
| Tiered | Hybrid approach combining sidecar for performance and shared microservice for batch processing | Varying requirements across different application components |
| Cloud | Fully-managed cloud platform | Auto-scaling, built-in observability, zero operational overhead. |
'
}
/>
Putting it all together '
}
/>
Spice makes data fast, federated, and AI-ready - through configuration, not code. The flexibility of this architecture means you can start simple and evolve incrementally. '
}
/>
| Concept | Purpose |
| Federation | Query 30+ sources with unified SQL |
| Acceleration | Materialize data locally for sub-second queries |
| Views | Virtual tables from SQL transformations |
| Snapshots | Fast cold-start from object storage |
| Models | Chat, NSQL, and embeddings via OpenAI-compatible API |
| Search | Full-text and vector search integrated in SQL |
| Writes | INSERT INTO for Iceberg and Amazon S3 tables |
'
}
/>
What can you build with Spice? '
}
/>
| Use Case | How Spice Helps |
| Operational Data Lakehouse | Serve real-time operational workloads and AI agents directly from Apache Iceberg, Delta Lake, or Parquet with sub-second query latency. Spice federates across object storage and databases, accelerates datasets locally, and integrates hybrid search and LLM inference - eliminating separate systems for operational access. |
| Data lake Accelerator | Accelerate data lake queries from seconds to milliseconds by materializing frequently-accessed datasets in local engines. Maintain the scale and cost efficiency of object storage while delivering operational-grade query performance with configurable refresh policies. |
| Data Mesh | Unified SQL access across distributed data sources with automatic performance optimization |
| Enterprise Search | Combine semantic and full-text search across structured and unstructured data |
| RAG Pipelines | Merge federated data with vector search and LLMs for context-aware AI applications |
| Real-Time Analytics | Stream data from Kafka or DynamoDB with sub-second latency into accelerated tables |
| Agentic AI | Build autonomous agents with tool-augmented LLMs and fast access to operational data |
'
}
/>
Whether you're replacing complex ETL pipelines, building AI-powered applications, or deploying intelligent agents at the edge-Spice provides the primitives to deliver fast, context-aware access to data wherever it lives. "
}
/>
📚 Docs: Use Cases '
}
/>
Next steps '
}
/>
Now that you have a mental model for Spice, check out the cookbook recipes for 80+ examples, the GitHub repo, the full docs, and join us on Slack to connect directly with the team and other Spice users.'
}
/>
And, remember these principles: '
}
/>
Spice is a runtime, not a database: It federates across your existing data infrastructure '
}
/>
Configuration over code: Declarative YAML replaces custom integration code '
}
/>
Acceleration is optional but powerful: Start with federation, add acceleration for latency-sensitive use cases '
}
/>
Composable primitives: Federation + Acceleration + Search + LLM Models work together '
}
/>
SQL-first: Everything accessible through standard SQL queries '
}
/>
'} />
---
## A New Class of Applications That Learn and Adapt
URL: https://spice.ai/blog/a-new-class-of-applications-that-learn-and-adapt
Date: 2021-12-30T18:08:39
Description: Explore the history of decision engines and how modern machine learning enables applications that learn, adapt, and make better decisions over time with Spice.ai.
A new class of applications that learn and adapt is becoming possible through machine learning (ML). These applications learn from data and make decisions to achieve the application\'s goals. In the post Making apps that learn and adapt, Luke described how developers integrate this ability to learn and adapt as a core part of the application\'s logic. You can think of the component that does this as a "decision engine." This post will explore a brief history of decision engines and use-cases for this application class.'
}
/>
History of decision engines'
}
/>
The idea to make intelligent decision-making applications is not new. Developers first created these applications around the 1970s1, and they are some of the earliest examples of using artificial intelligence to solve real-world problems.'
}
/>
The first applications used a class of decision engines called "expert systems". A distinguishing trait of expert systems is that they encode human expertise in rules for decision-making. Domain experts created combinations of rules that powered decision-making capabilities.'
}
/>
Some uses of expert systems include:'}
/>
Fault diagnosis "Smart" operator and troubleshooting manualRecovery from extreme conditionsEmergency shutdown'
}
/>
However, the resources required to build expert systems make employing them infeasible for many applications2. They often need a significant time and resource investment to capture and encode expertise into complex rule sets. These systems also do not automatically learn from experience, relying on experts to write more rules to improve decision-making.'
}
/>
With the advent of modern deep-learning techniques and the ability to access significantly more data, it is now possible for the computer, not only the developer, to learn and encode the rules to power a decision engine and improve them over time. The vision for Spice.ai is to make it easy for developers to build this new class of applications. So what are some use-cases for these applications?'
}
/>
Use cases of decision-making applications'
}
/>
Reduce energy costs by optimizing air conditioning'
}
/>
Today: The air conditioning system for an office building runs on a fixed schedule and is set to a fixed temperature in business hours, only adjusting using in-room sensor data, if at all. This behavior potentially over cools at business close as the outside temperature lowers and the building starts vacating.'
}
/>
With Spice.ai: Using Spice.ai, the application combines time-series data from multiple data sources, including the time of day and day of the week, building/room occupancy, and outside temperature, energy consumption, and pricing. The A/C controller application learns how to adjust the air conditioning system as the room naturally cools towards the end of the day. As the occupancy decreases, the decision engine is rewarded for maintaining the desired temperature and minimizing energy consumption/cost.'
}
/>
Food delivery order dispatching'
}
/>
Today: Customers order food delivery with a mobile app. When the order is ready to be picked up from the restaurant, the order is dispatched to a delivery driver by a simple heuristic that chooses the nearest available driver. As the app gets more popular with customers and the number of restaurants, drivers, and customers increases, the heuristic needs to be constantly tuned or supplemented with human operators to handle the demand.'
}
/>
With Spice.ai: The application learns which driver to dispatch to minimize delivery time and maximize customer star ratings. It considers several factors from data, including patterns in both the restaurant and driver's order histories. As the number of users, drivers, and customers increases over time, the app adapts to keep up with the changing patterns and demands of the business."
}
/>
Routing stock or crypto trades to the best exchange'
}
/>
Today: When trading stocks through a broker like Fidelity or TD Ameritrade, your broker will likely route your order to an exchange like the NYSE. And in the emerging world of crypto, you can place your trade or swap directly on a decentralized exchange (DEX) like Uniswap or Pancake Swap. In both cases, the routing of orders is likely to be either a form of traditional expert system based upon rules or even manually routed.'
}
/>
With Spice.ai: A smart order routing application learns from data such as pending transactions, time of day, day of the week, transaction size, and the recent history of transactions. It finds patterns to determine the most optimal route or exchange to execute the transaction and get you the best trade.'
}
/>
Summary'}
/>
A new class of applications that can learn and adapt are made possible by integrating AI-powered decision engines. Spice.ai is a decision engine that makes it easy for developers to build these applications.'
}
/>
If you\'d like to partner with us in creating this new generation of intelligent decision-making applications, we invite you to join us on Slack, or reach out on Twitter.'
}
/>
Phillip'} />