🧑‍🍳 Spice.ai Cookbook

78 guides and samples to help you build data-grounded AI apps and agents with Spice.ai Open-Source. Find ready-to-use examples for data acceleration, AI agents, LLM memory, and more.

Featured Recipes

Most popular cookbook recipes for SQL federation, local models, acceleration, and LLM memory.

Federated SQL Query

Join S3, PostgreSQL, and Dremio data in one SQL query.

Run Llama3 Locally

Use Llama models from HuggingFace with Spice. Includes video walkthrough.

Data Acceleration with DuckDB

Speed up queries using DuckDB. Includes video walkthrough.

Core Features

Discover core capabilities like data federation, acceleration, search, and LLM inference to enhance your applications.

Federated SQL Query

Query data from S3, PostgreSQL, and Dremio in a single query.

OpenAI SDK

Use the OpenAI SDK to connect to models hosted on Spice.

AI SQL Function

Invoke LLMs directly within SQL queries using the AI SQL function.

DuckDB Data Accelerator

Accelerate data locally using DuckDB. Includes video walkthrough.

Amazon S3 Vectors Search

Use Amazon S3 Vectors to store embeddings and run efficient vector search. Includes video walkthrough.

Spice Cayenne Data Accelerator

Accelerate data locally using the Spice Cayenne Data Accelerator.

Models, AI, and Agents

Integrate with popular AI models, LLMs, and build intelligent agents using Spice.ai.

Azure OpenAI Models

Connect and use Azure OpenAI models with Spice.

Running Llama3 Locally

Use the Llama family of models locally from HuggingFace using Spice. Includes video walkthrough.

OpenAI SDK

Use the OpenAI SDK to connect to models hosted on Spice.

OpenAI Responses API

Use the OpenAI Responses API with Spice.

OpenAI Models

Use OpenAI LLM and embedding models in Spice.

LLM Memory

Persistent memory for language models.

Text to SQL (NSQL)

Ask natural language (NLP) questions of your datasets using the built-in text-to-SQL tool.

AI SQL Function

Invoke LLMs directly within SQL queries using the AI SQL function.

Generative Visualizations

Generate SQL queries and Chart.js visualizations from natural language using AI.

Nvidia NIM on Kubernetes

Deploy Nvidia NIM infrastructure on Kubernetes with GPUs connected to Spice.

Nvidia NIM on AWS EC2

Deploy Nvidia NIM on AWS GPU-optimized EC2 instances connected to Spice.

Searching GitHub Files

Search GitHub files with embeddings and vector similarity search. Includes video walkthrough.

Hybrid-Search with RRF

Combine multiple search methods using Reciprocal Rank Fusion (RRF) for improved search results.

xAI Models

Use xAI models such as Grok. Includes video walkthrough.

DeepSeek Model

Use DeepSeek model through Spice.

Filesystem Hosted Model

Use models hosted directly on filesystems.

Web Search Tools using Perplexity

Provide LLMs with web search access for more informed answers.

Language Model Evaluations

Use Spice to evaluate language models.

LLM as a Judge

Define LLM judge models to evaluate the performance of other language models.

Model-Context-Protocol (MCP)

Use Spice to connect to or host MCP servers.

Amazon S3 Vectors

Use Amazon S3 Vectors to store embeddings and perform efficient vector search. Includes video walkthrough.

Data Acceleration, Materialization, and Federation

Optimize query performance with local acceleration, data materialization, and federation techniques.

DuckDB Data Accelerator

Accelerate data locally using DuckDB. Includes video walkthrough.

PostgreSQL Data Accelerator

Accelerate data locally using PostgreSQL.

SQLite Data Accelerator

Accelerate data locally using SQLite.

Apache Arrow Data Accelerator

Accelerate data using Apache Arrow.

Hashed Partitioning with DuckDB

Use hashed partitioning for performance with DuckDB.

Dataset Partitioning

Partition accelerated datasets to improve query performance.

Database Snapshots

Bootstrap DuckDB accelerations from object storage to skip cold starts.

Accelerated Views

Use view materialization for improved performance.

Indexes on Accelerated Data

Create and manage indexes on accelerated data.

Search & Embeddings

Implement advanced search capabilities and leverage embeddings for vector similarity search.

Searching GitHub Files

Search GitHub files with embeddings and vector similarity search. Includes video walkthrough.

Hybrid-Search with RRF

Combine multiple search methods using Reciprocal Rank Fusion (RRF) for improved search results.

Amazon S3 Vectors

Use Amazon S3 Vectors to store embeddings and perform efficient vector search. Includes video walkthrough.

Data Connectors

Connect to various data sources and systems to query, analyze, and manage your data efficiently.

PostgreSQL Connector

Connect to and query PostgreSQL databases.

AWS RDS PostgreSQL

Connect to AWS RDS PostgreSQL instances.

Supabase PostgreSQL

Connect to Supabase PostgreSQL databases.

MySQL Connector

Connect to and query MySQL databases.

AWS RDS Aurora MySQL

Connect to AWS RDS Aurora with MySQL compatibility.

PlanetScale MySQL

Connect to PlanetScale MySQL databases.

Clickhouse Connector

Connect to and query Clickhouse databases.

Databricks Connector

Connect to and query Databricks instances using Delta Lake or Spark Connect.

Delta Lake Connector

Query data from Delta Lake tables.

Debezium CDC from Postgres

Stream changes from PostgreSQL using Debezium CDC.

Debezium CDC with SASL/SCRAM

Stream MySQL changes using Debezium with SASL/SCRAM authentication.

Dremio Connector

Connect to and query Dremio.

DuckDB Connector

Query DuckDB databases with sample TPCH data.

File Connector

Query data from local files.

FTP Connector

Query data from FTP servers.

GitHub Connector

Connect to and query GitHub data. Includes video walkthrough.

GraphQL Connector

Query data from GraphQL endpoints.

HTTP Connector

Query data from HTTP(s) endpoints like REST APIs.

MSSQL Connector

Connect to Microsoft SQL Server databases.

ODBC Connector

Connect to databases using ODBC.

Redshift Connector

Read and write TPC-H data with Amazon Redshift.

Oracle Connector

Connect to and query Oracle databases.

Glue Connector

Connect to AWS Glue.

S3 Connector

Query data from S3 compatible storage.

ScyllaDB Connector

Query data from ScyllaDB clusters using federated SQL.

SharePoint Connector

Connect to SharePoint and OneDrive for Business.

SMB Connector

Query data files from SMB/CIFS network shares.

Snowflake Connector

Connect to and query Snowflake databases.

Spice.ai Cloud Connector

Connect to the Spice.ai Cloud Platform.

Apache Spark Connector

Connect to and query Apache Spark.

IMAP Emails

Federated SQL query of mail across IMAP email servers.

IMAP Outlook Mailbox

Connect Spice to an Outlook mailbox via IMAP.

MongoDB Connector

Connect to and query MongoDB databases.

Live Orders Analytics with Apache Kafka Data Connector

Combine real-time data streaming from Kafka with other datasets using Spice.

Catalog Connectors

Connect to data catalogs to discover, manage, and utilize your data assets effectively.

Spice.ai Cloud Platform Catalog

Connect to the Spice.ai Cloud Platform catalog.

Databricks Unity Catalog

Connect to Databricks Unity catalog.

Unity Catalog

Connect to Unity catalog.

Iceberg Catalog Connector

Connect to Iceberg catalog with support for reading and writing Iceberg tables.

Glue Catalog Connector

Connect to AWS Glue Catalog.

Visualization

Visualize data with BI and analytics tools.

Sales BI with Apache Superset

Visualize data in Spice with Apache Superset.

Grafana Datasource

Add Spice as a Grafana datasource.

API Clients

Use API clients for data access and integration.

Python ADBC Client

Query Spice using ADBC and Parameterized Queries with Python.

Java JDBC Client

Query Spice.ai using the Java JDBC client.

Scala JDBC Client

Query Spice.ai using the Scala JDBC client.

Deployment

Deploy Spice.ai in different environments.

Deploying to Kubernetes

Deploy Spice.ai on Kubernetes.

Running in Docker

Run Spice.ai in Docker containers.

Sidecar Deployment Architecture

Deploy Spice as a sidecar alongside your application.

Microservice Deployment Architecture

Deploy Spice as a standalone microservice architecture.

Advanced Topics

Explore advanced deployment and data architecture patterns for production workloads.

Local Dataset Replication

Link datasets in a parent/child relationship within the current Spicepod.

Distributed Query

Run queries distributed across multiple nodes for large datasets.

Performance and Benchmarking

Measure and optimize performance with benchmarks and best practices for your Spice.ai deployment.

TPC-H Benchmarking

Benchmark performance using TPC-H.

Results Caching

Cache query results for improved performance.

Caching Accelerator

Use intelligent HTTP response caching with stale-while-revalidate (SWR).

Indexes on Accelerated Data

Create and manage indexes on accelerated data.

Configuration

Fine-tune your Spice.ai deployment with advanced configuration options for optimal performance.

Data Retention Policy

Configure data retention policies.

Refresh Data Window

Configure data refresh windows.

Advanced Data Refresh

Advanced configuration for data refresh.

Data Quality with Constraints

Add data quality constraints.

Cron Dataset Schedules

Schedule dataset refreshes using cron syntax.

SDKs

Use SDKs for different programming languages.

OpenAI SDK

Use the OpenAI SDK to connect to models hosted on Spice.

Rust SDK

Query Spice.ai using the Rust SDK.

Python SDK

Query Spice.ai using the Python SDK.

Go SDK

Query Spice.ai using the Go SDK.

Spice.js JavaScript (Node.js) SDK

Query Spice.ai using the JavaScript (Node.js) SDK with examples.

Java SDK

Query Spice.ai using the Java SDK.

Security

Secure your Spice.ai deployment and data access with robust security practices and configurations.

Intelligent Security Copilot

Analyze real-time data access patterns with Spice.ai.

TLS Encryption

Enable encryption in transit using TLS.

API Key Authentication

Secure access with API key authentication.

FAQs

Common questions about using the Spice.ai OSS Cookbook and choosing the right recipe to start with.

What is the Spice.ai OSS Cookbook?

The cookbook is a curated set of practical, ready-to-run recipes for Spice.ai Open Source. Recipes cover connectors, federation, acceleration, hybrid search, model integration, and deployment patterns.

Which recipe should I start with?

If you are new to Spice, start with a recipe that matches your immediate goal: federated SQL for multi-source queries, DuckDB acceleration for faster performance, or OpenAI SDK and MCP recipes for AI agent use cases.

Do cookbook recipes work with Spice Cloud?

Yes. Cookbook patterns are based on Spice OSS capabilities and can be adapted to Spice Cloud workflows. You can start locally with OSS and move to managed deployments as workloads grow. See Spice Cloud pricing.

Build faster with Spice

See how teams use Spice to turn cookbook patterns into production data and AI applications.

Get a demo

content stat graphiccontent stat graphiccontent stat orb