spiceai/docs

spiceai/

docs

Help Login

trunk

Edit on GitHub

Fork

/docs/website/versioned_docs/version-1.11.x/features/query-federation/index.md

spiceai/docs | Spice Cloud Platform

trunk

Edit on GitHub

Fork

/docs/website/versioned_docs/version-1.11.x/features/query-federation/index.md

spiceai/docs/README.md

title: 'Query Federation' sidebar_label: 'Query Federation' description: 'Learn how to use federated SQL queries in Spice.ai Open Source' sidebar_position: 1 pagination_prev: null pagination_next: null tags:

query
sql
features

import DocCardList from '@theme/DocCardList';

Spice provides a high-performance SQL query engine built on Apache DataFusion, supporting query federation across multiple data sources including databases (PostgreSQL, MySQL), data warehouses (Databricks, Snowflake, BigQuery), and data lakes (S3, MinIO).

Spice.ai Open Source Query Federation

For a full list of supported sources, see Data Connectors.

When to Use Query Federation

Query federation is useful when:

Data lives in multiple systems (e.g., PostgreSQL + S3 + Snowflake) and needs to be joined without ETL pipelines.
Applications need a single SQL interface to query across databases, data lakes, and warehouses.
SQL queries should be pushed down to source databases to minimize data transfer.

Minimal Example

Query data from PostgreSQL and S3 through a single SQL interface:

Query Methods

Spice supports multiple ways to execute queries:

SQL Queries: Execute standard SQL queries against datasets using the HTTP API, Arrow Flight SQL, JDBC, ODBC, or ADBC.
Parameterized Queries: Execute prepared statements with parameter binding for improved security and performance.
Federated Queries: Join and query data across multiple sources in a single SQL statement.

API Endpoints

Protocol	Endpoint	Description
HTTP	`/v1/sql`	Execute SQL queries over HTTP
Arrow Flight SQL	`grpc://localhost:50051`	High-performance Arrow-native queries
JDBC/ODBC	Flight SQL compatible	Connect from BI tools and applications
ADBC	Flight SQL driver	Arrow Database Connectivity

HTTP API

Execute a query using the HTTP API:

Arrow Flight SQL

Connect using Arrow Flight SQL for high-performance data transfer:

SQL REPL

Use the Spice CLI for interactive queries:

Query Features

Federated Query Example

To start using federated queries in Spice, follow these steps:

Step 1. Install Spice by following the installation instructions.

Step 2. Clone the Spice Cookbook repository and navigate to the federation directory.

Step 3. Login to the demo Dremio.

Step 4. Create a new Spice app called demo.

Step 5. Add the spiceai/fed-demo Spicepod.

Note in the Spice runtime output several datasets are loaded.

Step 6. Start the Spice runtime.

Step 7. Show available tables and query them, regardless of source.

Show the available tables:

Execute the queries:

Step 8. Join tables across remote sources and locally accelerated source

Step 9. Join tables across locally accelerated sources and query

Acceleration

The query in step 8 returns results from federated remote data sources, but performance is affected by network latency and data transfer overhead.

Step 9 demonstrates the same query executed against locally materialized datasets using Data Accelerators. By storing data locally, queries avoid network round-trips and achieve significantly faster response times.

:::warning[Limitations]

Query Performance: Without acceleration, federated queries will be slower than local queries due to network latency and data transfer.
Query Capabilities: Not all SQL features and data types are supported across all data sources. More complex data type queries may not work as expected.

:::