spiceai/docs

spiceai/

docs

Help Login

trunk

Edit on GitHub

Fork

/docs/website/versioned_docs/version-1.11.x/features/data-acceleration/refresh-modes/caching.md

spiceai/docs | Spice Cloud Platform

trunk

Edit on GitHub

Fork

/docs/website/versioned_docs/version-1.11.x/features/data-acceleration/refresh-modes/caching.md

spiceai/docs/README.md

title: 'Caching Refresh Mode' sidebar_label: 'Caching' description: 'Learn how to use caching refresh mode for HTTP-based datasets' sidebar_position: 4 pagination_prev: null pagination_next: null

The caching refresh mode provides intelligent caching for HTTP-based datasets where multiple result rows can share the same request metadata. This mode is specifically designed for scenarios like API responses where the same request parameters can return different content over time or multiple rows of data.

The caching mode supports two key paradigms:

Stale-While-Revalidate (SWR) - Serves cached data immediately while refreshing in the background, optimizing for low latency and reduced API costs
Cache Persistence - Stores cached data to disk using file-based accelerators (DuckDB, SQLite, or Cayenne) for fast cold starts and durability

Overview

Unlike traditional refresh modes that treat datasets as single sources of truth, the caching mode treats HTTP request metadata (path, query parameters, and body) as cache keys. This approach is particularly useful for:

REST API responses that return multiple records for a single request
Search API results where the same query may return different results over time
Dynamic content APIs where responses change based on server state

:::info[Future Enhancement] While currently designed for HTTP-based datasets, future versions of Spice will extend the caching mode to support arbitrary queries from any data source, enabling flexible caching strategies across all connector types. :::

How It Works

The caching mode uses HTTP request filter values as cache keys rather than enforcing primary key constraints. When a refresh occurs:

Cache Key Generation: By default, the combination of request_path, request_query, and request_body acts as the cache key. If a primary_key is explicitly specified in the acceleration configuration, it will be used instead of the metadata fields.
Row Replacement: All existing rows matching the cache key are removed before inserting new data
Multiple Results: Multiple rows with identical request metadata can coexist, representing different content items from the same API response
Timestamp Tracking: Each row includes a fetched_at timestamp indicating when the data was retrieved

Schema

Datasets using caching mode include the following metadata fields in addition to the content data:

Field Name	Type	Description
`request_path`	String	The URL path used for the request
`request_query`	String	The query parameters used for the request
`request_body`	String	The request body (for POST requests)
`content`	String	The response content
`response_status`	UInt16	The HTTP status code of the response (e.g., `200`, `404`, `500`)
`response_headers`	Map(String, String)	The HTTP response headers as key-value pairs
`fetched_at`	Timestamp	The timestamp when the data was fetched (based on HTTP Date header)

The fetched_at timestamp uses the HTTP Date response header when available, falling back to the current system time if not present.

Configuration

To use caching mode, configure an HTTP/HTTPS dataset with refresh_mode: caching:

Use Cases

Caching TV Show Search Results

Cache TV show search API results where the same query may return different results over time:

This configuration:

Fetches search results for "game of thrones" every 30 seconds
Stores all result items with the same request metadata
Replaces all previous results for this query on each refresh
Preserves the timestamp of when results were fetched

Caching TV Show Episodes

Cache responses from a TV show episodes API:

Multi-Endpoint Caching

Cache responses from multiple TVMaze API endpoints:

Querying Cached Data

Query cached data using standard SQL, filtering by request metadata or content:

Stale-While-Revalidate Pattern

The caching mode supports the Stale-While-Revalidate (SWR) pattern for acceleration, providing optimal performance by serving cached data immediately while refreshing in the background.

How SWR Works

When configured with background refresh, the caching mode:

Serves stale data immediately - Returns cached results without waiting for a refresh
Triggers background refresh - Initiates an asynchronous refresh of the cache
Updates cache transparently - Subsequent queries receive fresh data once the refresh completes

This pattern reduces query latency by eliminating wait times for data fetches while keeping the cache reasonably fresh.

Background Refresh Configuration

Configure background refresh using refresh_check_interval to specify how frequently the cache should be updated:

SWR Benefits for API Caching

The SWR pattern is particularly valuable for caching API responses:

Reduced latency - Queries return immediately from the cache without waiting for HTTP requests
Lower API costs - Fewer requests to external APIs reduce usage and costs
Improved reliability - Cached data remains available even if the API is temporarily unavailable
Better user experience - Consistent fast response times improve application performance

Example: SWR with On-Demand Refresh

Combine background refresh with on-demand refresh for maximum flexibility:

With this configuration:

The cache refreshes every 30 seconds automatically
Queries are served immediately from the cache
Manual refresh is available via /v1/datasets/tv_search_swr/acceleration/refresh
Cache is guaranteed fresh on application startup

Cache Persistence

The caching mode supports persisting cached data to disk using file-based acceleration engines, enabling the cache to survive application restarts and reducing cold start times.

File-Based Accelerators

Three acceleration engines support file persistence for caching mode:

DuckDB - High-performance analytical database with excellent compression
SQLite - Lightweight, reliable database ideal for embedded scenarios
Cayenne - Spice's native accelerator optimized for analytical workloads

Configuring File Persistence

Enable file persistence by setting acceleration.mode: file and specifying an acceleration engine:

DuckDB Persistence Example

DuckDB provides excellent performance for analytical queries on cached data:

SQLite Persistence Example

SQLite is ideal for lightweight caching scenarios:

Cayenne Persistence Example

Cayenne provides optimized performance for Spice workloads:

Benefits of File Persistence

Persisting the cache to disk provides several advantages:

Fast cold starts - Cache is immediately available on application restart without fetching from APIs
Reduced API load - No need to refetch all data after restarts
Cost savings - Fewer API requests reduce metered API costs
Offline capability - Cached data remains queryable even when the API is unavailable
Data durability - Cache survives application crashes and restarts

Memory vs. File Mode

Choose between in-memory and file-based caching based on your requirements:

Aspect	Memory Mode (`mode: memory`)	File Mode (`mode: file`)
Performance	Fastest - all data in RAM	Fast with disk I/O
Persistence	Lost on restart	Survives restarts
Capacity	Limited by available memory	Limited by disk space
Cold start	Slow - must refetch all data	Fast - loads from disk
Best for	Small, frequently changing caches	Large, stable caches
Engines	`arrow` (default)	`duckdb`, `sqlite`, `cayenne`

Combining SWR and Persistence

For optimal performance, combine the SWR pattern with file persistence:

This configuration provides:

Immediate query response from persisted cache on startup
Background refresh every 30 seconds without blocking queries
Durable cache that survives application restarts
Reduced API requests and costs

Behavior and Characteristics

Row-Level Replacement

The caching mode uses InsertOp::Replace to handle data updates. When new data is fetched for a given cache key (request metadata):

All existing rows matching that cache key are removed
All new rows are inserted
This operation is atomic, ensuring consistent cache state

This behavior differs from other modes:

full mode: Replaces the entire dataset
append mode: Adds new rows without removing existing ones
changes mode: Applies CDC events
caching mode: Replaces rows matching the specific cache key

Cache Key Behavior

The caching mode determines cache keys based on the acceleration configuration:

Default (No Primary Key Specified):

Uses HTTP request metadata fields as the cache key: request_path, request_query, and request_body
Multiple result rows can share the same request metadata
The cache key serves as the logical grouping mechanism for row replacement
Content within a response may have duplicate values across different requests

With Primary Key Specified:

Uses the explicitly configured primary_key columns as the cache key
Provides fine-grained control over cache key composition
Useful when caching requires uniqueness based on response content fields rather than request metadata

Example with custom primary key:

HTTP Date Header

The fetched_at timestamp respects the HTTP Date response header when present. This provides:

Accurate server-side timestamps for cached responses
Consistency across distributed systems
Proper cache age calculation based on server time

If the Date header is not present, the system falls back to using the current system time.

Refresh Configuration

The caching mode supports standard refresh configuration options. See Stale-While-Revalidate Pattern for background refresh details and Cache Persistence for file-based caching configuration.

Parameter	Description	Default
`refresh_check_interval`	How often to refresh cached data in the background	None
`refresh_sql`	SQL query defining what data to cache	None
`refresh_on_startup`	Whether to refresh on startup (`auto` or `always`)	`auto`
`on_zero_results`	Behavior when cache returns no results (`return_empty`, `use_source`)	`return_empty`
`engine`	Acceleration engine (`arrow`, `duckdb`, `sqlite`, `cayenne`)	`arrow`

Cache TTL (Time-to-Live)

The caching mode provides parameters to control cache freshness and staleness behavior:

Parameter	Description	Default
`caching_ttl`	Duration that cached data is considered fresh. After this period, data becomes stale and triggers a background refresh.	`30s`
`caching_stale_while_revalidate_ttl`	Duration after `caching_ttl` expires during which stale data is served while refreshing in the background. After this period, queries wait for fresh data.	None
`caching_stale_if_error`	When set to `enabled`, serves expired cached data if the upstream source returns an error. Valid values: `enabled`, `disabled`.	`disabled`

The caching_ttl parameter defines how long cached data is considered fresh before it becomes stale. Once cached data exceeds this age, the SWR pattern triggers background refresh to update the cache while continuing to serve the stale data during the caching_stale_while_revalidate_ttl window.

If caching_stale_while_revalidate_ttl is omitted, cached data becomes rotten immediately after caching_ttl expires, and queries will wait for fresh data rather than returning stale results. When a value is specified, stale data is served during that window after caching_ttl expires while a background refresh occurs. Once the combined caching_ttl + caching_stale_while_revalidate_ttl period has passed, queries will wait for fresh data instead of returning stale results.

Configuring Cache TTL:

How Cache TTL Works:

When data is fetched, the fetched_at timestamp is recorded
On subsequent queries, the system checks now - fetched_at > caching_ttl
If data is within TTL, it is served immediately (fresh)
If data exceeds TTL, it becomes stale:
- Stale data is served immediately (no query delay) if within caching_stale_while_revalidate_ttl
- Background refresh is triggered to update the cache
- Next query receives the refreshed data

TTL Format: Duration strings support common units:

Seconds: 30s, 90s
Minutes: 5m, 15m
Hours: 1h, 24h
Mixed: 1h30m, 2h15m30s

Default Behavior: When caching_ttl is not specified, it defaults to 30s (30 seconds). This provides a reasonable balance between freshness and cache efficiency for most use cases. When caching_stale_while_revalidate_ttl is not specified, stale data is not served after the TTL expires, and queries will wait for fresh data.

Stale-If-Error Behavior

The caching_stale_if_error parameter controls whether expired cached data is served when the upstream data source returns an error during a refresh attempt. This provides fault tolerance by returning stale data instead of failing the query when the upstream source is temporarily unavailable.

When caching_stale_if_error: enabled:

If the upstream source returns an error during refresh, expired cached data is served instead of failing
Queries continue to return data even when the upstream API is unavailable
Useful for APIs with intermittent availability or rate limits

When caching_stale_if_error: disabled (default):

Errors from the upstream source propagate to the query
Queries fail when fresh data cannot be fetched and cached data has expired

:::warning[Stale-While-Revalidate Configuration Conflict] When using refresh_mode: caching, you cannot configure both the caching accelerator's caching_stale_while_revalidate_ttl and the results cache's stale_while_revalidate_ttl for the same dataset. These parameters control similar behavior at different layers, and having both enabled creates a conflict.

Choose one approach:

Caching accelerator SWR: Use acceleration.params.caching_stale_while_revalidate_ttl for HTTP-based dataset caching
Results cache SWR: Use runtime.caching.sql_results.stale_while_revalidate_ttl for SQL query results caching

:::

TTL Considerations:

Shorter TTL (e.g., 10s, 30s): More frequent refresh, higher data freshness, more API requests
Longer TTL (e.g., 10m, 1h): Fewer refreshes, lower API costs, potentially stale data
Matching patterns: Set caching_ttl shorter than refresh_check_interval to define the staleness window

Limitations

Currently only available for HTTP-based datasets using the HTTPS connector. Future releases will extend support to arbitrary queries from any data source.
Requires acceleration.enabled: true
When no primary_key is specified, cache keys default to request metadata fields (request_path, request_query, request_body)
On-demand refresh via /v1/datasets/:name/acceleration/refresh API triggers a new refresh for all cache keys defined in refresh_sql