spiceai/docs

spiceai/

docs

Help Login

evgenii/docs-spicepod-v2

Edit on GitHub

Fork

/docs/website/versioned_docs/version-2.0.x/reference/spicepod/models.md

spiceai/docs | Spice Cloud Platform

evgenii/docs-spicepod-v2

Edit on GitHub

Fork

/docs/website/versioned_docs/version-2.0.x/reference/spicepod/models.md

spiceai/docs/README.md

title: 'Models' sidebar_label: 'Models' description: 'Models YAML reference' pagination_next: null

The models section of a Spicepod defines machine learning (ML) models and large language models (LLMs) for use with Spice. Models can be loaded from Hugging Face, OpenAI, local files, or other supported providers. The model type is automatically determined based on the source and file format.

Field	Description
`name`	Unique, readable name for the model within the Spicepod.
`from`	Source-specific address to uniquely identify a model.
`description`	Additional details about the model, useful for displaying to users.
`datasets`	Datasets that the model depends on for inference.
`files`	Specify additional files, or override default files needed by the model.
`params`	Additional parameters to be passed to the model.

`models`

The models section in your configuration specifies one or more models to be used with your datasets.

Example:

`from`

The from field specifies both the source of the model (e.g Huggingface, or a local file), and the unique identifier of the model (relative to the source). The from value expects the following format

Model Source

The <model_source> prefix of the from field indicates where the model is sourced from:

huggingface:huggingface.co - Models from Hugging Face
file: - Local file paths
openai - OpenAI (or compatible) models
spiceai - Spice AI models

Model ID

The <model_id> suffix of the from field is a unique (per source) identifier for the model:

For Spice AI: Supports only ML models. Represents the full path to the model in the Spice AI repository. Supports a version suffix (default to latest).
- Example: lukekim/smart/models/drive_stats:60cb80a2-d59b-45c4-9b68-0946303bdcaf
For Hugging Face: A repo_id and, optionally, revision hash or tag.
- Qwen/Qwen1.5-0.5B (no revision)
- meta-llama/Meta-Llama-3-8B:cd892e8f4da1043d4b01d5ea182a2e8412bf658f (with revision hash)
For local files: Represents the absolute or relative path to the model weights file on the local file system. See below for the accepted model weight types and formats.
For OpenAI: Only supports LMs. For OpenAI models, valid IDs can be found in their model documentation. For OpenAI compatible providers, specify the value required in their v1/chat/completion payload.

`name`

A unique identifier for this model component.

`description`

Additional details about the model, useful for displaying to users

`files`

Optional. A list of files associated with this model. Each file has:

path: The path to the file
name: Optional. A name for the file
type: Optional. The type of the file (automatically determined if not specified)

File types include:

weights: Model weights
- For ML models: typically .onnx files
- For LLMs: .gguf, .ggml, .safetensors, or pytorch_model.bin files
- These files contain the trained parameters of the model
config: Model configuration
- Usually a config.json file
- Contains model architecture and hyperparameters
tokenizer: Tokenizer file
- Usually a tokenizer.json file
- Defines how input text is converted into tokens for the model
tokenizer_config: Tokenizer configuration
- Usually a tokenizer_config.json file
- Contains additional configuration for the tokenizer

The system attempts to automatically determine the file type based on the file name and extension. If the type cannot be determined automatically, you can explicitly specify it in the configuration.

`params`

Optional. A map of key-value pairs for additional parameters specific to the model.

Example uses include:

Setting default OpenAI request parameters for language models, see parameter overrides.
Enabling language models to perform actions against Spice (e.g. making SQL queries), via language model tool use, see runtime tools.
Invoking language models directly from SQL queries using the ai() function.

`params.tools`

Which tools should be made available to the model. Supported values: auto, all, search_registry, or a comma-separated list of specific tool names. See Tool Modes.

`params.tool_embedding_model`

The name of an embedding model (defined in the embeddings section) to use for searchable tool discovery. Required when tools: search_registry is set. When tools: auto is used, this model enables registry-based discovery if the tool count exceeds the auto-search threshold (20 tools). If only one embedding model is configured, it is used automatically.

`params.prompt_cache_key`

Optional. A stable key forwarded to the LLM provider to enable prompt/prefix caching. When set, Spice maps this key into the provider-native caching mechanism:

Provider	Behavior
OpenAI / Azure OpenAI	Passed through on Chat and Responses API requests
Anthropic	Adds `cache_control: { type: "ephemeral" }` to the request
Google Gemini	Maps to `cached_content.name` (must be a valid cached-content resource name)
xAI (Grok)	Sent as the `x-grok-conv-id` HTTP header
AWS Bedrock (Converse)	Appends a native `CachePoint` block
Databricks (hosted Claude)	Adds Claude-style `cache_control` to the last content part
Local (mistral-rs)	Paged-attention scheduling is enabled automatically on supported backends (CUDA + Unix)

`params.prompt_cache_retention`

Optional. Retention hint for prompt caching, applicable to the OpenAI Responses API only. For example, "24h" requests that the cached content be retained for 24 hours.

`datasets`

Optional. A list of dataset names that this model should be applied to. For ML models, this preselects the dataset to use for inference.

`dependsOn`

Optional. A list of dependencies that must be loaded and available before this model.