spiceai/docs

trunk

title: 'chat' sidebar_label: 'chat' pagination_prev: null pagination_next: null description: 'spice chat CLI documentation'

Start an interactive or one-shot chat with a model registered in the Spice runtime.

--cloud Send requests to a Spice Cloud instance instead of the local instance. Default: false.
--http-endpoint <string> Runtime HTTP endpoint. Default: http://localhost:8090.
--model <string> Target model for the chat request. When omitted, the CLI uses the single ready model or prompts for a choice if several models are ready.
--temperature <float32> Model temperature used for chat request. Default: 1.
--user-agent <string> Custom User-Agent header sent with every request.
--responses Direct all chats to the /v1/responses endpoint, which exposes configured models that support OpenAI's Responses API and enables access to OpenAI-hosted tools. To learn more about Spice's support for OpenAI's Responses API, view the OpenAI model provider documentation or the Azure OpenAI model provider documentation.

When exactly one model is ready, spice chat opens a REPL that uses that model automatically:

When multiple models are ready, the command prompts for a selection before starting the REPL:

Passing --model skips the prompt and directs the request to the specified model. The flag works both in REPL mode and in one‑shot mode:

Single prompt: