Spice.ai runs on Google Cloud Platform (GCP) on Kubernetes, serverless containers, or virtual machines. The container image and Helm chart are the same artefacts used in every other environment, so the choice of GCP service is a matter of operational fit rather than packaging.
For a complete list of GCP-compatible data connectors, AI models, and supported services, see GCP Integrations.
Run Spice on GKE when the workload benefits from Kubernetes orchestration, multi-replica scale, or shared cluster tenancy. GKE pairs with the Spice Helm chart and the Argo CD or Flux GitOps workflows.
The fastest path is gcloud. The example below creates a regional Standard cluster with Workload Identity enabled — required for federated credentials to GCP services.
For burst or low-utilization workloads, use GKE Autopilot — Google manages the nodes, billing is per-pod, and Workload Identity is enabled by default. For production, prefer Terraform for repeatable provisioning. The terraform-google-modules/kubernetes-engine module is a common starting point.
Most Spice connectors (Cloud Storage via the S3 connector with HMAC, BigQuery via ADBC, Cloud SQL via PostgreSQL or MySQL) accept GCP credentials from Application Default Credentials. Use Workload Identity so pods receive scoped, short-lived tokens without static keys:
Reference the service account from the Helm release so pods inherit federated tokens via the standard ADC chain:
For declarative GitOps, swap this command for an Argo CD Application or a Flux HelmRelease pointing at the same chart. See the Argo CD or Flux guides for full manifests.
For stateful acceleration (DuckDB, SQLite, Cayenne):
n2-standard-*-lssd, c3-standard-*-lssd, z3 series). Expose Local SSDs through GKE's Local SSD raw block / ephemeral storage provisioner. Local SSDs do not survive node replacement, so pair with a refresh strategy or a re-hydration source.type: hyperdisk-balanced or hyperdisk-extreme).pd-ssd, premium-rwo) — use the built-in premium-rwo storage class only when Hyperdisk is unavailable in a region.filestore-csi) — not recommended for acceleration — use only for stateless shared artefacts that need ReadWriteMany. NFS latency negates the benefit of using a local accelerator.:::tip[Spice.ai Enterprise]
For production stateful workloads, the Spice.ai Enterprise Operator's SpicepodSet provides per-replica StatefulSets with automatic PVC resizing, Workload-Identity-aware ServiceAccount annotations, and configurable update strategies. For distributed query execution across scheduler/executor tiers backed by Cloud Storage, see SpicepodCluster.
:::
To expose Spice externally, install the GKE Gateway controller or use a Cloud Load Balancer Service:
For internal-only deployments, set Internal to bind to the cluster's VPC rather than a public IP.
The Spice Helm chart ships a PodMonitor resource for the Prometheus Operator. On GKE, Google Cloud Managed Service for Prometheus is the common target — it ingests PodMonitor resources directly when managed collection is enabled. Set monitoring.podMonitor.enabled: true and import the Spice Grafana dashboard into Cloud Monitoring or self-managed Grafana.
For comprehensive guidance, refer to the GKE documentation, GKE security best practices, and the Spice.ai Kubernetes Deployment Guide.
Cloud Run is a serverless container platform suitable for HTTP-driven Spice.ai workloads that benefit from scale-to-zero and request-based autoscaling. Use it when a single managed container is sufficient and operating Kubernetes is not desired.
Create a service account with the IAM roles the Spicepod requires. Cloud Run attaches it to the service so the runtime authenticates via Application Default Credentials without static keys:
Cloud Run pulls the Spice.ai container image directly. Mount secrets from Secret Manager and configure HTTP ingress on port 8090:
To run multiple replicas with shared file-based acceleration, mount Cloud Storage with FUSE and point file accelerators at the mount path (for example, duckdb_file: /data/taxi_trips.db). Cloud Storage volume latency is significantly higher than local SSD, so prefer GKE for latency-sensitive accelerated workloads.
Cloud Run scales by concurrent requests per instance (default 80). For background workloads (refresh schedules, ingestion) that should not scale to zero, set --min-instances 1. For workloads with long-running connections (Arrow Flight, streaming refresh), set --no-cpu-throttling and tune --concurrency to match the runtime's request profile.
Cloud Run uses startup and liveness probes — point them at /health and /v1/ready. Each gcloud run deploy creates a new revision; use traffic splitting for canary upgrades:
For more details, see the Cloud Run documentation and the Spice.ai Docker Deployment Guide.
Deploy Spice directly on Compute Engine for maximum control over the environment, GPU access, or large-memory machine types.
Manual VM deployment:
spice binary directly. See the installation guide.Automated deployment with Terraform or Deployment Manager:
cloud-init to install Docker, pull the Spice.ai image, retrieve secrets from Secret Manager, and start the runtime.For detailed guidance, refer to the Compute Engine documentation, the Container-Optimized OS guide, and the Google provider for Terraform.
Most GCP services that Spice connects to accept explicit credentials through component parameters (for example, iceberg_gcs_credentials on the Iceberg connector). When explicit credentials are not provided, Spice follows the standard Application Default Credentials chain:
GOOGLE_APPLICATION_CREDENTIALS — path to a service account JSON key file. Common in local development; not recommended for production.gcloud CLI credentials — cached credentials from gcloud auth application-default login. Common during development.For services with explicit parameters (Cloud Storage HMAC, BigQuery service account JSON), prefer named credentials or Workload Identity over GOOGLE_APPLICATION_CREDENTIALS files in production.
:::note[IAM role bindings]
Regardless of the credential source, the principal must have the appropriate IAM role bindings (for example, roles/storage.objectViewer on a bucket, or roles/bigquery.dataViewer on a BigQuery dataset). When a Spicepod connects to multiple GCP services, the principal must have permissions across all of them.
:::
Spice.ai is not yet published to Google Cloud Marketplace (coming soon). In the meantime, deploy using the spiceai/spiceai container image or the Spice Helm chart.
stateful.enabled: true and stateful.storageClass: <chosen-class> in values.yaml.