Spice.ai provides multiple deployment options on Amazon Web Services (AWS), enabling data and AI applications to run on AWS's elastic infrastructure. Whether using virtual machines, container orchestration, or managed services, Spice deploys to meet requirements for performance, scalability, and cost efficiency.
For a complete list of AWS-compatible data connectors, AI models, vector stores, and secret management, see AWS Integrations.
Run Spice.ai on Amazon EKS when the workload benefits from Kubernetes orchestration, multi-replica scale, declarative configuration, or shared cluster tenancy. EKS pairs well with the Spice Helm chart and the Argo CD or Flux GitOps workflows.
The fastest path is eksctl, which provisions the VPC, IAM roles, and node groups in a single command:
--with-oidc enables the OIDC provider required for IAM Roles for Service Accounts (IRSA). For production, prefer Terraform or CloudFormation for repeatable provisioning. The community terraform-aws-modules/eks module is a common starting point.
For burst or low-utilization workloads, attach an EKS Fargate profile so Spice pods run on serverless capacity instead of managed nodes.
Most Spice connectors (S3, DynamoDB, Bedrock, Glue) accept AWS credentials from the environment. Use IRSA so pods receive scoped, short-lived credentials without static keys:
Reference the service account from the Helm release so Spice pods inherit the role:
For EKS Pod Identity (the newer alternative to IRSA), associate the role with aws eks create-pod-identity-association and skip the OIDC setup step.
For declarative GitOps, swap this command for an Argo CD Application or a Flux HelmRelease pointing at the same chart. See the Argo CD or Flux guides for full manifests.
For stateful acceleration (DuckDB, SQLite, Cayenne):
i4i, i7ie, m6id, m7gd, c7gd, r7gd and other d-suffixed instances). Provision the NVMe local volume with the Local Volume Static Provisioner or use Bottlerocket's local-volume-provisioner to expose it as a local-storage StorageClass. Note that local volumes do not survive node replacement, so pair with a refresh strategy or a re-hydration source.io2 delivers up to 256K IOPS and sub-millisecond latency. Use the Amazon EBS CSI driver and a custom StorageClass with type: io2 and a provisioned value.:::tip Amazon EFS works for sharing data across replicas but is not recommended for accelerations: NFS-style latency negates the benefit of using an accelerator. Reserve EFS for stateless artifacts that need to survive pod replacement. :::
:::tip[Spice.ai Enterprise]
For production stateful workloads, the Spice.ai Enterprise Operator's SpicepodSet provides per-replica StatefulSets with automatic PVC resizing, IRSA-aware ServiceAccount annotations, and configurable update strategies. For distributed query execution across scheduler/executor tiers backed by S3, see SpicepodCluster.
:::
To expose Spice externally, install the AWS Load Balancer Controller and front the Spice Service with a Network Load Balancer:
The Spice Helm chart ships a PodMonitor resource for the Prometheus Operator. For EKS, the kube-prometheus-stack chart and Amazon Managed Service for Prometheus are common targets. Set monitoring.podMonitor.enabled: true and import the Spice Grafana dashboard.
For comprehensive guidance, refer to the Amazon EKS User Guide, the EKS Best Practices Guide, and the Spice.ai Kubernetes Deployment Guide.
Deploy Spice.ai directly on Amazon EC2 instances for maximum control over the environment.
Manual EC2 Deployment:
Automated EC2 Deployment with CloudFormation:
UserData to automate Docker installation, pull the Spice.ai Docker image, retrieve configuration or secrets from AWS Parameter Store or Secrets Manager, and run the container with required environment variablesFor detailed guidance and best practices, refer to the AWS CloudFormation User Guide, EC2 User Guide for Linux Instances, and AWS Systems Manager Parameter Store Documentation.
Run Spice.ai on Amazon ECS when a single managed container is sufficient and operating Kubernetes is not desired. ECS Fargate provides serverless capacity; ECS on EC2 provides full control over the host. Both consume the same task definition.
Create a task definition for the spiceai/spiceai image, exposing port 8090 (HTTP) and, optionally, 50051 (Arrow Flight) and 9090 (Prometheus). Inject secrets from AWS Secrets Manager or SSM Parameter Store instead of baking them into the image.
spiceai-task.json:
Register the task:
The executionRoleArn (typically ecsTaskExecutionRole) needs secretsmanager:GetSecretValue and ssm:GetParameters permissions to inject secrets. The taskRoleArn is the role Spice itself assumes at runtime — grant it the AWS permissions the Spicepod needs (for example, s3:GetObject on referenced buckets, bedrock:InvokeModel for Bedrock models).
Front the service with a Network Load Balancer (low-latency TCP) or an Application Load Balancer (HTTP routing, TLS termination). For internal-only deployments, place the service in private subnets and set assignPublicIp=DISABLED.
Spice accelerations are latency- and IOPS-sensitive. Choose the storage type based on launch type and sharing requirements:
i4i, i7ie, m6id, m7gd, c7gd, r7gd, etc.) and bind-mount the NVMe device into the task. This delivers the lowest latency and highest IOPS available on AWS but does not survive instance replacement, so pair with a refresh strategy.\n- Amazon EBS volume attached to an ECS service (EC2 launch type) \u2014 use the EBS volume task configuration with volumeType: io2 for high-IOPS, low-latency block storage that survives task restarts. Fall back to gp3 (with provisioned IOPS) when io2 is unavailable in the region.\n- Amazon S3 Express One Zone (Cayenne only) \u2014 for Cayenne acceleration that needs to be shared across tasks or persisted independently of task lifecycle, S3 Express One Zone provides single-digit-millisecond latency. Configure Cayenne against an S3 Express directory bucket \u2014 see the Cayenne acceleration documentation.\n- Amazon EFS (Fargate-only fallback) \u2014 EFS is the only persistent storage option supported by Fargate, but its NFS-style latency is not recommended for accelerations. Use it only for stateless artefacts that must survive task replacement, or switch to the EC2 launch type when low-latency local storage is required.In the Spicepod, point file accelerators at /data, for example duckdb_file: /data/taxi_trips.db.
Configure service auto-scaling on average CPU or on custom CloudWatch metrics derived from the Spice /v1/metrics endpoint:
For comprehensive details, see the Amazon ECS Developer Guide and the Spice.ai Docker Deployment Guide.
Most AWS services that Spice connects to have explicit parameters for configuring authentication (usually by setting an access_key_id and secret_access_key). If explicit credentials are not provided, Spice follows the standard AWS SDK behavior for loading credentials from the environment based on the following sources in order:
Environment Variables:
AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEYAWS_SESSION_TOKEN (if using temporary credentials)Shared AWS Config/Credentials Files:
Config file: ~/.aws/config (Linux/Mac) or %UserProfile%\.aws\config (Windows)
Credentials file: ~/.aws/credentials (Linux/Mac) or %UserProfile%\.aws\credentials (Windows)
The AWS_PROFILE environment variable can be used to specify a named profile, otherwise the [default] profile is used.
Supports both static credentials and SSO sessions
Example credentials file:
:::tip To set up SSO authentication:
aws configure sso to configure a new SSO profileAWS_PROFILE=sso-profileaws sso login --profile sso-profile to start a new SSO session
:::AWS STS Web Identity Token Credentials:
The connector will try each source in order until valid credentials are found. If no valid credentials are found, an authentication error will be returned.
:::note[IAM Permissions]
Regardless of the credential source, the IAM role or user must have appropriate permissions (e.g., s3:ListBucket, s3:GetObject) to access the service. If the Spicepod connects to multiple different AWS services, the permissions should cover all of them.
:::
iopsgp3 (with provisioned IOPS bumped above the 3,000 baseline) only when io2 is unavailable in a region or when cost outweighs the latency improvement.stateful.enabled: true and stateful.storageClass: <chosen-class> in values.yaml.UserDataECS Container Credentials:
AWS_CONTAINER_CREDENTIALS_RELATIVE_URI or AWS_CONTAINER_CREDENTIALS_FULL_URI which are automatically injected by ECS.AWS EC2 Instance Metadata Service (IMDSv2):