Spice.ai Now Supports Amazon S3 Vectors For Vector Search at Petabyte Scale!

Roger Frey

President and COO

Today, we’re announcing native support for Amazon S3 Vectors in the Spice.ai Open Source data and AI compute engine.

As an AWS Startup Partner and AWS Marketplace Seller, Spice AI partners with AWS across technology integration, joint go-to-market, and co-selling to deliver solutions for enterprise customers that address real-world data challenges, accelerating the delivery of AI-native applications on AWS.

The Spice.ai S3 Vectors integration arrives alongside AWS’s announcement of the public preview of Amazon S3 Vectors, a new S3 bucket type designed for vector embeddings, complete with a query endpoint and metadata service. Developers can now configure Spice.ai to use S3 Vectors as a vector database backend, for simple, efficient storage, indexing, and querying of embeddings directly from S3.

Figure 1. Spice.ai S3 Vectors integration.

What is Vector Similarity Search?

Vector similarity search retrieves data by comparing similarities in multi-dimensional representations, instead of relying on exact keyword or value matches. This method powers semantic search, recommendation systems, and retrieval-augmented generation (RAG) in AI applications.

The process works as follows:

Convert data to vectors: Turn items like text, images, or audio into vectors - arrays of numbers that capture the data's core meaning or features. Machine learning models handle this conversion, known as embedding. Examples include Amazon Titan Embeddings or Cohere Embeddings via AWS Bedrock, or MiniLM L6 available on HuggingFace.
Store the vectors: Store the embeddings in a specialized vector database or index designed for fast similarity queries.
Query with a vector: Convert the user's query (e.g., a phrase or image) into a vector. The system then identifies the closest matches using distance measures such as cosine similarity, Euclidean distance, or dot product.

This approach provides precise, context-aware data retrieval from vast unstructured datasets. It supports AI applications that prioritize understanding over simple matching.

With the S3 Vectors integration, this process and the lifecycle of vectors is completely managed by the Spice.ai runtime, which also provides an intuitive SQL interface for querying.

Amazon S3 Vectors

Amazon S3 Vectors, launched in public preview on July 15, 2025, provides the first cloud object store with native vector storage and querying, extending AWS object storage for semantic search and retrieval. It features vector indexes within buckets for embedding organization, PUT APIs for uploads, and query APIs for similarity searches using metrics like cosine distance.

Reducing costs for uploading, storing, and querying vectors by up to 90% versus alternatives, it supports AI agents, inference, and semantic search on S3 content with sub-second query performance at petabyte scale. It upholds S3's elasticity, durability, and compute-storage separation - vectors stay in durable storage, queries run on transient resources, bypassing monolithic databases and idle-period costs. Suited for tasks like matching scenes in video archives, clustering business documents, or pattern detection in medical images, it uses a new bucket type with dedicated APIs, no provisioning required, and scales to 10,000 indexes per bucket.

Spice.ai’s Integration with Amazon S3 Vectors

Spice.ai's integration with Amazon S3 Vectors simplifies and accelerates application development for developers.

With native support for S3 Vectors, Spice developers can configure datasets via YAML to use S3 Vectors as the vector storage engine, annotating columns with hosted embedding models including Amazon Titan Embeddings or Cohere Embeddings via AWS Bedrock, or self-hosted models like MiniLM L6 from Hugging Face.

Figure 2. Simple YAML configuration of S3 Vectors in Spice.ai.

‍

The Spice.ai runtime manages the full vector lifecycle: it ingests source data from disparate enterprise sources like files, databases, and data lakes, embeds it using specified models, and pushes it into S3 Vector buckets. Applications query via SQL (e.g., SELECT * FROM vector_search(table, 'search query') WHERE condition ORDER BY score) with push-down optimization for efficiency, or HTTP APIs, while the runtime handles indexing and provides an intuitive SQL interface.

Figure 3. Using the vector_search SQL function in Spice Cloud for semantic search.

The Spice S3 Vectors integration simplifies and accelerates AI application development by leveraging S3's vector capabilities with minimal application code, without operational overhead, and it can be used together with existing Spice.ai Keyword and Full-Text (BM25) search capabilities.

Demo of Amazon S3 Vectors in Spice.ai Open Source

Availability

S3 Vectors support is available today in the v1.5.0 release of Spice.ai Open Source and Spice Cloud!

To learn more about S3 Vectors in Spice, visit spiceai.org/docs/components/vectors/s3_vectors.

About Spice AI

Spice AI helps enterprises build fast, accurate, and scalable AI applications and agents with its portable, open-source data and AI compute engine. It connects data from disparate sources, simplifies application development, and supports workloads across cloud, edge, and on-premises systems. Based in Seattle, Spice AI focuses on making AI application development simple and easy.

Get started with Spice.ai Open Source in just 30 seconds at: https://spiceai.org/docs/getting-started

Work with Spice AI

Interested in working with Spice AI or looking to learn a little more about the work we do? We are always looking for our next big challenge. Book an introductory call via our Calendly. Take a deeper look at our enterprise offerings by visiting Spice.ai.

Visit Spice.ai

spice.ai/blog/amazon-s3-vectors

Subscribe to our blog and newsletter

Oops! Something went wrong while submitting the form.