Making Object Storage Operational for Real-Time and AI Workloads

Wyatt Wenzel

Product Marketing Lead

TLDR

Object storage and open table formats deliver nearly limitless scalability and cost efficiency, making them important pieces of modern data architectures.
Despite their advantages, object storage can’t function as an independent solution for workloads that require millisecond latency, sophisticated queries, or AI-driven retrieval, where throughput-optimized designs and limited query expressiveness introduce bottlenecks.
Spice expands the utility of these systems by pushing object storage closer to the application layer and layering on more advanced compute capabilities.
Spice’s federation and acceleration eliminates ETL and transforms object storage into a functional data layer for operational applications and AI agents.

Introduction

Although legacy systems and workflows remain common, many enterprises are re-evaluating their architectures to meet new demands - driven in part, but not exclusively, by AI - that require support for more data-intensive and real-time applications.

The underlying storage needs for these novel workloads are generally outside the bounds of a traditional operational database for a handful of reasons - namely scalability, flexibility (the need to support heterogeneous data types), or availability (or some combination thereof).

Object storage systems have experienced a renaissance in this environment, often being re-purposed or augmented for more transactional use cases than they’ve historically supported. Platforms such as Amazon S3 and MinIO provide the scalability to handle petabytes of data, the cost efficiency of commodity hardware and open-source software, and the simplicity of a flat architecture that reduces management overhead. Although object storage systems don’t offer some of the guardrails of operational databases like strong consistency, many operational use cases tolerate eventual consistency. Common scenarios like rate-limiting or feature lookups, for example, don’t mandate strong consistency, and object storage systems help development teams avoid the performance tax strong consistency can impose.

These attributes have made object storage a source of truth for operational workloads; development teams get the dual benefit of reduced system complexity while maintaining high reliability.

Challenges for Object Storage in Demanding Operational Workloads

Unfortunately, there’s no free lunch in technology.

Object storage systems also come with significant tradeoffs for more performance-sensitive workloads.

Object storage systems optimized for throughput rather than responsiveness introduce higher latency, limiting their use in real-time scenarios.
The object storage key-value model makes complex SQL queries difficult to express and slows down analytical flexibility.
Managing governance, consistency, and security at scale becomes a challenge in environments limited to eventual consistency.
AI and ML workloads, which rely on random access patterns and low-latency retrieval, are not natively optimized for object storage.
Finally, for enterprises migrating from legacy databases, re-engineering data formats and pipelines to fit object stores can introduce complexity, cost, and downtime.

‍

While object stores are now ubiquitous in enterprise environments, they can’t serve as an independent solution for the operational and AI-driven workloads now shaping many application access patterns.

Open Table Formats: Structuring Data for Performance and Governance

Open table formats like Apache Iceberg, Delta Lake, and Apache Parquet represent a step function of improvement for these more demanding operational data workloads by introducing database-like capabilities to object storage. These formats address the shortcomings of raw object storage, such as lack of transactional support and poor query performance, making them ideal for managing structured operational data:

Consistency Optionality: ACID transactions ensure reliable updates, while eventual consistency aligns with use cases where brief sync delays are tolerable.
Query Performance: Optimizations like data skipping and indexing make complex queries fast.
Governance and Security: Features like schema enforcement and audit trails support compliance.
Migration Support: Structured formats ease transitions from legacy systems by mimicking database functionality.

‍

However, open table formats are still not a panacea for all operational workloads. They improve governance and query planning, but they don’t solve the performance challenges of running federated queries across multiple operational and analytical systems or powering AI applications that embed both structured and unstructured data. Different tools for different jobs, as they say.

What if you could maintain all of the great attributes of object storage and open table formats, but add the orchestration necessary to actually power your application without a bunch of ETL pipelines?

Well, you now can with Spice.ai.

Transforming Object Storage into a High-Performance Data Layer with Spice

Spice was purpose-built to solve this problem. By unifying SQL query federation and acceleration, search and retrieval, and LLM inference into a single, deploy-anywhere runtime, Spice makes it possible to serve data and AI-powered experiences directly from your existing object storage - securely, at low latency, and without sacrificing the simplicity and economics of object storage. Built in Rust on top of modern open-source technologies like Apache DataFusion (query optimization), Apache Arrow (in-memory processing), DuckDB (fast analytics), Apache Iceberg (open table format), and OpenTelemetry (observability), Spice transforms object storage into a high-performance data layer equipped to serve the most demanding operational workloads.

It’s a lightweight (~150MB) and portable runtime that:

Federates, Materializes, and Accelerates Data: Run SQL queries across databases, data lakes, and APIs without moving data. Store hot data in-memory or locally using Apache Arrow, DuckDB, or SQLite for sub-second queries.
Delivers Hybrid Search Across Unstructured and Structured Data: Execute keyword, vector, and full-text search from a single SQL query.
Serves AI Models: Support local or hosted AI models, tying real-time data to AI outputs.

One Runtime for All Your Data

Where others solve one piece of the problem (search, query, or inference), Spice brings these capabilities together in one platform. The result is faster delivery of high-performance applications, with fewer moving parts to operate and maintain.

As you can imagine, Spice’s value goes beyond operationalizing object stores. With Spice you can federate SQL across transactional and analytical systems, join it with Parquet in S3 or Iceberg tables, and avoid the latency and cost of moving data back and forth.

You can run Spice wherever your application lives: as a sidecar for edge workloads, a microservice in the cloud, or a managed deployment. The benefit of this deployment optionality is that it gives applications and AI a controlled execution layer rather than direct database access.

For more on the Spice architecture, visit the OSS overview here.

Figure 2: AI-driven architecture with Spice.ai

Real-World Impact: Twilio, Barracuda, and FactSet

Let’s take this out of the abstract and into some real-world applications built on Spice.

Twilio: Database CDN for Messaging Pipelines

For Twilio, consistently fast data access is mission-critical. In their messaging pipelines, even a brief database outage could cascade into service interruptions. With Spice, Twilio stages critical control-plane datasets in object storage, then accelerates them locally for sub-second queries. This not only improved P99 query times to under 5ms but also introduced automated multi-tenancy controls that propagate updates in minutes instead of hours. By reducing reliance on direct database queries and adding a resilient S3 failover path, Twilio doubled data redundancy and improved overall reliability - all with a lightweight container drop-in.

Barracuda: Datalake Accelerator for Email Archives

By deploying Spice as a datalake accelerator, Barracuda reduced P99 query times to under 200 milliseconds and moved audit logs into cost-efficient Parquet files on S3, which Spice queries directly. The shift not only eliminated costly data lakehouse queries but also reduced load on Cassandra, improving stability across the infrastructure. The result was a faster, more reliable customer experience at a fraction of the cost.

NRC Health: Data-Grounded AI for Healthcare Insights

NRC Health needed a way to build secure, data-grounded AI features that could integrate multiple internal platforms - from MySQL and SharePoint to Salesforce - without lengthy development cycles. Spice provided a unified, AI-ready data layer that developers could access through a single interface. Developers found it easier to experiment with embeddings, search, and inference directly in Spice, avoiding the complexity of stitching together bespoke pipelines. The result is faster innovation and AI features grounded in real, relevant healthcare data.

Conclusion

Object storage and open table formats have become critical parts of modern enterprise data infrastructure, but they were not designed to serve real-time operational or AI-driven workloads on their own. Spice fills that gap by pairing federation with acceleration, search, and inference, turning data lakes into low-latency, AI-ready data layers. For enterprises hoping to get the most leverage possible out of their operational data, Spice is the catalyst.

Getting Started with Spice

Spice is open source (Apache 2.0) and can be installed in less than a minute on macOS, Linux, or Windows, and also offers an enterprise-grade Cloud deployment.

Explore the open source docs and blog
Visit the getting started guide
Explore the 70+ cookbooks
Try Spice.ai Cloud for a fully managed deployment and get started for free.

‍

Work with Spice AI

Interested in working with Spice AI or looking to learn a little more about the work we do? We are always looking for our next big challenge. Book an introductory call via our Calendly. Take a deeper look at our enterprise offerings by visiting Spice.ai.

Visit Spice.ai

spice.ai/blog/making-object-storage-operational

Subscribe to our blog and newsletter

Oops! Something went wrong while submitting the form.

Real-Time Hybrid Search Using RRF: A Hands-On Guide with Spice

Learn how to build hybrid search with Reciprocal Rank Fusion (RRF) directly in SQL using Spice - combining text, vector, and time-based relevance in one query for faster, more accurate results.

October 23, 2025

Write to Apache Iceberg Tables with SQL in Spice

Spice v1.8 introduces native Apache Iceberg write support, enabling developers to insert data directly into Iceberg tables and catalogs using standard SQL.

October 24, 2025

Build Better Apps with Spice.ai SQL Query Federation & Acceleration

See how Spice.ai turns fragmented enterprise data into a unified, high-performance data layer. Federate and accelerate queries across operational and analytical systems to power faster, more intelligent applications with zero ETL.