What is Delta Lake? Open-Source Lakehouse Storage Layer

What is Delta Lake?

Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark and big data workloads. Built on top of Apache Parquet, Delta Lake adds a transaction log that enables reliable reads and writes, schema enforcement, time travel, and incremental processing on data stored in cloud object stores like S3, Azure Blob Storage, and GCS.

See Spice SQL federation

Read the docs

Data lakes promised a single, low-cost repository for all of an organization's data. In practice, they introduced a new class of problems. Without transactions, concurrent reads and writes produce corrupted or inconsistent results. Without schema enforcement, tables drift as upstream producers change their output format. Without audit history, there is no way to reproduce a query result from last week or roll back a bad write.

Delta Lake solves these problems by adding a transaction log -- the Delta log -- on top of standard Parquet files stored in a data lake. Every operation (write, delete, merge, schema change) is recorded as an atomic, ordered entry in the log. This turns an unstructured collection of files into a reliable, versioned table with database-grade guarantees.

Originally developed at Databricks, Delta Lake was open-sourced under the Apache 2.0 license in 2019. It is now a Linux Foundation project with implementations in Scala/Spark (the original) and Rust (delta-rs), making it accessible outside the Spark ecosystem.

Core Architecture

A Delta Lake table is a directory in a file system or object store that contains two types of content: data files in Apache Parquet format and a transaction log directory called _delta_log.

Data Files

The actual data is stored as standard Parquet files. Parquet is a columnar storage format that supports efficient compression and encoding schemes. Because the data files are plain Parquet, they can be read by any tool that supports the Parquet format -- though without the transaction log, readers would see all files (including those that have been logically deleted) rather than the current table state.

The Transaction Log (Delta Log)

The _delta_log directory is the core innovation of Delta Lake. It contains a sequence of JSON files, each representing an atomic commit to the table. These commit files are named sequentially: 000000000000000000000.json, 000000000000000000001.json, and so on.

Each commit file contains one or more actions:

Add file: Records that a new Parquet data file is part of the table
Remove file: Records that a Parquet data file is no longer part of the current table version (the physical file is retained for time travel)
Metadata: Records schema changes, partition column changes, or configuration updates
Protocol: Records the minimum reader and writer protocol versions required to interact with the table
Transaction identifiers: Records application-level transaction IDs for idempotent writes

To read the current state of a Delta table, a reader replays the log from the beginning (or from the latest checkpoint) and computes the set of active files by applying all add and remove actions.

Checkpoint Files

Replaying the entire log for every read would be expensive for tables with long histories. Delta Lake solves this with periodic checkpoint files -- Parquet files in the _delta_log directory that contain a snapshot of the cumulative state at a given version. Readers start from the most recent checkpoint and only replay subsequent commits.

Checkpoint files are created automatically at configurable intervals (by default, every 10 commits). A special _last_checkpoint file in the _delta_log directory records the version of the most recent checkpoint, so readers can locate it without scanning the entire log directory.

Key Features

ACID Transactions

Delta Lake provides serializable isolation for writes and snapshot isolation for reads. This means:

Atomicity: Each write operation either fully succeeds or fully fails. There are no partial writes visible to readers.
Consistency: Schema enforcement ensures that every row conforms to the table's defined schema.
Isolation: Concurrent readers and writers do not interfere with each other. Readers see a consistent snapshot; writers use optimistic concurrency control to detect conflicts.
Durability: Once a commit is written to the log, it is durable in the underlying storage system.

Optimistic concurrency control works by having each writer read the current table version, compute its changes, and attempt to write its commit file at the next sequential version number. If another writer has already committed at that version, the write fails and the writer must retry -- rereading the current state and recomputing its changes.

Schema Enforcement and Evolution

Delta Lake validates every write against the table's schema. If a write includes a column that does not exist in the schema, or if a column's data type does not match, the write is rejected before any data is written. This prevents the silent data corruption that is common in unmanaged data lakes.

Schema evolution is supported through explicit operations. New columns can be added, existing columns can be widened (e.g., INT to LONG), and columns can be renamed or reordered. Each schema change is recorded as a metadata action in the transaction log, preserving a complete history of how the schema has changed over time.

Time Travel

Because the transaction log records every version of the table, Delta Lake supports querying any historical version by specifying a version number or a timestamp:

-- Query the table as of version 42
SELECT * FROM events VERSION AS OF 42

-- Query the table as of a specific timestamp
SELECT * FROM events TIMESTAMP AS OF '2026-03-01T00:00:00'

Time travel is useful for auditing (reproducing a previous query result), debugging (comparing current data against a known-good historical state), and rollback (restoring a table to a previous version after a bad write).

Historical data is retained as long as the underlying Parquet files have not been physically deleted. The VACUUM command removes data files that are no longer referenced by any version within the retention period (default 7 days), reclaiming storage at the cost of losing time travel access to older versions.

Z-Ordering and Data Skipping

Delta Lake supports Z-ordering, a technique that co-locates related data in the same set of files to improve query performance. When data is Z-ordered on a column (e.g., event_date), rows with similar values for that column are stored in the same Parquet files. Combined with per-file min/max statistics stored in the transaction log, the query engine can skip entire files that do not contain relevant data.

OPTIMIZE events ZORDER BY (event_date, user_id)

Data skipping uses the per-file column statistics (min, max, null count) recorded in the Delta log. When a query includes a filter like WHERE event_date = '2026-03-01', the engine reads the statistics, identifies which files could possibly contain matching rows, and reads only those files. For well-organized tables, this can reduce I/O by orders of magnitude.

Merge (Upsert) Operations

Delta Lake supports MERGE INTO for complex upsert logic -- matching source rows against target rows and applying inserts, updates, or deletes based on match conditions:

MERGE INTO customers AS target
USING updates AS source
ON target.customer_id = source.customer_id
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *

Unlike appending new files (which is the only write pattern available in raw Parquet data lakes), MERGE operations read existing data, compute the changes, and write new files that reflect the merged result. The transaction log records the old files as removed and the new files as added, maintaining a complete audit trail.

Delta Lake vs. Other Table Formats

Delta Lake vs. Apache Iceberg

Apache Iceberg is another open-source table format that brings transactional guarantees to data lakes. Both provide ACID transactions, schema evolution, time travel, and partition evolution. The key differences are in design philosophy and ecosystem.

Delta Lake originated in the Databricks ecosystem and has the deepest integration with Apache Spark. Its transaction log is a sequence of JSON files (with Parquet checkpoints), and its concurrency model uses optimistic concurrency control based on file-level conflict detection.

Apache Iceberg was designed from the start to be engine-agnostic. Its metadata layer uses a tree of manifest files that enable fine-grained tracking of individual data files. Iceberg's partition evolution allows changing partition schemes without rewriting data -- a capability that Delta Lake added later through partition transforms.

In practice, teams using Databricks tend to use Delta Lake; teams using multi-engine environments (Spark, Trino, Flink, Dremio) often choose Iceberg for its broader engine compatibility. For a detailed comparison, see Apache Iceberg vs. Delta Lake.

Delta Lake vs. Apache Hudi

Apache Hudi (Hadoop Upserts Deletes and Incrementals) focuses on incremental data processing and near-real-time ingestion. Hudi's distinguishing feature is its record-level indexing, which enables efficient upserts without full-table scans.

Hudi supports two table types: Copy-on-Write (CoW), which rewrites entire files on update, and Merge-on-Read (MoR), which writes changes to a separate log and merges them at read time. Delta Lake uses a Copy-on-Write approach for all operations.

Delta Lake's transaction log model is simpler than Hudi's dual-table architecture, which can make it easier to operate and debug. Hudi's record-level index is more efficient for workloads dominated by single-record upserts, while Delta Lake's file-level approach is more efficient for batch upserts.

Delta Lake vs. Raw Parquet in Data Lakes

Storing raw Parquet files in a data lake provides no transactional guarantees. Without a transaction log:

No atomic writes: A failed write can leave partial files that corrupt subsequent reads
No schema enforcement: Any file with any schema can be written to the same directory
No time travel: There is no way to query historical versions of the data
No concurrent safety: Multiple writers can produce conflicting files that readers cannot reconcile
No efficient deletes or updates: Removing or updating specific rows requires rewriting entire files manually

Delta Lake adds all of these capabilities while keeping the underlying data in standard Parquet format. The overhead is the transaction log directory, which is typically a negligible fraction of total storage.

How Spice Uses Delta Lake

Spice connects to Delta Lake tables as a federated data source through the delta-rs Rust library. This enables teams to query Delta tables using standard SQL without requiring a Spark cluster or Databricks runtime.

Federated Query Access

Spice registers Delta Lake tables stored in S3, Azure Blob Storage, or GCS as federated data sources. Users query these tables through Spice's unified SQL interface alongside data from PostgreSQL, MySQL, Snowflake, and 30+ other connectors. The query engine handles credential management, object store access, and Parquet decoding transparently.

datasets:
  - from: delta_lake:s3://data-lake/events/
    name: events
    params:
      delta_lake_aws_access_key_id: ${AWS_ACCESS_KEY_ID}
      delta_lake_aws_secret_access_key: ${AWS_SECRET_ACCESS_KEY}

Because Spice uses Apache DataFusion as its query engine, it applies predicate pushdown to Delta Lake scans. Filters, projections, and partition pruning are pushed down through DataFusion's optimizer into the Delta table provider, minimizing the amount of data read from object storage.

Local Acceleration

For workloads that require sub-second query performance, Spice supports accelerating Delta Lake tables locally. The acceleration layer materializes the Delta table data into a local store, enabling fast queries without round-trips to cloud object storage.

datasets:
  - from: delta_lake:s3://data-lake/events/
    name: events
    acceleration:
      enabled: true
      refresh_mode: changes
      refresh_check_interval: 10s

When refresh_mode is set to changes, Spice uses change data capture techniques to detect new commits in the Delta log and apply only the incremental changes to the local cache. This keeps the local acceleration layer fresh without performing full reloads, which is critical for large tables where a full refresh would be prohibitively slow.

Cross-Format Federation

Because Spice is format-agnostic, teams can join Delta Lake tables with data from other sources in a single query:

SELECT c.customer_name, SUM(e.amount)
FROM delta_events e
JOIN postgres_customers c ON e.customer_id = c.id
WHERE e.event_date > '2026-03-01'
GROUP BY c.customer_name

This enables SQL federation across Delta Lake, relational databases, and other table formats like Apache Iceberg without moving data into a centralized warehouse.

Advanced Topics

The Delta Log Protocol

The Delta Lake protocol defines two protocol versions: a reader protocol version and a writer protocol version. These versions control which features a reader or writer must support to interact with the table. When a new feature is introduced (e.g., column mapping, deletion vectors), the protocol version is incremented. Readers or writers that do not support the required version must refuse to operate on the table rather than producing incorrect results.

This protocol versioning enables forward compatibility: new features can be added to the format without breaking existing readers, as long as those readers check the protocol version before proceeding.

Deletion Vectors

Traditional Delta Lake deletes work by rewriting entire Parquet files with the deleted rows removed. For tables with large files, deleting a small number of rows requires rewriting gigabytes of data. Deletion vectors solve this by recording which rows in a file have been logically deleted, using a compact bitmap stored alongside the file metadata in the transaction log.

Readers check the deletion vector for each file and skip the marked rows during scanning. The physical data remains in the original Parquet files until a subsequent OPTIMIZE or VACUUM operation rewrites the files without the deleted rows. This trades a small read-time cost (checking the bitmap) for a large write-time saving (avoiding file rewrites).

Change Data Feed

Delta Lake's Change Data Feed (CDF) exposes a stream of row-level changes (inserts, updates, deletes) between table versions. This enables downstream consumers to process only the changes since their last read, rather than re-scanning the entire table.

-- Read changes between versions 10 and 20
SELECT * FROM table_changes('events', 10, 20)

CDF records are stored in separate _change_data files alongside the regular data files. Each change record includes the row data, the operation type (insert, update_preimage, update_postimage, or delete), and the commit version. This is the mechanism that enables efficient incremental processing pipelines and is the foundation for how tools like Spice detect and apply changes when refreshing locally accelerated Delta tables.

Liquid Clustering

Liquid clustering is Delta Lake's replacement for traditional Hive-style partitioning. Instead of writing data into static partition directories (e.g., event_date=2026-03-01/), liquid clustering uses a space-filling curve to organize data dynamically. The clustering key can be changed at any time without rewriting existing data -- new data is clustered according to the new key, and background optimization gradually reorganizes historical data.

This addresses a fundamental limitation of static partitioning: once a partition scheme is chosen, changing it requires a full table rewrite. Liquid clustering makes the organization scheme a tunable parameter rather than a permanent architectural decision.

Delta Lake FAQ

What is the difference between Delta Lake and Apache Iceberg?

Both are open-source table formats that add ACID transactions, schema evolution, and time travel to data lakes. Delta Lake originated in the Databricks ecosystem and has the deepest Spark integration. Apache Iceberg was designed to be engine-agnostic from the start and has broader multi-engine support (Spark, Trino, Flink, Dremio). Both store data in Parquet format. The choice often depends on the primary compute engine and ecosystem preferences.

Does Delta Lake require Apache Spark?

No. While Delta Lake was originally built for Spark, the delta-rs project provides a standalone Rust implementation with Python bindings that works without Spark. Tools like Spice, Trino, Flink, and others can read and write Delta tables directly. The core Delta Lake protocol is an open specification that any engine can implement.

How does time travel work in Delta Lake?

Every write to a Delta table creates a new version in the transaction log. Time travel allows querying any historical version by specifying a version number or timestamp. The underlying Parquet files for previous versions are retained until a VACUUM operation removes files older than the retention period (default 7 days). This enables auditing, debugging, and rollback without maintaining separate backups.

What is the Delta Lake transaction log?

The transaction log (Delta log) is a directory of ordered JSON files that records every atomic change to a Delta table. Each commit file contains actions like adding or removing data files, schema changes, and protocol updates. Readers replay the log from the latest checkpoint to determine the current set of active files. The log provides the foundation for ACID transactions, time travel, and audit history.

Can Spice query Delta Lake tables without Spark?

Yes. Spice connects to Delta Lake tables through the delta-rs Rust library, which implements the Delta Lake protocol natively without any JVM or Spark dependency. Spice can read Delta tables from S3, Azure Blob Storage, and GCS, apply predicate pushdown for efficient scans, and optionally accelerate tables locally for sub-second query performance.

Learn more about Delta Lake and Spice

Documentation and technical deep dives on querying Delta Lake tables with federated SQL and local acceleration.

Docs

Spice.ai OSS Documentation

Learn how Spice connects to Delta Lake tables for federated SQL queries with local acceleration and change data capture.

Blog

How we use Apache DataFusion at Spice AI

A technical overview of how Spice extends Apache DataFusion with custom table providers, optimizer rules, and UDFs.

Talk to an engineer

See Spice in action

Walk through your use case with an engineer and see how Spice handles federation, acceleration, and AI integration for production workloads.

Talk to an engineer