spiceai/docs

spiceai/

docs

Help Login

trunk

Edit on GitHub

Fork

/docs/website/versioned_docs/version-1.10.x/components/data-accelerators/arrow.md

spiceai/docs | Spice Cloud Platform

trunk

Edit on GitHub

Fork

/docs/website/versioned_docs/version-1.10.x/components/data-accelerators/arrow.md

spiceai/docs/README.md

title: 'In-Memory Arrow Data Accelerator' sidebar_label: 'In-Memory Arrow Data Accelerator' description: 'In-Memory Arrow Data Accelerator Documentation' sidebar_position: 2

The In-Memory Arrow Data Accelerator is the default data accelerator in Spice. It uses Apache Arrow to store data in-memory for fast access and query performance.

Configuration

To use the In-Memory Arrow Data Accelerator, no additional configuration is required beyond enabling acceleration.

Example:

However Arrow can be specified explicitly using arrow as the engine for acceleration.

Hash Index

:::warning[Experimental] Hash index is an experimental feature available in Spice v1.11.0-rc.2 and later. :::

The In-Memory Arrow Data Accelerator supports an optional hash index for O(1) point lookups on primary key columns. To enable, set hash_index: enabled in the dataset params:

See the configuration example above for usage details.

Limitations

The In-Memory Arrow Data Accelerator does not support persistent storage. Data is stored in-memory and will be lost when the Spice runtime is stopped.
The In-Memory Arrow Data Accelerator does not support Decimal256 (76 digits), as it exceeds Arrow's maximum Decimal width of 38 digits.
The In-Memory Arrow Data Accelerator does not support traditional indexes, but does support hash indexes (experimental) for point lookups.
The In-Memory Arrow Data Accelerator only supports primary-key constraints, not unique constraints.
With Arrow acceleration, mathematical operations like value1 / value2 are treated as integer division if the values are integers. For example, 1 / 2 will result in 0 instead of the expected 0.5. Use casting to FLOAT to ensure conversion to a floating-point value: CAST(1 AS FLOAT) / CAST(2 AS FLOAT) (or CAST(1 AS FLOAT) / 2).

:::warning[Memory Considerations]

When accelerating a dataset using the In-Memory Arrow Data Accelerator, some or all of the dataset is loaded into memory. Ensure sufficient memory is available, including overhead for queries and the runtime, especially with concurrent queries.

In-memory limitations can be mitigated by storing acceleration data on disk, which is supported by duckdb and sqlite accelerators by specifying mode: file.

:::

Cookbook

A cookbook recipe to configure In-Memory Arrow as data accelerator in Spice. In-Memory Arrow Data Accelerator