spiceai/docs

spiceai/

docs

Help Login

trunk

Edit on GitHub

Fork

/docs/website/versioned_docs/version-1.6.x/components/data-connectors/kafka.md

spiceai/docs | Spice Cloud Platform

trunk

Edit on GitHub

Fork

/docs/website/versioned_docs/version-1.6.x/components/data-connectors/kafka.md

spiceai/docs/README.md

title: 'Kafka Data Connector' sidebar_label: 'Kafka Data Connector' description: 'Kafka Data Connector Documentation' pagination_prev: null

The Kafka Data Connector enables direct acceleration of data from Apache Kafka topics using refresh_mode: append acceleration. This allows seamless integration with existing Kafka-based event streaming infrastructure for real-time data acceleration and analytics.

Overview

Upon startup, Spice fetches all messages for the specified topic using a uniquely generated consumer group. If a persistent acceleration engine is used (with mode: file), data is fetched starting from the last processed record, allowing Spice to resume without reprocessing all historical data.

Schema is automatically inferred from the first available topic message in JSON format. The connector creates the appropriate table schema for acceleration based on the detected data structure.

Configuration

`from`

The from field takes the form of kafka:kafka_topic where kafka_topic is the name of the Kafka topic to consume from.

`name`

The dataset name. This will be used as the table name within Spice.

The dataset name cannot be a reserved keyword.

`params`

Parameter Name	Description
`kafka_bootstrap_servers`	Required. A list of host/port pairs for establishing the initial Kafka cluster connection. The client will use all servers, regardless of the bootstrapping servers specified here. This list only affects the initial hosts used to discover the full server set and should be formatted as `host1:port1,host2:port2,...`.
`kafka_security_protocol`	Security protocol for Kafka connections. Default: `SASL_SSL`. Options: `PLAINTEXTSSLSASL_PLAINTEXTSASL_SSL`
`kafka_sasl_mechanism`	SASL (Simple Authentication and Security Layer) authentication mechanism. Default: `SCRAM-SHA-512`. Options: `PLAINSCRAM-SHA-256SCRAM-SHA-512`
`kafka_sasl_username`	SASL username. Required if `kafka_security_protocol` is or .

Acceleration Settings

:::warning

Using the Kafka connector requires acceleration with refresh_mode: append enabled.

:::

The following settings are required:

Parameter Name	Description
`enabled`	Required. Must be set to `true` to enable acceleration.
`engine`	Required. The acceleration engine to use. Possible valid values: `duckdb`: Use DuckDB as the acceleration engine.`sqlite`: Use SQLite as the acceleration engine.`postgres`: Use PostgreSQL as the acceleration engine.
`refresh_mode`	Required. The refresh mode to use. Must be set to `append` for the Kafka connector.
`mode`	Optional. The persistence mode to use. When using the `duckdb` and `sqlite` engines, it is recommended to set this to `file` to persist the data across restarts. Spice persists metadata about the dataset, allowing it to resume from the last known state instead of re-processing all messages.

Data Format Support

The Kafka connector currently supports JSON-formatted messages. Schema is automatically inferred from the first available message in the topic, and all subsequent messages are expected to follow a compatible structure.

Secrets

Spice integrates with multiple secret stores to help manage sensitive data securely. For detailed information on supported secret stores, refer to the secret stores documentation. Additionally, learn how to use referenced secrets in component parameters by visiting the using referenced secrets guide.

Cookbook

See how to query Kafka real-time data with other datasets using federated queries in Live Orders Analytics example.