Saltar al contenido principal

Cognite Kafka extractor

The Cognite Kafka extractor is a generic extractor that extracts data from Apache Kafka. It connects to a Kafka broker and subscribes to a single topic. Cognite hosts the Kafka extractor, so you don't have to download or install anything.

Before you start

  • Assign access capabilities to create a hosted Kafka extractor and for the extractor to write data points, time series, events, RAW rows and in to data models in the target CDF project.

    consejo

    You can use OpenID Connect and your existing identity provider (IdP) framework to manage access to CDF data securely. Read more.

Deploy the extractor

  1. Navigate to Data management > Integrate > Extractors.

  2. Locate the Cognite Kafka extractor and select Set up extractor.


Kafka is highly redundant, so when you connect to a Kafka broker, you typically connect to one or more from a list of bootstrap brokers. This tells the client which specific broker to connect to. You provide this list of bootstrap brokers to the extractor.

Message formats

Kafka is an event streaming platform from Apache that supports payloads on any format. The Cognite Kafka extractor supports several pre-defined message formats.

If you want to define your own custom mapping of Kafka messages, see custom data formats for hosted extractors. Custom formats used with Kafka jobs will receive an input argument containing the message as JSON, and a context argument containing the topic and some other message metadata, for example:

{
"key": [1, 2, 3, 4],
"topic": "stream_a",
"headers": {
"headerA": "valueA",
"headerB": "valueB"
},
"timestamp": 1710312447698,
"current_offset": 5912,
"partition_id": 1
}

Messages are published on topics. A Kafka broker has a fixed set of topics, each writing data to one or more partitions. When you create a Kafka extractor instance, you must specify the number of partitions and the name of the topic.

This information can be used to help construct data points or other supported output types.