KAFKA - DATA SOURCE
Description
The KAFKA data source is designed to retrieve data from Apache Kafka, a distributed streaming platform that allows for the publishing and subscribing to streams of records in real-time. Kafka is widely used for building real-time data pipelines and streaming applications. It provides a high-throughput, fault-tolerant, and scalable platform for handling data streams. The KAFKA data source supports various configuration options to customize the data retrieval process, including specifying the client ID, Kafka broker addresses, the topic to subscribe to, and additional properties for Kafka clients.
Config
REQUIRED
Config Parameters
Name | Description |
---|---|
clientId | A user-specified string sent in each request to help trace calls and is used in logging. This is a required parameter. |
brokers | A comma-separated list of host and port pairs that are the addresses of the Kafka brokers. This is a required parameter. |
topic | The name of the Kafka topic to subscribe to. This is a required parameter. |
properties | A set of additional properties for the Kafka client. This is an optional parameter. |
Config Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
Common Mistakes
- Incorrect Broker Addresses: Ensure that the
brokers
parameter contains the correct addresses of the Kafka brokers. - Topic Name Errors: Verify that the
topic
parameter matches the name of an existing Kafka topic. - Client ID Issues: If you encounter issues related to client identification, ensure that the
clientId
is unique and correctly identifies the client in the Kafka cluster. - Property Configuration: If additional properties are specified in the
properties
parameter, ensure they are correctly configured according to the Kafka client documentation.