Apache Kafka
Continuously stream data from Apache Kafka into Onehouse tables.
Click Sources > Add New Source > Apache Kafka. Then, follow the setup guide within the Onehouse console to configure your source.
Cloud Provider Support
- AWS: ✅ Supported
- GCP: ✅ Supported
Reading Kafka Messages
Onehouse supports the following serialization types for Kafka message values:
| Serialization Type (for message value) | Schema Registry | Description |
|---|---|---|
| Avro | Required | Deserializes message value in the Avro format. Send messages using Kafka-Avro specific libraries; vanilla AVRO libraries will not work. |
| JSON | Optional | Deserializes message value in the JSON format. |
| JSON_SR (JSON Schema) | Required | Deserializes message value in the Confluent JSON Schema format. |
| Protobuf | Required | Deserializes message value in the Protocol Buffer format. |
| Byte Array | N/A | Passes the raw message value as a Byte Array without performing deserialization. Also adds the message key as a string field. |
Onehouse currently does not support reading Kafka message keys for Avro, JSON, JSON_SR, and Protobuf serialized messages.
Usage Notes
- If a message is compacted or deleted within the Apache Kafka topic, it can no longer be ingested since the payload will be a tombstone/null value.
Guide: Create a Kafka source with Protobuf messages
If you're using protobuf as the message value serialization type, you need to provide the protobuf schema in a .jar file built with the .proto file.
Onehouse will use the .jar to deserialize the message value.
Prerequisites
- Kafka cluster with required topics
- Java 8
- Maven (compatible with the Java version)
- Protobuf (Protobuf
3.XX.X)
Create and upload the schema JAR
- Create a schema i.e.
sample.protoin themain/java/resourcesfolder - Run the following command to compile the schema
protoc --java_out=./src/main/java ./src/main/resources/sample.protoThis will generate the schema class in the
main/javafolder. - Generate the jar file using the following command
mvn clean packageThis will generate the jar file in the
targetfolder. - Upload the schema JAR to S3 to an object storage bucket that Onehouse can access.
- Create a new source with
Apache Kafkaas the source type and provide the JAR S3 URI in theSchema Registrysection.