Apache Kafka
Continuously stream data from Apache Kafka into Onehouse tables.
Click Sources > Add New Source > Apache Kafka. Then, follow the setup guide within the Onehouse console to configure your source.
Cloud Provider Support
- AWS: ✅ Supported
- GCP: ✅ Supported
Reading Kafka Messages
Onehouse supports the following serialization types for Kafka message values:
| Serialization Type (for message value) | Schema Registry | Description |
|---|---|---|
| Avro | Required | Deserializes message value in the Avro format. Send messages using Kafka-Avro specific libraries; vanilla AVRO libraries will not work. |
| JSON | Optional | Deserializes message value in the JSON format. |
| JSON_SR (JSON Schema) | Required | Deserializes message value in the Confluent JSON Schema format. |
| Protobuf | Required | Deserializes message value in the Protocol Buffer format. |
| Byte Array | N/A | Passes the raw message value as a Byte Array without performing deserialization. Also adds the message key as a string field. |
Onehouse currently does not support reading Kafka message keys for Avro, JSON, JSON_SR, and Protobuf serialized messages.
Usage Notes
- If a message is compacted or deleted within the Apache Kafka topic, it can no longer be ingested since the payload will be a tombstone/null value.
Guide: Create a Kafka source with Protobuf messages
If you're using protobuf as the message value serialization type, you need to provide the protobuf schema in a .jar file built with the .proto file.
Onehouse will use the .jar to deserialize the message value.
Prerequisites
- Kafka cluster with required topics
- Java 8
- Maven (compatible with the Java version)
- Protobuf (Protobuf
3.XX.X)
Create and upload the schema JAR
- Create a schema i.e.
sample.protoin themain/java/resourcesfolder - Run the following command to compile the schema
protoc --java_out=./src/main/java ./src/main/resources/sample.protoThis will generate the schema class in the
main/javafolder. - Generate the jar file using the following command
mvn clean packageThis will generate the jar file in the
targetfolder. - Upload the schema JAR to S3 to an object storage bucket that Onehouse can access.
- Create a new source with
Apache Kafkaas the source type and provide the JAR S3 URI in theSchema Registrysection.
Secrets Management (BYOS)
If you are using Bring Your Own Secrets (BYOS), store your credentials in AWS Secrets Manager or Google Cloud Secret Manager using the JSON formats below. See the Secrets Management documentation for setup instructions and tag requirements.
SASL Protocol
Depending on the distribution of Kafka, use api_key/api_secret or username/password as the credentials.
{
"username": "<value>",
"password": "<value>"
}
TLS Protocol
{
"keystore_password": "<value>",
"key_password": "<value>"
}