Defines a data source for ingestion. A source represents the upstream system that an onehouse_flow reads from — an S3 bucket, a Kafka topic, a Postgres database, and so on.
This page documents Terraform-specific behavior (HCL syntax, types, mutability, drift, import). For full parameter semantics, valid values, and defaults, see CREATE SOURCE and DELETE SOURCE. The provider does not support source updates — changes force destroy + recreate.
Example Usage
S3 source
resource "onehouse_source" "events" {
name = "raw-events-s3"
source_type = "S3"
s3 {
object_storage_bucket_name = "my-raw-events-bucket"
}
}
object_storage_bucket_name is the bucket name only (not a URI). The bucket must be accessible to the Onehouse control plane (IAM/service-account role configured during cloud-provider connection).
GCS source
resource "onehouse_source" "events_gcs" {
name = "raw-events-gcs"
source_type = "GCS"
gcs {
object_storage_bucket_name = "my-gcs-events-bucket"
}
}
Confluent Kafka with SASL and schema registry
resource "onehouse_source" "kafka" {
name = "events-kafka"
source_type = "CONFLUENT_KAFKA"
credential_type = "ONEHOUSE"
kafka {
bootstrap_servers = "pkc-xxxxx.us-west-2.aws.confluent.cloud:9092"
connection_protocol = "SASL"
security_protocol = "SASL_SSL"
payload_serialization = "avro"
sasl {
mechanism = "scram_sha_256"
key = "SASL_KEY"
secret = "SASL_SECRET"
}
schema_registry {
type = "confluent"
confluent {
servers = "https://psrc-xxxxx.us-west-2.aws.confluent.cloud"
subject_name = "events-value"
key = "SR_KEY"
secret = "SR_SECRET"
}
}
}
}
For production, switch to credential_type = "SECRET_MANAGER" and replace key/secret with key_secret_reference (cloud secret ARN/ID).
Postgres CDC source
resource "onehouse_source" "pg" {
name = "orders-postgres"
source_type = "POSTGRES"
credential_type = "ONEHOUSE"
rdbms {
log_message_bus = "lakelog"
db_config {
host = "pg.internal.example.com"
port = "5432"
database_name = "orders"
user = "onehouse_cdc_user"
password = "secret"
}
lake_log_config {
intermediate_storage_path = "s3://my-bucket/lakelog/orders-pg/"
}
}
}
lake_log_config {} is required when log_message_bus = "lakelog" and not allowed otherwise.
MySQL CDC source via MSK
resource "onehouse_source" "mysql" {
name = "users-mysql"
source_type = "MY_SQL"
credential_type = "ONEHOUSE"
rdbms {
log_message_bus = "msk"
db_config {
host = "mysql.internal.example.com"
port = "3306"
database_name = "users"
user = "onehouse_cdc_user"
password = "secret"
}
}
}
Kinesis source
resource "onehouse_source" "kinesis" {
name = "events-kinesis"
source_type = "KINESIS"
kinesis {
region = "us-west-2"
payload_serialization = "json"
}
}
Onehouse-table source
resource "onehouse_source" "ot" {
name = "curated-source"
source_type = "ONEHOUSE_TABLE"
onehouse_table {
lake = "warehouse"
database = "events"
name = "curated_events"
}
}
Argument Reference
Top-level
| Argument | Type | Required | Mutability | Description |
|---|
name | string | ✅ | Immutable | Source name. Unique within the project. |
source_type | string | ✅ | Immutable | Source type. → details below |
credential_type | string | for credential-bearing types | Immutable | ONEHOUSE (credentials in state) or SECRET_MANAGER (cloud-secret reference). → details |
Exactly one type-specific sub-block must be set, matching source_type.
source_type — families and sub-blocks
The provider supports nine source types in five families. Pick the sub-block that matches your source_type.
| Family | source_type values | Sub-block | SQL ref |
|---|
| Object storage | S3, GCS | s3 {} or gcs {} | S3 · GCS |
| Event streams | APACHE_KAFKA, MSK_KAFKA, CONFLUENT_KAFKA | kafka {} | Kafka types |
| AWS Kinesis | KINESIS | kinesis {} | Kinesis type |
| Onehouse tables | ONEHOUSE_TABLE | onehouse_table {} | Onehouse table type |
| Databases (CDC) | POSTGRES, MY_SQL | rdbms {} | Postgres · MySQL |
For credential-bearing types (Kafka, RDBMS, Onehouse-table), both ONEHOUSE and SECRET_MANAGER credential modes are supported via the top-level credential_type attribute. See Secrets Management for the trade-offs.
s3 {} / gcs {} block
| Argument | Type | Required | Description |
|---|
object_storage_bucket_name | string | ✅ | Bucket name (not a URI). → S3 · GCS |
kafka {} block
| Argument | Type | Required | Description |
|---|
bootstrap_servers | string | ✅ | Comma-separated host:port list. → details |
cloud_resource_identifier | string | when MSK_KAFKA | MSK cluster ARN. Required for source_type = "MSK_KAFKA". |
connection_protocol | string | ✅ | PLAINTEXT, TLS, or SASL. → details |
security_protocol | string | ✅ | SASL_SSL, SASL_PLAINTEXT, SSL, PLAINTEXT. → details |
payload_serialization | string | ✅ | avro, json, proto, confluent_proto, confluent_json_sr. → details |
tls {} | block | when TLS | TLS trust/key stores and passwords. → fields below |
sasl {} | block | when SASL | SASL credentials and (optional) Confluent keystore/truststore extras. → fields below |
schema_registry {} | block | optional | Schema registry config. → fields below |
tls {} block
Set when connection_protocol = "TLS". Use ONEHOUSE mode (key_store_password + key_password) or SECRET_MANAGER mode (key_store_password_key_password_secret_reference) — not both. All fields are optional in the schema and write-only (sent on CREATE, never stored in state).
| Argument | Type | Required | Description |
|---|
trust_store_path | string | | TLS trust store path on the data-plane host. Write-only. |
key_store_path | string | | TLS key store path on the data-plane host. Write-only. |
key_store_password | string | ONEHOUSE mode | TLS key store password. Write-only. Mutex with key_store_password_key_password_secret_reference. |
key_password | string | ONEHOUSE mode | TLS key password. Write-only. Mutex with key_store_password_key_password_secret_reference. |
key_store_password_key_password_secret_reference | string | SECRET_MANAGER mode | Cloud-secret reference (e.g. AWS Secrets Manager ARN) holding both TLS passwords. Write-only. Mutex with the literal *_password fields. |
sasl {} block
Set when connection_protocol = "SASL". Use ONEHOUSE mode (key + secret) or SECRET_MANAGER mode (key_secret_reference). The keystore_* / trust_store_* / key_password fields are optional Confluent extras and belong to ONEHOUSE mode. All credential fields are write-only.
| Argument | Type | Required | Description |
|---|
mechanism | string | ✅ | SASL mechanism. One of PLAIN, SCRAM_SHA_256, SCRAM_SHA_512 (case-insensitive). |
key | string | ONEHOUSE mode | SASL key. Write-only. Mutex with key_secret_reference. |
secret | string | ONEHOUSE mode | SASL secret. Write-only. Mutex with key_secret_reference. |
key_secret_reference | string | SECRET_MANAGER mode | Cloud-secret reference holding both SASL key and secret. Write-only. Mutex with key + secret. |
keystore_path | string | | Optional Confluent extra: SASL keystore path. Write-only. |
keystore_password | string | | Optional Confluent extra: SASL keystore password. Write-only. |
keystore_type | string | | Optional Confluent extra: SASL keystore type, e.g. jks or pkcs12. Write-only. |
trust_store_path | string | | Optional Confluent extra: SASL trust store path. Write-only. |
trust_store_password | string | | Optional Confluent extra: SASL trust store password. Write-only. |
trust_store_type | string | | Optional Confluent extra: SASL trust store type, e.g. jks or pkcs12. Write-only. |
key_password | string | | Optional Confluent extra: SASL key password. Write-only. |
kinesis {} block
| Argument | Type | Required | Description |
|---|
region | string | ✅ | AWS region of the Kinesis stream (e.g., us-west-2). → details |
payload_serialization | string | | Serialization format. Currently only json is supported. → details |
schema_registry {} | block | optional | Schema registry config. → fields below |
onehouse_table {} block
| Argument | Type | Required | Description |
|---|
lake | string | ✅ | Source lake name. → details |
database | string | ✅ | Source database name. → details |
name | string | ✅ | Source table name. → details |
schema_registry {} | block | optional | Schema registry config. → fields below |
rdbms {} block
| Argument | Type | Required | Description |
|---|
log_message_bus | string | ✅ | lakelog, msk, or (Postgres only) google_managed_kafka. → Postgres · MySQL |
db_config {} | block | ✅ | host, port, database_name, user/password (or user_password_reference for secret-manager mode). → Postgres · MySQL |
lake_log_config {} | block | when log_message_bus = "lakelog" | intermediate_storage_path for the lakelog buffer. → Postgres · MySQL |
schema_registry {} | block | optional | Schema registry config. → fields below |
db_config {} block
| Argument | Type | Required | Description |
|---|
host | string | ✅ | Database server hostname (no port). |
port | string | ✅ | Database server port. |
database_name | string | ✅ | Database name to capture from. |
user | string | ONEHOUSE mode | Database user. Write-only. Mutex with user_password_reference. |
password | string | ONEHOUSE mode | Database password. Write-only. Mutex with user_password_reference. |
user_password_reference | string | SECRET_MANAGER mode | Cloud-secret reference holding both DB user and password. Write-only. |
lake_log_config {} block
Required when log_message_bus = "lakelog"; not allowed otherwise.
| Argument | Type | Required | Description |
|---|
intermediate_storage_path | string | ✅ (when lakelog) | Object-storage path used to stage the CDC log buffer. |
schema_registry {} block
Shared by the kafka {}, kinesis {}, onehouse_table {}, and rdbms {} blocks. type selects exactly one child sub-block.
| Argument | Type | Required | Description |
|---|
type | string | ✅ | One of glue, confluent, google, jar, file. → details |
glue {} / confluent {} / google {} / jar {} / file {} | block | ✅ | Type-specific config; exactly one matching type. → fields below |
glue {} (when type = "glue")
| Argument | Type | Required | Description |
|---|
name | string | | Glue schema registry name. |
confluent {} (when type = "confluent")
Accepts either key/secret literals (credential_type = "ONEHOUSE") or key_secret_reference (credential_type = "SECRET_MANAGER").
| Argument | Type | Required | Description |
|---|
servers | string | ✅ | Confluent Schema Registry server URL(s). |
subject_name | string | | Optional. The specific Confluent SR subject to fetch the schema from. If omitted, the Kafka topic name is used as the subject (Confluent's <topic>-value convention). Use this to pin a single schema. |
subject_prefix | string | | Optional. A subject-name prefix used to discover/list all matching SR subjects rather than pinning one. Useful for multi-topic discovery. |
key | string | ONEHOUSE mode | Confluent SR key. Write-only. |
secret | string | ONEHOUSE mode | Confluent SR secret. Write-only. |
key_secret_reference | string | SECRET_MANAGER mode | Cloud-secret reference holding both the Confluent SR key and secret. Write-only. |
subject_name and subject_prefix are both optional and serve as alternative ways to locate schemas — provide subject_name to pin one subject, or subject_prefix for prefix-based discovery. Self-hosted registries without auth may omit key/secret/key_secret_reference entirely.
google {} (when type = "google")
| Argument | Type | Required | Description |
|---|
url | string | | Google Managed Schema Registry URL. Write-only. |
jar {} (when type = "jar")
| Argument | Type | Required | Description |
|---|
location | string | | Jar location for proto schemas. |
file {} (when type = "file")
| Argument | Type | Required | Description |
|---|
base_path | string | | File-based schema registry base path. |
full_path | string | | File-based schema registry full path. |
Attribute Reference
| Attribute | Type | Description |
|---|
id | string | Onehouse-assigned source UUID. |
created_at | string | Creation time in RFC3339. |
created_by | string | Identity that created the source. |
Import
terraform import onehouse_source.events events-kafka
After import, sensitive fields (passwords, tokens, secrets) inside the type-specific block cannot be recovered from SHOW SOURCES. Re-supply them in your .tf to avoid a forced replacement on the first terraform plan.
Data Source
data "onehouse_source" "lookup" {
name = "events-kafka"
}
output "source_type" {
value = data.onehouse_source.lookup.source_type
}
Limitations
- No update. Any argument change forces destroy + recreate.
- Write-only credentials. Sensitive fields (passwords, tokens, SASL secrets) are write-only — they are sent to the server on CREATE but not returned by DESCRIBE. After import, re-supply them in your
.tf file.
- Multi-table on
ONEHOUSE_TABLE. Not yet supported (tracked in ENG-41456).
- Oracle CDC.
source_type = "ORACLE" is defined in the proto but not yet supported by the provider.