onehouse_source
Defines a data source for ingestion. A source represents the upstream system that an onehouse_flow reads from — an S3 bucket, a Kafka topic, a Postgres database, and so on.
This page documents Terraform-specific behavior (HCL syntax, types, mutability, drift, import). For full parameter semantics, valid values, and defaults, see CREATE SOURCE and DELETE SOURCE. The provider does not support source updates — changes force destroy + recreate.
Example Usage
S3 source
resource "onehouse_source" "events" {
name = "raw-events-s3"
source_type = "S3"
s3 {
object_storage_bucket_name = "my-raw-events-bucket"
}
}
object_storage_bucket_name is the bucket name only (not a URI). The bucket must be accessible to the Onehouse control plane (IAM/service-account role configured during cloud-provider connection).
GCS source
resource "onehouse_source" "events_gcs" {
name = "raw-events-gcs"
source_type = "GCS"
gcs {
object_storage_bucket_name = "my-gcs-events-bucket"
}
}
Confluent Kafka with SASL and schema registry
resource "onehouse_source" "kafka" {
name = "events-kafka"
source_type = "CONFLUENT_KAFKA"
credential_type = "ONEHOUSE"
kafka {
bootstrap_servers = "pkc-xxxxx.us-west-2.aws.confluent.cloud:9092"
connection_protocol = "SASL"
security_protocol = "SASL_SSL"
payload_serialization = "avro"
sasl {
mechanism = "scram_sha_256"
key = "SASL_KEY"
secret = "SASL_SECRET"
}
schema_registry {
type = "confluent"
confluent {
servers = "https://psrc-xxxxx.us-west-2.aws.confluent.cloud"
subject_name = "events-value"
key = "SR_KEY"
secret = "SR_SECRET"
}
}
}
}
For production, switch to credential_type = "SECRET_MANAGER" and replace key/secret with key_secret_reference (cloud secret ARN/ID).
Postgres CDC source
resource "onehouse_source" "pg" {
name = "orders-postgres"
source_type = "POSTGRES"
credential_type = "ONEHOUSE"
rdbms {
log_message_bus = "lakelog"
db_config {
host = "pg.internal.example.com"
port = "5432"
database_name = "orders"
user = "onehouse_cdc_user"
password = "secret"
}
lake_log_config {
intermediate_storage_path = "s3://my-bucket/lakelog/orders-pg/"
}
}
}
lake_log_config {} is required when log_message_bus = "lakelog" and not allowed otherwise.
MySQL CDC source via MSK
resource "onehouse_source" "mysql" {
name = "users-mysql"
source_type = "MY_SQL"
credential_type = "ONEHOUSE"
rdbms {
log_message_bus = "msk"
db_config {
host = "mysql.internal.example.com"
port = "3306"
database_name = "users"
user = "onehouse_cdc_user"
password = "secret"
}
}
}
Onehouse-table source
resource "onehouse_source" "ot" {
name = "curated-source"
source_type = "ONEHOUSE_TABLE"
onehouse_table {
lake = "warehouse"
database = "events"
name = "curated_events"
}
}
Argument Reference
Top-level
| Argument | Type | Required | Mutability | Description |
|---|---|---|---|---|
name | string | ✅ | Immutable | Source name. Unique within the project. |
source_type | string | ✅ | Immutable | Source type. → details below |
credential_type | string | for credential-bearing types | Immutable | ONEHOUSE (credentials in state) or SECRET_MANAGER (cloud-secret reference). → details |
Exactly one type-specific sub-block must be set, matching source_type.
source_type — families and sub-blocks
The provider supports eight source types in four families. Pick the sub-block that matches your source_type.
| Family | source_type values | Sub-block | SQL ref |
|---|---|---|---|
| Object storage | S3, GCS | s3 {} or gcs {} | S3 · GCS |
| Event streams | APACHE_KAFKA, MSK_KAFKA, CONFLUENT_KAFKA | kafka {} | Kafka types |
| Onehouse tables | ONEHOUSE_TABLE | onehouse_table {} | Onehouse table type |
| Databases (CDC) | POSTGRES, MY_SQL | rdbms {} | Postgres · MySQL |
For credential-bearing types (Kafka, RDBMS, Onehouse-table), both ONEHOUSE and SECRET_MANAGER credential modes are supported via the top-level credential_type attribute. See Secrets Management for the trade-offs.
s3 {} / gcs {} block
| Argument | Type | Required | Description |
|---|---|---|---|
object_storage_bucket_name | string | ✅ | Bucket name (not a URI). → S3 · GCS |
kafka {} block
| Argument | Type | Required | Description |
|---|---|---|---|
bootstrap_servers | string | ✅ | Comma-separated host:port list. → details |
cloud_resource_identifier | string | when MSK_KAFKA | MSK cluster ARN. Required for source_type = "MSK_KAFKA". |
connection_protocol | string | ✅ | SASL or SSL or PLAINTEXT. → details |
security_protocol | string | ✅ | SASL_SSL, SASL_PLAINTEXT, SSL, PLAINTEXT. → details |
payload_serialization | string | ✅ | avro, json, proto, confluent_proto, confluent_json_sr. → details |
tls {} | block | optional | TLS certs and keys (4 fields). → details |
sasl {} | block | when SASL | SASL credentials (mechanism, key/secret or key_secret_reference). → details |
schema_registry {} | block | optional | Schema registry config (see below). → details |
onehouse_table {} block
| Argument | Type | Required | Description |
|---|---|---|---|
lake | string | ✅ | Source lake name. → details |
database | string | ✅ | Source database name. → details |
name | string | ✅ | Source table name. → details |
schema_registry {} | block | optional | Schema registry config. → details |
rdbms {} block
| Argument | Type | Required | Description |
|---|---|---|---|
log_message_bus | string | ✅ | lakelog, msk, or (Postgres only) google_managed_kafka. → Postgres · MySQL |
db_config {} | block |