onehouse_lake
Registers a Onehouse data lake. A lake is the top-level grouping for databases and tables. It points at a root path in cloud storage and is associated with a managed compute cluster that runs table services.
This page documents Terraform-specific behavior (HCL syntax, types, mutability, drift, import). For full parameter semantics, valid values, and defaults, see CREATE LAKE, ALTER LAKE, and DELETE LAKE.
Deleting a lake is irreversible. By default, terraform destroy fails if the lake contains databases or tables. Set force_destroy = true to cascade-delete all dependents.
Example Usage
Observed lake on an existing S3 bucket
resource "onehouse_lake" "raw" {
name = "raw_events"
lake_type = "OBSERVED"
bucket_path = "s3://my-data-lake/raw/"
default_services_cluster = "Default Managed Cluster - 2"
}
Managed lake on GCS
resource "onehouse_lake" "warehouse" {
name = "warehouse"
lake_type = "MANAGED"
bucket_path = "gs://onehouse-warehouse/lakes/warehouse/"
default_services_cluster = onehouse_cluster.services.name
}
Reference to a separately-managed cluster
resource "onehouse_cluster" "services" {
name = "table-services"
type = "Managed"
min_ocu = 2
max_ocu = 8
worker_type = "oh-general-4"
}
resource "onehouse_lake" "warehouse" {
name = "warehouse"
lake_type = "MANAGED"
bucket_path = "s3://onehouse-warehouse/lakes/warehouse/"
default_services_cluster = onehouse_cluster.services.name
}
Argument Reference
| Argument | Type | Required | Mutability | Description |
|---|---|---|---|---|
name | string | ✅ | Immutable | Lake name. Must be unique in the project. |
lake_type | string | ✅ | Immutable | One of MANAGED or OBSERVED. → details |
bucket_path | string | ✅ | Immutable | Cloud-storage root path. Must end with a trailing / (server-validated). E.g., s3://bucket/dir/, gs://bucket/dir/. |
default_services_cluster | string | ✅ | Mutable | Name of the compute cluster that runs table services. → details |
force_destroy | boolean | Immutable | When true, terraform destroy cascade-deletes all databases and tables in the lake. Default false. |
lake_type — MANAGED vs OBSERVED
| Value | Behavior |
|---|---|
MANAGED | Onehouse fully manages the lake — writes Hudi tables, runs compaction and cleaning, and owns the root directory. |
OBSERVED | Onehouse reads existing tables under bucket_path but does not write. Use this for an existing data lake you want to expose through Onehouse without migrating data. |
lake_type is immutable. To switch a lake between modes, you must destroy and recreate it. Set force_destroy = true if the lake contains dependents.
default_services_cluster
The managed cluster that runs table services (compaction, cleaning, clustering) for tables in this lake. This is the only mutable field on a lake — changing it issues ALTER LAKE <name> SET DEFAULT_SERVICES_CLUSTER = <new>. The new cluster must exist before the change is applied.
Attribute Reference
| Attribute | Type | Description |
|---|---|---|
id | string | Equal to name (lakes have no separate UUID in the public API). |
created_at | string | Creation time in RFC3339. |
created_by | string | Identity that created the lake. Empty when created by a service principal. |
num_databases | number | Number of databases in the lake. |
databases | list(string) | Database names in the lake (in DESCRIBE LAKE order). |
Import
terraform import onehouse_lake.warehouse warehouse
Data Source
data "onehouse_lake" "lookup" {
name = "warehouse"
}
output "lake_default_cluster" {
value = data.onehouse_lake.lookup.default_services_cluster
}
Limitations
- Immutable fields force replacement. Changing
name,lake_type, orbucket_pathforces destroy + recreate. Ifforce_destroyis not set, deletion fails when dependents exist.