📄️ onehouse_cluster
Provisions and manages a Onehouse compute cluster. Compute clusters are the execution engines for jobs, SQL workloads, and ingestion flows.
📄️ onehouse_lake
Registers a Onehouse data lake. A lake is the top-level grouping for databases and tables. It points at a root path in cloud storage and is associated with a managed compute cluster that runs table services.
📄️ onehouse_database
Creates a database — a lake-scoped namespace for tables.
📄️ onehouse_catalog
Configures an external catalog. Onehouse syncs table metadata to the catalog so external query engines (Spark, Trino, Athena, etc.) can discover and read Onehouse tables.
📄️ onehouse_source
Defines a data source for ingestion. A source represents the upstream system that an onehouse_flow reads from — an S3 bucket, a Kafka topic, a Postgres database, and so on.
📄️ onehouse_flow
Configures a flow — an ingestion pipeline that reads from an onehousesource and writes to a destination Onehouse table identified by (lake, database, tablename).
📄️ onehouse_table_service
Manages a table service — an automated maintenance operation that runs on a specific table. Table services keep tables healthy by clustering data for faster reads, compacting small files, cleaning up old versions, syncing metadata to external catalogs, creating automatic savepoints, and restoring to previous savepoints.
📄️ onehouse_transformation
Defines a reusable, named transformation that flows can apply to data as it is ingested into Onehouse tables. A transformation is created independently and then referenced by name from one or more flows.
📄️ onehouse_transformer_jar
Manages a custom transformer JAR registered from an object-storage location (S3 or GCS). Onehouse copies the uploaded JAR to its own managed storage location and references it by name for use in custom transformations.