Skip to main content

onehouse_catalog

Configures an external catalog. Onehouse syncs table metadata to the catalog so external query engines (Spark, Trino, Athena, etc.) can discover and read Onehouse tables.

Canonical reference

This page documents Terraform-specific behavior (HCL syntax, types, mutability, drift, import). For full parameter semantics, valid values, and defaults, see CREATE CATALOG and DELETE CATALOG.

Example Usage

AWS Glue catalog

resource "onehouse_catalog" "glue" {
name = "prod-glue"
type = "GLUE"
glue {
region = "us-west-2"
}
}

Cross-account Glue with table-format suffixes — arn/aws_account_id enable cross-account access (aws_account_id is also required for non-AWS orgs), and table_format_suffix additionally syncs each Hudi table to Glue as the named formats under a suffixed table name:

resource "onehouse_catalog" "glue_cross_account" {
name = "prod-glue-cross-account"
type = "GLUE"
glue {
region = "us-west-2"
arn = "arn:aws:iam::123456789012:role/onehouse-glue"
aws_account_id = "123456789012"
table_format_suffix = {
iceberg = "_ice"
delta = "_dlt"
}
}
}

AWS Glue Iceberg REST Catalog

resource "onehouse_catalog" "glue_irc" {
name = "prod-glue-irc"
type = "GLUE_ICEBERG_REST_CATALOG"
glue_iceberg_rest_catalog {
catalog_name = "my_glue_catalog"
aws_account_id = "123456789012"
aws_region = "us-west-2"
}
}

Hive Metastore catalog

resource "onehouse_catalog" "hive" {
name = "internal-hive"
type = "HIVE"
hive {
metastore_servers = [
"thrift://hms.internal.example.com:9083",
]
}
}

Unity Catalog (with personal access token)

resource "onehouse_catalog" "unity" {
name = "prod-unity"
type = "UNITY"
unity {
databricks_host = "https://dbc-91697e55-175d.cloud.databricks.com"
http_path = "sql/protocolv1/o/4044103294663248/0923-070957-cgil4gk"
catalog_name = "production_data"
auth_token = var.databricks_pat
}
}

Unity Catalog (with OAuth service principal)

resource "onehouse_catalog" "unity_oauth" {
name = "prod-unity-oauth"
type = "UNITY"
unity {
databricks_host = "https://dbc-91697e55-175d.cloud.databricks.com"
http_path = "sql/protocolv1/o/4044103294663248/0923-070957-cgil4gk"
catalog_name = "production_data"
oauth_client_id = var.databricks_oauth_client_id
oauth_secret = var.databricks_oauth_secret
}
}
warning

auth_token, oauth_client_id, and oauth_secret are write-only — they are sent to the server on CREATE but not stored in Terraform state or returned by DESCRIBE. Use either a PAT (auth_token) or OAuth (oauth_client_id + oauth_secret), not both.

OneTable catalog

resource "onehouse_catalog" "onetable" {
name = "onetable-prod"
type = "ONETABLE"
onetable {
target_formats = ["iceberg", "delta"]
}
}

DataHub catalog

resource "onehouse_catalog" "datahub" {
name = "datahub-prod"
type = "DATAHUB"
datahub {
server_url = "https://datahub.internal.example.com"
credential_type = "SECRET_MANAGER"
auth_token_reference = "arn:aws:secretsmanager:us-west-2:111122223333:secret:datahub-AbCdEf"
}
}

Snowflake catalog

resource "onehouse_catalog" "snowflake" {
name = "prod-snowflake"
type = "SNOWFLAKE"
snowflake {
account_identifier = "myorg-myaccount"
warehouse = "COMPUTE_WH"
database = "MY_DATABASE"
external_volume = "my_external_volume"
catalog_integration = "my_catalog_integration"
auth_token_reference = "arn:aws:secretsmanager:us-west-2:111122223333:secret:snowflake-creds-AbCdEf"
role = "DATA_ENGINEER"
}
}

OneLake catalog

resource "onehouse_catalog" "onelake" {
name = "prod-onelake"
type = "ONELAKE"
onelake {
workspace_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
item_id = "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy"
tenant_id = "zzzzzzzz-zzzz-zzzz-zzzz-zzzzzzzzzzzz"
client_id = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
client_secret = var.azure_client_secret
item_type = "LAKEHOUSE"
}
}

Argument Reference

Top-level

ArgumentTypeRequiredMutabilityDescription
namestringImmutableCatalog name.
typestringImmutableOne of GLUE, GLUE_ICEBERG_REST_CATALOG, HIVE, UNITY, ONETABLE, DATAHUB, DATAPROC, BIGQUERY, SNOWFLAKE, ONELAKE. → details below

Exactly one type-specific sub-block must be set, matching the type value.

type — when to pick each value

ValueUse when
GLUEYou want AWS Glue Data Catalog discovery for Athena, EMR, Redshift Spectrum. → details
GLUE_ICEBERG_REST_CATALOGYou want to access Iceberg tables via the AWS Glue Iceberg REST Catalog. AWS only. → details
HIVEYou have a Hive Metastore (e.g., internal Spark/Hive deployments). → details
UNITYYou use Databricks Unity Catalog. → details
ONETABLECross-format table interop via OneTable (Hudi / Iceberg / Delta interop). → details
DATAHUBYou use DataHub for data discovery and lineage. → details
DATAPROCYou use a Google Dataproc Metastore (same Hive Thrift interface).
BIGQUERYYou want BigQuery external tables over your Onehouse data.
SNOWFLAKEYou use Snowflake and want to sync table metadata for external table access. → details
ONELAKEYou use Microsoft OneLake (Fabric) and want to sync table metadata. → details

glue {} block

ArgumentTypeRequiredDescription
regionstringAWS region of the Glue Data Catalog. Defaults to the project region when omitted.
arnstringGlue catalog ARN. Required for cross-account access; omit for same-account.
aws_account_idstringAWS account ID hosting the Glue catalog. Required for cross-account access and for non-AWS (GCP/Azure) orgs.
table_format_suffixmap(string)Additionally sync each Hudi table to Glue as the given table formats, appending the suffix to the table name (e.g. ordersorders_ice). Keys are table formats — iceberg or delta; values are suffixes (lowercase alphanumeric/underscore, 1–128 characters).

glue_iceberg_rest_catalog {} block

ArgumentTypeRequiredDescription
catalog_namestringName of the Glue Iceberg REST catalog. Alphanumeric and underscores only, 1–255 characters.
aws_account_idstringAWS account ID hosting the Glue catalog.
aws_regionstringAWS region of the Glue catalog (e.g. us-west-2). → details

hive {} block

ArgumentTypeRequiredDescription
metastore_serverslist(string)One or more Thrift URIs (thrift://host:port). Server tries to connect on CREATE, so URIs must be reachable from the Onehouse control plane. → details

unity {} block

ArgumentTypeRequiredDescription
databricks_hoststringDatabricks workspace URL (e.g. https://dbc-xxx.cloud.databricks.com).
http_pathstringDatabricks SQL warehouse HTTP path. → details
catalog_namestringName of the catalog inside Unity.
auth_tokenstringDatabricks personal access token. Sensitive, write-only. Mutually exclusive with oauth_client_id/oauth_secret.
oauth_client_idstringOAuth client ID for Databricks service principal authentication. Sensitive, write-only. Use with oauth_secret.
oauth_secretstringOAuth client secret for Databricks service principal authentication. Sensitive, write-only. Use with oauth_client_id.

Use either auth_token (PAT auth) or oauth_client_id + oauth_secret (OAuth service principal auth), not both.

onetable {} block

ArgumentTypeRequiredDescription
target_formatsset(string)Target table formats. Subset of delta, iceberg. → details

datahub {} block

ArgumentTypeRequiredDescription
server_urlstringDataHub server URL.
data_platform_namestringIdentifier for the Hudi platform in DataHub.
dataset_environmentstringDataHub environment (e.g. prod, dev).
credential_typestringONEHOUSE (default) or SECRET_MANAGER. → details
auth_tokenstringwhen credential_type = "ONEHOUSE"DataHub auth token. Sensitive.
auth_token_referencestringwhen credential_type = "SECRET_MANAGER"Cloud secret ARN/ID.

dataproc {} block

Set when type = "DATAPROC". Uses the same Hive Thrift interface as hive {}.

ArgumentTypeRequiredDescription
metastore_serverslist(string)Dataproc metastore Thrift URIs (thrift://host:port).

bigquery {} block

Set when type = "BIGQUERY".

ArgumentTypeRequiredDescription
project_idstringGCP project ID. Required when the bigquery {} block is set.
big_lake_connection_idstringBigLake connection ID for external table access.
require_partition_filterboolRequire a partition filter on queries. Defaults to false.
use_icebergboolUse Iceberg format for the BigQuery external tables. Defaults to false.

snowflake {} block

ArgumentTypeRequiredDescription
account_identifierstringSnowflake account identifier (e.g. myorg-myaccount).
warehousestringSnowflake warehouse name.
databasestringSnowflake database name.
external_volumestringSnowflake external volume for Iceberg tables.
catalog_integrationstringSnowflake catalog integration name.
auth_token_referencestringCloud secret manager identifier for Snowflake credentials. Sensitive, write-only.
rolestringSnowflake role to assume.

onelake {} block

ArgumentTypeRequiredDescription
workspace_idstringMicrosoft Fabric workspace ID.
item_idstringOneLake item ID.
tenant_idstringAzure AD tenant ID.
client_idstringAzure AD application (client) ID.
client_secretstringAzure AD client secret. Sensitive.
item_typestringOneLake item type. One of LAKEHOUSE or MIRRORED_AZURE_DATABRICKS_CATALOG.

Attribute Reference

AttributeTypeDescription
idstringCatalog UUID assigned by Onehouse.
created_atstringCreation time in RFC3339.
created_bystringIdentity that created the catalog.

Import

terraform import onehouse_catalog.glue prod-glue

After import, the server does not return sensitive fields (unity.auth_token, datahub.auth_token, etc.). Re-supply them in your .tf file before the next terraform apply to avoid a forced replacement.

Data Source

data "onehouse_catalog" "lookup" {
name = "prod-glue"
}

output "catalog_type" {
value = data.onehouse_catalog.lookup.type
}

Limitations

  • No Update. The API has no ALTER CATALOG — any field change forces destroy + recreate.
  • Write-only credentials. Sensitive fields (auth_token, oauth_client_id, oauth_secret, auth_token_reference, client_secret) are write-only — they are sent on CREATE but not returned by DESCRIBE. After import, re-supply them in your .tf file.