Skip to main content

CREATE CATALOG

Description

Creates a new catalog.

Note that the SQL statement does not end with ;

Syntax

CREATE CATALOG `<name>`
TYPE = { 'GLUE' | 'HIVE' | 'BIGQUERY' | 'DATAHUB' | 'ONETABLE' | 'DATAPROC' | 'UNITY' }
[ CREDENTIAL_TYPE = { 'CREDENTIAL_TYPE_ONEHOUSE' | 'CREDENTIAL_TYPE_SECRET_MANAGER' } ]
WITH 'key1' = 'value1', 'key2' = 'value2' ....

Required parameters

  • <name>: Identifier for the catalog
    • Must be unique for your project
    • Must start with an alphabetic character and cannot contain spaces or special characters unless the entire identifier string is enclosed in double quotes (e.g. "Staging Catalog")
    • Case-sensitive
  • TYPE: Specifies the type of the catalog to be created.

Optional parameters

  • CREDENTIAL_TYPE: Credential Management Type. Refer Credential Management to read more.
    • CREDENTIAL_TYPE_ONEHOUSE: Credentials get stored in onehouse control-plane.
    • CREDENTIAL_TYPE_SECRET_MANAGER: Credentials are stored in customer dataplane, users need to create the secret and share the cloud specific secret identifier.
    • Applicable only in case of datahub catalog.
    • Default: CREDENTIAL_TYPE_ONEHOUSE.

Special parameters

Include special parameters after WITH as type String.

Hive Metastore

  • hive.catalog.servers - Server information of hive catalogs. Specify values like 'hive.catalog.servers' = 'server1' or 'hive.catalog.servers' = 'server1, server2' depending upon single or multiple servers.

AWS Glue Metastore

  • glue.region - Optionally, specify the AWS region of the Glue catalog. If not specified, this will be set to the same region as your Onehouse project.
  • glue.arn - Optionally, specify the ARN of the IAM role granting Onehouse access to the Glue catalog. This is only necessary when the Glue catalog is in a different AWS account from your Onehouse project. Follow these steps to set up the IAM role.

DataProc Metastore

  • hive.catalog.servers - Server information of hive catalogs. Specify values like 'hive.catalog.servers' = 'server1' or 'hive.catalog.servers' = 'server1, server2' depending upon single or multiple servers.

BigQuery

  • bq.project.id - Google Cloud Project Name where BigQuery is to be Synced.
  • bq.require.partition.filter - Require partition filtering when users query the table from BigQuery.
  • bq.big.lake.connection.enabled - Set to true to set up a BigLake connection for the table

DataHub

  • datahub.server.url - Server URL of the DataHub.
  • datahub.data.platform.name - Identifier to represent Hudi when creating its corresponding DataPlatform entity within Datahub.
  • datahub.dataset.environment - Environment to use when pushing entities to Datahub.

If CREDENTIAL_TYPE is CREDENTIAL_TYPE_ONEHOUSE (Default):

  • datahub.auth.token - Auth token to connect to the DataHub.

If CREDENTIAL_TYPE is CREDENTIAL_TYPE_SECRET_MANAGER:

  • datahub.auth.token.reference - Cloud specific identifier (e.g., ARN for AWS) of a secret containing the authtoken in specified format.

Onetable

  • onetable.target.formats - Target formats to sync Hudi table to. Valid values are 'delta', 'iceberg' and 'delta, iceberg'.

Unity Catalog

  • unity.databricks.host - Host URL of the Databricks compute resource
  • unity.http.path - HTTP Path of the Databricks compute resource
  • unity.auth.token - personal-access-token of the databricks workspace
  • unity.catalog.name - Catalog name in the databricks to be synced

Example

Example syntax for creating a GLUE CATALOG.

CREATE CATALOG my_glue_catalog TYPE = 'glue'

Example syntax for creating a HIVE CATALOG.

CREATE CATALOG my_hive_catalog TYPE = 'hive' WITH 'hive.catalog.servers' = 'server1,server2'

Example syntax for creating a BIGQUERY CATALOG.

CREATE CATALOG `my_bigquery_catalog` TYPE = 'bigquery' WITH 'bq.require.partition.filter' = 'true', 
'bq.project.id' = 'my-project', 'bq.big.lake.connection.enabled' = 'false'

Example syntax for creating a DATAHUB CATALOG.

CREATE CATALOG `my_datahub_catalog` TYPE = 'datahub' WITH 'datahub.server.url' = 'http://my.datahub.com', 
'datahub.dataset.environment' = 'dev', 'datahub.auth.token' = 'example-auth-token',
'datahub.data.platform.name' = 'example-platform'

Example syntax for creating a ONETABLE CATALOG.

CREATE CATALOG `my_onetable_catalog` TYPE = 'onetable' WITH 'onetable.target.formats' = 'iceberg, delta'

Example syntax for creating a DATAPROC CATALOG.

CREATE CATALOG `my_dataproc_catalog` TYPE = 'dataproc' WITH 'hive.catalog.servers' = 'server1,server2'

Example syntax for creating a UNITY CATALOG.

CREATE CATALOG `my_unity_catalog` TYPE = 'unity' WITH 'unity.databricks.host' = 'https://my.cloud.databricks.com',
'unity.http.path' = 'sql/1.0/warehouse/xyz', 'unity.auth.token' = 'example-auth-token', 'unity.catalog.name' = 'example-catalog-name'

Sample response