Skip to main content

CREATE CLUSTER

Description

Create a new Cluster.

Note that the SQL statement does not end with ;

Syntax

CREATE CLUSTER <cluster_name>
TYPE = { 'Managed' | 'SQL' | 'Spark' | 'Open_Engines' }
MAX_OCU = <int>
MIN_OCU = <int>
WITH 'key1' = 'value1', 'key2' = 'value2' ....

Sample response

Examples

Create a Managed Cluster:

CREATE CLUSTER managed_cluster_retail
TYPE = 'Managed'
MAX_OCU = 10
MIN_OCU = 1

Create a Spark Cluster with larger, spot workers:

CREATE CLUSTER jobs_prod
TYPE = 'Spark'
MAX_OCU = 10
WITH 'worker.type' = 'oh-general-8', 'worker.spot' = 'True'

Create a Trino Open Engines Cluster:

CREATE CLUSTER trino_cluster
TYPE = 'Open_Engines'
MAX_OCU = 10
MIN_OCU = 1
WITH 'open_engines.engine' = 'Trino', 'open_engines.catalog' = 'glue_catalog_name'

Create a Ray Open Engines Cluster:

CREATE CLUSTER ray_cluster
TYPE = 'Open_Engines'
MAX_OCU = 10
MIN_OCU = 1
WITH 'open_engines.engine' = 'Ray', 'open_engines.ray.max_cpu_units' = '7', 'open_engines.ray.min_cpu_units' = '1', 'open_engines.ray.max_gpu_units' = '3'

Required parameters

  • <cluster_name>: Specify a unique name for the Cluster.
  • TYPE: Specify the type of Cluster. This determines the type of workloads the Cluster can run.
  • MAX_OCU: Specify the Maximum OCU the Cluster can scale to.
  • MIN_OCU: Specify the Minimum OCU the Cluster will always run.

Special parameters

Include special parameters and advanced configs after WITH as type String.

Instance type parameters

  • worker.type: Specify the worker instance type as a String. Must be a standard Onehouse instance type (eg. 'oh-general-4') or a custom instance type that's been enabled for the project (eg. 'm8g.xlarge'). View more on instance types.
  • worker.spot: Specify 'TRUE' or 'FALSE' to enable/disable spot instances for workers.
  • driver.type: Specify the driver instance type as a String. Default is 'Auto', which uses the same instance type as workers.

Open Engines parameters

  • open_engines.engine: Specify the compute engine as 'Trino', 'Flink', or 'Ray'.
  • open_engines.catalog: For Flink or Trino Clusters, specify the name of the catalog to use. This field is required for Trino but optional for Flink.
  • open_engines.ray.max_cpu_units: Specify the maximum CPU units the Cluster can scale to. The max CPU Units and max GPU units must sum to the Max OCU you set above.
  • open_engines.ray.min_cpu_units: Specify the minimum CPU units the Cluster will always run. This must be the same value you set for the Min OCU above.
  • open_engines.ray.max_gpu_units: Specify the maximum GPU units the Cluster can scale to. The max CPU Units and max GPU units must sum to the Max OCU you set above.