Create a Cluster
Open the Clusters page in the Onehouse console to create a new Cluster. Below we will cover the configurations to set up the Cluster.
You can create multiple Clusters of the same type to isolate different workloads.
Basic configurations
- Name: The name by which to identify the Cluster.
- Cluster Type: Select from the following Cluster types. This cannot be changed after creation.
- Managed: Run Flows to ingest data and Table Services to optimize your tables.
- SQL: Run SQL workloads on the Quanton engine. Submit queries through a JDBC Endpoint or the Onehouse SQL Editor.
- Spark: Create and run Jobs to execute Apache Spark code in Python, Java, or Scala on the Quanton engine.
- Open Engines: Deploy open-source compute engines on Onehouse infrastructure with Open Engines. Supported engines: Trino, Apache Flink, and Ray.
- Notebook (beta feature): Deploy a Jupyter notebook on Onehouse infrastructure to run interactive PySpark workloads on the Onehouse Quanton engine.
- LakeBase (beta feature): Run fast, interactive SQL queries on your data lakehouse with automatic scaling and low-latency query execution.
OCU configurations
Specify OCU limits to constrain the min/max Onehouse Compute Units (OCU) the Cluster will use per hour. This will determine the how many instances the Cluster can use, based on the hourly OCU cost of your selected instance type(s).
- Max OCU / Hour: Maximum OCU the Cluster will use per hour. Set this to manage your costs.
- Min OCU / Hour: Minimum OCU the Cluster will use per hour. Set this higher if you need to keep the Cluster warm.
Setting Max OCU for your Clusters can help you confidently keep costs under a budget. If your data volumes or complexity of the workload change and your Cluster usage hits its Max OCU, the Cluster will not continue scaling up. This may lead to delays in data processing, so it is important to consider how your workloads grow or fluctuate.