SQL
Onehouse SQL Clusters enable you to run Apache Spark SQL on managed, autoscaling compute. You can connect tools such as dbt, Apache Airflow, and DbVisualizer to a Onehouse SQL Cluster to run queries on the Onehouse Quanton engine.
info
SQL Clusters allow you to query, create, and modify Onehouse tables and external tables.
Architecture
Onehouse deploys an Apache Spark Thrift server on managed compute within your VPC. Instances are pre-configured to run Spark SQL.

Pre-Installed Libraries
The following libraries are pre-installed on every SQL Cluster:
- Apache Spark 3.5.2
- Apache Hudi 0.14.1
Billing
SQL Clusters are billed at the same OCU rate as other nodes in the product, as described in usage and billing.
Scaling & Max OCU
A SQL Cluster will scale up to the SQL Max OCU you set, and will scale down (to a 1 instance minumum) based on utilization.
Tips
- Queries are automatically canceled after 24 hours of duration. Please contact Onehouse support if you require longer-running queries.
- You may set Spark properties with
SET <property>=<value>. These properties are persisted at the session-level, not globally for the Cluster. - For Onehouse tables and external Apache Hudi tables, you may set Hudi table properties with
ALTER TABLE tableIdentifier SET|UNSET TBLPROPERTIES (table_property = 'property_value');. These properties are persisted for the table, and will apply to any queries you run on the SQL Cluster. - If another writer evolves the schema while your SQL client (eg. beeline) is active, you might hit an error such as
UNRESOLVED_COLUMN.WITH_SUGGESTION. To get the latest schema, runREFRESH TABLE <db>.<table>or re-connect your SQL client.
Limitations
- Java and Scala UDFs are supported, but Python UDFs are not yet supported.
- Creating or dropping a database via SQL is not supported. You can do this via the Onehouse console.
- Tables created via SQL do not support default values. If you add new columns with
ALTER TABLE, any default value specified will be overridden tonull. We plan to resolve this limitation soon. - When there are concurrent DDL requests (ALTER TABLE) to a SQL Cluster, one request may be dropped. For example, if you run
ALTER TABLE <db>.<table> SET TBLPROPERTEIS ('<key>' = '<value>'), double check that the TBLPROPERTIES key is set properly before executing subsequent commands for safety. - Changing a primary key value for an existing record in the table is not supported. Instead, delete the record and insert a new record.
- Changing a partition key value for an existing record is not possible with the
UPDATEcommand. Instead, useINSERT(in upsert mode) orMERGE INTO.