Skip to main content

Open Engines

Open Engines Clusters allow you to deploy open source compute engines on Onehouse infrastructure. This allows you to easily spin up engines for different use cases, such as analytics queries, stream processing, and machine learning.

The following engines are currently supported:

EngineBest for
TrinoFast, read-only SQL queries for analytics
Apache FlinkStream processing
RayAI, machine learning, and data science

Open Engines integrate with the full Onehouse platform (though there are intially some limitations). You can read from existing Onehouse tables with Open Engines and deploy Onehouse-managed Table Services on tables created with Open Engines.

Pricing

Onehouse is offering Open Engines for free for a limited time. You will not be billed OCU for Open Engines usage, but will still pay your cloud provider for any cloud resource consumption.

Customer Support

While other Onehouse product offerings include options for enterprise-grade customer support, Open Engines customer support is limited to infrastructure-level issues at this time. For debugging engine-specific issues, you should leverage open source channels.

If you require a solution with full customer support at the engine-level, we are happy to connect you with one of our specialized compute engine partners.

Create a Cluster

To use Open Engines, you must first create an Open Engines Cluster in Onehouse. When creating the Cluster, you will select one of the supported engines.

Access the Cluster

Open Engines queries and workloads must be submitted directly to the Cluster (ie. not through the Onehouse control plane). When you create an Open Engines Cluster, you will get an endpoint that can only be accessed from within the VPC. We suggest connecting through a bastion host or VPN.

Limitations

  • Trino and Ray are read-only for Onehouse tables due to their open source implementations.
  • Tables created by Open Engines can only be viewed and managed by Onehouse in the Apache Hudi format, and must be created as External Tables under an Observed Lake. We soon plan to add support for tables created in Managed Lakes.
  • Open Engines do not yet integrate with lock providers in Onehouse. You must add your lock provider configurations manually for the Open Engines writer when writing concurrently to a table other Onehouse writers such as Stream Captures or Table Services. We plan to integrate with Onehouse lock providers soon.
  • Trino and Flink can only connect to one external catalog currently. Trino supports a Glue or DataProc catalog, and Flink supports a Hive Metastore or DataProc catalog.
  • Access control (eg. CREATE ROLE in Trino) is not yet supported.