Skip to main content

Azure

Azure uses a Bring Your Own Kubernetes (BYOK) approach. Unlike AWS and GCP where Onehouse automatically provisions the Kubernetes cluster, on Azure you provision your own AKS cluster and register it with Onehouse. The UI wizard will guide you through this setup process.

The Onehouse infrastructure deployment follows two steps:

  1. Onehouse Customer Stack — Deploys the managed identities and role assignments that allow Onehouse to operate the platform. Deployed using Terraform.
  2. AKS Cluster Registration — Register your AKS cluster with Onehouse to complete the linking step.

Step 1: Deploy the Customer Stack

Navigate to Connections > Cloud Accounts in the Onehouse UI and launch the setup wizard. The wizard will walk you through the configuration steps and provide the values needed for the Terraform stack below.

Set Up the Terraform Stack

Create a directory with the following structure:

my-azure-stack/
├── main.tf
├── backend.tf
└── terraform.tfvars

backend.tf

terraform {
required_version = ">= 1.11.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.0"
}
}
backend "azurerm" {
resource_group_name = "${RESOURCE_GROUP}"
storage_account_name = "storage1house${REQUEST_ID_PREFIX}"
container_name = "onehouse-customer-bucket-${REQUEST_ID_PREFIX}"
key = "onboarding/terraform/preboarding/onehouse.tfstate"
}
}

provider "azurerm" {
features {}
}

main.tf

module "onehouse_stack" {
source = "path/to/terraform-azure-customer-stack"

requestId = var.requestId
environment = var.environment
region = var.region
onehouseStorageAccountId = var.onehouseStorageAccountId
customerStorageAccountIds = var.customerStorageAccountIds
}

output "core_role_client_id" {
value = module.onehouse_stack.core_role_client_id
}

output "node_role_client_id" {
value = module.onehouse_stack.node_role_client_id
}

output "node_role_principal_id" {
value = module.onehouse_stack.node_role_principal_id
}

output "node_role_identity_id" {
value = module.onehouse_stack.node_role_identity_id
}

output "resource_group_name" {
value = module.onehouse_stack.resource_group_name
}

Terraform Variables

VariableDescriptionDefault
requestId[Required] Onehouse request ID
environment[Required] Onehouse environment. Value: production
regionAzure region for resource deploymenteastus
onehouseStorageAccountId[Required] Onehouse-managed storage account resource ID (provided by Onehouse team)
resourceGroupNameExisting resource group name. If empty, a new one is created.""
customerStorageAccountIdsList of customer storage account resource IDs for blob access[]
tagsCustom tags applied to all resources{}

IAM Roles Created

The Terraform module creates two user-assigned managed identities:

IdentityPurpose
onehouse-core-role-<prefix>Used by the Onehouse control plane to manage your AKS cluster and access storage
onehouse-node-role-<prefix>Assigned to AKS nodes as the kubelet identity

The following role assignments are created automatically — no manual action required:

AssignmentScopePurpose
Managed Identity Operator on node-roleResource groupAllows AKS to manage the kubelet identity
Network Contributor on resource groupResource groupAllows AKS to manage load balancers
Owner + Storage Blob Data OwnerOnehouse storage accountControl plane data access
Storage Blob Data Reader + ReaderCustomer storage accountsRead access to your data

Run Terraform Stack

Authenticate to Azure and run:

az login
az account set --subscription "<subscription-id>"

terraform init
terraform plan
terraform apply

Note the outputs — you will need them in the next step:

core_role_client_id    = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
node_role_client_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
node_role_principal_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
node_role_identity_id = "/subscriptions/.../resourceGroups/.../providers/Microsoft.ManagedIdentity/userAssignedIdentities/onehouse-node-role-<prefix>"
resource_group_name = "onehouse-rg-<prefix>"

Step 2: Create AKS Cluster

Create an AKS cluster using the outputs from Step 1. The cluster must meet the following requirements:

  • Naming convention — The cluster name must be onehouse-customer-cluster-<requestIdPrefix>, where <requestIdPrefix> is the first 8 characters of your request ID with hyphens removed. For example, if your request ID is 00c9b97d-xxxx-xxxx-xxxx-xxxxxxxxxxxx, the cluster name is onehouse-customer-cluster-00c9b97d.
  • Identity — The cluster must use the node role managed identity (onehouse-node-role-<prefix>) as both the cluster identity and the kubelet identity. Use the node_role_identity_id, node_role_client_id, and node_role_principal_id outputs from the Terraform stack.
warning

The cluster name must exactly match the naming convention above. Onehouse uses this name to locate and manage your cluster. An incorrect name will prevent the platform from functioning.

BYOK vs Quick Start

Azure dataplane setup uses a Bring Your Own Kubernetes (BYOK) model — you provision and manage the AKS cluster yourself using your organization's standard process. The az aks create command below is provided as a quick-start reference for evaluation or POC purposes.

Quick-Start AKS Creation (Optional)

az aks create \
--resource-group <resource_group_name> \
--name onehouse-customer-cluster-<requestIdPrefix> \
--location <region> \
--kubernetes-version 1.33 \
--node-count 1 \
--nodepool-name default \
--node-vm-size Standard_D2ps_v5 \
--node-osdisk-size 50 \
--network-plugin azure \
--network-plugin-mode overlay \
--network-policy azure \
--load-balancer-sku standard \
--service-cidr 10.1.0.0/16 \
--dns-service-ip 10.1.0.10 \
--pod-cidr 10.244.0.0/16 \
--vnet-subnet-id <node_subnet_id> \
--assign-identity <control_plane_identity_resource_id> \
--assign-kubelet-identity <kubelet_identity_resource_id> \
--enable-aad \
--aad-admin-group-object-ids <aad_admin_group_id> \
--nodepool-labels workload-type=system node-role=default \
--enable-image-cleaner \
--image-cleaner-interval-hours 24 \
--generate-ssh-keys

Replace the placeholders with values from your environment and Terraform outputs:

PlaceholderValue
<resource_group_name>resource_group_name output from Terraform
<requestIdPrefix>First 8 characters of your request ID, hyphens removed
<region>Azure region (e.g. eastus)
<node_subnet_id>Resource ID of your node subnet
<node_role_identity_id>node_role_identity_id output from Terraform
<node_role_client_id>node_role_client_id output from Terraform
<node_role_principal_id>node_role_principal_id output from Terraform

Step 3: Register Cluster with Onehouse

Once your AKS cluster is running, register it in the Onehouse UI. The UI wizard will guide you through this process.

note

When creating your AKS cluster, ensure it is configured with the node role managed identity (onehouse-node-role-<prefix>) as both the cluster identity and the kubelet identity. This is required for Onehouse to schedule workloads and access storage from the nodes. The node_role_identity_id, node_role_client_id, and node_role_principal_id outputs from the Terraform stack are the values to use.

Add Node Pools

After the cluster is created, add node pools for the workloads you intend to run. Each node pool must have specific labels so that the Onehouse platform can schedule pods to the correct nodes.

note

Not all node pools are required immediately. The workerpool is required for the platform to function. The remaining pools (sqlworkers, jobworkers, openengines) should be added based on the features you intend to use — consult your Onehouse team for guidance on which pools are needed for your production setup.

The required labels per workload type are listed below.

workerpool — Onehouse agent, monitoring, and essential platform pods

LabelValueDescription
oh-essential-podstrueSchedules core Onehouse platform pods
oh-essential-pods-vpc-cnitrueSchedules platform pods that require VPC-level networking
node-roleworkerIdentifies the node as a general worker
monitoring-podstrueSchedules monitoring and observability workloads

sqlworkers — SQL query workloads

LabelValueDescription
sql-workerstrueSchedules SQL query engine pods
capacity_typeON_DEMANDIndicates on-demand instance capacity
instance_type(your VM size)The VM size used for this pool (e.g. Standard_D4s_v5)

jobworkers — Spark ingestion and processing jobs

LabelValueDescription
job-workerstrueSchedules Spark executor pods
driver-podstrueSchedules Spark driver pods
capacity_typeON_DEMANDIndicates on-demand instance capacity
instance_type(your VM size)The VM size used for this pool (e.g. Standard_D4s_v5)
LabelValueDescription
openengine-trinotrueSchedules Trino query engine pods
openengine-flinktrueSchedules Flink streaming engine pods
openengine-ray-cputrueSchedules Ray CPU-based engine pods
capacity_typeON_DEMANDIndicates on-demand instance capacity
instance_type(your VM size)The VM size used for this pool (e.g. Standard_D4s_v5)

Grant Control Plane Access to the Cluster

After the cluster is created, assign the following roles to the core_role managed identity on the AKS cluster. These allow the Onehouse control plane to manage and operate your cluster.

CLUSTER_ID=$(az aks show \
--resource-group <resource_group_name> \
--name onehouse-customer-cluster-<requestIdPrefix> \
--query id -o tsv)

az role assignment create \
--assignee <core_role_client_id> \
--role "Azure Kubernetes Service Cluster Admin Role" \
--scope $CLUSTER_ID

az role assignment create \
--assignee <core_role_client_id> \
--role "Azure Kubernetes Service Cluster User Role" \
--scope $CLUSTER_ID

az role assignment create \
--assignee <core_role_client_id> \
--role "Reader" \
--scope $CLUSTER_ID

Use core_role_client_id from the Terraform outputs.

The wizard completes the linking and you will see one entry in Connections > Cloud Accounts.

note

If provisioning fails, contact your Onehouse team.