Error handling

Exception hierarchy

OnehouseSdkError
└── ResourcesError                     # base for the resources/ subpackage
    ├── AuthError                      # missing/invalid credentials, HTTP 401/403
    ├── SqlParseError                  # HTTP 400 — server rejected the SQL
    ├── OperationFailedError           # terminal status FAILED or INVALID
    └── OperationTimeoutError          # polling exceeded the configured timeout

Catch ResourcesError if you want a single net for everything the resources client can throw. Catch the specific subclasses when you want to handle a category differently (e.g. retry on timeout, surface SQL errors to a user).

What each error means

`AuthError`

The credential set is incomplete, malformed, or rejected by the server.

Sources:

Missing required fields when constructing the client (you'll see exactly which fields and where to set them).
World-readable credentials file (warning, not error — but worth fixing).
HTTP 401 or 403 from the server (key/secret rotated, link revoked, wrong region).

Handle by: halting and surfacing the message — it tells you which field is missing and which env var / profile to set. Don't retry; auth errors are not transient.

from onehouse_python_sdk.resources import AuthError

try:
    client = OnehouseResources()
    client.show_clusters()
except AuthError as e:
    print(f"Auth failed: {e}")
    raise SystemExit(1)

`SqlParseError`

The server rejected the SQL at parse time (HTTP 400 with the error text in the grpc-message header). Common causes: a misspelled keyword, missing required clause, or a feature the server doesn't support on your project.

Handle by: fixing the SQL. If you got here from a typed helper, file a bug — typed helpers should produce parseable SQL. If you got here from client.execute(...) with hand-written SQL, the message tells you what the parser disliked.

from onehouse_python_sdk.resources import SqlParseError

try:
    client.execute("CREAT CLUSTER `prod` TYPE = 'Managed'")  # typo
except SqlParseError as e:
    print(f"SQL invalid: {e}")
    # → "syntax error at or near \"CREAT\""

`OperationFailedError`

The operation reached a terminal failure status — FAILED (the server tried and could not complete the work) or INVALID (the server marked the request invalid post-submission).

Attributes: request_id, api_status, api_response — the server's failure payload usually has a human-readable reason.

Handle by: logging api_response and surfacing the reason. Retrying may help for transient failures (cluster busy, network hiccup downstream) but won't help for permanent ones (name already exists, quota exceeded).

from onehouse_python_sdk.resources import OperationFailedError

try:
    client.create_cluster("prod", type="Managed", max_ocu=10, min_ocu=1)
except OperationFailedError as e:
    print(f"Operation {e.request_id} failed with {e.api_status}")
    print(f"Server said: {e.api_response}")

`OperationTimeoutError`

The client gave up polling before the operation reached a terminal status. The operation is still running on the server. Re-submitting (rather than resuming) creates a duplicate resource — see below for the right pattern.

Attributes: request_id, timeout, last_status (the most recent non-terminal status the poller saw).

Handle by: resuming with get_status(request_id). If you consistently hit this, raise the default timeout on the client.

Resuming a timed-out operation

OperationTimeoutError.request_id is still valid server-side. Resume with get_status instead of re-submitting.

import time
from onehouse_python_sdk.resources import ApiStatus, OperationTimeoutError

try:
    result = client.create_cluster(
        "prod", type="Managed", max_ocu=10, min_ocu=1, timeout=30,
    )
except OperationTimeoutError as e:
    # Operation still running on the server — keep polling.
    status = client.get_status(e.request_id)
    while not status.terminal:
        time.sleep(5)
        status = client.get_status(e.request_id)
    print(status.api_status, status.api_response)

Tuning timeout and poll interval

Three knobs control how long blocking calls wait and how aggressively they poll.

Knob	Default	Set on the client (applies to every call)	Override per call
Execute timeout	`60.0` s	`OnehouseResources(default_execute_timeout=300.0)`	`client.create_cluster(..., timeout=300)`
Poll interval	`1.0` s, grows to a cap of `10.0` s (1.5× backoff)	`OnehouseResources(default_poll_interval=2.0)`	`client.create_cluster(..., poll_interval=2)`
HTTP request timeout	`30.0` s	`OnehouseResources(timeout=60.0)`	not exposed per call

The HTTP request timeout is per network round-trip (submit or one poll). The execute timeout is the total time to reach a terminal status, including polls.

# Long-running operations (large catalogs, big cluster spin-ups)
client = OnehouseResources(
    default_execute_timeout=600.0,   # 10 minutes total
    default_poll_interval=2.0,       # poll every 2s instead of 1s
)

# One-off override for a specific call
client.create_cluster("prod", type="Managed", max_ocu=20, min_ocu=2, timeout=900)

HTTP errors not surfaced as named exceptions

429 Too Many Requests — the transport retries with backoff (up to max_retries=3 by default, honoring Retry-After). You only see a ResourcesError if every retry is exhausted.
5xx / other 4xx — raised as the base ResourcesError with the status code and response snippet in the message.
Network errors (DNS, connection reset, TLS) — retried max_retries times, then raised as ResourcesError.

Exception hierarchy​

What each error means​

AuthError​

SqlParseError​

OperationFailedError​

OperationTimeoutError​

Resuming a timed-out operation​

Tuning timeout and poll interval​

HTTP errors not surfaced as named exceptions​