Skip to main content

Discover Data in Storage

Overview

Onehouse stores your data within your cloud storage bucket (S3 or GCS). Since you own the storage, you can discover all your Onehouse data within your cloud account.

Under the hood, Onehouse stores the data in the Apache Hudi table format. You can browse and query your Onehouse data from cloud storage similar to how you would with any Hudi table.

Open the cloud storage bucket

Open the cloud storage bucket you connected to Onehouse during the onboarding steps. Onehouse creates all your lakes within this bucket.

If you're not sure which bucket holds the lake, open the Data page in Onehouse and click on your lake. You will see a DFS path that links to the lake's bucket and directory in cloud storage.

The directory within cloud storage will have the same name as your lake in Onehouse.

You should now be in the bucket's directory that maps to your Onehouse lake. Within this directory, you will see a sub-directory for each database in the lake. Open one of these directories to see your tables.

Each table is structured as a directory which holds the table data and metadata. To learn more about this table structure, see the Apache Hudi docs.