Skip to main content

DataProc Metastore

Description

DataProc Metastore is a fully managed, highly available metadata management service for Apache Hive and other compatible data processing engines on Google Cloud. It provides a centralized repository for storing and managing metadata about tables, partitions, and schemas, making it easier to process and analyze data using engines like Apache Spark, Presto, and others on Google Cloud Dataproc.

Setup guide

  1. Enter a Name to identify the data catalog in Onehouse
  2. Select DataProc Metastore as the Type
  3. Enter the Servers

Note that DataProc metastores can only be created in GCP projects.