Migrate from Amazon EMR
This document outlines the process of migrating workloads from Amazon EMR to Onehouse, detailing foundational differences, migration patterns for data and code, best practices, and tooling options.
Code Migration
Onehouse Jobs are fully compatible with Apache Spark. You may reuse any code from EMR that is not leveraging AWS-specific features.
Refer here for details on how to create Onehouse Jobs.
Glue Catalog & LakeFormation
Our integration with Lake Formation enables support for named data catalog resources and tag-based access control through AWS Glue Catalog.
Onehouse Jobs allow you to sync data to Glue catalog asynchronously, or to use Glue as your primary catalog. If you need to use Glue as your primary catalog, contact Onehouse support for configuration.