Schema Evolution
Onehouse supports the following forms of backward-compatible schema evolution for Stream Captures writing to a destination table.
Supported Operations
Adding Columns
Behavior
New columns will be added in your destination table.
Usage notes
- New columns must be nullable.
- New columns must be added at the end of the schema (not in the middle).
- If a column with the same name and position was previously removed from the table, new records will automatically populate that existing column.
- New columns cannot start with "_" (e.g. "_row_key"). These are supported during initial creation of the table, but not in schema evolution.
Deleting Columns
Behavior
Deleted columns will remain in your destination table and their existing rows will not be deleted. The deleted columns will have nulls for all new records.
Usage notes
- Only nullable columns can be deleted.
Widening Column Types
Behavior
Column types can be evolved to a “wider” type that preserves the existing data in that column. Onehouse will attempt to write incoming records to the table using the super-type (i.e. the wider type) between the column in the incoming record and the corresponding column in the table.
For example, SuperType(Int, Long) = Long
because Int can be converted to Long without losing any data, but not the other way around.
The matrix below shows the expected behavior for each type combination. Matrix rows are the type in the incoming record and matrix columns are the type in the Onehouse table. The value in the cell is the new type in the Onehouse table (the super-type). Cells marked "X" indicate that the incoming record is incompatible with the Onehouse table.
Incoming Type ↓ / Table Type → | Int | Long | Float | Double | String | Bytes |
---|---|---|---|---|---|---|
Int | Int | Long | Float | Double | String | X |
Long | Long | Long | Float | Double | String | X |
Float | Float | Float | Float | Double | String | X |
Double | Double | Double | Double | Double | String | X |
String | String | String | String | String | String | Bytes |
Bytes | X | X | X | X | String | Bytes |
Usage notes
- Booleans, Complex Types (Array, Map, etc.), and Logical Types (Decimal, Date, Timestamp, etc.) as defined by the Avro spec are not supported for Schema Evolution in either direction. However, they can be changed from a non-nullable to a nullable field of the same type.
- Schema Evolution is supported for changing the type within a Map or Array. For example, you can evolve an Array column from Int[] to Long[]. However, you cannot perform operations that modify the overall structure such as converting an Array to a Map or changing the structure of nested Complex Types.
- In the case of Bytes and String, both types can be converted to each other, so the table will accept either type without evolving (as shown in the matrix above).
- Any type can be evolved from non-nullable to nullable. Types cannot be evolved from nullable to non-nullable.
Examples
- Example 1: Incoming records of multiple compatible types
- Onehouse table has col1 of type Int
- Incoming record r1 has col1 of type Long → Onehouse table col1 type is widened to Long
- Incoming record r2 has col1 of type Int → Onehouse writes r2 as Long in the table
- Example 2: Incompatible types
- Onehouse table has col1 of type Int
- Incoming record r1 has col1 of type String → Onehouse fails the Stream Capture or quarantines r1 (based on the Stream Capture configs)
Additional Usage Notes
- The operations described in this doc are also supported for nested fields.
We recommend that you do not directly modify the schema of your destination table through DDL operations while a Stream Capture is writing data to the table, as it may cause errors. Please reach out to Onehouse support if you need to do this.