Schema Evolution
This page explains how schema changes are handled in debezium-postgres2lake.
How schema evolution is detected
The pipeline compares consecutive Avro value schemas as events are processed.
When a schema change is detected, the current batch is committed, schema change handling is applied, and a new appender lifecycle starts for the updated schema.
Supported schema changes
The schema diff resolver supports the following evolution patterns:
- Add column
- Delete column
- Make field optional (required -> optional)
- Safe primitive widening (for example:
int->long,float->double) - Decimal widening with increased precision/scale where compatible
Unsupported or incompatible changes
The pipeline fails fast for incompatible schema transitions, including:
- Making an optional field required
- Primitive-to-non-primitive (or incompatible structural) type changes
- Other unsupported type promotions
Fail-fast behavior prevents silent data corruption and makes incompatible changes explicit during processing.
Behavior by sink type
Table formats:
- Iceberg and Paimon apply schema evolution to table metadata/DDL using the resolved schema diff
- After evolution is applied, writing continues with the updated schema
Object/file formats:
- Avro, Parquet, and ORC do not perform table DDL evolution in this codebase
- A new writer/appender is created for the new schema, producing new files with updated schema metadata
Operational guidance
- Prefer additive schema changes in source tables
- Roll out incompatible DDL carefully and validate in a staging environment first
- Keep downstream consumers aware of schema version changes and compatibility constraints