How CloudQuery handles changes to existing tables
CloudQuery handles schema changes by avoiding breaking changes and adding columns instead, because users commonly build views on top of synced tables. Migration strategies are implemented in the destination integrations, as source integrations are database agnostic and send back JSON objects.
CloudQuery has two modes of migrating to a new schema, safe which is supported by all destinations, and forced which is only supported by ClickHouse, MySQL, PostgreSQL, MSSQL and SQLite at the moment.
The safe mode is the default and will not run migrations that would result in data loss, and will print an error instead. The forced mode will run migrations that may result in data loss and the migration should always succeed without errors.
Schema changes that require data loss only succeed with forced mode:
| Schema change | Reasoning |
|---|---|
| Adding a new column that is a primary key or a not null column | New syncs can’t succeed without back-filling the data, or dropping and re-adding the table |
| Removing a column that is a primary key or a not null column | New syncs can’t succeed as the column will not be populated with data, so dropping and re-adding the table is required |
| Changing a column type | New syncs can’t succeed without casting existing data into the new type, which is not always possible and can have performance implications in production environments |
Schema changes that don’t require data loss pass in both modes:
| Schema change | Reasoning |
|---|---|
| Adding a new column that is neither a primary key nor a not null column | New syncs can succeed by adding the new column to the existing table |
| Removing a column that is neither a primary key nor a not null column | New syncs can succeed by ignoring the column removal |
Next Steps
- Managing Versions - Understand integration version management
- Destination Integrations - Configure migration and write modes
- Source Integrations - Configure source schema options
Last updated on