fix(replica_cluster): resolve race condition in designated primary transition by mnencia · Pull Request #9601 · cloudnative-pg/cloudnative-pg

@dosubot dosubot bot added the size:M

This PR changes 30-99 lines, ignoring generated files.

label

Dec 30, 2025

@dosubot dosubot bot added the lgtm

This PR has been approved by a maintainer

label

Jan 8, 2026

gbartolini

@mnencia @gbartolini

…ansition

When a replica cluster switch is initiated, a race condition could occur
where the instance manager fails to set the designated primary transition
completion condition after an optimistic lock conflict, causing the operator
to wait indefinitely.

The root cause was in the RequiresDesignatedPrimaryTransition sentinel
calculation, which used IsPrimary() to check for the absence of standby.signal.
After RefreshReplicaConfiguration() creates standby.signal during the first
reconciliation loop, IsPrimary() returns false, making the sentinel false and
causing subsequent loops to return early without retrying the status update.

This fix changes the sentinel to use CurrentPrimary status instead of
IsPrimary(), keeping it true throughout the transition and allowing retries
when status updates fail. Additionally, the RetryOnConflict wrapper is
removed since the reconciliation loop itself provides the retry mechanism,
simplifying the code and making all conflicts follow the same clear path.

Closes #9591

Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>

cnpg-bot pushed a commit that referenced this pull request

Jan 19, 2026
…ansition (#9601)

When a replica cluster switch is initiated, a race condition could occur
where the instance manager fails to set the designated primary
transition completion condition after an optimistic lock conflict,
causing the operator to wait indefinitely.

The root cause was in the RequiresDesignatedPrimaryTransition sentinel
calculation, which used IsPrimary() to check for the absence of
standby.signal. After RefreshReplicaConfiguration() creates
standby.signal during the first reconciliation loop, IsPrimary() returns
false, making the sentinel false and causing subsequent loops to return
early without retrying the status update.

This fix changes the sentinel to use CurrentPrimary status instead of
IsPrimary(), keeping it true throughout the transition and allowing
retries when status updates fail. Additionally, the RetryOnConflict
wrapper is removed since the reconciliation loop itself provides the
retry mechanism, simplifying the code and making all conflicts follow
the same clear path.

Closes #9591

Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
(cherry picked from commit a73b322)

cnpg-bot pushed a commit that referenced this pull request

Jan 19, 2026
…ansition (#9601)

When a replica cluster switch is initiated, a race condition could occur
where the instance manager fails to set the designated primary
transition completion condition after an optimistic lock conflict,
causing the operator to wait indefinitely.

The root cause was in the RequiresDesignatedPrimaryTransition sentinel
calculation, which used IsPrimary() to check for the absence of
standby.signal. After RefreshReplicaConfiguration() creates
standby.signal during the first reconciliation loop, IsPrimary() returns
false, making the sentinel false and causing subsequent loops to return
early without retrying the status update.

This fix changes the sentinel to use CurrentPrimary status instead of
IsPrimary(), keeping it true throughout the transition and allowing
retries when status updates fail. Additionally, the RetryOnConflict
wrapper is removed since the reconciliation loop itself provides the
retry mechanism, simplifying the code and making all conflicts follow
the same clear path.

Closes #9591

Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
(cherry picked from commit a73b322)

cnpg-bot pushed a commit that referenced this pull request

Jan 19, 2026
…ansition (#9601)

When a replica cluster switch is initiated, a race condition could occur
where the instance manager fails to set the designated primary
transition completion condition after an optimistic lock conflict,
causing the operator to wait indefinitely.

The root cause was in the RequiresDesignatedPrimaryTransition sentinel
calculation, which used IsPrimary() to check for the absence of
standby.signal. After RefreshReplicaConfiguration() creates
standby.signal during the first reconciliation loop, IsPrimary() returns
false, making the sentinel false and causing subsequent loops to return
early without retrying the status update.

This fix changes the sentinel to use CurrentPrimary status instead of
IsPrimary(), keeping it true throughout the transition and allowing
retries when status updates fail. Additionally, the RetryOnConflict
wrapper is removed since the reconciliation loop itself provides the
retry mechanism, simplifying the code and making all conflicts follow
the same clear path.

Closes #9591

Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
(cherry picked from commit a73b322)

mnencia added a commit that referenced this pull request

Jan 20, 2026
…ansition (#9601)

When a replica cluster switch is initiated, a race condition could occur
where the instance manager fails to set the designated primary
transition completion condition after an optimistic lock conflict,
causing the operator to wait indefinitely.

The root cause was in the RequiresDesignatedPrimaryTransition sentinel
calculation, which used IsPrimary() to check for the absence of
standby.signal. After RefreshReplicaConfiguration() creates
standby.signal during the first reconciliation loop, IsPrimary() returns
false, making the sentinel false and causing subsequent loops to return
early without retrying the status update.

This fix changes the sentinel to use CurrentPrimary status instead of
IsPrimary(), keeping it true throughout the transition and allowing
retries when status updates fail. Additionally, the RetryOnConflict
wrapper is removed since the reconciliation loop itself provides the
retry mechanism, simplifying the code and making all conflicts follow
the same clear path.

Closes #9591

Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
(cherry picked from commit a73b322)

renovate bot added a commit to sdwilsh/ansible-playbooks that referenced this pull request

Mar 26, 2026
##### [\`v1.28.1\`](https://github.com/cloudnative-pg/cloudnative-pg/releases/tag/v1.28.1)

**Release date:** Feb 5, 2026

##### Enhancements

- Added support for Azure's `DefaultAzureCredential` authentication mechanism for backup and recovery operations. This can be enabled by setting `azureCredentials.useDefaultAzureCredentials: true` in the backup configuration, simplifying authentication in Azure environments without requiring explicit storage account keys or SAS tokens. ([#9468](cloudnative-pg/cloudnative-pg#9468)) <!-- 1.27 1.25 -->

##### Fixes

- Fixed validation of PostgreSQL extension names containing underscores (e.g., `pg_partman`, `pg_ivm`). Extension names with underscores are automatically sanitized to use hyphens for Kubernetes volume names while preserving the original name in mount paths. Webhook validation prevents naming conflicts after sanitization. Contributed by [@shusaan](https://github.com/shusaan). ([#9386](cloudnative-pg/cloudnative-pg#9386)) <!-- 1.27 -->

- Fixed a critical issue where the `TimelineID` in the cluster status was not reset to 1 after a major version upgrade. Because `pg_upgrade` initializes a new timeline, keeping the old ID (e.g., timeline 2) caused replicas to attempt to restore incompatible history files from object storage, leading to fatal "requested timeline is not a child of this server's history" errors. ([#9830](cloudnative-pg/cloudnative-pg#9830)) <!-- 1.27 -->

- Fixed an issue where stale TLS status fields in the `Pooler` were not cleared after being removed from the specification. This was particularly critical when upgrading to v1.28.0, where the `ServerTLS` field was repurposed, causing PgBouncer to use incorrect certificates and resulting in "unsupported certificate" errors that blocked all application connectivity. The operator now explicitly clears `ServerCA`, `ClientCA`, `ClientTLS`, and `ServerTLS` status fields when they are no longer configured. ([#9397](cloudnative-pg/cloudnative-pg#9397))

- Fixed a bug where replicas could enter a crash-loop by attempting to download timeline history files from future timelines. This occurred when stale files remained in the WAL archive from a previous cluster life, and replicas would incorrectly try to fetch them during recovery. ([#9650](cloudnative-pg/cloudnative-pg#9650)) <!-- 1.27 1.25 -->

- Fixed a race condition in `replica_cluster` setups during designated primary transitions, preventing transient "no primary" states in the replica cluster. ([#9601](cloudnative-pg/cloudnative-pg#9601)) <!-- 1.27 1.25 -->

- The backup controller now uses the unique instance session ID to detect instance manager restarts. This prevents the operator from incorrectly assuming a backup is still progressing if the underlying container has crashed and restarted, which previously led to orphaned backup objects. ([#9370](cloudnative-pg/cloudnative-pg#9370)) <!-- 1.27 -->

- Fixed a validation gap in Azure object store configurations where the `storageAccount` was not required when using explicit credentials (such as a storage key or SAS token). The operator now enforces that a storage account name is provided in these cases and that `connectionString` is mutually exclusive with other authentication parameters. ([#9604](cloudnative-pg/cloudnative-pg#9604)) <!-- 1.27 1.25 -->

- Optimized the deletion path so the operator begins cleaning up resources immediately when a cluster is marked for deletion. This significantly reduces the time a cluster remains in `Terminating` status while waiting for internal reconciliation loops. ([#9555](cloudnative-pg/cloudnative-pg#9555)) <!-- 1.27 1.25 -->

- Fixed an issue where replication slots were not properly dropped from replicas when the feature was disabled or the cluster was reconfigured. This ensures that unused slots do not cause WAL build-up on the primary. ([#9381](cloudnative-pg/cloudnative-pg#9381)) <!-- 1.27 1.25 -->

- Fixed an issue where `imagePullSecrets` were not added to the `ServiceAccount` created for the `Pooler`. Previously, these secrets were applied to the Deployment but not the SA, which caused image pull failures in restricted environments using certain security policies. ([#9427](cloudnative-pg/cloudnative-pg#9427)) <!-- 1.27 1.25 -->

- Added a check to verify ownership before the operator deletes a `PodMonitor`. This prevents the operator from accidentally deleting manually managed monitoring resources that happen to share a name with expected CNPG resources. Contributed by [@juliamertz](https://github.com/juliamertz). ([#9340](cloudnative-pg/cloudnative-pg#9340)) <!-- 1.27 1.25 -->

- Fixed a bug where `pg_stat_archiver` metrics would continue to report stale data on standby instances after a switchover. The exporter now skips these metrics on standbys, as PostgreSQL only provides valid archiver stats on the primary. ([#9411](cloudnative-pg/cloudnative-pg#9411)) <!-- 1.27 1.25 -->

- Clarified the interpretation of timestamp formats for recovery `targetTime`. Timestamps provided without an explicit timezone are now consistently interpreted as UTC. Contributed by [@pchovelon](https://github.com/pchovelon). ([#8937](cloudnative-pg/cloudnative-pg#8937)) <!-- 1.27 1.25 -->

- Fixed backup status updates to prevent "resource has been modified" errors during concurrent updates. ([#9551](cloudnative-pg/cloudnative-pg#9551)) <!-- 1.27 1.25 -->

- Fixed event reporting to use the correct pod name when a backup pod is not found. ([#9552](cloudnative-pg/cloudnative-pg#9552)) <!-- 1.27 1.25 -->

- Improved performance of scheduled backup operations for clusters with a very high number of historical backups. ([#9489](cloudnative-pg/cloudnative-pg#9489)) <!-- 1.27 1.25 -->

- Fixed error handling when removing finalizers on `Database` objects. ([#9431](cloudnative-pg/cloudnative-pg#9431)) <!-- 1.27 1.25 -->

- `cnpg` plugin:

  - Updated the `status` command to display "Disabled" when the `skipWalArchiving` annotation is present on a cluster. This replaces confusing "starting up" or "unknown" states when WAL archiving is intentionally bypassed. ([#9709](cloudnative-pg/cloudnative-pg#9709)) <!-- 1.27 1.25 -->

  - Fixed the `logs --follow` command to continue polling for new pods instead of exiting prematurely when all current log streams complete. ([#9599](cloudnative-pg/cloudnative-pg#9599)) <!-- 1.27 1.25 -->

sdwilsh pushed a commit to sdwilsh/ansible-playbooks that referenced this pull request

Mar 26, 2026
##### [\`v1.28.1\`](https://github.com/cloudnative-pg/cloudnative-pg/releases/tag/v1.28.1)

**Release date:** Feb 5, 2026

##### Enhancements

- Added support for Azure's `DefaultAzureCredential` authentication mechanism for backup and recovery operations. This can be enabled by setting `azureCredentials.useDefaultAzureCredentials: true` in the backup configuration, simplifying authentication in Azure environments without requiring explicit storage account keys or SAS tokens. ([#9468](cloudnative-pg/cloudnative-pg#9468)) <!-- 1.27 1.25 -->

##### Fixes

- Fixed validation of PostgreSQL extension names containing underscores (e.g., `pg_partman`, `pg_ivm`). Extension names with underscores are automatically sanitized to use hyphens for Kubernetes volume names while preserving the original name in mount paths. Webhook validation prevents naming conflicts after sanitization. Contributed by [@shusaan](https://github.com/shusaan). ([#9386](cloudnative-pg/cloudnative-pg#9386)) <!-- 1.27 -->

- Fixed a critical issue where the `TimelineID` in the cluster status was not reset to 1 after a major version upgrade. Because `pg_upgrade` initializes a new timeline, keeping the old ID (e.g., timeline 2) caused replicas to attempt to restore incompatible history files from object storage, leading to fatal "requested timeline is not a child of this server's history" errors. ([#9830](cloudnative-pg/cloudnative-pg#9830)) <!-- 1.27 -->

- Fixed an issue where stale TLS status fields in the `Pooler` were not cleared after being removed from the specification. This was particularly critical when upgrading to v1.28.0, where the `ServerTLS` field was repurposed, causing PgBouncer to use incorrect certificates and resulting in "unsupported certificate" errors that blocked all application connectivity. The operator now explicitly clears `ServerCA`, `ClientCA`, `ClientTLS`, and `ServerTLS` status fields when they are no longer configured. ([#9397](cloudnative-pg/cloudnative-pg#9397))

- Fixed a bug where replicas could enter a crash-loop by attempting to download timeline history files from future timelines. This occurred when stale files remained in the WAL archive from a previous cluster life, and replicas would incorrectly try to fetch them during recovery. ([#9650](cloudnative-pg/cloudnative-pg#9650)) <!-- 1.27 1.25 -->

- Fixed a race condition in `replica_cluster` setups during designated primary transitions, preventing transient "no primary" states in the replica cluster. ([#9601](cloudnative-pg/cloudnative-pg#9601)) <!-- 1.27 1.25 -->

- The backup controller now uses the unique instance session ID to detect instance manager restarts. This prevents the operator from incorrectly assuming a backup is still progressing if the underlying container has crashed and restarted, which previously led to orphaned backup objects. ([#9370](cloudnative-pg/cloudnative-pg#9370)) <!-- 1.27 -->

- Fixed a validation gap in Azure object store configurations where the `storageAccount` was not required when using explicit credentials (such as a storage key or SAS token). The operator now enforces that a storage account name is provided in these cases and that `connectionString` is mutually exclusive with other authentication parameters. ([#9604](cloudnative-pg/cloudnative-pg#9604)) <!-- 1.27 1.25 -->

- Optimized the deletion path so the operator begins cleaning up resources immediately when a cluster is marked for deletion. This significantly reduces the time a cluster remains in `Terminating` status while waiting for internal reconciliation loops. ([#9555](cloudnative-pg/cloudnative-pg#9555)) <!-- 1.27 1.25 -->

- Fixed an issue where replication slots were not properly dropped from replicas when the feature was disabled or the cluster was reconfigured. This ensures that unused slots do not cause WAL build-up on the primary. ([#9381](cloudnative-pg/cloudnative-pg#9381)) <!-- 1.27 1.25 -->

- Fixed an issue where `imagePullSecrets` were not added to the `ServiceAccount` created for the `Pooler`. Previously, these secrets were applied to the Deployment but not the SA, which caused image pull failures in restricted environments using certain security policies. ([#9427](cloudnative-pg/cloudnative-pg#9427)) <!-- 1.27 1.25 -->

- Added a check to verify ownership before the operator deletes a `PodMonitor`. This prevents the operator from accidentally deleting manually managed monitoring resources that happen to share a name with expected CNPG resources. Contributed by [@juliamertz](https://github.com/juliamertz). ([#9340](cloudnative-pg/cloudnative-pg#9340)) <!-- 1.27 1.25 -->

- Fixed a bug where `pg_stat_archiver` metrics would continue to report stale data on standby instances after a switchover. The exporter now skips these metrics on standbys, as PostgreSQL only provides valid archiver stats on the primary. ([#9411](cloudnative-pg/cloudnative-pg#9411)) <!-- 1.27 1.25 -->

- Clarified the interpretation of timestamp formats for recovery `targetTime`. Timestamps provided without an explicit timezone are now consistently interpreted as UTC. Contributed by [@pchovelon](https://github.com/pchovelon). ([#8937](cloudnative-pg/cloudnative-pg#8937)) <!-- 1.27 1.25 -->

- Fixed backup status updates to prevent "resource has been modified" errors during concurrent updates. ([#9551](cloudnative-pg/cloudnative-pg#9551)) <!-- 1.27 1.25 -->

- Fixed event reporting to use the correct pod name when a backup pod is not found. ([#9552](cloudnative-pg/cloudnative-pg#9552)) <!-- 1.27 1.25 -->

- Improved performance of scheduled backup operations for clusters with a very high number of historical backups. ([#9489](cloudnative-pg/cloudnative-pg#9489)) <!-- 1.27 1.25 -->

- Fixed error handling when removing finalizers on `Database` objects. ([#9431](cloudnative-pg/cloudnative-pg#9431)) <!-- 1.27 1.25 -->

- `cnpg` plugin:

  - Updated the `status` command to display "Disabled" when the `skipWalArchiving` annotation is present on a cluster. This replaces confusing "starting up" or "unknown" states when WAL archiving is intentionally bypassed. ([#9709](cloudnative-pg/cloudnative-pg#9709)) <!-- 1.27 1.25 -->

  - Fixed the `logs --follow` command to continue polling for new pods instead of exiting prematurely when all current log streams complete. ([#9599](cloudnative-pg/cloudnative-pg#9599)) <!-- 1.27 1.25 -->