FreeIPA Service disruption in EU Control Plane

Incident Report for Cloudera

Postmortem

On March 12, 2025, intermittent service disruptions were experienced due to an expired certificate served by an endpoint. Despite an automated successful certificate renewal, staggered pod restarts resulted in inconsistencies, with older ingress controller pods serving the expired certificate alongside a recently updated pod. This discrepancy led to intermittent user errors.

Resolution was achieved through a systematic restart of all affected components, ensuring uniform application of the renewed certificate and restoring service consistency. We have also implemented measures to prevent similar occurrences.

We sincerely apologize for any inconvenience this service disruption may have caused. We are committed to maintaining the highest standards of service reliability and appreciate your understanding.

Posted Apr 09, 2025 - 13:51 UTC

Resolved

Current Status: Our teams have identified the source of the intermittent issue impacting the FreeIPA service in the EU Control Plane. We have successfully deployed a fix for the issue and confirmed that the issue has been resolved. If you are still experiencing issues or have any questions please raise a support case with us.

We sincerely apologize for any inconvenience this may have caused and we will have the root cause analysis (RCA) published within seven business days.

Customer Experience: During this outage customers on the EU Control Plane may have experienced intermittent issues with the FreeIPA service, impacting its health status and autoscaling operations.
Posted Mar 12, 2025 - 04:00 UTC