Clusters on the US Control Plane in an unreachable state
Incident Report for Cloudera
Postmortem

The root cause of the issue was the expiration of certificates on our Cluster Connectivity Manager (CCM) instances. The certificates were renewed earlier this year however did not get fully installed across all services which caused the breakdown of communication between the services.

To address this, we have enhanced our monitoring and devised a comprehensive remediation plan to detect and address similar incidents ahead of time.

We regret the inconvenience this may have caused and strive to ensure every measure is taken to avoid similar issues in the future.

Posted Nov 17, 2023 - 21:42 UTC

Resolved
This incident has been resolved.
Posted Nov 10, 2023 - 22:06 UTC
Update
Current Status: Our teams have successfully deployed a fix for the issue and confirmed that the issue has been resolved. If you are still experiencing issues or have any questions please raise a support case with us. A root cause analysis (RCA) will be published within 7 Business days.

Customer Experience: Customers on the US control plan may observe DataLake & DataHub cluster displaying an unreachable state

Incident Start time: ~20:00 UTC November 10th, 2023
Incident End time: ~21:40 UTC November 10th, 2023
Posted Nov 10, 2023 - 22:04 UTC
Investigating
Current Status: We are currently investigating an issue with our US Control Plane.
We will have another update within 60 mins.
Customer Experience: Customers on the US control plan may observe DataLake & DataHub cluster displaying an unreachable state
Incident Start time: ~20:00 UTC November 10th, 2023
Posted Nov 10, 2023 - 21:13 UTC
This incident affected: Cloudera Data Platform (US) (CDP Management Console).