CML/CDE/CDF cluster upgrade
Incident Report for Cloudera
Postmortem

On February 20, 2024, customers reported issues while performing a CML workspace upgrade. Post investigation, we identified a bug in a newly promoted build; that impacted CML/CDE/CDF upgrades; and as a result we temporarily disabled CML upgrades. 

It is important to note that only the upgrade functionality was impacted, and this did not have any impact on existing workload operations. 

A hotfix was deployed to production to address the bug post which CML upgrades were re-enabled. 

We have implemented additional corrective measures within our automated test suite to proactively detect similar issues in the future.

We sincerely apologize for any inconvenience this incident may have caused to our customers.

Posted Apr 04, 2024 - 18:06 UTC

Resolved
Current Status: Our teams have successfully deployed a fix for the issue and confirmed that the issue has been resolved. If you are still experiencing issues or have any questions please raise a support case with us. A root cause analysis (RCA) will be published within seven business days.

Customer Experience: During this time cluster upgrades for CML, CDE and CDF are impacted.
Posted Mar 01, 2024 - 19:57 UTC
Monitoring
Current Status: Our teams have identified the source of the issue and have implemented a solution which is under monitoring. Please expect further updates tomorrow
Customer Experience: During this time cluster upgrades for CML, CDE and CDF are impacted
Posted Feb 29, 2024 - 20:51 UTC
Update
Current Status: Our teams have identified the source of the issue. We are working on developing and implementing a solution to restore the service’s. We will have another update towards the end of business today.

Customer Experience: During this time cluster upgrades for CML, CDE and CDF are impacted
Posted Feb 29, 2024 - 15:07 UTC
Identified
The fix is currently being validated.
Posted Feb 29, 2024 - 08:44 UTC
Investigating
We are currently working on a fix for cluster upgrade failures that have been observed in the Control Plane regions.

Please hold upgrading clusters for CML, CDE and CDF in any of the Control Plane regions till further update is made available.
Posted Feb 20, 2024 - 15:32 UTC
This incident affected: Cloudera Data Platform (AP) (DataFlow, Data Engineering, Machine Learning), Cloudera Data Platform (EU) (DataFlow, Data Engineering, Machine Learning), and Cloudera Data Platform (US) (DataFlow, Data Engineering, Machine Learning).