CDP Control plane operations are impacted
Incident Report for Cloudera
Postmortem

On Monday, February 13th at 04:18 UTC, Cloudera SRE detected a spike in errors related to the CDP Control plane management console. After investigation, it was determined that a recent production change had caused the outage. Although Cloudera follows strict software development lifecycle standards, an unforeseen bug due to complex dependencies was still encountered. The issue was resolved by rolling back to the previous version.

 To reduce the risk of similar issues in the future, we are improving our test suites, monitoring  and dependency tracking to detect such scenarios as early in the development process as possible. Alongside also reviewing our rollback process to reduce mean time to recovery.

Posted Feb 16, 2023 - 09:34 UTC

Resolved
This incident has been resolved and the CDP Control plane is working as expected.
we will publish the RCA as soon as possible.
Posted Feb 13, 2023 - 06:20 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Feb 13, 2023 - 05:50 UTC
Update
We are continuing to work on a fix for this issue.
Posted Feb 13, 2023 - 05:43 UTC
Identified
The issue has been identified and a fix is being implemented.
Posted Feb 13, 2023 - 05:42 UTC
Update
We are continuing to investigate the issue. Please note this impacts the management operations of the workload clusters.
Posted Feb 13, 2023 - 05:08 UTC
Investigating
We're experiencing an elevated level of errors on CDP Control plane and are currently looking into the issue.
Posted Feb 13, 2023 - 04:18 UTC
This incident affected: Cloudera Data Platform (US) (CDP Management Console, Data Hub), Cloudera Data Platform (AP) (CDP Management Console, Data Hub), and Cloudera Data Platform (EU) (CDP Management Console, Data Hub).