Intermittent 504 errors on the Data Catalog

Incident Report for Cloudera

Postmortem

On June 3rd 2025 our internal systems detected intermittent 504 errors on the Data Catalog service UI.

The interruption was caused by Data Catalog service components prematurely indicating they were ready to handle requests immediately after a scheduled system restart. In reality, these components were still performing essential internal data updates, a process which took approximately 24 minutes. During this period, the service was unable to fully process user requests, leading to intermittent access issues and 504 errors.

To prevent similar occurrences, we have enhanced our system's readiness checks to ensure the Data Catalog service is fully prepared to serve traffic.

We apologize for any inconvenience caused by the service disruption. We are committed to providing a reliable and robust platform and truly appreciate your understanding.

Posted Jun 16, 2025 - 13:41 UTC

Resolved

A temporary interruption in access to the DataCatalog service was detected by our internal monitoring systems. This interruption has been classified as transient, indicating that it was not a prolonged or persistent outage, though it did affect accessibility to the service for a brief period of time.

Our technical teams are actively investigating the root cause of this disruption to understand precisely what occurred. An update will be disseminated as soon as the investigation concludes and a definitive explanation for the disruption is established.
Posted Jun 04, 2025 - 00:30 UTC