Orchestrator outage for customers in EU region
Incident Report for UiPath
Postmortem

Customer impact

From 2024-08-14 8:02:00 UTC to 2024-08-14 11:00 UTC our customers experienced errors when accessing some of the services located in the EU region of Automation Cloud. Impacted products include Automation Cloud, Orchestrator, Automation Hub, Automation Ops, Document Understanding, Serverless Robots, Cloud Robots - VM, Solutions Management, and Insights.

Root cause

A critical identity component that is managing the JWT token signing keys failed to initialize on startup because of a missing configuration. This startup failure was ignored and led to an incomplete OIDC connect discovery document (document was missing the "jwks_uri" link). As a result, UiPath services that depended on the OIDC discovery document to validate JWT tokens, were unable to load the signing keys and therefore were unable to perform authentication.

Detection

Automated alerts immediately detected the issue and notified UiPath on-call engineers. They confirmed the scope of the outage and updated UiPath Status .

Response

After a brief investigation, we determined that the issue was caused by a subset of the Identity pods and the faulty pods were restarted, restoring UiPath service.

Follow up

  • Fixed the startup code to handle missing configuration correctly
  • Added new Kubernetes startup probe to ensure that Identity pods are fully initialized before they are brought into the rotation
Posted Aug 20, 2024 - 19:51 UTC

Resolved
All impacted services are now functioning normally. We apologize for any inconvenience this may have caused and thank you for your patience.
Posted Aug 14, 2024 - 11:23 UTC
Monitoring
Our Engineering teams have applied a fix and the impact is steadily decreasing. We are actively monitoring for any signs of the issue resurfacing
Posted Aug 14, 2024 - 11:09 UTC
Identified
Our Engineering teams have identified the root cause of the issue and are currently implementing a potential fix.
Posted Aug 14, 2024 - 10:57 UTC
Update
We are continuing to investigate this issue.
Posted Aug 14, 2024 - 09:16 UTC
Update
We are continuing to investigate this issue.
Posted Aug 14, 2024 - 09:16 UTC
Update
We are continuing to investigate this issue.
Posted Aug 14, 2024 - 09:12 UTC
Update
We have extended impact across multiple services in UiPath. Our Engineering teams are troubleshooting the issue and we'll keep providing timely updates till resolution.
Posted Aug 14, 2024 - 09:10 UTC
Investigating
Orchestrator is experiencing issues in EU region. Our Engineering teams are working on it and we'll keep providing timely updates till resolution.
Posted Aug 14, 2024 - 08:45 UTC
This incident affected: Orchestrator, Action Center, and Apps.