Orchestrator US - Increased latency

Incident Report for UiPath

Postmortem

Customer Impact
Between Oct 29 16:00 UTC and Oct 31 22:00 UTC, the compute hosting Orchestrator service experienced degraded performance due to a resource contention issue. While all customer-facing services remained available, some customers may have experienced slower response times and processing delays.

Root Cause
During deployment, a part of the compute was updated to a new Orchestrator version while another portion remained on the previous version. This created a misconfiguration in the affinity rule responsible for distributing Orchestrator replicas across compute nodes. As a result, two replicas were scheduled on the same compute host, causing CPU and memory contention and reduced performance.. Pre-deployment testing did not account for partial updates to the service, allowing this scenario to occur.

Detection
Elevated CPU and memory usage was detected by internal monitoring systems, and customer reports of slower response times helped confirm the impact.

Response
The engineering team identified the affinity rule misconfiguration and resolved the issue by deploying the new Orchestrator version across all compute nodes. This alignment restored correct rule behavior, eliminating contention and returning performance and resource usage to normal levels.

Follow-Up Actions
To prevent recurrence, the team is updating the affinity rule to account for multiple Orchestrator versions and adjusting compute configurations to ensure efficient resource utilization and scaling. Additionally, enhanced monitoring and alerting mechanisms are being implemented to proactively detect and address similar issues.

Posted Nov 07, 2025 - 08:46 UTC

Resolved

The issue has been resolved
Posted Oct 31, 2025 - 22:16 UTC

Identified

We have identified the cause of the issue and are deploying a fix
Posted Oct 31, 2025 - 21:59 UTC

Update

We are continuing to search for the cause of the issue
Posted Oct 31, 2025 - 21:07 UTC

Investigating

We are investigating the latency issues which are persisting after scaling up resources.
Posted Oct 31, 2025 - 20:21 UTC

Identified

We have identified the affected resources, and are scaling them up while we investigate further
Posted Oct 31, 2025 - 19:32 UTC

Investigating

Orchestrator is experiencing increased latency in the US region. We are investigating the cause.
Posted Oct 31, 2025 - 19:19 UTC
This incident affected: Orchestrator.