Welcome to our status page. If you are looking for help, please check our documentation guides or contact us on our community forum. All products listed below have a target availability of 99.9%.
From January 26, 2026, 2:00 AM to 4:48 PM UTC (approximately 14 hours), some customers in the US region experienced intermittent errors and degraded performance when using UiPath services.
The issue originated in the Identity platform, a backend service that supports sign-in and access control for multiple UiPath services. This caused degraded performance across multiple dependent services such as Orchestrator, Cloud Robots (VM and Serverless), Maestro, Studio Web, IXP, Communications Mining, etc.
A bad node version from Microsoft prevented cluster autoscaling. In combination, a configuration issue to force even distribution of nodes across zones in the Identity service, (which is widely considered the best practice), made it more susceptible to this this failure.
As traffic grew, the service was unable to add enough capacity to keep up, which led to delays and failures when verifying user identity. This, in turn, affected other services that rely on identity verification.
The issue was identified through automated monitoring and customer reports indicating service disruption.
While the customer impact was detected quickly, the underlying scaling limitation took longer to identify because of its intermittent nature and the lack of a specific alert indicating that the service was unable to scale as expected. For the DRI observer, this looked like transient failures due to traffic spikes.
To help prevent similar incidents in the future, we are taking the following actions:
These improvements are underway as part of our ongoing commitment to reliability.
Microsoft RCA:
Your support case 2601260050004104 is related to an ongoing outage in your region.
STATUS:
Mitigated 1/26/2026 9:00:36 PM UTC
What happened?
Between 08:00 UTC on 22 January 2026 and 00:00 UTC on 23 January 2026, a platform issue resulted in an impact to the Azure Kubernetes Service (AKS). Impacted customers experienced failures when attempting to start, stop, scale, or update their AKS clusters when using AKS VHD image 202510.19.1, 202510.29.0, 202511.07.0, 202511.12.0, 202511.20.0, 202512.06.0, 202512.18.0, 202601.07.0, and 202601.13.0. These failures stemmed from issues within the underlying system image used by certain AKS node pools, which may prevent normal cluster operations from completing successfully.
What do we know so far?
We determined that the issue occurred because some AKS clusters were using a specific node image that stopped working correctly after a period of time. When customers tried to perform actions such as scaling, starting, stopping, or updating their clusters, those operations failed because new or updated nodes could not complete their setup. Symptoms typically appear as repeated VMSS extension failures such as exit status 178. Existing workloads were often unaffected until such an operation was attempted, which is why the issue was not immediately visible. Customer action is required to apply the fix to the node pools.
Actions Required:
To ensure mitigation, we highly recommend that customers run a Node Image Upgrade on each affected node pool. If the upgrade succeeds, the node pool and cluster will recover normally. If the upgrade fails, replace the node pool entirely by adding a new node pool, migrating the workload to the new node pool, followed by deleting the old node pool.
If your snapshot references a node image >= 202510.19.1 and < 202512.18.0, you will need to create a new snapshot.
If your snapshot references a node image >= 202601.07.0 and was created before 00:00 UTC 23 January 2026, you will need to create a new snapshot.
Recreate each affected node pool using the newly created snapshot (delete + recreate).
This replacement path is the recommended and supported mitigation for snapshot-based node pools.
While a Node Image Upgrade may succeed when specifying the same snapshot ID, this path is not reliable for snapshot scenarios and should not be used as the primary mitigation.
What happens next?