As part of our ongoing efforts to improve system performance and reliability, we recently upgraded the Istio service mesh across our Kubernetes clusters. This upgrade included changes to the default retry behaviour for intra-cluster communication.
Following the upgrade, we identified and resolved sporadic 503 errors by updating retry configurations for affected applications. All systems are now stable, and we continue to monitor performance to ensure reliability.
Thank you for your patience.
Posted Mar 19, 2025 - 14:48 UTC
Monitoring
The system has been stable after applying the mitigation. The investigation is still in progress to resolve the root cause. We will provide next updates soon.
Posted Mar 19, 2025 - 13:37 UTC
Update
We have applied a change at the compute layer to mitigate the issue. We will be keeping the system under monitoring and will continue the investigation on root cause.
Posted Mar 19, 2025 - 11:20 UTC
Investigating
Customers in Europe region may face intermittent 5xx errors in Apps service. We are currently investigating this issue.