Partilhar via


Hosted build delays - 11/16 - Mitigated

We’ve confirmed that all systems are back to normal as of 22:50 UTC. Our logs show the incident started at 19:40 and that during the 3 hours and 10 minutes that it took to resolve the issue, an unknown number of customers experienced build delays upwards of 20 minutes. Sorry for any inconvenience this may have caused.

  • Root Cause: The failure was due to a loss of build capacity following Azure provisioning failures for virtual machines.
  • Chance of Re-occurrence: Low - additional capacity already provisioned.
  • Lessons Learned: We are working to improve capacity monitor and earlier alerting for capacity issues to ensure availability with temporary Azure resource issues.
  • Incident Timeline: 3 hours & 10 minutes – 19:40 UTC through 22:50.

  • We have identified an issue with virtual machine management operations failing for some users in West US (Azure is working on this)
  • This resulted in a drop in our overall capacity for allocating hosted build agents and thus causing the increased latency/delays for all users in the US who were using hosted builds.
  • We have since added additional capacity in our east US region to try to mitigate the impact and have started to see the wait times reduce.

  • We're investigating build delays in allocating hosted build agents in Central US.
  • Currently we are engaged in bridge with MMS DRI for further investigation.