Hi Eren Chin,
Welcome to the Microsoft Q&A Platform. Thank you for posting your query here.
Based on your query, sometimes extensions fail to provision on VMSS instances, which can cause them to be marked as failed even if the overall deployment status is successful. This could also happen due to temporary resource constraints during provisioning.
To identify which instances have failed, you can go to the Azure Portal, navigate to your VMSS, and check under "Instances" for any that have a "Failed" provisioning state. If you need to check the provisioning state of extensions, you can run a CLI command that will show you the provisioning state for each instance and its extensions.
Use below Command
az vmss list-instances --resource-group MyResourceGroup --name MyVmss --query "[].{instanceId:instanceId, extension:resources[].id, extProvisioningState:resources[].provisioningState}"
Please refer below document for more information and troubleshooting steps.
VM extension provisioning errors in Virtual Machine Scale Sets
If some instances are marked as failed, you can try redeploying them from the Portal by selecting the failed instances and clicking on the "Redeploy" button. Alternatively, you can restart the instances to see if that helps resolve the issue. And also, reviewing activity logs and diagnostics can provide more insight into why the failure occurred, and using Azure Monitor or Application Insights can help you get more detailed logs if needed.
Please let us know if you need any other information, please feel free to ask, we will be happy to assist you as needed.!
If you found this information helpful, please click an accepting the answer and "Upvote" on my post for other community members reference