Hi Ruka Tsoi!
Went through the logs again.
The logs attached says VM SKU, identity and base image were null.
Here is updated remedials based on your pointers.
- Are you sure that you are passing VM SKU in with select compute types as "compute instances" and selecting base image as "mcr.microsoft.com/azureml/promptflow/promptflow-runtime:latest" or the default one showing up. (UI side)
- Check for other SKU if you are trying with same SKU. (There are SKU constrains on region wise sometime.
- Please enable either system assigned identity or use managed identity on your computes and add them as storage contributor, key Vault contributor, Azure Container Registry (to your respective container registry) and ML workspace contributor. Reference. (Permission)
- Please check with prerequisites on storage location and permission (Data scientist role needed, Storage File Data Privileged Contributor, Storage blob contributor) - Prerequisites.
- Switch to Azure Managed virtual network for better experience (Test in a dev workspace first). Known limitation.(Virtual network side)
- You have enabled outbound access to required service tags mentioned here.
Please check network isolation for prompt flow for more details
Hope above pointers fix your issue now.
Thank you.