Tutorial: Generate images using serverless GPUs in Azure Container Apps (preview)
In this article, you learn how to create a container app that uses serverless GPUs to power an AI application.
With serverless GPUs, you have direct access to GPU compute resources without having to do manual infrastructure configuration such as installing drivers. All you have to do is deploy your AI model's image.
In this tutorial you:
- Create a new container app and environment
- Configure the environment to use serverless GPUs
- Deploy your app to Azure Container Apps
- Use the new serverless GPU enable application
- Enable artifact streaming to reduce GPU cold start
Prerequisites
Resource | Description |
---|---|
Azure account | You need an Azure account with an active subscription. If you don't have one, you can create one for free. |
Azure Container Registry instance | You need an existing Azure Container Registry instance or the permissions to create one. |
Access to serverless GPUs | Access to GPUs is only available after you request GPU quotas. You can submit your GPU quota request via a customer support case. |
Create your container app
Go to the Azure portal and search for and select Container Apps.
Select Create and then select Container App.
In the Basics window, enter the following values into each section.
Under Project details enter the following values:
Setting Value Subscription Select your Azure subscription. Resource group Select Create new and enter my-gpu-demo-group. Container app name Enter my-gpu-demo-app. Deployment source Select Container image. Under Container Apps environment enter the following values:
Setting Value Region Select West US 3.
For more supported regions, refer to Using serverless GPUs in Azure.Container Apps environment Select Create new. In the Create Container Apps environment window, enter the following values:
Setting Value Environment name Enter my-gpu-demo-env. Select Create.
Select Next: Container >.
In the Container window, enter the following values:
Setting Value Name Enter my-gpu-demo-container. Image source Select Docker Hub or other registries. Image type Select public. Registry login server Enter mcr.microsoft.com. Image and tag Enter k8se/gpu-quickstart:latest. Workload profile Select the option that begins with Consumption - Up to 4... GPU Select the checkbox. GPU Type Select the T4 option and select the link to add the profile to your environment. Select Next: Ingress >.
In the Ingress window, enter the following values:
Setting Value Ingress Select the Enabled checkbox. Ingress traffic Select the Accepting traffic from anywhere radio button. Target port Enter 80. Select Review + create.
Select Create.
Wait a few moments for the deployment to complete and then select Go to resource.
This process can take up to five minutes to complete.
Use your GPU app
From the Overview window, select the Application Url link to open the web app front end in your browser and use the GPU application.
Note
To achieve the best performance of your GPU apps, follow the steps to improve cold start for your serverless GPUs.
Monitor your GPU
Once you generate an image, use the following steps to view results of the GPU processing:
Open your container app in the Azure portal.
From the Monitoring section, select Console.
Select your replica.
Select your container.
Select *Reconnect.
In the Choose start up command window, select /bin/bash, and select Connect.
Once the shell is set up, enter the command nvidia-smi to review the status and output of your GPU.
Clean up resources
The resources created in this tutorial have an effect on your Azure bill.
If you aren't going to use these services long-term, use the steps to remove everything created in this tutorial.
In the Azure portal, search for and select Resource Groups.
Select my-gpu-demo-group.
Select Delete resource group.
In the confirmation box, enter my-gpu-demo-group.
Select Delete.