Deploy generative AI models
You first need to deploy a model to chat with or make API calls to receive responses to prompts. When you create a new deployment, you need to indicate which base model to deploy. You can deploy any number of deployments in one or multiple Azure OpenAI resources as long as their Tokens Per Minute (TPM) stays within the deployment quota.
Deploy using Azure AI Studio
In Azure AI Studio's Deployments page, you can create a new deployment by selecting a model name from the menu. The available base models come from the list in the models page.
From the Deployments page in the Studio, you can also view information about all your deployments including deployment name, model name, model version, status, date created, and more.
Deploy using Azure CLI
You can also deploy a model using the console. Using this example, replace the following variables with your own resource values:
- OAIResourceGroup: replace with your resource group name
- MyOpenAIResource: replace with your resource name
- MyModel: replace with a unique name for your model
- gpt-35-turbo: replace with the base model you wish to deploy
az cognitiveservices account deployment create \
-g OAIResourceGroup \
-n MyOpenAIResource \
--deployment-name MyModel \
--model-name gpt-35-turbo \
--model-version "0301" \
--model-format OpenAI \
--sku-name "Standard" \
--sku-capacity 1
Deploy using the REST API
You can deploy a model using the REST API. In the request body, you specify the base model you wish to deploy. See an example in the Azure OpenAI documentation.