Region availability for models in serverless API endpoints
In this article, you learn about which regions are available for each of the models supporting serverless API endpoint deployments.
Important
Models that are in preview are marked as preview on their model cards in the model catalog.
Certain models in the model catalog can be deployed as a serverless API with pay-as-you-go billing. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
Region availability
Pay-as-you-go billing is available only to users whose Azure subscription belongs to a billing account in a country where the model provider has made the offer available (see "offer availability region" in the table in the next section). If the offer is available in the relevant region, the user then must have a Hub/Project in the Azure region where the model is available for deployment or fine-tuning, as applicable (see "Hub/Project Region" columns in the following tables).
Cohere models
Model | Offer Availability Region | Hub/Project Region for Deployment | Hub/Project Region for Fine tuning |
---|---|---|---|
Cohere Command R+ 08-2024 | Microsoft Managed Countries | East US East US 2 North Central US South Central US Sweden Central West US West US 3 |
Not available |
Cohere Command R 08-2024 | Microsoft Managed Countries | East US East US 2 North Central US South Central US Sweden Central West US West US 3 |
Not available |
Cohere Command R+ | Microsoft Managed Countries Japan Qatar |
East US East US 2 North Central US South Central US Sweden Central West US West US 3 |
Not available |
Cohere Command R | Microsoft Managed Countries Japan Qatar |
East US East US 2 North Central US South Central US Sweden Central West US West US 3 |
Not available |
Cohere Rerank v3 - English | Microsoft Managed Countries Japan Qatar |
East US East US 2 North Central US South Central US Sweden Central West US West US 3 |
Not available |
Cohere Rerank v3 - Multilingual | Microsoft Managed Countries Japan Qatar |
East US East US 2 North Central US South Central US Sweden Central West US West US 3 |
Not available |
Cohere Embed v3 - English | Microsoft Managed Countries Japan Qatar |
East US East US 2 North Central US South Central US Sweden Central West US West US 3 |
Not available |
Cohere Embed v3 - Multilingual | Microsoft Managed Countries Japan Qatar |
East US East US 2 North Central US South Central US Sweden Central West US West US 3 |
Not available |
JAIS models
Model | Offer Availability Region | Hub/Project Region for Deployment | Hub/Project Region for Fine tuning |
---|---|---|---|
JAIS 30B Chat | Microsoft Managed Countries Egypt |
East US East US 2 North Central US South Central US Sweden Central West US West US 3 |
Not available |
Meta Llama models
Model | Offer Availability Region | Hub/Project Region for Deployment | Hub/Project Region for Fine tuning |
---|---|---|---|
Llama 2 7B Llama 2 13B Llama 2 70B |
Microsoft Managed Countries | East US East US 2 North Central US South Central US West US West US 3 |
West US 3 |
Llama 2 7B Chat Llama 2 70B Chat |
Microsoft Managed Countries | East US East US 2 North Central US South Central US West US West US 3 |
West US 3 |
Llama 3 8B Instruct Llama 3 70B Instruct |
Microsoft Managed Countries | East US East US 2 North Central US South Central US Sweden Central West US West US 3 |
Not available |
Llama 3.1 8B Instruct Llama 3.1 70B Instruct |
Microsoft Managed Countries | East US East US 2 North Central US South Central US West US West US 3 |
West US 3 |
Llama 3.1 405B Instruct | Microsoft Managed Countries | East US East US 2 North Central US South Central US West US West US 3 |
Not available |
Microsoft Phi-3 family models
Model | Offer Availability Region | Hub/Project Region for Deployment | Hub/Project Region for Fine tuning |
---|---|---|---|
Phi-3.5-vision-Instruct | Not applicable | East US 2 Sweden Central |
Not available |
Phi-3.5-MoE-Instruct | Not applicable | East US 2 Sweden Central |
East US 2 |
Phi-3.5-Mini-Instruct | Not applicable | East US 2 Sweden Central |
East US 2 |
Phi-3-Mini-4k-Instruct Phi-3-Mini-128K-Instruct |
Not applicable | East US 2 Sweden Central |
East US 2 |
Phi-3-Small-8K-Instruct Phi-3-Small-128K-Instruct |
Not applicable | East US 2 Sweden Central |
Not available |
Phi-3-Medium-4K-Instruct Phi-3-Medium-128K-Instruct |
Not applicable | East US 2 Sweden Central |
East US 2 |
Mistral models
Model | Offer Availability Region | Hub/Project Region for Deployment | Hub/Project Region for Fine tuning |
---|---|---|---|
Mistral Nemo | Microsoft Managed Countries Brazil Hong Kong Israel |
East US East US 2 North Central US South Central US Sweden Central West US West US 3 |
Not available |
Ministral-3B | Microsoft Managed Countries Brazil Hong Kong Israel |
East US East US 2 North Central US South Central US Sweden Central West US West US 3 |
Not available |
Mistral Small | Microsoft Managed Countries Brazil Hong Kong Israel |
East US East US 2 North Central US South Central US Sweden Central West US West US 3 |
Not available |
Mistral Large Mistral-Large (2407) Mistral-Large (2411) |
Microsoft Managed Countries Brazil Hong Kong Israel |
East US East US 2 North Central US South Central US Sweden Central West US West US 3 |
Not available |
Nixtla models
Model | Offer Availability Region | Hub/Project Region for Deployment | Hub/Project Region for Fine tuning |
---|---|---|---|
TimeGEN-1 | Microsoft Managed Countries Mexico Israel |
East US East US 2 North Central US South Central US Sweden Central West US West US 3 |
Not available |
Alternatives to region availability
If most of your infrastructure is in a particular region and you want to take advantage of models available only as serverless API endpoints, you can create a hub or project on the supported region and then consume the endpoint from another region.
Read Consume serverless API endpoints from a different hub or project to learn how to configure an existing serverless API endpoint in a different hub or project than the one where it was deployed.