Determining sizing requirements for GPU enabled Azure VM

ITManager8815 0 Reputation points
2025-01-23T22:53:14.72+00:00

Greetings,

We are trying to determine the correct VM sizing requirement for our AI workload, which is used for NLP processing. This workload does not require any training, but will only be used for inference.

We have the following software configuration:

  1. a C# application that is heavily multithreaded using a lot of socket I/O. The application has concentrated bursts where 10-20 threads are fired concurrently to perform tasks (mostly socket I/O).

This app communicates via dedicated sockets to:

  1. a Python application which performs various NLP tasks. This app is also multithreaded to handle multiple incoming requests from the .NET app. This app sends queries to a local LLM (model size will vary based on query type). We estimate we will need to support sub-second performance (at the very least) on a 7B parameter model. Ultimately, we may need to go to larger model sizes if accuracy is insufficient. The amount of text passed to the LLM will range from 300-3000 tokens.

In short, we need:

a) a CPU with sufficient cores to handle multiple concurrent threads on the .NET side. The app will have 5 or 6 background threads running continuously, and sudden bursts of activity which will require a minimum of 10-20 threads to run shorter-lived tasks.

b) a GPU with sufficient VRAM to handle at the very least, a 7B parameter model. Ultimately, we may need to support larger models to perform the same task due to insufficient accuracy.

We need the ideal configuration of GPU/VRAM and CPU/RAM to handle these tasks, and also, potentially, larger LLM sizes of up to 14B or 70B parameters.

We are looking at the NC-series VMs, with a budget of about $1,000/month (see https://azure.microsoft.com/en-us/pricing/details/virtual-machines/windows/#pricing). Any feedback on the optimal configuration in terms of CPU/GPU would be greatly appreciated.

Thank you in advance.

Azure Virtual Machines
Azure Virtual Machines
An Azure service that is used to provision Windows and Linux virtual machines.
8,284 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Mahesh Goud Juvvadi 1,840 Reputation points Microsoft Vendor
    2025-01-24T08:33:56.2166667+00:00

    Hi ITManager8815,

    Thank you for reaching out to the Microsoft Q&A platform.

    Based on your question, my understanding AI workloads require specialized virtual machines (VMs) to handle high computational demands and large-scale data processing. Choosing the right VMs optimizes resource use and accelerates AI model development and deployment.

    Choose a suitable virtual machine image, such as the Data Science Virtual Machines, to access preconfigured tools for AI workloads quickly.

    https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/ai/infrastructure/compute

    Please Check virtual machine pricing for a detailed estimate, use the Azure Pricing Calculator.

    please find the below articles for your reference:
    https://techcommunity.microsoft.com/blog/educatordeveloperblog/how-to-choose-the-best-gpu-optimized-vm-sizes-for-your-project-on-azure/3583356

    https://techcommunity.microsoft.com/blog/azurecompute/using-microsoft-copilot-in-azure-to-find-the-best-vm-size-for-you/4356049

    I hope this helps!

    I hope this information is helpful. Please feel free to reach out if you have any further questions.

    If the answer is helpful, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".  

    Thank You.

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.