We use the Azure Open AI service, with the gpt-4o model, integrated into a custom application and to process documents and extract information. The extracted information must be processed and available as soon as possible to the user users. These are the features and needs of the solution: the number of pages of documents to be processed is variable: on average it is 15 but with peaks of 100 pages. the average number of tokens per page is 1k the number of documents in parallel processing must not be less than 4 processing times must be minimal At present, the current limit of 30k TPM is too low and creates queues in the processing pipeline; therefore, a higher token rate limit (ideally between 100k and 200k TPM) is needed to allow simultaneous, latency-free processing. We request for TPU under the Provisioned-Managed deployment type, but we are unable to use the additional TPU with gpt-4o model, they are only available in gpt-4. Are there other ways to increase the TPM? Or how do I make the PTU available also with gpt-4o?

gpt-4o | requested additional TPUs not showing up

Ivo Bacco 0

We use the Azure Open AI service, with the gpt-4o model, integrated into a custom application and to process documents and extract information.

The extracted information must be processed and available as soon as possible to the user users.

These are the features and needs of the solution:

the number of pages of documents to be processed is variable: on average it is 15 but with peaks of 100 pages.
the average number of tokens per page is 1k
the number of documents in parallel processing must not be less than 4
processing times must be minimal

At present, the current limit of 30k TPM is too low and creates queues in the processing pipeline; therefore, a higher token rate limit (ideally between 100k and 200k TPM) is needed to allow simultaneous, latency-free processing.

We request for TPU under the Provisioned-Managed deployment type, but we are unable to use the additional TPU with gpt-4o model, they are only available in gpt-4.

Are there other ways to increase the TPM? Or how do I make the PTU available also with gpt-4o?

YutongTie-MSFT 53,716 Reputation points

2024-12-09T22:05:36.6833333+00:00

Hello Ivo,

Thanks for reaching out to us, you may create a support ticket to check if you are able to get more quota. Let me know if you have no support plan, we are happy to enable you a free ticket for your quota issue.

Regards,

Yutong
Ivo Bacco 0 Reputation points

2024-12-12T10:49:11.5266667+00:00

Hello Yutong,
I don't have a support plan enabled, would you be so kind to allow me to open a free ticket about the quota?

Thanks very much in advance.
Ivo
YutongTie-MSFT 53,716 Reputation points

2024-12-28T02:07:10.2666667+00:00

@Ivo Bacco Please share your subscription ID to me via the private message, please let me know if you can not see it. I will ping you again there.

Regards,

Yutong

Share via

gpt-4o | requested additional TPUs not showing up

Your answer