Azure OpenAI quota increase request
Hi, My company plans to use GPT4 engine to construct an application. I'm responsible for selecting the right services that can meet our needs. However, the default quota and limitations (token per minute) won't cover the expectations. I've read that…
Documentation about Llama 3.2 11B Vision Instruct Model says 128K context window but not able to process more than 8k tokens
I am writing to inquire about the context window of the Llama 3.2 11B Vision Instruct model. The documentation states that the context window is 128K tokens. However, when using the model, I am unable to provide input exceeding 8192 tokens. I would…
Getting data through api for using it in prompt on my application
Hi, I want to get data from Api in my prompt where if I ask questions to it will give response according to the data I'm getting from Api, how I can do this? Give me step by step approach, I know how to use OpenAI model in azure I've used it manually…
Can we use Azure Open API to manage the conversation via chat in all kinds of languages ( English, German, Dutch, ...) Does the amount of training hours matters....
Can we use Azure Open API to manage the conversation via chat in all kinds of languages ( English, German, Dutch, ...) Does the amount of training hours matters....
Token consumption
Hi, I wanted to check how many tokens are being used up by different models in azure openai, in the metrics section I am not able to see the exact token usage. On the count aggeration it is giving 4 what is that suppose to indicate. I have option for…
Issue with Token Rate Limit when Uploading Files to OpenAI Playground
I am encountering an issue when using the Azure OpenAI API. When I send a prompt without attaching any file, the model responds as expected with information. However, when I attach a file to my prompt, I receive the following error…
Availability of Fine-tuning in OpenAI on Azure at India datacenter
Could you please provide information on when fine-tuning will be available in OpenAI on Azure India datacenter?
RAG application document retrieval based on the user using azure AD details
Project Overview We have developed a chatbot as an AI assistant for the company document repository. This chatbot is created using Azure Services, including Azure OpenAI and an Azure web application for the chat interface. The data source for the chatbot…
How to switch to Azure Enterprise
Hi, My company is using a regular pay as you go subscription focused on Azure OpenAI services. However, we are exceeding the maximum quota limits and need to essentially have an Enterprise Agreement. Ultimately, we need up to 50 million tokens per minute…
Can I automatically deploy and de-deploy a fine tuned openai model to reduce costs?
I wish to use a fine tuned openai model. But it will not be used very frequently. Therefore the hourly hosting cost is a deal breaker and I really wish MS would mirror the OpenAI approach where the tokens are just billed at a higher rate.... But Is it…
Attempting to use Azure OpenAI Chat playground and upload data
I am attempting to use the chat playground and upload my own pdfs to test out azure's rag functionality, however, it isnt working. I have added a pdf to the 'interrim' container that is within the storage container 'terranexdocs' however, this is not…
Is there a service/feature to read text content from URL?
I have a current RAG solution which takes data from a blob container, indexes it, and then when the index is queried the results are passed to an OpenAI model to aid generating grounding results. Now, I want a low-code/no-code solution to pass a webpage…
Azure AI Foundry - Rate Limit Exceeded Try again in 86400 seconds
Hi I have deployed a gpt-4o model in Azure AI Foundry and am trying to test it in the Chat Playground. I've used the Add your data section to create a vector index that points to my Azure Blob storage location where the files for the gpt model to…
How to Ensure GPT-4 (Azure OpenAI) Includes Original Terms in Parentheses During Translation
I'm using the Azure OpenAI Service with the GPT-4 model (GPT-4o mini) to create a bot for translating psychoanalysis articles. The translations need to preserve specific technical terms from the source text by including them in parentheses alongside the…
Why am I getting a 500 Internal Service Error from calling Azure OpenAI
When I call Azure's OpenAI resource, I get a 500 error when I was getting no issues when calling the exact same resource with the same method previously.
Azure OpenAI Assistant API: Support for Image Input
I'm currently facing an issue with the Azure OpenAI Assistant API when trying to use image input with the GPT-4 Vision model. While this feature has been available through OpenAI since May, it seems that Azure hasn't implemented it yet in their Assistant…
Can you provide guidance on how to use images as input in the Azure AI assistant? Which AI models support image input, and in which regions are these features available?
Hi, I am working on building a doubt-solving AI chatbot for students, and I want to create an assistant capable of handling both text and image inputs. The goal is to design an AI-powered assistant that can answer questions from users based on text…
Problem with File Search In Assistant API
I am currently using the Azure OpenAI Assistant API (2024-05-01-preview). I'm trying to use the File Search feature, but the ranking options parameter does not seem to be working in the tools. Has it not been updated yet? Specifically, I want to set a…
GPT-3.5 Turbo Deployment Issue: Cross-Region Integration Challenges with AI Search
I have deployed the GPT-3.5 Turbo model, and my AI search (index and data source) is hosted in Sweden Central. The OpenAI service was created in Sweden, but the GPT-3.5 Turbo model is available only in France Central. Whenever I interact with the model,…
How do i change the request limit for whisper model in azure
Hi everyone! 👋 I’m currently using the Whisper model in Azure OpenAI Service and running into issues with the default request limits. I need to increase the request limit to better handle my application’s workload. Here’s what I’ve tried/checked so…