Hi Bojana Taseva!
Welcome to Azure OpenAI Q and A forum. Thank you for posting your query here.
Sorry for the inconvenience caused.
I see you are facing high token usage from GPT 3.5 Turbo 16 for a 19 kb Share point file.
Here is the answer to your queries.
1.On High token usage:
From OpenAI side:
Token consumption may rise due to high query size and complexity in query. It can handle 16k tokens max (context window size).
From AI search indexing:
You can optimize your index schema to reduce size of index.
2. On optimizing token usage to avoid high token usage :
- You can reduce max_token in model deployment to lower output size.
- You can ask model to answer in specific word limit . for e.g." Please summarize the scenario and keep word count under 50 words"
Keep your queries simple, precise instead of longer queries and adopt multi-shot prompting to get desired answer.
3. Finding token used in each prompt.
You can find your token usage (completion, prompt token and total tokens) under usage section in your ouput.
"usage":
{ "completion_tokens": 39,
"prompt_tokens": 58,
"total_tokens": 97 }
Please upvote the answer and say "yes" if the answer was useful to you.
Thank you.