Concerns Regarding the tables in markdown output changes in 2024-07-31-preview
Hello Azure AI Document Intelligence Team,
We have some concerns regarding the tables in markdown output changes in 2024-07-31-preview release:
Starting from 2024-07-31-preview, the representation of tables is changed to HTML tables to enable rendering of merged cells, multi-row headers, etc.
A significant concern is that HTML tables require nearly twice the number of tokens compared to markdown tables based on our testing. Since we're developing GenAI apps using Azure AI Document Intelligence, this increase in token usage will reduce chunk context and might negatively impact the accuracy of our applications.
Could we explore alternative solutions for handling merged cells and multi-row headers? E.g., could we consider duplicating cells for row spans instead of converting everything to HTML? Additionally, it would be beneficial to provide options for end-users to choose tables in Markdown or HTML, so that end-users can do customization for markdown merged cells issue (if Azure can't address the issue).
Thank you for considering these suggestions.
Best regards, Jonathan