How to Enhance Azure OCR Detection in Arabic Data

Question

Hello everyone, I'm using Azure Computer Vision service to detect Arabic text in images with table structure. In some cases, the results I'm getting are not accurate.

As you can see in the attached image, the Arabic dates (right column) are not being detected, and the header is being considered as a one cell.

How can I solve this? Knowing that I'm using Img2Table library in python to detect the text in the cells as one piece of text.

User's image

Thank you!

Accepted Answer

Hi @Mayar Alzerki

Thank you for using the Microsoft Q&A forum.

The results of the OCR API as mostly based on the quality of the image and the requirements should confer to these pre-requisites.

To enhance Azure OCR detection in Arabic data, you can try using the OCR API's language, detectOrientation to detect OcrRegion, textAngle, and languageDetectionMode parameters. Additionally, you can review the Img2Table library documentation and Azure OCR API documentation to ensure they are being used correctly.

If the issue persists, you can try preprocessing the image or using Custom Vision to train the models to recognize specific objects or text, and eventually improve the accuracy of the OCR service.

Hope this helps.

Please do not forget to click Accept Answer and Yes for was this answer helpful, wherever the information provided helps you. This can be beneficial to other community members.

Answer

As attached in the screenshot, The left side is the actual statement and the right side is the azure extracted data. As shown in the screenshot in the right side, the azure extracted data is not in the order as compared to the left side of the screenshot image.

Statements are reversed if there is a mixture of english and arabic words

Any Solution for this ? User's image

Statements are reversed if there is a mixture of english and arabic words

Any Solution for this ? User's image

Share via

How to Enhance Azure OCR Detection in Arabic Data

1 additional answer

Your answer