How to improve line breaks in Document Intelligence's OCR output?
Yathharthha Kaushal
0
Reputation points
Hello everyone,
I'm currently working with an image of code that I need to convert into text using OCR. The issue is that the code uses line breaks in a way that, if modified, could lead to compile/runtime errors.
Here's an example of the OCR output I am getting:
Paragraph:
public static int foo(int bar) {
==========
Paragraph:
bar++; if (bar < 10) bar = foo(bar);
==========
Paragraph:
int i = 0; int j = 0; while (i > foo(j - bar)) { j++; bar += j;
==========
Paragraph:
3
==========
Paragraph:
return bar;
==========
Paragraph:
}
==========
However, the code should look like this:
public static int foo(int bar) {
bar++;
if (bar < 10)
bar = foo(bar);
int i = 0;
int j = 0;
while (i > foo(j - bar)) {
j++;
bar += j;
}
return bar;
}
Here's the actual image used for the ocr:
Is there any way to make the Document Intelligence OCR output line breaks better, ensuring the code is correctly formatted?
Best regards,
YK
Sign in to answer