We search developers with solid expertise in OCR, Tesseract or other open source software.
We are the creators of [login to view URL], a platform for independant comic artists that want to show their work to the world.
We have created a unique feature > a collaborative, crowdsourced interface for authors to have their comics translated by fans! Thanks to this feature, some comics have been translated in over 15 languages by fans!
We now want to improve this interface by automating a step that is currently done manually in 3 phases :
1/ selecting text areas by creating rectangles on top of them
2/ emptying the text areas
3/ then writing the translation in the text area.
Milestone 1, texte zone recognition :
The software starts by receiving one parameter- the path to the jpg/png file > a full comic page
The process analyses the image, and searches for zones of text inside balloons (or speech bubbles). Minimum 4 letters words, All other text is ignored.
The software returns the coordinates of the zones it has found.
The zones’s coordinates are given in a text file, one line per zone.
The line is X,Y,width,height values are numbers in pixels, X,Y is the top-left corner of the rectangle, 0,0 being top left of the image.
Exemple: 56,48,350,220 (a 350x200 pixel rectangle with position at X=56, Y=48)
The target performance is 1 second for an image
The target success rate is at least 95% of text zones being detected properly and can’t fall under 80%. During the test phase, the results will be checked by humans.
There won’t be any user interface, this program will communicate with [login to view URL] API and we will do the rest of the process.
Milestone 2, OCR :
Using an image with only text, selected and extracted by the first part
given as parameter for the check of the milescore, but won’t be needed like this when the project is complete
The process Analyses the image part, and recognizes the text.
Outputs the text in a text file.
Encoding in UTF-8
Text must be on one line.
if found line returns, writes \n (an actual backslash and a “n”)
The target performance is 0.3 second for an image.
Milestone 3, completion :
Using the full image and the rectangle informations, the soft will make the following process for each part of the image independently:
Analyses the image part, and recognizes the text (like Milestone 2)
Write the text in a text file, on one line.
Begin the line by the size of the letters and a coma.
Example : “45,Hello!\nHow are you?”
Thus, each rectangle will have it’s line in the text file.
We can use the same text file and encode it with the rectangle area, then text.
Exemple: 56,48,350,220,45,Hello!\nHow are you? (X, Y, Width, Height, FontSize, Text)
The target success rate is at least 95% of text zones being detected properly and can’t fall under 80%. During the test phase, the results will be checked by humans.
There won’t be any user interface, this program will communicate with [login to view URL] API and we will do the rest of the process.
Performances and APIs
If launching the soft can be a little slow, it can have a mode where it’ll work on an entire directory of images.
Amilova code will run the program telling what images/dir to do, then will read the created text files.
Technical requirements :
The software(s) will need to work on linux command-line.
You can use the language of your choice, with a pre-validation from us.
We’ll want the source code.
****
Please bid only if you have solid experience in OCR software. We believe that the 3 parts can be done independently, so if your skills match only one part please state it explicitly.
If we could see successful OCR projects in your freelancer history, that would be a huge plus.
Prepare a list of the similar projects you’ve been working on so you can present your work with practical cases.
Hello, thank you for invitation to the project I am ready to do it. I am a C++/Python developer working in image processing team in neurosoft.pl. Could you tell me if you have any expectations according to programming language. I can do all 3 parts and I prefer python to do it, the main reason is that Python is multiplatform and has a great support for image processing techniques.
Regards,
Marek
I attach the text region detection samples: [login to view URL] the algorithm can be improved this is the very first version. In case of payment we can negotiate, please send me also some more images so I can test my algorithm
$2.000 USD in 22 Tagen
4,6 (7 Bewertungen)
4,9
4,9
11 Freelancer bieten im Durchschnitt $2.641 USD für diesen Auftrag
Hi, I have checked attached 3 samples. If the all text is similar to the samples, it's not problem to OCR the text. But I am wondering which complex font would be coming. I have experience in Tesseract OCR engine through several projects.
One of them was extracting document types in single pdf files.
Since I am sure I can handle all three milestones, I would like to talk You for further details.
Thanks.
Hello
I am an expert of image processing.
Before I made program about Invoice PDF OCR and de-CAPTCHA.
Also I made ALPR program.
If you need to require my demo, I will support it.
I can this job perfectly.
IF you want me, please leave message.
Sincerely