We have a scanned document on which a label has been attached. The label has been designed to have a border that makes it easy to determine the correct orientation and area of the label.
The label portion of the scanned image needs to be extracted and deskewed as an image. The resulting image should be horizontal.
Ideally any image processing should be done using the Python programming language (2.6), the PIL package and probably openCV. Though other languages (C# or C++) would be acceptable with libraries such as Aforge. Our preference is for .Net / Mono solutions.
It may not be feasible to do this project without the use of an image processing engine such as openCV or Aforge. There is a routine in openCV called cvMinAreaRect2() that may do the job of returning a matching rectangle that is inclined. There is a Python to openCV interface available.
An example scanned document is included with this project. Ideally, the border should not be included in the returned image. However, this is not crucial.
Given the way the border is layed out, it may be that the approach discussed here:
[[login to view URL]][1] would work without resorting to openCV. (Referring to the attached image) Loop through the scanned image searching for an "o" find 2 of them. Search again looking for filled in "o" find 2 of them. That gives the boundary, extract the rectangle and write to a new image.
## Deliverables
Any standard image type should be acceptable to the process. Resolutions may vary.
Must be reasonably quick. That is, not more than 10 seconds per document.
## Platform
Windows