Who said you can't teach an old dog new tricks? We've taken dated OCR technology and brought it into modern times. Grooper's patented Synthetic OCR generates the most accurate text from images and electronic files, regardless of which OCR engine you use.
It all starts with image quality
Before any OCR action takes place, you'll want to make sure you're handing the OCR activity an image that is straight and free of artifacts. The key is to remove everything from the page that isn't text. Grooper lets you process images through a growing arsenal of exclusive tools and out-of-the-box profiles specifically designed for this task. The best part is these tools won't alter the original version of the image you want to permanently retain.
If at first you don't succeed...
Use Synthetic OCR
No matter how clean and pristine your images may appear, outdated OCR engines still have a difficult time collecting accurate text from images with multiple columns, different font sizes, and image shear. Grooper's patented OCR synthesis engine intelligently performs multiple passes of OCR on different portions of the image and Groops the results together as a single unit, keeping only the most accurate text results.
And when all else fails
We've got spell-correction
Powered by our Atomic RegEx engine, Grooper can perform OCR correction to fix some pretty ugly stuff.
Our secret blend of
PDF Text Extraction
PDF has become the most widely used document standard in the world. With that adoption comes a variety of challenges you'll have to face in order to get the best text from every page. Some PDFs are purely text-based, others just images re-packaged into a PDF format, and yet others have combinations of the two scattered throughout pages.