J
Jeff Liebermann
Guest
On Mon, 29 May 2017 23:04:15 -0400, rickman <gnuarm@gmail.com> wrote:
The example I provided was how to do the latter. I scanned the image
in one program, and added a searchable text layer with a PDF viewer.
There are scanning programs that will seem to do the process in one
step such as Nuance Omnipage, Paperport, Adobe Acrobat (NOT reader),
etc. To the casual user, it looks like the process is being done in
one step. In reality, it first scans to a bitmap. Next, the OCR
software reads the bitmap to produce the searchable text layer. It
then saves the result as a PDF file. To the best of my limited
knowledge, none of the available software does the OCR step *WHILE*
scanning, but I might be wrong about that.
I never did figure out how to display and edit the OCR text in
PDF-Xchange Editor. Looking through their feature list of other
versions, it seems to be something at only the more advanced and
expensive versions will do. Bummer.
--
Jeff Liebermann jeffl@cruzio.com
150 Felker St #D http://www.LearnByDestroying.com
Santa Cruz CA 95060 http://802.11junk.com
Skype: JeffLiebermann AE6KS 831-336-2558
It is so easy to be misunderstood. I'm talking about the text showing up in
the PDF document. I receive d a document that was clearly a scanned image
in a PDF file. But the text was selectable and copyable. The two options
are the image was scanned and OCR when the PDF was made, or the PDF viewer
had OCR scanning built in. Since I couldn't select the text in another
scanned image PDF it must be the former.
The example I provided was how to do the latter. I scanned the image
in one program, and added a searchable text layer with a PDF viewer.
There are scanning programs that will seem to do the process in one
step such as Nuance Omnipage, Paperport, Adobe Acrobat (NOT reader),
etc. To the casual user, it looks like the process is being done in
one step. In reality, it first scans to a bitmap. Next, the OCR
software reads the bitmap to produce the searchable text layer. It
then saves the result as a PDF file. To the best of my limited
knowledge, none of the available software does the OCR step *WHILE*
scanning, but I might be wrong about that.
Original document scanned to JPG using Irfanview 4.44:
http://802.11junk.com/jeffl/OCR%20Demo/JPG.jpg
This is not searchable.
Same document saved to PDF using Irfanview 4.44:
http://802.11junk.com/jeffl/OCR%20Demo/PDF-no-OCR.pdf
This is also NOT searchable.
Same document in PDF-Xchange 6.0 build 322.4 after OCR:
http://802.11junk.com/jeffl/OCR%20Demo/PDF-after-OCR.pdf
This one can be searched.
PDF-Xchange screen grab showing a typical search result:
http://802.11junk.com/jeffl/OCR%20Demo/PDF-Xchange-screen.jpg
I never did figure out how to display and edit the OCR text in
PDF-Xchange Editor. Looking through their feature list of other
versions, it seems to be something at only the more advanced and
expensive versions will do. Bummer.
--
Jeff Liebermann jeffl@cruzio.com
150 Felker St #D http://www.LearnByDestroying.com
Santa Cruz CA 95060 http://802.11junk.com
Skype: JeffLiebermann AE6KS 831-336-2558