Practical Comparison of FineReader 6 vs 8
- From: "James Weiler" <xooqi@[redacted]>
- Subject: Practical Comparison of FineReader 6 vs 8
- Date: Mon, 20 Feb 2006 12:48:39 CST
Dear Bookpeople,
I found the time last week to measure the comparative accuracy of ABBYY
FineReader 6 vs Finereader 8 on a rather difficult text. Here's what I
found:
Source text: Modern Vampirism, the Google Book Search scans, page 11. 1200
characters.
Both versions of the software gave a warning that I should have scanned the
page at higher resolution.
FineReader 6 results: 16 errors, including one line entirely omitted.
FineReader 8 results: 4 errors.
All FineReader settings were default.
Suspects:
FineReader 6 showed so many suspect characters that using them as a guide
did not help in proofreading. Most of the suspects were in fact the right
character, and some of the characters that were missed were not marked as
suspects.
FineReader 8 showed 10 suspects, of which 4 were actually wrong. No
characters besides those shown as suspect were misrecognized.
Source Scan: a lower case 's' in 'beings' that was broken in the original
print was misrecognized by both versions of FineReader, as was an italicized
P in 'Plane' with a degraded loop.
Character-Error Breakdown:
Actual-Version 6-Version 8
11-11-xx
in-tm-^in
cases-coses-cases
fair-(air-fair
independently.-independently-independently.
operandi-ofcrandi-operandi (Italic)
as-a*-as
Plane-J^aae-I'lane (Italic)
in-ID-in
THE-"I an-THE
living-lmng-living
(-{-(
beings-bring'-being*
pupils-pupils-pupiis
others-Other*-others
The FineReader 6 results needed to be proofread word by word for accuracy to
catch all the errors. No combination of just spell checking and looking at
the suspicious characters would have resulted in letter-perfect results.
All that was needed to produce letter-perfect results from the FineReader 8
OCR was to look at the 10 suspect characters and correct the four that were
actually misrecognized.
Bottom line:
If your standard is to actually read the text in the process of correcting
it, a careful reader and moderately quick typist would take about the same
amount of time to correct the 16 errors produced by FineReader 6 as the 4
errors in FineReader 8.
If productivity is more important than absolute accuracy, I could correct a
version of page 11 using FineReader 8 to a result with 0 or 1 final errors
in perhaps 20 seconds, reading and correcting only the suspicious
characters. To achieve the same results using FineReader 6, I would have to
read and correct the entire page, taking perhaps 2 to 3 minutes.
Caveat:
These results were from a page that both versions of FineReader reported as
being too low resolution. Using 300 DPI scans, I can't see any significant
difference in results between FineReader 6 and 8.
Jim Weiler
the xooqi guy