Book People Archive

Practical Comparison of FineReader 6 vs 8



Dear Bookpeople,

I found the time last week to measure the comparative accuracy of ABBYY 
FineReader 6 vs Finereader 8 on a rather difficult text. Here's what I 
found:

Source text: Modern Vampirism, the Google Book Search scans, page 11. 1200 
characters.
Both versions of the software gave a warning that I should have scanned the 
page at higher resolution.
FineReader 6 results: 16 errors, including one line entirely omitted.
FineReader 8 results: 4 errors.

All FineReader settings were default.

Suspects:
FineReader 6 showed so many suspect characters that using them as a guide 
did not help in proofreading. Most of the suspects were in fact the right 
character, and some of the characters that were missed were not marked as 
suspects.
FineReader 8 showed 10 suspects, of which 4 were actually wrong. No 
characters besides those shown as suspect were misrecognized.
Source Scan: a lower case 's' in 'beings' that was broken in the original 
print was misrecognized by both versions of FineReader, as was an italicized 
P in 'Plane' with a degraded loop.

Character-Error Breakdown:
Actual-Version 6-Version 8
11-11-xx
in-tm-^in
cases-coses-cases
fair-(air-fair
independently.-independently-independently.
operandi-ofcrandi-operandi (Italic)
as-a*-as
Plane-J^aae-I'lane (Italic)
in-ID-in
THE-"I an-THE
living-lmng-living
(-{-(
beings-bring'-being*
pupils-pupils-pupiis
others-Other*-others

The FineReader 6 results needed to be proofread word by word for accuracy to 
catch all the errors. No combination of just spell checking and looking at 
the suspicious characters would have resulted in letter-perfect results.

All that was needed to produce letter-perfect results from the FineReader 8 
OCR was to look at the 10 suspect characters and correct the four that were 
actually misrecognized.

Bottom line:

If your standard is to actually read the text in the process of correcting 
it, a careful reader and moderately quick typist would take about the same 
amount of time to correct the 16 errors produced by FineReader 6 as the 4 
errors in FineReader 8.

If productivity is more important than absolute accuracy, I could correct a 
version of page 11 using FineReader 8 to a result with 0 or 1 final errors 
in perhaps 20 seconds, reading and correcting only the suspicious 
characters. To achieve the same results using FineReader 6, I would have to 
read and correct the entire page, taking perhaps 2 to 3 minutes.

Caveat:
These results were from a page that both versions of FineReader reported as 
being too low resolution. Using 300 DPI scans, I can't see any significant 
difference in results between FineReader 6 and 8.

Jim Weiler
the xooqi guy