Book People Archive

Re: More About Google's Deal with U of Cal



JMO wrote:

> Yes, these could be quite useful: you can download a book in one go
> to read offline (or to use for proofreading) instead of having to click
> through the book one page at a time.
>
> The PDF files appear to be page images only; no searchable text included.
>
> There may still be a few bugs in the system.  In particular,
> their PDF generator may also need to be adjusted a bit to conform to
> the PDF standard, as my experiments so far indicate that at least some
> of the files don't work in many PDF viewers.  To see an example, look at
> the download for _The Slipper Point Mystery_, which I recently listed at
>


But on the site from which you can download these pdfs, you find that Google
has written:

START OD QUOTE FROM GOOGLE

What PDF viewers do you support?

Images in our PDFs are highly compressed. This saves on storage costs (both
for us and for you) and makes them faster to download. It can also mean that
some PDF readers are not able to read the files. We feel that the advantages
of more highly compressed files are worth the tradeoff.

We recommend that you use the free Acrobat Reader version 7 for viewing PDFs
downloaded from books.google.com. If for some reason you can't install
Acrobat Reader, we also recommend the following applications:

  a.. xpdf. If you find that xpdf crashes with JBIG2 errors you should apply
the patch available here.
  b.. GPL Ghostscript 8.54. Please note that previous versions will not work
with these PDF files.
  c.. Foxit Reader.
If you are a Mac user, Preview on OS X 10.4 is not known to have any issues
with these PDF files, although users of earlier versions of Mac OS may
experience problems.

If you experience problems loading PDFs in your browser using the Adobe
plugin, we recommend downloading the file (in most cases by right-clicking
and selecting "save as" on the "Download" link) and viewing it in Adobe
Reader or one of the other supported applications.

END OF QUOTE FROM GOOGLE

My impression is that some of these scans are searchable, but not all.
Anyway, the scans that Google has of a book are searchable.

The pdfs are indeed compressed to a surprising extent. I have downloaded one
that I thought ought to be about 12 megabytes, to find it scarcely more than
three. Even more surprising is the way in which you can zoom in on the text
without it becoming jagged. There is smooth text at 800 magnification, and
very slight jaggedness at 1600. This would be a bit like a DjVu file saved
in the unsearchable mode. It isn't a DjVu, by the way.

But here is the good news! ABBYY FineReader 8 will read the pdf (at least it
read the one I downloaded as a sample), but extremely slowly, about thirty
seconds per page. Now, why would that be? Still, if you leave it working on
the pdf for three or four hours, you have the entire book read in. You can
save each page as a separate tiff, or in whatever format you like, or you
could save the entire book as a more conventional pdf.

I confirm that you can read the Google pdf with FoxitReader. Easily.

Nick Hodson, Athelstane, London, England, United Kingdom