Re: Google Books PDFs
- From: "Stewart C. Russell" <scruss@[redacted]>
- Subject: Re: Google Books PDFs
- Date: Fri, 01 Sep 2006 17:00:00 -0400
I've had reasonable success extracting page images using pdfimages (from
the xpdf suite: <http://www.foolabs.com/xpdf/>). It's rather slow, but
extracts the pages exactly as stored.
Each page seems to comprise the main page image (scanned at a high
resolution) plus a separate "Digitized by Google" watermark image. These
smaller images can easily be discarded.
cheers,
Stewart
--
http://scruss.com/blog/