Book People Archive

Re: Google Books PDFs

From: "Stewart C. Russell" <scruss@[redacted]>
Subject: Re: Google Books PDFs
Date: Fri, 01 Sep 2006 17:00:00 -0400

I've had reasonable success extracting page images using pdfimages (from 
the xpdf suite: <http://www.foolabs.com/xpdf/>). It's rather slow, but 
extracts the pages exactly as stored.

Each page seems to comprise the main page image (scanned at a high 
resolution) plus a separate "Digitized by Google" watermark image. These 
smaller images can easily be discarded.

cheers,
  Stewart

-- 
  http://scruss.com/blog/