Re: Bookscanning Launch and Vision of an Open Library
- From: Bruce Albrecht <bruce@[redacted]>
- Subject: Re: Bookscanning Launch and Vision of an Open Library
- Date: Thu, 27 Oct 2005 15:51:04 -0500
Lars Aronsson writes:
> OCA has many members. Adobe is a member because of their PDF
> technology and H-P for some similar reason. I don't think Yahoo
> and Microsoft will scan a single book. They're there to make the
> books searchable. The Internet Archive is already scanning (they
> used call it the Million Book Project...) and the University of
> California libraries provide the books (as the University of
> Toronto already has).
The text of one of the news articles says Yahoo and Microsoft have
committed to paying for the scanning of 18K and 150K books
respectively. I expect you're right, and this only means they will be
funding IA's ongoing work.
> If you go to Google Print, can you find a single book that isn't
> held and scanned by the University of Michigan?
Yes, I've seen one from Harvard. The Poison Tree,
If anyone is interested, my current list of books available from
Google Library is over 1000 titles, and can be found at
A perl script to harvest the images for a Google Library book is
I've added documentation and a few options. For Windows users, it
should work straight out of the box with Activestate Perl, or the Perl
available with the Distributed Proofreader's tool, Guiguts.