Re: Bookscanning Launch and Vision of an Open Library
- From: Bruce Albrecht <bruce@[redacted]>
- Subject: Re: Bookscanning Launch and Vision of an Open Library
- Date: Thu, 27 Oct 2005 15:51:04 -0500
Lars Aronsson writes:
> OCA has many members. Adobe is a member because of their PDF
> technology and H-P for some similar reason. I don't think Yahoo
> and Microsoft will scan a single book. They're there to make the
> books searchable. The Internet Archive is already scanning (they
> used call it the Million Book Project...) and the University of
> California libraries provide the books (as the University of
> Toronto already has).
The text of one of the news articles says Yahoo and Microsoft have
committed to paying for the scanning of 18K and 150K books
respectively. I expect you're right, and this only means they will be
funding IA's ongoing work.
> If you go to Google Print, can you find a single book that isn't
> held and scanned by the University of Michigan?
Yes, I've seen one from Harvard. The Poison Tree,
http://print.google.com/print?id=ITnQL8F_3M0C
If anyone is interested, my current list of books available from
Google Library is over 1000 titles, and can be found at
http://www.zuhause.org/dp/gfound.html
A perl script to harvest the images for a Google Library book is
http://www.zuhause.org/dp/gharvest
I've added documentation and a few options. For Windows users, it
should work straight out of the box with Activestate Perl, or the Perl
available with the Distributed Proofreader's tool, Guiguts.