Book People Archive

End of year roundup: Big Scanners, new directors, and more free content to come



The latest issue of Walt Crawford's _Cites and Insights_ just came out,
and included in this issue is an update on what's been happening with
the Google and Open Content Alliance book scanning projects, which some
folks here were recently asking about.  You can read the article at

    http://citesandinsights.info/v7i1b.htm

and there's a useful summary up front.  Basically, Walt's wondering what's
up with OCA (he reports that the site's not been updated as he expects it
would be), and is reporting that Google's produced lots of books, though
he can't say exactly how many.

I can confirm that indeed Google seems to be putting out a lot of material,
though in somewhat random order.  A while back, for instance, someone asked
me about Alexander von Humboldt's 5-volume work _Cosmos_.  I was able to
determine that Google had several editions of books labeled volumes 2-5,
and Gutenberg had a volume 1, but that I couldn't assemble a set of 5 volumes
all from the same publisher.  Maybe someone will have it eventually, but
for now I've stiched together the volumes I know of on my "in progress" page.

I've played around a bit with Microsoft Live books as well, but not as much,
because it doesn't play as well with my usual computing setup.  In particular,
it doesn't seem to work at all on Mozilla on my Solaris box, where I usually
work, probably due to incompatible Javascript.  It does work on my Mac laptop,
but it can be slow at times.  I haven't yet determined how to search
specifically for a title (as opposed to putting in a title and having a bunch
of books pop up that mention it in the full text, but not the title itself);
nor have I determined a reliable way to point directly and persistently to
a book in the system.  (In some cases, I at least get a persistent link to a
PDF on the Internet Archive, though.)

It also looks like Google is planning to expand into scanning back runs of
serials as well as books.  I have mixed feelings about this.  On the one hand,
I'm excited to see the interest, particularly since I was one of the people
exhorting Google and others to look into historic serials in various
presentations I gave last year, and it sounds like the public might
be able to get access to a lot of content they might not have been able to view
before for free.   On the other hand, it appears that there
are some definite concerns, voiced by people like Peter Brantley and Dorothea
Salo, that the deal Google is offering to serials publishers will lock up
the digitizations in ways that may well be to the disadvantage of the
publishers and the public over the long term.  You can read more from
these commentors at

    http://ono.cdlib.org/archives/shimenawa/2006/11.html

(scroll down to the Nov. 10 and Nov. 9 posts), and

    http://cavlec.yarinareth.net/archives/2006/12/17/control-your-bits/

Dorothea's post in particular brings up the problem of digitization quality
in Google's work, which has also been brought up here.

It'll be interesting to see how libraries respond to these and other
developments in the rapidly progressing corpus of digitized works.
Especially as it's just been announced that Peter Brantley, one of
the commentators above, has been named the next executive director
of the Digital Library Federation, to which many of the larger academic
research libraries in the US (and a few outside the US) belong.
I give Peter my congratulations and the best wishes for his new role,
and look forward to seeing the directions the DLF takes in the years
to come.

Of course, there's much more to the world of online books than what
the Big Projects are doing.  We've seen recent posts by folks like
Nick Hodson and Bowerbird, as well as ongoing reports from places
like Project Gutenberg and Distributed Proofreaders, that show that
the "little guys" continue to make significant contributions and
continue to do things that the "big guys" aren't doing, or in some
cases can't do.

That'll be true even more next year, as another
year of books enters the public domain in many countries (though
not in the US, where I'm located and where most of the Big Scanners
are based.)  Famous British authors like Rudyard Kipling and G. K.
Chesterton will return to the public domain in their home country;
and Winnie the Pooh will be finally free (at least the stories,
if not the drawings) in Canada (the original home of the bear that
provided the name to the character), as well as in other countries
that still have life+50 years copyright terms.

Till next year--

   John