Book People Archive

A report from Reading 2.0

Last week I went to the "Reading 2.0" meeting in California,
organized by Peter Brantley of the California Digital Library.
It was an interesting group: a couple dozen folks from libraries,
technology companies, publishers, and nonprofit or volunteer
organizations.	Some of the large-scale digitization projects
were represented there, including the Google project (two
Google people, plus folks from the Harvard and Michigan libraries),
and the Open Content Alliance (Brewster Kahle of the Internet
Archive, plus folks from the California Digital Library, Microsoft,
and Yahoo).  Some other people there that many Book People will
recognize include Juliet Sutherland of Distributed Proofreaders
and Jon Noring of OpenReader and Ebook-community.  (Joseph Esposito
was also scheduled to be there, but had to cancel at the last minute.)

The meeting turned out primarily to be a learning and getting-
acquainted meeting.  No group-wide declarations or agreements
came out of it, but the meeting provided a way for folks to learn
more about what users of online texts wanted and what producers
could provide.

Tim O'Reilly wrote up a good summary of the formal presentations at

and has links to some of the materials we presented or cited.  One
demo that folks here might be particularly interested in was FictionFinder,
a system that lets you zero in on works (and then particular editions
of works if desired) based on fairly detailed criteria, including
things like "detective novels set in Philadelphia".  It clustered results
so that you don't get overwhelmed by the hundreds of different
editions of Huckleberry Finn, but can choose among them if
desired, either by detailed criteria or just whatever copy happens
to be the easiest to obtain.

As at most good meetings, a lot of the interaction took place outside
the formal sessions.  It turned out, for instance, that there
were a number of people who had a fair bit of interest and investment
in improving copyright clearance, so after I talked about copyright
renewal information that had been assembled for serials,
I talked "offline" with folks from  Michigan, the Internet Archive,
and OCLC who were concerned about similar issues.  We agreed that
it would be very useful to find better ways to collect and share
copyright clearance information, to make it easier for people to
put books and other materials online.  I'm hoping some sort
of collaboration will come out of that, though I'm not sure
exactly what form it will end up taking.

Various presenters talked about technologies that could be used to
make it easier to share information about collections and make
them easier to find, cite, and browse.  (OAI, OpenURL, and
persistent identifier technologies like ARK all had presentations
devoted to them.)  For my part, I'd love to be able for starters
to pull high quality metadata from people's collections so that
folks can easily browse and search across them.  This doesn't
seem to be possible yet either for OCA or Google. in the OCA's
case it seems to be because there isn't yet the infrastructure
to provide it.  In Google's case, I also got the sense that 
the company hasn't yet decided whether or how it should share
this metadata in aggregate.  (They *are* aware that folks
are quickly finding and downloading their public domain content,
but I wasn't able to get a strong sense about what they wanted
to do related to that.)  Some not-quite-so-big projects *are* providing
good metadata via OAI, and once I complete my site's transition
to standard subject terms and associated searching and browsing,
I hope to put some serious effort into pulling that metadata in.

Several speakers from Tim O'Reilly onward talked in various ways
about the importance of engaging a community (or "harnessing
collective intelligence"), and Juliet Sutherland gave a useful example
of the power of the grassroots in action in her talk about
Distributed Proofreaders.  I heard her "have you done your page
today?" question repeated later by a number of the attendees.

For such a diverse set of participants (including ones that have
been fiercely competitive or contentious elsewhere)
the meeting was remarkably friendly.  (Of course, it's easier
to be that way when there isn't something on the table you
have to agree on or negotiate over.)  Even the DRM discussion
at the end was fairly calm, though I got the sense that a lot
of the people in the room would rather not deal with it, and
that many of the others would rather not talk much about it.
(I commend Jon Noring for being one of the folks willing to engage
the issue at the meeting.)  I did find interesting the take of one of
the publishing people there, that they felt they had needed DRM to bring
their authors on board for their ebook initiatives, but that in
their opinion DRM had failed miserably.

Brewster Kahle invited me to lunch the next day at the Internet
Archive's facility in the Presidio.  It's an interesting place:
a historic two-story house now filled with scanners, server
racks, books, prints, and a vibe that reminded me a bit of the
big communal houses of computer-science grad-students.  I was
not the only guest there; there was also a lawyer there doing
some pro bono work for them. One of the people there walked me through
their hardcopy book to online book and back to hardcopy book process.
I also got one of their hardcopy books as a gift: a paperback
edition of _Old Christmas_ (from an 1880s edition by Irving
and Caldecott).	  I read it on the plane on the way home.  It
came out fairly well.  The images were reduced in size a bit
(and all were black-and-white where it looks like some were
originally in color) and the type on a few pages was a bit faint,
but it was all readable, and certainly in a form that was
easier to acquire and carry about than the original.  It was
a vivid demonstration of how cheap, easily distributed surrogates,
whether in print or digital form, could both disseminate the
message and pique interest in the original book editions.

So that's my quick report on the conference.  Anyone else who was there,
or who's found something interesting in the reports or online materials
from the conference, feel free to follow up!