Book People Archive

the university of michigan sets the public domain free



the university of michigan has set the public domain free!

their library is now making available the _digital_text_ from
the massive book-scanning project undertaken by google,
thus creating an avalanche of public-domain digitized books.
even at this early date, the count probably runs to 6 figures...

for people who have long grasped the idea of a global library,
either by virtue of their own vision or through the outstanding
examples of pioneers like michael hart and project gutenberg,
this development is a truly remarkable milestone on the journey.

thank you michigan!   we love you!   go wolverines!          :+)

i now call on the other universities collaborating with google to
follow this lead that has been set by the university of michigan,
and release their books too.   since google has released the scans,
that will mean e-book researchers like myself will have available
all the raw materials that we need to create electronic-books that
strive to be their best -- powerful, beautiful, and useful to people.

the first step in this transformation is to clean up the o.c.r. results.
a cursory examination of some of the text released thus far shows
that no clean-up has been done on it yet -- which is unfortunate...

but on the other hand, that same cursory examination reveals that
the quality of the o.c.r. text is good, and the nature of the cleaning
that's required is such that much of it can be done _automatically_,
meaning most books will be able to be cleaned in as little as 1 hour,
which is extremely good news indeed.   the scanning project should
be congratulated for achieving an acceptable level of quality so far,
and encouraged to dedicate themselves to improving it even more.
(as a sidebar, there might be some relatively simple steps that can
be taken -- such as adjusting o.c.r. settings -- to upgrade results,
and i will post some messages in the future suggesting the steps.)

all this just informs us that google and the university of michigan
need our help, and i am willing to step up and provide assistance,
in any humble way that i can.   i've written computer programs that
help a person clean up a book's o.c.r. results, and i'll be releasing
some beta versions of that software in the coming days, so i ask
that anyone who wants to join this effort contact me backchannel.

although there is much work still to be done, and i intend to help,
it's good to recognize when a new plateau has been reached and
to take a gaze over the horizon, perhaps "smell the digital books",
just like their musty paper cousins.   this is one of those times, and
i exhort all of the bookpeople here to _enjoy_ this beautiful view...

...and then get back to work.             ;+)

-bowerbird