new stuff up
- From: Bowerbird@[redacted]
- Subject: new stuff up
- Date: Wed, 22 Mar 2006 15:15:47 EST
first, issue #3 of "monday morning quarterback", the series
describing "best practices" for book-digitization, is now up:
> http://groups.yahoo.com/group/bpsuper/message/7
> http://snowy.arsc.alaska.edu/bowerbird/mmq/mmq03.txt
this segment is short and sweet, focusing on one point:
============================================================================
each scan you make should have, in its filename,
the _page-number_ of the page which it pictures.
============================================================================
***
two new demos up, for "the secret garden" and "a hacker manifesto":
> http://www.greatamericannovel.com/sgfhb/sgfhbc001.html
> http://www.greatamericannovel.com/ahmmw/ahmmwc001.html
i've also updated my earlier demos:
> http://www.greatamericannovel.com/myant/myantc001.html
> http://www.greatamericannovel.com/mabie/mabiep001.html
> http://www.greatamericannovel.com/tolbk/tolbkc001.html
these books demonstrate "continuous proofreading",
so you still see the error-reporting/commenting stuff.
but with these latest revisions i've also begun doing the
_formatting_ expected for the purpose of pure reading.
for example, the chapter-headers are now _displayed_
as headers (i.e., big and bold), and they are hotlinked
back to the "hot table of contents" for easy navigation.
"contents" pages in the books are also now hotlinked
to the items listed. (these hotlinks are in addition to
the ones on the _separate_ "table of contents" pages
which are auto-generated and were always hotlinked.)
i've also changed from internet-style block-paragraphs
(with a blank line between paragraphs) to "book-style"
indented-paragraphs (with no blank line between 'em)...
given this formatting, the auto-generated .html display
is starting to look _highly_similar_ to the original pages.
(the aim is _not_ to look "identical", since one intent is to
standardize .html across books; "similar" is good enough.)
page-numbers are also colorized, to make 'em stand out.
i've also included "chapter-jump" links, so the reader can
jump from any chapter to the one before or the one after.
i find this immensely useful when it comes to navigating,
and navigation is a _most_important_aspect_ of e-books.
finally, i've put links on each page that allow the reader
to conveniently switch from the 1-page display to 2-up...
there are still formatting considerations yet to be done
-- e.g., front-matter, text-styling, block stuff, footnotes.
still, many pages from each book are completely done...
***
quick note: when viewing these e-books, i recommend
that you use the full-screen mode in your web-browser.
(or turn off all your toolbars if that's all that you can do.)
not only will this remove all of the interface distractions,
it means the text and scans can be made much bigger,
which helps make it more readable and less fatiguing...
also, in case you haven't discovered it yet on your own,
clicking the scan will "turn the page" forward (or back)...
***
once again -- since michael hart was unclear on this --
i firmly believe that in 5 years, _nobody_ will choose to
read ordinary .html e-books in a web-browser per se,
because better viewer-apps will be available to them...
for instance, nobody uses gopher any more either.
(and i smile when i say this, because i know 3 techies
will climb out of the woodwork and say "i use gopher!"
ok, that's about how many people will use a browser
to read an e-book in 5 years.)
(which is _not_ to say that some of the better viewers
won't be _browser-based_. i myself have been doing
a widdle bit of perl, and it's coming along quite nicely.
in this method, the .html page is generated on-the-fly,
which means the end-user can specify various _options_
for the display, bringing usability up to a decent level.
however, even though it happens _inside_the_browser_,
it's the perl script doing the work and not the browser.)
***
it is worth reiterating that these .html files are generated
via a "master" that is plain-text, in zen markup language.
once it is demonstrated that z.m.l. can mimic the _look_
of the original pages satisfactorily, and we have corrected
an e-text to error-free perfection, we can then relegate its
scans back to the "museum" status they so richly deserve,
while we live in the resource-and-bandwidth-conserving,
searchable, easy-to-copy-and-remix luxury of digital text.
back in the days when people considered it as a matter of
"debate" about whether z.m.l. would work, they thought it
was sufficient to voice any flimsy response to it, and would
"challenge" me to "prove it works" with some "examples"...
i invite those people to start looking at these demo-books.
the proof is in the pudding, and the pudding is being served.
***
i welcome any criticism of my work. preferably constructive,
but i can deal with any kind...
-bowerbird
p.s. the scans from "the secret garden" are fairly crappy.
i felt it was time to show some scans that are low-quality.
these came from distributed proofreaders, who do scans
intended for o.c.r. only, and _not_ for any future reading.
some people have suggested that d.p. scans _could_be_
used for reading purposes, but as i think you will observe,
it would take _lots_ of cleanup, and even then they would
probably not be all that great. just a note of realism here.
i'm not suggesting that d.p. should change its orientation.
if they don't care to create a high-quality "reading copy" of
the books they scan, someone else will have to do it instead.
you might say "if they're gonna scan the book anyway, they
might as well spend a little more time to do it high-quality",
but they are volunteers so _they_ decide what they will do,
it's not up to me or you.