Book People Archive

Re: Part 2 of PDF, DRM, and "open" formats



John Mark Ockerbloom wrote:

Excellent comments in both Parts, John!


> I'd like to talk some more about what makes a format open and desirable
> for use for online books and documents.  In particular, I'm going to be
> discussing other digital document formats, including OpenReader, which
> Jon Noring just announced a spec for.  (Congratulations, Jon!)

Thanks!


> It's interesting to note that PDF's main benefits and drawbacks are often
> the same features in different contexts.

Yes, that is a good way to put it. I've noted in the past that the
greatest weakness of PDF is also its strength.


> For example, PDF's detailed page
> layout and formatting capabilities give authors and publishers more control
> over the appearance of their work than most other formats do-- but they also
> make PDF documents more difficult to reformat for *other* views and uses than
> other formats.  The wide range of capabilities that PDF supports also make it
> difficult for a user to know in advance what they can do with a random PDF
> document.  They might not have tools that can effectively use all of PDF's
> capabilities, or the document itself might not support-- or might even
> block-- the capabilities the user is wanting to use, such as copying text.

This is also a good summary. PDF, at least used in its non-structured,
fixed page mode (which is probably 99.9% of all PDFs), gives publishers
the illusion of full control over digital presentation, but in actuality
publishers *lose almost all control* over digital presentation.

The only time when publishers retain full control over presentation
with PDF is when it is printed on paper.


> In the ebook world, "open" formats for commercial publishing have been
> discussed for years, and the first "Open Ebook Publishing Structure"
> (OEBPS) specification was released in September 1999.  In the press release
> accompanying the first specification, the promoters wrote "The
> specification is expected to accelerate the availability of electronic
> reading material, because the single universal format will work on all
> reading systems that are compliant with the specification.... The OEB
> standard means that publishers can format their content once and still
> make it available on all devices and software that support the Open
> eBook specification.  This is a huge win for consumers..."
>
> So, do you have any books in this single universal format that's now
> been around for nearly 7 years?  Me neither.  That's because virtually
> no one sold books in this native format.  To make a nice-looking book,
> you needed to include supplementary files for pictures, fonts, and the
> like.  The OEBPS format didn't specify a way to package these along
> with the main text file.  Instead, publishers came up with their own packages
,
> and they were virtually all proprietary.  So, while the OEBPS format may have
> been "open" for publishers, it did nothing to help consumers
> use books the way they wanted on the platforms they chose.  Many of the
> proprietary books now being sold do in fact include encoded OEBPS
> documents wrapped up in some other format, but that makes no difference
> to the consumer, who can't use the OEBPS directly.  And we now have what
> I would consider a relatively tiny, unattractive consumer market for ebooks.

Ah, you found the kicker with OEBPS. One has to understand that an
"OEBPS reading system" can include conversion, at the publisher's end,
of OEBPS into some other format -- for example, Microsoft Reader LIT.
That is, the "System" can have very expansive boundaries.

One also has to understand that in the early days of OeBF, OeBF was
controlled by proprietary interests who intentionally placed the
moniker of "exchange" format on OEBPS, to downplay its potential use
as a native end-user format, even though the tech people who designed
OEBPS always considered direct native rendering as one use of OEBPS.

So whenever someone says "OEBPS was designed as an exchange format,
not as a native end-user format", I just cringe. The proprietary
interests taught that person well!


> Now it's 2006, and many people would like to see consumer publishing get
> it right this time.  That's where OpenReader claims to make a difference.
> The OpenReader front page talks about the importance of consumers not having
> to worry about "the software they use to read downloads of electronic
> publications", or having their ebooks depend on the vendor staying in
> business, or not surviving technlogy upgrades, or otherwise subject to
> proprietary shackles.  There's an implicit hope or promise expressed here
> that OpenReader will help with these problems, just as there was hope
> in 1999 that OEBPS would.
>
> Indeed, the OpenReader folks have just released their first spec.  And,
> as one of the people who had been prodding them for a long time to release
> specs,  I want to express my congratulations and thanks.  But the Binder spec
> that's just been released, like the original OEBPS, does not cover the
> complete ebook.  I should add that the OpenReader folks don't claim that it
> does-- they and I both acknowledge that it's just the first step.  The more
> crucial step, though, and the one that will determine whether OpenReader can
> do what's been promised for readers or just become another OEBPS, hasn't
> been taken yet.  That step is fixing the rules on what gets to be called
> an OpenReader book and what doesn't.

To provide some background:

There are actually two levels to what defines the OpenReader format.
First is the Framework specification, which defines the rules on how
to assemble resources together and calling it a Publication. The
Framework will call for one Binder document, one or more content
documents of approved types, and other optional stuff such as CSS 2.1
style sheets, images, and multimedia. The draft Framework is now in
the process of being authored (to note, however, the Binder is the
real core of the spec.)

The next level is encapsulation into a single binary file of the
Framework files for distribution purposes.

Again, this will be an open standard, probably consistent with the
IDPF Container soon to be released, which in turn is heavily
influenced by the ODF Container.

The Container spec itself does not deal with DRM/TPM.

TPM is to be applied by publishers independent of these layers, by
encrypting resources before they are encapsulated. More on this
below.


> This is particularly important because the main OpenReader promoters on the
> Net have indicated that their vision of OpenReader includes DRM.  But they
> haven't been clear to date on what *sorts* of DRM will be permitted and
> what won't.  I worry that if they choose a standard that's too open-ended
> (or allow one by default by *not* choosing a standard) that they'll end up
> with the same problems that encrypted PDFs have-- namely, that readers can
> end up just as hamstrung by proprietary restrictions as they were with
> other formats.  (OpenReader promoters, on the other hand, apparently worry
> that if they're too strict with DRM standards, or they don't permit DRM
> at all, that publishers won't use the format to begin with.  On the other
> hand, there are lots of people on this list who are perfectly willing to
> "publish" works online in suitable formats without any DRM.)  It's also
> clear from our discussion to this point that many, though not all, forms
> of DRM effectively make a format no longer "open", at least as actually used.

I fully understand what you are saying and asking, and agree that many
types of TPM applied to an open standard format effectively renders
the format "closed."

Unfortunately, DRM/TPM, as it is wont to do, brings in a whole bunch
of black, oily crud -- it is a huge can of worms that is difficult to
untangle. In addition, there are the pressures that larger publishers
have in wanting some kind of DRM/TPM, even if it is only a
"psychological" crutch (or to make their squeamish authors and their
lawyers happy.)

We prefer that TPM NOT be used with OpenReader Publications, and TPM
will always be optional. In addition OpenReader user agents will never
be required to enable any TPM system.

Now, I know exactly what you are asking, but OpenReader, at this time,
cannot release a set of DRM/TPM requirements as you are asking. Why?
We simply don't have the volunteer resources to try to tackle this
knotty issue head-on.

Setting up such DRM/TPM requirements is not a trivial matter where I
can spend a couple hours some afternoon and draw them up and everyone
will be happy. This will require several sharp people, knowledgeable
about TPM, and who come from different perspectives, to brainstorm the
issue together in a temporary working group setting, untangle the
various cans of worms (and there's several cans to untangle), and
hammer out a set of TPM recommendations that make sense.

The biggest problem faced is to find the proper balance so the TPM
recommendations allow for "sufficiently open", yet "powerful enough"
TPM. If we make the requirements too strict, then we will in effect
ban the use of most if not all TPM, and this may (or may not!) hurt
the chances of OpenReader in meeting its goals. If we make the
requirements too lax (so they disallow very little), then we won't
meet the "openness" that John Mark wishes (and I wish!) for OpenReader.

So, in essence, I am offering the solution: we need to setup an
independent group of sharp people to look into the issue, and do so
fast. I'll certainly join the group and help with organizing it, but
it will require someone else to lead the effort (I simply don't have
the time to take the leadership role.) In some ways, it is best if
the effort of this group is "generic" rather than OpenReader specific.
This makes its results useful in the larger context of general open
standard digital publication formats.

(Of course, I offer the leadership position of this group to John
Mark Ockerbloom, if he is willing to accept! I will help John as I
can, including setting up teleconferences, gotomeetings, etc.)

Now, to note, we are intrigued with OSoft's DRM/TPM since from our
perspective it seems to strike the needed balance I discuss above,
although I have no defendable basis to say this other than my
"feelings" believe this. In addition, OSoft has said they are willing
to license, at very liberal terms, their DRM system to a third party
non-profit, jointly run by publishers, retailers, the accessibility
community, etc., which will administer the DRM/TPM system for the
digital publication ecosystem. This assures the cost of the DRM will
be kept to a minimum, and that it is not controlled by any one
proprietary interest. In addition, how OSoft's DRM/TPM works is pretty
open, quite gentle, and focuses on the person and not any particular
hardware, allowing cross-platform use of any TMP'd publication.

Now, I am not saying OpenReader will bless any particular DRM/TPM
system (at least without following a set of recommendations as noted
above), but we are intrigued by OSoft's system, and it does suggest
that indeed we should be able to come up with a balanced list of
requirements and recommendations for anyone trying to add a TPM layer
to the OpenReader format.


> OpenReader is of course not the only possible next-generation digital book
> format.  Some people on and off this list are doing interesting things
> with formats like ODF, DjVu, ZML, TEI, Wiki formats, or more direct followons
> to OEBPS, or they are finding clever new ways of using older formats in new
> and interesting ways.  Are we about to see breakthroughs with what we do with
> these old and new formats?  Or are we just going to party like it's 1999?
> I'm not sure myself, but I'll stop here, and would love to hear what other
> folks have to say.

And of course add if:book's Sophie to the mix, and Microsoft's Windows
Presentation Foundation (for Vista) which I thought I heard somewhere
that they plan to make an "open standard."

To summarize, what is needed is an independent group to study DRM/TPM
as applied to open standards digital publication formats, and come up
with a balanced set of recommendations to assure a reasonable amount
of "openness." I liken such recommendations to the Web Content
Accessibility Guidelines (WCAG) and similar types of recommendations
intended to assure that everyone wins.

Jon Noring

speaking this time for the OpenReader Consortium