The Reading World

Gathering knowledge: Esoteric e-book formatting thought problems apropos of something

Last week’s announcement that the IDPF (International Digital Publishing Forum) has opened its ePub maintenance process is tremendously important to the future of books and publishing, regardless of whether you believe books, the artifact made with ink and paper, or publishing, the process of assembling, producing and distributing books for a profit, have bright futures or are destined for the trash heap. Everyone concerned about books and e-books should be paying close attention to the evolution of ePub, because it represents the current best effort at an open standard for the display of text and other information across a variety of e-reader devices.

I’ve spent the past few days studying the existing ePub components to prepare some suggestions for the IDPF. ePub is made up of three components, the Open Publication Structure 2.0, Open Packaging Format 2.0, and Open Container Format 1.0, and is deeply related to related metadata and publishing standards initiatives such as the Dublin Core Metadata Element Set 1.1 and DAISY (Digital Accessible Information System) Consortium standards. The result is a series of postings to follow which will offer thought problems that explore the nature of thought, reading, authoring, references, citation and conversation.

Making books useful and accessible to all, including the visual and hearing disabled, is a complex technical undertaking. The ePub and related standards efforts are predicated on the existence of texts which must be delivered to readers, which is precisely the problem one would address if distribution were still the key challenge. Unfortunately, distribution is the easy part of publishing today. In the networked world, ideas arrive in bits and pieces instead of whole units between the covers of a book or in an article from the newspaper. Words are quoted or paraphrased and the enterprising reader can explore the sources to discover what credit to give the fragments of knowledge they find assembled by writers, bloggers, news aggregators and in short messages. Therefore, citable information and the ability to assess ideas in relation to events and previous expressed ideas—in short, whether a newly published adds to or merely repeats previously expressed ideas—are the new hallmarks of value.

In the print era, when moving books, magazines and newspapers around in a timely fashion created value, the reader couldn’t participate, unless

Book and Reading News

A “standard” assumes the features are already set

“Ultimately, the success or failure of the eBook and eBook reader market is going to depend on establishing a standard format,” writes Tony Bradley at PCWorld. He’s right to the degree that, once a format is ready to make reading on a digital device better, it must become a standard to ensure that readers can access the file on any device and that publishing involves managing as few formats as possible. But there is an assumption in the article that there is a viable format exists on which everyone should agree. We are very far from agreeing what an e-book is, except that, as a subset of that definition, it will display words on a page.

A first-generation standard will scratch only the surface of the problem, addressing the problem of getting words on the digital page. The industry and, more importantly, readers, need more:

  • An open annotation system, but one that respects personal privacy by keeping notes meant only for the book’s reader (and, by extension, anyone with their password, their heirs) separate from public notes and conversation embedded in/around a book title.
  • A privacy regime enforced at the document level, preventing tracking of personal reading.
  • A page-independent reflowing capability, so that ridiculous ideas, such as “books for the Kindle DX,” become the fossils they deserves to be. A book should never be dedicated to a device, though there are some bizarre collectibility plays that might go that way.
  • A page-independent citation system so that kids can use an e-book citation in their homework as easily as a scholar.
  • And more…. Such as the whole question of how to integrate networking into documents.

The challenge of establishing that first standard, which lets e-books be read on any device, including PCs and smartphones, will be choosing technology that doesn’t shut the door to these additional standard requirements of a book while preserving forward-compatibility.

UPDATE: As I was arguing the other day and in the previous posting, the conform-to-compete trend in e-books is indicative of a wave of destruction. Mike Cane argues an e-book bubble is already well underway and I would not disagree with him, except to point out it is a very small bubble, though one that could unfortunately hobble the market for another half decade if it pops just now. Having published an e-book in 1993, when these things were going to be big, big, big! I have no illusions about how small a market can be. Cane, however, uses his argument to conclude that components of current technology, such as E-Ink, will inevitably fail. He argues this for all the right reasons that e-books don’t do anything spectacularly different than books and often represent less-than-a-book—he’s right that it is a race to the bottom based on price. The individual components could succeed or fail, perhaps not even within the e-book industry.

The Reading World

Reconstructing Dialogue: Publishers’ new jobs

The 140 Characters Conference this week, hosted by Jeff Pulver and attended by many of my friends, spawned a lot of discussion about the nature of communication, even though it was often cast in the terms of economics, both monetarily so and in relation to intellectual brevity. Publishers Weekly observed a conflict between long- and short-form discussion as well as the potential poisoning of the economic well because of too much commercialization of Twitter.

We do love competition
We do love competition

This is simply another variant on the professional/amateur, journalist/blogger arguments of earlier years, but it has legs, because it frames a debate about which side should “win,” which is excellent fodder for conferences and columns, blogs and short statements on Twitter or Friendfeed. It misses the point that all media blends over time, rather than one media appearing and replacing another. These conflicts are sideshows, albeit apparently enjoyable sideshows, to the larger, subtle changes that are altering our world.

It is not the case that all thought can be reduced to 140 characters, as it is fashionable to claim these days, so the challenge—one that is going to be partially addressed through the evolution of books and social software, is to create a consilience of long- and short-form dialogue, so that ideas that are explained at length in one venue, such as a book or Web page, can be extended and discussed in shorter forms that gracefully integrate with the long-form. Also, the short form needs to be artfully connected to long-form thinking, so that the two experiences are not separate, as they are today, which creates false dichotomy between the parts of discussions.

Hashmarks and search don’t heal this rift, they simply organize the boundary between long- and short-form parts of human communication.

Many worlds in a view
Many worlds in a view

Practices of the mind and social behaviors that “bridge” this gap are helpful, but remind me of the guard towers along the Berlin Wall—everyone on both sides spoke German (and a different second language, ideologically charged, which was the real communications problem). The fact that there is dialogue across these boundaries is the exception that proves the rule of opposition between long and short forms.

We need flow (see Jerry Michalski‘s declaration of a desire to be accessible and useful from the early blogging days), we need consilience. Formats need to be completely permeable, semantically connected and, wherever you are, on a page, in a book, in Twitter and IM, to serve as channels out of one place and its ideas to others. That’s what the Semantic Web will look like, and we haven’t seen even a glimmer of how vastly different that will really be.

We’re adding channels, which is not a zero-sum game. It’s not necessary for one mode of communication to defeat another. The current e-book format wars are yet another example of a useless conflict, because none of the formats supports real dialogue. They are just replicating the close experience of paper books, with barriers to sharing ideas, annotations and conversations (note: the plural is necessary) within the text.

When this argument about the “best” channel for all communication gives way to reasonable discussions about all channels, we’ll be making progress toward a semantic infrastructure that doesn’t trap people and their ideas in a single format.

Bonus Reading: Tom Foremski offers this assessment of the Internet, it “devalues everything it touches, anything that can be digitized.” Yet, that doesn’t mean there’s no value on anything on the Internet, only that the traditional services and processes for gathering and distributing value in information need to change. Tom writes, “Is this a bad thing? No, it is just what it is, just as gravity just —neither good or bad.” It also means, I might add, that the old ways weren’t necessarily bad or good, just what happened to evolve in response to technology and human culture. Highly recommended.