Gathering knowledge: Esoteric e-book formatting thought problems apropos of something

Last week’s announcement that the IDPF (International Digital Publishing Forum) has opened its ePub maintenance process is tremendously important to the future of books and publishing, regardless of whether you believe books, the artifact made with ink and paper, or publishing, the process of assembling, producing and distributing books for a profit, have bright futures or are destined for the trash heap. Everyone concerned about books and e-books should be paying close attention to the evolution of ePub, because it represents the current best effort at an open standard for the display of text and other information across a variety of e-reader devices.

I’ve spent the past few days studying the existing ePub components to prepare some suggestions for the IDPF. ePub is made up of three components, the Open Publication Structure 2.0, Open Packaging Format 2.0, and Open Container Format 1.0, and is deeply related to related metadata and publishing standards initiatives such as the Dublin Core Metadata Element Set 1.1 and DAISY (Digital Accessible Information System) Consortium standards. The result is a series of postings to follow which will offer thought problems that explore the nature of thought, reading, authoring, references, citation and conversation.

Making books useful and accessible to all, including the visual and hearing disabled, is a complex technical undertaking. The ePub and related standards efforts are predicated on the existence of texts which must be delivered to readers, which is precisely the problem one would address if distribution were still the key challenge. Unfortunately, distribution is the easy part of publishing today. In the networked world, ideas arrive in bits and pieces instead of whole units between the covers of a book or in an article from the newspaper. Words are quoted or paraphrased and the enterprising reader can explore the sources to discover what credit to give the fragments of knowledge they find assembled by writers, bloggers, news aggregators and in short messages. Therefore, citable information and the ability to assess ideas in relation to events and previous expressed ideas—in short, whether a newly published adds to or merely repeats previously expressed ideas—are the new hallmarks of value.

In the print era, when moving books, magazines and newspapers around in a timely fashion created value, the reader couldn’t participate, unless

Streamlining book metadata, but not for readers

TeleRead points to a new white paper, “Streamlining Book Metadata Workflow,” from the U.S. National Institute of Standards and Technology (NIST), which discusses how to make the collection, curation and dissemination of book metadata more efficient. Its an interesting paper, but one that demonstrates a glaring problem with most of the technical discussions surrounding e-books: Readers are not described as “stakeholders” in the metadata process, even though “enabl[ing] readers to identify and acquire books online” is the focus of the paper.

Readers will be the creators of the most important metadata describing books. Period, there is no second-guessing that conclusion, which has been proved again and again in every hypertext environment in human history. Defining the problem of book metadata without treating the reader as the fulcrum of the process is missing the point, which is a common problem in technical discussions of semantic and intellectual work. The problem of coding and building a system is daunting, but made much easier by assuming the final user of the technology will be passive consumers.

The paper is interesting as a discussion of the various existing bibliographical metadata systems used to move books around and inventory them within bookstores and libraries. It is even useful within the publishing and distribution value chain. However, it misses the mark in the most fundamental way possible, by defining the reader out of the metadata workflow.