Copyright © 2006 by the OpenReader™ Consortium. Some rights reserved. This specification, with the exception of the OpenReader logo, is licensed under a Creative Commons Attribution-ShareAlike 2.5 License. All rights reserved for the OpenReader logo. The OpenReader™ trademark is not covered by the Creative Commons License.
This specification details the structure, conformance requirements and recommendations of a Binder Document (“Binder”). It is a module specification within the suite of specifications used to define the OpenReader Publication Framework Specification.
The Binder is an XML document whose purpose is to organize a textual publication represented by multiple resources. The word “Binder” is used since the Binder Document is, in a loose sense, the digital equivalent of a book binder, or a three-ring binder, where paper pages (which can be thought of as discrete resource fragments) are ordered and “bound together” to create a single, coherent publication.
Although the Binder is primarily designed for the OpenReader Publication Framework (a framework centered on XML-conforming content documents and CSS), this specification has been authored in sufficiently generic fashion so it may be used for other similar publication frameworks.
The Binder Document is significantly influenced by the similar purpose Package Document of the innovative OEBPS Specification, first formulated in 1999. Since 1999, much has been learned about the shortcomings of OEBPS — expected for any first-generation technology — and new requirements identified. The Binder incorporates all the long-overdue improvements asked for by publishers, the accessibility community, and other digital publication stakeholders. It also enables new and powerful innovations which will greatly benefit both publishers and end-users.
The normative edition of this specification is the XHTML 1.1 document located at http://openreader.org/spec/bnd10.html .
Other formatted editions may be offered besides the normative edition, but they will not be considered normative.
The XHTML 1.1 normative edition of this specification is authored
so that the markup in the document body (that contained in the
body element) conforms with the
Basic Content Document
Specification, Version 1.0.
Several important words and terms used in this specification are defined in the Common Definitions Document, Version 1.0.
The following key words (“imperatives”) are used in this specification to denote requirement level consistent with RFC 2119:
To aid in readability and understandability, special text highlighting conventions are used in this specification (in addition to ordinary text emphasis) to emphasize important items.
The requirement level imperatives described in Section 1.3 are highlighted based on three basic imperative levels: required, recommended, and optional.
The normative XHTML 1.1 edition of this specification includes special markup for every mention of elements, attributes, attribute values, and other related code. (For details, refer to the comment in the source document header.) This allows these markup constructs to be specially highlighted, using CSS, during presentation (including their status and requirement level) so they may be more easily recognized.
Since the normative edition of this specification may be rendered with different CSS style sheets, converted into other formats, rendered on visually limited hardware, or presented with text-to-speech engines, some or all of this highlighting may be lost. Care has been taken to assure that, in the absence of highlighting, every mention of these markup constructs will be clear and unambiguous.
| Status | Requirement Level | ||
|---|---|---|---|
| Required | Cond. Req. | Optional | |
| Normal | pubid |
title |
dublincore |
| Deprecated | |
|
|
| Removed | |
|
|
| Status | Requirement Level | |||
|---|---|---|---|---|
| Required | Cond. Req. | Optional | Fixed | |
| Normal | idns |
event |
comment |
|
| Deprecated | |
|
|
|
| Removed | |
|
|
|
In the above tables, there are four requirement levels:
“Required” means the element/attribute must appear, in some capacity, in all Binder documents.
“Conditionally Required” means the element/attribute must appear under certain element usage situations, and is optional in other situations.
“Optional” means the element/attribute is optional under all situations.
“Fixed” (applicable only to attributes) means the attribute is fixed to a certain value in the DTD and there is no separate requirement the attribute must appear in the associated element.
Similarly, there are three status levels:
“Normal” means the element/attribute has normal status in this specification.
“Deprecated” means the element/attribute has been deprecated, and support for it may be removed in a future version of this specification.
“Removed” means the element/attribute is no longer supported in this specification, but is nevertheless mentioned.
An empty cell in the tables means there is no mention in this specification of an element/attribute having the associated status and requirement level.
Attribute values are highlighted as
en-US.
Other types of “code” are highlighted as
PCDATA.
This specification is built upon a wide and stable base of compatible open specifications and standards. Following are the various specifications and standards referenced in some manner by this specification.
W3C Specifications and Notes:
Internet Engineering Task Force (IETF):
International Organization for Standardization (ISO):
National Information Standards Organization (NISO)
Internet Assigned Numbers Authority (IANA)
Others:
application/x-orp-bnd1+xml”The MIME Media Type of a conforming Binder Document is
“application/x-orp-bnd1+xml”.
This MIME media type is not
IANA
registered.
Other specifications and applications using or referencing Binder
Documents by MIME media type should use
“application/x-orp-bnd1+xml”,
rather than one based on the
“text/” media type name.
The reason is that Binder Documents may be
encoded in UTF-16 (see Section 3.1) and
“application/” media type
names are more appropriate when both UTF-8 and UTF-16 encodings are
allowed
(RFC 3023).
The Binder Document is an XML document valid to the Binder Document DTD, Version 1.0. The Binder, as an XML document, may use the various vocabulary-independent constructs that are specified in XML 1.0.
In addition to XML conformance, the Binder Document must meet a set of general requirements that go beyond what XML specifies. This section outlines these general requirements, as well as provides an overview of the more important XML provided vocabulary-independent constructs, useful to both Binder Document authors and user agent developers.
A conformant Binder Document must meet all of the following general and top-level requirements:
Fully conforms to XML 1.0 (e.g., it is well-formed)
Text encoding is UTF-8 or UTF-16 as specified in the latest Unicode standard.
Includes an XML declaration with a text encoding declaration:
<?xml version="1.0" encoding="UTF-8" ?>
or
<?xml version="1.0" encoding="UTF-16" ?>
Valid to the Binder Document DTD, Version 1.0, which is externally referenced by public identifier as follows:
<!DOCTYPE binder PUBLIC
"-//OpenReader//DTD Binder Document 1.0//EN"
"http://openreader.org/dtd/bnd10.dtd">
Does not include a DTD internal subset.
For the document root element
binder, the
default
namespace is explicitly declared to be
http://openreader.org/namespace/orp-binder/1.0/.
Two prefixed namespace declarations are also
required:
Dublin Core, and
XHTML. The
required attribute
xml:lang (see
Section 4.2.1.1) specifies the default
language of the Binder Document (but not of the Publication itself,
also discussed in Section 4.2.1.1.)
Example where the default language of the Binder Document is U.S. English:
<binder xmlns="http://openreader.org/namespace/orp-binder/1.0/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xml:lang="en-US">
Does not declare any other namespaces, whether default or prefixed.
Conforms with all the specific requirements and constraints described elsewhere in this specification.
Based on the general conformance requirements in Section 3.1, the following Binder Document template is constructed. This template includes the required:
Binder Document authors will find it a useful template (or “boilerplate”) to use as a starting point to build conforming Binder Documents.
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE binder PUBLIC
"-//OpenReader//DTD Binder Document 1.0//EN"
"http://openreader.org/dtd/bnd10.dtd">
<binder xmlns="http://openreader.org/namespace/orp-binder/1.0/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xml:lang="en-US">
<pubid>
<!-- pubid content goes here -->
</pubid>
<resources>
<!-- resources content goes here -->
</resources>
<userset usid="1" mode="oeb" lang="en-US">
<!-- userset content goes here. The <userset>
attribute values are example only. -->
</userset>
<!-- Note: There may be more than one <userset> in <binder>. For
multiple <userset>, the 'title' attribute is also required. -->
</binder>
Notes on the above example:
The text encoding declaration in the XML declaration
may either be UTF-8 or
UTF-16.
The value of the xml:lang
attribute in binder will vary
depending upon the default language of the Binder Document (see
Section 4.2.1.1).
All the elements shown may include certain optional attributes.
The required element
userset, which
may appear more than once,
must include the three
required attributes as shown. However,
the given attribute values are example only and
may vary as allowed by this
specification.
This specification is not intended to be a tutorial on how to author XML-conforming Binder Documents. Nevertheless, to aid in the authoring of Binder Documents, which must be well-formed XML, and which may include the useful vocabulary-independent constructs that XML allows, this section presents a sampling of the most important markup-related XML requirements, vocabulary-independent constructs, and related useful topics.
Note: A few Binder Document and user agent requirements (beyond what XML requires) are specified in this section.
As specified in Section 3.1, all Binder Documents must be conforming XML 1.0 documents, which means, for example, they are well-formed. Following is a list of several specific XML markup requirements, but this list is by no means exhaustive. These requirements are mentioned since, when not followed, they contribute to a large fraction of encountered XML well-formedness and validation errors.
Element and attribute names are case-sensitive. For example
<binder> and
<BINDER> are different
elements.
Attribute values must be enclosed in either single or double
straight quotes. For example, lang="en-US"
and lang='en-US' are conforming, but
lang=en-US,
lang="en-US' and
lang='en-US" are not.
All non-empty elements must have properly formed starting and closing tags.
All declared empty elements must be properly formed (see Section 3.3.5).
All elements must properly nest.
Depending upon the circumstance, certain markup characters,
when used literally (e.g. & and
<), must be
escaped. Refer
to the next Section 3.3.2 for
details.
For XML 1.0 documents (and this includes Binder Documents), each individual character within character data is represented in one of two ways:
directly in the document’s text encoding, and
indirectly using a numeric character reference, or by a predefined or declared character entity reference which points to a numeric character reference.
For example, it is convenient to use numeric character references and allowed character entity references when the tool used to create an XML document is limited to ASCII encoding (UTF-8 conformant but limited to the Basic Latin script), and some characters fall outside of the ASCII range.
In certain circumstances, five of the characters used to define
XML markup constructs (specifically
&
<
>
" and
'), when
used literally,
must be represented (or
“escaped”) by their numeric character references, or by
their declared character entity reference equivalents. For this
purpose, XML predefines entity references for these five characters
which all XML processors must recognize:
| Character | Predefined Entity | Numeric Reference (hex) | Numeric Reference (dec) |
|---|---|---|---|
& |
& |
& |
& |
< |
< |
< |
< |
> |
> |
> |
> |
" |
" |
" |
" |
' |
' |
' |
' |
Listed below are the circumstances when the five markup characters, used literally and not as part of markup, must be escaped:
The & and
< characters, except when used within
CDATA sections and
Comments.
The " and
' characters when they appear within an
attribute value and match the attribute value delimiting quote mark.
It is recommended that both always be escaped in
attribute values.
The > character in the very rare
instance it appears in the string
“]]>” when that string is
not marking the end of a CDATA section. It is considered good practice
to escape the > character wherever it is
used literally.
Example of Binder Document markup with both required and optional numeric and character entity references:
<title comment='Jane's AT&T Résumé'>Jane's AT&T Résumé</title>
A user agent will render the above content as:
Jane's AT&T Résumé
CDATA
sections may be used in XML documents
(which includes Binder Documents) to escape blocks of text
containing markup characters (e.g.
“<” and
“&”) when used literally.
This is an alternative to individually escaping each markup character
(see Section 3.3.2).
A CDATA section starts with
“<![CDATA[”
and terminates with
“]]>”.
CDATA sections may be used anywhere character data may occur,
except that they must not appear within
an attribute
value. They must not nest; the text
content within a CDATA section must not
contain the literal character sequence
“]]>”.
Example:
<xhtml:span>Insert the following: <h1>Greetings!</h1></xhtml:span>
is equivalent to
<xhtml:span>Insert the following: <![CDATA[<h1>Greetings!</h1>]]></xhtml:span>
A user agent will render both of the above as:
Insert the following: <h1>Greetings!</h1>
Comments may appear anywhere in an XML document (including a Binder Document) except before the XML declaration and within other markup. Comments are not part of the character data; they are primarily intended for Binder Document authors to insert private commentary (notes) within the document.
A comment starts with
“<!--” and terminates with
“-->”; the comment text is
between these two delimiters. The comment text
must not contain the string
“--” (two hyphens), but
otherwise may include, without escaping, all the
Unicode characters recognized in
XML 1.0, including
the XML markup characters. A comment
must not terminate with the literal
string “--->”.
Examples of valid comments:
<!-- This is a comment -->
<!-- & < > " ' -->
To conform to this specification, user agents must not:
Some elements in a DTD may be declared
EMPTY. When used in an XML document, these
elements must not contain any content and
must use the empty-element syntax (also
known as “minimized form”) as specified in
XML 1.0.
Example of correct usage of declared empty element syntax (the
element item is declared
EMPTY in this specification):
<item resid="css1" resource="style1.css" media-type="text/css"/>
Note: a sequence of one or more white space
characters may appear before the closing
“/” in empty-element
syntax.
In this specification, the empty-element syntax must only be used for declared empty elements; it must not be used for declared non-empty elements when they contain no content.
Example of correct and incorrect usage when a declared non-empty
element contains no content (the element
dc:title is declared non-empty in
this specification):
<dc:title/> <!-- not allowed! --> <dc:title></dc:title> <!-- correct usage -->
When declared non-empty elements contain no content, or only a sequence of one or more white space characters, this occurrence is referred to as “empty content.”
White space characters and their handling by user agents is an important consideration to both Binder Document authors and user agent developers.
In XML, the white space characters are:
space ( )
tab (	)
carriage return
(
)
line feed (
)
The rules for white space handling of both character data and attribute values by XML processors are addressed in Sections 2.10, 3.2.1 and 3.3.3 of the XML 1.0 Specification.
User agent requirements:
For character data, XML processors are required to pass to the user agent all characters in a document that are not markup. This includes white space characters.
Except where the XML attribute
xml:space is specifically set to the
value of preserve, or a similar
override mechanism is applied (e.g.,
the CSS white-space
property), in this specification user agents
must normalize the character data of an
element as follows:
Replace all sequences of two or more white space characters
with a single space character
( ), and
Remove all leading and trailing spaces.
Example:
<dc:title>
<xhtml:em> This
</xhtml:em> is a
Title
</dc:title>
and
<dc:title> <xhtml:em> This </xhtml:em> is a Title </dc:title>
are both equivalent to:
<dc:title><xhtml:em>This</xhtml:em> is a Title</dc:title>
For white space in attribute values, XML requires that all XML
processors
normalize
attribute values before sending the attribute value data to the
user agent. Note that this normalization process treats attribute
values not of type CDATA differently from
those of type CDATA.
To conform to this specification, user agents
must normalize
CDATA attribute values as if they were not
of type CDATA. That is, for attribute values
of type CDATA, user agents
must replace a sequence of space
( ) characters with a single
space ( ) character, and remove
any leading and trailing space ( )
characters.
Example (the comment attribute is
of datatype CDATA):
<style cssrefs="css1" comment=" This is a
style sheet "/>
is equivalent to:
<style cssrefs="css1" comment="This is a style sheet"/>
Binder Document authors are free to use all the Unicode characters in character data, except those disallowed by XML and this specification (refer to "Unicode in XML and other Markup Languages" for recommendations on the Unicode characters not suitable for use in XML, and related topics.) This flexibility allows for the richest content in Binder Documents, meeting nearly all international needs, but in certain situations will create a few complexities for Binder Document authors and user agent developers.
One of the more complex topics concerns the spacing characters used for inter-word separation. Because the concept of a “word” in most languages plays a fundamental role in various word-related operations, such as text searching, line breaking, etc., Binder Document authors and user agent developers need to understand how the Unicode space characters are used to enable inter-word separation, plus the related topics of line breaking (primarily for the purpose of visual presentation), and soft hyphens.
The Unicode Space Characters set (see Section 6.2 in the Unicode 4.1.0 specification) includes:
Ordinary space character
( , which is also an
XML white space character)
No-break space ( , or
as defined in XHTML and in the
OpenReader Character Entity References Common
Set)
General
Punctuation space characters in the range of
  to
​
Narrow no-break space
( )
Medium mathematical space
( )
Ideographic space
( )
Zero width no-break space
()
(For more details on these spacing characters, and other space-like characters, refer to Section 6.2 in the Unicode 4.1.0 standard. This specification does not specify how user agents are to exactly render these different space characters.)
In this specification, user agents must treat any sequence of Unicode Space Characters and/or XML white space characters within character data as an inter-word separator.
The related topic of line breaking is important for the purpose of visual rendering. This topic is covered in detail in the Unicode Standard Annex #14 Technical Report: Line Breaking Properties, which provides a comprehensive set of guidelines. User agents should follow, as closely as possible, the line break recommendations in this Unicode technical report.
In general, line breaking is allowed between words except where one or more no-break space characters are used between the words. The no-break space characters include:
No-break space ( , or
, as defined in XHTML and in the
OpenReader Character Entity References Common
Set — this is the preferred character to use for no line
breaking between words)
Figure space ( )
Narrow no-break space
( )
Zero width no-break space
()
User agents should not line break between two words separated by a sequence of one or more no-break space characters.
Binder Document authors should not use a no-break space character for any purpose other than indicating that no line break should occur between words.
[Informative Commentary] In XML document
authoring, particularly XHTML, some authors inappropriately use a
no-break space (primarily ) to
“pad” spacing in order to force a desired visual
presentation, thereby working against the reflowability and
adaptability of the content to various hardware, applications and
end-user presentation settings.
Regarding line breaking within a word, user agents may do so per the allowance and the conventions of the language as detailed in the above referenced Unicode technical document on line breaking.
Binder Document authors may insert within a
word the “soft hyphen” character
(­ or
­ as defined in XHTML and the
OpenReader Character Entity References Common
Set) to signal that the user agent may line break the word at that
point.
In this specification, user agents must not render the soft hyphen character but may add the appropriate end-of-line character(s) (and other necessary text adjustments, depending upon language and conventions) for a line break placed after a soft hyphen. For all other purposes, such as word searching, user agents must ignore the soft hyphen character since it is technically not part of the word.
Note: The soft hyphen is not the same character as the plain hyphen
(-.) The plain hyphen character
is considered a part of the word, and user agents
must process it like any other character
in the word.
Example of the use of a soft hyphen:
<dc:title>How to Insert a Soft Hy­phen Within a Word</dc:title>
Should a user agent, in presenting the contents to the end-user, line break before the word “Hyphen”, it will render the above example as follows:
How To Insert a Soft Hyphen Within a Word
If the user agent line breaks at the soft hyphen, it will render the above example as follows (using the common English language convention for hyphenation):
How To Insert a Soft Hy- phen Within a Word
This section describes the Binder Document vocabulary. As noted in Section 3.1, a Binder Document must be valid to the Binder Document DTD. The Binder DTD is the normative vocabulary reference with respect to:
allowed elements, attributes and attribute values, and
element content model.
The root element binder is
described in Section 3.1, item 6.
Although the overarching framework which references this specification is the ultimate authority with respect to user agent processing of Binder Documents, this section (as well as elsewhere in this specification) includes a number of user agent requirements and recommendations deemed necessary to the integrity of the Binder Document paradigm. All these user agent requirements and recommendations are authoritative in the overarching framework except where explicitly overridden or amended by the framework.
The Binder Document is divided into functional parts, each of which performs some function in the organization and/or use of the associated Publication. Some functional parts are vital and thus are required; others are optional and provide for enhancements to the end-user experience.
The following table provides an overview of the various Binder parts, with links to full descriptions. Note that each functional part is represented in markup with a head element (shown in the second column). In the Binder Document, the order of the head elements is important as listed in the table; except for the User Set, each functional part’s head element must not appear more than once.
Functional Part Name |
Head Element Name |
Description |
Requirement Level |
|---|---|---|---|
|
Primary identifier of the Publication |
required |
|
|
List of resources that make up the Publication |
required |
|
|
Dublin Core metadata for the Publication |
optional but recommended |
|
|
Provides for multiple, end-user selectable “views” of the Publication |
at least one required |
|
User Set Functional Parts |
|||
|
Title of the Publication |
required |
|
|
Unicode characters and character blocks appearing in the content documents associated with the User Set |
optional but recommended |
|
|
Primary reading order (similar to OEBPS Spine), or home page for “web” option |
required |
|
|
Linearization of the content documents in the User Set for printing and related purposes |
optional |
|
|
Merging of selected out-of-spine content documents into a “virtual” composite document |
optional |
|
|
Associating substitute non-text content media resources to specific content document elements. |
optional |
|
|
Associating non-text content media resources with each other |
optional |
|
|
Designating non-text content media cover resources |
optional but recommended |
|
|
Designating thumbnail images |
optional but recommended |
|
|
Assigning cascading style sheets (such as CSS) to content document resources |
optional |
|
|
Assigning the primary navigation index (required), and optional alternative navigation indices |
required |
|
|
Providing textual descriptions for non-text content media resources |
required for non-text content media resources |
|
This specification supports certain vocabulary-specific constructs that may be used in various functional parts of Binder Documents. (Refer to Section 3.3 for vocabulary-independent constructs.)
The Binder Document vocabulary defines three
[Common] attributes that
may be applied to most elements. They are
xml:lang,
xml:id, and
comment.
xml:langThe xml:lang attribute,
specially
defined in XML 1.0, may be used to specify
the default language of the element’s content and attribute
values. This attribute must not be used
to specify Publication-related languages, which are instead specified
using the lang attribute (see
Section 4.2.2.)
The value of xml:lang
must comply with
RFC 3066, or its
successor on the IETF Standards Track. Thus, the value will also
conform to the separate requirement that
xml:lang be an XML
Name.
(Language
Codes)
(Country Codes)
While xml:lang is
optional for most elements, it is
required for the root element
binder (see
Section 3.1, item 6), specifying the default
language of the Binder Document.
xml:idThe xml:id attribute, based on
the W3C Recommendation
xml:id Version 1.0,
is used to give a unique identifier to an element. Its value
must:
be unique across all elements in a Binder Document,
be an XML Name,
not contain the “:”
character,
start with a Letter as defined in
Appendix B
of the XML 1.0 Specification — it cannot start with an
underscore (“_”),
not start with the string
“xml” (and all its
case variants), since this is
reserved
in XML 1.0 for possible future standardization, and
not start with the string
“orp” (and all its
case variants), since this is reserved for possible use in future
versions of this specification.
This specification assigns no special meaning or purpose to
xml:id (although a future version of
this specification may do so); Binder Document authors and authoring
systems may freely use xml:id for
special identification of elements. However,
xml:id must
not be used as a “back door” to actuate special
features or functions in user agents.
Note that xml:id
must not be applied to the
item and
userset elements, since these
elements already require attributes with datatype
ID (see Sections 4.4
and 4.6, respectively.) XML
forbids an
element having more than one ID.
commentThe comment attribute
may be used by Binder Document authors to provide
commentary within an element. The attribute value of
comment is datatype text
(CDATA).
The allowed Unicode character range for the value of the
comment attribute, as with all
attribute values of datatype text (CDATA),
is given in
XML 1.0, with the
following
constraints:
The literal < character
must not appear. If this character is
to be used literally, it must be
escaped.
The literal & character
must not appear except as part of a
predefined or declared character entity reference. If this character
is to be used literally (not part of an entity reference), it must
be escaped.
The literal " and
' characters must
not appear when they match the attribute delimiter quote
marks; they must be escaped. It is
recommended that both of these characters always
be escaped when they appear in attribute values.
Similar to XML comments (see Section
3.3.4), the comment attribute is
intended for private use by Binder Document authors. User agents
must not use the value of the
comment attribute.
lang AttributeThe lang attribute
may be applied to the
required
resources,
item, and
userset elements. Its purpose is
to specify Publication-related languages.
The lang attribute differs
from the xml:lang attribute (see
Section 4.2.1.1) in that
xml:lang is intended only
to specify the language of element character data and attribute values
within a Binder Document.
lang also differs from
xml:lang in that
lang may
contain multiple language assignments which are separated by one or
more white space characters. Each assigned
language follows the rules given for
xml:lang in
Section 4.2.1.1.
Example:
lang="en-US de-DE"
It is possible for both xml:lang
and lang to appear in the same
element, and their assigned language values may
differ.
Example:
<item resid="image1"
resource="image1.png"
media-type="image/png"
lang="fr-FR"
xml:lang="en-US"
comment="Photograph of Paris (caption in French)"/>
In the above example, xml:lang
is assigned the value en-US since the
value of the comment attribute is in
English. The lang attribute is
assigned the value fr-FR since the
Publication resource, image1.png, is a
photograph with a caption in French.
residrefs AttributeThe required attribute
residrefs is applied to several
elements to reference one or more resource identifiers assigned in the
Resource Manifest (see Section 4.4.3.) It is
of datatype IDREFS, and
must be a white
space separated list of resource identifiers.
(XML 1.0
discussion of datatype IDREFS.)
title ElementThe required
title element is used in the
Publication Title,
Styling, and
Navigation functional parts of the User
Set.
Each title element contains
character data (#PCDATA) representing a
title description, and may also contain the
XHTML Namespace Inline Elements (see
Section 4.2.5).
Example:
<title>Title of <xhtml:em>This</xhtml:em> Book</title>
All of the elements in the Binder Document vocabulary which
may contain character data (#PCDATA), with
the exception of the id element,
may also contain inline elements drawn from the
XHTML Namespace.
The nine supported inline elements allow Binder Document authors to add semantic richness to the character data. When these inline elements are used in character data, user agents should apply appropriate styling when presenting the character data.
As in XHTML, the inline elements may be nested. The Binder
Document Common Attributes may be applied
to these elements, as well as the optional
class attribute which is drawn from
XHTML. (Description)
This specification does not specify how user agents are to use the
value of the class attribute when
present. However, a future version of this specification
may add support for author-supplied CSS for
styling the XHTML Inline elements, and the
class attribute will be useful for
CSS styling purposes.
The following table lists the supported XHTML Namespace Inline elements. Each supported element is linked to the general description in the HTML 4.01 specification, providing a good overview of the purpose and use of the element.
Element |
Short Description |
|---|---|
Computer Code Fragment |
|
Emphasis |
|
Text Entered by the User |
|
Program, Script, and Similar Output |
|
Generic Inline Level Container |
|
Strong Emphasis |
|
Subscript |
|
Superscript |
|
Instance of a Variable or Program Argument |
Example of the use of XHTML Namespace Inline elements:
<pubtitle> <title>Title of <xhtml:em>This</xhtml:em> Book</title> </pubtitle>
A user agent may visually present the above character data as:
Title of This Book
The Binder Document DTD declares the 253 character entity references specified in the Character Entity References Common Set Specification, Version 1.0. These character entity references are identical to those supported in XHTML 1.1 (which, in turn, are inherited from HTML 4.01.) They include the five XML predefined character entity references (see Section 3.3.2.)
Binder Document authors may use these “mnemonic” character entities instead of the equivalent numeric character references, as explained in Section 3.3.2. User agents must recognize these character entities.
Example using numeric character references:
<title>Jane’s AT&T Résumé</title>
The same example using “mnemonic” character entity references:
<title>Jane’s AT&T Résumé</title>
Both the above examples will render as:
Jane’s AT&T Résumé
Future versions of this and related Binder Document specifications may support an expanded common set of “mnemonic” character entity references derived from other document markup vocabularies such as TEI and DocBook.
The required Publication Identifier
functional part, headed by the element
pubid, is used to assign a unique,
primary identifier to the Publication. Since the primary identifier
may be used for public identification purposes,
such as for external linking into the Publication, it
should be globally unique.
The element pubid
must contain one and only one
id element. The
id element is declared non-empty
and must contain character data
(#PCDATA), which gives the actual value of
the primary identifier — id
must not be empty.
Three attributes are required for the
id element:
type,
idns and
ver.
Since the primary Publication Identifier may be used with
Internationalized
Resource Identifiers (RFC 3987) as part of an absolute path
segment, the characters used should be
restricted, whenever possible, to the
ipchar production of
RFC 3987 (page 7).
(This is not always possible with certain formalized identifier
namespaces. An example is the
Digital
Object Identifier namespace, DOI, which includes the
“/” character in its identifier
— the “/” character is
not included in the ipchar production.)
When the Binder Document author devises their own unique
primary identifier namespace, it is recommended
that their namespace restricts the allowed primary identifier
characters to some subset of the
iunreserved production of
RFC 3987 (page
7).
The primary identifier must not contain any white space characters. Also, the primary identifier must not contain any percent encodings except when the primary identifier namespace requires it. (Note that applications which extract the primary identifier from a Binder Document and embed it into an IRI or URI may convert certain characters, as necessary, to their percent encoded equivalents.)
type AttributeThe required
type Attribute for the
id element
must be given the value of
primary.
This requirement is for future compatibility when the Publication Identifier functional part is expanded to include alternate, antecedent, and other types of Publication identifiers.
idns AttributeThe required
idns attribute for the
id element specifies the scheme and
namespace associated with the primary identifier. It
must take one of the following forms:
Uniform Resource Name (URN) Scheme
The current list of registered URN Scheme Namespaces is maintained by IANA, and governed by RFC 3406.
The syntax for the value of idns,
following RFC 2141,
is given as
<IDNS> ::= "urn:" <NID>
Where “urn:” is the
scheme, and <NID> is one of the
formally
registered URN Namespaces. Note that the leading
“urn:” and
the Namespace Identifier <NID>
are case-insensitive per
RFC 2141 —
however, for consistency, Binder Document authors
must use all lower case.
Not all of the registered URN Scheme Namespaces are suitable for use as a primary identifier. At least two of them, ISBN and UUID, are suitable, and these two are the most likely to be used of those currently registered.
Example use of the URN Scheme with a UUID Namespace for the
idns attribute value:
idns="urn:uuid"
“info” Scheme
The “info” URI
Scheme, managed by OCLC,
currently supports a number of identifier namespaces that may be
usable for Publication Identifiers. Notably, these include
Digital
Object Identifiers
(“doi”),
Fedora
Digital Objects and Disseminations
(“fedora”), and
Serial
Item and Contribution Identifiers
(“sici”).
The syntax for the value of idns
is given as
<IDNS> ::= "info:" <NID>
Where “info:” is the
scheme, and <NID> is one of the
formally registered “info”
Namespaces. Note that the leading
“info:” and the Namespace
Identifier <NID> are
case-insensitive — however, for consistency, Binder Document
authors must use all lower case. The
trailing “/” character,
specified by the “info” URI Scheme,
must not be included in the
idns attribute value.
Example use of the “info” Scheme with a DOI Namespace
for the idns attribute value:
idns="info:doi"
“x-other:” Scheme
The “x-other:” Scheme is intended for unregistered, private identifier namespaces, such as one designed by the Binder Document author. The syntax essentially follows that of the formal schemes already described (see example below.)
It is strongly recommended that instead of using the “x-other:” Scheme, Binder Document authors should generate a globally unique Publication Identifier using the freely usable UUID Namespace with the URN Scheme. There are a number of free UUID generators, as well as UUID registration services.
Example use of the “x-other:” Scheme for the
idns attribute:
idns="x-other:my-own-pubid-namespace"
ver AttributeThe required
ver attribute for the
id element provides for versioning
of a given Publication Identifier. Its value is an integer, and for
its first use with a particular Publication Identifier
must be assigned the value of
“1”.
When small or minor changes are made to any of the resources of a Publication, but not enough to warrant a change in Publication Identifier, the publisher should increment the version number by one.
Changes which qualify as small or minor include any edits that
break few, if any, already-established links into the Publication
based on resource and element IDs. However, if the Publication is
significantly edited, such as substantive changes to its markup
structure or major content revisions, then the publisher
should consider assigning a new Publication
Identifier (with a reset of ver back
to “1”.)
Example markup of the Publication Identifier:
<pubid>
<id type="primary"
idns="urn:uuid"
ver="1">6a2014b0-87a2-11da-a72b-0800200c9a66</id>
</pubid>
The required Resource Manifest functional part of the Binder Document provides a list of all the resources (excluding the Binder Document itself) which make up the Publication. It assigns one or more unique resource identifiers to each resource.
The Resource Manifest may declare resources that remain unused (“orphaned”) by the Publication. However, all declared resources must exist in the resource pool associated with the Publication.
The overarching framework which references the Binder Document Specification, such as the OpenReader Publication Framework Specification, defines what resources are allowed in the Resource Manifest, their MIME media types, the resource locator scheme, user agent error handling in the event of missing resources, and related requirements. In this section (and elsewhere in this specification), all the markup examples are based on the OpenReader Publication Framework Specification requirements.
The Resource Manifest is headed by the
required element
resources, which
must contain one or more empty
item elements. The order of
multiple item elements is not
significant.
Three attributes are required for the
item element:
resource,
media-type, and
resid.
One optional attribute,
lang, may be
applied to both the resources and
item elements. It is used to assign
the language(s) associated with the resources.
resource AttributeThe required
resource attribute for the
item element gives the path and name
for a Publication resource. The overarching publication framework,
which references this Binder Document specification, defines the
resource path/filename scheme, and allowed characters.
Example:
resource="documents/chapter1.xml"
media-type AttributeThe required
media-type attribute for the
item element gives the
MIME media
type of the resource.
Example:
media-type="application/x-orp-bcd1+xml"
resid AttributeThe required
resid attribute for the
item element assigns a resource
identifier for the resource. As noted in Section
4.4.5, a resource may be assigned multiple resource identifiers by
applying multiple instances of the
item element to that resource.
The datatype of the resid
attribute value is ID, and the allowed
characters are the same as those specified for
xml:id (see
Section 4.2.1.2). Each
resid
must be unique across all
ID values in the Binder Document.
Example:
resid="chap1"
Two important benefits of assigning resource identifiers to resources are:
Provides greater flexibility to the publishing work flow by allowing resource paths and names to change without disrupting the Publication organization, and
Preserves the integrity of both inter-document links and external links into Publications, provided the link addressing is enabled with resource identifiers rather than resource names.
lang AttributeThe optional but
recommended
lang attribute
may be applied to both the
resources and
item elements.
Section 4.2.2 provides an overview of the
general purpose and requirements of this attribute.
The specific purpose for the lang
attribute in the Resource Manifest is to assign the primary and/or
significant language(s) to resources. Since this attribute may contain
more than one language assignment,
lang may
assign multiple languages to a single resource. In this instance,
the order of the languages in lang
is significant, from highest to lowest priority or significance, but
otherwise this specification does not distinguish the relative
significance between the multiple languages.
For example, a content document may contain portions of text whose languages differ from the primary language. In this case, the primary language is listed first in the attribute value, followed by the other languages.
When lang is applied to the
resources element, it globally
applies to all resources in the Resource Manifest except where
overridden for particular resources by applying the
lang attribute to the
associated item elements. The
lang attribute may be applied to
an item element even when
lang is not applied to the
resources element.
Although assigning languages to resources in the Resources Manifest is optional, Binder Document authors should do so. A future version of this specification may elevate this recommendation to a requirement.
Example of assigning languages to resources:
<resources lang="en-US">
<item resid="chap1"
resource="chapter1.xml"
media-type="application/x-orp-bcd1+xml"
lang="en-US de-DE"/>
<item resid="chap2"
resource="chapter2.xml"
media-type="application/x-orp-bcd1+xml"
lang="fr-FR"/>
<item resid="chap3"
resource="chapter3.xml"
media-type="application/x-orp-bcd1+xml"/>
...
</resources>
In the above example, all the resources in the Resource Manifest
are globally assigned English (US). For the content
document resource with resource identifier
“chap1”, the global
language has been overridden, with a primary language of English, but
with some text in German. For the content document resource
“chap2”, the global
language has been overridden with the primary language of French.
For the content document resource
“chap3”, since the
lang attribute does not appear,
its primary language is the globally assigned language, English.
As noted in Section 4.4.3, a resource
may be assigned multiple resource identifiers by using multiple
item elements.
This is a powerful feature which allows Publication authors to efficiently reuse resources when they appear in different contexts in their Publications.
For example, the same content document may appear in various portions of the Publication, and each appearance may use a different set of style sheets.
This mechanism may simplify some publishing work flows where having multiple copies of the same resource, each given a different resource (or file) name, adds an unnecessary complication, particularly in the document editing process.
This mechanism also makes it feasible for unambiguously linking to the correct spot within a Publication when a resource is used multiple times, provided the link addresses the resource identifier and not the resource name.
Example of applying two resource identifiers to one resource:
<item resid="intr1"
resource="document/intro.xml"
media-type="application/x-orp-bcd1+xml"
<item resid="intr2"
resource="document/intro.xml"
media-type="application/x-orp-bcd1+xml"/>
In the example above, the content document resource
“intro.xml” is assigned two
resource identifiers: intr1 and
intr2.
A fairly complex and lengthy markup example of the Resource Manifest:
<resources lang="en-US">
<item resid="intr1"
resource="intro.xml"
media-type="application/x-orp-bcd1+xml"
comment="Special introduction written by Jane Doe"/>
<item resid="intr2"
resource="intro.xml"
media-type="application/x-orp-bcd1+xml"/>
<item resid="chap1"
resource="chapter1.xml"
media-type="application/x-orp-bcd1+xml"/>
<item resid="chap2"
resource="chapter2.xml"
media-type="application/x-orp-bcd1+xml"/>
<item resid="note1"
resource="note1.xml"
media-type="application/x-orp-bcd1+xml"/>
<item resid="note2"
resource="note2.xml"
media-type="application/x-orp-bcd1+xml"/>
<item resid="css-a"
resource="cssdir/a.css"
media-type="text/css"/>
<item resid="css-b"
resource="cssdir/b.css"
media-type="text/css"/>
<item resid="css-c"
resource="cssdir/c.css"
media-type="text/css"/>
<item resid="css-d"
resource="cssdir/d.css"
media-type="text/css"/>
<item resid="css-e"
resource="cssdir/e.css"
media-type="text/css"/>
<item resid="imag1"
resource="images/image1.png"
media-type="image/png"
lang="fr-FR"
xml:lang="fr-FR"
comment="La Tour Eiffel"/>
<item resid="imag2-jpeg"
resource="images/image2.jpg"
media-type="image/jpeg"/>
<item resid="imag2-png"
resource="images/image2.png"
media-type="image/png"/>
<item resid="imag2-tiff"
resource="images/image2.tiff"
media-type="image/tiff"/>
<item resid="imag3"
resource="images/image2-thumb.png"
media-type="image/png"/>
<item resid="tabl1"
resource="images/table1.png"
media-type="image/png"/>
</resources>
The optional, but recommended Publication Metadata functional part of the Binder Document assigns metadata for the Publication. In this specification, only the Dublin Core Metadata Element Set, Version 1.1 is supported.
The Publication Metadata functional part is headed by the
optional
metadata element, which
may contain one
dublincore element.
The dublincore element
may contain any of the fifteen Dublin Core
Metadata Elements (prefixed with the Dublin Core namespace, see
Section 3.1, requirement 6), in any number
(including zero), and in any order. It is
required that the attribute
xml:lang (see
Section 4.2.1.1) always be applied to the
dublincore element to designate the
default language of the content in the Dublin Core metadata. This
allows the Dublin Core metadata to be extracted from, and used
independent of, the Binder Document.
The overall structure of the Publication Metadata functional part, with some example Dublin Core metadata element markup, is shown as follows:
<metadata>
<dublincore xml:lang="en-US">
<!-- The 15 Dublin Core Elements in any number and order. For example: -->
<dc:identifier idns="urn:uuid">6a2014b0-87a2-11da-a72b-0800200c9a66</dc:identifier>
<dc:title>The Excellent Adventures of the Markup Kid</dc:identifier>
<dc:creator role="aut ill" file-as="Doe, John">John “Markup Kid” Doe</dc:creator>
<dc:language usage="primary">en-US</dc:language>
</dublincore>
</metadata>
The following table lists the fifteen supported Dublin Core elements (in alphabetical order), along with their definitions taken from the Dublin Core Metadata Element Set specification, and attributes supported in this specification beyond the Common Attribute Set.
Dublin Core Element |
Dublin Core Definition |
Attributes (Besides Common) |
|---|---|---|
|
Entity responsible for making contributions to the content of the resource |
|
Extent or scope of the content of the resource |
||
Entity primarily responsible for making the content of the resource |
|
|
Date associated with an event in the life cycle of the resource |
|
|
|
Account of the content of the resource |
|
|
Physical or digital manifestation of the resource |
|
Unambiguous reference to the resource within a given context |
|
|
Language of the intellectual content of the resource |
|
|
|
Entity responsible for making the resource available |
|
|
Reference to a related resource |
|
|
Information about rights held in and over the resource |
|
|
Reference to a resource from which the present resource is derived |
|
Topic of the content of the resource |
|
|
Name given to the resource |
||
|
Nature or genre of the content of the resource |
The fifteen Dublin Core metadata elements share the same content
model. Each Dublin Core element may contain
character data (#PCDATA) representing the
metadata information, and may also contain the
XHTML Namespace Inline Elements (see Section
4.2.5) for enhancing user agent presentation of the metadata
information — especially useful for metadata of a
“prose” nature, such as that designated by
dc:description and
dc:title.
Several of the more important Dublin Core metadata elements, and those with specialized attributes, are described in greater detail in sibling sections; for information on the other elements, refer to the Dublin Core Metadata Element Set, Version 1.1 specification and related documents. The OEBPS 1.2 Specification, also discusses the use of the Dublin Core metadata elements (this specification adopts several of the OEBPS innovations in the use of Dublin Core metadata.) Binder Document authors should use the Dublin Core metadata elements consistent with Dublin Core recommendations, except where they conflict with the requirements of this specification.
The Binder Document already requires three critical metadata items which are designated elsewhere in the Binder: the Publication Identifier, the Publication Title (required for each User Set), and the Publication primary and secondary languages. Although redundant, Binder Document authors should replicate this information in the Dublin Core Metadata — the details are discussed in the relevant sibling sections.
Binder Document authors should specify the
language, using xml:lang (see
Section 4.2.1.1), for the
“prose” containing metadata elements, such as
dc:description and
dc:title, when their language
differs from the default language assigned to
dublincore.
User agents should provide a mechanism allowing users, on demand, to access and review the Dublin Core metadata information.
dc:identifierThe dc:identifier element
designates an identifier for the Publication. It
should be included in the Dublin Core metadata.
The first instance of use of this element in the Dublin Core metadata
must replicate the primary identifier
assigned in the Publication Identifier
functional part.
The dc:identifier element
supports two identifier-related attributes
required by the
id element in the
Publication Identifier functional part:
idns and
ver.
The value of the required
idns attribute, which specifies the
identifier namespace, must follow the
requirements in Section 4.3.3. Likewise,
the value of the sometimes
required (as noted below)
ver attribute, which specifies the
versioning of the identifier, must follow the requirements in
Section 4.3.4.
In the first dc:identifier
instance, which replicates the primary Publication
Identifier, the idns and
ver attributes
must assume the values identical to
their counterparts for the id
element.
For the Publication Identifier markup example in
Section 4.3.5, the
dc:identifier equivalent is:
<dc:identifier idns="urn:uuid" ver="1">6a2014b0-87a2-11da-a72b-0800200c9a66</dc:identifier>
dc:titleThe dc:title element designates
the Publication title. It should replicate,
whenever possible, the title given in the
Publication Title functional part of the User
Set.
A Publication title may comprise multiple
lines, as the markup example in Section 4.7
illustrates. Each line in the Publication title will be expressed
with its own dc:title element, and
the order of appearance of dc:title
elements is significant (they are not required
to be adjacent to each other, although that is
recommended for readability.) For the same
primary language (as explained below), the
dc:title elements form the complete
Publication title by their order of appearance in the Dublin Core
metadata.
A complication arises when there are multiple User Sets and the Publication title varies between them (which is allowed.) The title variation may be in the same primary language, or in two or more primary languages.
For multiple User Sets with variations of the Publication title,
the Binder Document author should include in the Dublin Core metadata
an appropriate Publication title (which may comprise one or more
lines) for each primary language represented by the User Set titles.
When there are multiple primary languages, the Binder Document author
must apply the
xml:lang attribute (designating the
primary language) to all instances of
dc:title in the Dublin Core
metadata.
For the Publication Title markup example given in Section 4.7, the Dublin Core equivalent is:
<dc:title>Dr. Strangelove</dc:title> ... <dc:title>Or: How I Learned to Stop Worrying and <xhtml:em>Love the Bomb</xhtml:em></dc:title>
In the above example, since both the
dc:title elements are of the same
language (inherited from a containing element where
xml:lang is assigned US English),
they form the two line Publication title:
Dr. Strangelove Or: How I Learned to Stop Worrying and Love the Bomb
Note in the markup example above that even though appearance order is
significant, the two dc:title
elements need not be adjacent.
Markup example of dc:title with
multiple lines and in multiple languages mixed together:
<dc:title xml:lang="en-US">A Trip To Paris:</dc:title> ... <dc:title xml:lang="fr-FR">Un Voyage Vers Paris:</dc:title> ... <dc:title xml:lang="en-US">Visiting The Eiffel Tower</dc:title> ... <dc:title xml:lang="fr-FR">Visiter La Tour D'Eiffel</dc:title>
In the above markup example, the Dublin Core metadata designates two language-specific Publication titles (each comprising two lines.) The English version of the Publication title is:
A Trip To Paris: Visiting The Eiffel Tower
The French version of the Publication title is:
Un Voyage Vers Paris: Visiter La Tour D'Eiffel
dc:languageThe dc:language element
designates a significant language used in the Publication. For
multiple Publication languages, use one instance of
dc:language for each language.
Binder Document authors should designate at least
one primary language (see below) for the Publication.
Similar to the requirements for
xml:lang (see
Section 4.2.1.1), the character content of
dc:language must comply with
RFC 3066, or its
successor on the IETF Standards Track.
(Language Codes)
(Country Codes)
The required attribute
usage
must be assigned either the value of
primary or
secondary, which indicates whether the
language is considered a primary language of the Publication, or a
secondary language. (Note that there may be more
than one primary language of the Publication.)
The designated primary languages of the Publication
should be all the unique languages specified
in the required
lang attribute for all instances of
the userset element (see
Section 4.6.3.)
The designated secondary languages of the Publication
should be all the unique languages (other than
those designated as primary), specified in the
lang attribute for the
resources and all the
item eleme