XSchema Specification, Version 1.0

16 September 1998 Draft

This specification draft contains the complete text of sections 1-3 of the XSchema specification as well as appendixes A, B, and D. Section 4, XSchema Transformation to XML 1.0 DTD, Section 5, Connecting XSchemas to XML Documents, and Appendix C, XSchema in XSchema, are still under development.

The sections included here are presented for final review; no changes will be made to these sections after this review, which ends 23 Sept 1998.

All comments and suggestions are welcome. Please send public comments to the XML-Dev mailing list [XML-DEV] (you must join the list to post); private comments may be sent to Simon St.Laurent or Ronald Bourret. Historical information regarding the development of XSchema is available at http://purl.oclc.org/NET/xschema.

1.0 Introduction

1.1 Status

1.2 Origin and Goals

1.3 Relation to Standards

1.4 Terminology

1.5 Authors

2.0 XSchema Syntax

2.1 The XSchema Element

2.2 Element Declarations

2.3 Content Model Declarations

2.3.1 Empty Content Model

2.3.2 Any Content Model

2.3.3 PCData Content Model

2.3.4 Reference Content Model

2.3.5 Mixed Content Model

2.3.6 Choice Content Model

2.3.7 Sequence Content Model

2.4 Attribute Declarations

2.4.1 Attribute Types

2.4.2 Attribute Defaults

2.4.3 Combinations of Types, Defaults, and Default Values

2.5 Notation Declarations

2.6 Unparsed Entity Declarations

2.7 XSchema Extensions

2.7.1 Documentation Extensions

2.7.2 Other Extensions

2.8 id Attributes

3.0 XSchema and Namespaces

3.1 The XSchema Namespace

3.2 Namespaces of Elements and Attributes Being Defined

Appendix A: References

Appendix B: XSchema DTD

Appendix D: Contributors

1.0 Introduction

In order for document processing to be reliable, it is necessary to be able to describe classes of documents and to verify individual documents' membership in these classes -- in other words, to be able to express constraints on documents and thus define 'document types'. XML inherits a mechanism for doing this from SGML: the Document Type Definition. XML DTDs can perform a subset of the functions of SGML DTDs.

DTDs have limited expressiveness and it is necessary to experiment with new ideas in schema design. These ideas include a syntax that is more like that of XML document content, certain kinds of extensibility and a cleaner separation between parsing and verifying. XSchema is an experimental schema language designed to provide a starting point for these experiments.

So that XSchemas will be immediately useful with existing software, the XSchema specification will describe a conversion from XSchema documents to DTDs. This initial version of the XSchema specification is deliberately simple, providing an initial base for implementations while introducing as few complicating factors as possible. Authors accustomed to DTD creation will find their toolset constricted; it is hoped that supporting software and tools available from other standards will make up for this reduced toolset.

1.1 Status

The XSchema specification is the product of discussions on the xml-dev mailing list [XML-DEV]. This document has no official status. The editors have no affiliation with the World Wide Web Consortium (W3C), the organization developing and maintaining the XML standard, nor any affiliation with any W3C member organizations. While it is hoped that this document may eventually be submitted to the W3C as a Note, it is not an official specification and should be considered experimental.

1.2 Origin and Goals

Proposals for describing SGML document type definitions using document syntax rather than the separate declaration syntax have been under development for a number of years, and used by several tools for documentation. The current proposal arose from a number of concerns surrounding XML's usability and consistency. Originally conceived of as a mapping of DTD syntax to document syntax, the project has developed into an effort focused on creating schemas describing element and attribute structures rather than preserving every function provided by XML 1.0 DTDs.

The list of goals developed by the xml-dev discussion follows:

  1. XSchema documents shall use XML document syntax, using element nesting and attributes to describe all constraints that may be verified by a processor using XSchema .
  2. XSchema shall define a transformation from XSchema documents to DTDs.
  3. XSchema documents shall be capable of representing the normalized element and attribute structures defined in XML 1.0 DTDs, and provide namespace support.
  4. XSchema documents shall be parseable, manageable, and manipulable using the same tools used to parse, manage, and manipulate XML documents.
  5. XSchema documents shall be easy to create, read, and modify, and shall provide authoring support for XML documents.
  6. XSchema documents shall be easy to use in combination with a parser to provide structural validation of documents.
  7. XSchema shall include an XSchema document and an XML 1.0 DTD defining the structure of XSchema documents .
  8. XSchema shall suggest mechanisms for applying XSchema documents to documents.
  9. XSchema shall include mechanisms for extending the information included in XSchema documents to support metadata.
  10. The XSchema specification shall be readable, clear, and rigorous, using terminology and nomenclature as close to the XML 1.0 specification as possible.
  11. The XSchema specification will comply with and be consistent with W3C recommendations.
  12. XSchema documents shall provide constructs for human- and machine-readable documentation.

1.3 Relation to Standards

XSchemas use XML 1.0 document instance syntax and may be applied to XML 1.0 [XML] documents. This specification refers to several IETF standards, notably Multipurpose Internet Mail Extensions (MIME) ([RFC 2046]and [RFC 2048]) and XML Media Types [RFC 2376].

Namespace usage in XSchema is based on the 2 August, 1998 "Namespaces in XML" Working Draft [Namespaces]. Because this draft is still subject to change, all namespace attributes (the xmlns and xmlns:XSC attributes of the XSchema element and all ns and prefix attributes) and processing (Section 3, "XSchema and Namespaces") are subject to change, even after the rest of the XSchema specification is finalized.

It is hoped that future versions of XSchema will use [XLink] and [XPointer] to implement schema reuse.

XSchema has been influenced by the XML-Data proposal [XML-Data]. It is hoped that XSchemas and RDF Schemas may be mapped to each other.

1.4 Terminology

The requirement levels used throughout this document reflect the approach of [RFC 2119], though keywords (like may and must) are not capitalized. Other terms used are defined in the XML 1.0 Recommendation [XML].

1.5 Authors

The XSchema specification is the result of contributions from a large number of people on the XML-Dev list, coordinated by the following, smaller group of authors. For a list of contributors, see Appendix D, "Contributors".
Simon St. Laurent (simonstl@simonstl.com)
Ronald Bourret (rbourret@dvs1.informatik. tu-darmstadt.de)
John Cowan (cowan@locke.ccil.org)

2.0 XSchema Syntax

This section describes XSchema document syntax. In version 1.0, the XSchema document is an XML document containing a single XSchema element in which information describing the schema is nested. The XSchema element must be preceded by an XML declaration and may be preceded by other declarations, comments, and processing instructions. In future versions of XSchema, XSchema elements may be embedded in instance documents.

2.1 The XSchema Element

The XSchema element is the root element for all XSchema documents. The declaration for the XSchema element is:

<!ELEMENT XSchema (Doc?, More?, (ElementDecl | Model | AttDef | AttGroup | Notation | UnparsedEntity | XSchema)*)>

<!ATTLIST XSchema
    xmlns         CDATA   #FIXED   "http://www.purl.org/NET/XSchema/v1"
    xmlns:XSC     CDATA   #FIXED   "http://www.purl.org/NET/XSchema/v1"
    prefix        NMTOKEN #IMPLIED
    ns            CDATA   #IMPLIED
    Version       CDATA   #FIXED   "1.0"
    MimeType      CDATA            "application/xml"
    FileExtension CDATA            "xml"
    id            ID      #IMPLIED>

The XSchema element contains other elements describing the XSchema and building a schema. These elements are described in later sections of this specification. The XSchema element may also contain other XSchema elements nested inside of it. This nesting of XSchema elements improves reusability of XSchemas by allowing the combination of multiple XSchemas inside of a single XSchema framework. It also allows finer-grained control over documentation for subsections of an XSchema.

The XSchema element's attributes include information about the namespace used by XSchema, the version of the XSchema specification used, and information about the type of documents described by the XSchema.

The XSchema namespace is fixed with the xmlns attribute to correspond to [Namespaces]. The xmlns:XSC attribute, also fixed, allows XSchema declarations to be prefixed with XSC for situations where they need to redefine the default namespace (as is the case with XSC:Doc, and may be the case with XSC:More - see Section 2.7, "XSchema Extensions" for more details.) The prefix attribute identifies the prefix that will be applied to all elements and attributes defined within this XSchema element during conversion to DTDs, unless overridden in the element or attribute declarations themselves. The ns attribute identifies the URI of the namespace containing the elements and attributes being defined, unless overridden in the element or attribute declarations themselves. Namespace processing is covered further in Section 3.0, "XSchema and Namespaces".

Information about the XSchema specification version used to create this XSchema, contained in the Version attribute, is critical to proper handling of documents should the specification be updated in the future. This specification is identified as version 1.0. Future major and minor versions of the XSchema specification should identify themselves differently. No provision is made at this time for nesting XSchemas using different versions of the specification under a parent XSchema element.

The MimeType and FileExtension attributes are used to provide a suggested MIME (Multipurpose Internet Mail Extensions) Content-type and file extension for documents created using a particular XSchema. Applications may use this information to identify XML document types. A document library that generates XML documents dynamically could assign file extensions and MIME types based on the XSchema used.

Applications using this information should use the values stored in the first XSchema encountered during processing. For instance, if an XSchema includes another nested XSchema, the values for the MimeType and FileExtension attributes of the root XSchema should be used.

By default, most XML documents are assumed to have a MIME type of application/xml, as described in [RFC 2376]. Developers who need different MIME types for documents created using particular XSchemas may register other MIME types with the IETF, as described in [RFC 2048], or use the 'x-' prefix syntax for subtypes, as described in [RFC 2046].

For information about the id attribute, see Section 2.8, "id Attributes".

2.2 Element Declarations

Element declarations in XSchemas are made using the ElementDecl element and its contents:

<!ELEMENT ElementDecl (Doc?, More?, Model, AttGroup?)>
<!-- Name is the element name -->
<!ATTLIST ElementDecl
       Name   NMTOKEN #REQUIRED
       id     ID      #IMPLIED
       prefix NMTOKEN #IMPLIED
       ns     CDATA   #IMPLIED
       Root   (Recommended | Possible | Unlikely) "Possible">

The Name attribute identifies the name of the element, and is required. An element declaration would look like:

<ElementDecl Name="Species">
       ...additionalElementInformation...
</ElementDecl>

This declaration would declare an element named "Species", which would appear in an instance as:

<Species>...content...</Species>

The Name attribute must be unique within the set of elements in the defined namespace. It provides the name of the element as declared here and is also used by other elements to refer to this element in their content model declarations. The Name attribute must match the NCName production in [Namespaces]. (Effectively, this requires element names to begin with a letter or underscore and not include a colon.)

The prefix attribute identifies the prefix that will be applied to this elements and its attributes during conversion to DTDs, unless overridden in the attribute declaration itself. The ns attribute identifies the URI which functions as the namespace name for this element and its attributes. Namespace processing is covered further in Section 3.0, "XSchema and Namespaces".

The Root attribute provides authoring tools with a guide for which elements are likely root elements for documents. This is intended to simplify the choices presented to authors during document composition. Composition tools could use this to build a menu of likely starting points for a document. The Root attribute is purely a suggestion and does not require any action on the part of the processor.

For information about the id attribute, see Section 2.8, "id Attributes".

Note that an element must declare a content model of some type, using the Model element, even if that content model is empty. Documentation (in the Doc element), non-XSchema extensions (in the More element) and attribute declarations (using the AttGroup element) are optional.

Documentation about the element, additional extensions, content-model information, and attribute information are stored as sub-elements of the ElementDecl element. Documentation is covered in Section 2.7.1, Documentation Extensions. Additional extensions are covered in Section 2.7.2, Other Extensions. Content Models are covered in Section 2.3, Content Model Declarations, and attributes are covered in Section 2.4, Attribute Declarations.

2.3 Content Model Declarations

Content model declarations are made within the Model sub-element of the declaration for the element to which they apply.

Model elements may appear inside XSchema elements for reusability, documentation, and reference, but will need to be linked to particular element declarations through mechanisms not yet defined (most likely XLink). All content model declarations have an optional id attribute; for more infomation, see Section 2.8, "id Attributes".

The Model element holds the content model for an element.

<!ELEMENT Model (Doc?, More?, (Ref | Choice | Seq | Empty | Any | PCData | Mixed | Model))>
<!ATTLIST Model
       id ID #IMPLIED>

Model elements are pure containers, and act much like parentheses in XML 1.0 DTD declarations. A Model element nested inside a Choice or Seq element can only contain Model, Doc, More, Ref, Choice, and Seq elements.

2.3.1 Empty Content Model

The simplest content model is empty, which indicates that the parent element has no sub-elements and no character data content. The Empty element indicates that an element is empty.

<!ELEMENT Empty EMPTY>
<!ATTLIST Empty
    id ID #IMPLIED>

For example, to declare the Species element shown in the previous section empty, use the following XSchema declaration:

<ElementDecl Name="Species">
  <Model>
    <Empty/>
  </Model>
</ElementDecl>

This would not allow the Species element to contain any text or sub-elements.

2.3.2 Any Content Model

The Any content model, which allows the element to contain parsed character data or any other elements as content, is equally simple:

<!ELEMENT Any EMPTY>
<!ATTLIST Empty
    id ID #IMPLIED>

Using the Any content model is much like using the Empty content model. To declare that the Species element had a content model of any, use the following declaration:

<ElementDecl Name="Species">
  <Model>
    <Any/>
  </Model>
</ElementDecl>

This allows the Species element to contain text and any sub-elements an author desired.

2.3.3 PCData Content Model

The PCData content model, which allows the element to contain only parsed character data, is also represented by a single empty element.

<!ELEMENT PCData EMPTY>
<!ATTLIST Empty
    id ID #IMPLIED>

Using the PCData content model is much like using the Empty and Any content models. For example, to assign the Species element a content model of PCData, use the following declaration:

<ElementDecl Name="Species">
  <Model>
    <PCData/>
  </Model>
</ElementDecl>

This allows the Species element to contain text, but no sub-elements.

2.3.4 Reference Content Model

The Reference content model allows an element to specify other elements which it may contain, as well as their quantity. Ref elements identify the element to be contained, as well as the frequency with which it must appear:

<!ELEMENT Ref EMPTY>
<!-- Element references the name in an ElementDecl element -->
<!ATTLIST Ref
    id        ID      #IMPLIED
    Element   NMTOKEN #REQUIRED
    Frequency (Required | Optional | ZeroOrMore | OneOrMore) 'Required'>

The value of the Element attribute must equal the value of the Name attribute of an ElementDecl element elsewhere in the XSchema document. A Model element may directly contain at most one Ref element. To define content models that permit or require the use of more elements, the Any, Mixed, Choice, or Sequence content models should be used as appropriate.

The Frequency attribute controls the number of referenced elements that may occur.

To declare that the Species element may contain a single CommonName element, and nothing else, use the following declaration:

<ElementDecl Name="Species">
  <Model>
    <Ref Element="CommonName" Frequency="Required"/>
  </Model>
</ElementDecl>

This requires the Species element to contain a single CommonName element. To make the CommonName element optional - though it may still only appear once, set the Frequency attribute to 'Optional':

<ElementDecl Name="Species">
  <Model>
    <Ref Element="CommonName" Frequency="Optional"/>
  </Model>
</ElementDecl>

Optional is the equivalent of the ? occurrence indicator in XML 1.0 DTDs.

To require the Species element to contain at least one but possibly multiple CommonName elements, set the Frequency attribute to 'OneOrMore':

<ElementDecl Name="Species">
  <Model>
    <Ref Element="CommonName" Frequency="OneOrMore"/>
  </Model>
</ElementDecl>

OneOrMore is the equivalent of the + occurrence indicator in XML 1.0 DTDs.

Finally, to allow the Species element to contain any number (including zero) of CommonName elements, set the Frequency attribute to 'ZeroOrMore':

<ElementDecl Name="Species">
  <Model>
    <Ref Element="CommonName" Frequency="ZeroOrMore"/>
  </Model>
</ElementDecl>

ZeroOrMore is the equivalent of the * occurrence indicator in XML 1.0 DTDs.

2.3.5 Mixed Content Model

The Mixed content model allows the unordered use of different element types and character data. Content within an element that uses a mixed declaration must be PCData or one or more of the elements referenced by Ref elements nested within the Mixed declaration. Only Ref elements can be nested under an Mixed element; the PCData content is inherent in the Mixed content model.

<!ELEMENT Mixed (Ref+)>
<!ATTLIST Mixed
    id        ID           #IMPLIED
    Frequency (ZeroOrMore) #FIXED   "ZeroOrMore">

To declare that the Species element may contain a mix of PCData, CommonName elements, LatinName elements, and PreferredFood elements in any order, use the following declaration:

<ElementDecl Name="Species">
  <Model>
    <Mixed>
         <Ref Element="CommonName"/>
         <Ref Element="LatinName"/>
         <Ref Element="PreferredFood"/>
    </Mixed>
  </Model>
</ElementDecl>

The XSchema processor should ignore any frequency attributes in Ref elements that appear as subelements of the Mixed element.

2.3.6 Choice Content Model

The Choice content model allows for either-or inclusions of elements and groups of elements. The Choice content model represents groups of element content possibilities and must contain at least two sub-elements. Situations where only one element is needed should use the Ref content model instead of Choice. The Choice element may indicate a frequency, allowing the content model defined by the Choice model to appear one, one or zero, one or more, or zero or more times.

<!-- A Choice must have two or more children -->
<!ELEMENT Choice ((Seq | Ref | Model), (Seq | Ref | Model)+)>
<!ATTLIST Choice
    id ID #IMPLIED
    Frequency (Required | Optional | ZeroOrMore | OneOrMore) 'Required'>

The simplest Choice element will contain two Ref elements and a frequency attribute. By default, the Choice element's content model is required to appear once.

To declare that a Species element may contain either a common name or a Latin name, but not both, use the following declaration:

<ElementDecl Name="Species">
  <Model>
    <Choice Frequency="Required">
         <Ref Element="CommonName"/>
         <Ref Element="LatinName"/>
    </Choice>
  </Model>
</ElementDecl>

The Ref elements in an Choice element may also specify the frequency with which they appear, as may the Seq elements described in Section 2.3.7, "Sequence Content Model". The Choice element is the equivalent of the choice group (element | element) in XML 1.0 DTDs. The ordering of the sub-elements within an Choice element has no effect.

2.3.7 Sequence Content Model

The Sequence content model allows for the sequential appearance of sub-elements. Elements, if they are required to appear, must appear in the order of the Choice and Ref sub-elements in the Seq element. The Seq element may also indicate a frequency, allowing the content model defined by the Seq model to appear one, one or zero, one or more, or zero or more times.

<!-- A Seq must have two or more children -->
<!ELEMENT Seq ((Choice | Ref | Model),(Choice | Ref | Model)+)>
<!ATTLIST Seq
    id        ID #IMPLIED
    Frequency (Required | Optional | ZeroOrMore | OneOrMore) 'Required'>

The simplest Seq element will contain two Ref elements in the order in which they should appear and a frequency attribute. By default, the Seq element's content model is required to appear once.

To declare that the Species element requires a common name and a Latin name, in that order, use the following declaration:

<ElementDecl Name="Species">
  <Model>
    <Seq Frequency="Required">
         <Ref Element="CommonName"/>
         <Ref Element="LatinName"/>
    </Seq>
  </Model>
</ElementDecl>

The Ref elements in an Seq element may also specify the frequency with which they appear, as may the Choice elements. The Seq element is the equivalent of the sequence group (element, element) in XML 1.0 DTDs.

2.4 Attribute Declarations

Attribute declarations are made with AttDef elements nested inside of AttGroup container elements. AttGroup elements may be nested inside of ElementDecl element declarations or XSchema elements. The type of an attribute is defined with an attribute, as is a declaration of whether or not it is required and a possible default value. Values for enumerated types are provided with subelements.

<!ELEMENT AttGroup (Doc?, More?, (AttDef | AttGroup)*)>
<!ATTLIST AttGroup
    Element NMTOKEN #IMPLIED
    id      ID      #IMPLIED
    prefix  NMTOKEN #IMPLIED
    ns      CDATA   #IMPLIED>

<!ELEMENT AttDef (Doc?, More?, EnumerationValue*)>
<!ATTLIST AttDef
    Element  NMTOKEN      #IMPLIED
    Name     NMTOKEN      #REQUIRED
    Type     (CData    |
               ID       |
               IDRef    |
               IDRefs   |
               Entity   |
               Entities |
               Nmtoken  |
               Nmtokens |
               Notation |
               Enumerated) "CData"
    Required (Yes | No)   "No"
    AttValue CDATA        #IMPLIED
    id       ID           #IMPLIED
    prefix   NMTOKEN      #IMPLIED
    ns       CDATA        #IMPLIED>

<!ELEMENT EnumerationValue (Doc?, More?)>
<!ATTLIST EnumerationValue
    Value CDATA #REQUIRED>

Attribute declarations for an element can be nested inside the declaration (ElementDecl element) of that element, nested directly inside an XSchema element, or both. When an AttGroup or AttDef element is nested inside an ElementDecl or another AttGroup element, the outermost element specification (the Name attribute in the ElementDecl element or the Element attribute in the AttGroup element) is dominant; all Element attributes inside this specification are ignored. When an AttGroup or AttDef element is nested directly inside an XSchema element, the Element attribute may contain a name token that matches the Name attribute of the element to which the attribute applies; if the Element attribute is missing, the AttGroup or AttDef element may only be used by reference.

AttGroup elements are container elements and may be nested inside one another. All of their attributes, except for id, apply to the child AttGroup and AttDef elements. Except for the Element attribute, these attributes may be overridden by the attributes of those child elements.

The Name attribute of the AttDef element provides the name by which the attribute will be identified. This attribute must match the NCName production in [Namespaces], which requires that the name begins with a letter or underscore and does not include a colon. Attribute names that use the same namespace as the element to which they apply must be unique within that element. Attribute names that use a different namespace ("global" attributes) must be unique within the Global Attribute Partition of that namespace.

A nested declaration is shown below.

<ElementDecl Name="Species">
  ...additionalElementInformation...
  <AttGroup>
    <AttDef Name="status" ...additionalAttributeInformation.../>
  </AttGroup>
</ElementDecl>

This declares an element with the name Species that has an attribute named status. If the status attribute was declared outside of the Species element declaration, the declarations would appear as shown below.

<ElementDecl Name="Species">
    ...additionalElementInformation...
</ElementDecl>
...
<AttDef Name="status" Element="Species" ...additionalAttributeInformation.../>

Merely naming an attribute may be adequate. Attribute declarations may identify types and provide information about whether the attribute is required. By default, attributes will be assumed to contain character data (CData), not be required, and have no default value. This information is declared using additional attributes. The simplest attribute declaration possible identifies an attribute as containing character data (CData) and allows the attribute to be optional, as shown below.

<AttDef Name="sampleAttribute"/>

The prefix attribute identifies the prefix that will be applied to the attribute during conversion to DTDs. The ns attribute identifies the URI which functions as the namespace name for the attribute. Namespace processing is covered further in Section 3.0, "XSchema and Namespaces".

For information about the id attribute, see Section 2.8, "id Attributes".

2.4.1 Attribute Types

XSchema 1.0 provides equivalents for all of the XML 1.0 DTD attribute types. All of them are declared using attribute values within the AttDef element.

The CData attribute type is one of the most common, permitting an attribute to contain character data as defined by the XML 1.0 specification. If the Species element were to contain an attribute providing the Latin name of the species, the declaration could look like the following. (The Type attribute could actually be omitted in this case, as CData is the default type.)

<ElementDecl Name="Species">
...additionalElementInformation...
    <AttGroup>
        <AttDef Name="Latin" Type="CData"/>
    </AttGroup>
</ElementDecl>

This attribute would then be available for use in instances of the Species element:

<Species Latin="Passerina cyanea">...additionalContent...</Species>

The ID attribute type is used to uniquely identify elements in a document for application processing. IDRef and IDRefs attribute types are used to refer to a single ID value in the same document or multiple ID values in the same document, separated by whitespace, respectively. These attribute declarations must be used with the same constraints as apply to ID, IDREF, and IDREFS attribute types in XML 1.0.

The Entity and Entities attribute types identify the names of unparsed entities. The use of these attribute types must be made with the same constraints as apply to the ENTITY and ENTITIES attribute types in XML 1.0. The name of an unparsed entity identified by an Entity or Entities attribute must match the Name attribute of an UnparsedEntity element elsewhere in the XSchema document.

The Nmtoken and Nmtokens attribute types are used to declare attributes that must contain information conforming to the Nmtoken and Nmtokens productions in XML 1.0.

The Notation and Enumerated attribute types are more complex, requiring EnumerationValue subelements to identify their possible content. These two declarations use similar syntax, but the allowed values of Notation declarations must match the Notations declared elsewhere in the XSchema document.

If the status attribute of the Species element were to allow the values of extinct, endangered, protected, and non-threatened, an appropriate enumerated type declaration would look like:

<ElementDecl Name="Species">
...additionalElementInformation...
    <AttGroup>
        <AttDef Name="status" Type="Enumerated">
            <EnumerationValue Value="extinct"/>
            <EnumerationValue Value="endangered"/>
            <EnumerationValue Value="protected"/>
            <EnumerationValue Value="non-threatened"/>
        </AttDef>
    </AttGroup>
</ElementDecl>

A Species element created conforming to this declaration might look like:

<Species status="extinct">...additionalContentAboutDodos...</Species>

2.4.2 Attribute Defaults

XSchema requires attribute declarations to provide information about the default value of a given attribute. XSchema provides for the four cases supported by XML 1.0: #REQUIRED, #IMPLIED, #FIXED AttValue, and AttValue, though they are expressed as choices between required and not required with an optional default value. There may be only one default value declaration per attribute.

Required attributes (identified in XML 1.0 by #REQUIRED) are identified by assigning the value "Yes" to the Required attribute of an AttDef element and not assigning a value to the AttValue attribute. For instance, if the Latin attribute described above was required by the Species element, the AttDef element would contain a Required attribute with a value of "Yes":

<ElementDecl Name="Species">
...additionalElementInformation...
    <AttGroup>
        <AttDef Name="Latin" Required="Yes"/>
    </AttGroup>
</ElementDecl>

Optional attributes (identified in XML 1.0 by #IMPLIED) are identified assigning the value "No" to the Required attribute of an AttDef element and not assigning a value to the AttValue attribute. Implied indicates that there is no default value provided, and also that no value is required. If the Latin attribute is optional, the AttDef element would contain a "No" value for the Required attribute. (Note that this is the default status and the Required declaration does not need to be made explicitly.)

<ElementDecl Name="Species">
...additionalElementInformation...
    <AttGroup>
        <AttDef Name="Latin" Required="No"/>
    </AttGroup>
</ElementDecl>

Fixed attributes (identified in XML 1.0 by #FIXED AttValue) are identified through the use of the Required attribute in combination with the AttValue attribute, which must contain the fixed value for the attribute. Attributes declared as fixed can only contain the declared value for that attribute. Fixed effectively hard codes attribute values into particular elements. If the Required attribute has a value of "Yes", and the AttValue attribute is present, the attribute value should be treated as a #FIXED value in XML 1.0.

For example, to declare a planet attribute for the Species element, a Required attribute given the value of "Yes" would identify the fixed nature of the attribute and the AttValue attribute would provide the value.

<ElementDecl Name="Species">
...additionalElementInformation...
    <AttGroup>
        <AttDef Name="planet" Required="Yes" AttValue="Earth"/>
    </AttGroup>
</ElementDecl>

Attributes may also be provided with a default value that may be overridden by other declarations. These default values are identified through the use of the AttValue attribute. The status attribute of species elements described above would be an appropriate target for such a default value, especially if most species being described fell into a particular category:

<ElementDecl Name="Species">
...additionalElementInformation...
    <AttGroup>
        <AttDef Name="status" Type="Enumerated" AttValue="non-threatened"/>
            <EnumerationValue Value="extinct"/>
            <EnumerationValue Value="endangered"/>
            <EnumerationValue Value="protected"/>
            <EnumerationValue Value="non-threatened"/>
        </AttDef>
    </AttGroup>
</ElementDecl>

Any default (required, fixed, etc.) may be used with any attribute type, though default values must always correspond to acceptable values for the attribute type.

2.4.3 Combinations of Types, Defaults, and Default Values

This notation also permits the declaration of certain attributes (IDs with defaults, for instance) that are prohibited by the standard XML 1.0 DTD syntax. Developers who use these combinations should test that their documents will behave as expected in DTD-only environments as well as XSchema environments. Additional processing of document instances may be necessary to produce normalized-for-DTD use documents if they included such attributes as default values. The attribute type should always be considered more important than its default values in XSchema to DTD conversion.

The table below summarizes the possible combinations of XSchema attribute defaults and their XML 1.0 DTD equivalents.

Required AttValue XML 1.0 Equivalent
Yes <value> #FIXED <value>
Yes -- #REQUIRED
No <value> AttValue
No -- #IMPLIED

(-- indicates an undeclared value)

2.5 Notation Declarations

Notation declarations are made with Notation elements nested in the XSchema element.

<!ELEMENT Notation (Doc?, More?) >
<!ATTLIST Notation
    Name          NMTOKEN #REQUIRED
    id            ID      #IMPLIED
    PubidLiteral  CDATA   #IMPLIED
    SystemLiteral CDATA   #IMPLIED>

The Name attribute provides the name of the notation. It must match the Name production in the XML 1.0 specification.

Notations may include a public identifier or a system literal, or both. XSchema processors should ignore Notation elements that contain neither. Public identifiers and system literals should conform to the rules in Section 4.7 of the XML 1.0 Specification.

For information about the id attribute, see Section 2.8, "id Attributes".

2.6 Unparsed Entity Declarations

Unparsed entities are declared with UnparsedEntity elements nested in the XSchema element.

<!ELEMENT UnparsedEntity EMPTY>
<!ATTLIST UnparsedEntity
    Name          NMTOKEN #REQUIRED
    id            ID      #IMPLIED
    SystemLiteral CDATA   #REQUIRED
    PubidLiteral  CDATA   #IMPLIED
    Notation      NMTOKEN #REQUIRED>

The Name attribute provides the name of the unparsed entity. It must match the Name production in the XML 1.0 specification. The Notation attribute provides the name of a notation that gives the format of the unparsed entity. It must match the Name production in the XML 1.0 specification and must also match the Name attribute of a Notation element elsewhere in the XSchema document.

UnparsedEntity elements must include a system literal and may include a public identifier. Public identifiers and system literals should conform to the rules in Section 4.7 of the XML 1.0 Specification.

For information about the id attribute, see Section 2.8, "id Attributes".

2.7 XSchema Extensions

XSchema provides areas in which XSchema developers can provide supplemental information and metadata regarding XSchema components in both human- and machine-readable formats. Human-readable information is provided through the use of a subset of HTML that conforms to XML syntax, while machine-readable information may be provided through the XSC:More element.

2.7.1 Documentation Extensions

Human-readable documentation for XSchemas should be provided using the Itsy Bitsy Teeny Weeny Simple Hypertext [IBTWSH]. This is an XML DTD which describes a subset of HTML 4.0 for embedded use within other XML DTDs.  It is equivalent (within its scope) to -//W3C//DTD HTML 4.0 Transitional//EN.  Documentation that uses portions of the IBTWSH format may be included in the XSC:Doc element, a subelement available to all declarations. The XSC:Doc element provides basic formatting options for XSchema documentation.

<!ENTITY % ibtwsh SYSTEM "http://www.ccil.org/~cowan/XML/ibtwsh.dtd">
%ibtwsh;
<!ELEMENT XSC:Doc %struct.model;>
<!ATTLIST XSC:Doc
    xmlns CDATA #FIXED "">

Note that because XSC:Doc redefines the default namespace to support IBTWSH, the XSC: prefix must be used for XSC:Doc. Any element allowed in the IBTWSH struct.model set of elements (A, ABBR, ACRONYM, ADDRESS, BIG, BLOCKQUOTE, BR, CITE, CODE, DFN, DIR, DIV, DL, EM, H1, H2, H3, HR, KBD, OL, P, PRE, SAMP, SMALL, SPAN, STRONG, UL, VAR, XML) may be used in the XSC:Doc element. To preserve compatibility with HTML, IBTWSH does not use namespaces.

XSchema applications should ignore all XSchema declarations (i.e., elements prefixed with XSC: or another appropriate XSchema prefix) within an XSC:Doc element. (The XML element of IBTWSH allows an ANY content model.)

2.7.2 Other Extensions

The XSC:More element provides an area which developers can use to create their own supplements to XSchema, defining content types more tightly than is possible through XSchema 1.0. The XSC: More element has a simple ANY content model, though XSchema processors should ignore the appearance of any elements from the XSchema namespace in this area.

<!ELEMENT XSC:More ANY>
<!ATTLIST XSC:More
    xmlns CDATA "">

Because XSC:More redefines the default namespace, the XSC: prefix must be used for XSC:More. Developers may override the blank value of the xmlns attribute to define their own default namespace for elements contained in the XSC:More element.

2.8 id Attributes

All XSchema elements except EnumerationValue, More, and Doc have an optional id attribute. These attributes, if they appear, must have a unique value within the document. They have no defined use in XSchema 1.0, but are included so that future extensions (possibly involving XLink) can uniquely identify elements in an XSchema document.

3.0 XSchema and Namespaces

XSchema uses namespaces for its own operations and also supports schemas that take advantage of namespace facilities. XSchema processors are responsible only for elements that use the XSchema namespace appropriate to the version of XSchema they are processing. Information in other namespaces may be used in the XSC:Doc and XSC:More elements and passed to other applications as the processor deems appropriate.

Note: This section is subject to change, even after the XSchema specification is otherwise finalized. For more information, see Section 1.3, "Relation to Standards."

3.1 The XSchema Namespace

The namespace for XSchema 1.0 is built into the XSchema DTD as the default value of the xmlns and xmlns:XSC attributes of the XSchema element. The URL of the XSchema namespace is a PURL (permanent URL) provided by the OCLC. PURLs use redirection to maintain a permanent address for sites that may change address. (For more information, see http://www.purl.org.) While XSchema specification information may be stored at the location to which the PURL server redirects visitors, XSchema applications should not rely on any of that information being there.

XSC:Doc and XSC:More must use the XSC: prefix because they declare other values for the default namespace. All other XSchema elements may use the XSC: prefix if desired, but are not required to do so.

3.2 Namespaces of Elements and Attributes Being Defined

XSchema supports document sets that have their own namespaces.  The URI of the namespace to which a defined element or attribute belongs is identified by the ns attribute of the applicable XSchema, ElementDecl, AttGroup, or AttDef element.  This URI must match the URI that applies to the element or attribute in the document containing the element.  For example, if the Species element is part of the http://www.taxonomy namespace, an XSchema document might contain the following declaration:

<ElementDecl Name="Species" ns="http://www.taxonomy">
   ...additionalElementInformation
</ElementDecl>

and the document that uses the Species element might contain:

<TAXON:Species xmlns:TAXON="http://www.taxonomy">
   ...additionalElementContent
</TAXON:Species>

If no ns attribute applies to the defined element or attribute, then that element or attribute is not considered to belong to any particular namespace. In particular, it does not belong to the default namespace of the document in which it is used, assuming a default namespace is defined.

To maintain interoperability with DTDs, XSchema provides a mechanism for declaring the namespace prefixes to be used in element and attribute declarations in a DTD. This allows documents and their associated XSchemas to track the same namespace using different prefixes if necessary. XSchema-to-DTD converters should use the prefix attribute of an XSchema, ElementDecl, AttGroup, or AttDef element when creating DTD element and attribute declarations. DTD-to-XSchema converters should use the prefixes assigned in the DTD and request further information about the 'real' namespace for use in the ns attribute. This may be accomplished by parsing a sample document instance, or by direct input from the person doing the conversion.

Appendix A: References

IBTWSH
John Cowan. Itsy Bitsy Teeny Weeny Simple Hypertext. See http://www.ccil.org/~cowan/XML/ ibtwsh.dtd.
Namespaces
Tim Bray, Dave Hollander, and Andrew Layman. Namespaces in XML. 2 Aug 1998. See http://www.w3.org/TR /1998/WD-xml-names-19980802.html.
RFC 2046
IETF (Internet Engineering Task Force). RFC 2046: Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types, ed. N. Freed and N. Borenstein. November, 1996. See http://www.isi.edu/in-notes/rfc20 46.txt.
RFC 2048
IETF (Internet Engineering Task Force). RFC 2048: Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures, ed. N. Freed, J. Klensin, and J. Postel. November, 1996. See http://www.isi.edu/in-notes/rfc20 48.txt.
RFC 2119
IETF (Internet Engineering Task Force). RFC 2119: Key words for use in RFCs to Indicate Requirement Levels, ed. Scott Bradner. 1997. See http://www.isi.edu/in-notes/rfc21 19.txt.
RFC 2376
IETF (Internet Engineering Task Force). RFC 2376: XML Media Types, ed. E.J.Whitehead and Murata Makoto. July, 1998. See http://www.isi.edu/in-notes/rfc23 76.txt.
XML
Tim Bray, Jean Paoli, and C.M. Sperberg-McQueen. Extensible Markup Language (XML) 1.0. 1998. See http://www.w3.org/TR/REC-xml.
XML-Data
Andrew Layman, et al. XML-Data. 5 Jan 1998. See http://www.w3.org/TR/1998/NOTE-XML-data.
XML-DEV
XML-DEV Mailing List, archived at http://www.lists.ic.ac.uk/hy permail/xml-dev/.
XLink
Eve Maler and Steve DeRose. XML Linking Language (XLink). 1998. See http://www.w3.org/TR/WD-xlink.
XPointer
Eve Maler and Steve DeRose. XML Pointer Language (XPointer). 1998. See http://www.w3.org/TR/WD-xptr.

Appendix B: XSchema DTD

<!ELEMENT XSchema (Doc?, More?, (ElementDecl | Model | AttDef | AttGroup | Notation | UnparsedEntity | XSchema)*)>

<!ATTLIST XSchema
    xmlns         CDATA   #FIXED   "http://www.purl.org/NET/XSchema/v1"
    xmlns:XSC     CDATA   #FIXED   "http://www.purl.org/NET/XSchema/v1"
    prefix        NMTOKEN #IMPLIED
    ns            CDATA   #IMPLIED
    Version       CDATA   #FIXED   "1.0"
    MimeType      CDATA            "application/xml"
    FileExtension CDATA            "xml"
    id            ID      #IMPLIED>

<!ELEMENT ElementDecl (Doc?, More?, Model, AttGroup?)>
<!-- Name is the element name -->
<!ATTLIST ElementDecl
       Name   NMTOKEN #REQUIRED
       id     ID      #IMPLIED
       prefix NMTOKEN #IMPLIED
       ns     CDATA   #IMPLIED
       Root   (Recommended | Possible | Unlikely) "Possible">

<!ELEMENT Model (Doc?, More?, (Ref | Choice | Seq | Empty | Any | PCData | Mixed | Model))>
<!ATTLIST Model
       id ID #IMPLIED>

<!ELEMENT Empty EMPTY>
<!ATTLIST Empty
    id ID #IMPLIED>

<!ELEMENT Any EMPTY>
<!ATTLIST Empty
    id ID #IMPLIED>

<!ELEMENT PCData EMPTY>
<!ATTLIST Empty
    id ID #IMPLIED>

<!ELEMENT Ref EMPTY>
<!-- Element references the name in an ElementDecl element -->
<!ATTLIST Ref
    id        ID      #IMPLIED
    Element   NMTOKEN #REQUIRED
    Frequency (Required | Optional | ZeroOrMore | OneOrMore) 'Required'>

<!ELEMENT Mixed (Ref+)>
<!ATTLIST Mixed
    id        ID           #IMPLIED
    Frequency (ZeroOrMore) #FIXED   "ZeroOrMore">

<!-- A Choice must have two or more children -->
<!ELEMENT Choice ((Seq | Ref | Model), (Seq | Ref | Model)+)>
<!ATTLIST Choice
    id ID #IMPLIED
    Frequency (Required | Optional | ZeroOrMore | OneOrMore) 'Required'>

<!-- A Seq must have two or more children -->
<!ELEMENT Seq ((Choice | Ref | Model),(Choice | Ref | Model)+)>
<!ATTLIST Seq
    id        ID #IMPLIED
    Frequency (Required | Optional | ZeroOrMore | OneOrMore) 'Required'>

<!ELEMENT AttGroup (Doc?, More?, (AttDef | AttGroup)*)>
<!ATTLIST AttGroup
    Element NMTOKEN #IMPLIED
    id      ID      #IMPLIED
    prefix  NMTOKEN #IMPLIED
    ns      CDATA   #IMPLIED>

<!ELEMENT AttDef (Doc?, More?, EnumerationValue*)>
<!ATTLIST AttDef
    Element  NMTOKEN      #IMPLIED
    Name     NMTOKEN      #REQUIRED
    Type     (CData    |
               ID       |
               IDRef    |
               IDRefs   |
               Entity   |
               Entities |
               Nmtoken  |
               Nmtokens |
               Notation |
               Enumerated) "CData"
    Required (Yes | No)   "No"
    AttValue CDATA        #IMPLIED
    id       ID           #IMPLIED
    prefix   NMTOKEN      #IMPLIED
    ns       CDATA        #IMPLIED>

<!ELEMENT EnumerationValue (Doc?, More?)>
<!ATTLIST EnumerationValue
    Value CDATA #REQUIRED>

<!ELEMENT Notation (Doc?, More?) >
<!ATTLIST Notation
    Name          NMTOKEN #REQUIRED
    id            ID      #IMPLIED
    PubidLiteral  CDATA   #IMPLIED
    SystemLiteral CDATA   #IMPLIED>

<!ELEMENT UnparsedEntity EMPTY>
<!ATTLIST UnparsedEntity
    Name          NMTOKEN #REQUIRED
    id            ID      #IMPLIED
    SystemLiteral CDATA   #REQUIRED
    PubidLiteral  CDATA   #IMPLIED
    Notation      NMTOKEN #REQUIRED>

<!ENTITY % ibtwsh SYSTEM "http://www.ccil.org/~cowan/XML/ibtwsh.dtd">
%ibtwsh;
<!ELEMENT XSC:Doc %struct.model;>
<!ATTLIST XSC:Doc
    xmlns CDATA #FIXED "">

<!ELEMENT XSC:More ANY>
<!ATTLIST XSC:More
    xmlns CDATA "">

Appendix D: Contributors

Paul Prescod Peter Murray-Rust Alain Deseine
Chris Maden Rick Jelliffe Toby Speight
Jeni Tennison Marcus Carr Michael Kay
James Anderson David Megginson Don Park
James K. Tauber Tim Bray John Simpson
Steven Champeon Andrew Layman Arjun Ray
Curt Arnold Bill la Forge Bryan Gilbert
Carl Hage Dan Brickley David Brownell
David G. Durand David Ornstein David Rosenborg
Eric Albright Francis Norton Frank Boumphrey
Gisli Olafsson Dirk Gouders Guy Huard
Jacek Ambroziak Jack Bolles Jarle Stabell
Jeremy H. Griffith Jon Bosak Lars Marius Garshol
Liam Quin Lisa Rein Mark D. Anderson
Matt Mower Matthew Gertner Mark Tucker
Kenneth J. Meltsner Murata Makoto Murray Maloney
Parameshwor Karki Paul Haahr Paul Rabin
Robin Cover Scott Vanderbilt Sean McGrath
Simon North Stefan Wagner Steve Withall
Steven R. Newcomb Thuy-Lin Nguyen Todd Ross
W.E. Perry Will Hunt