Skip to content

Overview of DTD annotations

Klortho edited this page Dec 18, 2012 · 1 revision

The dtdanalzyer and its associated tools recognize annotations, in the form of specially-formatted comments, within the DTDs. The main use-case for this is to allow DTD designers to embed documentation into the DTD itself, much like the conventions for Javadoc or the <annotation> element inside XML Schema Documents (XSD). It can also be used for other purposes, like applying tags that have specific meaning to the downstream processes, or adding schematron rules to supplement the DTD.

Example

The following example illustrates many of the features supported. In this example, the documentation text is in Markdown format.

<!--~~ <split>
Specifies the main ingredients of a banana split.
Remember the following:
* One banana,
* Two banana,
* Three banana,
* Four.

~~ model
Four bananas make a bunch and so do many more.

~~ tags root rock-group mess-of-fun

~~ examples
    <split>
      <banana instrument='guitar'>Fleegle</banana>
      <banana instrument='drums'>Bingo</banana>
      <banana instrument='bass'>Drooper</banana>
      <banana instrument='keyboard'>Snorky</banana>
    </split>
~~-->

<!ELEMENT split (banana)*>

These annotations document the <split> element that is defined by the DTD, and are processed by the dtddocumentor tool to produce HTML documentation.

Comment structure and sections

The annotations recognized by this tool are in comments that begin and end with special tags that use double tildes, like so:

<!--~~ identifier
  everything in here is significant annotation.
~~-->

Following the opening comment tag, and on the same line, the user must insert an identifier that specifies the DTD object to which the annotation applies. In the "split" example above, the annotation applies to the <split> element. Annotations can also be applied to the DTD as a whole, a single module (usually a file), an element, an attribute, a general entity, or a parameter entity.

Each annotation block is divided into separate sections. A section is introduced by a newline followed by two tildes and then a keyword that "names" the section. By default, the first section is implicitly defined to be of type "notes".

In the example for the <split> element, the "notes" section includes all the text up to and including the fourth bullet item in the list. Following that, the line ~~ model introduces a new "model" section that contains a single line of text.

The current set of recognized annotation sections consists of:

  • notes: General notes about the element. (XHTML or Markdown format.)

  • model: Notes related to the content model of an element. In general, this is used to provide usage information. (XHTML or Markdown format.)

  • tags: A list of tag keywords that are assigned to that object. These tags can then be used for grouping or filtering. For example, the tag "root" is used by default by the document generator to identify elements that can act as the root element of a document instance. (A list of whitespace-delimited keywords.)

  • schematron: A list of assert and report Schematron rules. These are discussed in more detail below. (XML format.)

  • json: used by the dtd2xml2json tool. See Auto-generating XML-to-JSON conversion XSLT. (XML format.)

  • examples: A section allowing DTD authors to give examples showing proper usage of the item. (XHTML or Markdown format.)

Annotation format and processing

The DtdAnalyzer interprets the contents of each annotation section according to its type, and then inserts those contents into an element in the output.

The notes, model, and examples sections, and any unrecognized section types, are considered to be documentation. By default, these will be considered to have valid XHTML content, and will be simply inserted into the documentation as-is. However, if the --markdown (-m) switch is given, they will be considered to be in Markdown format, and will be passed through the Markdown processor pandoc for conversion into XHTML. The Markdown is preferred because it is much more readable than XHTML when looking at the source DTDs. To use Markdown, you must first install the excellent pandoc tool, and put it in the PATH that is available to the dtdanalyzer process. Using the -m switch activates Markdown processing.

The tags section, in contrast, is interpreted merely as a list of keywords.

The sections schematron and json should be in well-formed XML format. The schematron sections in Schematron XML, of course, and the json section in a format documented in the page Auto-generating XML-to-JSON conversion XSLT.

Autolinking from Markdown annotations

In those sections that are processed as Markdown, there is an added feature that lets you easily link between the various pages of documentation within the DTD.

Here is how you make links to other documentation pages:

To disable the auto-formatting of any of these, just precede them with a backslash. For example, \`<split>, \@Instrument, \%banana.ent;, or \&fleegle-pic;