The dir attribute

Darwin Information Typing Architecture (DITA) Version 1.2

Document
Darwin Information Typing Architecture (DITA) Version 1.2

The dir attribute provides direction about how processors should render bidirectional text. Languages such as Arabic, Hebrew, Farsi, Urdu, and Yiddish have text written from right to left. Numerics and embedded sections of Western language text, however, are written from left to right. Some multilingual documents also contain a mixture of text segments in two directions. This attribute specifies how such text should be rendered to a reader.

Bidirectional text processing is controlled by several factors:
  • The xml:lang attribute may be used to identify text that requires bidirectional rendering. The Unicode Bidirectional algorithm provides the means to properly identify western content in mixed text.
  • The dir attribute may be set on the root element, in combination with the xml:lang attribute. For example, to correctly set in a web browser a text in Arabic with embedded English content, the root element should be set with xml:lang="ar" and dir="rtl". All text, including punctuation marks, will be set correctly.
  • The dir attribute may be set to either "ltr" or "rtl" on an element in the document.
  • The dir attribute may be set to either "lro" or "rlo" on an element in the document.

The Unicode bidirectional algorithm positions the punctuation correctly for a given language. The rendering is responsible for displaying the text properly.

The use of the dir attribute and the Unicode algorithm is explained in the article Specifying the direction of text and tables: the dir attribute (http://www.w3.org/TR/html4/struct/dirlang.html#adef-dir) . This article contains several examples of how to use the dir attribute set to either left-to-right or right-to-left. There is no example of setting the dir attribute to either "lro" or "rlo", although it can be inferred from the example that uses the <bdo> element, a now-deprecated W3C mechanism for overriding the entire Unicode bidirectional algorithm.

Note that properly written mixed text does not need any special markers. The Unicode bidirectional algorithm is sufficient. However, some rendering systems may need directions for displaying bidirectional text, such as Arabic, properly. For example, the Apache FOP tool may not render Arabic properly unless the left-to-right and right-to-left indicators are used.

Recommended usage

The dir attribute, together with the xml:lang attribute, is essential for rendering table columns and definition lists <dl> to ensure proper order.

In general text, the Unicode Bidirectional algorithm, as specified by the xml:lang attribute together with the dir attribute, provides for various levels of bidirectionality, as follows:
  • Directionality is either explicitly specified via the xml:lang attribute in combination with the dir attribute on the highest level element (topic or derived peer for topics, map for ditamaps) or assumed by the processing application. If used, it is recommended to specify the dir attribute on the highest level element in the topic or document element of the map.
  • When embedding a right-to-left text run inside a left-to-right text run (or vice-versa), the default direction may provide incorrect results based on the rendering mechanism, especially if the embedded text run includes punctuation that is located at one end of the embedded text run. Unicode defines spaces and punctuation as having neutral directionality and defines directionality for these neutral characters when they appear between characters having a strong directionality (most characters that are not spaces or punctuation). While the default direction is often sufficient to determine the correct directionality of the language, sometimes it renders the characters incorrectly (for example, a question mark at the end of a Hebrew question may appear at the beginning of the question instead of at the end or a parenthesis may render incorrectly). To control this behavior, the dir attribute is set to "ltr" or "rtl" as needed, to ensure that the desired direction is applied to the characters that have neutral bidirectionality. The "ltr" and "rtl" values override only the neutral characters (e.g. spaces and punctuation), not all Unicode characters.
    Note: Problems with Unicode rendering may be caused by the rendering mechanism. The problems are not due to the XML markup itself.
  • Sometimes you may want to override the default directionality for strongly bidirectional characters. Overrides are done using the "lro" and "rlo" values, which overrides the Unicode Bidirectional algorithm. This override forces a direction on the contents of the element. These override attributes give the author a brute force way of setting the directionality independent of the Unicode Bidirectional algorithm. The gentler "ltr" and "rtl" values have a less radical effect, only affecting punctuation and other so-called neutral characters.

For most authoring needs, the "ltr" and "rtl" values are sufficient. Only when the desired effect cannot be achieved using these values, should the override values be used.

Implementation precautions

Applications that process DITA documents, whether at the authoring, translation, publishing, or any other stage, should fully support the Unicode bidirectional algorithm to correctly implement the script and directionality for each language used in the document.

Applications should ensure every highest level topic element and the root map element explicitly assign the dir attribute, as well as the xml:lang attribute.