Rich-Text Format Specification V. 2 Introduction 2 rtf syntax 2 Conventions of an rtf reader 4

Download 1.09 Mb.
Date conversion29.04.2016
Size1.09 Mb.
1   2   3   4   5   6   7   8   9   ...   15

Conventions of an RTF Reader

The reader of an RTF stream is concerned with the following:

á Separating control information from plain text.

á Acting on control information.

á Collecting and properly inserting text into the document, as directed by the current group state.

Acting on control information is designed to be a relatively simple process. Some control information simply contributes special characters to the plain text stream. Other information serves to change the program state, which includes properties of the document as a whole, or to change any of a collection of group states, which apply to parts of the document.

As previously mentioned, a group state can specify the following:

á The destination, or part of the document that the plain text is constructing.

á Character-formatting properties, such as bold or italic.

á Paragraph-formatting properties, such as justified or centered.

á Section-formatting properties, such as the number of columns.

á Table-formatting properties, which define the number of cells and dimensions of a table row.

In practice, an RTF reader will evaluate each character it reads in sequence as follows:

á If the character is an opening brace ({), the reader stores its current state on the stack. If the character is a closing brace (}), the reader retrieves the current state from the stack.

á If the character is a backslash, the reader collects the control word or control symbol and its parameter, if any, and looks up the control word or control symbol in a table that maps control words to actions. It then carries out the action prescribed in the table. (The possible actions are discussed below.) The read pointer is left before or after a control-word delimiter, as appropriate.

á If the character is anything other than opening brace ({), closing brace (}), or backslash (\) , the reader assumes that the character is plain text and writes the character to the current destination using current formatting properties.

If the RTF reader cannot find a particular control word or control symbol in the look-up table described above, the control word or control symbol should be ignored. If a control word or control symbol is preceded by an opening brace ({), it is part of a group. The current state should be saved on the stack, but no state change should occur. When a closing brace (}) is encountered, the current state should be retrieved from the stack, thereby resetting the current state. If the \ * control symbol precedes a control word, then it defines a destination group and was itself preceded by an opening brace ({). The RTF reader should discard all text up to and including the closing brace (}) that closes this group. All RTF readers must recognize all destinations defined in the March 1987 RTF specification. The reader may skip past the group, but it is not allowed to simply discard the control word. Destinations defined since March 1987 are marked with the \* control symbol.


All RTF readers must implement the \* control symbol to be able to read RTF files written by newer RTF writers.

For control words or control symbols that the RTF reader can find in the look-up table, the possible actions are as follows.

Change Destination

The RTF reader changes the destination to the destination described in the table entry. Destination changes are legal only immediately after an opening brace ({). (Other restrictions may also apply; for example, footnotes cannot be nested.) Many destination changes imply that the current property settings will be reset to their default settings. Examples of control words that change destination are \ footnote, \ header, \ footer, \ pict, \ info, \ fonttbl, \ stylesheet, and \ colortbl. This chapter identifies all destination control words where they appear in control-word tables.

Change Formatting Property

The RTF reader changes the property as described in the table entry. The entry will specify whether a parameter is required. “Alphabetic List of RTF Keywords,” later in this chapter, also specifies which control words require parameters. If a parameter is needed and not specified, then a default will be used. The default value used depends on the control word. If the control word does not specify a default, then all RTF readers should assume a default of 0.

Insert Special Character

The reader inserts into the document the character code or codes described in the table entry.

Insert Special Character and Perform Action

The reader inserts into the document the character code or codes described in the table entry and performs whatever other action the entry specifies. For example, when Microsoft Word interprets \ par, a paragraph mark is inserted in the document and special code is run to record the paragraph properties belonging to that paragraph mark.

Formal Syntax

This chapter describes RTF using the following syntax, based on Backus-Naur Form:




Text (without control words)


Hexadecimal data


Binary data


A literal

A non-terminal


The (terminal) control word a, without a parameter.


The (terminal) control word a, with a parameter


Item a is optional.


One or more repetitions of item a.


Zero or more repetitions of item a.

a b

Item a followed by item b.

a | b

Item a or item b

a & b

Item a and/or item b, in any order

1   2   3   4   5   6   7   8   9   ...   15

The database is protected by copyright © 2016
send message

    Main page