RichEdit Property Sets
RichEdit has many character-format properties, most of which are documented for ITextFont2 and CHARFORMAT2. Nevertheless, the OpenType specification defines many more character-format properties called OpenType features consisting of a 32-bit identifier (id) and a 32-bit value. For example, the Gabriola font has stylistic set 6, which displays “Gabriola is graceful” as
Variable fonts are the latest addition to the OpenType specification and the variable-font axis coordinates are also specified by an id-value pair. For example, the experimental HoloFont font has three axes, ‘wght’, ‘wdth’, and ‘opsz’, the first two of which are illustrated in
HoloFont was designed by John Hudson and Ross Mills of Tiro Typeworks Ltd.
You can try out variable fonts by checking out this site and you can find myriad variable-font articles and talks here. Variable fonts present a user-interface (UI) challenge. One technique is to use a slide bar to choose an axis coordinate. AI might provide good default values. If the traditional font drop downs are used, you can be confronted with a zillion choices. HoloFont has 9 weights × 5 widths × 6 optical sizes = 270 entries which all appear in the current Word drop-down font list! And that’s tiny compared to the continua of possible axis coordinate values. To illustrate this quandary, here are the first few entries in the HoloFont font drop-down list
Narrow ThinNarrow ExtraLightmmmNarrow LightNarrow SemiLightNarrowNarrow SemiBoldNarrow BoldNarrow ExtraBoldNarrow Black | SemiNarrow ThinSemiNarrow ExtraLightmmmSemiNarrow LightSemiNarrow SemiLightSemiNarrowSemiNarrow SemiBoldSemiNarrow BoldSemiNarrow ExtraBoldSemiNarrow Black | ThinExtraLightmmmLightSemiLightRegularSemiBoldBoldExtraBoldBlack | SemiWide ThinSemiWide ExtraLightSemiWide LightSemiWide SemiLightSemiWideSemiWide SemiBoldSemiWide BoldSemiWide ExtraBoldSemiWide Black |
Clearly such detailed font drop-down lists are impractical, so maybe we should use slide bars or drag selected text handles.
OpenType properties that are used in shaping complex scripts like Arabic are invoked automatically by DirectWrite and Uniscribe. But many other OpenType properties including these examples are discretionary and must be present in the backing store to work. In addition, it’s desirable to be able to add other kinds of properties. The CHARFORMAT2::dwCookie allows a client to attach one 32-bit value to a text run, but there’s need to attach multiple properties such spelling, grammar, and other proofing-error annotations along with other client properties.
To handle all these properties, the latest Office 365 RichEdit implements property sets as described in the remainder of this post. The D2D/DirectWrite RichEdit mode (but not the GDI/Uniscribe mode) displays the OpenType properties as illustrated in the figures above. The following, admittedly technical, discussion describes the property-set object model, the RTF and binary file format additions for property sets, how to display variable-font and other OpenType features using DirectWrite, and the OpenType variable-font (fvar) table.
Kinds of Properties
The kinds of RichEdit character format properties are summarized in the table
ID Range | Usage |
0..0xFFFF | Properties not in property sets |
0x10000..0x1FFFF | RichEdit temporary properties such as proofing errors |
0x20000..0x2FFFF | Client temporary properties |
0x30000..0x3FFFF | RichEdit persisted properties |
0x40000..0x2020201F | Reserved; returns E_INVALIDARG if used |
0x20202020..0x7E7E7E7Emmm | OpenType features/axis (if 0x80808080 mask = 0; else invalid) |
0x7E7E7E7F..0xFFFFFFFF | Reserved; returns E_INVALIDARG if used |
There are no persisted client properties since they are client-specific and could be misinterpreted if read by a different client.
Property Set Object Model
The client APIs for setting and getting properties are ITextFont2::SetProperty (id, value) and ITextFont2::GetProperty (id, pvalue). The id’s for these methods are given by xxxx, where xxxx is an OpenType feature tag, an OpenType variable-font axis tag (see MakeTag() below) or an annotation id defined in the table at the end of the preceding section. Since OpenType x’s belong to a limited set of ASCII characters in the U+0020..U+007E range, there’s plenty of room in the 32-bit id space to define other properties. Common properties like font weight are already represented as CCharFormat::_wWeight and in principle don’t need to be members of a property set. Since by default there are no properties in a property set, calling ITextFont2::SetProperty(id, tomDefault) deletes the property id if it exists. Note that id values < 0x10000 are reserved for other purposes, such as tomFontStretch (0x33E) to define a font’s stretch value. These values are well below the first possible OpenType id 0x20202020 (4 spaces). The largest OpenType tag is 0x7E7E7E7E, which gives 944 = 78,074,896 tags, although most of them will never be used or are used for other purposes such as ‘MATH’ for the math table. This leaves 2564 − 944 = 4,294,967,296 − 78,074,896 = 4,216,892,400 IDs for other purposes.
OpenType tags are constructed in the order given by the macro
#define MakeTag(a, b, c, d) (((d)<<24) | ((c)<<16) | ((b)<<8) | a)
For example, the variable-font weight axis tag ‘wght’ has the value 0x74686777.
Internally it’s useful to mark OpenType feature tags with a bit (tomOpenTypeFeature—0x00800000) to distinguish them from variable-font axis tags. This bit cannot be confused with annotation id’s which have values of 0x3FFFF or less. The feature tags are defined by the DWRITE_FONT_FEATURE_TAG enum defined in dwrite.h. The variable-font axis tags are defined by the font’s fvar table discussed below and in principle can be any combination of ASCII letters. So, if a tag isn’t a feature tag, we assume that it’s a variable-font axis tag and let DirectWrite accept or reject it.
Property Set RTF
In RTF, property sets are encoded similarly to the {\colortbl…} for colors and have the form
{\*\propsets id value…; …}
Here the id and value are 32-bit values that are encoded for all properties in a property set. Each property set is ended by a semicolon. This format is repeated for all property sets used in the text. If an id starts with an ASCII letter and consists of 4 ASCII letters, it is written as a character string. For example, the id ‘wdth’ is written as such for the 32-bit id value 0x68746477. If any byte in the id isn’t an ASCII letter, the id is written as a 32-bit integer. These choices make it easier to read property IDs. A value with no fractional part is written as an integer. A value with a fractional part is written as a decimal fixed-point number, e.g., 123.545. Any other combination is invalid and ends reading the RTF stream. The property set table {\*\propsets …} is stored in the RTF header following {\fonttbl …} and {\colortbl …} (if they are present).
An example with two property sets containing variable-font id’s is
{\*\propsets wght 800 wdth 104;wght 400;}
This syntax is a slightly simplified version of the variable-font CSS syntax used in web applications.
In the RTF body, a reference to the Nth property set in the \propsets table is given by \psN (like \crN for choosing the Nth color in the \colortbl). Here N is 0-based, that is, \ps0 refers to the property set immediately following \propsets.
Property Set Binary Format
The property id-value pair is written in the binary format as opyidProperty (0x8A), optProperty (opt8Bytes) followed by the 32-bit id and value. CPropertySet is written as opyidPropertySet (0x89), optPropSet (optArray) followed by the set’s opyidProperty’s. The array of property sets CPropertySets is written as opyidPropertySets (0x88), optPropertySets (optArray) followed by the opyidPropertySet’s. These constants are defined in rebinary.h.
Rendering Variable-Fonts and OpenType Features
In addition to backing-store enhancements, the display routines need to pass active variable-font axis coordinates and OpenType features to DirectWrite. See OpenType Variable Fonts for information about the DirectWrite APIs for this. To create a font specified in part by axis coordinates, RichEdit gets an IDWriteFontFace5 (see dwrite_3.h) with the desired axis coordinates in place of the usual IDWriteFontFace. It does this by calling IDWriteFontFace::QueryInterface() to get an IDWriteFontFace5 interface, calling IDWriteFontFace5::GetFontResource() to get an IDWriteFontResource interface, releasing the IDWriteFontFace5 and calling IDWriteFontResource::CreateFontFace() to get a new IDWriteFontFace5 with the desired axis coordinates. Then it uses this IDWriteFontFace5 instead of the original IDWriteFontFace.
To pass OpenType features to DirectWrite, copy them into a std::vector<DWRITE_TYPOGRAPHIC_FEATURES> and pass them to IDWriteTextAnalyzer1::GetGlyphs() and IDWriteTextAnalyzer1::GetGlyphPlacements(). Some font features, such as Gabriola’s stylistic set 6 ‘ss06’ introduce glyphs with ascents and/or descents that exceed the standard typo ascents and descents as discussed in High Fonts and Math Fonts. To display such large glyphs with no clipping, the rendering software needs to calculate the line ascent and descent from the glyph ink, rather than from the usual font values. This is the approach used with the LineServices math handler.
OpenType Variable Font Axes
The variable font axes are defined in the OpenType fvar table, which has the header
struct FvarHeader // Variable font fvar table header
{
OTUint16 majorVersion; // Major version of fvar table (1)
OTUint16 minorVersion; // Minor version of fvar table (0)
OTUint16 axesArrayOffset; // Byte offset from table start to first VariationAxisRecord
OTUint16 reserved; // Permanently reserved (2)
OTUint16 axisCount; // Count of VariationAxisRecord's
OTUint16 axisSize; // BYTE count of VariationAxisRecord (20 for this version)
OTUint16 instanceCount; // Count of InstanceRecord's
OTUint16 instanceSize; // BYTE count of InstanceRecord
}; // (axisCount*sizeof(DWORD) + (4 or 6))
Types like OTUint16 that begin with OT describe 4-byte, big-endian quantities that need reverse ordering to work with our little-endian machine architecture. The header is followed by axisCount VariationAxisRecord’s defined by
struct VariationAxisRecord
{
OTUint32 axisTag; // Tag identifying axis design variation
OTFixed minValue; // Minimum coordinate value (16.16 format)
OTFixed defaultValue; // Default coordinate value
OTFixed maxValue; // Maximum coordinate value
OTUint16 flags; // Axis qualifiers (hidden if 1)
OTUint16 axisNameID; // ID for 'name' table entry that provides axis display name
};
The axisTag’s have the same MakeTag() form as the regular OpenType tags. Since they are accessed via the OpenType fvar table, they are in a different namespace from the regular OpenType tags. We don’t know of any tag conflicts between the two name spaces, so it’s probably okay not to mark the axis tags differently. But internally we mark OpenType feature tags by setting the high bit of byte 2 (OR in tomOpenTypeFeature), since the tags consist of ASCII symbols in the range 0x20..0x7E. This marking avoids sending OpenType tags to the wrong DirectWrite APIs.
The VariationAxisRecord’s are followed, in turn, by the InstanceRecord’s defined by
struct InstanceRecord
{
OTUint16 subfamilyNameID; // ID for 'name' table entry giving subfamily name
OTUint16 flags; // Reserved for future use (0)
OTFixed coordinates[axisCount]; // instanceSize coordinates
OTUint16 postScriptNameID; // Optional. ID for 'name' table entry giving PostScript name
};
At some point, it might be worth dealing with the InstanceRecord’s, but it’s certainly easier to use axis coordinates than handle myriad localizable font names (see Holofont discussion in the introduction). RichEdit could export a facility for translating between the two, but probably such a facility should be delegated to the font picker. The localizable font names are designed to help end users recognize the nature of a variable font instance, but they aren’t efficient at the RichEdit level. They also aren’t usable for variable-font animations, since such animations vary axis coordinates continuously.
4-28 Decimal Floating-Point Format
The OpenType “fvar” table described in the previous section defines the min, max, and default variable-font axis coordinate values using the OpenType 16.16 numeric format. The integer part of the value is given by shifting right 16 bits, i.e., dividing by 65536. If the fractional part is nonzero, store the value in a floating-point variable and divide by 65536. In applications, coordinates are easier to read when the fractional part is 0 if only the integer part is displayed. Since purely fractional coordinates (values < 1) are useless, if the absolute value is less than 65536, the value can be understood to be an integer without a fractional part.
The OpenType 16.16 format is a binary fixed-point format that may encounter roundoff when converted to decimal, e.g., 800.1 → 800.100006. This roundoff is ugly in RTF, CSS, and dialog boxes. So we need a decimal floating-point format that doesn’t have such roundoff. The IEEE 754-2008 decimal floating-point encoding defines decimal32 with 20 bits of precision, a sign bit and the large exponent range of 10192. OpenType variable-font axis coordinates need at most four decimal places. The sign bit is used for the slant (slnt) standard axis and can be used for custom axes.
If the value has no fractional part, we store it as a standard 2’s complement integer rather than in the high word of 16.16 for readability in RTF, CSS and dialog boxes. To convert it to the 16.16 format, multiply by 65536. But if the value has a fractional part, we use the following signed 4-28 decimal floating-point format
s | n | significand | |||||||||||||||||||||||||||||
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
If the number is negative, the sign bit 31 is 1. Bits 0..27 are the significand. The decimal divide value n is defined by
n | divide significand by |
000mm | (not floating point) |
001 | 10 |
010 | 100 |
011 | 1000 |
100 | 10000 |
101 | 100000 |
110 | 1000000 |
111 | (not floating point) |
n must have at least one 0 bit to distinguish the format from a negative 2’s complement integer and at least one 1 bit to distinguish it from a positive integer.
This gives 28 bits of precision with a maximum value of (1028 – 1)/10 = 26843545.5 with one decimal place and a minimum value of 0.000001 with six decimal places. These limits are beyond the values used for OpenType variable font-axis coordinates, which typically range between 1 and 1000. The 4-28 decimal floating-point format is easy to use and displays the original fixed-point values with no round-off error. To convert it to the 16.16 format, store the 28-bit significand field in a double variable, divide by the number corresponding to n, multiply by 65536 and round to the nearest integer. For the DWrite APIs, store the 28-bit significand field in a double, divide by the number corresponding to n and cast the result to a FLOAT.
In C, the 4-28 decimal floating-point format of the value x is recognized by the function IsDecimalFloat(x) defined by
#define IsDecimalFloat(x) IN_RANGE(3, (x >> 28) & 7, 6)
where IN_RANGE() is defined by
#define IN_RANGE(n1, b, n2) ((unsigned)((b) - (n1)) <= unsigned((n2) - (n1)))
The divide factor in the n table is given by pow(10, (x >> 28) & 3) or (x >> 28) & 3 can be used as a table index.