Representation of Math Accents
The post Math Accents discusses how accent usage in math zones differs from that in ordinary text, notably in the occurrence of multicharacter bases. Even with single character bases, the accents may vary in width while in ordinary text the accent widths are the same for all letters. The present post continues the discussion by describing the large number of accents available for math in Unicode and in Microsoft Office math zones and how they are represented in MathML, RTF, OMML, LaTeX, and UnicodeMath.
Unicode math accents
As noted in Section 3.10 Accent Operators of the UnicodeMath specification, the most common math accents are (along with their TeX names)
These and more accents are described in Section 2.6 Accented Characters and 3.2.7 Combining Marks in Unicode Technical Report #25, Unicode Support For Mathematics. More generally, the Unicode ranges U+0300..U+036F and U+20D0..U+20EF have these and other accents that can be used for math.
The Windows Character Map program shows that the Cambria Math font has all combining marks in the range 0300..036F as well as 20D0..20DF, 20E1, 20E5, 20E6, 20E8..20EA. The range 0300..036F used as math accents in Word looks like
Except for the horizontal overstrikes and the double-character accents shown in red, all these work as math accents in Microsoft Office apps, although many aren’t used in math. In keeping with the Unicode Standard, UnicodeMath represents an accent by its Unicode character, placing the accent immediately after the base character. There’s no need for double-character accents in Microsoft Office math since the corresponding “single” character accents expand to fit their bases as in
In UnicodeMath, this is given by (a+b)~, where ~ can be entered using the TeX control word \tilde. This is simpler than TeX, which uses \widetilde{a+b} for automatically sized tildes rather than \tilde{a+b}.
The combining mark in the range 20D0..20EF that work as accent objects in Office math zones areYou can test accents that don’t have TeX control words by inserting a math zone (type alt+=), type a non-hex letter followed by the Unicode value, alt+x, space. For example, alt+=, z, 36F, alt+x, space gives
Accents in MathML
MathML 1 was released as a W3C recommendation in April 1998 as the first XML language to be recommended by the W3C. At that time, Unicode was just starting to take hold as Microsoft Word 97 and Excel 97 had switched to Unicode. [La]TeX was developed before Unicode 1.0, so it relied on control words. Accordingly, it was common practice in 1998 to use control words or common spacing accents to represent accents instead of the Unicode combining marks even though many accents didn’t have a unified standardized representation. Unicode standardized virtually all math accents by using combining marks. One problem with using the combining marks in file formats is that they, well, combine! So, it may be difficult to see them as separate entities unless you insert a no-break space (U+00A0) or space (U+0020) in front of them. UnicodeMath allows a no-break space to appear between the base and accent since UnicodeMath is used as an input format as well as in files. Only programmers need to look at most file formats (HTML, MathML, OMML, RTF), so a reliable standard is more important for file formats than user-friendly presentation.
MathML 3’s operator dictionary defines most horizontal arrows with the “accent” property. In addition, it defines the following accents
02C6 ˆ modifier letter circumflex accent
02C7 ˇ caron
02C9 ˉ modifier letter macron
02CA ˊ modifier letter acute accent
02CB ˋ modifier letter grave accent
02CD ˍ modifier letter low macron
02D8 ˘ breve
02D9 ˙ dot above
02DA ˚ ring above
02DC ˜ small tilde
02DD ˝ double acute accent
02F7 ˷ modifier letter low tilde
0302 ̂ combining circumflex accent
0311 ̑ combining inverted breve
Presumably the operator dictionary should be extended to include more math combining marks and their equivalents, if they exist, with the spacing diacritics in the range U+02C6..U+02DD.
Here’s the MathML for the math object 𝑎̂.
<mml:mover accent="true">
mm<mml:mi>a</mml:mi>
mm<mml:mo>^</mml:mo>
</mml:mover>
Accents in OMML
“Office MathML” OMML is the XML used in Microsoft Office file formats to represent most math. It’s an XML version of the in-memory math object model which differs from MathML. The math accent object 𝑎̂ has the following OMML
<m:acc>
mm<m:accPr>
mmmm<m:chr m:val=" ̂"/>
mmmm<m:ctrlPr/>
mm</m:accPr>
mm<m:e>
mmmm<m:r>
mmmmmm<m:t>𝑎</m:t>
mmmm</m:r>
mm</m:e>
</m:acc>
The Rich Text Format (RTF) represents math zones essentially as OMML written in RTF syntax. Regular RTF uses the \uN notation for Unicode characters not in the current code page. The math accent object 𝑎̂ has the RTF
{\macc{\maccPr{\mctrlPr\i\f0\fs20 }{\mchr \u770? }}{\me\i\u-10187?\u-9138?}}
Unicode RTF is easier to read since characters are written in Unicode
{\macc{\maccPr{\mctrlPr\i\f0\fs20 }{\mchr ̂}}{\me\i 𝑎}}
But none of these is as simple as the UnicodeMath 𝑎 ̂ ☺.