Udostępnij za pośrednictwem


Linear Format Notations for Mathematics

I have been having a great discussion with Christian Lerch about computer-oriented mathematical notations. He has a program that lets you input MathML using a pure ASCII syntax. It is similar to ASCIIMathML. A lightly commented EBNF grammar of his MathEL language as implemented for the time being (still beta and evolving a bit) is given in https://km-works.eu/mathel-interactive/img/MathEL-ebnf.txt. Christian has a more elaborate reference manual in the works. One interesting feature is that he omits the leading backslash in the TeX-like names for symbols. This approach is definitely more readable and is the choice for eqn/troff distributed with Unix. TeX loomed so large in our thinking that we just automatically used the [La]TeX names by default. Office’s autocorrect facility allows users to define any names they want so users can add names without leading backslashes. Perhaps I can add an option that does not require the backslashes in math zones.

Another approach has been discussed, along with many earlier mathematical notations, in the article Stephen Wolfram's Mathematical Notation: Past and Future (2000). Wolfram’s discussion is both edifying and entertaining to read. I think he makes considerable progress with Mathematica’s notation, including introducing the five special characters we ended up adding to Unicode (U+2145..U+2149) for things like differential d (U+2146). However, his thinking is a little constrained by a very desirable goal: whatever math you write should be computable by Mathematica.

Something like Presentation MathML has sufficient flexibility to produce mathematics that cannot be computed without additional context and semantics. Once you allow a notation to have such ambiguity, you can go farther than Wolfram is willing to go. Wolfram calls the linear format “StandardForm” and the presentation format “TraditionalForm”.

I think that a linear format resembles traditional math notation far more closely if one uses Unicode symbols whenever possible. Having an autocorrection facility translate ASCII names for operators and Greek letters into Unicode symbols provides a convenient way to input the desired symbols, but the real format is defined in terms of Unicode symbols, not the ASCII names for these symbols. In addition to looking more mathematical, the notation is then automatically globalized, aside from some Arabic locales. The [La]TeX ASCII symbol names have a minor English-language prejudice. Hopefully this bias isn’t too objectionable to non-English-language speakers. The Unicode symbols are international by their very nature. As such an international linear format for math should use Unicode symbols, rather than ASCII, although it is fine to use the latter for input.

In addition it is nice to have little fix ups that reduce the number of brackets/parens needed to overrule operator precedence. For example, int_-infty^infty should be okay, since it’s clear by context that the – in –infty must be a unary minus. Also it’s nice to have a^b^c mean something (treat it right associatively), unlike in TeX, where it’s an error. OTOH, a^b_c is different, since TeX’s meaning for it is quite handy. Expressions like 1/.2 should create a fraction with .2 in the denominator rather than two sentence fragments. So the ASCII period and comma end up being handled differently when part of a number than in ordinary text. You also need a way to bind in n-aryands, like integrands and summands. So I introduced the glue operator U+2592, which both glues the n-aryand to the n-ary operator and terminates a limit, if one appears. To keep things looking natural, I considered a simple operand to be an alphanumeric span, rather than TeX’s single letter. The expression a_123 is a sub 123 rather than a sub 1 multiplied by 23. A complex operand can include parenthesized expressions so that f(x)/g(x) builds up to the ratio of f(x) to g(x).

Another requirement is to be able to round-trip the presentation (Professional/built-up) format through the linear format and back. This feature is very handy, since it lets us represent essentially arbitrary mathematical expressions/equations in “plain” text. One particular use is for user definable autocorrect entries. For example, you can assign \binomial with the linear format (a+b)^n=∑_(k=0)^n ▒(n¦k)a^k b^(n-k) and presto! When you type \binomial <space> or carriage return, it builds up to the binomial formula. Well not yet in Word 2010 (you need to select Professional Format explicitly), but it does in PowerPoint 2010 and OneNote 2010.

For special edge cases, I needed a couple of somewhat arcane constructs like the lenticular brackets〖〗for round-tripping compound arguments since if parentheses were used, the parentheses would be displayed. The result is that the Unicode linear format is a very general mathematical notation, but it is not a context-free grammar. This would certainly upset Wolfram and it does make it harder to implement and maintain in its full generality, although it is quite efficient in practice. It is the closest to a true mathematical notation that I could define. Let me hasten to add that many, many people, inside and outside Microsoft, have contributed ideas that shaped the version that Microsoft Office 2010 offers.

Bertrand Russell once wrote, “A good notation has a subtlety and suggestiveness which at times make it seem almost like a live teacher…and a perfect notation would be a substitute for thought.” The Unicode linear format certainly isn’t a substitute for thought, but it is a more mathematically natural notation than previously available on computers.

Comments

  • Anonymous
    September 20, 2010
    I use the linear format almost exclusively, and to make it easier, I've defined a large number of keyboard shortcuts. For example, "ctrl-G a" gives me a greek alpha. I find I like that a lot better than typing the longer TeX-inspired names, although it does mean memorizing a number of control sequences. Has anyone else done this in a systematic way? Ideal, for me, would be a math keyboard with all the extra characters nicely labeled--something about as complex as the old APL keyboards. --Greg

  • Anonymous
    September 20, 2010
    The comment has been removed

  • Anonymous
    September 24, 2010
    I just discovered Office math capabilities (I use OneNote 2010 only though). It is definitely the best math editor I have seen and while it can be improved (like the placement of the cursor right after applying an AutoCorrection), I don't see how one could enter math any more efficiently than the way it's implemented in Office. I for example created an AutoCorrect list that allows me to enter math in real-time during a lecture so it has to be extremely efficient. None of my AutoCorrect entry  requires more than 2 characters and is really easy to remember (only 4 rules to remember). I also don't use any keyboard shortcuts so far except the one for math zones (ALT+=). E.g. I append double quotes to letters to turn them greek (e.g.: a" for alpha) or I append a dot to turn letters into common double struck letters or other common letter-like symbols. It's still a work in progress though since I am still reading up on the unofficial documentation and there are still a few symbols I haven't added yet. I'd love to do a guest post on this when I am done, since I believe there is no faster and easier to remember way to enter math than the AutoCorrect list I am working on. Are there plans to release official documentation on all the math features of Office and what feature is available in what product and what version?

  • Anonymous
    September 26, 2010
    Your math autocorrect list seems very effiicient and useful. It'd be cool to have a blog post on it. More documentation will become available as time goes on. Most of the functionality is in Word 2007/2010, OneNote 2010, and applications that incorporate the 2010 version of OfficeArt, such as PowerPoint 2010 and Excel 2010. There are some differences (mostly in the user interfaces) and documenting them would make for a good blog post as well.

  • Anonymous
    September 28, 2010
    Greg: I defined a Math keyboard for using with Word 2007 when I was an undergrad, although I didn't actually produce a physically-labeled one. It's fallen out of use for me since I switched to Mac, but I could get a very good speed entering equations with it (often such that Word couldn't keep up with my typing) and would rarely need to resort to the ribbon (only for esoteric symbols and matrices; even then I would use the keyboard to navigate it). It's available online (free) at http://ed.mvps.org/physics/

  • Anonymous
    November 05, 2010
    The comment has been removed

  • Anonymous
    November 05, 2010
    I agree that it would be great to have a LaTeX input option. While I personally prefer the current linear format since it's a more mathematical notation and somewhat more efficient, I'm also a strong supporter of standards and conventions. In particular, if you have LaTeX in your fingers, why should you have to learn a new notation? LaTeX works fine for you. Naturally we have to prioritize this feature request (and it's a common request) against other things we're doing.

  • Anonymous
    August 23, 2011
    First let me say how much I love the new math mode. I have a question for which I haven't been able to find the answer in the various documents and blog posts.  How do I enter the equivalent of latex: hat{x}?  If I type hat(, this creates a container with a hat over it, but the parenthesis is still there, so I need to delete the parent, then hit the backarrow key to position the cursor inside the container to type the "x". I'm sure there is a better way to do this.

  • Anonymous
    August 23, 2011
    Thanks for the compliment! Office math follows the Unicode custom of placing the combining mark after the expression that gets the combining mark. So zhat<space space or operator> puts the hat over the z. (a+b)hat puts a wide hat over the a+b as described in the linear format document (www.unicode.org/.../UTN28-PlainTextMath-v3.pdf) in Section 3.10 Accent Operators.