RichEdit 8 Feature Additions

The time has come to summarize the features added in RichEdit 8, which shipped with Windows 8 and Office 2013. Since so much was added, I wrote a number of blog posts over the last twelve months about the larger RichEdit 8 features. The present post lists those features and then describes some smaller features included in RichEdit 8. Two large features, the Text Object Model Version 2 (TOM 2) and the Windows RT TOM don’t have separate posts since they’re described in detail in MSDN. In spite of these other posts, this post is bigger than usual. Features added in previous versions of RichEdit are described in RichEdit Versions 1.0 through 3.0, RichEdit versions, and RichEdit Versions Update to 7.0.

Contents

DWrite/D2D Support

Performance Improvements

RichEdit Spell Checking

Touch

Accessibility

Images

Flyweight Controls

Emoji

Variation Sequences

TOM Version 2

TOM Table Interface

Windows RT TOM

Immersive RichEdit

Windows 8 RichEdit

Windows Phone 8 RichEdit

Windows RT Font Binding

Character Flags Class

DWrite Font Fallback

Font Style, Weight, Stretch

Default Preferred Font Table

BCP-47 Language Tag Support

More Keyboards

More List Types

Math

RTL Math Prototype

Generic LineServices Math Callbacks

Hyperlink Schemes

Encrypted Passwords

Ellipsis

Battery life

 

DWrite/D2D Support

Office 2013 has undergone a substantial shift to a relatively new display facility, Direct2D, and a new text facility, DirectWrite. These are the display facilities that are used on Windows Phone 8, the new Windows RT slates, and optionally on Windows 7 & 8. For further info, see this post.

Performance Improvements

This post describes a couple of performance improvements: 1) a more efficient display tree, and 2) a faster rich-text formatting mechanism.

RichEdit Spell Checking

This post describes how RichEdit 8 was enhanced to access the Windows 8 spell-checking and autocorrection components directly.

Touch

This post describes how the RichEdit selection and grippers work on Windows 8 touch devices

Accessibility

This post describes an implementation of Microsoft UI Automation (UIA) that exposes most objects in a RichEdit instance. This is done via the UIA Text Pattern and includes basic character and paragraph formatting, images, OLE objects, math zones, tables, hyperlinks, and the text inside or associated with these objects.

Images

This post describes how RichEdit 8 uses the Windows Imaging Component (see also) to provide image support for jpg’s, png’s and gif’s.

Flyweight Controls

Over the years, the basic edit control has grown in size to accommodate greatly increased functionality. So now, even a plain-text, single-line control is pretty large. Clients can benefit from “flyweight RichEdit controls” which are stored in RichEdit stories (ITextStory, part of TOM 2) and share the properties of the parent ITextServices. RichEdit uses scratch stories internally for math build up/down, to convert MathML into the internal math representation, and to copy rich text when the text in the original copy selection gets changed. And there’s the main flyweight story that’s used by default. So all clients benefit from flyweight controls even those that don’t use the controls explicitly via the ITextStory interface. For more info, see this post.

Emoji

Emoji characters posed special challenges due in part to the Unicode unification of 107 Emoji characters with existing characters in the BMP and in part to the 11 keycap Emoji for #, 0, …, 9, which use the U+20E3 keycap combining mark. The original plan was to use the Segoe UI Symbol font for all Emoji, but this font choice is ambiguous for the unified Emoji. RichEdit 8 uses Segoe UI Symbol for all Emoji except the double exclamation mark (U+203C: ‼), which uses the current font if it has this character.

If an ambiguous Emoji character is followed by one of the BMP “Emoji” variation selectors U+FE0E and U+FE0F, RichEdit treats it as an Emoji character. U+FE0E specifies that the character should be rendered using a standard Emoji-capable font, e.g., Segoe UI Symbol, whereas U+FE0F implies that special Emoji rendering should be used. This special rendering is specified, in principle, in a higher-order protocol, but RichEdit 8 doesn’t have such a protocol. For more info, see this post.

Variation Sequences

Variation selector sequences posed challenges in both the user interface and in font selection. Such sequences consist of a base character, either in the BMP or a surrogate pair, followed by a variation selector, which can also be in the BMP (U+FE00..U+FE0F), or a surrogate pair (U+E0100..U+E01EF). The keyboard arrow, Delete and Backspace keys need to treat a VS sequence as a single character. Font binding is tricky, since currently only special fonts have support for VS sequences. Initially we tried font binding the U+E0100..U+E01EF variation selectors to a Japanese font, since the only usage at the time is in Japan. But this was changed to use the font of the base character, since it’s likely that China will define some VS sequences as well. It’s important that the variation selector is in the same character format run as its base character. See also the Emoji entry above which mentions how VS sequences can help denote how to render emoji characters.

TOM Version 2

The Text Object Model (TOM) Version 2 adds the interfaces ITextDocument2, ITextSelection2, ITextRange2, ITextFont2, ITextPara2, ITextStoryRanges2, ITextStrings, ITextStory, and ITextRow. The complete TOM model is defined by tom.idl, which includes tom1.idl. The interfaces are all documented in MSDN.

TOM Table Interface

This post describes the ITextRow table interface which allows you to insert tables, examine tables and to perform table manipulations, such as inserting, deleting and resizing table columns. Along with the ITextRange Move methods, the ITextRow methods give complete control over RichEdit’s nested table facility.

Windows RT TOM

The Windows RT Text Object Model gives the Windows RT RichEditBox a TOM-like object model. The Windows RT TOM is a subset of the full TOM2 interfaces. It has the following interfaces, all in the Windows.UI.Text namespace: ITextDocument, ITextSelection, ITextRange, ITextCharacterFormatting, ITextParagraphFormatting, and ITextConstantsStatics. The first five of these interfaces delegate to the TOM2 ITextDocument2, ITextSelection2, ITextRange2, ITextFont2, and ITextPara2, respectively. The large TOM2 enum of values is broken into a set of enums each oriented towards a particular feature. The Windows.UI.Text.idl file defines the interfaces and enumerations.

Immersive RichEdit

For the new immersive environment on tablets and on the Windows Phone 8, not only are GDI and Uniscribe absent, so are the functions handled by the venerable user.dll. That program library includes the Windows functions SendMessage, MessageBox, CreateWindow, etc. A version of RichEdit 8 has been created for the immersive environment. All instances are windowless and use D2D/DWrite for measuring and rendering. The client can still send RichEdit messages via the ITextServices::TxSendMessage() method. The advantage of dropping the traditional user.dll is the relative simplicity of the model. But doing so omits significant functionality, at least in the initial version. At the same time, the touch functionality is dramatically improved on Windows 8 and in the immersive environment. The immersive version of RichEdit 8 is used by the Windows Store OneNote.

Windows 8 RichEdit

The Windows 8 RichEdit is mostly a subset of the Office RichEdit 8. The features that are included are documented in MSDN. Here’s a list of omitted features:

  •        Math
  •        Page/Table Services (PTS): multicolumns, math paragraph, tight wrap around objects
  •        Text trackers
  •        Blobs (blobs are used internally for png and jpeg images)
  •         IRichTextProvider (callback interface used by OneNote and OfficeArt to insert rich text into a RichEdit control)
  •        XML handlers (used by OneNote and OfficeArt to access RichEdit’s MathML converters)
  •        Various messages
  •        Quite a few bugs fixes (Windows 8 shipped before Office 2013)

Windows Phone 8 RichEdit

The Windows Phone 8 RichEdit is based on the standard RichEdit 8 code base, rather than on the earlier WinCE version. A combination of makefile and conditional compilation instructions control various differences.

The default preferred font table was modified to correspond to the fonts on the phone. Related to this is the need to have a table of fonts to use when a file specifies a font on the phone, but not on the desktop. There’s also a last minute font fallback for East Asian scripts. Many changes from the phone teams have been back ported into the main RichEdit code base.

Windows RT Font Binding

For Windows RT, a special font callback interface, IProvideFontInfo, is defined that is used to replace RichEdit’s built-in font binding with Windows RT’s font binding. A major reason for this replacement is to support Windows RT composite fonts. The IProvideFontInfo interface is obtained by calling ITextHost::QueryInterface for an IProvideFontInfo. It includes the GetRunFontFaceId() method, which returns a font ID given the current font, the font weight, stretch and style, the lcid, a pointer to input characters and character count to be used with the returned font, the current font ID, and an out parameter runCount that gives the character count that the returned font covers. Two known problems exist with this feature: 1) it doesn’t stamp the characters with a CharRep, and 2) the Windows RT font binder doesn’t understand mathematical text. The CharRep is important for BiDi and for font fallback. Hopefully these problems will be addressed in a future release. IProvideFontInfo should not be used in math zones, since font binding in math zones is quite tricky.

Character Flags Class

RichEdit font binding uses a character repertoire (CharRep) facility. Character repertoires are often the same as Unicode scripts, but they include other sets such as symbols and emoji. In previous versions of RichEdit, the character-repertoire flags and indices along with the functions that manipulated them were scattered around in several files. Furthermore the variables used had no more space for new character repertoires, such as emoji. Accordingly we needed to generalize the facility.

To this end, we collected the character flags functionality and associated defines into the CCharFlags class, which hides many details from calling code. We used it to add support for 15 new character repertoires bringing RichEdit up to date with the scripts that Windows 8 supports. The scripts added are: Symbol, Emoji, Glagolitic, Lisu, Vai, N’ko, Osmanya, PhagsPa, Gothic, Deseret, Tifinagh, Old Italic, Old Turkic, Bopomofo, and Cyrillic Ext B. More character repertoires can be added easily and, in fact, a number have been added in Windows 8.1.

DWrite Font Fallback

If you specify the charset in creating a font, GDI will ensure that you get a font that handles that charset. Admittedly charsets cover only a subset of the world’s languages (no Indic, Syriac, etc.), but they do cover many important languages, notably Chinese, Japanese, and Korean (CJK). It’s really desirable to choose a font for Chinese characters that suits the user: Simplified Chinese, Traditional Chinese, or Japanese. Another trick is if a character is an end-user-defined character (EUDC) in the Unicode Private Use Area, GDI will ensure that you see a glyph by searching through possible EUDC fonts. These characters are not defined in the Unicode Standard, so you can’t use them reliably for text interchange. But they are popular in CJK locales and a given machine may have fonts with the glyphs that the user wants.

DWrite doesn’t offer such automatic font fallback. Accordingly to handle font fallback better on the DWrite code path, we pass down the current CharRep. This gives access to a default font that is likely to have the character glyphs when the current font does not. Code to handle EUDC for DWrite is included as well.

Font Style, Weight, Stretch

Windows 8 generalized its font attributes to have style, weight, and stretch. Actually GDI’s LOGFONT has always had font weight, but it hasn’t always been consistent about grouping font files that differ only by weight into a font family. For example, Windows 8 considers Arial Black to be the heaviest weight member of the Arial family rather than an independent font. The only change needed to handle font weight was to expose it in the RTF format with the fweightN control word. Font style includes upright, italic, and oblique. GDI has always had upright and italic, and used oblique when italic is requested and no corresponding italic font is available. To handle explicit requests for oblique, we added the RTF control word oblique and an attribute CEM_OBLIQUE. Font stretch didn’t have a representation in RichEdit’s character formatting or in RTF, so we added the RTF control word fstretchN.

Default Preferred Font Table

In RichEdit 7 and earlier versions, the default preferred font table is created at run time using a set of calls. The table is indexed by the charrep. In RichEdit 8, most entries are given in a convenient, explicit table. This change facilitated updating the entries to the Windows 8 preferences and creating a modified table for use on Windows Phone 8. There are two kinds of entry: user-interface (UI) and document. Plain-text instances use the UI entries and rich-text instances use the document entries unless the client has sent an EM_SETLANGOPTIONS message with the IMF_UIFONTS option.

BCP-47 Language Tag Support

RichEdit’s character formatting includes an LCID, which is being deprecated in favor of the BCP-47 language tags. In particular, Windows RT uses BCP-47 language tags as does the Windows RT Windows.UI.Text.ITextCharacterFormat LanguageTag property. We didn’t want to add a new method to ITextFont2, so we implemented the functionality in classic TOM by adding the flag tomLanguageTag for the ITextRange2::GetText2() and SetText2() methods. The approach uses the OS LCIDToLocaleName and LocaleNameToLCID functions. We also implemented a facility for converting BCP-47 strings that LocaleNameToLCID doesn't recognize into LCIDs for internal consumption. This facility doesn’t handle arbitrary BCP-47 tags, but it handles those used by Windows 8 that don’t have LCIDs (see following feature).

More Keyboards

Support was added for seven new Windows 8 keyboards that don’t have LCIDs. This involves decoding

Comments

  • Anonymous
    September 07, 2013
    Thanks for this post. We wish for better Math support, which will make Word / Power Point viable for technical papers / presentations.

  • Anonymous
    October 03, 2013
    Thanks for the post. Please, push the decision to include math support in Windows RichEdit of the next version of Windows. All of us are waiting for that.

  • Anonymous
    October 11, 2013
    Hi! Can you tell us the dll and class name of the module? We want to use this new engine inside a winforms RichTextBox. Thanks

  • Anonymous
    October 29, 2013
    Hi. I have to use windowless rich edit control. There is problem to draw vertical rtf text. I try rotate it by using command SendMessage(EM_SETPAGEROTATE, EPR_90, 0); But after that I can't describe rectangle of view rtf. I try to write sample using MFC. There is some source code: class CMyRich : public CRichEditView { public: CMyRich() { CoInitialize(NULL); LoadLibrary("Msftedit.dll"); m_strClass=TEXT("RichEdit50W"); } }; in dialog window OnInitDialog: rich = new CMyRich(); CRect client; rich->Create(NULL, NULL, WS_CHILD, client, this, 101); rich->SetWindowText("Testing_123456789abcdef ghijklmnopqrst"); in dialog window on paint message: CRect rcTwip; GetClientRect(rcTwip); rcTwip.bottom = rcTwip.bottom * 1440 / 96; rcTwip.right = rcTwip.right * 1440 / 96; rich->PrintInsideRect(&dc, rcTwip, 0, 100, TRUE); In this case I have correct image, but if I use rich->SendMessage(EM_SETPAGEROTATE, EPR_90, 0) I have indescribable image. Does anyone khow solution of my problem?

  • Anonymous
    January 24, 2014
    Hi, could you please help me? I currently have some Windowless RichEdit code which uses the new Direct2D implementation. However I cannot get Horizontal Scrolling to work. When I send the WM_SCROLL message(s) via TxSendMessage nothing happens. Note, Vertical scrolling works, and also Horizontal Scrolling works fine when I use a regular hDc surface. To wrap using a reqular hDc surface normally you would call EM_SETTARGETDEVICE, passing in the width for the text wrap to wrap to. Now when using a Direct2d Surface this appears to not be taking as the text simply raps to the client coords. I've seen the TxGetHorzExtent member in the ITextHost2 interface but this member does not appear to be getting called. Are there any additional steps to setting up a Direct2d surface or any other calls or flags I need to set in order to get wrapping and horizontal scrolling working together. Your help would be greatly appreciated Many Thanks

  • Anonymous
    January 24, 2014
    The comment has been removed

  • Anonymous
    January 24, 2014
    Thanks for doing that. I have come to the conclusion that there is most likely a problem with the RichEdit Direct2d implementation, though I would love to be proved wrong. My conclusions, after another day of trial and error. Vertical scrolling works fine. Horizontal scrolling works only when word wrapping is turned off. Either by omitting the TXT_WORDWRAP flag or turning it off via EN_SETTARGETDEVICE. With Wrapping turned on: The HScroll bar gets set up correctly via TxSetScrollRange as normal, so that's ok. When using the a gdi surface the EN_SETTARGETDEVICE value is honored, so that's also ok. However with Direct2d the wrapping appears to be controlled by the lprcBounds passed into TxDrawD2. It does not matter what I set EN_SETTARGETDEVICE too it's simply being ignored. This to me is the problem. lprcBounds should refer the client bounds, not the wrapping bounds. So when I set lprcBounds to the client bounds (as it should be) the wrapping is incorrectly drawn on the client bounds rather than on the EN_SETTARGETDEVICE bounds. However scrolling works in this case, though it just scrolls into blank space since the wrapping is the wrong place. Since it appears that lprcBounds is controlling the wrapping I decide I must set the lprcBounds to the same width as EN_SETTARGETDEVICE. This fixes the wrapping, but now the scrolling does not work. I'm stuck between a rock and a hard place :-( Thanks

  • Anonymous
    January 27, 2014
    You might try #define EM_SETTARGETDEVICEEX (WM_USER + 266) with wparam = 0 and lparam = TARGETDEVICEDESCR * typedef struct _targetdevicedescr { LONG dxWidth; // Width of the target device in logical units LONG dyHeight; // Height of the target device in logical units DWORD dwFlags; // Set to 0 } TARGETDEVICEDESCR; I haven't checked this out, but it looks as though it might work. If so, we ought to document it :-) In any event, it's worth having a "wrap to ruler" mode in D2D mode as in WordPad. (WordPad uses the GDI mode).

  • Anonymous
    January 27, 2014
    Brilliant, yes that's got it. Thank you so much.

  • Anonymous
    January 31, 2014
    My Direct2D implementation is coming along nicely now. What is required to get metafile Images and Ole Objects to render in Direct2D. I think I can overcome the pasting of metafiles by specifically requesting the CF_DIB format and supplying the missing header info (as per your other posts), but this would still leave rtf files already containing static metafiles images. In both cases Images and Ole Objects, the space is allocated. It just renders a white area. Many Thanks.

  • Anonymous
    January 31, 2014
    The comment has been removed

  • Anonymous
    January 31, 2014
    I was hoping that maybe Static metafiles might have worked, aka the kind that MSPaint puts on the clipboard. I guess there is probably no way to tell the difference between one that contains a bitmap vs gdi commands. I can live without metafiles. Would creating a Direct2D surface using the D3D11_RESOURCE_MISC_GDI_COMPATIBLE flag not work then? So calling ID2D1GdiInteropRenderTarget::GetDC/ReleaseDC could be used with TxGetDC in desktop space. My code is in the desktop space. What extra steps are needed for ole objects (Excel) to render please? I noticed that OneNote does Ole nicely ;). I think I only have a couple of user calls left (mainly Caret and Cursor calls) and no Gdi calls (provided I can get direct 2d working). I could properly get rid of the Caret calls. I've kept them as I think they may still be useful for Accessibility. I've not had a proper look at the new richedit IAcc interfaces yet. Thanks

  • Anonymous
    January 31, 2014
    It's possible to use both GDI and D2D on the desktop. It's only in purely immersive platforms that GDI is missing. Similarly Ole can be used with D2D on the desktop. We didn't enable that combination in RichEdit though because RichEdit has to run on all platforms (Windows, Office, Windows Phone, iOS, and Android).

  • Anonymous
    April 01, 2014
    Dwayne Robinson, one of the DirectWrite gurus, points out that the ID2D1GdiMetafile and associated interfaces can be used to render GDI metafiles in the DirectD2D world. It doesn't look trivial, but it's worth investigating. ID2D1GdiMetafile is documented in MSDN. In particular, ID2D1DeviceContext::DrawGdiMetafile exists in Windows 8 and Windows 7 update to draw metafiles from a file stream.

  • Anonymous
    February 27, 2015
    Is there anyone who can explain how to enable the usage of DirectWrite/Direct2d within a Windowless RichEdit Control? The documentation/examples are pretty scarce. I see in the comments above that Neal was working on such a thing, but he didn't go into how he didn't go into the setup work. My attempts so far don't seem to be doing anything. I believe I should be able to tell if DirectWrite/Direct2d is in use by seeing what modules load up in my application but so their dll's do not show up. Any help would be greatly appreciated.

  • Anonymous
    February 27, 2015
    Check out ITextServices::TxDrawD2D() (msdn.microsoft.com/.../hh768719(v=vs.85).aspx) and ITextHost::TxGetPropertyBits() with flags like TXTBIT_D2DDWRITE. When a control is instantiated, this callback is called to find out the kind of instance you want, e.g., D2D/DWrite vs GDI/Uniscribe, etc.

  • Anonymous
    February 27, 2015
    The comment has been removed