次の方法で共有


Recognizing Improvements in Windows 7 Handwriting

Microsoft has been working on handwriting recognition for over 15 years going back to the Pen extensions for Windows 3.0. With the increased integration and broad availability of the handwriting components present in Windows Vista we continue to see increased use of handwriting with Windows PCs. We see many customers using handwriting across a wide variety of applications including schools, hospitals, banking, insurance, government, and more. It is exciting to see this natural form of interaction used in new scenarios. Of course one thing we need to continue to do is improve the quality of recognition as well as the availability of recognizers in more languages around the world. In this post, Yvonne, a Program Manager on our User Interface Platform team, provides a perspective on engineering new recognizers and recognition improvements in Windows 7. --Steven

Hi, my name is Yvonne and I’m a Program Manager on the Tablet PC and Handwriting Recognition team. This post is about the work we’ve done to improve recognition in handwriting for Windows 7.

Microsoft has invested in pen based computing since the early 1990s and with the release of Windows Vista handwriting recognizers are available for 12 languages, including USA, UK, German, French, Spanish, Italian, Dutch, Brazilian Portuguese, and Chinese (Simplified and Traditional), Japanese and Korean. Customers frequently ask us when we plan to ship more languages and why a specific language is not yet supported. We are planning to ship new and improved languages for Windows 7, including Norwegian, Swedish, Finnish, Danish, Russian, and Polish, and the list continues to grow. Let’s explore what it takes to develop new handwriting recognizers.

Windows has true cursive handwriting recognition, you don’t need to learn to write in a special way – in-fact, we’ve taught (or “trained” as we say) Windows the handwriting styles of thousands of people and Windows learns more about your style as you use it. Over the last 16 years we’ve developed powerful engines for recognizing handwriting, we continue to tune these to make them more accurate, faster and to add new capabilities, such as the ability to learn from you in Vista. Supporting a new language is much more than adding new dictionaries – each new language is a major investment. It starts with collecting native handwriting, next we analyze the data and go through iterations of training and tuning, and finally the system gets to you and continues to improve as you use it.

Data Collection

The development of a new handwriting recognizer starts with a huge data collection effort. We collect millions of words and characters of written text from tens of thousands of writers from all around the world.

Before I describe our collection efforts, I would like to answer a question we are frequently asked: “Why can’t you just use an existing recognizer with a new dictionary?” One reason is that some languages have special characters or accents. But the overriding reason is because people in different regions of the world learn to write in different ways, even between countries with the same language like the UK and US. Characters that may look visually very similar to you can actually be quite different to the computer. This is why we need to collect real world data that captures exactly how characters, punctuation marks and other shapes are written.

Setting up a data collection effort is challenging and time consuming because we want to ensure that we collect the “right kind of data”. We carefully choose our collection labs in the respective countries for which we develop recognizers.

Before we start our data collection in the labs, we configure our collection tools, prepare documentation, and compile language scripts that will guide our volunteers through the collection process. Our scripts are carefully prepared by native speakers in the respective language to ensure that we collect only orthographically correct data, data from different writing styles, and data that covers all characters, numbers, symbols and signs that are relevant to a specific language. All of our scripts are proofread and edited before they are blessed to be used at the collection labs.

Once our tools and scripts are ready, we open our labs and start to recruit volunteers to donate their handwriting samples. Our recruitment efforts ensure that we have balanced demographics such as gender, age, left handiness, and educational background that represent the majority of the population for that country.

A supervisor at the lab instructs the volunteers to copy the text as it is displayed in the collection tool in their own writing style. What is important to note is that we want to collect writing samples that accurately represent the person’s natural way of writing. We therefore encourage volunteers to treat “pen and tablet” like “pen and paper”. If one of the volunteers tends to writes in big, curvy strokes, then we want to collect his/her big, curvy strokes during the collection session. High quality data in this context refers to data that was naturally written.

Here is a snapshot of what our collection tool looks like:

Figure 1. Collection tool.

Figure 1: Collection Tool

A collection session lasts between 60-90 minutes at which point our volunteer has donated a significant amount of handwritten data without feeling fatigued. The donated data is then uploaded and stored in our database at Microsoft ready for future use. The written samples contain important information like stroke orders, start- and end points, spacing, and other characteristics that are essential to train our new recognizer.

Let’s take a look at some of our samples in our database to illustrate the great variation among ink samples:

Figure 2. Ink samples illustrating stroke order.

Figure 2: Ink samples illustrating different stroke orders.

The screenshot shows how three different volunteers inked the word “black”. The different colors are used to illustrate the exact stroke orders in which the word was written. Our first two volunteers used five strokes to write the word “black”; our third volunteer used four strokes. Please also note how our third volunteer used one stroke only to ink the letters “ck”, while our first volunteer used three strokes for the same combination of letters. All of this information is used to train our recognizers.

Neural Network and Language Model

Once we have collected a sufficient amount of inked data, we split our data into a training set, used by our development team, and a “blind” set, used by our test team. The training set is then employed to train the Neural Network, which is largely responsible for the magic that is taking place during the recognition process. Good, naturally written data is essential in developing a high quality recognizer; the recognizer can’t be any better than its training set. The more high quality data we feed into our Neural Network, the more equipped we are to handle sloppy cursive handwriting.

Our Neural Network is a Time-Delay Neural Network (TDNN) that can handle connected letters of cursive scripts. A TDNN takes ink segments of preceding and following stroke segments into consideration when computing the probabilities of letters, digits and characters for each segment of ink. The output of the TDNN is powerful but not good enough when handwriting is sloppy. In order to come within reach of human recognition accuracy, we have to employ information that goes beyond the shape of the letter: we call this the Language Model context. The majority of this Language Model context comes in form of the lexicon, which is a wordlist of valid spellings for a given language. For many languages, this is the same lexicon that the spellchecker uses. The TDNN and the lexicon work closely together to compute word probabilities and output the top suggestions for the given input.

Training the Neural Network is an involved process that takes time. We often experiment with borrowing data from other languages to increase the size of the training data with the ultimate goal to boost recognition accuracy. Borrowing characters from other languages does not always lead to success. As I mentioned above, stroke order, letter shape, writing styles and letter size can differ significantly from country to country and can have a negative impact on the performance of the TDNN. It often takes us several rounds of training, re-training and tuning before we find “the right formula” that will lead to high recognition accuracy.

How do we know if we are headed in the right direction when we build a new recognizer? This is an important question that the test team and native speakers answer for us. The test team is responsible for generating our recognition accuracy metrics that reflect how good our recognizer is. These accuracy metrics are based on our blind test set which is the collected data that development could not use for training. In addition to our accuracy metrics, we work with native speakers in house and at our world-wide subsidiaries to get feedback and further input.

Improving the recognizers through personalization

In the previous paragraphs I have outlined how we develop high quality recognizers that can handle a wide variety of different writing styles. But there is more as each person can also train the recognizer his/her unique writing style. The training that is done to teach the recognizer a personal writing style is the same training that happens before Microsoft ships the product. The only difference is that we are now collecting unique training data from a specific person (and not that of thousands of people). We call this process “Personalization”.

Figure 3: Personalization Wizard (Sentence module).

Figure 3: Personalization Wizard (Sentence module).

As the screenshots of our Personalization wizard illustrates, a person is asked to write the requested sentence to provide his/her ink samples. The more data a person donates during the personalization process, the better the recognizer will become. In addition to providing writing samples based on specified sentences, a person can target specific recognition errors, shapes, and characters that will all be used for training. Our Personalization feature is complex and offers a variety of different modules that enable a person to optimally tune the recognizer. We are proud to announce that Personalization will be available for all Vista languages and all new Windows 7 languages. We encourage you to use this feature to improve your recognition accuracy.

We continue to work on improving our recognizers which also means that we are incorporating our customers feedback through online telemetry (anonymously, privately, voluntary, and opt-in). In Windows Vista we released a new feature called “Report Handwriting Recognition Errors”, which gives people the opportunity to submit those ink samples that the recognizer did not recognize correctly. After the person has corrected a word in the Tablet Input Panel (TIP), we enable a menu that allows a person to send the misrecognized ink together with its corrected version to our team.

Here is a screenshot of what our error reporting tool looks like:

Figure 4: With “Report Handwriting Recognition Errors” people can choose which of the misrecognized ink samples they want to submit.

Figure 4: With “Report Handwriting Recognition Errors” people can choose which of the misrecognized ink samples they want to submit.

We receive approximately 2000 error reports per week. Each error report is stored in our database before we analyze it and use it to improve our next generation of recognizers. As you can imagine, real world data is extremely helpful because it is only this type of data that can reveal shortcomings of our recognizers.

We value and appreciate every single error report. Keep sending us your feedback, so that we can use it to improve the magic of our present and future recognizers.

Thank you,

– Yvonne representing the handwriting recognition efforts

Comments

  • Anonymous
    February 10, 2009
    And what are the improvements in Windows 7?

  • Anonymous
    February 10, 2009
    Nice! Now i wont tablet PC :D Example new dell latitude Xt2 http://www.engadget.com/2009/02/10/dell-latitude-xt2-multi-touch-tablet-with-11-hour-battery-now-of/

  • Anonymous
    February 10, 2009
    Handwriting works amazingly well on Tablet PCs, but a major problem I'm having with pen input (atleast on my HP Tablet PC) is terrible calibration and a no reliable way to improve calibration.  The pointer will often be half a centimeter off from the point where the pen tip is touching the screen.  In addition, at different positions on the screen, the calibration is correspondingly better or worse.  Now, perhaps this is more of a hardware issue (which Microsoft has less control over), but it would be great if some kind of better calibration tool came with Windows 7 (there is one that I believe came with HP on my machine but it only had 4 calibration points on it and more often than not ruins calibration even more than it already is).  The more calibration points on this tool the better--I wouldn't mind pecking away at the screen every once in a while if I could enjoy a realistic "pen and paper" experience.

  • Anonymous
    February 10, 2009
    The comment has been removed

  • Anonymous
    February 10, 2009
    After years of using the pen, I find it serves me better focus on where the cursor is rather than where the pen tip is. It's never more than a mm off for me anyway, so it's not too much of a hardship. Although I should mention there is a better calibration tool with Windows 7 that uses 16 points, but I haven't noticed an appreciable difference due to it.

  • Anonymous
    February 10, 2009
    Thanks for the excellent write up. What about the new Math Input Panel new to Windows 7? Is there any overlap in how it was created? If so, is there a broader tool/API that could be created which Microsoft and third parties could use to create other recognition sets?

  • Anonymous
    February 10, 2009
    Too cool! AND FINALY! NORWEGIAN IS GOING TO BE SUPPORTED! :D Therefore, I must get a tabletPC at once Windows 7 get's released ^^, Great improovemnts!

  • Anonymous
    February 10, 2009
    I think I have a lot of "bad habits" when it comes to pen input in Windows -- my first pen input device was an HP iPaq 3850 running PocketPC 2002. I learned to make my L's as curly-L's, even though I don't do that normally -- it increased recognition. After reading this post I just went and did about half of the handwriting exercises. Hopefully this will help my Win7 tablet with my URL entry in non-IE browsers. :D Wonderful post! I'm incredibly interested in this sort of thing, so I really appreciate your time writing this up Yvonne. Also thanks to Steven for posting it. :)

  • Anonymous
    February 10, 2009
    Pretty awesome, now if I only had a tablet/convertible PC myself. I'm wondering though, you said that the recognizer will use the language that your spellchecker uses to improve the recognizers accuracy. How does this work of you would use 2 languages at the same time? Let's say I'm making lecture notes, which are given in Dutch in my case, but have to use quite some English words while doing so. Wouldn't using 2 languages at the same time decrease accuracy in this case? Or is the recognizer intelligent enough to know that you are not consistently using the same language in that same piece of text? I have to admit though, I'm really interested in buying myself a nice tablet/convertible laptop in the future and this post made me even more eager to have one.

  • Anonymous
    February 10, 2009
    The comment has been removed

  • Anonymous
    February 10, 2009
    bdodson, I'm learning chinese too and have had the same idea about using Chinese (mandarin) character recognition: using OneNote to practice writing them down and using recognition to see if I've done so correctly :)

  • Anonymous
    February 10, 2009
    @bdodson & @lozmatic -- Hey, I did the same thing! During my first demo of the new recognizers I even got a chance to write a little "show off" by writing a bit of Russian (a very little bit). --Steven

  • Anonymous
    February 10, 2009
    The comment has been removed

  • Anonymous
    February 11, 2009
    Off topic http://blogs.zdnet.com/gadgetreviews/?p=1436 Smile :D

  • Anonymous
    February 12, 2009
    Very interesting and very impressive! What about Speech and Narrator?

  • Anonymous
    February 12, 2009
    The Great Mark Russinovich Springboard Windows 7 http://ms.istreamplanet.com/springboard/portal.asp Enjoy

  • Anonymous
    February 16, 2009
    Ultimate Extras come in windows seven ultimate ? From my point of view is an important feature for windows, that many bought the Ultimate version, not only by all the features but also by the ultimate extras. Many hope that this feature is not removed from the final version

  • Anonymous
    February 16, 2009
    I've upgraded my Thinkpad X60 Tablet running Vista to the Windows 7 beta. Nice! Except, now I don't have Dutch recognition anymore. Under Vista I could also put it to Dutch, in Win7 I only get the on-screen keyboard when set to the Dutch language. (Note: I ran English Vista, not Dutch, so that's not it...) Is the beta missing other pen input languages but English, or can I turn it on somewhere. Oh yeah, I do like the 'in place' recognition of Win7! Easier then the Vista way. And Win7 does indeed a better job of recognizing, IMHO.

  • Anonymous
    February 17, 2009
    I was waiting for Arabic language recognition from the first days of tablets, years ago. I know that Arabic language recognition is very different from Latin languages, but if Microsoft did not do it, who will? When will you add support for Arabic language?

  • Anonymous
    February 20, 2009
    In this version we have significantly improved the accuracy for four East Asian languages (Simplified Chinese, Traditional Chinese, Japanese, and Korean), we have also provided better personalization scheme, and we also support text prediction fucntion for CHS and CHT. Actually our accuracy has surprised many users since initially they didn't expect their cursive writing can be recognized!

  • Anonymous
    March 03, 2009
    This will be a big benefit for students and the learning process in general. the days of taking hand written notes may soon become a thing of the past. I would also image this could reduce the amount of paper used, in that all notes can be consolidated.  

  • Anonymous
    March 08, 2009
    well, handwriting itself works spotless, BUT I used to use TRUST's graphic tablet which seems not to work with anything else than handwriting collector as cursor sticks to the edges of the screen and I can only move it up and down :( 7's handwriting works, however well

  • Anonymous
    April 27, 2009
    The comment has been removed

  • Anonymous
    March 01, 2010
    I am probably unusual in that I so prefer the comfort (form factor) of handwriting recognition that I use it almost 100% of the time, even for documents 50 pages long.  As one of the owners of a medical software company, I also use it in order to understand the advantages and limitations of this tool. Compared to earlier versions, I do detect some modest improvement in the accuracy of recognition in Windows 7 and I like some of the UI changes. There are two problems, however. The new version has a greater propensity toward revising already-recognized words based on new writing that is spatially well-separated from the already-recognized word.  For example, I am using handwriting now. When I wrote "am using" in the last sentence it insisted on revising "am" (which it had already recognized) to combine it with "using" to create "amusing."  That one is easy to fix with the split gesture, but "at times" became "attorney." This happens very frequently now (probably 8-10 times as I wrote the above). Sometimes it even goes two words back. I'd say it offsets the other improvements. I had become adept at fixing wrong words, but now I have to re-write larger segments that had already been recognized correctly, then were changed in such a way that it has to be re-written. The second issue is that with intensive usage (as it gets from me) it develops some sort of problem and stops working. This manifests as a failure to insert the text into the target field. You still see it in the handwriting dialog and when you hit "insert" you see it disappear, but it doesn't show up in the target. This was an issue when handwriting first came out, but it was fixed by a service pack. It's back now.  The problem is that once it starts, it will continue intermittently and get worse. You have to reboot to fix the problem. I hope there will be a fix on this soon.

  • Anonymous
    March 01, 2010
    I am probably unusual in that I so prefer the comfort (form factor) of handwriting recognition that I use it almost 100% of the time, even for documents 50 pages long.  As one of the owners of a medical software company, I also use it in order to understand the advantages and limitations of this tool. Compared to earlier versions, I do detect some modest improvement in the accuracy of recognition in Windows 7 and I like some of the UI changes. There are two problems, however. The new version has a greater propensity toward revising already-recognized words based on new writing that is spatially well-separated from the already-recognized word.  For example, I am using handwriting now. When I wrote "am using" in the last sentence it insisted on revising "am" (which it had already recognized) to combine it with "using" to create "amusing."  That one is easy to fix with the split gesture, but "at times" became "attorney." This happens very frequently now (probably 8-10 times as I wrote the above). Sometimes it even goes two words back. I'd say it offsets the other improvements. I had become adept at fixing wrong words, but now I have to re-write larger segments that had already been recognized correctly, then were changed in such a way that it has to be re-written. The second issue is that with intensive usage (as it gets from me) it develops some sort of problem and stops working. This manifests as a failure to insert the text into the target field. You still see it in the handwriting dialog and when you hit "insert" you see it disappear, but it doesn't show up in the target. This was an issue when handwriting first came out, but it was fixed by a service pack. It's back now.  The problem is that once it starts, it will continue intermittently and get worse. You have to reboot to fix the problem. I hope there will be a fix on this soon.

  • Anonymous
    March 04, 2010
    On further experience with the problem inserting text into the target field, it appears that it can recover without rebooting if given enough time ( 20 seconds to several minutes). Sometimes, the problem manifests as a delay  in insertion. More often, the text to be inserted is lost. BTW, I am observing this on a new fully loaded  HP tablet PC.

  • Anonymous
    March 06, 2010
    Unfortunately, the tendency of the Windows 7  handwriting system to re-think already recognized words as you continue to write subsequent words is a serious step backwards.   The majority of the time, it turns a correctly recognized word into a wrong one. It sometimes rethinks it twice in quick succession, which usually makes the end result even worse. The rethought words are seldom easily correctable.  Besides  decreasing accuracy, the  feel of instability is quite unsettling. It was a mistake to go in this direction.

  • Anonymous
    March 25, 2010
    You guys really need to tone down the tendency of the 7.0 handwriting system to rethink itself. I'm constantly seeing behavior like I just  encountered:  I was trying to write "Isn't that true" and it had already recognized "Isn't that" but when I wrote "true" it decided I must have been trying to write "infinite thrust".  It misses more than it get's right in this re-thinking it now does!