When capabilities don't meet requirements
So I’ve recently fixed two bugs that have come up that both shared a root cause. I thought that they were interesting enough that it would be worth sharing with you guys. In order to make them make sense I first need to give a little bit of an architectural background to it all.
As you probably know, I’m responsible for the core IntelliSense™ architecture. What does that mean? Well,
I work on the language analysis component that watches what you type
and keeps our internal understanding of your source code up to date so
that we can proffer help in many forms (tooltips, parameter help,
completion lists, etc.). This is a fairly complex task and there are many ways that one could architect a system like this. In
my opinion the most ideal way would be to have a system that watched
what you did and always kept itself up to date after every action you
performed. This would be a system that
would be very easy to reason about and which would always be able to
help you the most since there would be no disparity between what you
had written and what it knew about.
Unfortunately, we do not have such a system. Why not? Well, first let me discuss the work that is done. Consider an action like typing. After a user has typed a character we might have to do all of the following steps:
- Update our token stream for that document. Certain characters might require the entire document to be re-tokenized. (imagine adding a /* )
- Update our parse tree for that document. Depending on how much the token stream was churned will determine how much of the file needs to be reparsed and fixed up. (imagine changing top level scoping by changing an open curly brace {
- Update our annotated expression graph. A single token change could have an impact to the expression graph for your entire solution. (Imagine changing a namespace name).
This is quite a bit
of work to do, and we would potentially have to do all of that in the
scant milliseconds between the characters a user has typed. As it stands today with our current architecture, it’s too much work to do in too little time.
So what did we decide on instead? Well, the C# Language Service is divided into two threads:
- The
foreground thread which responds to user interaction events and
presents the results of an request (like populating and bringing up a
completion list after <dot> is hit) - The background
thread which is responsible for keeping our internal expression
representation up to date after a user has performed an action. i.e.
after a character is typed it ensures that we now understand the user’s
code given the massive amount of change in meaning that could have
caused.
By choosing this sort
of model though we have now made a tacit decision that the information
we present to you might not be completely up to date or accurate. i.e.
when the foreground thread decides to do something like present a
tooltip when you hover over an identifier it is requesting the
information from an expression graph that could very well not be up to
date. Now, you might be saying to yourself
“well, why don’t you just wait until the information is up to date and
then present it to the user?”. Interestingly enough, this is exactly what some language services choose to do. When you’re in VB and you move away from the current bit of code you’re typing you might notice a pause. What’s going on at that point in time? Well, the VB Language Analysis System is taking what you’ve just written and making its entire system up to date. However, in C# land we made a conscious decision that that was not the model we wanted. We
feel that pausing while a user types is something that our users will
absolutely hate and it’s imperative that we not block the user while
typing. It should be noted that our architecture does exceedingly well on multi-proc and hyper-threaded machines. In
those cases the system will be running both of these threads
simultaneously, and in effect we often appear to be up to date in
between your keystrokes.
Now, is this such a big deal? In practice the answer is almost always “no”. While
not being fast enough to perform in between your keystrokes, our system
is still extremely fast and uses many snazzy techniques to do the work
it does quickly. So if you were to have the following code:
public void Foo()
{
this.
}
and you then added a new method after “Foo” like so:
public void Foo()
{
this.
}
public void Bar()
{
}
and you went back up to “Foo” and requested a completion list after “this”, you would almost certainly see “Bar” in the list. We’re
not necessarily fast enough to work within your keystrokes, but we are
fast enough to deal with keystrokes and a bit of navigation.
When are cases where this being out of date can cause an issue? One case is WinForms. In
order for them to accurately display what your form will look like they
need us to accurately examine, decompose, and report back the meaning
of your InitializeComponent method. In the
past we would just depend on our potentially out of date symbol graph,
but it ended up being the source of many bugs and major headaches for
the user (ever had all your controls disappear? There’s a good chance it was due to that). So,
in VS2005, we changed our model so that if WinForms is asking us for
data we will block the foreground thread until the background thread is
finished working. In effect, for this circumstance we’ve moved to the VB model. Now,
in order to not give you a horrible experience where you’re asking
yourself “why the heck isn’t the system responding” we will pop up a
progress dialog to tell you what’s happening, why, and how long there
is left. In that case we considered it
important enough because in the event of us sending bad information to
WinForms, we could end up corrupting your form.
What we had here was a case where the requirements and capabilities of two different systems were not being met. WinForms
has a requirement that the Language Service in question provide
high-fidelity analysis of the code in question, whereas the C# Language
Service was designed to be fast, but not necessarily provide high
fidelity information. Unfortunately, this
realization and formalization wasn’t clear to either the WinForms team
or our team in the VS2003 timeframe which is why many of these bugs
existed.
Ok, so that was the background bit, now onto the bugs I was fixing recently. It’s been known to me for a while that there are more areas where requirements and capabilities are also not in sync. But it wasn’t clear to me how to best fix the problem.
The first is “Bring Up Completion List On Identifier” (BUCLOI). If
you haven’t played with VS2005 yet, then here’s how it works: when you
start any C# identifier (or keyword) we’ll bring up the completion list
automatically populated with all the relevant identifiers for the that
location (FYI: if you don’t like this feature it is simple to disable
in the tools|options dialog). i.e. if you’re starting to write a method and you type “pu” then we’ll show you:
and you can then select “public” without wasting time hitting <ctrl><space> to bring up the completion list. Now,
in order for this feature to be useful it has to be the case that we
accurately provide you with the identifiers that would be valid in that
location. This was a great way for us to find bugs in the language service when we introduced this feature more than a year ago. If
the identifier wasn’t in the list then people would be frustrated and
send us nasty bug reports telling us that this feature was getting in
the way. However, it allowed us to know
that we weren’t doing a good enough job analyzing your code, or we
weren’t getting all our internal information up to date fast enough. By pushing this right up in peoples’ faces we were able to improve performance and accuracy by probably an order of magnitude. However,
even with all the work we did, there was one place where we were still
running into problems and getting people frustrated. Specifically, in this case:
As it turns out this is one of those areas where a single character
change ends up having a massive effect on our internal symbol graph. By adding the return value in for this generic method we are changing the parse tree extensively forcing a lot of re-analysis. On
top of that, generic methods are special in as far as how overloads and
constraints are determined, and so there’s a fair bit of work that
needs to be done to stitch all that information together. Because
of this it’s possible that on some occasions when you bring up the
completion list for the return type it might not contain the generic
type parameter in it. And, what’s worse is
that because it is highly likely that the type parameter is one
character long, there’s a very high chance that it will be the prefix
of some other item in the list. If you then hit space, it will complete to that longer identifier and completely throw you off. Ugh. What an awful thing to do to you. You’re trying to type completely legal code and we totally screw you over. Definitely a bad thing and something that I think will leave some users cursing at us for. Once
again, this is a case where the high-fidelity requirement of BUCLOI
contrasts with the lackadaisical nature of the analysis engine.
Normally, we don’t run into these situations, but here was one case where it did become important. In
fact, in all grammatical areas in the C# language this is one of the
rare ones where even though we’re so close to the definition of the
item the adding/removing of the reference ends up having very costly
effects on us. In pretty much all other places it’s not a problem. Now,
unlike the WinForms case we couldn’t just block in this situation
(since we’re dead set against that, and how would you feel about a
dialog coming up saying “please wait” while you were trying to type the
return type for a method!). So what did we end up doing? Well, something I really dislike, but which I think was an acceptable choice for VS2005. We special cased. Instead
of having pretty generic and consistent rules for how we process
changes in your code, we now special case what we do if you’re changing
the return type of a generic method. In
that case we make darn sure that our understanding of the generic
method type parameters is up to date very quickly in the process. I
would have preferred to not have to do this, but hopefully I can keep
this hack abstracted away while keeping the rest of the architecture
fairly clean.
The second place where this came up was in a great bug that was found by QA a while back. The
basic gist of it was that after you perform an “extract method’
refactoring, then immediately after the refactoring the “generate
method stub” smart tag would appear to offer to generate the method
you’d just created! After reading everything so far it’s probably pretty clear to you what had happened. The smart tag checked to see if the method call it was on could bind to an existing method. Of
course, because that method had *just* been created (probably <5
milliseconds prior), the IntelliSense™ background thread hadn’t
incorporated it into a symbol graph. And so, the smart tag saw that it wasn’t there and said: “hey, perfect place to display myself!” So how did this get missed by us devs? I mean, you’d think we’d have seen that when we created these features. Well, as it turns out I develop on a dual proc machine. And,
on that machine this never repro’ed because on those machines the
internal state was up to date when the smart tag queried it. (similarly, the first example with method return types never repro’ed on these machines either). So
once again this was a case where a feature wanted up to date
information but was depending on a system that didn’t guarantee any
such thing. Any guess as to how we eventually ended up solving this one?
As we move forward I think it’s very important to consider and document these design decisions very carefully. The
have far reaching consequences, and need to be understood and planned
for when designing features that will end up being affected by them.
Comments
- Anonymous
April 10, 2005
Cyrus,
that's intresting. however, i assume not all devs are working on dual proc machines, are they? inwhich case: shouldn't someone else have spotted this before?
WM_QUERY
thomas woelfer - Anonymous
April 10, 2005
The comment has been removed - Anonymous
April 10, 2005
Very interesting! I've found the VS2005 Intellisense quite useful indeed, so it's certainly nice to hear about its architecture. More posts like this are welcome :-) - Anonymous
April 11, 2005
Very interesting article. It helps to understand what goes on at MS. - Anonymous
April 11, 2005
I don't really buy that argument. The context switch for the syntax etc of the language is almost certainly greater than that required for the changes in IntelliSense behaviour. It is for me. I actually find it useful that behaviour is different on the occassions I have to switch to VB.Net. The altered behaviour reminds me to put the ' before the {}s :-)
As I understand internal competition at Microsoft - someone correct my understanding if I'm wrong - is that the language teams do in fact compete to implement the best features for their language. And get kudos for the best features. But the features which make the cut will (sometimes a version later) be available on all languages. - Anonymous
April 11, 2005
"I don't really buy that argument. The context switch for the syntax etc of the language is almost certainly greater than that required for the changes in IntelliSense behaviour. It is for me."
shrug
The differences to me between VB.NET and C# are all syntactic; I write the same thing in the same way in the two languages, so this to me makes the differences between the IDEs all the more apparant.
Between C# and (M)C++ the differences are much greater, but I still find it incredibly jarring that VC++ for example doesn't perform any wiggle underlining when C# does.
"As I understand internal competition at Microsoft - someone correct my understanding if I'm wrong - is that the language teams do in fact compete to implement the best features for their language. And get kudos for the best features. But the features which make the cut will (sometimes a version later) be available on all languages. "
That may be true, but I don't believe it makes for a good end-user experience. - Anonymous
April 11, 2005
Thomas: "Cyrus,
that's intresting. however, i assume not all devs are working on dual proc machines, are they? inwhich case: shouldn't someone else have spotted this before?
WM_QUERY
thomas woelfer "
The devs who were working on these features and had the best chance to see these bugs, were both using dual proc machines. :)
Other devs were unlikely to be using Generate-Method-Stub and Extract-Method that much. - Anonymous
April 11, 2005
The comment has been removed - Anonymous
April 11, 2005
Jouni: "Very interesting! I've found the VS2005 Intellisense quite useful indeed, so it's certainly nice to hear about its architecture. More posts like this are welcome :-) "
Cool! At some point maybe i'll write an article about it for MSDN! - Anonymous
April 11, 2005
Sam: "
As I understand internal competition at Microsoft - someone correct my understanding if I'm wrong - is that the language teams do in fact compete to implement the best features for their language. And get kudos for the best features. But the features which make the cut will (sometimes a version later) be available on all languages. "
Yes, and no.
The language teams are mostly independent. They have their own goals and their own customers that they try to please. And, we try to make the best chocies for our customers.
DrPizza is absolutely right in one way: we do a pretty bad job in considering the many developers who are multi-language workers.
And, definitely, it happens that customers then see what a different langauge has and say "we want that for ours". That's why VB picked up some refactorings from us and we picked up E&C from them. So with every release tehre are going to be some differences, but over time things stabalize and get better :) - Anonymous
April 11, 2005
"Interstingly enough we dont' hear a lot of this. Most users actually feel very differently than you on this point. "
The problem as I see it is that having a feature in one language but not another leads one to develop habits in one that one is then forced to break in the other. I'd sooner just not be able to develop the habit in the first place.
"You're misrepresenting me. It's not difficult to parse if all you're doign is writing a simple parser."
I don't think I'm misrepresenting you. You said it was "easy" as long as you had a grammar to feed into a parser generator. That grammar doesn't exist. The C++ spec is written in English, not EBNF, and none of the parser generators I've used read English. Even if the grammar did exist, it's context sensitive, so your regular parser generators can't handle it anyway (well, some such as ANTLR may have a way of adding semantic input to the parser so they may be able to do it... if you're willing to do the semantic analysis required... which means reading the spec again). In practice, you're better off just conflating syntactic and semantic analysis (or at least, large chunks thereof) and abandoning all pretence of separation. The language doesn't meet the criteria you listed to make writing a compiler trivial.
"Absolutely. However, once again, it's a matter of cost. I wrote a blog post a while back about how difficult refactoring is in the presense of a preprocessor. In C# this is not so bad because the vast majority of users don't use the preprocessor. However, in C++ this is not the case and trying to support them in that language would be enormously costly. In the end it was decided that it would just have to wait for the future. "
I would think it's acceptable for refactorings to operate on the preprocessed source and warn you whenever they would like to make a change to code generated by a macro (it'd still require you to record that pieces of code were generated by macros but that seems relatively simple). Refactorings should work across #includes, but I don't think they need to work across #defines.
"I think you misunderstood me. You will not get this dialog during typing scenarios. You will only get it occasionally when switching to a WinForms designer. "
Oh, I see. That's OK then. - Anonymous
April 11, 2005
DrPizza: Once Again, thanks you very much for the excellent feedback. As you can guess, there isn't much we can do in the way of this for whidbey, but we'll be taking it very seriously for the post whidbey release.
That said, let me go over your points:
"The problem as I see it is that having a feature in one language but not another leads one to develop habits in one that one is then forced to break in the other. I'd sooner just not be able to develop the habit in the first place."
You're absolutely right. And i'd really like us to be better in this regard in the future. Maybe keep the language behaviors distinct, but offer ways to unify them. So you could be in C# but have VB like behavior (i.e. blocking until everything was ready). That way you could determine which style was best for you, then once you set that style it would carry no matter what development langauge you were in.
"I would think it's acceptable for refactorings to operate on the preprocessed source and warn you whenever they would like to make a change to code generated by a macro (it'd still require you to record that pieces of code were generated by macros but that seems relatively simple). Refactorings should work across #includes, but I don't think they need to work across #defines."
That's a definite possibility. However, with all the work the C++ team had to do just to come up with C++/CLI it just wasn't possible to cover this. I think i nthe future you'll be pleased with where we're going since we absolutely recognize the pains we're causing ot a whole class of developers out there.
"Oh, I see. That's OK then."
Yup. I'd rather quit then be forced to implement a feature that blocks the user when they're typing :) - Anonymous
April 11, 2005
Cyrus on Intellisense - Anonymous
April 12, 2005
So the problem in Win Forms where all your controls get zapped is fixed in VS2005?! Hooray! Nice one! - Anonymous
April 12, 2005
The comment has been removed