They grow up so fast

So we were taking a look a the C# language service in VS 2002 and 2003 and we were comparing it to VS 2005 and we got quite a shock.  So in 2002 the language service weighed in at exactly 700k.  (well, not exactly, but close enough for the purposes of this discussion).  Then in 2003 it came it at 772k.  Ok, that seems somewhat reasonable.  Bug fixes would have churned up the source some, and maybe some new features would have accounted for the 10% increase in size.  So where are we at in 2005?

2807k

o_O

That's 4 times the size of 2002 and 3.6 times the size of 2003.  To give you a feeling for that, here's some bad ascii art:

----------------

| 2002 |

------------------

| 2003 |

-----------------------------------------------------------

| 2005 |

-----------------------------------------------------------

So why am i so surprised by that?  Well, because a lot of the work done in 2005 was (IMO) to actually reduce the language service.  During all the work done on 2005 we took a long and hard look at all the problems we were having with 2003.  Architectural decisions that were holding us back, customer DTS and QFE problems, scalability issues, etc. etc.  And, we made a team decision that in order to really advance the product we actually needed to simplify a whole bunch of the codebase. 

The code was written IMO with a micro-optimization flair.  And while this did mean that it was quite fast in certain cases, the extra complexity added in by that optimization kept on making certain things difficult.  For example, an optimization to save memory when parsing types with no nested types ended up causing a bug whereby we might lose information about all nested types in your project in certain circumstances (yikes!). 

Another perf optimization was the "ProjData" file.  For those who don't know about it, this file is basically a dump of all our internal datastructures (i.e. information about all the types and members in your project).  It existed in 2002 and 2003 so that when loading a project we wouldn't need to recalculate all that information that we'd done in the last run.  However, the design of that file was very much "in your face".  So rather than be abstracted away so that most language service services woudln't need to know about it, basically everything needed ot know about it and handle it correctly.  If the file format changed you probably had to go fix up about 100 places, and there were serious multi-threaded access woes that went along with it.  At a certain point in time we looked at that and said "this is something that is causing an enormous source of problems for us, and we really need to see what we can do about it".  So we ended up removing that file entirely and we simplified the language service by an order of magnitude. 

By doing these simplifications we found that while our perf might have degraded slightly over 2003 in small cases, it vastly outperformed it in large cases.  And, frankly, on the small cases we're talking about performance lossesof microseconds, whereas on large projects we're talking about gains of minutes (if not more). 

So many complex chunks of code were vastly simplified or ripped out entirely.  When we ran into other performance problems (like having a winforms control with 10s of thousands of elements on it), we attacked the problem from the ground up to make sure we could handle that well.  We also talked to some of our huge enterprise customers to get help with this.  We took projects with 25 MB of source (yes, that's right), and used that to ensure that the language service could really handle what would be thrown at it.

Do we fail at some things?  Yes.  If you have a project with 2 billion files in it, we'll probably not be able to scale to that level.  But, for the most part, we should be able to meet most people's needs.

So why did we grow so much?  It's still somewhat of a mystery to me.  We do have the refactoring code, and the EnC code, the new formatting engine, as well as the new smart tags and whatnot, and all the extra work to support things like generics.  But those really didn't seem like *that* much code.  But i guess when i break down everything the language service does (like):

1) Code Generation – Handles any time we spit out code (like for generate-method-stub and implement-interface

2) CodeModel – We implement the code model that is exposed to teams like Team System for introspection of your source code

3) Intellisense support for completion lists, parameter help, quick info

4) Debugger Interaction – We figure out what should go in many of the debugger displays, and we help bind your breakpoints

5) Snippets support

6) Formatting

7) Code navigation support – Things like the navigation bars, goto definition, and metadata-as-source

8) The new smart tags and little productivity enhancements (like “add-using”)

9) Refactorings

10) Metadata reading

11) WinForms support – For analyzing your code and figuring out what it means

12) Web Development support – Using C# inside HTML code

13) All the massive infrastructure to just support understanding your code as you type it

Then i can see how things might have grown.  There wasn't a single area here where we didn't do quite a lot of work (or all the work in the case of new features), in order to make things better over 2003.  So maybe even though we cleaned up a lot of stuff, made things more scalable and stable, we still ended up increasing our size by a ton.

Hopefully if you use 2005 you'll see it like so:

     ------------------

2003 | |

     -----------------------------------------------------------

2005 |2003 features|New 2005 yumminess that you absolutely love|

     -----------------------------------------------------------

Comments

  • Anonymous
    March 19, 2005
    The comment has been removed
  • Anonymous
    March 20, 2005
    Cyrus,

    What is Web Development support – Using C# inside HTML code?

    Thanks,
    Vaibhav
  • Anonymous
    March 20, 2005
    Thomas: Yes, that's exactly what i'm saying. We do have an issue where a lot of projects can make opening/closing a solution slow. Do you have more than 1000 projects in your solution?

    The DLL locking problem was something we got an enormous amount of feedback on and there as a ton of cross group coloboration done ot try and eliminate it.

    BTW, if you want, you can send me your solution so that i can do some perf tests on it.

    I would totally understand if you didn't want to do this considering it's either your code or your employers, and probably neither of you want others to look at it.

    However, if you are interested we could talk and see if some sort of agreement could be drafted that would allow us to use that project for testing purposes, and nothing else.
  • Anonymous
    March 20, 2005
    The comment has been removed
  • Anonymous
    March 20, 2005
    you almost got it, why-Bav :)

    Thanks
  • Anonymous
    March 20, 2005
    The comment has been removed
  • Anonymous
    March 20, 2005
    Whoa...

    135 MB of source?

    Whoa...

    Whoa...

    How many projects?
  • Anonymous
    March 20, 2005
    > Whoa...
    > 135 MB of source?
    > Whoa...
    > Whoa...
    > How many projects?

    I thought you guys were getting to know the real world. This kind of thing shouldn't be that much of a shock you know...

    Marc
  • Anonymous
    March 21, 2005
    Marc: We were. I've never seen a report from a customer with taht much C# source.
  • Anonymous
    March 21, 2005
    Marc: I even asked on my blog a while ago the sizes of projects people had, and nothing approached that.
  • Anonymous
    March 21, 2005
    I am glad to see more elements can be handled by the winforms designer -- this means the VG.net designer will also be able to handle more elements. People often create hundreds of elements in VG.net.
  • Anonymous
    March 21, 2005
    Frank: What's VG.Net?

    The performance enhancements i was referring to were specifically about being able to understand and pass information about the C# code to the winforms designer much more quickly than we had before.
  • Anonymous
    March 21, 2005
    280 projects from last count. We've got 60 developers working on this product full time.
  • Anonymous
    March 21, 2005
    The comment has been removed
  • Anonymous
    March 21, 2005
    Matt: Can you contact me through the "contact" link. Thanks!
  • Anonymous
    March 21, 2005
    Cyrus,

    i tried to contact you through the contact link but got no response. you can contact me at tw@die.de

    WM_THX
    thomas woelfer
  • Anonymous
    March 27, 2005
    Everyone,

    fyi: quite some time ago, cyrus wrote (in the comment section of this entry):

    ' BTW, if you want, you can send me your solution so that i can do some perf tests on it. '

    which sounded like a nice offer. i contacted him, but never heard back. i contacted him again (by using the blog posting above) - and he apologized and asked back about the size of the project. and answered again - and never heard back.

    never heard anything since.

    so i guess cyrus is trying to look intrested and friendly pretty hard. but it's either just a lie or this guy is so disorganized that his attitude doesn't matter anyway.

    WM_FYI
    thomas woelfer
  • Anonymous
    March 28, 2005
    Thomas: This is something that I'm working on. But these things take time. I'm sorry that i haven't been able to keep up with you to your satisfaction and i'll try harder! :(
  • Anonymous
    March 28, 2005
    ok, understood.

    WM_L8R
    thomas woelfer
  • Anonymous
    March 31, 2005
    I think VG.net is the only commercial root designer outside MS? At any rate, all the perf enhancments should help it load Pictures faster as well.
  • Anonymous
    March 31, 2005
    http://www.vgdotnet.com
    It is vector graphics for .net developers.
  • Anonymous
    June 07, 2009
    PingBack from http://greenteafatburner.info/story.php?id=2411