Formatting intro

Anson and i were discussing formatting last night (at around 1 am).  He'd received some feedback from some customers about the new formatting engine Kevin has written for Whidbey.  The issue that the feature (like many others added in whidbey) tends to be very aggressive.  So, unlike VS2003 which only really affected indentation and curly-brace placing, the the new formatter tends to go after all whitespace (except inside comments/strings) and tries to figure out the right amount of space that it should actually take up.  So one can think of the formatter as simply taking a list of tokens and a function from whitespace to whitespace and producing the updated token list.  In other words you could write it as:

# type token = Whitespace of string | Identifier (* plus other tokens *);;
type token = Whitespace of string | Identifier
#let rec format token_list f =
    match token_list with
        [] -> []
      | h::t ->
          (match h with
              Whitespace(_) -> (f h)
            | _ -> h)::(format t f);;
val format : token list -> (token -> token) -> token list = <fun>

That's really basically it.  We can extend this slightly further to deal with the grammatical (ast) structure of code, but that's pretty trivial to do with Functors. However, the difficulty really comes into definining the function f.  This opens up a big can of worms.  In whidbey we've taken the route of supplying some basic functions for you.  For example:

# let clear w = match w with Whitespace (_) -> Whitespace ("") | _ -> w;;

Which removes whitespace (which you might see when formatting

“if (” into “if(”

or

let trim w = match w with Whitespace (_) -> Whitespace (" ") | _ -> w;;

Which reduces a sequence of whitespace into one space.

There are also functions for dealing with newlines, and indendation.  But for the most part that's all we've provided.  The issue is that this isn't a very rich system.  Because we're defined all the modifications ourselves people are incapable of defining their own way of formatting whitespace.  For example, you cannot say “I want 2 spaces between “class” and the name of the class”.   Next post will deal with our thoughts on how to make this better.

Comments

  • Anonymous
    May 17, 2004
    Some developers make entire languages out of whitespace...

    I think developers can be very picky about how code is formatted, Soom of my peers write excellent code but It looks ugly to be because of the formatting I.e. in this case the developer does not white space between operations
    (int x=24;)rather than(int x = 24)

  • Anonymous
    May 17, 2004
    If there's ONE feature I'd like to have, it's the ability to COMPLETELY TURN OFF the automatic formatting in Visual Studio 2003. Every time I turn around Visual Studio has mangled my HTML beyond recognition. My favorite is when VS2003 removes the whitespace between the end of a server control and the following literal text, therefore making the two run together on the rendered page. I put the space back, but VS2003 pulls it back out later at some random time when I'm not looking. I finally gave up and just let my website look goofy with missing spaces...

    How about a Service Pack for VS2003 that allows me to disable automatic formatting altogether??
  • Anonymous
    May 17, 2004
    Is the actual engine written in OCaml?
    Just curious. :-)
  • Anonymous
    May 17, 2004
    The comment has been removed
  • Anonymous
    May 18, 2004
    Michael: In Whidbey you can disable all automatic formatting of code.
  • Anonymous
    May 18, 2004
    Brian: I'll send your thoughts to our PM here who will know how to get that to the right ears. We may even be able to find a blog where you can directly to someone responsible for this area.
  • Anonymous
    May 18, 2004
    Toshiyuki: Unfortunately no. However, there's always hope for the future. F# is a good way to bridge the gap between OCaml and the managed world.
  • Anonymous
    June 06, 2004
    Brian: Scott Guthrie blogs about ASP.NET and I'm sure he'd want to hear this feedback. I also know from Tech-Ed that the non-formatting of html code was something they made sure to do for VS2k5.

    His blog is at: http://weblogs.asp.net/scottgu/