Lightweight syntax option in F# 1.1.12.3
We're glad to announce that F# 1.1.12.3 supports the optional use of lightweight syntax through the use of whitespace to make indentation significant. At the time of this release this is an experimental feature, though it is assumed that its use will become widespread.
The F# indentation-aware syntax option is a conservative extension of the explicit language syntax, in the sense that it simply lets you leave out certain tokens such as in
and ;;
by having the parser take indentation into account. This can make a surprising difference to the readability of code.
[ Note: This feature is similar in spirit to the use of indentation by Python and Haskell, and we thank Simon Marlow (of Haskell fame) for his help in designing this feature and sketching the implementation technique. We also thank all the F# users at MSR Cambridge who've been helping us iron out the details of this feature. ]
Compiling your code with the indentation-aware syntax option is useful even if you continue to use explicit tokens, as it reports many indentation problems with your code and ensures a regular, clear formatting style. The F# library is written in this way.
In this article we call the indentation-aware syntax option the "light" syntax option. It is also occasionally called the "hardwhite" or "white" option (because whitespace is "hard", i.e. significant as far as the lexer and the parser is concerned).
The light syntax option is enabled using the #light
directive in a source file. This directive scopes over all of the subsequent text of a file.
When the light syntax option is enabled, comments are considered pure whitespace. This means the indentation position of comments is irrelevant and ignored. Comments act entirely as if they were replaced by whitespace characters.
TAB characters may not be used when the light syntax option is enabled. You should ensure your editor is configured to replace TAB characters with spaces, e.g. in Visual Studio 2005 go to "Tools\Options\Text Editor\F#\Tabs" and select "Insert spaces".
Using the light syntax option makes code clearer by doing three things:
Fewer tokens. Nearly all end-of-line separator tokens become optional in well-indented code. In particular,
;;
,in
and;
tokens can generally be omitted.Clearer disambiguation. It uses indentation to disambiguate the parsing of certain constructs, e.g. nested
if/then/else
blocks and nestedmatch
blocks. This greatly reduces the number of parentheses in code with nested branching constructs.Sanity checks. It applies additional sanity checks on formatting, reporting places where "undentation" has been used. Unindentation is where a language construct has been used at a column position that is "undented" from an enclosing construct, which breaks the important principle that nested constructs appear at increasing column positions. Some manifestations of undentation are permitted in certain positions in the language syntax.
The basic rules applied when the light syntax option is activated are shown below, illustrated by example.
|
|
|
|
|
|
|
|
Undentation. In general, nested expressions must occur at increasing column positions in indentation-aware code, called the "incremental indentation" rule. Warnings or syntax errors will be given where this is not the case. However, for certain constructs "'undentation" is permitted. In particular, undentation is permitted in the following situations:
|
|
|
More details: offside lines and contexts. Indentation-aware syntax is sometimes called the "offside rule". This pleasant terminology comes from a 1965 paper where Peter Landin introduced the idea, and derives from football (soccer), where the last defending player causes an imaginary line to be drawn across the pitch, and if an attacker is beyond this line the referee will blow the whistle and call "offside!". In F# code offside lines occur at column positions. For example, a =
token associated with let
introduces an offside line at the column of the first token after the =
token.
When a token occurs prior to an offside line, one of three things happens:
(1) enclosing constructs are terminated. This may result in a syntax error, e.g. when there are unclosed parentheses.
(2) extra delimiting tokens are inserted. In particular, when the offside line associated with the token after a
do
in awhile
...do
construct is violated, adone
token is inserted.(3) an "undentation" warning or error is given, indicating that the construct is badly formatted. This is usually simple to remove by adding extra indentation and applying standard structured formatting to your code.
When a token occurs directly on an offside line, an extra delimiting token may be inserted. For example, when a token occurs directly on the offside line of a context introduced by a let
, an appropriate delimiting separator token is inserted i.e. an in
token.
Offside lines are also introduced by other structured constructs, in particular at the column of the first token after the then
in an if
/then
/else
construct, and likewise after try
, else
, ->
and with
(in a match
/with
or try
/with
) and with
(in a type augmentation). "Opening" bracketing tokens (
, {
and begin
also introduce an offside line. In all these cases the offside line introduced is determined by the column number of the first token following the significant token. Offside lines are also introduced by let
, if
and module
. In this cases the offside line occurs at the start of the identifier.
The "light" syntax option is implemented as a pre-parse of the token stream coming from a lexical analysis of the input text (according to the lexical rules above), and uses a stack of contexts. When a column position becomes an offside line a "context" is pushed. "Closing" bracketing tokens (" )
", " }
" and "end
") automatically terminate offside contexts up to and including the context introduced by the corresponding "opening" token.
Here are some examples of the offside rule being applied to F# code:
|
|
Enjoy!
Don and James for the F# team
Comments
Anonymous
August 23, 2006
We're very pleased to announce that F# 1.1.12.3 is available for download.
This release incorporates...Anonymous
August 25, 2006
Wonderful! It is rather easy to create buggy programs in Caml using nested match, and you will not get any warnings.
Maybe an option is to still use "(, ), begin, end, in", and get a warning if the indentation isn't consistent with the "(, ), ..."Anonymous
May 27, 2007
We kindly invite all of you hurting yourself an many others with this just another sharp language... Ouch! Stop hurting! Stop #!Anonymous
May 29, 2007
thanks for the implementation notes at the end. ive been a boo fan for a while, an indent-aware python-inspired dotnet language. although the parser has always supported indentation, theres a number of added features we're trying to coax our tokenizer into doing. we still havent added the parsers routines to dump inline ndoc, and the interpretter still cannot process the backspace key (since characters are fed in via an S.I.Stream). just two silly things, but both seem much more complex when you try actually effecting change. i greatly enjoy hearing tales of other people modifying their compilers and systems to the benefit of coders and wrists everywhere, and its wonderful hearing the technical successes behind this growing and altogether-wonderful project.Anonymous
June 25, 2007
Mattias, OCaml automatically indents code for you.Anonymous
August 28, 2008
PingBack from http://scripts.mit.edu/~birge/blog/functional-programming-and-f-sharp-newton-basin-fractal-code/Anonymous
May 31, 2009
PingBack from http://outdoorceilingfansite.info/story.php?id=3374Anonymous
June 08, 2009
PingBack from http://jointpainreliefs.info/story.php?id=715Anonymous
June 09, 2009
PingBack from http://insomniacuresite.info/story.php?id=5373Anonymous
June 18, 2009
PingBack from http://homelightingconcept.info/story.php?id=1150Anonymous
September 05, 2009
Coming from Ocaml but having also programmed in Python, Boo and Haskell i have to say that i don't like that feature much for a caml like language. It makes sense for python and for haskell too but for staying close to ocaml it should stay optional and not be the default setting. Just saving a few in's here and there is not a good argument for introducing such a "feature". It also hides the fact that things declared with "let ... in" are defined for the actual scope (just using identing for this is not as clear). So i think it does not make the source code clearer but is more confusing even more for beginners. Ocaml folks have their own indentation rules mostly coming from the tuareg mode of emacs and if they switch to F# they will be a bit disapointed. A great strength of ocaml is the ability to indent the code as the programmer wishes. I tried the Visual Studio 2008 F# prerelease and was dismayed of all the "errors" i did get when writing code until i discovered this "feature". So please let it at most be optional and please not the default setting for writing F# code. A lot of people switching from ocaml or other languages will thank you a lot.