แก้ไข

แชร์ผ่าน


F# coding conventions

The following conventions are formulated from experience working with large F# codebases. The Five principles of good F# code are the foundation of each recommendation. They are related to the F# component design guidelines, but are applicable for any F# code, not just components such as libraries.

Organizing code

F# features two primary ways to organize code: modules and namespaces. These are similar, but do have the following differences:

  • Namespaces are compiled as .NET namespaces. Modules are compiled as static classes.
  • Namespaces are always top level. Modules can be top-level and nested within other modules.
  • Namespaces can span multiple files. Modules cannot.
  • Modules can be decorated with [<RequireQualifiedAccess>] and [<AutoOpen>].

The following guidelines will help you use these to organize your code.

Prefer namespaces at the top level

For any publicly consumable code, namespaces are preferential to modules at the top level. Because they are compiled as .NET namespaces, they are consumable from C# without resorting to using static.

// Recommended.
namespace MyCode

type MyClass() =
    ...

Using a top-level module may not appear different when called only from F#, but for C# consumers, callers may be surprised by having to qualify MyClass with the MyCode module when not aware of the specific using static C# construct.

// Will be seen as a static class outside F#
module MyCode

type MyClass() =
    ...

Carefully apply [<AutoOpen>]

The [<AutoOpen>] construct can pollute the scope of what is available to callers, and the answer to where something comes from is "magic". This is not a good thing. An exception to this rule is the F# Core Library itself (though this fact is also a bit controversial).

However, it is a convenience if you have helper functionality for a public API that you wish to organize separately from that public API.

module MyAPI =
    [<AutoOpen>]
    module private Helpers =
        let helper1 x y z =
            ...

    let myFunction1 x =
        let y = ...
        let z = ...

        helper1 x y z

This lets you cleanly separate implementation details from the public API of a function without having to fully qualify a helper each time you call it.

Additionally, exposing extension methods and expression builders at the namespace level can be neatly expressed with [<AutoOpen>].

Use [<RequireQualifiedAccess>] whenever names could conflict or you feel it helps with readability

Adding the [<RequireQualifiedAccess>] attribute to a module indicates that the module may not be opened and that references to the elements of the module require explicit qualified access. For example, the Microsoft.FSharp.Collections.List module has this attribute.

This is useful when functions and values in the module have names that are likely to conflict with names in other modules. Requiring qualified access can greatly increase long-term maintainability and the ability of a library to evolve.

[<RequireQualifiedAccess>]
module StringTokenization =
    let parse s = ...

...

let s = getAString()
let parsed = StringTokenization.parse s // Must qualify to use 'parse'

Sort open statements topologically

In F#, the order of declarations matters, including with open statements (and open type, just referred as open farther down). This is unlike C#, where the effect of using and using static is independent of the ordering of those statements in a file.

In F#, elements opened into a scope can shadow others already present. This means that reordering open statements could alter the meaning of code. As a result, any arbitrary sorting of all open statements (for example, alphanumerically) is not recommended, lest you generate different behavior that you might expect.

Instead, we recommend that you sort them topologically; that is, order your open statements in the order in which layers of your system are defined. Doing alphanumeric sorting within different topological layers may also be considered.

As an example, here is the topological sorting for the F# compiler service public API file:

namespace Microsoft.FSharp.Compiler.SourceCodeServices

open System
open System.Collections.Generic
open System.Collections.Concurrent
open System.Diagnostics
open System.IO
open System.Reflection
open System.Text

open FSharp.Compiler
open FSharp.Compiler.AbstractIL
open FSharp.Compiler.AbstractIL.Diagnostics
open FSharp.Compiler.AbstractIL.IL
open FSharp.Compiler.AbstractIL.ILBinaryReader
open FSharp.Compiler.AbstractIL.Internal
open FSharp.Compiler.AbstractIL.Internal.Library

open FSharp.Compiler.AccessibilityLogic
open FSharp.Compiler.Ast
open FSharp.Compiler.CompileOps
open FSharp.Compiler.CompileOptions
open FSharp.Compiler.Driver

open Internal.Utilities
open Internal.Utilities.Collections

A line break separates topological layers, with each layer being sorted alphanumerically afterwards. This cleanly organizes code without accidentally shadowing values.

Use classes to contain values that have side effects

There are many times when initializing a value can have side effects, such as instantiating a context to a database or other remote resource. It is tempting to initialize such things in a module and use it in subsequent functions:

// Not recommended, side-effect at static initialization
module MyApi =
    let dep1 = File.ReadAllText "/Users/<name>/config-options.txt"
    let dep2 = Environment.GetEnvironmentVariable "DEP_2"

    let private r = Random()
    let dep3() = r.Next() // Problematic if multiple threads use this

    let function1 arg = doStuffWith dep1 dep2 dep3 arg
    let function2 arg = doStuffWith dep1 dep2 dep3 arg

This is frequently problematic for a few reasons:

First, application configuration is pushed into the codebase with dep1 and dep2. This is difficult to maintain in larger codebases.

Second, statically initialized data should not include values that are not thread safe if your component will itself use multiple threads. This is clearly violated by dep3.

Finally, module initialization compiles into a static constructor for the entire compilation unit. If any error occurs in let-bound value initialization in that module, it manifests as a TypeInitializationException that is then cached for the entire lifetime of the application. This can be difficult to diagnose. There is usually an inner exception that you can attempt to reason about, but if there is not, then there is no telling what the root cause is.

Instead, just use a simple class to hold dependencies:

type MyParametricApi(dep1, dep2, dep3) =
    member _.Function1 arg1 = doStuffWith dep1 dep2 dep3 arg1
    member _.Function2 arg2 = doStuffWith dep1 dep2 dep3 arg2

This enables the following:

  1. Pushing any dependent state outside of the API itself.
  2. Configuration can now be done outside of the API.
  3. Errors in initialization for dependent values are not likely to manifest as a TypeInitializationException.
  4. The API is now easier to test.

Error management

Error management in large systems is a complex and nuanced endeavor, and there are no silver bullets in ensuring your systems are fault-tolerant and behave well. The following guidelines should offer guidance in navigating this difficult space.

Represent error cases and illegal state in types intrinsic to your domain

With Discriminated Unions, F# gives you the ability to represent faulty program state in your type system. For example:

type MoneyWithdrawalResult =
    | Success of amount:decimal
    | InsufficientFunds of balance:decimal
    | CardExpired of DateTime
    | UndisclosedFailure

In this case, there are three known ways that withdrawing money from a bank account can fail. Each error case is represented in the type, and can thus be dealt with safely throughout the program.

let handleWithdrawal amount =
    let w = withdrawMoney amount
    match w with
    | Success am -> printfn $"Successfully withdrew %f{am}"
    | InsufficientFunds balance -> printfn $"Failed: balance is %f{balance}"
    | CardExpired expiredDate -> printfn $"Failed: card expired on {expiredDate}"
    | UndisclosedFailure -> printfn "Failed: unknown"

In general, if you can model the different ways that something can fail in your domain, then error handling code is no longer treated as something you must deal with in addition to regular program flow. It is simply a part of normal program flow, and not considered exceptional. There are two primary benefits to this:

  1. It is easier to maintain as your domain changes over time.
  2. Error cases are easier to unit test.

Use exceptions when errors cannot be represented with types

Not all errors can be represented in a problem domain. These kinds of faults are exceptional in nature, hence the ability to raise and catch exceptions in F#.

First, it is recommended that you read the Exception Design Guidelines. These are also applicable to F#.

The main constructs available in F# for the purposes of raising exceptions should be considered in the following order of preference:

Function Syntax Purpose
nullArg nullArg "argumentName" Raises a System.ArgumentNullException with the specified argument name.
invalidArg invalidArg "argumentName" "message" Raises a System.ArgumentException with a specified argument name and message.
invalidOp invalidOp "message" Raises a System.InvalidOperationException with the specified message.
raise raise (ExceptionType("message")) General-purpose mechanism for throwing exceptions.
failwith failwith "message" Raises a System.Exception with the specified message.
failwithf failwithf "format string" argForFormatString Raises a System.Exception with a message determined by the format string and its inputs.

Use nullArg, invalidArg, and invalidOp as the mechanism to throw ArgumentNullException, ArgumentException, and InvalidOperationException when appropriate.

The failwith and failwithf functions should generally be avoided because they raise the base Exception type, not a specific exception. As per the Exception Design Guidelines, you want to raise more specific exceptions when you can.

Use exception-handling syntax

F# supports exception patterns via the try...with syntax:

try
    tryGetFileContents()
with
| :? System.IO.FileNotFoundException as e -> // Do something with it here
| :? System.Security.SecurityException as e -> // Do something with it here

Reconciling functionality to perform in the face of an exception with pattern matching can be a bit tricky if you wish to keep the code clean. One such way to handle this is to use active patterns as a means to group functionality surrounding an error case with an exception itself. For example, you may be consuming an API that, when it throws an exception, encloses valuable information in the exception metadata. Unwrapping a useful value in the body of the captured exception inside the Active Pattern and returning that value can be helpful in some situations.

Do not use monadic error handling to replace exceptions

Exceptions are often seen as taboo in the pure functional paradigm. Indeed, exceptions violate purity, so it's safe to consider them not-quite functionally pure. However, this ignores the reality of where code must run, and that runtime errors can occur. In general, write code on the assumption that most things aren't pure or total, to minimize unpleasant surprises (akin to empty catch in C# or mismanaging the stack trace, discarding information).

It is important to consider the following core strengths/aspects of Exceptions with respect to their relevance and appropriateness in the .NET runtime and cross-language ecosystem as a whole:

  • They contain detailed diagnostic information, which is helpful when debugging an issue.
  • They are well understood by the runtime and other .NET languages.
  • They can reduce significant boilerplate when compared with code that goes out of its way to avoid exceptions by implementing some subset of their semantics on an ad-hoc basis.

This third point is critical. For nontrivial complex operations, failing to use exceptions can result in dealing with structures like this:

Result<Result<MyType, string>, string list>

Which can easily lead to fragile code like pattern matching on "stringly typed" errors:

let result = doStuff()
match result with
| Ok r -> ...
| Error e ->
    if e.Contains "Error string 1" then ...
    elif e.Contains "Error string 2" then ...
    else ... // Who knows?

Additionally, it can be tempting to swallow any exception in the desire for a "simple" function that returns a "nicer" type:

// Can be problematic due to discarding the cause of error.
let tryReadAllText (path : string) =
    try System.IO.File.ReadAllText path |> Some
    with _ -> None

Unfortunately, tryReadAllText can throw numerous exceptions based on the myriad of things that can happen on a file system, and this code discards away any information about what might actually be going wrong in your environment. If you replace this code with a result type, then you're back to "stringly typed" error message parsing:

// Problematic, callers only have a string to figure the cause of error.
let tryReadAllText (path : string) =
    try System.IO.File.ReadAllText path |> Ok
    with e -> Error e.Message

let r = tryReadAllText "path-to-file"
match r with
| Ok text -> ...
| Error e ->
    if e.Contains "uh oh, here we go again..." then ...
    else ...

And placing the exception object itself in the Error constructor just forces you to properly deal with the exception type at the call site rather than in the function. Doing this effectively creates checked exceptions, which are notoriously unfun to deal with as a caller of an API.

A good alternative to the above examples is to catch specific exceptions and return a meaningful value in the context of that exception. If you modify the tryReadAllText function as follows, None has more meaning:

let tryReadAllTextIfPresent (path : string) =
    try System.IO.File.ReadAllText path |> Some
    with :? FileNotFoundException -> None

Instead of functioning as a catch-all, this function will now properly handle the case when a file was not found and assign that meaning to a return. This return value can map to that error case, while not discarding any contextual information or forcing callers to deal with a case that may not be relevant at that point in the code.

Types such as Result<'Success, 'Error> are appropriate for basic operations where they aren't nested, and F# optional types are perfect for representing when something could either return something or nothing. They are not a replacement for exceptions, though, and should not be used in an attempt to replace exceptions. Rather, they should be applied judiciously to address specific aspects of exception and error management policy in targeted ways.

Partial application and point-free programming

F# supports partial application, and thus, various ways to program in a point-free style. This can be beneficial for code reuse within a module or the implementation of something, but it is not something to expose publicly. In general, point-free programming is not a virtue in and of itself, and can add a significant cognitive barrier for people who are not immersed in the style.

Do not use partial application and currying in public APIs

With little exception, the use of partial application in public APIs can be confusing for consumers. Usually, let-bound values in F# code are values, not function values. Mixing together values and function values can result in saving a few lines of code in exchange for quite a bit of cognitive overhead, especially if combined with operators such as >> to compose functions.

Consider the tooling implications for point-free programming

Curried functions do not label their arguments. This has tooling implications. Consider the following two functions:

let func name age =
    printfn $"My name is {name} and I am %d{age} years old!"

let funcWithApplication =
    printfn "My name is %s and I am %d years old!"

Both are valid functions, but funcWithApplication is a curried function. When you hover over their types in an editor, you see this:

val func : name:string -> age:int -> unit

val funcWithApplication : (string -> int -> unit)

At the call site, tooltips in tooling such as Visual Studio will give you the type signature, but since there are no names defined, it won't display names. Names are critical to good API design because they help callers better understanding the meaning behind the API. Using point-free code in the public API can make it harder for callers to understand.

If you encounter point-free code like funcWithApplication that is publicly consumable, it is recommended to do a full η-expansion so that tooling can pick up on meaningful names for arguments.

Furthermore, debugging point-free code can be challenging, if not impossible. Debugging tools rely on values bound to names (for example, let bindings) so that you can inspect intermediate values midway through execution. When your code has no values to inspect, there is nothing to debug. In the future, debugging tools may evolve to synthesize these values based on previously executed paths, but it's not a good idea to hedge your bets on potential debugging functionality.

Consider partial application as a technique to reduce internal boilerplate

In contrast to the previous point, partial application is a wonderful tool for reducing boilerplate inside of an application or the deeper internals of an API. It can be helpful for unit testing the implementation of more complicated APIs, where boilerplate is often a pain to deal with. For example, the following code shows how you can accomplish what most mocking frameworks give you without taking an external dependency on such a framework and having to learn a related bespoke API.

For example, consider the following solution topography:

MySolution.sln
|_/ImplementationLogic.fsproj
|_/ImplementationLogic.Tests.fsproj
|_/API.fsproj

ImplementationLogic.fsproj might expose code such as:

module Transactions =
    let doTransaction txnContext txnType balance =
        ...

type Transactor(ctx, currentBalance) =
    member _.ExecuteTransaction(txnType) =
        Transactions.doTransaction ctx txnType currentBalance
        ...

Unit testing Transactions.doTransaction in ImplementationLogic.Tests.fsproj is easy:

namespace TransactionsTestingUtil

open Transactions

module TransactionsTestable =
    let getTestableTransactionRoutine mockContext = Transactions.doTransaction mockContext

Partially applying doTransaction with a mocking context object lets you call the function in all of your unit tests without needing to construct a mocked context each time:

module TransactionTests

open Xunit
open TransactionTypes
open TransactionsTestingUtil
open TransactionsTestingUtil.TransactionsTestable

let testableContext =
    { new ITransactionContext with
        member _.TheFirstMember() = ...
        member _.TheSecondMember() = ... }

let transactionRoutine = getTestableTransactionRoutine testableContext

[<Fact>]
let ``Test withdrawal transaction with 0.0 for balance``() =
    let expected = ...
    let actual = transactionRoutine TransactionType.Withdraw 0.0
    Assert.Equal(expected, actual)

Don't apply this technique universally to your entire codebase, but it is a good way to reduce boilerplate for complicated internals and unit testing those internals.

Access control

F# has multiple options for Access control, inherited from what is available in the .NET runtime. These are not just usable for types - you can use them for functions, too.

Good practices in context of libraries that are widely consumed:

  • Prefer non-public types and members until you need them to be publicly consumable. This also minimizes what consumers couple to.
  • Strive to keep all helper functionality private.
  • Consider the use of [<AutoOpen>] on a private module of helper functions if they become numerous.

Type inference and generics

Type inference can save you from typing a lot of boilerplate. And automatic generalization in the F# compiler can help you write more generic code with almost no extra effort on your part. However, these features are not universally good.

  • Consider labeling argument names with explicit types in public APIs and do not rely on type inference for this.

    The reason for this is that you should be in control of the shape of your API, not the compiler. Although the compiler can do a fine job at inferring types for you, it is possible to have the shape of your API change if the internals it relies on have changed types. This may be what you want, but it will almost certainly result in a breaking API change that downstream consumers will then have to deal with. Instead, if you explicitly control the shape of your public API, then you can control these breaking changes. In DDD terms, this can be thought of as an Anti-corruption layer.

  • Consider giving a meaningful name to your generic arguments.

    Unless you are writing truly generic code that is not specific to a particular domain, a meaningful name can help other programmers understanding the domain they're working in. For example, a type parameter named 'Document in the context of interacting with a document database makes it clearer that generic document types can be accepted by the function or member you are working with.

  • Consider naming generic type parameters with PascalCase.

    This is the general way to do things in .NET, so it's recommended that you use PascalCase rather than snake_case or camelCase.

Finally, automatic generalization is not always a boon for people who are new to F# or a large codebase. There is cognitive overhead in using components that are generic. Furthermore, if automatically generalized functions are not used with different input types (let alone if they are intended to be used as such), then there is no real benefit to them being generic then. Always consider if the code you are writing will actually benefit from being generic.

Performance

Consider structs for small types with high allocation rates

Using structs (also called Value Types) can often result in higher performance for some code because it typically avoids allocating objects. However, structs are not always a "go faster" button: if the size of the data in a struct exceeds 16 bytes, copying the data can often result in more CPU time spend than using a reference type.

To determine if you should use a struct, consider the following conditions:

  • If the size of your data is 16 bytes or smaller.
  • If you're likely to have many instances of these types resident in memory in a running program.

If the first condition applies, you should generally use a struct. If both apply, you should almost always use a struct. There may be some cases where the previous conditions apply, but using a struct is no better or worse than using a reference type, but they are likely to be rare. It's important to always measure when making changes like this, though, and not operate on assumption or intuition.

Consider struct tuples when grouping small value types with high allocation rates

Consider the following two functions:

let rec runWithTuple t offset times =
    let offsetValues x y z offset =
        (x + offset, y + offset, z + offset)

    if times <= 0 then
        t
    else
        let (x, y, z) = t
        let r = offsetValues x y z offset
        runWithTuple r offset (times - 1)

let rec runWithStructTuple t offset times =
    let offsetValues x y z offset =
        struct(x + offset, y + offset, z + offset)

    if times <= 0 then
        t
    else
        let struct(x, y, z) = t
        let r = offsetValues x y z offset
        runWithStructTuple r offset (times - 1)

When you benchmark these functions with a statistical benchmarking tool like BenchmarkDotNet, you'll find that the runWithStructTuple function that uses struct tuples runs 40% faster and allocates no memory.

However, these results won't always be the case in your own code. If you mark a function as inline, code that uses reference tuples may get some additional optimizations, or code that would allocate could simply be optimized away. You should always measure results whenever performance is concerned, and never operate based on assumption or intuition.

Consider struct records when the type is small and has high allocation rates

The rule of thumb described earlier also holds for F# record types. Consider the following data types and functions that process them:

type Point = { X: float; Y: float; Z: float }

[<Struct>]
type SPoint = { X: float; Y: float; Z: float }

let rec processPoint (p: Point) offset times =
    let inline offsetValues (p: Point) offset =
        { p with X = p.X + offset; Y = p.Y + offset; Z = p.Z + offset }

    if times <= 0 then
        p
    else
        let r = offsetValues p offset
        processPoint r offset (times - 1)

let rec processStructPoint (p: SPoint) offset times =
    let inline offsetValues (p: SPoint) offset =
        { p with X = p.X + offset; Y = p.Y + offset; Z = p.Z + offset }

    if times <= 0 then
        p
    else
        let r = offsetValues p offset
        processStructPoint r offset (times - 1)

This is similar to the previous tuple code, but this time the example uses records and an inlined inner function.

When you benchmark these functions with a statistical benchmarking tool like BenchmarkDotNet, you'll find that processStructPoint runs nearly 60% faster and allocates nothing on the managed heap.

Consider struct discriminated unions when the data type is small with high allocation rates

The previous observations about performance with struct tuples and records also holds for F# Discriminated Unions. Consider the following code:

    type Name = Name of string

    [<Struct>]
    type SName = SName of string

    let reverseName (Name s) =
        s.ToCharArray()
        |> Array.rev
        |> System.String
        |> Name

    let structReverseName (SName s) =
        s.ToCharArray()
        |> Array.rev
        |> System.String
        |> SName

It's common to define single-case Discriminated Unions like this for domain modeling. When you benchmark these functions with a statistical benchmarking tool like BenchmarkDotNet, you'll find that structReverseName runs about 25% faster than reverseName for small strings. For large strings, both perform about the same. So, in this case, it's always preferable to use a struct. As previously mentioned, always measure and do not operate on assumptions or intuition.

Although the previous example showed that a struct Discriminated Union yielded better performance, it is common to have larger Discriminated Unions when modeling a domain. Larger data types like that may not perform as well if they are structs depending on the operations on them, since more copying could be involved.

Immutability and mutation

F# values are immutable by default, which allows you to avoid certain classes of bugs (especially those involving concurrency and parallelism). However, in certain cases, in order to achieve optimal (or even reasonable) efficiency of execution time or memory allocations, a span of work may best be implemented by using in-place mutation of state. This is possible in an opt-in basis with F# with the mutable keyword.

Use of mutable in F# may feel at odds with functional purity. This is understandable, but functional purity everywhere can be at odds with performance goals. A compromise is to encapsulate mutation such that callers need not care about what happens when they call a function. This allows you to write a functional interface over a mutation-based implementation for performance-critical code.

Also, F# let binding constructs allow you to nest bindings into another one, this can be leveraged to keep the scope of mutable variable close or at its theoretical smallest.

let data =
    [
        let mutable completed = false
        while not completed do
            logic ()
            // ...
            if someCondition then
                completed <- true   
    ]

No code can access the mutable completed that was used only to initialize data let bound value.

Wrap mutable code in immutable interfaces

With referential transparency as a goal, it is critical to write code that does not expose the mutable underbelly of performance-critical functions. For example, the following code implements the Array.contains function in the F# core library:

[<CompiledName("Contains")>]
let inline contains value (array:'T[]) =
    checkNonNull "array" array
    let mutable state = false
    let mutable i = 0
    while not state && i < array.Length do
        state <- value = array[i]
        i <- i + 1
    state

Calling this function multiple times does not change the underlying array, nor does it require you to maintain any mutable state in consuming it. It is referentially transparent, even though almost every line of code within it uses mutation.

Consider encapsulating mutable data in classes

The previous example used a single function to encapsulate operations using mutable data. This is not always sufficient for more complex sets of data. Consider the following sets of functions:

open System.Collections.Generic

let addToClosureTable (key, value) (t: Dictionary<_,_>) =
    if t.ContainsKey(key) then
        t[key] <- value
    else
        t.Add(key, value)

let closureTableCount (t: Dictionary<_,_>) = t.Count

let closureTableContains (key, value) (t: Dictionary<_, HashSet<_>>) =
    match t.TryGetValue(key) with
    | (true, v) -> v.Equals(value)
    | (false, _) -> false

This code is performant, but it exposes the mutation-based data structure that callers are responsible for maintaining. This can be wrapped inside of a class with no underlying members that can change:

open System.Collections.Generic

/// The results of computing the LALR(1) closure of an LR(0) kernel
type Closure1Table() =
    let t = Dictionary<Item0, HashSet<TerminalIndex>>()

    member _.Add(key, value) =
        if t.ContainsKey(key) then
            t[key] <- value
        else
            t.Add(key, value)

    member _.Count = t.Count

    member _.Contains(key, value) =
        match t.TryGetValue(key) with
        | (true, v) -> v.Equals(value)
        | (false, _) -> false

Closure1Table encapsulates the underlying mutation-based data structure, thereby not forcing callers to maintain the underlying data structure. Classes are a powerful way to encapsulate data and routines that are mutation-based without exposing the details to callers.

Prefer let mutable to ref

Reference cells are a way to represent the reference to a value rather than the value itself. Although they can be used for performance-critical code, they are not recommended. Consider the following example:

let kernels =
    let acc = ref Set.empty

    processWorkList startKernels (fun kernel ->
        if not ((!acc).Contains(kernel)) then
            acc := (!acc).Add(kernel)
        ...)

    !acc |> Seq.toList

The use of a reference cell now "pollutes" all subsequent code with having to dereference and re-reference the underlying data. Instead, consider let mutable:

let kernels =
    let mutable acc = Set.empty

    processWorkList startKernels (fun kernel ->
        if not (acc.Contains(kernel)) then
            acc <- acc.Add(kernel)
        ...)

    acc |> Seq.toList

Aside from the single point of mutation in the middle of the lambda expression, all other code that touches acc can do so in a manner that is no different to the usage of a normal let-bound immutable value. This will make it easier to change over time.

Nulls and default values

Nulls should generally be avoided in F#. By default F#-declared types do not support the use of the null literal, and all values and objects are initialized. However, some common .NET APIs return or accept nulls, and some common .NET-declared types such as arrays and strings allow nulls. However, the occurrence of null values is very rare in F# programming and one of the benefits of using F# is to avoid null reference errors in most cases.

Avoid the use of the AllowNullLiteral attribute

By default F#-declared types do not support the use of the null literal. You can manually annotate F# types with AllowNullLiteral to allow this. However, it is almost always better to avoid doing this.

Avoid the use of the Unchecked.defaultof<_> attribute

It is possible to generate a null or zero-initialized value for an F# type by using Unchecked.defaultof<_>. This can be useful when initializing storage for some data structures, or in some high-performance coding pattern, or in interoperability. However the use of this construct should be avoided.

Avoid the use of the DefaultValue attribute

By default F# records and objects must be properly initialized on construction. The DefaultValue attribute can be used to populate some fields of objects with a null or zero-initialized value. This construct is rarely needed and its use should be avoided.

If you check for null inputs, raise exceptions at first opportunity

When writing new F# code, in practice there's no need to check for null inputs, unless you expect that code to be used from C# or other .NET languages.

If you do decide to add checks for null inputs, perform the checks at first opportunity and raise an exception. For example:

let inline checkNonNull argName arg =
    if isNull arg then
        nullArg argName

module Array =
    let contains value (array:'T[]) =
        checkNonNull "array" array
        let mutable result = false
        let mutable i = 0
        while not state && i < array.Length do
            result <- value = array[i]
            i <- i + 1
        result

For legacy reasons some string functions in FSharp.Core still treat nulls as empty strings and do not fail on null arguments. However do not take this as guidance, and do not adopt coding patterns that attribute any semantic meaning to "null".

Object programming

F# has full support for objects and object-oriented (OO) concepts. Although many OO concepts are powerful and useful, not all of them are ideal to use. The following lists offer guidance on categories of OO features at a high level.

Consider using these features in many situations:

  • Dot notation (x.Length)
  • Instance members
  • Implicit constructors
  • Static members
  • Indexer notation (arr[x]), by defining an Item property
  • Slicing notation (arr[x..y], arr[x..], arr[..y]), by defining GetSlice members
  • Named and Optional arguments
  • Interfaces and interface implementations

Don't reach for these features first, but do judiciously apply them when they are convenient to solve a problem:

  • Method overloading
  • Encapsulated mutable data
  • Operators on types
  • Auto properties
  • Implementing IDisposable and IEnumerable
  • Type extensions
  • Events
  • Structs
  • Delegates
  • Enums

Generally avoid these features unless you must use them:

  • Inheritance-based type hierarchies and implementation inheritance
  • Nulls and Unchecked.defaultof<_>

Prefer composition over inheritance

Composition over inheritance is a long-standing idiom that good F# code can adhere to. The fundamental principle is that you should not expose a base class and force callers to inherit from that base class to get functionality.

Use object expressions to implement interfaces if you don't need a class

Object Expressions allow you to implement interfaces on the fly, binding the implemented interface to a value without needing to do so inside of a class. This is convenient, especially if you only need to implement the interface and have no need for a full class.

For example, here is the code that is run in Ionide to provide a code fix action if you've added a symbol that you don't have an open statement for:

    let private createProvider () =
        { new CodeActionProvider with
            member this.provideCodeActions(doc, range, context, ct) =
                let diagnostics = context.diagnostics
                let diagnostic = diagnostics |> Seq.tryFind (fun d -> d.message.Contains "Unused open statement")
                let res =
                    match diagnostic with
                    | None -> [||]
                    | Some d ->
                        let line = doc.lineAt d.range.start.line
                        let cmd = createEmpty<Command>
                        cmd.title <- "Remove unused open"
                        cmd.command <- "fsharp.unusedOpenFix"
                        cmd.arguments <- Some ([| doc |> unbox; line.range |> unbox; |] |> ResizeArray)
                        [|cmd |]
                res
                |> ResizeArray
                |> U2.Case1
        }

Because there is no need for a class when interacting with the Visual Studio Code API, Object Expressions are an ideal tool for this. They are also valuable for unit testing, when you want to stub out an interface with test routines in an improvised manner.

Consider Type Abbreviations to shorten signatures

Type Abbreviations are a convenient way to assign a label to another type, such as a function signature or a more complex type. For example, the following alias assigns a label to what's needed to define a computation with CNTK, a deep learning library:

open CNTK

// DeviceDescriptor, Variable, and Function all come from CNTK
type Computation = DeviceDescriptor -> Variable -> Function

The Computation name is a convenient way to denote any function that matches the signature it is aliasing. Using Type Abbreviations like this is convenient and allows for more succinct code.

Avoid using Type Abbreviations to represent your domain

Although Type Abbreviations are convenient for giving a name to function signatures, they can be confusing when abbreviating other types. Consider this abbreviation:

// Does not actually abstract integers.
type BufferSize = int

This can be confusing in multiple ways:

  • BufferSize is not an abstraction; it is just another name for an integer.
  • If BufferSize is exposed in a public API, it can easily be misinterpreted to mean more than just int. Generally, domain types have multiple attributes to them and are not primitive types like int. This abbreviation violates that assumption.
  • The casing of BufferSize (PascalCase) implies that this type holds more data.
  • This alias does not offer increased clarity compared with providing a named argument to a function.
  • The abbreviation will not manifest in compiled IL; it is just an integer and this alias is a compile-time construct.
module Networking =
    ...
    let send data (bufferSize: int) = ...

In summary, the pitfall with Type Abbreviations is that they are not abstractions over the types they are abbreviating. In the previous example, BufferSize is just an int under the covers, with no extra data, nor any benefits from the type system besides what int already has.

An alternative approach to using type abbreviations to represent a domain is to use single-case discriminated unions. The previous sample can be modeled as follows:

type BufferSize = BufferSize of int

If you write code that operates in terms of BufferSize and its underlying value, you need to construct one rather than pass in any arbitrary integer:

module Networking =
    ...
    let send data (BufferSize size) =
    ...

This reduces the likelihood of mistakenly passing an arbitrary integer into the send function, because the caller must construct a BufferSize type to wrap a value before calling the function.