Mastering F# Lists

Minor update: spelling

Lists represent the backbone of functional programming and in order to be an effective F# programmer you must truly master list processing. Fortunately lists are simple and straight forward, so let’s begin.

Mastering F# Lists

There are several things which distinguish the F# list type from the .NET Arrays and generic List<T> type in the System.Collections.Generic namespace.

  F# List Arrays Generic List
Modify Elements No Yes Yes
Add New Elements No No Yes
Element Lookup O(n) slow O(1) fast O(1) fast

After looking at that chart, you might be wondering why even use F# lists at all? Element lookup is slow and you can’t even modify the thing once it has been created, so why bother. The reason why is that F# lists are immutable. Unlike arrays and generic lists, F# lists are guaranteed never to change once they have been created. If you have a type that returns an array or List<T> you don’t know what the outside world will do to it, check out Jomo’s Barrel of Bugs blog post for details.

But seriously, is this safety worth it? Yes. If used correctly in the right situations, F# lists can be more efficient than .NET Arrays or generic Lists. So let’s dig into looking at how F# lists work.

Linked Lists

F# lists are represented as linked lists, which are a foundational Computer Science data structure abstracted by links in a chain. Each individual link has a piece of data associated with it and may be connected to another link. The last link isn’t connected to anything, so it ‘points to null’ or the empty list []. (Ed. Sorry the empty list looks like a square block in my images.)

clip_image001[1]

There are two main operations on a list: Cons (adding a single element) and Concat (joining two lists).

Cons

Cons is where you add an element to the beginning of a list. In F# the type of the cons function’s type is:

 ‘a –> ‘a list –> ‘a list

The important thing to understand is that cons executes in constant, O(1), time. To join an element to an immutable linked list all you need to do is put that value in a list node and set its ‘next list node’ to be the first element in the existing list. Everything ‘after’ the new node is none the wiser.

clip_image002[1]

> 1 :: [2 .. 4];;

val it : int list = [1; 2; 3; 4]

Concat

Concat joins two lists together. The function type for concat is:

 ‘a list –> ‘a list' –> ‘a list

Here is an example of it in action 

> [2; 3; 4] @ [5 .. 7];;

val it : int list = [2; 3; 4; 5; 6; 7]

In order to join two lists all you need to do is set the ‘next node’ value in the last element of the first list equal to the first node in the second list. This sounds like a great solution, but remember you cannot modify list elements. So to join to lists you actually need to make a copy of the first list just so you can change what the last element points to. For this reason, execution of concat is on order O(n) where n is the number of elements in the first list. Visually you can see this by:

clip_image003[1]

To summarize: Cons is fast and easy and Concat is slowish.

List Module Functions

Now that we understand lists, let’s look at some built-in functions for manipulating them.

List.hd

List.hd returns the first element or head of a list.

List.tl

List.tl returns the ‘tail’ of the first element. That is, the list of items after the first element.

> let l = [1; 2; 3];;

val l : int list

> List.hd l;;

val it : int = 1

> List.tl l;;

val it : int list = [2; 3]

List.length

List.length returns the length of the list, nothing special to see here.

> List.length [1; 2; 3; 4; 5];;

val it : int = 5

List.rev

List.rev reverses a list. This makes a copy of the entire list so be careful when calling it, as it if you use List.rev in an inner loop your performance is hosed.

> List.rev [1 .. 10];;

val it : int list = [10; 9; 8; 7; 6; 5; 4; 3; 2; 1]

List.find

List.find takes a boolean function and returns the first element where that function returns true. However, if List.find doesn’t ‘find’ the element you are searching for it throws an exception. Since generally throwing exceptions is bad for everyone involved, I recommend List.tryfind instead, which returns an Option value. In this example we look through a list of numbers for one which is equal to the square root of 144.

> List.find (fun i -> i * i = 144) [1 .. 10];;

System.Collections.Generic.KeyNotFoundException: The item was not found in the collection

at Microsoft.FSharp.Core.Operators.not_found[T]()

at <StartupCode$FSI_0014>.$FSI_0014._main()

stopped due to error

> List.tryfind (fun i -> i * i = 144) [1 .. 10];;

val it : int option = None

List.filter

List.filter takes a function and produces a new list with only the items on which the function returned true. This filters down the list to just what you want. In the example we filter down a list of numbers to only those which are even.

> List.filter (fun i -> i % 2 = 0) [1 .. 20];;

val it : int list = [2; 4; 6; 8; 10; 12; 14; 16; 18; 20]

Aggregate Operators

The List module also supports a trio of aggregate operators: iter, map, and fold. These three functions provide all the power you need to do heavy lifting with any collection type. (In fact you will find a subset of iter, map, and fold in the Set, Seq, Option, and Array modules.)

List.Iter

The iter function is perhaps the simplest aggregate operator. All it does is iterates through each element in a list taking calling a function for each element. Note that the iter function doesn’t produce a value. Iterating through a list is predominately used for evaluating the side effects of the provided function, such as printing to the console.

> List.iter (fun i -> printfn "List contains %d" i) [1 .. 5];;

List contains 1

List contains 2

List contains 3

List contains 4

List contains 5

val it : unit = ()

List.Map

List.map is used to transform a list of items using a given function. The type of List.map is

(‘a –> ‘b) –> ‘a list –> ‘b list

Visually we can see the effects of a map using function f by:

clip_image004[1]

Inverting a list of numbers

Imagine we wanted to invert a list of numbers. All we need to do is map the invert function to each element in our list.

 // Invert a list of numbers
let invertNumber x = x * -1

// Inverts numbers 1 through 10
let x = List.map invertNumber [1 .. 10]

// Prints: [-1; -2; -3; -4; -5; -6; ...]
printfn "%A" x

List.Fold

Folds are the most powerful aggregate operator and not surprisingly the most complicated. When you have a list of values and you want to distill it down to a single piece of data you use List.fold. List folds come in two flavors: fold_left and fold_right. The one you use determines the order in which elements are visited. (fold_left going left-to-right and fold_right going right-to-left.)

List.fold iterates through each element of the list and builds up an accumulator value. Once every element has been processed, List.fold returns the final value of the accumulator. The type of List.fold_left is:

(('a -> 'b -> 'a) -> 'a -> 'b list -> 'a)

To name the parameters fold_left takes:

List.fold_left : ([function] accumulator listElement -> accumulator) initialAccumulatorValue theListToFold

List.Fold_left
Sum a list of numbers

The simplest example of a fold is summing a list of numbers. Our accumulator function will simply add the list element, x, to our accumulator value.

 // Sum a list
let accumulate acc x = acc + x

let sumList = List.fold_left accumulate 0 [1 .. 10]

// Notice our accumulator function is identical to the (+) operator,
// so we can rewrite our fold as:
let conciseSum = List.fold_left (+) 0 [1 .. 10]

 

The accumulator value doesn’t need to be a primitive value though, perhaps you want to reduce a list to different pieces of data. You can have that accumulator be a tuple to store multiple values or even a record to store structured data.

Counting vowels

Before I go into the next example, let me quickly review an elegant syntax for cloning records:

 // Cloning Records
type Car = { Make : string; Model : string; Year : int }

let eclipse   = { Make = "Mitsubishi"; Model = "Eclipse"; Year = 2005 }

// You could make a copy like this ...
let nextYears1 = { Make = eclipse.Make; Model = eclipse.Model; Year = 2006 }
// .. but an easier way would be this.
let nextYears2 = { eclipse with Year = 2006 }

Here’s another example of List.fold_left, totaling the number of vowels in a sentence. For our accumulator we will use a record.

 // Counting vowels
type VowelCount = { A : int; E : int; I : int; O : int; U : int }

let countVowels acc c =
    match c with
    | 'a' -> { acc with A = acc.A + 1 }
    | 'e' -> { acc with E = acc.E + 1 }
    | 'i' -> { acc with I = acc.I + 1 }
    | 'o' -> { acc with O = acc.O + 1 }
    | 'u' -> { acc with U = acc.U + 1 }
    | _   -> acc
    
let listOfChars = Seq.to_list "The quick brown fox jumps over the lazy dog."

let accInitVal = { A = 0; E = 0; I = 0; O = 0; U = 0 }
 // Evaluates to: {A = 1; E = 3; I = 1; O = 4; U = 2;} 

List.fold_left countVowels accInitVal listOfChars

Fold_right

Choosing between List.fold_left and List.fold_right may seem like a cosmetic difference, but folding order can have a substantial impact on performance. Consider the problem of splitting a string. Given a list of characters, return a list of lists of characters representing words separated by spaces.

 // List.fold_left (bad)
let listOfChars2 = Seq.to_list "The quick brown fox jumps over the lazy dog"

let breakIntoWords (acc : char list list) c =
    // If the letter isn't a space, add it to our accumulator
    if c <> ' ' then
        // Words are stored in order, so the last word is at the end of the list...
        let revAcc = List.rev acc
        // Now get the first item in the reversed list
        let word = List.hd revAcc
        // And add this character at the end
        let updatedWord = word @ [ c ]
        // Finally put this updated word at the end of our accumulator
        let updatedRevAcc = updatedWord :: (List.tl revAcc)
        List.rev updatedRevAcc
    // If the letter is a space then add a new list of chars
    else 
        acc @ [ [] ]
        
let words = List.fold_left breakIntoWords [ [] ] listOfChars2

(* Prints:
[['T'; 'h'; 'e']; ['q'; 'u'; 'i'; 'c'; 'k']; ['b'; 'r'; 'o'; 'w'; 'n'];
 ['f'; 'o'; 'x']; ['j'; 'u'; 'm'; 'p'; 's']; ['o'; 'v'; 'e'; 'r'];
 ['t'; 'h'; 'e']; ['l'; 'a'; 'z'; 'y']; ['d'; 'o'; 'g']] *)
printfn "%A" words

The solution doesn’t seem terribly bad, we just need to deal with the inconvenience of reversing the lists and adding elements to the ‘back’. But remember that everything you call List.rev it takes O(n) time. And every type you call (@) that takes another O(n). In short, that fold was a steaming pile of slow. However, because of the nature of that problem if we go ‘backwards’ or fold in right-to-left order we can add characters and words to the front of the list, allowing us to use the much faster Cons function.

Solving this problem using List.fold_right allows us to remove all calls to List.rev and Concat. The resulting fold operation is much more performant.

 // List.fold_right (good)
let listOfChars3 = Seq.to_list "The quick brown fox jumps over the lazy dog"

let breakIntoWordsGood c (acc : char list list) =
    // If the letter isn't a space, add it to our accumulator
    if c <> ' ' then
        // Words are stored in reverse order, so get the first word
        let word = List.hd acc
        // And add this character at the beginning
        let updatedWord = c :: word
        // Finally put this updated word at the beginning of our accumulator
        updatedWord :: (List.tl acc)
    // If the letter is a space then add a new list of chars
    else 
        [ [] ] @ acc
        
let words2 = List.fold_right breakIntoWordsGood listOfChars3 [ [] ] 

(* Prints:
[['T'; 'h'; 'e']; ['q'; 'u'; 'i'; 'c'; 'k']; ['b'; 'r'; 'o'; 'w'; 'n'];
 ['f'; 'o'; 'x']; ['j'; 'u'; 'm'; 'p'; 's']; ['o'; 'v'; 'e'; 'r'];
 ['t'; 'h'; 'e']; ['l'; 'a'; 'z'; 'y']; ['d'; 'o'; 'g']] *)
printfn "%A" words2

While there is still plenty more to know about lists, now you should be armed with enough knowledge to tackle Project Euler problems with ease.

In closing, I hope you enjoyed this post and if there is any F# topic you would like me to cover in the future blog post please use the Contact link and let me know. Thanks!

Comments

  • Anonymous
    July 10, 2008
    PingBack from http://blog.a-foton.ru/2008/07/mastering-f-lists/

  • Anonymous
    July 10, 2008
    Ahhh! you beat me to this. I am learning F# and I was hoping to do a post like along these lines :-) However, thank you for doing it. It is well done and I like the samples you provided. And you covered the fold operations; I was having a little bit a difficulty with. Thanks again. ~sparky

  • Anonymous
    July 11, 2008
    Glad you found the post useful :)

  • Anonymous
    July 11, 2008
    The comment has been removed

  • Anonymous
    July 24, 2008
    从C# 3.0到F# Written by Allen Lee 缘起 当你看到这篇文章的标题时,你有什么感觉?是不是很想脱口而出:&quot;到底搞什么飞机啊,我C#还没来得及用好,现在又搞个F#,还让不让人活啊

  • Anonymous
    August 30, 2008
    接上一篇,本文继续介绍F#中的函数式编程范式,主要包含了操作符、列表、列表推导、类型推导、类型标注等概念。类型推导又称隐式类型,通常是——但不限于——函数式编程语言的特性,比如C# 3.0和VB.NET 9.0都提供了一定的支持,它使很多编程任务变得更为简单。

  • Anonymous
    May 26, 2009
    第一篇,从零开始编写我们的第一个F#程序。 什么是F#,我为何要学它? F#是一种.NET平台上的 函数式编程 语言。就像C#和VB.NET,F#可以利用.NET的核心类库,如 WPF , WCF ,