F#: How to check that tail recursion calls are optimised

F#: How to check that tail recursion calls are optimised

The tail recursion optimisation happens when a compiler decides that instead of performing recursive function call (and add new entry to the execution stack) it is possible to use loop-like approach and just jump to the beginning of the function. Typically it happens when the compiler is smart, the tail calls optimisation is enabled and the recursive call is a leaf of the expression tree.

Is this can be tail-optimised?

let rec fib n =
  if n = 1 || n = 2 then 1
  else fib(n-1) + fib(n-2)

No, actually not. In this example recurcive call is not last in the expression tree, which is + (fib (- n 1)) (fib (- n 2)). So the addition happens after recursive calls.

However, the tail call optimisation can be applied in this case:

let fib n =
  let rec rfib i p2 p1 =
    if i = n then p1
    else rfib (i+1) p1 (p2+p1)
  rfib 1 0 1

The second version will not exhaust the stack on its way to 10000 Fibonacci number. Tt will hit integer overflow much before that but at least the stack will be fine :-) Not only the stack will be fine, the performance will be much better than in the first version.

So the tail-call optimisation is worth checking for if the number is recursive calls is expected to be more than a hundred or the recursive function is on the hot (performance-critical) execution path.

Unfortunately, the F# compiler does not help you much with that (yet).

This leaves two options: either check it by actually running the code and observing stack overflow exception or check generated IL code.

Observing stack overflow exception is quite simple technically. You just run the code with input causing a lot if recusive calls and wait. However, there is two difficulties: a) it may be quite hard to isolate code from its context; b) it depends on the time and resources the function takes on every iteration. This way is hard to follow in some of the cases.

Checking the generated IL code, in contrast, is very simple if you know what to look for. In the few following paragraphs I will show how to do that for two functions from the beginning of this article so you can do it yourself when needed.

The code will be compiled with .NET core. The tool used for disassemble is monodis. Let's go.

$ dotnet new console -lang F# -o fib
$ cd fib

Now update the Program.fs file:

open System

let main argv =
    let rec fib1 n =
      if n = 1 || n = 2 then 1
      else fib1(n-1) + fib1(n-2)
    printfn "%A" (fib1 10)

    let fib2 n =
      let rec fib3 i p2 p1 =
        if i = n then p1
        else fib3 (i+1) p1 (p2+p1)
      fib3 1 0 1
    printfn "%A" (fib2 10000)


$ dotnet build

Now the fun part. Run monodis on the generated executable:

$ monodis bin/Debug/netcoreapp2.0/fib.dll > fib.il

And let's examine the generated file.

The generated file is a mixture of structural statements (.assembly, .mresource, .namespace, .module, .class, .method) and assembly code (IL_0032: ...). Everything with .assembly, .mresource, and .namespace do not look relevant. There are four classes, named Program, fib1, fib2 and fib3. So every function defined with let is compiled into the class.

Body of the function is compiled into the Invoke methods of the generated classes. Let's check these methods of the fib1 and fib3 classes.

    .method public virtual strict 
           instance default int32 Invoke (int32 n)  cil managed 
		IL_0019:  ldarg.0 
		IL_001a:  ldarg.1 
		IL_001b:  ldc.i4.1 
		IL_001c:  sub 
		IL_001d:  callvirt instance !1 class [FSharp.Core]Microsoft.FSharp.Core.FSharpFunc`2<int32, int32>::Invoke(!0)
		IL_0022:  ldarg.0 
		IL_0023:  ldarg.1 
		IL_0024:  ldc.i4.2 
		IL_0025:  sub 
		IL_0026:  callvirt instance !1 class [FSharp.Core]Microsoft.FSharp.Core.FSharpFunc`2<int32, int32>::Invoke(!0)
		IL_002b:  add 
		IL_002c:  ret 
    } // end of method fib1@6::Invoke
    .method public virtual strict 
           instance default int32 Invoke (int32 i, int32 p2, int32 p1)  cil managed 
		IL_000f:  ldarg.1 
		IL_0010:  ldc.i4.1 
		IL_0011:  add 
		IL_0012:  ldarg.3 
		IL_0013:  ldarg.2 
		IL_0014:  ldarg.3 
		IL_0015:  add 
		IL_0016:  starg.s 3
		IL_0018:  starg.s 2
		IL_001a:  starg.s 1
		IL_001c:  br.s IL_0000

    } // end of method fib3@12::Invoke

It is easy to see that differently from the fib1, the fib3 does not have callvirt calls so the tail call optimisation is applied.

At the end, there are two hints on how to make sure that code is optimised:

  • check that the result of function being called recursively is returned straight away, without been used in other computations within this function - this should already give you enough confidence that the tail calls are optimised
  • if unsure, remember the name of the recursive function (or change it so something inique within your project), decompile and search for the generated function class and its Invoke method. Make sure that it does not contain callvirts of itself