Type systems: the strong, the strict and the safe.
Or the weak, the loose, and the unsafe. But does it mean anything?
The SICP textbook shows in Exercise 1.10 a famous function known as Ackermann’s function. Actually the code there shows one version of that function; there are minor variations of it, all doing basically the same, and all known as Ackermann function. The exercise is not on the list of exercises officially discussed in the group sessions, but perhaps you have stumbled upon it or the group teacher discusses it.
As said, the function is very well known, and it’s always discussed in connection with a concept called primitive recursion (and also that is not on the pensum of the lecture). So if one reads about primitive recursion, invariably the Ackermann function is mentioned and if the Ackermann function is discussed, then it’s discussed in connection with primitive recursion, namely pointing out that Ackermann’s function is not primitive recursive. Actually, Exercise 1.10 in SICP is an exception to that rule, it gives the definition of the function and asks to observe how it behaves but does not mention primitive recursion.
Ackermann’s function is the first and most prominent example of a terminating function which is not-primitive recursive. That is largely the reason why it is famous. It’s also known as example for a function that grows extremely fast (that can be observed by playing around with it in Exercise 1.10). Both facts hang together; abstractly speaking, it grows too fast for any possible primitive-recursive function, while still terminating. The function is not famous for being practically useful. Also for that it grows too fast.
So if one has ever heard of the Ackermann function at all, it’s probably exactly for that: it’s ``the’‘ example of a function that is not primitive recursive. Also googling around in the internet, starting maybe at Wikipedia and at various different other pages that offer wisdom and answers on various questions, will confirm that. You can look for questions like ``What is an example of a total function that is not primitive recursive?’’ (answer ``Ackermann’’) or ``what’s the Ackermann function’’ (answer: an example for a non-primitive-recursive function), and on and on.
Alright, got it, Ackermann is not primitive recursive.
But that's actually not true!
Or maybe I should be more modest. It’s true under only under assumptions taken for granted and left often unmentioned. It’s an assumption that maybe never even crosses the mind of people who just ``know’’ that Ackermann is not primitive-recursive and people who write web-pages explaining Ackermann, that in fact there is a restriction.
Participants of a course on functional programming, however, may not suffer from or at least should not suffer from that blind spot. What unspoken restriction are we talking about? I come to that in a moment, but before that, we need at least sketch what is actually meant by primitive recursion.
General recursion is ubiquitous in the lecture, most functions half-way interesting use recursion. Primitive recursive functions are a restricted class of recursive functions. We don’t bother to give a precise definition of the concept; it’s easy to find it explained on the internet.
You will remember that SICP distinguishes between recursive procedures and recursive (resp. iterative) processes, where processes refers to what happens at run-time. Let’s focus on what the book calls ``procedures’’, the code representation, not the processes.
A recursive procedure is one that calls itself in its body. There is also indirect or mutual recursion, which is a situation where two or more procedures call each other; in software-engineering circles that’s also sometimes called a ``call-back’’ situation. Nevermind. There are no restrictions on how procedures can recursively call themselves (or each other). In other words, Scheme and Lisp (and most other modern languages) support recursion in its general form, unrestricted.
One can use recursion to easily come up with functions that don’t terminate. The simplest example is the one from Exercise 1.5, which does in fact nothing else than recursively calling itself (and thus will never terminate):
(define (p) (p))
One can then study restrictions on the use of recursion. One example is known as tail recursion. The book and the lecture uses that term in connection with the interpreter, stating that the scheme interpreter is an example of a tail recursive interpreter. More conventionally, one calls functions or procedures tail recursive and that characterizes functions, procedures, methods etc. which call themselves only ```at the end’’ of their body. In the terminology of SICP, that leads to an iterative process, not a recursive process (at least in an interpreter that knows how to deal with it adequately and efficiently).
So, a tail-recursive procedure is a restricted form of a recursive procedure.
But now to the restriction on recursion called primitive. The exact definition of primitive recursive functions involves fixing allowed elementary constructs, and projections and other details. The core of the concept, however, is the way recursion itself is allowed. It can be roughly stated as
A function can call itself recursively, but only on smaller arguments.
Instead of giving the formal definition of the primitive recursion operator, we give a feeling what’s allowed and what’s not allowed by small examples. Primitive recursive functions are classically defined as functions on natural numbers as arguments and as return value. For that, being smaller is pretty obvious. Let’s look at the following function
(define (f x y) (if (= x y) y (+ (f (+ x 1) y) 1)))
The function is supposed to take 2 natural numbers as argument. Additionally, let’s assume the argument for
x is smaller or equal than
y. Otherwise the corresponding process would not terminate. That’s a minor point, we can of course easily add a couple of lines, checking first whether the assumption is true, and if not, doing analogous calculation to cover also that situation, making it terminating for arbitrary natural numbers. But that’s not central to the discussion here.
f is recursive, calling itself (and it’s not tail-recursive). Now, is the function primitive-recursive? The definition of
(f x y) calls itself with
(f (+ x 1) y) and that is forbidden in primitive recursive schemas. So does that mean the function is not primitive recursive?
Not so fast. The way it’s defined is certainly not the primitive-recursive way But in the same way, that one may transform non-tail-recursive procedure definitions into tail-recursive ones (the lecture had examples for that), one may reformulate sometimes non-primitive-recursive definitions so that they fit the schema. What function is it anyway, given above? It’s easy enough, it calculates
2y - x (for
x <= y).
It turns out that this function indeed is primitive-recursive, in that one can easily define it using primitive recursion schemas. Indeed, it’s straightforward since one can define multiplication and addition and minus easily via primitive recursion. Defining the calculation
2y-x this way seems more natural than the slightly weird recursive definition where
f calls itself on
(+ n 1), but the definition was given to illustrate what is not allowed.
To illustrate what is allowed, let’s sketch how addition of two natural numbers can be defined with primitive recursion. Actually, it corresponds to the most straightforward definition of addition (assuming that the successor function is given as a more basic operation, here written as
x +1 is not meant as using binary addition on
x and 1 as arguments, but calculating the successor of
x. We also use infix notation and equations, not Lisp-like prefix and code, though one easily could).
0 + y = 0 (x +1) + y = (x + y) +1
The primitive recursion schema generally specifies a base case (the first line in the above example) and an induction case (the second line). In the case of addition, to define
(x +1) + y, the recursion scheme can use on the right-hand side of the equation a function
h that takes three arguments, and allowed as arguments are
(x + y). Besides it could rely on earlier defined functions and some primitive operations. In our very simple example, the
y are not needed in the construction,
x + y is the only relevant part and the successor function
+1 is built-in. (NB: to avoid confusion: the values of
y are not needed individually and directly as argument to the function
h, but of course they are needed indirectly in that their sum
x + y is used).
So addition is defined recursively in that the definition calls itself, and under the restriction that for defining the outcome for
x+1 in the induction case, only
x+y is used, not an arbitrary recursive call to plus.
The question then is:
Are all recursive functions also representable by primitive recursion. Or is being primitive recursive a restriction?
The answer is yes, it’s a restriction for sure. All primitive recursive functions terminate, which is a consequence of the fact that the recursion calls the function on a smaller argument. On the other hand, general recursion easily allows non-terminating procedures. Earlier in this post, there was a minimal example for that.
So far so good. We got a feeling that being primitive is a restriction on general recursion. To see that the Ackermann function is not primitive recursive is not obvious. Note that it’s not good enough to observe that its definition does not follow the required primitive-recursive schema: One has to make the argument that it cannot somehow be written up in a different way that fits the scheme.
Generally speaking, the Ackermann function is not primitive-recursive as it ``grows too fast’’. We don’t provide the argument formally, but the idea is quite simple. Looking at the primitive recursive schema sketched above, it has the feel of an iterative loop with a fixed bound, like
for i = 0 to n do .... Programming with for-loops with a fixed bound results in terminating programs, analogous to the fact that all primitive-recursive programs are terming. That’s in contrast to programs using general ``while’’ loop, resp. programs using general recursion.
A primitive recursive definition builds a new function using the primitive recursion schema corresponding to a for-loop iteration and using earlier defined primitive recursive functions as building block, which themselves are iterative schemes. That corresponds to a stack of nested iteration loops.
For illustration: as we have seen, addition can be defined using the successor function iteratively. One could continue to define multiplication as iterative addition. And exponentiation as iterated multiplication. SICP shows how that’s done in Scheme, though without mentioning that the recursions and iterations could be classified as ``primitive’’ (see Sections 1.1.3 and 1.1.4)
At any rate, taking the successor function as basic, multiplication can be represented by one loop (or using one primitive recursion scheme), exponentiation using 2 nested loops, and one could continue with iterated exponentiation, and then iterate that, piling up layer after layer of looping in a nested fashion, each layer really adds expressiveness (and the potential of faster growing functions).
So, using only such bounded loops for programming then leads to a hierarchy of functions. Those programmable with a nesting depths of at most one (like multiplication), one with a nesting depth of 2 (for example, exponentiation), etc., all programs terminating. It can be shown that this hierarchy is infinite. In other words, it’s not that there is some maximal looping depth, after which one does not need further nesting.
But where does Ackermann fit in?
Well, that’s the whole point: Ackermann does NOT fit into this looping hierarchy!
Ackermann’s function comed in different flavors, the one from Exercise 1.10 has 2 arguments and it’s not even the one most commonly found. There are also versions with 3 arguments and for the line of argument here, let’s assume for now we have a 3-argument formulation.
In all formulations, the Ackermann function has one argument that corresponds roughly to the nesting level of iterative loops resp. the amount of primitive-recursive schemes. So
Ack(x,y,1) corresponds to one looping level, and in a properly formulated 3-argument version,
x+y (Wikipedia starts counting at 0 instead of 1, but never mind). Continuing like that,
Ack(x,y,2) is exponentiation
exp(x, y) etc. This is the core of Ackermann’s idea: Define a function where one argument controls the nesting-depth of loops or the level of primitive-recursive schemes.
And that immediately shows that Ackermann cannot be primitive-recursive. If it were, it could be written using a fixed amount of for-loops or a given amount of primitive-recursive schemes. But that’s impossible, since we said, the hierarchy of looping constructs is a real hierarchy, each new level of nesting really adds a new class of functions. Thus,
Ack cannot fit into any specific layer, say level
Ack(x,y,m+1) would have to live in level
m+1. This was meant when stating at the start of the post, that
Ack grows too fast to be primitive recursive. Each layer limits the order of growth of functions inside that layer, but one argument of the Ackermann function, the one we called
m, controls the growth rate of Ackermann, and since it’s the input of the function, we can make Ackermann’s function growing arbitrarily fast and too fast to fit into any specific layer.
Indeed, that was the outline of the standard proof showing that Ackermann is not primitive recursive, and hopefully it was convincing. But then, why the claim that Ackermann can be captured primitive-recursively, it sure can’t be both ways?
The classic definition and the argument outlined here can be done more formally, exactly specifying what functions to use as primitive building blocks (basically successor and projection functions) and exactly the format of the primitive recursion schema (which we only sketched here on using addition as very simple example). In its standard form, primitive recursion is used to define functions over natural numbers. So functions that take natural numbers as input, and return a natural number. For instance, the Ackermann function
Ack(x,m) is a function of that type. (BTW: Let’s switch back to a two-argument version of Ackermann, but it is not crucial for what’s being said.) So this two argument Ackermann function is of type
Nat * Nat -> Nat, and the functions definable by primitive recursion are of type
Nat * Nat * ... * Nat -> Nat (though as argued,
Ack is not definable by primitive recursion, but it would be at least of a fitting type.)
In this and the whole construction and set-up lies a restriction, though one that is seldom drawn attention to. Namely not only that we focus on functions over natural numbers, but that we are dealing with first-order functions over natural numbers!
Ah, well, yaah, now that you mention it…
Is this important, are higher-order functions something to consider? Some may consider them as curious anomalies, but in a course about functional programming one sure is comfortable with higher-order functions, they are the bread and butter of functional programming. If embracing higher-order functions, instead of struggling to encode the first-order Ackermann function of type
Nat * Nat -> Nat primitive-recursively (and fail), we can look at Ackermann as a function of type
Nat -> Nat -> Nat. That’s the type of a higher-order function. It’s a function that takes a natural number as argument and returns a function (of type
Nat -> Nat).
With this type, it’s not really the same function: one cannot use one version as drop-in replacement for the other. But it can be seen still as conceptionally the same function. It’s indeed easy to transform any function of type
A * B -> C into a function of type
A -> B -> C and reversely. Actually, it’s not just easy to do the transformation manually, one can also easily write two functions that implement those two transformations. The transformations are known as currying and uncurrying in honor of Haskell Brooks Curry (that’s the name of a person, but of course there are also functional programming languages named after him, Haskell and the lesser known Brooks and Curry. In particular, Brooks is rather marginal. Note that Wikipedia in the article about H. Curry confuses the languages Brooks and Brook).
Now, with this switch of perspective and freeing one’s mind from the unspoken assumption that functions need to be first-order, one can observe:
With higher-order functions (and currying), Ackermann’s function can be defined by primitive recursion!
That’s known, but I think it’s fair to say, it’s much lesser known than the common knowledge that ``Ackermann is not primitive recursive.’’
Let’s wrap it up and simply show the primitive-recursive definition for (a version of) Ackermann. It corresponds to a two argument version of Ackermann, i.e., the uncurried, first-order version would have two arguments. The higher-order version has one argument, but gives back a function. Here it is:
Ack(0) = succ Ack(m+1) = Iter(Ack(m)) Iter(f)(0) = f(1) Iter(f)(m+1) = f(Iter(f)(m))
Ackermann can be defined in Scheme using general recursion; Exercise 1.10 in SICP shows a straightforward piece of code for that. Can one encode it primitive-recursively in Scheme, as well? Well, Scheme sure supports higher-order functions and it supports currying (defining functions using lambda-abstractions). Thus one can quite easily translate the above primitive-recursive definition into Scheme, and that is left as an exercise…
Or the weak, the loose, and the unsafe. But does it mean anything?
A small amount of background information (best read in the context of the first sets)
or maybe also not the first time.
A glance at Church numerals, inductive data types, and pattern matching
Or maybe how not to implement it?
(White-)space exploration, FORTRAN, lexing, antiquity & fetischism
or why Y?
A lesser known fact on Ackermann’s function
like for instance the compiler construction course.
Some clarification (I hope)
Formal methods as sales pitch
In the lecture, we covered type inference for generic polymorphism in a typed calculus with higher-order functions, including recursive functions and let-bin...
Some remarks about variations on the theme