Martin Steffen

The post is about evaluation strategies. The concept is discussed in the lecture (in week 2) and in SICP. Especially applicative order evaluation is covered, as the standard evaluation strategy of Scheme. Also, an alternative to that is discussed, namely normal order evaluation, and that’s done in connection with things that show up later in the lecture, namely delayed evaluation, streams, and also in the context of the meta-circular evaluator. That’s a Scheme interpreter written in Scheme and for that it will be discussed what needs to be done to have a non-standard interpreter, namely one that does normal order evaluation.

But what’s evaluation anyway? And why does one need a strategy for that?

Values, evaluation, and execution

Evaluation means to determine the value of something (like ``e-value-ation”), for us, the value of an expression or of (a part of) a program. The concept of evaluation is eminently functional; in absence of side effects, the value of an expression, in particular of function applications, is independent from when its evaluated, it does not depend on some state (which may change) and so it’s always the same. That property is also known as referential transparency. Since the value of an expression is always the same means, an expression represents nothing else than the value, it’s only not yet calculated. Like: (fac 5) is the same as 120, though (fac 5) is an unevaluated expression, and 120 is an evaluated expression (there’s nothing more to do): in ordinary language, we simply say

(fac 5) is 120

or write as equation

5! = 120

which is shorter than to say 120 is the value of (fac 5) (or of 5!). One can speak of “the” value of an expression, as opposed of “a” value of an expression, as in the absence of nondeterminism (random effects), there cannot be more than one value. Those observations are also underlying the substitution model from the lecture. Actually, not just from the lecture: substitution as explanation what happens when executing a program works as long as the program is purely functional, in Scheme or other languages. Substitution means replacement, and if two things are “the same”, one can replace one by the other without that it changes anything. For instance, if one calls a procedure, say square on (fac 5) as argument, then it in a way does not matter if the body of square does its calculating on (fac 5), the unevaluated argument expression, or on 120, because both represent the same value (namely 120). And evaluating (fac 5) before handing over the calculating formal parameter x. And referential transparency guarantees that it does not matter whether (fac 5) is being evaluated to 120 beforehand, i.e. before handing it over to square or handing over (fac 5) unevaluated, and let it be evaluated when evaluating the body.

Evaluation thus refers to the “execution mechanism” for purely functional programs and expressions. Sometimes one calls evaluation also reduction, like that an unevaluated expression such as (fac 5) is reduced in a number of steps closer and closer to it ultimate value.

Of course, also imperative programs need to be executed. Those are not primarily run to obtain their value, but (also or mainly) for their side effects. Sometimes they don’t even result in a value, but are executed for side effects only.

Later in the lecture, we will encounter set!, which assigns a value to a variable. Assuming that a variable x is introduced (via define, via let, or as formal parameter) and has some value, then (set! x (+ x 1)) changes the value or content of x and replaces it via the value increased by one.

The shown expression (set! x (+ x 1)) in Scheme has no value, i.e., it’s executed for its side effect alone, and trying to do something like

(* 10 (set! x (+ x 1)))

is meaningless i.e., leads to a runtime error. Being a functional language at its very core, imperative aspects take a bit of a backseat in Scheme/Lisp, and thus the syntax for assignment is specially marked by ! (“bang”), at least in the Scheme dialect we are using, as a warning sign to the programmer, not to expect referential transparency any longer (and, connected to that, the substitution model breaks down, as well).

Of course, there are many languages not centered around procedures, functions etc. but are imperative at their core. Many widely used programming languages, including object-oriented ones, are imperative. Destructive assignment is taken so much for granted in most languages that it often does not even specifically mentioned, like “let’s introduce assignment as destructive operation”, the qualifier “destructive” will not show up in any textbook about Java, maybe not even “imperative”. Also the syntax for assignment is typically less conspicuous. Since imperative operations are just the standard way of programming there’s not need to highlight them with a warning !, and so (set! x (+ x 1)) is just written as x = x+1, using = as symbol for assignment.

As a side remark: Especially disciples of functional programming find it unfortunate that the equality sign = is (mis-)used in many languages for something that is not equality, but imperative assignment. For example, in C-like languages, == represents equality, and = represents assignment, but there are other languages, where assignment may be written := or similar.

Back to evaluation and execution: as explained, some programs result in no value but are executed for their side-effects only, some have side effects and result in a value, and some, purely functional ones, have only a value and no side-effects. Details of which constructions gives a value and which not may differ from language to language. For instance, as said, (set! x (+ x 1)) does not result in a value in Scheme (of course the sub-expression (+ x 1) has a value, depending on the content of x), but in other languages, for instance Java and C, the corresponding assignment x = x+1 has not only a side-effect, changing x but also results in a value. Consequently, one can use constructions like

  y = (* 10 (x = x + 1))

even though it might not be a recommended coding style.

What about strategies?

Now that we know what evaluation is, determining the value of a (purely functional) piece of code and we know that in the presence of side-effects, one more typically speaks about execution instead (“running the program”). But why do we talk about strategies, especially evaluation strategies?

One speaks of strategies in situations when one faces choices, how to proceed, and a strategy is a plan to make those choices. As an example from a different field, given the task to explore a graph, one can do that in different ways, for instance following a strategy of depth-first traversal, or breadth-first, to name the two most prominent strategies. The depth-first strategy, after exploring one edge , faces the choice how to proceed: to explore subsequent edges first, or explore alternative, “sibling” edges first. Depth-first traversal consistently targets the subsequent edges first.

For evaluation, let’s look at a simple example, like

(6 + 4) - (5 * 2)

  (- (+ 6 4) (* 5 2))

in Scheme notation. The value of the expression obviously is 0, and it’s easy enough to calculate, i.e, easy enough to evaluate. However, thinking of the evaluation as a step-by-step process, one has a choice to either calculate 6 + 4 first and 5 * 2 afterwards (and then building the difference), or the other way around. Actually, if one had one interpreter of compiler using parallelism, one could even have the left and the right sub-expression evaluated in parallel (something we don’t really touch upon in the lecture)

But does it matter?

That’s a legitimate question, and the answer is: yes and no. Looking at the above simple numeric expression, the outcome is 0, independent of whether one calculates 6 + 4 before 5 * 2, or the other way around. That’s what referential transparency is about: the value of, for instance 6 + 4, namely 10, is independent from whether it’s calculated before 5 * 2 or afterward (or in parallel …), so in that sense the evaluation strategy or order does not matter. Of course, an interpreter will choose typically a particular order, like evaluating expressions like the one shown from left to right. Or a compiler realizes the same evaluation order, generating (machine) code that calculates the result of the left sub-expression before it calculates that of the one on the right (and before calculating the end result), since it has to calculate them in some order (if not parallelizing the task and using some multi-core architecture or similar) .

That was an argument that the evaluation strategy does not matter, at least is such purely functional or mathematical expressions, but actually the numerical example does not even touch on the two strategies mentioned above, applicative order vs. normal order. The reason being that the example does not really uses procedures, at least not user-defined ones, but calls only primitive procedures or operations, like + that do whatever needs to be done, without that one knows how they do it or how their procedure body looks like. Perhaps an operation using + is not even evaluated inside Scheme or the programming language , but handed over ultimately to some arithmetic hardware.

As said, strategy is about making choices, and the “evaluate sub-expressions or arguments to a procedure from left to right” is a “strategic” choice that was not even mentioned in the lecture. Simply because it’s not really relevant. A left-to-right version of Scheme (or some other language) is not different from a right-to-left version, at least not in any relevant way that would justify to give it special attention or names like “l-order evaluation” or “r-order evaluation” (and Scheme standards are explicit about that the standard does not prescribe an order, so the programmer should not assume one particular one).

Of course, in a language with side-effects, calling a function on arguments, where the argument have side-effects or using expressions where sub-expressions have side effects, it matters insofar as changing the order of evaluating the arguments may well change the outcome. Indeed, some imperative programming language explicitly specify that the order of evaluating the arguments of a procedure is unspecified. In other words, a programmer should not rely on that arguments are evaluated from left to right. For arguments without side-effects it would not matter anyhow, and arguments with side effects are bad coding style anyway. And as said initially, the word “evaluation” is best use for side-effect-free expressions anyway, as only then one is interested exclusively in an expression’s value. With side-effects the word “evaluation” and thus evaluation strategy is not too fitting anyway.

The strategic decision connected to the evaluation order does not regulate what happens if a procedure has multiple arguments (that’s a boring decision, as argued), but

when to evaluate an argument in a procedure application or function call (resp. the arguments of the application, if there are more than one)

Let’s take the example of the square-function. In Scheme, it’s plausibly defined as

  (define (square n) (* n n))

a purely functional procedure with one formal parameter. We may apply that to a (fac 5), like

 (square (fac 5))

The argument (fac 5) represents 120 its only not evaluated yet. One strategic decision is, that when applying a function to an unevaluated argument, one needs to evaluate the argument first, and evaluate the body afterwards, with the formal parameter replaced (= substituted) by the value. In the above example, the argument evaluates, as said, to 120 and replacing n by that value in the body of square yields

  (* 120 120)

That’s not yet evaluated, and requires one multiplication to reach the corresponding value 14400.

That’s how applicative order chooses to handle arguments in an application, namely evaluate them first. Alternatively, one can hand over the argument (fac 5) unevaluated, i.e. substituting the formal parameter in the body by (fac 5), yielding

  (* (fac 5) (fac 5))

That requires a number of evaluation or reduction steps, one needs to calculate (fac 5), actually one needs to calculate it two times, before * can do its thing. To leave unevaluated arguments unevaluated, but hands them over as is, that’s normal order evaluation

Note that that last step in the normal order evaluation example assumes that * is not a standard procedure, since the arguments (fac 5) are evaluated before being multiplied. While square in the example illustrates normal order evaluation, the multiplication is assumed to be built-in and is assumed to multiply numbers, i.e., numeric values, not unevaluated numeric expressions. We could alternatively assume that multiplication is not built-in, for instance implemented by a procedure multiply as follows (for simplicity, it works for non-negative arguments n only):

(define (multiply n m)
  (if (= n 0)
      0
      (+ m (multiply (- n 1) m))))

With that, square would be defined by (define (square n) (multiply n n)) and calling (square (fac 5)) leads with applicative order to

  (multiply (fac 5) (fac 5))

which in turn leads to

(if (= (fac 5) 0)
    0
    (+ (fac 5) (multiply (- (fac 5) 1) (fac 5))))

One could continue from here, doing further steps. It would require to remember that if is a special form, not a standard procedure, and thus the rules of applicative or normal order don’t apply in their pure form. If we did the same for + as we did for * namely redefining it maybe calling it plus, the evaluation would continue substituting unevaluated expressions to plus and the other functions. But I won’t do a further exploration of the normal order evaluation process on the example here, coming back to the initial question: AO or NO, does it matter?

In some way, no, it does not matter. Like in the example before, where left-to-right or right-to-left evaluation of sub-expressions did not matter, also for the strategic choice between AO vs NO in the example (square (fac 5)) does not matter, as far as the resulting value is concerned, namely 14400. That’s a general observation for purely functional programs: if the program results in a value under AO and under NO, it’s the same value.

In other ways, the choice between AO and NO does indeed matter! The end value may be the same, but the two strategies make different choices how to get there and that could mean, some strategy may get there quicker. The square example is a good illustration of that. It multiplies in its body (* n n) the argument n with itself. If, following NO, one hands over an unevaluated expression, like (fac 5) it means that (fac 5) needs to be evaluated (at least) two times, maybe more, if one uses a self-made multiply instead of the built-in *. For NO, the argument is evaluated only once, namely before handing it over to the caller. To avoid the potential performance penalty of repeatedly evaluated an expression, one could of course use memoization and the combination of NO and memoization is called lazy evaluation. Memoization can be used with AO as well, but it’s less urgent there, and there seems no specific word for that combination.

Of course, evaluating an argument only later, when unavoidable, can also lead to a situation that an argument is not evaluated at all under NO, whereas AO insists on evaluating it even if it turns out not being needed. As an example, take the following procedure that takes two arguments but returning only the first.

  (define (first x y) x)

If we apply that to 2 numeric expressions, say 42 and (/ 10 0), then AO will crash with a division-by-zero error or numerical overflow, whereas NO gives back 42 without crashing as the crashing division is never evaluated.

Of course, one could make the argument that (/ 10 0) is not a numerical expression, at least not a proper one, insofar that it does not evaluate to some value, it raises an error and potentially derails the overall evaluation. Actually, one can make the argument, that what happens when (/ 10 0), namely raising an exception, is not a purely functional behavior, and the “result”, the exception, is not a value, but a side-effect. Taking that view would basically say that it’s not a good example for distinguishing AO vs NO in the substitution model and for purely functional programs.

Fair enough, but the next example is harder to argue away. Instead of raising an error like division-by-zero, there are other programs that don’t yield a value. That’s programs that do not terminate. The simplest one in Scheme is probably a procedure without parameters that simply calls itself:

  (define (infiniteloop)  (infiniteloop))
  (infiniteloop)         ;; this never terminates

It hard to argue that this should not be a purely functional program, it does not change any state with things like set!, it does not produces output like with display, it does not crash or raise an exception. It simply keeps on evaluating without ever producing a value. If we use the procedure first from above with the infinite-loop program as its second argument

  (first 42   (infiniteloop))

it will not terminate under AO, but produces 42 under NO. That’s clearly an example where AO vs NO makes a difference. In a way it’s just an extreme example for the observation that AO and NO can make a difference in the number of steps it takes to reach at a value. If we had an program that takes an enormous amount of steps to evaluate and used the same set-up, like

  (first 42   expr-that-takes-super-long-to-evaluate)

then NO terminates very quick to produce 42, whereas AP takes super long before it produces 42. In the infinite-loop example, that super-long just means “forever” or infinitely long.

Evaluation strategies in imperative programming languages?

The discussion about evaluation strategies like AO and NO focused on functional languages, drawing examples mostly from the functional core of Scheme. Well, as explained evaluation is about reducing an expression to obtain its value. That’s an eminently functional way of explaining what happens when a program runs. Imperative programs may not result in a value, or if they yield a value, it’s the side-effects, like state-change, changing the value of variables, that is what primarily happens. Thus, we discussing a “standard” programming language (and most programming languages are imperative at their core), the word “evaluation” is seldom used, and no one speaks of evaluation strategies. A Java program is not evaluated, its run or executed, resp. it compiled mostly to byte-code and then run or executed, resp. interpreted on a virtual machine.

Not only is one much less interested in the resulting value of a program in an imperative, sequential setting, there is also not much room for (evaluation or execution) strategies. A strategy is a plan to resolve choices or alternatives, but in an sequential, imperative program, there is no room for such choices. Often code is arrange in sequences of statements, maybe using also loops and conditionals etc. The statements are separated (in many languages) syntactically by ; (semicolon). The semicolon is also called the sequential composition operator. And it goes without saying that of course the statements are evaluated one after the other from the beginning to end, leaving no room for an “strategic decisions”. It’s just how a program is executed, from beginning to end, and there’s loops, then of course the running program starts at the beginning of the loop when reaching the end of the loop (if not exiting).

Note incidentally, that at the current state of the lecture (in week 2) we have not even used composition in Scheme! In a purely functional language, it somehow makes no sense in explicitly programming to do ``first-this, then that’’. One does not specify the evaluation order, as mostly it does not matter, and where it matters to some extent, with procedure calls, the chosen evaluation strategy fixes the order. In Scheme, AO. Only somewhat later in week 6, when we introduce imperative feature, then there will be a need for sequential composition, and we will introduce the corresponding Scheme syntax: it will not be ;, but, you may guessed it, an S-expression. They keyword is begin, which you may guessed also that, is a special form…

Here is an example of some factorial in C, programmed very non-functional, and also doing input-output (something that we have not touched upon in Scheme).

#include <stdio.h>
 
int main()
{
  int c, n, f = 1;
 
  printf("Enter a number to calculate its factorial\n");
  scanf("%d", &n);

  for (c = 1; c <= n; c++)
    f = f * c;
 
  printf("Factorial of %d = %d\n", n, f);
 
  return 0;
}

However, when we said that there are no strategic decisions in an imperative language, like C, to make, that was an exaggeration. Also in imperative languages, one can nest expressions (as one commonly does on Scheme). Likewise one of course uses procedure calls, and of course can use procedures. Then the question arises: what is handed over when calling a function or procedure? The answer is often (like in C, Java, and most other languages): it’s a value of the argument expression, so the parameter passing mechanism is call-by-value. It corresponds to applicative order evaluation strategy, though, as said, in imperative languages, one does not much speak about evaluation (and consequently one does not speak about applicative order or normal order, as this refers to evaluation strategies).

So, call-by-value though it corresponds to AO, is not seen as an evaluation strategy, but a decision about parameter passing. And one main alternative to call-by-value is not something that corresponds to NO, but something called call-by-reference (which is never used in connection with Scheme). On the other hand, NO, which corresponds to the core of lazy evaluation, has no place in imperative languages. It’s not that it’s impossible to do, but basically imperative side effects and the loss of referential transparency messes things up, so to lazy evaluation unusable (it had been done, and called call-by-name, but abandoned as useless). That’s the reason why the only significant programming language based on lazy evaluation is purely function (Haskell). Scheme, and most other functional languages support side effects, and therefore must do call-by-value resp. applicative order.

2025 4
2024 9
2023 10
2022 1
2021 4

2022

Welcome to Compiler Construction, spring 2022

9 minute read

Evaluation strategies

Martin Steffen

Values, evaluation, and execution

What about strategies?

But does it matter?

Evaluation strategies in imperative programming languages?

2025

How to write a parser for the first time, e.g. for the oblig

How to prepare for an oral exam?

Did Fortran kill a Mars Mission?

Full employment theorem for compiler writers

2024

Y Y?

Recursion, primitive or otherwise

Processes and procedures

What’s in a name?

Evaluation strategies

Welcome to Functional programming (IN2040), autumn 2024

Worklist algorithms

Regular expressions, context-free grammars, and recursion

How to prepare for an oral exam?

2023

(Not so) abstract syntax trees

Y not code up $n!$ with no recursion and no Y tricks either?

Y Y?

Processes and procedures

Recursion, primitive or otherwise

Evaluation strategies

Welcome to Functional programming (IN2040), autumn 2023

How to prepare for an oral exam

Type systems: the strong, the strict and the safe.

How to implement regular expression matching

2022

Welcome to Compiler Construction, spring 2022

2021

Fol Soundandcomplete

Bitcoin, blockchain & formal methods, seriously now?

Welcome to program verification, autumn 2021

Welcome to Compiler Construction, spring 2021