Martin Steffen

From the group sessions I got feedback that the concepts of processes vs procedures and functions remained shady and a bit unclear (at least for some). Thus I write a few lines on that. I won’t repeat the examples and sentences from SICP; there is Section 1.2 (“Procedures and the processes generate”) which is intended to clarify the matter. Instead I “talk around” the concepts a bit.

To start with, it’s partly a terminology question, a question about (the use of) words. Using the correct words and using words correctly is course important, one needs to know the technical terms to communicate efficiently and understand texts. But terminology is only use up-to a point, and words may not and cannot be used 100% precise; natural language is not as precise like a mathematical definition or a piece of code, so there is always some slack.

Additionally different communities may use words differently. That applies also to the concepts discussed here, in particular “function” and “procedures”. In other programming languages, even in other Lisp/Scheme material, the words may be used slightly differently. In general, not only is Scheme a language that differs in many ways from many other languages, but the book SICP is quite idiosyncratic in its use of words, i.e., it makes choices sometimes to use words different from what the (programming languages) world outside the book does. Examples for that is the narrow definition of “tail-recursion” and the (non-)use of the word of “closures”.

However, the book is consistent and clear in its choices and use its words carefully.

Procedures and processes

Now, what’s procedures and processes? Those are different concepts and can be discussed in connection with every programming language. Whether elsewhere they call the concept “procedure” or whether they dicuss “processes” at all is a separate question. One can learn a language and to use it and solve problems without thinking about it, at least to some extent. Still, as long as the programming language one is studying is real in the sense of being run on a computer (interpreted on an interpreter or on a virtual machine or also compiled and then run), the concept of process exist in principle:

a process is a piece of code under execution

and here, in this section, the code is arranged in a functional manner based on procedures. When saying the code is under execution, it’s not literally meant that the the syntax of the program is transformed as it seems to be the case in the illustrations of the substitution model in the section discussing processes and procedures. In the larger context, it can also mean, that the code has been compiled to some machine code (in a compiled language), and it’s the machine code that is actually being run (as or in a process). Or some byte-code corresponding to the user code is run on a virtual machine, etc.

The discussion about processes and procedures is done in connection with recursion, in particular distinguishing between linear recursion, tail recursion, and tree recursion. That is done by looking at how corresponding pieces of code, the procedures, are “run”. That means, how they are evaluated. A procedure being run (or executed or evaluated) corresponds to a process. That’s why section 1.2 in SICP is also called “procedures and processes they generate”. And that’s why I said all programming languages will have the concept of “processes”, whether they use that particular word, or whether they discuss at all what happens when a program is run does not matter. As long as a piece of code is run, there is a run-time entity which we call process.

While other textbooks may do their presentation without mentioning of processes, SICP does. One should also keep in mind that the title of the book is not called “An introductory course to Scheme programming”, but “Structure and interpretation of computer programs”, so one important part is not just to learn Scheme, or to just learn to program in a functional way, but also to understand how programs are executed, in particular interpreted, i.e., run on an interpreter.

The pictures in Section 1.2 SICP are of course exactly that: pictorial representations (SICP also uses the word “visualization”) of what’s going on when a procedure (here fac, and the iterative version of fac and finally fib) is run. In that way the pictures describe (aspects of) the corresponding processes, resp. how those processes evolve.

While I said, that the concept of “process” exists in all programming languages insofar all programs in all languages are supposed to be run, the concrete pictures here are more specific for the current setting in SICP. The visualizations rely on the so-called substitution model. That’s an “explanation” of the behavior of a process where applying a procedures to values means substituting the formal parameters by the actual parameters (and then continue from there). The model as presented here not only relies on replacing formal parameters by the actual parameters. Additionally, it’s required that the arguments are evaluated first, i.e., the arguments are already values. This strategy thus corresponds to what is also known as call-by-value, one important, arguably the most important parameter passing mechanism for programming languages.

Using substitution as explanation of what happens when calling a function is also not unique for Scheme, one can use that also for other programming languages. But explaining program behavior via substitution is rarely done, as it works only in a purely functional setting, and most languages simply are not purely functional. Indeed, as soon as we introduce side-effects and things like set! in week 6, substitution as evaluation mechanism also breaks down for Scheme and has to be replaced by something more complex. Thus, it’s particular for the section here to use the substitution model when discussing the behavior of the processes.

The intention of the discussion and visualization is to give an impression of the memory usage (for instance comparing iterative vs. recursive versions of factorial). The illustration uses a sequence of S-expressions that evolves with substitutions (because that’s what we have seen so far). But even later, when we have abandoned the substitution model (or in other languages), the message that iterative processes (or loops) have a constant memory footprint, whereas recursive ones have a growing memory usage (also called the stack…) still holds true, independent from any visualization or model.

One could and probably should be more precise and saying that a process is a piece of sequential code under execution, at least in standard terminology. If one starts considering concurrency or parallelism, everything gets more complex. In the lecture, side-effects are presented as a drastic departure from the functional setting. But the departure would be only really radical, when introducing concurrency (and the terminology of process would need much more elaboration and would have a wider range of meanings). Concurrency and parallelism is an immensely large field in its own and Scheme is not the language that comes first to one’s mind when talking about concurrency and parallelism. Though some Scheme variations support parallelism and concurrent programming, and functional languages hold the promise to be easily be parallelized. The lecture will only touch upon concurrency and parallelism in the most superficial way, and also in this text, we cannot go deeper and for instance discuss the word process in a concurrent or parallel setting. For us, being concurrent or parallel as the alternative is not even on the screen, so we don’t even much mention that we are dealing sequential programs (though we do), and should a Scheme program be internally be executed in parallel, so the internal parallel evaluation would be invisible to us except that it may run faster.

Procedures and functions

SICP uses the words procedures and functions in a clear way and consistently (which is a good thing, especially for a textbook). So things are pretty clear on that front (within SICP). Functions are meant in a mathematical way, whereas procedures consist of Scheme code (using lambda and often using define if one wants to give a name to a procedure). As one of the first examples in the lecture, we had the factorial. The mathematical function is written conventionally with an exclamation mark $!$ whereas the procedure, the corresponding Scheme code, was called fac. Actually we had 2 procedures, one that was linear-recursive and one tail-recursive, both calculating the same function.

Since those two concepts are closely related, and since the whole thing is not really confusing, one sometimes of course relaxes a bit and says that fac is a function, namely the factorial function instead of saying fac is a Scheme variable giving a name to a procedure that represents an entity that is known in mathematics as the factorial function…

Of course, since functions as mathematical concept have no side effects, one can have procedures that do not represent mathematical functions, namely those which are not purely functional. Side-effects, for instance via set! in Scheme, is one way of breaking with pure functions. Another one would be non-determinism, for instance, functions that return output influenced by randomness. The procedure random, built-in in many Scheme dialects (though not in R5RS) is an example. Such a procedure is not purely functional, but has no side-effects either. Referential transparency, which is a characteristic property of a purely functional setting, does not hold for procedures like random. Side remark: it’s correct that a procedure like random breaks referential transparency and has no side-effects. To be more precise, it has no side-effects visible to the outside, to the user of random. Often random-number generators, when realized in software, generate not real random numbers, but so-called pseudo random numbers, numbers that looks random, but in fact really are not. A possible implementation might well rely on an internal state which is changed after each call to random, so, programmed that way, the procedure would internally make use to commands like set!, only that this is encapsulated (like the internal state of our bank-account examples). If realized in that way, the use of set! is another piece of evidence why random is not a mathematical function.

Clear as issue of functions vs. procedures is inside SICP, outside of the book and in other programming languages, the words are used often slightly differently, and one may stumble upon two other interpretations, themselves also slightly different. They don’t talk about functions in a mathematical sense at all, they focus on procedures or functions as programming constructions. One common interpretation is that procedures have no return value, and functions have a return value. Alternatively, one may find definitions that say functions don’t have side effects, procedures have. The latter is in line with our definition, because without side-effects, a procedure behaves like a mathematical function.

Both alternative definitions are slightly different, but hang together. If one has a procedure that does not return a value, then, to be useful at all, it will have side-effects (note that I/O or interacting with the environment count as side-effect). Analogously, if a procedures is not allowed to have side effects, it need to return a value. The only situation where the two definitions disagree is for procedures with side-effects and return values. One “definition” would call it a function, because of the returned value, the other definition would call it a procedure, because of its side-effect.

All is pretty simple (and uninteresting). This terminology sometimes are not (just) refer to concepts, but to actual language constructs. For instance, the slighted dated language Pascal uses procedure and function as keywords. So one could have

  procedure Hello;
  begin
     ShowMessage ('Hello world!');
  end;

  function Double (Value: Integer) : Integer;
  begin
     Double := Value * 2;
  end;

Most languages don’t feel the need to introduce different language level constructs or keywords to make a distinction on the programming language level. C, actually, at least the C standard, does not even talk about procedures, everything procedural is a function (with or without side-effect, with or without value). And object-oriented languages mostly call their “procedural” mechanism method (though methods often have some extra features over procedures).

2024 6
2023 10
2022 1
2021 4

2022

Welcome to Compiler Construction, spring 2022

9 minute read

Processes and procedures

Martin Steffen

Procedures and processes

Procedures and functions

2024

Worklist algorithms

How to write a parser for the first time, e.g. for the oblig

Regular expressions, context-free grammars, and recursion

Did Fortran kill a Mars Mission?

Full employment theorem for compiler writers

How to prepare for an oral exam?

2023

(Not so) abstract syntax trees

Y not code up $n!$ with no recursion and no Y tricks either?

Y Y?

Processes and procedures

Recursion, primitive or otherwise

Evaluation strategies

Welcome to Functional programming (IN2040), autumn 2023

How to prepare for an oral exam

Type systems: the strong, the strict and the safe.

How to implement regular expression matching

2022

Welcome to Compiler Construction, spring 2022

2021

Fol Soundandcomplete

Bitcoin, blockchain & formal methods, seriously now?

Welcome to program verification, autumn 2021

Welcome to Compiler Construction, spring 2021