Note that a procedure name is one level lower than the inside of the procedure body. When this program runs, main calls C which calls D which calls A which calls B.
The variables for procedures in a stack-based language such as Ada, Algol, Pascal are allocated in the stack. When a procedure is called an activation record is created for its variables. The scope rules of the language determine which objects can be used at each point in the program.
Note that the stack is shown growing up, from a low address to a high address, but some implementations have stacks growing down.
And then the calls return and the stack contracts.
The stack management method must implement the scope rules of the programming language.
The first method of stack management uses static and dynamic links. A dynamic link points from each activation record to the activation record of the calling routine. This chains activation records together in reverse order of calling - it reflects the dynamic call history of the program. It allows the stack to be retracted on routine exit. The length of the dynamic chain is the number of active routines at any given time.
A static link points from each activation record to the activation record of the immediately enclosing routine. It allows accessing of non-local variables and objects. It reflects the static textual layout of the program. The length of the static chain is the nesting depth of the current block or routine. The maximum possible length is the maximum nesting depth in the program.
Other linkage information for a routine includes its return address and in the case of a function, its result. An activation record is usually organised as linkage information, parameters and then local variables.
Local variables are accessed relative to activation record base (ARB).
Global (main's) variables are accessed via stack base (SB).
Non-local variables are accessed by following the static-chain the
appropriate number of links to their activation record
and accessing via the offset in that
from level n, access non-local var at level m:
A second method of stack management uses a "display".
A procedure 'p' at level n calls a procedure 'q' at level m. Note that m<=n (unless proc is a parameter). Note that q may be called from many places and from different levels so q cannot set up its own environment, but p can.
Caller p, in the case that m<n:
Caller p, in the case that m=n:
Since a routine q is written once but may be called from many places, as much of the calling code should be put in q as possible:
A function result can be treated as an output parameter. Alternatively it can be returned in a register or it can be left at word 0 of the activation record; this requires moving the other items up by 1 word. Word 0 becomes 0[SF] when the function has returned; this can be convenient for compiling expressions, especially if using the stack to evaluate expression:
The parameters are allocated in the activation record of a routine immediately after the linkage information, from word 3 onwards. A caller must evaluate the actual parameters and store them in the space the formal parameters will occupy. When the caller is running, this space will be from 3[SF] onwards.
There are various methods of passing parameters. The simplest method is to pass a parameter by-value or by-input. Here the actual parameter is evaluated and this value stored in the formal parameter.
A more complex method is to pass a parameter by-reference. Here the address of the actual parameter is stored in the formal parameter. Any access of the formal parameter accesses the actual parameter by indirection.
The most complex method is to pass a parameter by-name. The effect as if the formal parameter has been replaced everywhere in the routine by the actual parameter. In fact a closure, or thunk which calculates the actual parameter's value or address is passed.
The following example distinguishes the 3 methods:
Under by-value, the value of a, that is 2, is passed to x. i becomes 3, so does x, but a is unchanged. Output:
This would be the case in Pascal if x were a value parameter. The example is technically badly typed in Algol-68 because x is a value and Algol-68 argues that a value (eg 2) cannot be incremented.
Under by-reference, the address of a is passed to x. i becomes 3, as does a. Output:
This is the case in Pascal with a var parameter. Strictly, the actual parameter must have an address to pass to x. a[i] does have an address but '7' as in p(7) does not have an address. Fortran uses by-reference but 1+6, say, would be evaluated to 7 and stored in a temporary variable. This hidden variable's address would be passed to x. On some compilers, p(1), would result in 1=2 thereafter!
Under by-name, x "becomes" a[i]. i becomes 3, so x becomes a. a becomes 4. Output:
Algol-60 has by-value and by-name parameters. By-name parameters are implemented in a way similar to procedure formal parameters (see later). Many functional languages implement by-name parameters in an efficient way called by-need or lazy evaluation. This is possible because there are no side-effects in a functional language (see later).
One further method is by-input-output. Here the actual parameter is evaluated and copied to the formal parameter before routine entry and copied back afterwards. It usually behaves like by-reference unless an actual parameter is altered in a routine both via the formal parameter and via the actual parameter (as a non-local).
By-value requires an expensive copy if the actual parameter is a large structure. By-reference requires one indirection for each parameter access. By-name requires an implicit function call for each access.
The following technique, known as Jensen's device, uses call by-name parameters to sum a series:
Note that it depends on the loop control variable being non-local to sum (forbidden in Pascal) and on 1/i being passed by-name.
In the absence of by-name parameters, series can be summed in a more obvious manner by passing a procedure or function as a parameter. Such a parameter is known as a procedure formal parameter.
Note that term is a procedure formal parameter. fact is passed to sum as an actual procedure parameter. The environment that fact runs in is derived from p not from sum. Yet sum calls fact (via term), p does not call fact. Therefore p must give the environment for fact to sum as part of the procedure value. A procedure value is a closure. It consists of a pointer to the code and a pointer to the addressing environment for the code to run in. The stack situation as fact runs is:
p has access to its own ARB and so can store it as part of the routine value as term's env. Fact can access its own local variables, p's variables and main's variables.
Procedure formal parameters can be used to implement call-by-name parameters if the latter are not provided by a language.
Procedure parameters can be used in a recursive descent parser to good effect:
Algol-68 passes parameters by-value, but the notion of 'value' is very general so that all other methods can easily be programmed:
Here the value is an INT value.
A REF INT value, an integer variable location value can also be passed:
Note that '7' has no address to pass as a REF INT.
Lastly, a procedure value can be passed:
The value in this case is an anonymous procedure value.
Each routine allocates space for its parameters and other local variables in the stack. This space can be reserved by adding its size to the stack front register. In Pascal, all arrays and therefore all structures in the stack have a fixed size calculable at compile time. Therefore the stack space for each routine activation is known at compile time. Many languages allow dynamic array bounds:
It is impossible to know the size of arr until the program runs. However a descriptor for arr does have a fixed size:
Note that this declaration of arr has had a side-effect in advancing SF. Languages, such as Algol-68, allowing this must prohibit jumps forward over such declarations or the effect will be avoided. Also, a jump backwards over the declaration must be banned or the effect can happen several times.
In the above, an array's size could not change once it was created. The size of an Algol-68 flex array can change after creation. Such an array must be allocated in a heap (see later). The descriptor can still be allocated in the stack.
Goto's involve more than is immediately obvious in a block structured language. The first minor difficulty is due to forward jumps or forward references to labels (and routines). This problem also appears in assemblers.
A multi-pass compiler may be able to evaluate the location the label will refer to before the goto is compiled. If this information is not available a jump to an unknown location must be generated for the goto. A list of such incomplete jump instructions can be kept (a well known trick is to use the incomplete address fields in the jump instructions to link them together). The correct jump destination must be filled in when known. An alternative technique is to compile all forward jumps as indirect jumps via a jump table. The location of the jump table can be fixed at the start of compilation and its contents can be filled in at the end. Another technique is to have the linker-loader fill in forward jumps in the same way that it deals with external references.
The second difficulty is identifying the correct label with the appropriate name:
The question is, does the goto go to the lab already seen as in:
or is it a forward jump as in:
A one-pass compiler may not be able to decide at the goto. Delaying action similar to that for forward references is needed. Note that Pascal avoids this problem by forcing labels to be declared at the head of a block separate from the defining instance of the label. All applied uses of a label follow the declaration and the lexical level of the destination of each goto is known at the goto.
The last problem is that of the non-local jump out of a routine:
p calls q and q jumps to lab outside q and inside p. When the jump happens the stack must be retracted to discard q's activation record and to use p's again. The label, lab, is visible in q so it is in q's environment. Therefore the label's environment is a subset of q's and the goto can retract the stack before doing the jump.
A few languages allow labels to be passed as parameters and allow label variables. In this case a label value consists of a code address and an addressing environment - a closure - labels have much in common with routines. A local label cannot be returned from a routine in a stack based language because it refers to an environment that vanished when the routine returned. If the language allows such results, activation records must persist after return and must be placed in a heap.
An alternative method of managing the stack uses a display. This consists of the pointers that form a static chain removed from the activation records and placed together in a block, usually in a block of fast registers.
If we ignore procedure formal parameters for now, a procedure body p at level n calls a procedure q at level m, then m<=n. If m=n then q is local to the body of p and the body of q is at level n+1. The display when q runs is the same as that for p plus one extra entry (n+1). If m<n then q is a non-local procedure to p. The display when q runs should be the first m entries from p's display plus one new entry for q's activation record.
Assume word 0 of activation record holds return address, word 1 of record holds saved display entry.
If there are enough registers to hold the display then access to all variables is fast. To access a variable at 'offset,level' where d=display[level]:
Alternatively the display is held in memory and entries are loaded into registers when needed.
The maximum number of display entries, equals the maximum length of a static chain, equals the maximum textual nesting depth in a program. In principle there is no limit to this. In practice it rarely exceeds 10. The Burroughs 7600 dedicated 16 registers for the display; these are automatically set and reset by subroutine entry and exit instructions.
In practice most procedures access local variable most of the time, global variables a few times and non-local variables rarely. Therefore the potential speed advantage of displays over static links is small if non-existent.
Further, the simple scheme above does not work when procedure formal parameters are allowed. A procedure value consists of (a pointer to) code and an addressing environment. The environment must contain of a complete display (several words) or a pointer to such a display. This means that the entry sequence must store all the display and load a new given one and that the exit sequence must restore the old display. This is a large overhead on procedure call.
BCPL and C allow recursive routines but not nesting of routines. This means that there are only local variables and global (static) variables; there are no intermediate non-local variables. A static link is therefore unnecessary.
↑ © L. Allison, www.allisons.org/ll/ (or as otherwise indicated).
Created with "vi (Linux)", charset=iso-8859-1, fetched Friday, 01-Mar-2024 03:41:50 UTC.
Free: Linux, Ubuntu operating-sys, OpenOffice office-suite, The GIMP ~photoshop, Firefox web-browser, FlashBlock flash on/off.