Stacks

The stack is a simple but important example of an abstract data type. It is used to store elements where the Last element In is the First one Out (LIFO). Anyone with a cluttered desk or an overfull sink of washing-up will be familiar with a stack. New elements are added or pushed onto the top of the stack. The first element to be removed or popped is taken from the top - the last one in. All recursive programming languages use an implicit system-stack to allocate space for local variables of routines as they are called. Some algorithms require the use of an explicit stack.


Type:
   stack = empty_stack | push Element_Type × stack

Operations:
   empty :stack -> boolean
   top :stack ->Element_Type
   pop :stack->stack
   push:Element_Type × stack -> stack

Rules:
   empty(empty_stack) = true
   empty(push(e,s)) = false

   top(push(e,s)) = e
   top(empty_stack) = error

   pop(push(e,s)) = s
   pop(empty_stack) = error


The Stack Abstract Data Type.

The operations above are pure or free of side-effects.

It is frequently the case that stacks do not share components and then it is usual to write pop and push procedures ((sub-)routines, methods, ...) that change a stack as a side-effect. In addition, top and pop operations are often used in sequence to get the top value and then remove it. In this case the pop procedure is usually written to perform both operations.

Many programs use a single stack. In this case just one instance of a stack may be declared as a global variable (dangerous) or as part of a stack module (safer). The operations can then act on this instance implicitly.

Example: Reverse Polish Calculator

A small "pocket-calculator" program is given here. The input language is reverse-Polish (after Jan Lukasiewicz) or postfix notation which has the advantage that no brackets are needed. At least one commercially available pocket-calculator uses this notation.


  infix      reverse-Polish
  -------    --------------
  a+b*c      a b c * +
  (a+b)*c    a b + c *

The program uses a single stack which is implemented as a module in the following sections. The module specifies a clean interface between the use of the stack and the implementation of the stack. The latter can be changed so long as the interface is adhered to and the calculator program will still work. The implementation(s) can be verified and tested to ensure compliance with the interface. Use of modules in this way is a useful technique in the management of large programming projects.

Each operand is pushed onto the stack. Each operator takes two operands from the top of the stack and pushes its result onto the stack

[C/Stack/calculator.c]

See the later section on the shunting algorithm and the section on tree traversal in a later chapter for methods of generating reverse-Polish notation from infix notation.

Implementation

There are two natural ways to implement a stack depending on the circumstances. A stack can be implemented by an array of elements. An integer "pointer" top indicates the element currently at the top of the stack. Initially top is zero for an empty stack. The array must be declared to have some fixed size. This is acceptable if an accurate estimate of the stack's maximum size can be made before the program runs. The other implementation uses a linked list. This is more suitable if an accurate estimate of the stack's maximum size cannot be made in advance.

Both methods of implementing a single stack are given here as modules. The modules export only the abstract stack operations. Either module can be used with the previous calculator program. This illustrates that modules protect other parts of the program from information that they do not need to know. This design principle may not seem very convincing from the small examples that must of necessity be given in this book, but it is very important indeed in large programming projects.

Note that if multiple stacks are required an opaque stack type must be exported from the module so that separate stack instances can be declared; the operations must then have a stack parameter to indicate which stack is to be manipulated.

L
.
A
l
l
i
s
o
n

Stacks by Array

If a stack is known to be small or if an accurate estimate of its maximum size is known then an array gives an efficient implementation. On the other hand, if several stacks are used then enough space is reserved for them all to be full simultaneously even if it is known that this event can never occur.

Stacks by Lists

There is a very close correspondence between lists and stacks. The empty stack corresponds to the empty or null list and push corresponds to constructing a list cell. This makes an implementation by lists attractive if a good prior estimate of the stack's size is not available.

Example: Shunting Algorithm

Dijkstra's shunting algorithm is a method of generating reverse-Polish notation from conventional infix notation.


infix                 reverse-Polish
----------            --------------
 a+b*c = a+(b*c)       abc*+
 (a+b)*c               ab+c*
 a*b+c                 ab*c+
 a*(b+c)               abc+*

Infix and reverse-Polish.

We say that infix multiplication and division have a higher priority than addition and subtraction. The former bind more tightly to their operands than the latter.

The shunting algorithm uses a stack of operators.


<output<                            <input<
                                    a*b+c*d
-------------------------------------------
            .             .
              .         .
                .     .
                  . .
                   |
                   |  Operator Stack
                   |
                   |

 Shunting Algorithm.

It is named after a way of visualising its operation in terms of a small railway system. Imagine a main-line with the input coming from the right and going to the left. A siding is connected to the main-line by a triangle. Such a triangle and siding can be used to turn a train around by driving into the siding and reversing out again on the other branch - LIFO. Such triangles are used in real railway systems in the absence of a turn-table. In fact operands go straight through on the main-line but operators go onto the siding.


<output<                            <input<
ab                                  c*d
-------------------------------------------
            .             .
              .         .
                .     .
                  . . +
                   |
                   |  Operator Stack
                   |
                   | *

There is a rule that an operator of a given priority, such as `+', may not sit in the siding/stack immediately above an operator of higher or equal priority, such as `*'. Therefore, the `*' must be popped onto the main-line before `+' can be pushed onto the stack. The c and d pass through on the main-line and the second `*' can be pushed above the lower priority `+'.


<output<                            <input<
ab*cd
-------------------------------------------
            .             .
              .         .
                .     .
                  . .
                   |
                   |  Operator Stack
                   | *
                   | +

At the end of input, those operators that are left on the stack are popped to give the complete output


ab*cd*+

Parentheses have still be considered. They are used to change the binding of infix operators to operands.


eg.  a*b+c*d = (a*b)+(c*d) != a*(b+c)*d

Parentheses isolate a subexpression. The algorithm therefore treats the subexpression as a piece of sub-input.


<output<                            <input<
                                  a*(b+c)*d
-------------------------------------------
            .             .
              .         .
                .     .
                  . .
                   |
                   |  Operator Stack
                   |
                   |

When an opening `(' is encountered, it is pushed onto the stack and hides everything beneath. A low priority operator, such as `+' can sit above `(' while it cannot sit directly above `*'.


<output<                            <input<
abc                                )*d
-------------------------------------------
            .             .
              .         .
                .     .
                  . .
                   |
                   | +  Operator Stack
                   | (
                   | *

The closing `)' is treated as the end of the subexpression and pops any operators of the subexpression - up to the opening `('. Both parentheses are then discarded.


<output<                            <input<
abc+                              *d
-------------------------------------------
            .             .
              .         .
                .     .
                  . .
                   |
                   |  Operator Stack
                   |
                   | *

The algorithm then continues as before until the complete output


abc+*d*

is produced.

The basic shunting algorithm can easily be extended to check that the input expression is syntactically correct.

Exercises

Modify the stack-by-list implementation to free cells when they are popped.
Implement the shunting algorithm in Turing.
Modify the calculator program so that it uses the shunting algorithm to let it do calculations given in infix notation.
Use a stack to convert the following procedure to non-recursive, iterative form:
```
procedure r(x:T)
   if p(x) then
      a(x)
   else
      b(x)
      r(f(x))
      c(x)
   end if
end p
```
where T is an arbitrary type, p is an arbitrary predicate, a, b and c are arbitrary procedures and f is an arbitrary function.
Convert quick sort from recursive to a non-recursive, iterative form. To do this introduce a stack of (bounds of) array segments. Initially push the (bounds of) the whole array onto the stack. Repeatedly pop an array segment and, if its size is bigger than 1, partition it and push first the larger (why?) and then the smaller partition onto the stack. Terminate when the stack is empty.
Hard: Write a pretty-printer for some programming language. e.g. There are various "bracket structures" in Turing:


begin ... end
if ... then ... else ... end if
case ... label ... end case
loop ... end loop
for ... end for
record ... end record
union ... end union

procedure <ident> ... end <ident>
function <ident> ... end <ident>
module <ident> ... end <ident>


eg. |if ...
    |   |...
    |then ...
    |   |...
    |else ...
    |   |...
    |end if ...

forward