Descriptions of problems in the real world have many superfluous details. An essential step in problem solving is to identify the underlying abstract problem devoid of all unnecessary detail. Similarly, a particular model of computer has many details that are irrelevant to the problem, for example the processor architecture and the word length. One of the arts of computer programming is to suppress unnecessary detail of the problem and of the computer used. An invaluable aid is the use of a high level programming language such as Algol, C, Haskell, Java, ML, Turing, etc..
Once problems are abstracted, it becomes apparent that seemingly different problems are essentially similar or even equivalent in a deep sense. For example, the problems of maintaining a list of students taking a lecture course and of organising a dictionary structure in a compiler have much in common; both require the storage and manipulation of named things and the things have certain attributes or properties. Abstraction allows common solutions to seemingly different problems. By using abstract algorithms and data structures, a solution to a problem has maximum utility and scope for reuse.
Algorithms and data structures can be specified in any adequately precise language. English and other natural languages are satisfactory if used with care to avoid ambiguity but more precise mathematical languages and programming languages are generally preferred. The execution of the latter can also be automated. A program is the expression of some algorithm(s) and data structure(s) in a particular programming language. The program can be used on all types of computer for which suitable language compilers or interpreters exist, making it more valuable.
Sometimes details of a machine do intrude. The word size of the computer may be relevant if very big numbers are to be manipulated. The relative speed of real and integer arithmetic may be critical in simulating a physical system. There may be a trade-off between the speed of a computation and the space it uses. However such details should only be allowed to intrude if absolutely necessary.
Given a problem to solve, we select or design abstract data structures to store the data and algorithms to manipulate them. A data structure is an instance of a data type. The design step is largely independent of any programming language. The coding or programming step is to implement the abstract data structures and algorithms in a particular programming language such as Turing. An abstract data type defines certain static properties of the data and certain dynamic operations on them. It defines what the operations are, not how they work. An algorithm defines some process, how an operation works. It is specified in terms of simple steps. Each step must be clearly implementable in a mechanical way, perhaps by a single machine instruction, by a single programming language statement or by a previously defined algorithm. An algorithm to solve a problem must be correct; it must produce correct output when given valid input data. An algorithm must terminate when given valid input data. When the input data is invalid we usually want the algorithm to warn us and terminate gracefully. Often we want to know how much time an algorithm will take or how much space (computer memory) it will use and this leads to the analysis of algorithms and to the field of computational complexity.
Example: The Abstract Data Type Int
Mathematicians have long studied the integers and their properties. The consensus as to what integer-ness is amounts to a particular collection of axioms or rules that must be obeyed. The way in which integers are represented is unimportant provided only that all readers understand the notation - binary, octal, decimal, hexadecimal, twos complement, ones complement or sign and magnitude, the choice does not matter. What does matter is the way operations on integers behave. The rules define this behaviour, for example addition is commutative and zero is its identity value. Unary plus and minus take an integer and return an integer, denoted int->int. A binary operator like plus takes two integers and returns an integer, denoted int2->int.
These rules define an ideal for a programming language to attempt to live up to. In practice, a language can only approximate this ideal. For example, many languages place a limit on the size of integers that can be represented in the type int; it is equivalent to:
Long-multiplication, the usual manual method for multiplying large integers is an example of an algorithm. If the two integers have d1 and d2 digits respectively, the algorithm requires approximately d1×d2 steps.
The HTML FORM below runs a demonstration of integer long-multiplication. Change the values, click the `go' button and experiment:
Try numbers with 6, 8, 10, 12 or more digits and note how the execution time increases, but only increase the number of digits gradually; you have been warned!
It is possible to multiply integers more rapidly than above. This is particularly important for very long integers, e.g. in encryption.
Given two 2n-bit integers, u and v:
Using this recursively gives an O(nlog23)-time algorithm, better than O(n2); after:
Even Faster Integer Multiplication
Schonhage and Strassen showed how to multiply two N-digit integers in
Example: The Abstract Data Type Set
The set is an example of a structured abstract data type.
Pascal, say, provides sets where the element type is an index type - int or enumerated types. The reason is that these are easily and efficiently implemented as bit-maps. It is not possible to have a set of alphanumeric words for example and of course each set must be finite.
The sieve of Eratosthenes
is an algorithm that is over two thousand years old.
It is used to calculate prime numbers between 2
Note that the algorithm was expressed in English. Each step is sufficiently simple that it is obvious how to carry it out without giving the matter very much thought. More precision is needed to express the algorithm in a programming language. The set data type is provided in some programming languages. The repetitive nature of the algorithm is implemented by loops or recursion:
-- L.Allison, 1984 ©