
List of Contributors xiii
Preface xi
The first generation, still widely used, of digital control systems implementations are based on the single general-purpose processor architecture arising from the von Neumann model. Further, the development of these systems, measured in terms of increased speed and improved performance, has been historically linked to developments in computing technology. This has led to the use of special-purpose digital signal processors and, subsequently, control-dedicated 'digital control processors', both of which use the uniprocessor architecture.
Rapid developments in very large scale integration (VLSI) circuit technology now permit the fabrication of multiple processing elements (PEs) onto a single silicon chip. This, in turn, has led to the development of architectures in the form of systolic and wavefront arrays which have been the subject of intensive research, motivated by a number of computationally intensive applications areas. For example, in digital signal processing, an area which shares strong structural links with control, these arrays have been applied to problems such as correlation and convolution, infinite impulse response filters, fast Fourier transform, orthogonal triangularization and LU decomposition.
In effect, a systolic array is a network of processors which rhythmically computes and passes data through the network. Its major features are (a) modularity and regularity n terms of VLSI implementation, (b) linear-rate pipelinability, (c) spatial and temporal locality, (d) data input to the array only pipelined through boundary PEs, and (e) synchrony of data flow and computation. A wavefront array has the same architecture but here the control of data flow and computation is self-timed and data-driven, rather than controlled by a synchronized global clock. At word-level, these two types of architecture directly map an algorithm onto the hardware PE level and in this section they are applied to control problems.
After a brief review of previous work and the relevant features of systolic/wavefront arrays, Chapter 1 considers the design of these architectures for a generic class of real-time feedback control schemes. A crucial point stressed here is that, in contrast to other areas of application, both processing speed (or throughput rate) and processing delay (or system latency) must be included in measuring the performance of a parallel architecture for real-time feedback control schemes. Chapter 2 then gives a systematic treatment of the design of systolic arrays for adaptive control schemes and includes architectures for least-squares estimation and predictive control. Finally, Chapter 3 develops systolic architectures for the Kalman filter whose various forms have found wide applications in control/digital signal processing. Hence this chapter should be seen as complementing the already existing designs.
Y. Li and E. Rogers 3
L. Chisci and G. Zappa 36
R.W. Stewart 72

Intelligent control is a general area which is receiving increasing attention by the general control systems community. In general, the implementation of these schemes can impose a heave computational load and this, coupled with their inherent parallelism, requires a parallel treatment. Amongst the various candidate architectures, artificial neural networks (ANNs) are currently the subject of much research effort and, in effect, provide a self-learning capability and ultra-high processing speed by massive parallelism. Typically, an ANN consists of a very large number of processing elements - the neurons - interconnected by weights which can be updated, or trained, during operation. The actual network topology varies according to the choice of various models and activation rules.
In the context of the work reported here, the well-known back-propagation model can be interpreted as a 'partitioned complete graph' with each partition termed a layer. Further, every neuron within a layer is run in parallel and every layer is run in a pipeline. Alternatively, in the Hopfield model all neurons are interconnected to each other with no topological hierarchy.
These networks can be implemented in either hardware or software. Further, their self-learning property, in particular, offers the promise of powerful techniques, in comparison to existing techniques such as adaptive schemes, for solving nonlinear control problems. The three chapters in this section are a cross-section of current 'state of the art' research on the application of ANNs to control problems.
Chapter 4 critically examines the roll of ANNs in the important general area of processing control with emphasis on currently difficult problems in modelling, estimation and prediction. Their use as 'software sensors' is also examined. Chapter 5 develops the multivariate B-spline based ANN for adaptive e or learning controllers involving least-squares estimation, stochastic approximation and nonlinear time series prediction.
The work of Chapter 5 can also be viewed as extracting useful concepts from fuzzy logic based systems and a comparison between these two types of systems is also included. This leads naturally on to Chapter 6 where the group method of data handling is mapped onto a multilayer perceptron type configured parallel network, and a detailed treatment of some related self-organizing systems and fuzzy logic based control schemes is given.
M.J. Willis, G.A. Montague, A.J. Morris and M.T. Tham 109
M. Brown and C.J. Harris 134
D.A. Linkens 168

Instead of silicon fabrication, VLSI-oriented architectures, such as those detailed in the previous sections, can also be mapped onto, and simulated by, a parallel network configured by the user from commercially available microprocessors. This has obvious cost benefits but also retains benefits arising from the original architecture. These include modularity, spatial and temporal locality and boundary input/output (I/O) interfacing from the systolic array.
The application of processor networks to control engineering can be traced back to 1970s when decentralized and distributed machines began to be used in implementations. Currently the design and application of transputer based networks is a very active subject across a very broad spectrum of areas. This section contains a representative cross-section of control oriented applications.
In effect, a transputer is a microprocessor with a reduced instruction set computer (RISC) architecture, which supports parallelism at the hardware level and can be used as a 'building block' for concurrent networking. Basically, all members of the transputer family have a main processor, local memory, external memory interface and four intertansputer communication 'links' integrated on a single silicon chip, together with external interrupt and internal timers for real-time applications. The most important feature of the transputer is, from the parallel processing point of view, the on-chip inclusion of four communication links, which permit 'easy' construction of concurrent networks and 'fast; intertransputer communications. An in-depth treatment of the transputer system can be found in, for example, the references cited in Chapters 9 and 10.
Occam is the 'companion language' of the transputer system. This is a high-level language but is the lowest-level coding dedicated to transputers. It can directly map and match a concurrent procedure/module in software to one transputer in the hardware network. Once source of an in-depth treatment of the features of Occam is again the references cited in Chapters 9 & 10. Note also that other languages such as Parallel C, Pascal and Fortran can be run on a transputer network under certain compilers. Given that Occam is an 'easy to read' code, it, rather than other languages or pseudo-codes, is used to demonstrate the algorithms developed in the chapters of this section.
Applications to the control of robotic systems is the subject of Chapter 7 which consists of three main sections. These are (a) problem definition for the UMI RTX robotic manipulator, (b) system specification and high-level design issues using the DeMarco methodology, and (c) implementation issues. A case study is used to highlight the very strong benefits of using transputers in this very topical applications area and the ease with which they can be interfaced with other integrated circuit devices.
Target tracking, an area which should benefit significantly from appropriate application of parallel processing, is the subject of Chapter 8. The particular aspect of this very wide ranging area considered is the problem of tracking a large number of objects using information, corrupted by noise, from one or more sensors. further, the work detailed here on track partitioning, distribution and clustering configuration should also be of interest in developing similar type parallel implementations of other schemes.
Chapter 9 is based on the so-called heterogeneous system approach to parallel control and simulation problems. The motivation for this is other work which has revealed shortcomings in the ability of the transputer to cope with the demands of real-time control software. In particular, there is evidence to suggest that the architecture granularity (a measure of computation power to interprocessor communications overhead) is not appropriately matched. This, in turn, suggests that fusing the finer granularity of parallel digital signal processing (DSP) chips, such as the IMS A100, with the transputer's ability to handle irregular computations and manage parallel operations should prove highly effective.
Use of parallel processing enables the computations to be organized in a distributed sense. Hence it is possible t provide a fault tolerance capability, i.e. an operational failure results in performance degradation rather than complete operational failure. This is one of the major benefits of 'parallel processing for control; and is the subject of Chapter 10. Both software (Occam based) and hardware (transputer systems) methods are developed, including some configured to operate in real time.
M.I. Barlow, S.E. Burge and A.P. Roskilly 209
D.P. Atherton and D.M.A. Hussain 234
Y.Li and E.Rogers 258
A.M. Tyrrell 282

An alternative to VLSI-oriented systems or the construction of user-configured networks using 'building block' type processors, is to employ existing parallel machines. An early example of these systems is the vector processor, of which the Cray is an example; but such systems do not fully exploit parallelism. A single-instruction multiple-data (SIMD) system does permit full parallelism and examples here include the Illiac IV and the ICL Distributed Array Processor. further, tin the CDC Cyber 205, pipelining techniques are combined with parallelism to increase efficiency and yield higher performance. They are, however, less suitable for general-purpose computing where the data are not inherently in the form of 'large' uniform arrays.
In contrast to SIMD systems, multiple-instruction multiple-date (MIMD) machines are best suited for general-purpose parallelism. This type of 'shared memory multiprocessor' is more cost effective but less scaleable than 'distributed memory multicomputers', since adding processors to a shared memory system can suffer from bus saturation. Further, the scaleability of a distributed memory system is strongly dependent on the topology used.
One of the most common and successful topologies is the hypercube architecture, which provides the best trade-off between the longest path between processors and the number of physical connections each processor must have. In the case of an n-dimensional machine, for example, the architecture typically consists of 2n processors, each of which is nearest-neighbor connected by n bidirectional and asynchronous communication channels. Examples include the Intel iPSC Ametek System 14, NCUBE/TEN and the Connection Machine from Thinking Machines.
Of the various parallel machines, user reconfigurable systems provide better flexibility for different applications. Further, they also allow topology reconfiguration by programs and are best suited for the design and simulation of VLSI-oriented architectures, such as those developed in Parts I and II. An early example is the Accelerated Processors Model 10 system which has between 4 and 12 grooves of 8 arithmetic logic units. Recent examples based on transputers include the Microway Quadputer, the Meiko Computing Surface System and the Parsytec clusters, and Giga Cube machines. These machines are also well suited to the development of the architectures of Part III.
Chapter 11 presents an overview of the hardware of various parallel machines from a control applications standpoint. Strong emphasis is placed on interprocessor connection/communication schemes and, in particular, the hypercube case. This is supported by brief case studies on process and vehicle control systems application.
In Chapter 12, the subject is the use of currently available parallel machines to develop software for control systems design. This take the form of a tutorial-style survey of the current 'state of the art', plus some open research problems using, as an illustrative basis, some algorithms and programming techniques for use with hypercubes. A very important point emerging here is that the development of parallel algorithms for (commonly used) control systems design algorithms is not simply a case of modifying existing (sequentially based) software to run in parallel.
K.J. Hunt 311
E. Rogers and Y.Li 322