Abstract -- For a given control program different control unit (CU) architectures can result in very different sizes of the implementation. In the context of silicon compilation, CU synthesis should be able to automatically map a CU program onto a number of architectures and eventually choose an architecture optimal for the particular control program. In this paper an algorithm will be presented for mapping a state transition graph (STG) onto four basic target architectures aimed at reducing the CU area.
Automatic synthesis of datapath architectures from a program description is possible today and a whole range of solutions can be generated depending on performance constraints. For control unit (CU) synthesis there is no corresponding architectural synthesis. The general approach for control unit (CU) synthesis has been to generate and optimize logic for given, fixed control architectures. For a given control program different control architectures can, however, result in very different size and speed of the implementation. Therefore, in the context of silicon compilation, CU synthesis should be able to automatically map a single CU program onto a number of architectures and eventually automatically choose an architecture, optimal for the particular control program at hand and driven by performance constraints.
In this paper algorithms will be presented for mapping a STG onto a number of architectures. Four basic target architectures, which are aimed at reducing the CU area, are considered.
II. Target architectures
In many CU programs it is possible to find sequences that occur in several places in the STG. Two sequences are identical iff they produce the same output sequences for all possible input sequences. In figure 1 the state sequence a1...d1 and a2...d2 are identical.
The basic idea of a number of architectures is to somehow reuse these state sequences so that only one copy is needed in the implementation, thereby reducing the area of the CU. The
Control Unit Architecture Synthesis
Department of Computer Engineering, Lund University P.O. Box 118, S-221 00 Lund, Sweden e-mail: [email protected]
STG in the figure will be used to demonstrate how different architectures takes advantage of these sequences.
A typical example is the usage of subroutines which can be found in many microprogrammed CUs. There are numerous ways of implementing a subroutine mechanism. One possibility is to use a stack for saving and later restoring the current state. Also the traditional FSM built with a PLA and D-flipflops can easily be extended with a subroutine mechanism by replacing the state registers with a stack of state registers (figure 2).
This mechanism differs from the usual microprocessor type of subroutines in that the return operation returns the FSM to the calling state, not the state after the calling state. Traversing the STG in figure 2 would therefore lead to the sequence 1,2,a,b,c,d,2,3. This explains the need for an extra flip-flop to distinguish between the two outgoing edges of state 2. In  it is shown that characteristics for this architecture can be found in the STG and an algorithm for efficient mapping of an STG onto this architecture is also presented. This archi-
Figure 1: Identical sequences found in one STG.
3b 2 c2 d2
a1 b1 c1 d1
-/1 -/2 -/3 -/4 -/5 -/6
-/7 -/2 -/3 -/4 -/5 -/8
Figure 2: FSM with subroutine mechanism.
1 2 3