Reducing Write Latencies for Shared Data in a
Multiprocessor with a Multistage Network
Fredrik Dahlgren and Per Stenstr?om
Department of Computer Engineering, Lund University
P.O. Box 118, S-221 00 Lund, Sweden
Performance of cache coherence protocols can be severely restricted by the consistency model of the architecture. If a packet-switched, cyclic network is used, such as a multistage network, pipelining may violate strict consistency models such as sequential consistency.
In this paper, we show that by meeting a few constraints in the implementation of the cache coherence protocol, store requests can be pipelined. Thenew requirements are applied to a previously proposed cache coherence protocol for a MIN-based network. The most important constraints are (1) to augment the protocol with the notion of write-permission, (2) avoiding redundant paths between any two caches, and (3) the use of a distributed cache coherence protocol. We show how the ideas can be generalized to a wide class of cache coherence protocols.
An important design issue for shared-memory multiprocessors is to reduce the performance degradation due to network latency. One possible way is to use packet-switched, MIN (multistage) networks which allow memory request pipelining. In order to reduce network and memory contention, it is widely agreed that shared memory multiprocessors must rely on private caches as shown in Figure 1 .
A private cache organization introduces the cache coherence problem. In a previous work , we proposed a new hardware scheme for maintaining cache coherence in MIN- based networks. In that paper, we demonstrated that the scheme satisfies the general coherence property . General coherence means that if two processors issue a write to the same location, all processors consistently will observe one of the writes after some finite amount of time.