The Programmer's View of MARS
H. Kopetz, G. Fohler, G. Gr?unsteidl, H. Kantz, G. Pospischil, P. Puschner,
J. Reisinger, R. Schlatterbeck, W. Sch?utz, A. Vrchoticky, R. Zainlinger
Institut f?ur Technische Informatik, Technische Universit?at Wien
A-1040 Vienna, Austria, E-mail: [email protected]
The systematic development of fault-tolerant realtime systems with guaranteed timeliness requires an appropriate system architecture and a rigorous design methodology. We propose a system with strict separation of the issues of synchronization, dependability aspects and data transformation. Dependability aspects (error handling and redundancy management) are handled by the architecture. Synchronization and programming in the large is handled at the design level. The programmer only has to be concerned with a sequential program for which he has to meet a so-called ime budget". The architecture and many tools have been implemented and can be demonstrated on a faulttolerant prototype application.
Distributed real-time computer systems are replacing conventional mechanical or hydraulic control systems in many applications, e.g., flight control systems in airplanes, drive by wire" systems in automobiles, and industrial process control systems. In addition to the specified functional capabilities, these applications demand predictable timeliness and specified levels of non-functional attributes, such as reliability, safety, and maintainability.
At present, the process of designing real-time systems is tedious and often unsystematic. Frequently the primary focus during the design is on the functional capabilities of the planned control system. Concerns about timeliness and dependability are usually deferred until the final testing phases, when all parts of the system have to be integrated. The application code implementing the specified transformations in the data domain is most often intertwined with the code for the synchronization of concurrent tasks and the code for error handling and recovery. As a consequence, it is very difficult to establish the timeliness of these systems by formal reasoning or by a constructive test methodology. Furthermore, minor changes in one part of the system can have a major effect on the timeliness of some other part.
We propose a system architecture|MARS | which supports a strict separation of the issues of synchronization and timeliness, data transformation, and the dependability aspects (e.g., error detection, error handling and redundancy management).
In our view, such a separation of issues is only possible if the system architecture is time-triggered, i.e., all system activities are initiated as a consequence of the progression of real-time. Although the occurrence of events in the environment is outside the sphere of control of the computer system, the points in time when these events are to be recognized by the computer are predetermined in a time-triggered architecture. This is in contrast to event-triggered architectures, where the system activities are initiated as a consequence of the occurrence of external or internal events.
Event-triggered real-time architectures are assumed to provide a high degree of flexibility and have therefore received considerable attention in the literature (ARTS , MAFT ). Because of their event triggered nature, however, an excessive number of possible behaviors must be analyzed in order to establish timeliness guarantees. Furthermore, the implementation of active redundancy by the replication of the components is hard because of the issue of replica determinism . DELTA4  proposes the implementation of the rather complex leader-follower" model to overcome the latter difficulty. Other event-triggered architectures, e.g., Spring , do not consider the issues of fault-tolerance at all.
This paper is organized as follows. In the next section we give a short overview of the architecture and the separation of design issues from programming. The following two sections describe the tasks of designer and programmer in more detail.
2 Separation of Concerns
In our proposed system architecture a strict separation of the issues of synchronization and timeliness, data transformation, and the dependability aspects is supported. The designer is responsible for handling these issues as long as they concern the inter-task relationship, the programmer handles them at the task level, and the system architecture provides the base services for both of them. In the following we will outline both the MARS architecture and the concerns of the designer and the programmer, in order to give the essentials to read the rest of the paper.
On the architectural level a MARS System is a distributed computer system that consists of a number of autonomous, fail-silent node computers called components which are interconnected by a real-time net-