page 1  (13 pages)
2to next section

Object Template Abstractions for Light-Weight Data-Parallelism ?

Neelakantan Sundaresan Dennis Gannon

[email protected] [email protected]

Computer Science Department

215 Lindley Hall

Indiana University

Bloomington, IN 47405

Abstract

Data-parallelism is a widely used model for parallel programming. Control structures like parallel DO loops, and data structures like collections have been used to express data-parallelism. In typical implementations, these constructs are 'flat' in the sense that only one data-parallel operation is active at any time. To model applications that can exploit overlap of synchronization and computation in data-parallel tasks, or that have independent but limited data-parallelism, or that depict static or hierarchical nested parallelism, a more dynamic model is required. This paper describes how to combine light-weight thread mechanism with object-oriented methodologies to provide light-weight ?This research was funded in part by: ARPA DABT63-94-C-0029 and Rome Labs Contract AF 30602-92-C-0135

1

and dynamic data-parallel constructs. We discuss the necessary abstractions to build objects with data-parallel semantics in the context of the Coir system [16, 17], a C++-based system for control and data-parallelism. We also study the three classes of applications: pure data-parallel, static nested data-parallel, and hierarchical algorithms.

1 The Coir System

The Coir system provides a model for both control and data parallelism. It is implemented as a C++ library. The system is designed to use reusable components like pthread-standard thread libraries [10] and MPI message passing libraries [6]. The architecture model subsumes both shared and distributed memory machines. The following paragraphs provide a brief description. For details on the design and implementation of Coir refer to [15, 16, 17].

1.1 Control Parallelism

In Coir, control parallelism is modeled in terms of light-weight user-level threads. A thread is basically a sequence of instructions in execution in a program and has its own stack and program counter. Multiple threads can execute on a process, and threads can migrate across processes sharing the same memory arena. Described in C++, the control mechanism is combined with inheritance, dynamic dispatch, and parameterized types to support control objects that are specific to user-applications or to a target model of parallelism. In addition, thread objects can also synchronize over shared memory using light-weight mutexes and condition variables. Thread objects within or across address domains may communicate using thread-to-thread synchronous or asynchronous communication. This communication mechanism can be used to support actor-style or message-driven semantics. The communication mechanism has a