page 1  (10 pages)
2to next section

Diagnosis of Ill-typed Programs

Venkatesh Choppella and Christopher T. Haynes

Indiana University, USA

fchoppell,[email protected]


A framework, based on syntactic and type constraints, is provided for defining program slices that contribute to a given type error or similar syntactic property. We specify soundness, minimality and completeness criterion for these slices and outline an algorithm for their lazy generation.

1 Introduction

Error diagnosis in current type-reconstruction algorithms either misses information that is relevant, presents irrelevant details, or both. We propose a general framework that describes a type error in terms of the error's symptom," type constraints logically deriving the symptom, and program slices contributing the type constraints that derive the symptom.

We present two frameworks for diagnostic reasoning that subsume the analysis of type errors. In the first framework, based on propositional logic, a program slice is a set of program subforms that contribute to a given type constraint. We call these slices location sets. The second, based on first-order logic, is a more fine-grained framework in which a program slice is a set of relations (syntactic constraints) between subforms that lead to a given type constraint. We call these program slices partial syntactic descriptions.

Our diagnostic error analysis is achieved using a variant, due to Port [17], of the standard unification algorithm, which we call diagnostic unification. This algorithm provides information from which it is possible to directly obtain the set of all minimal proofs of the derivation of a given type constraint in a deduction system consisting of type rules and type constraints. A proof is minimal if the result of deleting any step or set of steps is ill-formed. A type error is signaled by a type constraint that is trivially unsatisfiable. Section 2 reviews related work. Section 3 presents an

extended example of our technique based on the simplytyped lambda calculus with constants and if. Section 4 formalizes diagnostic analysis of type errors in the framework of propositional logic, including soundness, minimality and completeness criteria for location sets. A sound and complete algorithm that computes location sets for the simplytyped lambda calculus is presented. Section 5 formalizes the notions of syntactic constraints, partial syntactic descriptions, and soundness and completeness criteria for judging algorithms that generate partial syntactic descriptions. Section 6 adapts these techniques to ML-style polymorphism and polymorphic recursion. Section 7 concludes with directions for further work, including a discussion of how the framework of this paper may be useful in arriving at a general framework for diagnostic analysis of program properties.

2 Related work

Wand [21] was among the first to address the problem of reporting type errors. He modified the traditional unification algorithm to accumulate reason lists" that explained" the introduction and binding of type variables. He speculated that a completeness result" might be provable for his algorithm, but did not formulate one. We formalize soundness criteria which Wand's algorithm fails to satisfy. For example, for >=x>=y:f(y x)(y 3)(not x), Wand's algorithm returns the reason list f(y 3); (not x)g. This set does not imply enough type constraints to derive the type error.

The network flow approach of Johnson and Walz [9, 20] attempts to derive all multiply contradictory hypotheses" of the type of a program variable and assign them relative frequencies. The authors claim that their implementation associates text with type errors, but the details are not published. They provide no correctness criterion or characterization of their complex algorithm.

Gomard [6] isolates the untypable parts of a program with a two-level syntax and type rules that employ a special type called untyped." When an expression with well-typed subexpressions is ill-typed, the entire expression is flagged with no indication of which parts of the subexpressions contributed to the error. In other words, although his technique