page 1  (15 pages)
2to next section

Approximate Matching of Network Expressions with Spacers*

Eugene W. Myers

TR 92-5

ABSTRACT

Two algorithmic results are presented that are pertinent to the matching of patterns of interest in macromolecular sequences. The first result is an output sensitive algorithm for approximately matching network expressions, i.e., regular expressions without Kleene closure. This result generalizes the O (kn ) expected-time algorithm of Ukkonen for approximately matching keywords [Ukk85]. The second result concerns the problem of matching a pattern that is a network expression whose elements are approximate matches to network expressions interspersed with specifiable distance ranges. For this class of patterns, it is shown how to determine a backtracking procedure whose order of evaluation is optimal in the sense that its expected time is minimal over all such procedures.

Key words: Approximate Match, Backtracking, Network Expression, Proximity Search

January 16, 1992

Department of Computer Science

The University of Arizona

Tucson, Arizona 85721

*This work was supported in part by the National Institutes of Health under Grant R01 LM04960-01 and the Aspen Center for Physics.