Approximate Matching of Network Expressions with Spacers*
Eugene W. Myers
Two algorithmic results are presented that are pertinent to the matching of patterns of interest in macromolecular sequences. The first result is an output sensitive algorithm for approximately matching network expressions, i.e., regular expressions without Kleene closure. This result generalizes the O (kn ) expected-time algorithm of Ukkonen for approximately matching keywords [Ukk85]. The second result concerns the problem of matching a pattern that is a network expression whose elements are approximate matches to network expressions interspersed with specifiable distance ranges. For this class of patterns, it is shown how to determine a backtracking procedure whose order of evaluation is optimal in the sense that its expected time is minimal over all such procedures.
Key words: Approximate Match, Backtracking, Network Expression, Proximity Search
January 16, 1992
Department of Computer Science
The University of Arizona
Tucson, Arizona 85721
*This work was supported in part by the National Institutes of Health under Grant R01 LM04960-01 and the Aspen Center for Physics.