page 1  (17 pages)
2to next section

Neweyes: A System for Comparing Biological Sequences Using the Running Karp-Rabin Greedy String-Tiling Algorithm1

Michael J. Wise

Department of Computer Science,
University of Sydney, Australia
[email protected]
FAX: +61 2 692 3838

Keywords: Efficient sequence search, structure matrices, transposed subsequences.

A system for aligning nucleotide or amino acid biosequences is described. The system, called Neweyes, employs a novel string matching algorithm, Running Karp-Rabin Greedy String Tiling (RKR-GST), which involves tiling one string with matching substrings of a second string. In practice, RKR-GST has a computational complexity that appears close to linear. With RKR-GST, Neweyes is able to detect transposed substrings. Neweyes also supports a form of matching-bygroup that gives the effect of different amino acid mutation matrices. Neweyes can be used in a macro mode (searching a database for a list of biosequences that are similar to a given biosequence) or in a micro mode, where two biosequences are compared and more detailed output formats are available.

1. Submitted to Software - Practice and Experience

- 1 -