A fast algorithm for the all-pairs suffix–prefix problem

Citation data:

Theoretical Computer Science, ISSN: 0304-3975, Vol: 698, Page: 14-24

Publication Year:
2017
Usage 5
Link-outs 3
Abstract Views 2
Captures 5
Readers 5
Social Media 3
Tweets 3
DOI:
10.1016/j.tcs.2017.07.013
Author(s):
Jihyuk Lim; Kunsoo Park
Publisher(s):
Elsevier BV
Tags:
Mathematics; Computer Science
Most Recent Tweet View All Tweets
article description
The all-pairs suffix–prefix problem occurs as a subproblem of DNA sequence assembly where it is the most time-consuming part of the whole assembly. Although there are algorithms for the all-pairs suffix–prefix problem which are optimal in the asymptotic time complexity, they are slower than SOF and Readjoiner which are state-of-the-art algorithms used in practice. In this paper we present an algorithm for the all-pairs suffix–prefix problem that uses a simple data structure for storing input strings and advanced algorithmic techniques for matching, which together lead to fast running time in practice. Our algorithm is 14 times faster than SOF and 18 times faster than Readjoiner on average in real datasets and random datasets.