Publications

Detailed Information

Fast Statistical Alignment

Cited 234 time in Web of Science Cited 262 time in Scopus
Authors

Bradley, Robert K.; Roberts, Adam; Smoot, Michael; Juvekar, Sudeep; Do, Jae Young; Dewey, Colin; Holmes, Ian; Pachter, Lior

Issue Date
2009-05
Publisher
Public Library of Science
Citation
PLoS Computational Biology, Vol.5 No.5, p. e1000392
Abstract
We describe a new program for the alignment of multiple biological sequences that is both statistically motivated and fast enough for problem sizes that arise in practice. Our Fast Statistical Alignment program is based on pair hidden Markov models which approximate an insertion/deletion process on a tree and uses a sequence annealing algorithm to combine the posterior probabilities estimated from these models into a multiple alignment. FSA uses its explicit statistical model to produce multiple alignments which are accompanied by estimates of the alignment accuracy and uncertainty for every column and character of the alignment-previously available only with alignment programs which use computationally-expensive Markov Chain Monte Carlo approaches-yet can align thousands of long sequences. Moreover, FSA utilizes an unsupervised query-specific learning procedure for parameter estimation which leads to improved accuracy on benchmark reference alignments in comparison to existing programs. The centroid alignment approach taken by FSA, in combination with its learning procedure, drastically reduces the amount of false-positive alignment on biological data in comparison to that given by other methods. The FSA program and a companion visualization tool for exploring uncertainty in alignments can be used via a web interface at http://orangutan.math.berkeley.edu/fsa/, and the source code is available at http://fsa.sourceforge.net/.
ISSN
1553-734X
URI
https://hdl.handle.net/10371/201379
DOI
https://doi.org/10.1371/journal.pcbi.1000392
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Related Researcher

  • College of Engineering
  • Department of Electrical and Computer Engineering
Research Area AI 애플리케이션을 위한 알고리즘-시스템 공동 설계, AI-powered Big Data Management, Generative AI, Large Language Model, ML, 고성능 대규모 AI 데이터 분석 및 처리, 모달 AI

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share