Multiple Alignment

LA home
Computing
Bioinformatics
 Glossary
 Three
 Multiple
  ACSC17
  JME94
  HICSS25

Given k>2 sequences, the multiple alignment problem is to pad them out with null characters, `-', so that they all have the same length and to do this in such a way as to optimise some criterion -- minimise a cost or maximise a score.

The most popular choices for specifying the cost (score) function in a mutiple alignment are:

  1. All-pairs costs: The multiple alignment is "projected" onto all k×(k-1)/2 possible pairs out of the k sequences, the cost (score) of every such two-way alignment is calculated, and they are all summed.
  2. Star costs: A single hypothetical ancestor is assumed connected to each one of the given k sequences by an edge. The minimum possible sum of costs (max score) from the ancestor to each leaf is calculated.
  3. Tree-costs: The leaf sequences are related by a phylogenetic tree. The cost is the sum of the costs corresponding to each edge in the tree. The tree can be unrooted or rooted and, especially in the latter case, a "molecular clock" may be assumed, or not. This models the evolutionary process most closely, but a problem is where to get the phylogenetic tree?

The simple alignment algorithm for 2 sequences of length ~n takes O(N2)-time. The corresponding algorithm for k≥3 sequences takes O(Nk)-time. This is infeasible of N is "large" and k is more than a "few" in which case either heuristics or a stochastic search must be used. For example, each internal hypothetical node in an unrooted phylogenetic tree has three neighbours so it is possible to do repeated three-way alignment in either a greedy or a stochastic way. Alternatively a k-way alignment may be "projected" onto any k'<k of those sequences. Therefore, under tree-costs the phylogenetic tree can be "broken" on any edge and the two corresponding sub-alignments of leaves can be realigned in about O(N2)-time. One can either (Gibbs-) sample random alignments or "cool" the process towards an optimal alignment [Allison & Wallace 1994]

www

free:
Linux operating-sys
OpenOffice office-suite, ver. 2.4+
The GIMP ~photoshop
Firefox web browser
FlashBlock flash on/off

© L. Allison   http://www.allisons.org/ll/   (or as otherwise indicated),
Created with "vi (Linux + Solaris)",  charset=iso-8859-1,  fetched Saturday, 11-Oct-2008 10:13:11 EST.