[Bioperl-l] Quickest Codon Based MSA?
johan.nilsson at sh.se
Thu Jan 24 17:33:42 EST 2008
I have a question which might not necessarily be related to Bioperl,
although I do believe the expertise is available here. I have a couple
of thousand FASTA files, each containing 20 CDS sequence orthologues of
rather high sequence similarity. I would like to create a codon-based
multiple sequence alignment for each of these FASTA files (i.e. a
nucleotide sequence alignment inferred from alignment of the translated
peptide sequences, to assure that no frame shifts will occur). I first
tried running Dialign2, which can perform the
translation/back-translation in one go, but this turned out to be far
too slow. I next tried to build protein alignments using ClustalW and
subsequently built the coding region alignment using EMBOSS 'tranalign',
but this also was too slow.
Is there any method available which significantly speeds up the
codon-preserving alignment??? As I mentioned, the sequences to be
aligned are in general very conserved, so any heuristic taking advantage
of the low divergence would be very helpful! Also, is there any
adjustable parameter in dialign2/dialign-T that might speed up the
program when looking at highly similar sequences?
More information about the Bioperl-l