[Bioperl-l] Bio::AlignIO::metafasta tests

Sendu Bala sb at mrc-dunn.cam.ac.uk
Tue Jun 6 11:40:05 EDT 2006


Chris Fields wrote:
> Sendu,
> 
> This is Heikki's original submission for the specs for meta format:
> 
> http://article.gmane.org/gmane.comp.lang.perl.bio.general/1370/match=meta+fa
> sta
> 
> So it's really a specialized FASTA format used to store meta information
> about sequences.  Seems mainly useful for amino acid sequences, but is
> extended to include properties of nucleotides like DNA content, RNA sec.
> structure, and so on.  

Thanks. It's not really clear to me if the meta data needs to be 
considered in the context of an alignment. That is, if you have two meta 
sequences with the same primary sequence, will all their meta data 
necessarily be the same? Or could they be different?

If the same, then the test data and test need to be fixed so my patched 
version of Bio::AlignIO::metafasta passes the tests.

If different, how should the meta data be handled? Like the test implies 
with its expected value for the consensus (just treat the primary 
sequence and all meta data as one long string)?
Is it really the intent to include characters from the meta data names 
when considering what symbols we've seen with symbol_chars() method?
Do we include the meta data name symbols when numbering?

Thoughts anyone?


More information about the Bioperl-l mailing list