[Bioperl-l] SimpleAlign Ids

Chris Fields cjfields at illinois.edu
Thu Apr 22 19:31:01 EDT 2010


It's something to take into consideration if the SoC project goes through.  We should know for sure next week.

chris

On Apr 22, 2010, at 4:42 PM, Mark A. Jensen wrote:

> Hi Bernd,
> I'm for the idea of an internal uid for sequences; tree nodes already work like this, for example, and I think it's the Right Thing To Do.
> MAJ
> ----- Original Message ----- From: "Bernd Web" <bernd.web at gmail.com>
> To: "BioPerl List" <bioperl-l at bioperl.org>
> Sent: Wednesday, April 21, 2010 1:08 PM
> Subject: [Bioperl-l] SimpleAlign Ids
> 
> 
>> Hi
>> 
>> Would it be an idea to change SimpleAlign/AlignIO as not to use
>> sequence IDs in the hash to store the sequences?
>> Quite regularly I run into issues with alignments that do not have
>> unique IDs. This esp. occurs with alignments from the CDD at NCBI.
>> When I know the input format I (or a user) is using, I have a
>> pre-processing step to make all IDs unique.
>> However, when the input format can change everytime, it is really
>> handy to use SimpleAlign with the format guesser.
>> When the sequence objects would be stored using unique keys instead of
>> Ids this issue would not occur.
>> I can image that for other using large alignments this might not be
>> handy as I suppose an extra lookup step would be needed.
>> 
>> Is the above an issue that  others run into too?
>> 
>> Kind regards,
>> Bernd
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list