[Bioperl-l] Possible bug in Bio::Tools::SeqStats->get_mol_wt?

Roy Chaudhuri roy.chaudhuri at gmail.com
Thu Mar 24 12:08:22 EDT 2011


Have a good break,

On 24/03/2011 15:47, Chris Fields wrote:
> On Mar 24, 2011, at 8:47 AM, Roy Chaudhuri wrote:
>> Hi all,
>> I have discovered a possible bug in Bioperl, although maybe it's my
>> expectations that are wrong, not the code.
>> I noticed that when calculating molecular weights for a bunch of
>> protein sequences using Bio::Tools::SeqStats->get_mol_wt, the
>> values I was getting were slightly different from the ones given by
>> Emboss pepstats. This was due to my protein sequences ending with
>> *, since they were derived from translating annotated genes
>> including the stop codon. Surprisingly (to me, at least)
>> Bio::Seq->length gives a value that counts the terminal *, so one
>> greater than the number of amino acids. SeqStats->get_mol_wt calls
>> Bio::Seq->length to determine the number of water molecules to
>> subtract from the total molecular weight, so the reported weights
>> for my sequence were the weight of one water molecule less than
>> they should have been. I'm not sure if this is a bug in get_mol_wt,
>> in Bio::Seq->length, or if it's bad practice to use protein
>> sequences with a terminal asterisk (I've never had a problem doing
>> so before).
> The method should account for the possibility that '*' is present;
> should be easy enough to fix with something like:
> my $len = $seq =~ tr/A-Za-z/A-Za-z/;
> I'm not able to do this right away (on fam vacation), can you file
> this on our new bug server?
> http://redmine.open-bio.org
>> Cheers, Roy. _______________________________________________
>> Bioperl-l mailing list Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> chris

More information about the Bioperl-l mailing list