[Bioperl-l] Questions on Representing Protein Ambiguity

James Thompson tex at biosysadmin.com
Sun Oct 3 06:15:04 EDT 2004


Thanks for the feedback. You're definitely right about consensus sequences
being relatively worthless when compared to the information contained in the
whole profile.

Friday afternoon I committed some to ProtMatrix.pm that will allow the regexp
method to take a threshold as an argument, and it's not too hard to change.

The Bio::Tools::dpAlign idea looks interesting, I'd never seen it before
myself. Sometime down the road I'll look into making it use matrices from the
Bio::Matrix::PSM family. Right now I'll work on making sure all of my code is
release-worthy. :)

James Thompson

On Fri, 1 Oct 2004, Aaron J. Mackey wrote:

> On Sep 30, 2004, at 10:49 PM, James Thompson wrote:
> > An alternative would be to borrow an idea from Perl's regex character 
> > classes
> > and represent multiple residues at a position inside of a set of 
> > brackets, like
> > this:
> >
> > M[ES]N[IAP]S
> In general, you're always going to lose information moving from a 
> profile to a flat pattern.  This option prevents losing all the 
> information that flattening to "MENIS" would (although MENIS is a 
> reasonable "consensus" in this case), but there's still information 
> loss.  So in that sense it isn't really a better solution than "just 
> take the most probable residue, unless it's less than some threshold, 
> in which case X".
> I think the whole idea of a consensus sequence from a profile is a bit 
> worthless, to be honest.  What are you supposed to be able to do with 
> the consensus, search with it?  That's what the profile is for in the 
> first place ... [ speaking of which, I'd love to see 
> Bio::Tools::dpAlign make use of these protein profiles ].
> -Aaron
> --
> Aaron J. Mackey, Ph.D.
> Dept. of Biology, Goddard 212
> University of Pennsylvania       email:  amackey at pcbi.upenn.edu
> 415 S. University Avenue         office: 215-898-1205
> Philadelphia, PA  19104-6017     fax:    215-746-6697

More information about the Bioperl-l mailing list