[Bioperl-l] Polyproteins, ribo slippage, and mat_peptide in viruses?

Peter biopython at maubp.freeserve.co.uk
Tue Oct 27 13:29:08 EDT 2009

On Tue, Oct 27, 2009 at 4:33 PM, Chris Larsen <clarsen at vecna.com> wrote:
> All,
> I am attempting to find some solutions to a DB loading problem we are
> encountering in viruses. It is multifold:
> Some viruses churn out a polyprotein rather than individual peptides;
> further they also slip the ribosome, so a source nucleotide is used more
> than once  in translation (ribosome halts, backs up one nucleotide, and
> continues in a new frame); and finally we have post translational processing
> into mature peptides. The main thing is that the mature peptide is contained
> a a subset of the whole parent polyprotein, but is not provided as a single
> file in GBK for each mat_peptide CDS. We have to get that in order to run
> algorithms on the relevant processed proteins. Therefore we cannot directly
> load into GUS, but rather have to choose how to get the mat_peptide
> sequence. Actually I think the viruses know that, and are just messing with
> us out of spite, since we have iPods and they dont. Anyway.. from anyone who
> has encountered this I seek guidance.
> We have as choices:
> 1. Get the locations of mature peptide children in /Protein/
> carve the mat_peptide sequence out of the whole polyprotein translation
> check that the mat_peptide is infact an identical subset of the translated
> protein
> load that
> OR
> 2. Use the locations of starts and stops in /Nucleotide/
> translate that, using the slippage information
> get mature peptides that line up exactly to the parent polyprotein
> If you know of BioPerl sequence handling support for this, I would love to
> hear more. Clearly this is a nonstandard thingamabob.
> Stupid viruses
> Chris

Cool viruses :)

Do you have some specific examples from GenBank? I'm starting
to deal with virus annotation in my work, so this is of interest.


P.S. As you might guess from my email address, I'm actually more
interested in Biopython than BioPerl, but the same algorithmic
approach could be tested in either.

More information about the Bioperl-l mailing list