[Bioperl-l] Translating alternate start codons

Jason Stajich jason at bioperl.org
Tue Nov 21 11:47:37 EST 2006


It seem then you don't want to use 'complete' since you don't always expect
complete CDSes.

Shouldn't you just translate to generate the peptide and then operate on the
initial and terminal residues to achieve what you want.
substr($peptide,0,1,'M');
chop($peptide) if substr($peptide,-1,1) eq $TERMINATOR;

This is a community project so any help in improving apparent deficiencies
in documentation are gladly welcomed - the wiki is designed to let everyone
contribute.  I think that clarifications and better descriptions for modules
should reside there so please add your input on the webpages.  A dev with
write access can migrate updated documentation to the module's POD where
appropriate.

-jason

On 11/21/06, Amir Karger <akarger at cgr.harvard.edu> wrote:
>
> > From: Brian Osborne [mailto:bosborne11 at verizon.net]
> >
> > Amir,
> >
> > The best documentation for translate() is in the online
> > Bioperl Tutorial,
> > have you checked that?
> >
> > Brian O.
>
> Thanks for the quick response. The tutorial is quite informative.
> It seems to me that the POD needs to document -complete more thoroughly,
> though:
>
>                   Or if you expect a complete coding sequence (CDS)
> translation,
>                   with inititator at the beginning and terminator at the
> end:
>
>                   $protein_seq_obj = $cds_seq_obj->translate(-complete
> => 1);
>
> This doesn't really explain what it does.
>
> I guess -complete was chosen as a compromise between having too many
> options and having lots of functionality. In my case, I want to keep the
> *, and I don't want warnings about terminators in the middle, because
> I've got a bunch of pseudogenes. So I'll just translate the M myself.
>
> I'm sure you've had many "the documentation is spread out in too many
> places" discussions before, and I know keeping docs up to date is Hard.
> Oh well.
>
> -Amir
>
> >
> >
> > On 11/21/06 10:21 AM, "Amir Karger" <akarger at CGR.Harvard.edu> wrote:
> >
> > > I think this is more a Bio question than a Bioperl question.
> > >
> > > I did this:
> > >
> > > #########
> > > #!/usr/local/bin/perl
> > >
> > > use strict;
> > > use warnings;
> > >
> > > use Bio::Seq;
> > > use Bio::Tools::CodonTable;
> > >
> > > my $seqobj = Bio::PrimarySeq->new (
> > >     -seq => 'ATATGATAA',
> > >     -id  => 'GeneFragment-12',
> > >     -accession_number => 'X78121',
> > >     -alphabet => 'dna',
> > > );
> > >
> > > $myCodonTable2  = Bio::Tools::CodonTable -> new ( -id => 4 );
> > > my $is = $myCodonTable->is_start_codon('ATA') ? "is" : "is not";
> > > print "ATA $is a valid start codon\n";
> > > print "Table 4: ", $seqobj->translate("-codontable_id" =>
> > 4)->seq,"\n";
> > > print "Table 1: ", $seqobj->translate("-codontable_id" =>
> > 1)->seq,"\n";
> > > ###########
> > >
> > > I got this:
> > > ATA is a valid start codon
> > > Table 4: IW*
> > > Table 1: I**
> > >
> > > But EMBL tells me that EMBLCDS:AAT64955 starts with an M:
> > >
> > http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-id+3b6PL1TmQt3+-e+[
> > EMBLCDS:'A
> > > AT64955']+-qnum+1+-enum+3
> > >
> > > So, does Bioperl purposely not translate start codons to M,
> > while EMBL
> > > does? Am I supposed to just change the I to M explicitly in
> > my code? I
> > > didn't see an obvious option to translate() to do it.
> > >
> > > Thanks,
> > >
> > > - Amir Karger
> > > Research Computing
> > > Life Sciences Division
> > > Harvard University
> > > 617-496-0626
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> >
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



-- 
Jason Stajich
jason at bioperl.org
http://www.duke.edu/~jes12/


More information about the Bioperl-l mailing list