[Bioperl-l] Inframe stop codon

Jason Stajich jason at bioperl.org
Sat Aug 2 08:58:05 EDT 2008

[regarding PAML analyses]

You would need to translate the cDNA sequence and identify where the  
stop codon is, then remove that codon or remove that sequence from  
your bulk analyses.  it depends on why you think the stop codon is in  
the sequence - mis-annotation, this is a pseudogene, or what?  If  
this is a small percentage of a lot of sequences I would probably  
just skip these, if this is the terminal stop codon that being  
included in the sequences, you just need to remove the last codon  
from the sequences before providing it to PAML. There Seq HOWTO has  
many examples of how to manipulate a sequence object with substr,  
trunc, as well as just the simple seq() method that gives you the  
sequence as a string, which you can manipulate, then update the  
sequence object afterwards. As in
my $str = $seq->seq;
# remove the last codon from this cDNA sequence
substr($str, -3, 3,'');

Alternatively you can use trunc to truncate the sequence
my $trunc = $seq->trunc(1,$seq->length -3);
$seq = $trunc;

You can translate the sequence with the $seq->translate command, then  
test for presence of a stop codon (This is exactly the code that is  
running in the pairwise_kaks script that is in the scripts/utilities/  
directory).  If you have a stop codon you need to figure out where it  
is at the end of the sequence or not.  If it is the terminal codon,  
you can just lop off the last codon on all your sequences, but if it  
is internal, you need to decide what you want to do with this sequence.

If there are multiple stop codons, I am not sure it is appropriate to  
run PAML here, unless you are interested in some sort of pseudo-rate  
calculation that has many of the codons omitted.  Otherwise you may  
just want to calculate a DNA substitution rate for the sequences to  
make comparison.

I suggest working a single file by hand to get the appropriate steps  
down and then coding it up will be easier.

I am sure folks on the list can help too so it is important to post  
to the mailing list - I don't see any messages from you on the list  
about this query.

On Aug 2, 2008, at 5:42 AM, Tannistha wrote:

> Hi Jason,
> Please suggest me how to filter the inframe stop codons,  
> aa_to_dna_aln returns the sequence with in-frame stop codons.
> I have posted my query along with the input files to the forum.
> Thanks for your earlier advice, runmode =0 is working for me.
> Look forward to your reply
> Best Regards
> Tannistha
> Dr. Tannistha Nandi
> email: tannistha3 at yahoo.com

More information about the Bioperl-l mailing list