[Bioperl-l] Inframe stop codon
jason at bioperl.org
Sat Aug 2 08:58:05 EDT 2008
[regarding PAML analyses]
You would need to translate the cDNA sequence and identify where the
stop codon is, then remove that codon or remove that sequence from
your bulk analyses. it depends on why you think the stop codon is in
the sequence - mis-annotation, this is a pseudogene, or what? If
this is a small percentage of a lot of sequences I would probably
just skip these, if this is the terminal stop codon that being
included in the sequences, you just need to remove the last codon
from the sequences before providing it to PAML. There Seq HOWTO has
many examples of how to manipulate a sequence object with substr,
trunc, as well as just the simple seq() method that gives you the
sequence as a string, which you can manipulate, then update the
sequence object afterwards. As in
my $str = $seq->seq;
# remove the last codon from this cDNA sequence
substr($str, -3, 3,'');
Alternatively you can use trunc to truncate the sequence
my $trunc = $seq->trunc(1,$seq->length -3);
$seq = $trunc;
You can translate the sequence with the $seq->translate command, then
test for presence of a stop codon (This is exactly the code that is
running in the pairwise_kaks script that is in the scripts/utilities/
directory). If you have a stop codon you need to figure out where it
is at the end of the sequence or not. If it is the terminal codon,
you can just lop off the last codon on all your sequences, but if it
is internal, you need to decide what you want to do with this sequence.
If there are multiple stop codons, I am not sure it is appropriate to
run PAML here, unless you are interested in some sort of pseudo-rate
calculation that has many of the codons omitted. Otherwise you may
just want to calculate a DNA substitution rate for the sequences to
I suggest working a single file by hand to get the appropriate steps
down and then coding it up will be easier.
I am sure folks on the list can help too so it is important to post
to the mailing list - I don't see any messages from you on the list
about this query.
On Aug 2, 2008, at 5:42 AM, Tannistha wrote:
> Hi Jason,
> Please suggest me how to filter the inframe stop codons,
> aa_to_dna_aln returns the sequence with in-frame stop codons.
> I have posted my query along with the input files to the forum.
> Thanks for your earlier advice, runmode =0 is working for me.
> Look forward to your reply
> Best Regards
> Dr. Tannistha Nandi
> email: tannistha3 at yahoo.com
More information about the Bioperl-l