[Bioperl-l] Bioperl-l Digest, Vol 78, Issue 26

Dan Kortschak dan.kortschak at adelaide.edu.au
Thu Oct 29 16:27:21 EDT 2009

Hi, longorf by default requires an ATG, but can be asked to ignore that
with the notstrict option, it naturally runs translation off the end if
no stop is found.

You may have problems though with the 3' end if there is a stop in the
adjacent intron (you'll end up with a bit of translated intron until
then next inframe stop - this may or may not be a problem for you);
longorf makes no attempt to be intelligent with respect to intron/exon
boundaries as it was designed for cDNA sets in mind. Also, if the
biologically relevant frame is not the longest, it will not be found
(this should be an obvious caveat).

If these criteria aren't met, you may want to look at writing something
that makes use of Bio::Tools:Genscan (I haven't tried this, so can't
vouch for it) or try out run_genscan.pl in the examples/tools directory.

I hope this helps

On Thu, 2009-10-29 at 12:00 -0400, bioperl-l-request at lists.open-bio.org
> Date: Thu, 29 Oct 2009 11:03:39 +1100
> From: Chris <coldmeadow at gmail.com>
> Subject: [Bioperl-l] translate DNA whith unkown ORF
> To: bioperl List <bioperl-l at bioperl.org>
> Message-ID:
>         <b4cd33f80910281703u5bdfc50ah52072b6c1282f505 at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
> Hello,
> I have a fasta file of coding exon DNA. The problem is each sequence is not
> a full coding exon, but may be only a segment of an exon. I would like to
> translate this DNA to protein, but the ORF is unknown. I can see the actual
> protein sequence on the UCSC browser but when I only have segments of  exons
> there is no way to download the protein sequence segments, just the sequence
> for the whole gene. Does anybody know a way I can access this protein
> sequence, given a bed file of the coordinates of the coding sequence
> segments? I have tried the script "longorf.pl" on the fasta file of the DNA
> sequence but this does not work (or I cannot get it to) - I think it looks
> for an atg in the sequence and assumes that is the start codon..
> Thanks,
> Chris.

More information about the Bioperl-l mailing list