[Bioperl-l] getting DNA sequence for exon features from GFF
cjfields at illinois.edu
Thu Aug 26 10:31:59 EDT 2010
On Aug 26, 2010, at 4:02 AM, Peter wrote:
> On Thu, Aug 26, 2010 at 9:53 AM, Dave Messina <David.Messina at sbc.su.se> wrote:
>> Admittedly i'm not up on the latest uses of GFF, but as far as I know, GFF
>> is an annotation format only — it does not contain the actual sequence.
>> Have you looked in your GFF file to see if there are nucleotides in there?
> Actually a GFF file can optionally include a FASTA format sequence
> at the end of the file, although it seems to be more common to just
> supply separate GFF and FASTA files and cross reference by ID.
IIRC, optionally including FASTA sequence is specified only in the GFF3 spec; use of FASTA isn't explicitly mentioned in earlier versions. We only support it with earlier GFF due to convergence of the various GFF parsers.
The original GFF spec proposed allowing sequence, but it's in the form of meta information and I have never seen it used in practice (as you mention, the FASTA is normally loaded separately).
More information about the Bioperl-l