[Bioperl-l] FeatureI GFF output is not GFF version 2 compatible?

Mark Wilkinson mwilkinson@gene.pbi.nrc.ca
Thu, 09 Nov 2000 14:58:22 -0600

Hi all!

Can someone clarify if my understanding is correct:  According to the
GFF  specifications page at Sanger,  under the GFF version 2 format
"From version 2 onwards, the attribute field must have an tag value
structure following the syntax used within objects in a .ace file,
flattened onto one line by semicolon separators. Tags must be standard
identifiers ([A-Za-z][A-Za-z0-9_]*). Free text values must be quoted
with double quotes."

I just dumped a bunch of SeqFeatures using $Feature->gff_string and got
output as follows:

PBICTGAt_2_000022  NCBI  NCBI_Gene  23468  24995  .   -   .  length=509 contig_stop=4995 chr_id=3 contig_start=3468 comment=Gene=At2g01500 Synonym=F2I9.12 Product=putative homeodomain transcription factor

it appears that neither of these two specifications are being followed
by the ->gff_string subroutine, i.e. the attributes are space-separated
not semicolon separated, and the free text is not quoted.  Is it my
mis-understanding of the GFF format, or is this a bug in the module (or
is the module not meant to be GFF version 2 compatible?)...(though the
documentation says that it is...)


any advice appreciated!!

If I get the OK from you all I could go in and "fix" it myself, but I
want to make sure I don't step on anyone's toes/break anyone's parser
before doing so.



Dr. Mark Wilkinson
Bioinformatics Group
National Research Council of Canada
Plant Biotechnology Institute
110 Gymnasium Place
Saskatoon, SK