[Bioperl-l] Converting GFF2 records to GFF3

Razi Khaja razi at genet.sickkids.on.ca
Thu Dec 23 15:54:40 EST 2004

Sorry for cross posting, but this may be relevent to both bioperl and song-devel.
Ive written a small script to convert gff2 records to gff3 using bioperl and vice versa (see gff2_to_gff3.pl and gff3_to_gff2.pl below).  
In doing this I have noticed some problems in conversion.
The method Bio::Tools::GFF::_gff3_string will quote attribute values if they contain characters not in [a-zA-Z0-9,;=.:%^*$@!+_?-] (ie. $value = '"'.$value.'"';) and will output empty quotes for tags without values (ie. $value = "\"\"";).
Currently the gff3 spec says: "Unescaped quotation marks, ... are explicitly forbidden." 
This brings up 2 questions:
(1) Are quotes necessary in gff3?
(2) When a value is empty, what should be output?
    a) Tag="";
    b) Tag=.;
    c) Tag=;
    d) nothing?
(Apart from not meeting the spec, this makes it difficult to do transformations from gff2 to gff3 and back to gff2 again.)

# =====  gff2_to_gff3.pl =====
use strict;
use Bio::Tools::GFF;
my( $gff2File ) = @ARGV;
my $gffio = Bio::Tools::GFF->new(-file=>"$gff2File", 
while( my $feature = $gffio->next_feature() ) {
    my $gff3string = $gffio->_gff3_string( $feature );
    print "$gff3string\n";

# =====  gff3_to_gff2.pl =====

use strict;
use Bio::Tools::GFF;
my( $gff3File ) = @ARGV;
my $gffio = Bio::Tools::GFF->new(-file=>"$gff3File", -gff_version=>3);
while( my $feature = $gffio->next_feature() ) {
    my $gff2string = $gffio->_gff2_string( $feature );
    print "$gff2string\n";


 * Razi Khaja, Bioinformatics Analyst
 * The Hospital for Sick Children, Toronto

More information about the Bioperl-l mailing list