[Bioperl-l] What's the best way to produce gff files from genebank/embl formats?

Chris Fields cjfields at uiuc.edu
Thu Nov 15 13:43:02 EST 2007

There are currently many ways to get what you want, but not all are  
consistent (particularly re: GFF3).  We are aiming for more  
consistent, compliant GFF/GTF output in the next developer series  
(1.7) of Bioperl.

You can try using bp_genbank2gff or bp_genbank2gff3 (both in the  
scripts directory); these are probably the most common way when  
working directly from a seq record.  Bio::Tools::GFF is the most  
commonly used class though I'm unsure of it's status for GFF3  
output.  From within a Bio::SeqI you can call write_gff() (currently  
not very flexible) or from the SeqFeature itself gff_string().   
Bio::Graphics::Feature has the additional method gff3_string().   
Bio::FeatureIO is also an option, though I would consider it very  
experimental (it will likely undergo significant revision in the next  
bioperl dev series).

Any others anyone can think of, maybe non-BioPerl related as well?


On Nov 15, 2007, at 9:44 AM, Lucia Peixoto wrote:

> Hi
> I was asked this question recently
> and it occurred to me I must be doing things inefficiently
> To produce gff file I was using SeqIO to parse the required fields,  
> then
> according to the conventions just printing out whatever was  
> required tab
> delimited, which is easy
> but if I wanted to generate a genbank file, extracting features  
> from a gff file
> and a plain fasta file it was more complicated
> is there support for gff in bioperl now?
> anyone can contribute with  smart way to go from/to gff, genebank  
> and embl?
> thanks very much
> Lucia Peixoto
> Department of Biology,SAS
> University of Pennsylvania
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

More information about the Bioperl-l mailing list