[Bioperl-l] Genbank2gff3 script update

Brian Osborne bosborne11 at verizon.net
Tue Mar 27 22:20:59 EDT 2007


Don,

I took the file http://eugenes.org/gmod/genbank2chado/bin/genbank2gff3.PLS
and replaced the script of the same name with it, in scripts/Bio-DB-GFF.

Brian O.


On 3/27/07 7:42 PM, "Don Gilbert" <gilbertd at cricket.bio.indiana.edu> wrote:

> 
> Dear Bioperl developers,
> 
> Here is an improved bp_Genbank2gff3.pl script, with bug fixes
> and enhancements.  The non-transparent changes in behavior are
> made via non-default command flags. I've updated these against current
> Bioperl CVS. Would one of you care to add this to your CVS repository?
> 
> THanks, Don Gilbert
> 
> Find at  http://eugenes.org/gmod/genbank2chado/
> 
> =item Bioperl bp_genbank2gff3.pl
> 
>   bin/genbank2gff3.PLS   (Bioperl CVS scripts/Bio-GFF-DB/genbank2gff3.PLS)
>   lib/Bio-new/SeqFeature/Tools/TypeMapper.pm      (required for genbank2gff3
> update)
>   lib/Bio-new/SeqFeature/Tools/Unflattener.pm     (minor change suggested for
> genbank2gff3)
>     (put into your Bioperl lib/Bio/... directories)
> 
> There are also this unrelated patch
>   lib/Bio-new/Graphnics/Glyph/processed_transcript.pm
>       -- new flag to ignore excess subfeatures from Chado's
> gene-mrna-polypeptide-exon model.
>   
> =item Genbank2gff3 changes
> 
>   * Polypeptide alternate gene model added (--noCDS option)
>     Standard gene model:  gene > mRNA > (UTR,CDS,exon)
>     G-R-P-E alternate model:   gene > mRNA > polypeptide > exon
>     Polypeptide contains all the important protein info (IDs, translation, GO
> terms)
> 
>   * IO pipes: curl ftp://ncbigenomes/... | genbank2gff3 --in stdin --out
> stdout | gff2chado ...
>   
>   * GenBank main record fields are added to source feature
>     and the sourcetype, commonly chromosome for genomes, is used.
>       
>   * Gene Model handling for ncRNA, pseudogenes are added.
> 
>   * GFF header is cleaner, more informative, and GFF_VERSION option
>     
>   * GFF ##FASTA inclusion is improved, and translation sequence stored there.
>      
>   * FT -> GFF attribute mapping is improved.
>   
>   * --format choice of SeqIO input formats (GenBank default).
>     Uniprot/Swissprot and EMBL produce useful GFF.
>     
>   * SeqFeature::Tools::TypeMapper has a few FT -> SOFA additions, more
> flexible usage.
> 
> -- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405
> -- gilbertd at indiana.edu--http://marmot.bio.indiana.edu/
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list