[Bioperl-l] genbank2gff.pl choking on CONTIG sections

Jason Stajich jason at bioperl.org
Wed Sep 24 19:05:18 EDT 2008

It should already if it is using Bio::DB::GenBank -- do you have  
example of a fail?  There seems to be some defaulting to EMBL for the  
source in the biofetch code so it might be worth twiddling.

from Bio::DB::GenBank

Note that when querying for GenBank accessions starting with 'NT_' you
will need to call $gb->request_format('fasta') beforehand, because
in GenBank format (the default) the sequence part will be left out
(the reason is that NT contigs are rather annotation with references
to clones).

Some work has been done to automatically detect and retrieve whole NT_  
when the data is in that format (NCBI RefSeq clones). The former  
behavior prior
to bioperl 1.6 was to retrieve these from EBI, but now these are  
directly from NCBI. The older behavior can be regained by setting the
'redirect_refseq' flag to a value evaluating to TRUE.

On Sep 24, 2008, at 3:00 PM, Scott Cain wrote:

> Hi all,
> The BioPerl script bp_genbank2gff.pl, which will either convert a
> Genbank record to GFF or load it directly to a Bio::DB::GFF database,
> is choking on GenBank records with CONTIG sections.  Since I don't
> think these would ever be useful for generating GFF or loading into a
> database (ie, the user will want to get all of the features on the
> parts, not know what the parts are), is there a way to force a
> Bio::DB::WebDBSeqI/Bio::DB::BioFetch to get the full record (like
> specifying view=gbwithparts in the url at ncbi)?
> Thanks,
> Scott
> -- 
> ------------------------------------------------------------------------
> Scott Cain, Ph. D. cain.cshl at gmail.com
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Cold Spring Harbor Laboratory
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Jason Stajich
jason at bioperl.org

More information about the Bioperl-l mailing list