[Bioperl-l] Re: GO dbxrefs in swissprot

Andreas Henschel henschel at mpi-cbg.de
Tue Jul 6 09:14:42 EDT 2004

Hilmar Lapp wrote:

> Pretty weird what you describe if it works for one entry but not 
> another. Also, the DR lines don't look suspiciously different.
> If there's no direct reason that prevents you from doing so you should 
> definitely upgrade to the 1.4.x series, possibly even to the latest 
> version of the stable branch from CVS. There were quite some fixes 
> meanwhile, some of which do affect how sequences get loaded into 
> biosql because the affect the annotation bundle.

Hi Hilmar,

I installed bioperl from cvs and repopulated the swissprot db into 
BioSQL. The entries I checked so far are apparantly correct. With the 
particular example I found that it was obviously a bug in the sequence 
annotation parsing of the 1.2.1 version. Sorry for having bothered you 
with versioning, I simply trusted the biosql installation instructions 
that claimed a patched 1.2.1 would do.
What still puzzles me is the size of the database: starting with a 543MB 
flatfile, the first run (with the faulty parser) gave me 600MB database 
and 9100 GO annotations. After the rerun with load_seqdatabase (...) 
--lookup --remove  I get 1.1GB database but only 5100 GO annotations in 
the dbxref table. Is this due to the normalization?
Is there a full list of parseable databases (GenBank, EMBL, ENSEMBL?, 
PDB? etc) and the resp. place to download?
Thanks again.


More information about the Bioperl-l mailing list