[Bioperl-l] load_ontology and GO - progress!

Dave Howorth dhoworth at mrc-lmb.cam.ac.uk
Thu Apr 22 07:39:59 EDT 2004

Hi everybody,

I'm very pleased to report that the upgraded branch-1-4 now passed its 
tests to an acceptable/understandable level and has installed. 
Furthermore, it has fixed the original trouble with the load_ontology 

So big thanks to everybody who has put effort into helping me, fixing 
problems and updating docs so quickly.

There's always a gotcha, of course ...

load_ontology is now failing with messages like this:

Loading ontology Gene Ontology:
	... terms

-------------------- WARNING ---------------------
MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values were 
("GO:0001529","elastin","OBSOLETE (was not defined before being made 
obsolete).","X") FKs (2)
Duplicate entry 'elastin-2' for key 2
Could not store GO:0001529 (elastin):

------------- EXCEPTION  -------------
MSG: create: object (Bio::Ontology::GOterm) failed to insert or to be 
found by unique key
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store 
STACK Bio::DB::Persistent::PersistentObject::store 
STACK (eval) load_ontology.pl:508
STACK toplevel load_ontology.pl:490


What appears to be happening is that the GO.defs file defines two terms 
called 'elastin' (0001528 and 0001529), one of which is used in 
component.ontology and the other in function.ontology.  Both are 
declared OBSOLETE.

This isn't an isolated case; it occurs for 'collagen' as well, for 
example.  Neither is it new in this release of the files; I found it in 
an old copy I had.

The GO flat file definition format document isn't very helpful in 
describing what should be in the file, so my question is whether this 
duplication is a legitimate occurence in GO?

That is, is there a bug in the GO distribution (duplicate IDs) or a bug 
in the biosql model (each of the three GO ontologies needs a separate 
ontology ID)?

Also of interest would be any ideas for a workaround :)

Cheers, Dave
Dave Howorth
MRC Centre for Protein Engineering
Hills Road, Cambridge, CB2 2QH
01223 252960

More information about the Bioperl-l mailing list