[Bioperl-l] GO ontology browser module available

Mark Wilkinson mwilkinson@gene.pbi.nrc.ca
Tue, 30 Jan 2001 15:35:03 -0600

Dear Group,

Most of you will be familiar with the GO consortium project of putting
together a common nomenclature for genome annotation.  As part of the
development of Workbench, I have thrown together a fairly simplistic
Gene Ontology ("GO") parser/browser widget.

It is able to parse the XML files available on the GO website, clean up
the XML to make it compatible with the XML::Parser module (available
from CPAN), and then dump the resulting hash using Data::Dumper.   The
dumped file can then be read into the GO_browser (which is an extension
of a Tk::Text widget) and browsed as if it were a directory window, with
double-clicks to navigate up and down the tree, and color coding of what
are 'branches' and what are 'leaves'.  Middle-clicks can be trapped in
the external Tk::MainWindow to extract the selected ontology term and
definition.  It is more or less a "plug in" module, similar in design to
SeqCanvas - you create a Text widget, pass the Text widget to
GO_Browser->new  and it gives you back a browsable GO ontology.

Parsing the GO ontology files themselves takes about 4-5 minues each,
but this only has to be done once per GO release; the resulting
hash-dump can be slurped into the GO_browser widget in a couple of
seconds.  I parse the GO ontology tree only to the point where GO-terms
end and hard gene-names, examples, and bibliographic data begin.  This
could easily be modified, however, as you wish.

Because this module doesn't really "fit" anywhere in the current BioPerl
structure, and because the .xml files that it is based on are still
quite fluid (and thus the module will likely have to be tweaked quite
extensively until things settle down), I don't feel that it is worth
adding into the BioPerl repository at this time.  However, I would be
glad to share it with anyone who might find it useful, with all the
usual disclaimers :-)

Let me know,

Cheers all!


Dr. Mark Wilkinson
Bioinformatics Group
National Research Council of Canada
Plant Biotechnology Institute
110 Gymnasium Place
Saskatoon, SK