[Bioperl-l] [RFC] Interolog::Walk
G.Gallone at sms.ed.ac.uk
Wed Aug 18 10:57:01 EDT 2010
Hello BioPerl community - I've written a new module called
Interolog::Walk that I'm planning to put on CPAN. I would be grateful if
you might take a look at the brief description I attached and tell me
what you think. I'll be more than happy to post further details should
the module be of some interest for someone.
Also, I am not totally sure about having the correct name for it. This
is my first module and It would be great if you could advise on naming
it appropriately. Hopefully the following description will give an idea
on what it does.
Interolog::Walk - Retrieve, score and visualize putative
Protein-Protein Interactions through the orthology-walk method
A common activity in computational biology is to mine
protein-protein interactions from publicly available databases in order
to build Protein-Protein Interaction (PPI) datasets.
In many instances, however, the number of experimentally obtained
annotated PPIs is very scarce and it would be helpful to enrich the
experimental dataset with high-quality, computationally-inferred PPIs.
Such computationally-obtained dataset can extend, support or enrich
experimental PPI datasets, and are of crucial importance in
high-throughput gene prioritization studies, i.e. to drive hypotheses
and restrict the dimensionality of many gene functional discovery problems.
This Perl Module, Interolog::Walk, is aimed at building putative PPI
datasets on the basis of a number of comparative biology paradigms: the
module implements a collection of computational biology algorithms based
on the concept of "orthology projection". If interacting proteins A and
B in organism X have orthologs A' and B' in organism Y, under certain
conditions one can assume that the interaction will be conserved in
organism Y, i.e. the A-B interaction can be "projected through the
orthologies" to obtain a putative A'-B' interaction. The pair of
interactions (A-B) and (A'-B') are named "Interologs" (see for instance
 and ).
Interolog::Walk collects, analyses and collates gene orthology data
provided by the Ensembl Consortium (www.ensembl.org) as well as PPI data
provided by EBI Intact (http://www.ebi.ac.uk/intact/). It provides the
user with the possibility of rating the quality and reliability of the
putative interactions collected, by means of confidence scores, and
optionally outputs network representations of the datasets, compatible
with the biological network representation standard, Cytoscape.
In order to carry out an interolog walk we start with a set of gene
identifiers in one organism of interest. We query those ids against a
number of comparative biology databases to retrieve a list of
orthologues for each gene id of interest, in one or more species.
In the following step we rely on PPI databases to retrieve the list of
available interactors for the protein ids obtained. The output at this
stage consists of a list of interactors of the orthologues of the
initial gene set, plus several fields of ancillary data.
In the last step of the process we project the interactions - again
using orthology data - back to the original species of interest. The
output of the process is a list of PUTATIVE INTERACTORS of the initial
gene set, plus several fields of ancillary data.
Given the scope and the focus of the project, I would imagine that
viable alternatives for the namespace might be
There are no similar projects as far as I could see so I shouldn't run
the risk of overlapping namespaces. Still I would love to know your
informed opinion about it.
 Yu H, Luscombe NM, Lu HX, Zhu X, Xia Y, Han JD, Bertin N, Chung S,
Vidal M, Gerstein M. Annotation transfer between genomes:
protein-protein interologs and protein-DNA regulogs. Genome Research
Wiles AM, Doderer M, Ruan J, Gu T-T, Ravi D, Blackman BA, Bishop AJR.
"Building and Analyzing Protein Interactome Networks by Cross-species
Comparisons." BMC Systems Biology 2010, 4:36 - PMID: 20353594
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the Bioperl-l