[Bioperl-l] orthologous genes extracting

Nathan (Nat) Goodman natg at shore.net
Mon Aug 16 13:42:51 EDT 2004

Paulo said:
> I have been using Ensmart, Ensembl's web interface for batch retrieval,
> at http://www.ensembl.org/Multi/martview
> I think their orthologues are based on reciprocal BLASTs and sometimes
> also on synteny and whole genome alignment data.

Thanks.  I've seen problems with Ensembl similar to those with HomoloGene,
but as I look through my notes, these may have been pathological cases.  For
example, mouse Casp1 points to two human othologs, CASP1 and COP.  CASP1 is
a biologically validated ortholog, but COP is pretty similar and adjacent to

I'm wondering if anyone has combined a curated data source, eg, MGD, with
computed results to avoid cases like this.

Looking at the bigger picture, I see a need for "orthology" maps at various
levels of stringency.  What I'm discussing here is the most stringent case
in which we try hard to assign 1 ortholog per species to each gene.  (I
realize the issue is biologically complex and this simple notion will fail
for biological reasons in many cases, which is why I'm hoping someone else
has already solved the problem :)  It would be nice to also offer more
permissive maps including ones based entirely on sequence similarity.


More information about the Bioperl-l mailing list