[Bioperl-l] Reciprocal best blast hits using BioPerl?

Chris Larsen clarsen at vecna.com
Mon Jan 18 12:42:13 EST 2010


Bhakti, (and Chris, Mark)--

Yes there is some perl available to parse reciprocal best blast hits.

Mark's referenced / archived post was mine, we were looking to do what  
you wanted. Here we proceed with the thread.

We ended up implementing OrthoMCL 1.4 as Chris F pointed to, and then  
made a simple perl parser that would take the raw OrthoMCL output, do  
splits, and spit out a delimited table of all the orthologs in a  
group, for say Mycobacterium Genus, so you could stuff it into DBLoader.

The link to the script, SOP, and method is at:
http://www.biohealthbase.org/brcDocs/documents/BHB_ORTHOLOG_SOP.pdf

Giving e.g.:

Francisella 1 110321310
Francisella 1 110321361
Francisella 1 56707275
Francisella 1 56707366
Francisella 1 56707462

Five members of Ortholog Group 1, with just their gi number.  And you  
can see the results of that parsing, supported by a database, being  
used to load BioHealthbase with all the reciprocal best blast hits  
plus other OrthoMCL parsing, for mycobacterial PolA at:

http://www.biohealthbase.org/brc/details.do?locus=MAV_3155&decorator=mycobacterium

See? Pretty? We were just interested in making ortholog groups on the  
bais of paralog-conscious reciprocal blast stuff. Like you. This  
package and doc I've made does what you want I think, as long as you  
stay in prokaryotes. But--careful...garbage in, garbage out. We  
started with clean Genuses. (. o O Genii?). You'll get more junky HUGE  
and TINY ortholog groups if you put in different Orders of microbes.  
Its taxa sensitive. OrthoMCL author David Roos is great at it though  
and designed it in mind of higher unicellular euks too...comb the docs  
for that; sorry I was doing bacterial work at the time and cant guide  
you if thats what you want.. If you end up installing OrthMCL 1.4, you  
can pipe the output to this method and get out useable stuff.

Hope it works for you.

Cheers,

Chris L

-- 

Christopher Larsen, Ph.D.
Sr. Scientist / Grants Manager
Vecna Technologies
6404 Ivy Lane #500
Greenbelt, MD 20770
Phone: (240) 965-4525
Fax: (240) 547-6133
240-737-4525



More information about the Bioperl-l mailing list