[Bioperl-l] Bio::PopGen modules performance

Kevin Thornton kt234 at cornell.edu
Sun Nov 6 22:34:55 EST 2005


First, thanks to Jason and Albert for the plug.

This thread brings up something I've been meaning to post to this  
list for a while now.  Recent versions of libsequence now contain  
definitions for all the data types and functions necessary to do  
coalescent simulations with recombination.  The efficiency is quite  
good, easily on par with "ms", with a few extra nuts and bolts thrown  
in there that can lead to improved efficiency over ms.  Also, the  
resulting data structure (i.e. the ancestral recombination graph),  
can be accessed directly, and/or mutations can be thrown down on  
them, and objects are returned that are compatible with the summary- 
statistic calculation factories already in the library.

Here's where bioperl may come in.  I have attempted to create a  
python wrapper for the library (using boost::python), with the  
ultimate goal of mentioning or submitting it to biopython.   
Unfortunately, there appears to be some limitations to boost::python  
that will prevent a full python interface to libsequence from  
appearing any time soon.  However, the code is all there for someone  
who's motivated to provide perl wrappers.  It is my understanding  
that a direct perl interface to a C++ API is not possible, or at  
least not easy.  If I'm wrong here, I'd be interested in hearing more  
about it.  However, some basic binaries could be provided which perl  
could call.  While this would be a pain in that it wouldn't be self- 
contained perl, it would be quite fast, and potentially quite flexible.

If this sounds interesting to anybody, I'd be willing to discuss this  

Kevin Thornton
Molecular Biology and Genetics
Cornell University

More information about the Bioperl-l mailing list