[Bioperl-l] Bio::PopGen modules performance
kt234 at cornell.edu
Sun Nov 6 22:34:55 EST 2005
First, thanks to Jason and Albert for the plug.
This thread brings up something I've been meaning to post to this
list for a while now. Recent versions of libsequence now contain
definitions for all the data types and functions necessary to do
coalescent simulations with recombination. The efficiency is quite
good, easily on par with "ms", with a few extra nuts and bolts thrown
in there that can lead to improved efficiency over ms. Also, the
resulting data structure (i.e. the ancestral recombination graph),
can be accessed directly, and/or mutations can be thrown down on
them, and objects are returned that are compatible with the summary-
statistic calculation factories already in the library.
Here's where bioperl may come in. I have attempted to create a
python wrapper for the library (using boost::python), with the
ultimate goal of mentioning or submitting it to biopython.
Unfortunately, there appears to be some limitations to boost::python
that will prevent a full python interface to libsequence from
appearing any time soon. However, the code is all there for someone
who's motivated to provide perl wrappers. It is my understanding
that a direct perl interface to a C++ API is not possible, or at
least not easy. If I'm wrong here, I'd be interested in hearing more
about it. However, some basic binaries could be provided which perl
could call. While this would be a pain in that it wouldn't be self-
contained perl, it would be quite fast, and potentially quite flexible.
If this sounds interesting to anybody, I'd be willing to discuss this
Molecular Biology and Genetics
More information about the Bioperl-l