[Bioperl-l] bioperl-db performance
Alex.Zelensky at anu.edu.au
Mon Sep 6 01:30:00 EDT 2004
I have a project which is based on the bioperl-db. Till now I've been
using the old (bioperl-1.1 branch) version of the code and schema, but
it is becoming unacceptable (mainly because of the way taxonomy is
stored), so I decided to upgrade to the current version. The new code
is a huge leap forward in terms of design, clarity and consistency.
However, I am experiencing severe performance problems.
For example, retrieving a locally stored GenPept entry consistently
takes 16-17 seconds (by primary or unique key, doesn't matter),
compared 2-3'' it takes to get it directly from SRS using
Bio::DB::GenBank or ~ 1'' from the old bioperl-db. Also, getting a
species object (I use them a lot) from a local database (new
bioperl-db) that contains nothing but an import of NCBI taxonomy takes
>15'', compared to <1'' with the old bioperl-db. In both cases I use a
mysql 4.0.16 on a dual 866 Mhz PowerPC G4 with 768 Mb RAM.
So, my questions are:
1. Is this performance drop an expected behavior (due to increased
complexity of the code and new schema)?
2. If the answer to (1) is yes, then what is the way to improve it and
how big an improvement can be achieved?
3. If the answer to (1) is no, where should I look for my problem
There was a related question on this list in May 2004, but it described
sequence loading performance on a significantly slower machine, and the
suggestion was to increase the horsepower.
Thanks in advance!
More information about the Bioperl-l