[Bioperl-l] reducing time

Josh Lauricha laurichj at bioinfo.ucr.edu
Wed Jan 28 14:23:14 EST 2004

On Wed 01/28/04 12:30, Gregory Wilson wrote:
> Not to get into a FS war, but you could try ReiserFS.  Specifically the
> quote: "ReiserFS is about eight to fifteen times faster than Ext2 at
> handling files smaller than 1K." More about reiserfs is at
> http://www.namesys.com/

I've been using ReiserFS for quite some time and it really does handle
very large directories of small files quite well... For instance:

$ time find -type f | wc
  28581   28581  389134

real    0m0.151s
user    0m0.060s
sys     0m0.100s

$ time find -type f | wc
 228648  228648 2312804

real    0m1.169s
user    0m0.330s
sys     0m0.910s

In both cases, these are files with one protein in each, the first has
28k proteins, the second 228k (about 900MB).

Of course, ls is going to take FOREVER on these, thats not because of
the fs but rather because ls is sorting the list. Without the sorting:

$ time ls -f -l >/dev/null 

real    0m3.961s
user    0m2.530s
sys     0m1.430s

But, XFS and JFS will probably have similar performance.

Anyhow, the perl hack would probably be simplest.

| Josh Lauricha            |
| laurichj at bioinfo.ucr.edu |
| Bioinformatics, UCR      |

More information about the Bioperl-l mailing list