[Bioperl-l] Memory Leak in Bio::SearchIO

Clarke, Wayne ClarkeW at AGR.GC.CA
Tue May 16 17:24:51 EDT 2006

Thanks Chris, 

I did forget to mention however that I did parse one single report and
found no problems, it finished fast and with no noticeable memory usage.
I will consider getting my SA to update bioperl from CVS as a precaution
but he has already stated he prefers to wait for the release of v1.5.
Even a single job of 10000 will finish but the problem is that I am
trying to loop through many jobs of 10000 and it seems to be additive
for reasons I can not determine. During testing I noticed that the RSS
on top decreased around 80% MEM usage, but then the shared mem
increased. I am wondering if this is due to the perl garbage collector
freeing up memory but keeping it in its pool for use, if so that is fine
as long as the it does not then want to reach into swapped mem.

Thanks again, Wayne

-----Original Message-----
From: Chris Fields [mailto:cjfields at uiuc.edu] 
Sent: Tuesday, May 16, 2006 3:15 PM
To: Clarke, Wayne; bioperl-l at lists.open-bio.org
Subject: RE: [Bioperl-l] Memory Leak in Bio::SearchIO

I mentioned two possibilities last time I posted: 1) that the BLAST file
too large, or 2) that you are using an old version of bioperl that
is broken.  You seem to fit #2. 

The issue is that NCBI does not consider text BLAST output sacrosanct
routinely makes changes to it that break parsing.  Due to this,
SearchIO::blast needs to be constantly updated, so much so that there
normally a few updates a year to fix parsing issues in that module alone
compared to BioPerl as a whole.  And, BTW, although bioperl-1.4 is about
years old now, even bioperl-1.5.1 SearchIO is broken when it comes to
latest NCBI BLAST (2.2.14 now).  I seriously suggest updating your local
bioperl distribution to the latest bioperl-live (from CVS).

Take one of those 10000 reports, just one, and try parsing it.  If you
the same problem (a CPU spike and increasing memory usage) then it may
fixed in CVS.


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Clarke, Wayne
> Sent: Tuesday, May 16, 2006 3:57 PM
> To: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Memory Leak in Bio::SearchIO
> With regards to the suggestions/comments made thank you. However I
> I should clear a few things up. I am running bioperl v1.4, I am
> through the blast reports which should not be of absurd size since
> only contain the top 5 hits, and I am using top to track(although I
> realize fairly inacuately) the memory usage. I have looked through the
> code for both AAFCBLAST and BEAST_UPDATE but do not believe the
> leak/problem to be contained within them since they are almost
> exclusively using method calls and those variables should be destroyed
> upon leaving the scope of the method. I have used Devel::Size to check
> the size of the variables $bdbi and $searchio and $connector and on
> iteration these variables have the same size. Any other suggestions
> would be greatly appreciated as I have nearly gone insane trying to
> track this problem down.
> Thanks, Wayne
> -----Original Message-----
> From: Torsten Seemann [mailto:torsten.seemann at infotech.monash.edu.au]
> Sent: Monday, May 15, 2006 6:19 PM
> To: Clarke, Wayne
> Cc: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Memory Leak in Bio::SearchIO
> > taking up and huge amount of RAM. For a single job of 10000 queries
> > can consume as much as a couple hundred Mb inside an hour. I realize
> >  my $result = $connector->getQueryResult($query_id);
> >                 my $searchio = new Bio::SearchIO(-format => "blast",
> >                 while (my $o_blast = $searchio->next_result()) {
> >                         my $clone_id = $o_blast->query_name();
> >                         my $statement = $bdbi->form_push_SQL
> ($o_blast, $clone_id, 5); }
> Some comments:
> Have you considered that whatever class/module $bdbi belongs to is
> causing the problem? ie. is it keeping a reference to $o_blast around?
> Are you aware that Perl garbage collection does not necessarily return
> freed memory back to the OS? This may affect how you were measuring
> "memory usage".
> --
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

More information about the Bioperl-l mailing list