[Bioperl-l] A better fix for Species.pm memory leak, please REVIEW.

Chris Fields cjfields at illinois.edu
Tue Nov 25 11:06:57 EST 2008

George (and anyone else paying attention),

I have attached a Bio::Species patch to bug 2594:


I would like comments back on that patch as soon as anyone can test it  
out.  If there aren't any objections I'll probably commit it in a  
week's time and close the bug out.

For the record, I don't see memory issues and the API is intact (still  
a Bio::Taxon), but it's a fairly drastic change as Bio::Species is now  
just a proxy.  No idea on how this affects parsing speed either and it  
hasn't been tested with Devel::Cycle, so benchmarks are welcome; I  
would guess that it's slightly slower (extra object being generated),  
something I can deal with as long as it doesn't suck memory away on  
long parses.



On Nov 24, 2008, at 10:59 PM, George Hartzell wrote:

> Chris Fields writes:
>> On Nov 23, 2008, at 8:31 PM, George Hartzell wrote:
>>> Chris Fields writes:
>>>> On Nov 23, 2008, at 6:19 AM, Sendu Bala wrote:
>>>>> But anyway, thanks George, go ahead and commit if it all seems
>>>>> functional.
>>>>> If you could summarise your solution in the bug report(s?) as  
>>>>> well,
>>>>> that would be great.
>>>> Agreed, though I think we need to ensure that memory leaks don't
>>>> permeate into Bio::Taxon when we make the switch (it shouldn't, I
>>>> think).  BTW, checking Blame/Annotate it looks like some of this
>>>> stems
>>>> from r10974:
>>>> http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=10974
>>> The change that I proposed wouldn't cause anything to permeate into
>>> Bio::Taxon.
>>> It's really a design problem with Bio::Species.  A Species "has-a"
>>> reference to a tree *and* when the Species creates that tree it asks
>>> the tree constructor to initialize itself using the Species itself
>>> which results in the tree including a reference to that Species
>>> object.  Et voila, circular reference.
>> My thoughts as well.
>>> It's not helped by the fact that Bio::Species explicitly disconnects
>>> all of the tree cleanup code that was supposed to handle those
>>> circular references.  On the other hand there are a bunch of  
>>> comments
>>> that suggest that the cleanup code never worked right.
>> The cleanup code relies on the object being destroyed (either via
>> garbage collection or explicitly).  Since there is a circular
>> dependency and DESTROY isn't explicitly called, the root cleanup
>> method won't work.
>>> ...
>>> I'm not in front of that machine at the moment, but I think that  
>>> there
>>> are a couple of other places in the src tree that use the -node or
>>> -root arg's to Bio::Tree::Tree::new and should probably be checked  
>>> for
>>> leaks.
>> A possible solution would be to make Bio::Species a shell or proxy
>> object that decorates a Bio::Taxon instance (and a Bio::Tree::Tree
>> where needed) and just delegates the appropriate methods to them.  It
>> could still inherit from Bio::Taxon; it would just override the
>> Bio::Taxon methods for delegation.  The Bio::Tree::Tree can refer to
>> the Bio::Taxon object (and vice versa), but neither refer back to the
>> Species object directly.
>> In this case, when the last ref to the Bio::Species is released it
>> should be garbage collected and trigger DESTROY (which is set up in
>> Bio::Root::Root).  This in turn triggers any cleanup methods, such as
>> calling DESTROY on the Tree and Taxon and get rid of memory leaks  
>> (and
>> wouldn't require weaken/is_weak).
>> [...]
> How will this interact with the places that Bio::Species unhooks the
> various cleanup mechanisms that Tree sets up.  There are comments that
> talk about the sub ref screwing with someone's use of Storable.   It
> seems like if the new Species has cleanup code registered then there's
> a problem with whom-ever had a problem with it before.
>> I'm perfectly willing to test that out if no one else is (I am  
>> working
>> on a few other bugs at the moment but I can probably get to it in the
>> next day or two).  Not to tread on any toes, but I would like to get
>> this one taken care of prior to 1.6 and these are fairly nasty memory
>> issues.  If there is any consolation it shouldn't be too hard to re-
>> rig the methods to delegate appropriately.
>>> I don't really feel hot or bothered about this fix vs. the  
>>> combination
>>> of weaken and stashing the subspecies.  It depends on what the  
>>> author
>>> intended for Species and its tree, neither are particularly sexy and
>>> if Species is really on its way out the door it may not matter.....
>>> g.
>> I don't think it will matter post-1.6, but we'll have to support
>> Bio::Species for the 1.6 release series.  So it will have to be
>> maintained for the short term, until 1.7 comes out.
> I can't poke at this stuff much until the weekend.  If you can kick
> it, go for it!
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Marie-Claude Hofmann
College of Veterinary Medicine
University of Illinois Urbana-Champaign

More information about the Bioperl-l mailing list