[Bioperl-guts-l] [Bug 2773] Bio::Tree::Node gets destroyed even though it is still live

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Mon Jun 1 14:52:00 EDT 2009


------- Comment #5 from cjfields at bioperl.org  2009-06-01 14:51 EST -------
(In reply to comment #4)
> (In reply to comment #3)
> Comment #2, that is. Sorry; doh.
> > I willing to look at this if the reporter responds to Comment #3. Please
> > isolate the bug with a short snippet of code.
> > We also love patches for hacks. 
> > cheers MAJ

Mark, you are more than welcome to take a look, a new pair of eyes on this
would help.  I think this is a good instance where we can take this to a
feature branch and sort it out.

Here are my extended thoughts from looking into this previously:

The problem appears to be related to Bio::Species memory issues brought about
by various refactors (introduced in 1.5.2).  To note, these were necessary for
eventual deprecation of Bio::Species and was possible due to a lot of hard work
by Sendu Bala.

Using weaken() is a nice convenience when it works as expected; Moose uses it
quite a bit.  In this case, and has been reported previously, it apparently
isn't working as expected wrt Bio::Tree::Tree and appears to GC prematurely. 
We have two options: (1) fix weak refs in BioPerl so they aren't GC'd early, or
(2) remove use of weaken() completely from Bio::Tree::* and explicitly GC
everything ourselves.  I'm leaning in the latter direction b/c it's cleaner
(we're not relying on weaken() 'black magic') and easier to debug.  

Bio::Tree::Tree should act as a decent enough proxy class, i.e. it is capable
of cleaning up the various contained Nodes when garbage-collected (primarily by
stepping through Nodes and destroying parent/child refs).  Thus it (and Nodes)
shouldn't need weaken() if set up correctly, namely calling node_cleanup() when
the Tree is being gc'd, either via _root_cleanup_methods or DESTROY.  I have
toyed around with this by removing all uses of weaken() within Node and found
no memory leaks involving Bio::Tree::Tree or Node directly.

The below example script req. Devel::Cycle and Devel::Leak::Object and was run
on Mac OS X using perl 5.10; all instances of weaken() are removed from
Bio::Species and Bio::Tree::Node and root cleanup methods are in place (i.e.
called upon DESTROY).  Bio::Species leaks, but Bio::Tree doesn't (it is gc'd
correctly).  Of note, removing weaken() causes many Tree-related tests to fail,
but my guess is those can be fixed and were probably part of the original 1.5.2
refactoring.  Also, I cheat a bit to get at the Bio::Tree::Tree within Species

(Side Note: interestingly, other uncollected instances are popping up as well:
Config, Errno, POSIX::SigRt.  This may indicate small issues with
Bio::Root::Root and pop up with any 'use Bio::*' statement, no instantiation
needed.  These appear minor.)


#!/usr/bin/perl -w

use strict;
use warnings;

use Devel::Leak::Object qw{ GLOBAL_bless };
use Devel::Cycle;
use Bio::Species;
use Bio::TreeIO;

my $species = Bio::Species->new();
$species->classification(qw( sapiens Homo Hominidae
                             Catarrhini Primates Eutheria
                             Mammalia Vertebrata Chordata
                             Metazoa Eukaryota ));

print "Bio::Tree::Tree nodes:".$species->{tree}->get_nodes."\n";

print "Bio::Species cycles:\n";


my $treeio = Bio::TreeIO->new(
                             -format => 'nhx',
                             -file   => 'test.nhx');
my $tree = $treeio->next_tree;

print "Bio::Tree::Tree nodes:".$tree->get_nodes."\n";

print "Bio::Tree::Tree cycles:\n";

find_cycle($tree); # no cycles




Bio::Tree::Tree nodes:11
Bio::Species cycles:
Cycle (1):
            $Bio::Species::A->{'tree'} => \%Bio::Tree::Tree::B          
        $Bio::Tree::Tree::B->{'_rootnode'} => \%Bio::Taxon::C               
             $Bio::Taxon::C->{'_desc'} => \%D                           
                            $D->{'20'} => \%Bio::Taxon::E               
             $Bio::Taxon::E->{'_desc'} => \%F                           
                            $F->{'18'} => \%Bio::Taxon::G               
             $Bio::Taxon::G->{'_desc'} => \%H                           
                            $H->{'16'} => \%Bio::Taxon::I               
             $Bio::Taxon::I->{'_desc'} => \%J                           
                            $J->{'14'} => \%Bio::Taxon::K               
             $Bio::Taxon::K->{'_desc'} => \%L                           
                            $L->{'12'} => \%Bio::Taxon::M               
             $Bio::Taxon::M->{'_desc'} => \%N                           
                            $N->{'10'} => \%Bio::Taxon::O               
             $Bio::Taxon::O->{'_desc'} => \%P                           
                             $P->{'8'} => \%Bio::Taxon::Q               
             $Bio::Taxon::Q->{'_desc'} => \%R                           
                             $R->{'6'} => \%Bio::Taxon::S               
             $Bio::Taxon::S->{'_desc'} => \%T                           
                             $T->{'4'} => \%Bio::Taxon::U               
             $Bio::Taxon::U->{'_desc'} => \%V                           
                             $V->{'1'} => \%Bio::Species::A             

Bio::Tree::Tree nodes:13
Bio::Tree::Tree cycles:
Tracked objects by class:
Bio::DB::Taxonomy::list                  1
Bio::Species                             1
Bio::Taxon                               10
Bio::Tree::Tree                          1
Config                                   1
Errno                                    1
POSIX::SigRt                             1

Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

More information about the Bioperl-guts-l mailing list