[Bioperl-guts-l] [Bug 2877] New: [Bio::Tree::Tree] some bootstrap scores assigned to the wrong node after root()

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Thu Jul 9 14:28:33 EDT 2009


http://bugzilla.open-bio.org/show_bug.cgi?id=2877

           Summary: [Bio::Tree::Tree] some bootstrap scores assigned to the
                    wrong node after root()
           Product: BioPerl
           Version: 1.6 branch
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Core Components
        AssignedTo: bioperl-guts-l at bioperl.org
        ReportedBy: tristan.lefebure at gmail.com
                CC: tristan.lefebure at gmail.com


Hello,

I have been bumping into problems while rerooting trees that
contained bootstrap scores. Basically, after re-rooting the
tree, some scores end-up at the wrong place (i.e. node) and
some nodes lose their score. I found this thread from Bank
Beszter, back in 2007, that exactly explains the same
problems:

http://lists.open-bio.org/pipermail/bioperl-l/2007-
May/025599.html

Below I paste a script that reproduces the bug and implements the
fix that Bank described (at least this is my understanding,
and it works on this example):


#! /usr/bin/perl

use strict;
use warnings;
use Bio::TreeIO;


my $in = Bio::TreeIO->new(-format => 'newick',
   -fh => \*DATA,
   -internal_node_id => 'bootstrap');

my $out = Bio::TreeIO->new(-format => 'newick', -file =>
">out.tree");

while( my $t = $in->next_tree ){
   my $old_root = $t->get_root_node();
   my ($b) = $t->find_node(-id =>"B");
   my $b_anc = $b->ancestor;
   $out->write_tree($t);

       #reroot with B -> wrong, and the tree is kind of weird
   $t->reroot($b);
   $out->write_tree($t);

       #reroot with B ancestor -> wrong
   $t->reroot($b_anc);
   $out->write_tree($t);

   #a fix, following Bank Beszteri description
   my $node = $old_root;
   while (my $anc_node = $node->ancestor) {
        $node->bootstrap($anc_node->bootstrap());
        $anc_node->bootstrap('');
        $node = $anc_node;
   }
   $out->write_tree($t); #->good this time
}


__DATA__
(A:52,(B:46,C:50)68:11,D:70);


Here is the output:

(A:52,(B:46,C:50)68:11,D:70);
((C:50,(A:52,D:70):11)68:46)B;
(B:46,C:50,(A:52,D:70):11)68;
(B:46,C:50,(A:52,D:70)68:11);


Tree #2 and #3 have the score 68 moved to the wrong node,
while tree #4 is OK. (BTW tree #2 is really weird, except if
B, is the real ancestor (a fossil ?), it really does not
make much sense to me).

My understanding here is that the problem is linked to the
well-known difficulty to differentiate node from branch
labels in newick trees. Bootstrap scores are branch
attributes not node attributes, but since Bio::TreeI has no
branch/edge/bipartition object they are attached to a node,
and in fact reflects the bootstrap score of the ancestral
branch leading to that node. Troubles naturally come when
you are dealing with an unrooted tree or reroot a tree: a
child can become an ancestor, and, if the bootstrap scores
is not moved from the old child to the new child, it will
end up attached at the wrong place (i.e. wrong node).

I see several possible fix to that:

1- incorporate Bank's fix into the root() method. I.e. if
there is bootstrap score, after re-rooting, the one on the
old to new ancestor path, should be moved to the right node.

2- Modify the way trees are stored in bioperl to incorporate
branch/edge/bipartition object, and move the bootstrap
scores to them. That won't be easy and will break many
things...

What do you think?


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


More information about the Bioperl-guts-l mailing list