[Bioperl-l] Bootstrap, root, reroot...

Tristan Lefebure tristan.lefebure at gmail.com
Thu Jul 9 14:30:57 EDT 2009


Done. bug #2877.
-Tristan

On Thursday 09 July 2009 14:02:01 Mark A. Jensen wrote:
> Hi Tristan--
> Would you enter this in bugzilla? I did an overhaul of
> the root/reroot a while back, and maybe you're running
> into some stuff I need to check out. Thanks a lot-
> Mark
> ----- Original Message -----
> From: "Tristan Lefebure" <tristan.lefebure at gmail.com>
> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Thursday, July 09, 2009 11:50 AM
> Subject: [Bioperl-l] Bootstrap, root, reroot...
>
> > Hello,
> >
> > I have been bumping into problems while rerooting trees
> > that contained bootstrap scores. Basically, after
> > re-rooting the tree, some scores end-up at the wrong
> > place (i.e. node) and some nodes lose their score. I
> > found this thread from Bank Beszter, back in 2007, that
> > exactly explains the same problems:
> >
> > http://lists.open-bio.org/pipermail/bioperl-l/2007-
> > May/025599.html
> >
> > I attach a script that reproduces the bug and
> > implements the fix that Bank described (at least this
> > is my understanding, and it works on this example):
> >
> >
> > #! /usr/bin/perl
> >
> > use strict;
> > use warnings;
> > use Bio::TreeIO;
> >
> >
> > my $in = Bio::TreeIO->new(-format => 'newick',
> >    -fh => \*DATA,
> >    -internal_node_id => 'bootstrap');
> >
> > my $out = Bio::TreeIO->new(-format => 'newick', -file
> > => ">out.tree");
> >
> > while( my $t = $in->next_tree ){
> >    my $old_root = $t->get_root_node();
> >    my ($b) = $t->find_node(-id =>"B");
> >    my $b_anc = $b->ancestor;
> >    $out->write_tree($t);
> >
> > #reroot with B -> wrong, and the tree is kind of weird
> >    $t->reroot($b);
> >    $out->write_tree($t);
> >
> > #reroot with B ancestor -> wrong
> >    $t->reroot($b_anc);
> >    $out->write_tree($t);
> >
> >    #a fix, following Bank Beszteri description
> >    my $node = $old_root;
> >    while (my $anc_node = $node->ancestor) {
> > $node->bootstrap($anc_node->bootstrap());
> > $anc_node->bootstrap('');
> > $node = $anc_node;
> >    }
> >    $out->write_tree($t); #->good this time
> > }
> >
> >
> > __DATA__
> > (A:52,(B:46,C:50)68:11,D:70);
> >
> >
> > Here is the output:
> >
> > (A:52,(B:46,C:50)68:11,D:70);
> > ((C:50,(A:52,D:70):11)68:46)B;
> > (B:46,C:50,(A:52,D:70):11)68;
> > (B:46,C:50,(A:52,D:70)68:11);
> >
> >
> > Tree #2 and #3 have the score 68 moved to the wrong
> > node, while tree #4 is OK. (BTW tree #2 is really
> > weird, except if B, is the real ancestor (a fossil ?),
> > it really does not make much sense to me).
> >
> > My understanding here is that the problem is linked to
> > the well-known difficulty to differentiate node from
> > branch labels in newick trees. Bootstrap scores are
> > branch attributes not node attributes, but since
> > Bio::TreeI has no branch/edge/bipartition object they
> > are attached to a node, and in fact reflects the
> > bootstrap score of the ancestral branch leading to that
> > node. Troubles naturally come when you are dealing with
> > an unrooted tree or reroot a tree: a child can become
> > an ancestor, and, if the bootstrap scores is not moved
> > from the old child to the new child, it will end up
> > attached at the wrong place (i.e. wrong node).
> >
> > I see several fix to that:
> >
> > 1- incorporate Bank's fix into the root() method. I.e.
> > if there is bootstrap score, after re-rooting, the one
> > on the old to new ancestor path, should be moved to the
> > right node.
> >
> > 2- Modify the way trees are stored in bioperl to
> > incorporate branch/edge/bipartition object, and move
> > the bootstrap scores to them. That won't be easy and
> > will break many things...
> >
> >
> > What do you think?
> >
> > --Tristan
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list