[Bioperl-l] Tree refactor? was Re: Bootstrap, root, reroot...
cjfields at illinois.edu
Wed Jul 15 18:11:44 EDT 2009
On Jul 11, 2009, at 2:52 AM, Aidan Budd wrote:
> On Thu, 9 Jul 2009, Tristan Lefebure wrote:
>> My understanding here is that the problem is linked to the
>> well-known difficulty to differentiate node from branch
>> labels in newick trees. Bootstrap scores are branch
>> attributes not node attributes, but since Bio::TreeI has no
>> branch/edge/bipartition object they are attached to a node,
>> and in fact reflects the bootstrap score of the ancestral
>> branch leading to that node. Troubles naturally come when
>> you are dealing with an unrooted tree or reroot a tree: a
>> child can become an ancestor, and, if the bootstrap scores
>> is not moved from the old child to the new child, it will
>> end up attached at the wrong place (i.e. wrong node).
>> I see several fix to that:
>> 1- incorporate Bank's fix into the root() method. I.e. if
>> there is bootstrap score, after re-rooting, the one on the
>> old to new ancestor path, should be moved to the right node.
>> 2- Modify the way trees are stored in bioperl to incorporate
>> branch/edge/bipartition object, and move the bootstrap
>> scores to them. That won't be easy and will break many
> Just wanted to add that, from my point of view, it would be great if
> were possible to add edge/branch objects as part of the bioperl trees.
> Perhaps so that the previous set of methods still behaved as before,
> with some new methods on the trees such as get_splits() or
> get_branches() along with associated split/branch/etc. objects...?
> Being a bioperl user but keeping well away from coding objects in
> the lack of such methods/objects meant I chose, in the end, not to
> use a
> bioperl solution to work with my trees (going instead for a homemade
> clunky python solution, where I'm happier with the OO stuff)
> No idea how difficult/problematic this would be to implement, though -
> just my 2 cents worth...
Mark and Tristan have both indicated some of the problems that lie
here, so it's worth discussing this on the list. I think the best way
to approach this is to suggest what a proposed refactoring of
Bio::Tree-related classes would look like (i.e. how it would be done,
what is expected of said classes interface-wise, etc), and then come
up with data and cases where the current classes don't DTRT,
preferably as tests we can incorporate into the test suite.
Note this will affect some of the key core classes we now have (seq
classes specifically, so memory management will be important). I'll
have my hands full with a few other refactors, so anyone out there
willing to take the reins on this one?
More information about the Bioperl-l