[Bioperl-l] Bio::Tree:Statistics parsimony question

Mark A. Jensen maj at fortinbras.us
Fri Oct 23 19:11:30 EDT 2009

I see. 
You can use the add_trait method on the tree object; to this, 
you provide a file containing a table of nodes X traits. 
The following code works; it creates a trait table on the fly, 
according to seq data given in the DATA section (after __END__).
Then in a loop, add_traits is used to add each position and calculate
the ps. There's probably a better way, but this could get you

use Bio::Tree::Statistics;
use Bio::TreeIO;
use strict;
use warnings;

# tree file can be found in t/data...
my $in = Bio::TreeIO->new(-format => 'nexus',
     -file   => 'traittree.nexus');
my $obj = Bio::Tree::Statistics->new();

my $tree = $in->next_tree;

# make a trait table from the sequences
open(my $tmpf, ">seq.dat");
my $len = <DATA>;
chomp $len;
print $tmpf join("\t", "id", map { "pos".$_ } (1..$len)), "\n";
while (<DATA>) {
    my ($id, $seq) = split /\s+/;
    print $tmpf join( "\t", $id, split('', $seq) ), "\n";

# cycle through
my @pars;
for (1..$len) {
    my $key = $tree->add_trait('seq.dat', $_);
    push @pars, $obj->ps($tree, $key);

print join("\n", @pars);

1  ACG
2  GCT
3  GGG
4  GGT
5  ACG
6  GCT
7  GGG
8  GGT
9  ACG
10 GCT
11 GGG
12 GGT
13 ACG
14 GCT
15 GGG
16 GGT

  ----- Original Message ----- 
  From: Chris 
  To: Mark A. Jensen 
  Cc: bioperl List 
  Sent: Friday, October 23, 2009 5:00 PM
  Subject: Re: [Bioperl-l] Bio::Tree:Statistics parsimony question

  Hi Mark,

  I might be missing something, but I cant see how that will work. ps()  takes as input a tree with the tags already in the leaf nodes and the key for those tags.  I think your suggested solution places the tags from the @seq array into ps() which throws an exception error.

  Maybe another explanaition;

  I would like to calculate the parsimony score at each position of an alignment. Consider 1 block with only 1 character in for each species;

  species1    A
  species 2   G
  species 3   G
  species 4   G

  I can do this for 1 position, by adding the tag (which Ive named 'seq') value pair  to each leaf node of the phylogeny tree for the 4 species, where the tag is a single character.  Then calculating the parsimony for that column is simply

  my $obj = Bio::Tree::Statistics->new();
  my $pars = $obj->ps($tree,'seq');
  print "Parsimony = $pars \n";

  Now consider an alignment block with more than 1 character
  species1    ACG
  species 2   GCT
  species 3   GGG
  species 4   GGT

   I see it is possible to add an array of values as a tag, now, instead of a single character as a tag on each leaf node, I have an array of nucleotides as tags at each leaf node. I was hoping for a way to calculate the parsimony given a tree with such tag value pairs.

  Regards, Chris.

  On Fri, Oct 23, 2009 at 11:12 PM, Mark A. Jensen <maj at fortinbras.us> wrote:

    up to list...
    ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
    To: "Chris" <coldmeadow at gmail.com>
    Sent: Friday, October 23, 2009 8:14 AM
    Subject: Re: [Bioperl-l] Bio::Tree:Statistics parsimony question

      Hi Chris, 
      looks like you could simply do

      @ps = map { $obj->ps($tree, $_) } @seq;

      or into a hash %ps with tags as keys:

      @ps{@seq} = map { $obj->ps($tree,$_) } @seq;

      cheers MAJ
      ----- Original Message ----- From: "Chris" <coldmeadow at gmail.com>
      To: <bioperl-l at bioperl.org>
      Sent: Friday, October 23, 2009 1:10 AM
      Subject: [Bioperl-l] Bio::Tree:Statistics parsimony question

        I have been able to use the ps() method in Bio::Tree::Statistics to calulate
        the parsimony for tag values from a tree if they are strings.

        This is the way I have been implementing the method:

        my $obj = Bio::Tree::Statistics->new();
        my $pars = $obj->ps($tree,'seq');
        print "Parsimony = $pars \n";

        Is it possible to do so if they are an array?

        say I have an array @seq = qw(A,A,C,G), I can add these as values to a leaf
        node of a tree using;

        foreach my $char (@seq){

        Does anybody have a suggestion to get the parsimony for each label of the
        array of labels for such a tree ?

        Thanks in advance,
        Bioperl-l mailing list
        Bioperl-l at lists.open-bio.org

    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org

More information about the Bioperl-l mailing list