[Bioperl-guts-l] bioperl-live/doc/howto/sgml PAML.xml,1.2,1.3

Jason Stajich jason at pub.open-bio.org
Mon Mar 21 17:06:28 EST 2005


Update of /home/repository/bioperl/bioperl-live/doc/howto/sgml
In directory pub.open-bio.org:/tmp/cvs-serv9548

Modified Files:
	PAML.xml 
Log Message:
NSSites description


Index: PAML.xml
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/doc/howto/sgml/PAML.xml,v
retrieving revision 1.2
retrieving revision 1.3
diff -C2 -d -r1.2 -r1.3
*** PAML.xml	11 Mar 2005 03:21:45 -0000	1.2
--- PAML.xml	21 Mar 2005 22:06:26 -0000	1.3
***************
*** 30,46 ****
  	<orgname><ulink url="http://www.duke.edu"><citetitle>Duke
  	    University</citetitle></ulink></orgname>
- 	<orgdiv>
- 	  <ulink url="http://upg.duke.edu"><citetitle>University Program in
- 	  Genetics</citetitle></ulink>
- 	</orgdiv>
- 	<orgdiv><ulink url="http://cgt.genetics.duke.edu"><citetitle>Center for
- 	    Genome Technology</citetitle></ulink></orgdiv>
  	<address>
- 	  Duke University Medical Center
- 	  <pob>Box 3568</pob>
- 	  <city>Durham, </city>
- 	  <state>North Carolina</state> 
- 	  <postcode>27710-3568</postcode> 
- 	  <country>USA</country>
  	  <email>jason-at-bioperl.org</email>
          </address>
--- 30,34 ----
***************
*** 63,73 ****
  	<revremark>Added pairwise Ka,Ks example code and running code</revremark>
        </revision>
      </revhistory>
  
      <legalnotice>
        <para>
! 	This document is copyright Aaron Mackey, 2002.  For
! 	reproduction other than personal use please contact me at
! 	amackey at virginia.edu
        </para>
      </legalnotice>
--- 51,69 ----
  	<revremark>Added pairwise Ka,Ks example code and running code</revremark>
        </revision>
+       <revision>
+ 	<revnumber>0.3</revnumber>
+ 	<date>2005-03-15</date>
+ 	<authorinitials>jes</authorinitials>
+ 	<revremark>Added branch-specific paramater parsing (NSsites
+ 	and per-branch rates)</revremark>
+       </revision>
+ 
      </revhistory>
  
      <legalnotice>
        <para>
! 	This document is copyright Aaron Mackey, Jason Stajich, 2002-2005.  For
! 	reproduction other than personal use please contact us at
! 	amackey at virginia.edu or jason at bioperl.org.
        </para>
      </legalnotice>
***************
*** 343,346 ****
--- 339,463 ----
      <para>
      </para>
+   </section>
+   <section id="nssites">
+     <title>Parsing branch-specific rates and NSSites results</title>
+     <para>
+ <emphasis>PAML</emphasis> allows for several models of molecular
+ evolution.  Given a tree topology one can test whether or not
+ contraints of evolutionary rates on different parts of the topology
+ better explain the observe data than a null model where an overall
+ rate is assumed.  To do this a tree topology must be provided and
+ typically marked to specify which branches to test the alternative
+ hypotheses of differing rates.
+     </para>
+     <para>
+ To get access to this branch specific data we store it in the <classname>Bio::Tree::Tree</classname> object which is
+       parsed for each result.  The nodes in the tree will contain
+       additional tagged values for capturing the branch specific
+       evolutionary rates.  In cases where there are several
+       different models of evolution tested (i.e. M0, M1, etc) we
+       create
+       a <classname>Bio::Tools::Phylo::PAML::ModelResult</classname>
+       to store each of the model results separately.  A separate
+       Tree object will be associated with each one of these models.
+     </para>
+     <para>
+       <emphasis>Please note that the models underlting PAML 3.14 have
+ 	changed some from PAML 3.13 and earlier.  NSsites 1 and 2 mean
+ 	slightly different things than they did in previous versions.
+ 	</emphasis>
+     </para>
+     <para>Accessing Tree data</para>
+     <para> First we'll just descibe how to access data for a
+ 	topology for a single model or where NSsites=0. In this case
+ 	we'll just want to get the tree(s) associated with a give
+ 	result.  In this code we loop through all
+ 	the <classname>Bio::Tree::Tree</classname> associated with the <classname>Bio::Tools::Phylo::PAML::Result</classname>.
+       </para>
+       <programlisting>
+ use Bio::Tools::Phylo::PAML;
+ 
+ my $outcodeml = shift(@ARGV);
+ my $paml_parser = new Bio::Tools::Phylo::PAML(-file => $outcodeml, 
+                                               -dir => "./");
+ if( my $result = $paml_parser->next_result() ) {
+  while ( my $tree = $result->next_tree ) {
+   for my $node ( $tree->get_nodes ) {
+    my $id;
+    # first we do some work to figure out what the ID should be.
+    # for a leaf or tip node this is just the taxon label
+    if( $node->is_Leaf() ) {
+     $id = $node->id;
+    } else {
+    # for the internal nodes it is just the name of all the sub-nodes
+    # put together, much like how Sanderson represents internal nodes
+    # in r8s 
+     $id = "(".join(",", map { $_->id } grep { $_->is_Leaf } 
+                               $node->get_all_Descendents) .")";
+    }
+    if( ! $node->ancestor || ! $node->has_tag('t') ) {
+      # skip when no values have been associated with this node
+      # (like the root node)
+      next; 
+    }
+    printf "%s\tt=%.3f\tS=%.1f\tN=%.1f\tdN/dS=%.4f\tdN=%.4f\t".
+            "dS=%.4f\tS*dS=%.1f\tN*dN=%.1f\n",
+     $id,map { ($node->get_tag_values($_))[0] } 
+     qw(t S N dN/dS dN dS), 'S*dS', 'N*dN';
+   }
+  }
+ }
+       </programlisting>
+       <para>Accessing NSsites data</para>
+       <para>
+ 	In cases where nssites=1 or nssites=2 is provided the data for
+ 	the results is accessible through
+ 	the <classname>Bio::Tools::Phylo::PAML::ModelResult</classname>.
+ 	The function <function>get_NSSite_results</function> on
+ 	the <classname>Bio::Tool::Phylo::PAML::Result</classname>
+ 	object.  In this way multiple model results can be folded into
+ 	a single <classname>PAML::Result</classname> object.  The
+ 	code shown below is nearly identical to that in the previous
+ 	example, there is just an additional loop to process the
+ 	NSsite Result objects.
+       </para>
+       <programlisting>
+ use Bio::Tools::Phylo::PAML;
+ 
+ my $outcodeml = shift(@ARGV);
+ my $paml_parser = new Bio::Tools::Phylo::PAML(-file => $outcodeml, 
+                                               -dir => "./");
+ if( my $result = $paml_parser->next_result() ) {
+   for my $ns_result ( $result->get_NSSite_results ) {
+     print "model ", $ns_result->model_num, " ", 
+           $ns_result->model_description, "\n";
+   while ( my $tree = $ns_result->next_tree ) {
+    for my $node ( $tree->get_nodes ) {
+     my $id;
+     # first we do some work to figure out what the ID should be.
+     # for a leaf or tip node this is just the taxon label
+     if( $node->is_Leaf() ) {
+      $id = $node->id;
+     } else {
+     # for the internal nodes it is just the name of all the sub-nodes
+     # put together, much like how Sanderson represents internal nodes
+     # in r8s 
+      $id = "(".join(",", map { $_->id } grep { $_->is_Leaf } 
+                               $node->get_all_Descendents) .")";
+     }
+     if( ! $node->ancestor || ! $node->has_tag('t') ) {
+      # skip when no values have been associated with this node
+      # (like the root node)
+      next; 
+     }
+     printf "%s\tt=%.3f\tS=%.1f\tN=%.1f\tdN/dS=%.4f\tdN=%.4f\t".
+            "dS=%.4f\tS*dS=%.1f\tN*dN=%.1f\n",
+     $id,map { ($node->get_tag_values($_))[0] } 
+     qw(t S N dN/dS dN dS), 'S*dS', 'N*dN';
+    }
+   }
+  }
+ }
+       </programlisting>
    </section>
  </article>



More information about the Bioperl-guts-l mailing list