[Bioperl-l] Taxonomy hierarchy extraction

George Heller george.heller at yahoo.com
Mon Jun 18 20:04:00 EDT 2007

Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. 
  Sorry to be bothering, really appreaciate your patience.

Jason Stajich <jason at bioperl.org> wrote:
  Try installing the latest Scalar::Util  
    On Jun 18, 2007, at 4:05 PM, George Heller wrote:

    This is the output of /usr/bin/perl -V

  Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
      osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
      uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
      config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
      hint=recommended, useposix=true, d_sigaction=define
      usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
      useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
      use64bitint=undef use64bitall=undef uselongdouble=undef
      usemymalloc=n, bincompat5005=undef
      cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
      optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
      cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
      ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
      intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
      d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
      ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
      alignbytes=4, prototype=define
    Linker and Libraries:
      ld='gcc', ldflags =' -L/usr/local/lib'
      libpth=/usr/local/lib /lib /usr/lib
      libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
      perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
      libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
    Dynamic Linking:
      dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
      cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'

  Characteristics of this binary (from libperl):
    Built under linux
    Compiled at Jul 24 2006 18:28:10


  Hilmar Lapp <hlapp at gmx.net> wrote:
    The perl version appears to be 5.8.5 though, so something strange 
  appears to be going on too.

  George, can you please post the output of

  $ /usr/bin/perl -V


  On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:

    As the error implies your local version of perl doesn't seem support
  weak references, which means it doesn't have Scalar::Utils (which was
  added to core after perl 5.6.1, I think). Try installing
  Scalar::Utils to see what happens.


  On Jun 18, 2007, at 5:18 PM, George Heller wrote:

    I tried running the below mentioned script and I seem to be getting
  the following error:

  Weak references are not implemented in the version of perl at /
  usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
  BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
  Bio/Tree/Node.pm line 76.
  Compilation failed in require at my.pl line 7.
  BEGIN failed--compilation aborted at my.pl line 7.

  My script looks something like,

  use strict;
  #use warnings;
  use DBI;
  use Bio::Tree::Node;
  use Bio::DB::Taxonomy;
  use Bio::DB::Taxonomy::flatfile;
  my $idx_dir = '/tmp';

  my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
  my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
  -nodesfile => $nodesfile,
  -namesfile => $namesfile,
  -directory => $idx_dir);
  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node-

  foreach $field (@extant_children) {
  print "$field";
  print "|";
  print "\n";

  And I am running the script using the command,

  perl myscript.pl -v --names names.dmp --nodes nodes.dmp

  and I have the nodes.dmp and names.dmp files in the current



  Jason Stajich wrote:
  It is implemented in the implementing class - DB::Taxonomy is
  just the base class. For example see the flatfile implementation

  See the scripts/taxa/local_taxonomydb_query.PLS for example using
  nodes and names are from NCBI taxonomy database.


  Here is an un-debugged copy+paste for your question that *should*


  use Bio::DB::Taxonomy
  my $idx_dir = '/tmp';


  my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
  my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
  -nodesfile => $nodesfile,
  -namesfile => $namesfile,
  -directory => $idx_dir);
  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node-





  On Jun 18, 2007, at 10:07 AM, George Heller wrote:

  What exactly is the "node n" in the query below. When I issue
  this query, it says,


  relation "node" does not exist.


  I tried to use the get_all_Descendents method but it looks like
  in order to do a recursive call it calls the method
  each_Descendent. This method is not implemented in
  Bio::DB::Taxonomy. It just has a single line,






  Hilmar Lapp wrote:
  I'm a bit confused - it sounds like you have set up a local 
  database and loaded the NCBI taxonomy into the database. You can 
  use simple SQL to retrieve all descendants of a node in the tree
  given its NCBI taxonID such as


  SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
  n.ncbi_taxon_id = :taxonID
  AND tn.left_value > n. left_value
  AND tn.right_value < n.right_value
  AND tn.taxon_id = tnm.taxon_id
  AND tn.name_class = 'scientific_name'


  BioPerl doesn't have a Taxonomy::biosql module yet (though this
  seem like a worthwhile thing to add), so you can't use the
  Bio::DB::Taxonomy interface to do this against a BioSQL instance.


  However, BioPerl does have support for the flat-file download of 
  NCBI taxonomy database and indexes it, so you can simply use
  Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
  to achieve what you wanted to do in a less than 5 lines of perl.


  Although the recursive implementation of
  () won't be lightning fast, it may still be perfectly fine for your
  application - are you sure it is not?




  On Jun 18, 2007, at 12:21 AM, George Heller wrote:


  Thanks. And how can I assign the $node here in the below code,
  that I can reference it to a particular taxon id record? I want to
  retrieve all the descendents from the taxonomy hierarchy, given a
  particular taxon id.


  I have a local db setup, in which I have uploaded data using the
  load_ncbi_taxonomy.pl script.




  Jason Stajich wrote:
  I assume you already figured out how to setup a local taxonomydb?




  You just want the extant species/leaves of the tree




  my @extant_children = grep { $_->is_Leaf } $node-






  On Jun 17, 2007, at 11:41 AM, George Heller wrote:


  Hi all,




  Can anyone point me to some example that uses the
  get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
  this, and I am not quite sure how to implement it.








  Sendu Bala wrote:
  George Heller wrote:
  Hi all,




  I am looking at extracting the taxonomy hierarchy for some taxon
  What I plan to do is, for a given taxon id, say 33090, I want to
  extract all taxon ids that are children of this species. I do not
  just want the immediate children, but the children's children 
  and so




  Any ideas on the way I can go about doing this?




  Well, you'll use Bio::DB::Taxonomy presumably, and
  each_Descendent in
  some kind of looping structure. Most easily a recursing sub.




  If you happen to code up something neat and efficient, why not
  share it
  with us and we could add it to the Taxonomy module(s).












  Shape Yahoo! in your own image. Join our Network Research Panel
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org




  Jason Stajich
  jason at bioperl.org














  Need a vacation? Get great deals to amazing places on Yahoo! 
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org


  : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
















  Take the Internet to Go: Yahoo!Go puts the Internet in your
  pocket: mail, news, photos & more.


  Jason Stajich
  jason at bioperl.org







  Bored stiff? Loosen up...
  Download and play hundreds of games for free on Yahoo! Games.
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org

  Christopher Fields
  Postdoctoral Researcher
  Lab of Dr. Robert Switzer
  Dept of Biochemistry
  University of Illinois Urbana-Champaign



  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org

  : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :








  Expecting? Get great news right away with email Auto-Check.
  Try the Yahoo! Mail Beta.
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org

  Jason Stajich
  jason at bioperl.org

Building a website is a piece of cake. 
Yahoo! Small Business gives you all the tools to get online.

More information about the Bioperl-l mailing list