[Bioperl-l] Taxonomy hierarchy extraction

George Heller george.heller at yahoo.com
Mon Jun 18 19:05:42 EDT 2007


This is the output of /usr/bin/perl -V

Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
  Platform:
    osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
    uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
    config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
    ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
    perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
    gnulibc_version='2.3.4'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  
Characteristics of this binary (from libperl):
  Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
  Built under linux
  Compiled at Jul 24 2006 18:28:10
  @INC:
    /usr/lib/perl5/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/5.8.5
    /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.5
    /usr/lib/perl5/site_perl/5.8.4
    /usr/lib/perl5/site_perl/5.8.3
    /usr/lib/perl5/site_perl/5.8.2
    /usr/lib/perl5/site_perl/5.8.1
    /usr/lib/perl5/site_perl/5.8.0
    /usr/lib/perl5/site_perl
    /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.5
    /usr/lib/perl5/vendor_perl/5.8.4
    /usr/lib/perl5/vendor_perl/5.8.3
    /usr/lib/perl5/vendor_perl/5.8.2
    /usr/lib/perl5/vendor_perl/5.8.1
    /usr/lib/perl5/vendor_perl/5.8.0
    /usr/lib/perl5/vendor_perl
   
  Thanks.
  George
    .

Hilmar Lapp <hlapp at gmx.net> wrote:
  The perl version appears to be 5.8.5 though, so something strange 
appears to be going on too.

George, can you please post the output of

$ /usr/bin/perl -V

-hilmar

On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:

> As the error implies your local version of perl doesn't seem support
> weak references, which means it doesn't have Scalar::Utils (which was
> added to core after perl 5.6.1, I think). Try installing
> Scalar::Utils to see what happens.
>
> chris
>
> On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>> I tried running the below mentioned script and I seem to be getting
>> the following error:
>>
>> Weak references are not implemented in the version of perl at /
>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>> Bio/Tree/Node.pm line 76.
>> Compilation failed in require at my.pl line 7.
>> BEGIN failed--compilation aborted at my.pl line 7.
>>
>> My script looks something like,
>>
>> #!/usr/bin/perl
>> use strict;
>> #use warnings;
>> use DBI;
>> use Bio::Tree::Node;
>> use Bio::DB::Taxonomy;
>> use Bio::DB::Taxonomy::flatfile;
>> my $idx_dir = '/tmp';
>>
>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>> -nodesfile => $nodesfile,
>> -namesfile => $namesfile,
>> -directory => $idx_dir);
>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>> my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>> foreach $field (@extant_children) {
>> print "$field";
>> print "|";
>> print "\n";
>> }
>>
>> And I am running the script using the command,
>>
>> perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>>
>> and I have the nodes.dmp and names.dmp files in the current
>> directory.
>>
>> Thanks,
>> George
>>
>>
>> Jason Stajich wrote:
>> It is implemented in the implementing class - DB::Taxonomy is
>> just the base class. For example see the flatfile implementation
>> Bio::DB::Taxonomy::flatfile
>>
>> See the scripts/taxa/local_taxonomydb_query.PLS for example using
>> it:
>> nodes and names are from NCBI taxonomy database.
>>
>>
>> Here is an un-debugged copy+paste for your question that *should*
>> work.
>>
>>
>> use Bio::DB::Taxonomy
>> my $idx_dir = '/tmp';
>>
>>
>> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>> -nodesfile => $nodesfile,
>> -namesfile => $namesfile,
>> -directory => $idx_dir);
>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>> my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>>
>>
>>
>> -jason
>>
>> On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>>
>> What exactly is the "node n" in the query below. When I issue
>> this query, it says,
>>
>>
>> relation "node" does not exist.
>>
>>
>> I tried to use the get_all_Descendents method but it looks like
>> in order to do a recursive call it calls the method
>> each_Descendent. This method is not implemented in
>> Bio::DB::Taxonomy. It just has a single line,
>>
>>
>> shift->throw_not_implemented();
>>
>>
>> Thanks.
>> George.
>>
>>
>> Hilmar Lapp wrote:
>> I'm a bit confused - it sounds like you have set up a local 
>> BioSQL
>> database and loaded the NCBI taxonomy into the database. You can 
>> now
>> use simple SQL to retrieve all descendants of a node in the tree
>> given its NCBI taxonID such as
>>
>>
>> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>> WHERE
>> n.ncbi_taxon_id = :taxonID
>> AND tn.left_value > n. left_value
>> AND tn.right_value < n.right_value
>> AND tn.taxon_id = tnm.taxon_id
>> AND tn.name_class = 'scientific_name'
>>
>>
>> BioPerl doesn't have a Taxonomy::biosql module yet (though this
>> would
>> seem like a worthwhile thing to add), so you can't use the
>> Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>>
>>
>> However, BioPerl does have support for the flat-file download of 
>> the
>> NCBI taxonomy database and indexes it, so you can simply use
>> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>> download
>> to achieve what you wanted to do in a less than 5 lines of perl.
>>
>>
>> Although the recursive implementation of
>> Taxonomy::get_all_Descendants
>> () won't be lightning fast, it may still be perfectly fine for your
>> application - are you sure it is not?
>>
>>
>> -hilmar
>>
>>
>> On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>>
>>
>> Thanks. And how can I assign the $node here in the below code,
>> such
>> that I can reference it to a particular taxon id record? I want to
>> retrieve all the descendents from the taxonomy hierarchy, given a
>> particular taxon id.
>>
>>
>> I have a local db setup, in which I have uploaded data using the
>> load_ncbi_taxonomy.pl script.
>>
>>
>> Thanks.
>> George
>>
>>
>> Jason Stajich wrote:
>> I assume you already figured out how to setup a local taxonomydb?
>>
>>
>>
>>
>> You just want the extant species/leaves of the tree
>>
>>
>>
>>
>> my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descedents;
>>
>>
>>
>>
>>
>>
>> -jason
>> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>
>>
>> Hi all,
>>
>>
>>
>>
>> Can anyone point me to some example that uses the
>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>> this, and I am not quite sure how to implement it.
>>
>>
>>
>>
>> Thanks.
>> George
>>
>>
>>
>>
>> Sendu Bala wrote:
>> George Heller wrote:
>> Hi all,
>>
>>
>>
>>
>> I am looking at extracting the taxonomy hierarchy for some taxon
>> ids.
>> What I plan to do is, for a given taxon id, say 33090, I want to
>> extract all taxon ids that are children of this species. I do not
>> just want the immediate children, but the children's children 
>> and so
>> on.
>>
>>
>>
>>
>> Any ideas on the way I can go about doing this?
>>
>>
>>
>>
>> Well, you'll use Bio::DB::Taxonomy presumably, and
>> each_Descendent in
>> some kind of looping structure. Most easily a recursing sub.
>>
>>
>>
>>
>> If you happen to code up something neat and efficient, why not
>> share it
>> with us and we could add it to the Taxonomy module(s).
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Shape Yahoo! in your own image. Join our Network Research Panel
>> today!
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Need a vacation? Get great deals to amazing places on Yahoo! 
>> Travel.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> --
>> ===========================================================
>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Take the Internet to Go: Yahoo!Go puts the Internet in your
>> pocket: mail, news, photos & more.
>>
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Bored stiff? Loosen up...
>> Download and play hundreds of games for free on Yahoo! Games.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================







 
---------------------------------
Expecting? Get great news right away with email Auto-Check.
Try the Yahoo! Mail Beta.


More information about the Bioperl-l mailing list