[Bioperl-l] taxonomy ID

Smithies, Russell Russell.Smithies at agresearch.co.nz
Thu Apr 2 15:55:06 EDT 2009

We're here to help  - unless it's to do your homework  ;-)


From: shalabh sharma [mailto:shalabh.sharma7 at gmail.com]
Sent: Friday, 3 April 2009 8:51 a.m.
To: Sendu Bala
Cc: Smithies, Russell; bioperl-l
Subject: Re: [Bioperl-l] taxonomy ID

thanks a lot everyone, the information is really useful and it solved my purpose.

On Wed, Apr 1, 2009 at 8:00 AM, Sendu Bala <bix at sendu.me.uk<mailto:bix at sendu.me.uk>> wrote:
Smithies, Russell wrote:
The taxonomy information isn't in the blast output unless you created
custom fasta headers for your blast database. The easiest way to get
the tax_id for your accessions would be to download the gi->tax_id
list from ftp://ftp.ncbi.nih.gov/pub/taxonomy/gi_taxid_nucl.dmp.gz. If you load that file into a hash, parse the accessions out of the
blast hits then lookup the tax_id from that hash, I think it should
be fairly fast.

Checking which are prokaryotes and which are eukaryotes based on
tax_id is a separate problem  :-) If you grab the taxdump.tar.gz file
from the same site, the nodes.dmp file contained within lists what
division each tax_id belongs to (Bacteria, Invertebrates, Mammals,
Phages, Plants, etc) so you can probably work it out from that.

Check out the synopsis for Bio::Taxon

If the division() function doesn't tell you what you need, you could use
get_lineage_nodes() and check the oldest ancestors to see if its a pro
or euk.

Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.

More information about the Bioperl-l mailing list