[Bioperl-l] Genome Information
shalabh.sharma7 at gmail.com
Tue Oct 26 12:23:02 EDT 2010
This information is really useful.
Actually i was using Bio::DB::Taxonomy for taxonomy information
and Bio::DB::EUtilities to get the genome size (i didn't know that i can
just use Bio::DB::EUtilities for all the information).
I was very confused about getting GC% and coding% info but i
think WWW::Mechanize might help me out.
I really appreciate your help.
On Tue, Oct 26, 2010 at 12:11 PM, Chris Fields <cjfields at illinois.edu>wrote:
> I don't know if there is a quick one-step way of getting this information
> via NCBI w/o wrangling with query term limit magic, and even then you will
> be bound to whatever version of the genome is present within the database of
> For instance, via eutils you can get summary information for various taxa,
> genomes, and genome projects using the following example code (prints the
> first 10 archaeal genome project summaries; set the '-db' parameter to one
> of 'genomeprj', 'taxonomy', 'genome'):
> use Bio::DB::EUtilities;
> my $term = "Archaea[ORGN]";
> my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
> -db => 'genome',
> -email => 'cjfields at bioperl.org',
> -usehistory => 'y',
> -term => $term);
> my $hist = $eutil->next_History || die "No history returned";
> $eutil->set_parameters(-eutil => 'esummary',
> -history => $hist,
> -retmax => 10);
> $eutil->print_all; # print summary info to STDOUT
> GC and coding % don't appear to be stored in any of the above databases,
> but they are displayed via the genome overview. You could probably use
> something like WWW::Mechanize to grab the summary table information
> displayed using the Genome UID:
> Just don't spam the server with a billion requests (use a timeout!) or
> you'll find yourself blocked. I may pop an email to NCBI to see if this
> information is programmatically accessible.
> On Oct 26, 2010, at 9:09 AM, shalabh sharma wrote:
> > Hi All,
> > I have thousands of taxaIds and i need to find out the following
> > information regarding genomes:
> > 1) Taxonomy information
> > 2) GC%
> > 3) total coding genes %
> > I can easily find the taxonomy info by using Bio::DB::Taxonomy but for
> > other two i am stuck.
> > Is there any way i can find this info?
> > I would really appreciate your help.
> > Thanks
> > Shalabh
> > -------------------------------
> > Shalabh Sharma
> > Scientific Computing Professional Associate (Bioinformatics Specialist)
> > Department of Marine Sciences
> > University of Georgia
> > Athens, GA 30602-3636
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l