Marc.Logghe at devgen.com
Mon Nov 29 09:17:55 EST 2004
I think you will always bump into that limit; it is the limit ncbi is using with efetch.
I don't know how it is internally done by Bio::DB::Query::GenBank but it should go via a 2 step process:
1) you perform a query and you get a webenv and query key back
2) you fetch your sequences by passing your webenv and query key and explicitely requesting your record numbers in chunks of 500.
I also never succeeded in fetching more that 500 sequences with Bio::DB::Query::GenBank.
I am currently using a non bioperl script based on http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_example.pl.
NCBI also asks to run these kind of queries at night EST, in the weekend and with a sleep of at least 5 sec between every fetch of 500 records.
> -----Original Message-----
> From: Aaron J. Mackey [mailto:amackey at pcbi.upenn.edu]
> Sent: Monday, November 29, 2004 2:59 PM
> To: Wuming Gong
> Cc: Bioperl-l at portal.open-bio.org
> Subject: Re: [Bioperl-l] Bio::DB::Query::GenBank
> If you try again late at night (meaning late at night EST),
> you may get
> all 5000 hits; NCBI seems to have implemented a limit of 500
> entries in
> batch retrieval when network load is already high, but you may be
> successful during non-peak hours ...
> On Nov 29, 2004, at 4:26 AM, Wuming Gong wrote:
> > Hi Mona,
> > I have met the same kind of problem. You may pull down the sequences
> > once by less than 500 and It works.
> > Wuming
> > On Thu, 04 Nov 2004 21:12:40 -0700, Ligia Mateiu
> <lmateiu at ualberta.ca>
> > wrote:
> >> Hi all,
> >> I used a query for which exists >5000 hits in Genbank, but my code
> >> retrieved just the very fist 500.
> >> Any idea why?
> >> Thanks a lot,
> >> Mona
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at portal.open-bio.org
> >> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> Aaron J. Mackey, Ph.D.
> Dept. of Biology, Goddard 212
> University of Pennsylvania email: amackey at pcbi.upenn.edu
> 415 S. University Avenue office: 215-898-1205
> Philadelphia, PA 19104-6017 fax: 215-746-6697
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
More information about the Bioperl-l