cjfields at uiuc.edu
Tue Feb 5 17:59:48 EST 2008
On Feb 5, 2008, at 4:31 PM, Susan J. Miller wrote:
> Chris Fields wrote:
>> The URL has changed. I'll fix this in bioperl-live.
>> You can fix this in your script directly for now (though I hate
>> use Bio::DB::SeqHound;
>> $Bio::DB::SeqHound::HOSTBASE = 'http://dogboxonline.unleashedinformatics.com/'
> Thanks Chris, that helps a little bit, but I'm still not having much
> luck with the SeqHound DB. The CPAN SeqHound.pm documentation for
> the get_Stream_by_Query method says:
No problem. It was an easy fix.
> I get the error:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: Id list has been truncated even after maxids requested
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/
> STACK: Bio::DB::Query::WebQuery::_fetch_ids /usr/lib/perl5/site_perl/
> STACK: Bio::DB::Query::WebQuery::ids /usr/lib/perl5/site_perl/5.8.8/
> STACK: Bio::DB::SeqHound::get_Stream_by_query /usr/lib/perl5/
> STACK: SeqHoundQuery.pl:21
> There are only 5013 sequences that match this query so it seems odd
> that the Id list is too long...or am I using SeqHound improperly?
> (My reason for trying SeqHound is that I want to set up a monthly
> cron job to download nucest fasta sequences for drosphila
> melanogaster, and I've tried NCBI E-Utilities and the script
> generated by the NCBI ebot and in both cases some of the 570828
> records get dropped, even after running repeated attempts.)
The URL is likely way too long (a common problem when using a GET as
opposed to a POST with LWP). NCBI's efetch has the same problem,
which is why using epost is a good idea (except it only takes GI's!).
You will have to loop through the IDs in bunches of 250-500 max to get
what you want.
Don't know if there is a way to post to SeqHound but it might be worth
investigating at some point. I also see they have a SOAP interface up.
More information about the Bioperl-l