Module Discussion:Bio::Tools::Run::RemoteBlast

From BioPerl
Jump to: navigation, search

Contents

Jason comments #1

One other thing to consider, the RemoteNCBI module (or whatever it is named, I am not entirely happy with that name but hadn't come up with much better), could still support submitting and retrieving reports which are HTML or Text, they just can't be parsed by the Bio::SearchIO system. Since this module is purely about fetch and retrieve and then delegates to SearchIO for the parsing there is no reason we can't keep supporting that aspect. So a submit, fetch, save cycle could still be supported for the non-supported file formats, it would just have the module throw a warning and return nothing if the user tried to get the result object out for a non-supported format. Presumably the -m9/-m8 format could be supported as well as I believe that can be requested from the CGI script and is parsed by SearchIO just fine.

This similar system would work for the EBI SOAP services as well which I believe supports FASTA and WU-BLAST as search engines through the SOAP interface.

Thanks for taking this all on with a cast of backseat programmers here to drive you mad. --jason stajich 14:37, 10 February 2006 (EST)

Chris comments

Are there plans for each of the various Bio::Tools::Run::Blast modules to allow saving in Text/HTML/XML/ASN.1/Tabular formats, even if they aren't all parsed via Bio::SearchIO? I noticed StandAloneBlast doesn't have anything for XML, though admittedly I know very little about what formats that a local BLAST setup will return. --Chris Fields 15:04, 10 February 2006 (EST)

There is no reason it shouldn't be able to, local BLAST gets XML is the -m 7 and local tabular is -m 8 or -m 9 cmdline option, ASN.1 is available via -m 10 (ascii) or 11 (binary) (although ASN.1 is not an oft requested format by anyone). The lack of this in StandAloneBlast is presumably because no one ever asked for it... and I think the XML is a little less expressive than the text format. I don't mind us over-engineering here (meaning building a lot of stuff no one is using yet), but only if it doesn't take up a lot of effort. --jason stajich 16:16, 10 February 2006 (EST)

Roger comments #1

Query submission and report retrieval is central to the planned functionality as documented, so this seems easy enough, and once agian, obvious upon reflection. :}

Rather than throw an exception though, how about we have two modules along the lines of:

Bio::Tools::Run::Blast::RemoteReports  (text only documents)
Bio::Tools::Run::Blast::RemoteHits     (parsable XML )

I just worry about a feature set that works with one parm value but not another, leading the uninitiated down the wrong (frustrating) path.

Are we terribly concerned with module bloat? Is there a preference for fewer modules with greater configurations, or is it okay to look at more modules with better defined boundaries? --Rogerhall 15:44, 10 February 2006 (EST)

I think that you should just have the one module dedicated to remote BLAST queries (via HTTP), though Jason may disagree. Since Jason changed to the Bio::SearchIO plugin structure, any fixes that need to be made are in those SearchIO plugins and not in RemoteBlast (or RemoteNCBI now, I guess). The text changes in BLAST 2.2.13 text output broke the Bio::SearchIO::blast module. Now, if NCBI decides to change anything in the way BLAST reports are sent through the web page, that's a whole different ballgame. Besides those doomsday scenarios, the only changes I can foresee would allow additional formats (ASN.1, text, tabular, XML, HTML) in save_output(). I think the work flow for a typical RemoteBlast user would go something like this (with those implemented in Bio::Tools::Run::RemoteBlast indicated):
  1. instantiate RemoteBlast object - implemented, of course
  2. change relevant parameters - implemented in various get/set methods or as described in the POD
  3. Submit report and check for errors in sequence/parameters/etc - implemented in submit_blast() and various other methods
  4. Retrieve using the rid - monitoring for errors on NCBI's end - implemented in retrieve_blast()
  5. Save report in a temp file - implemented, but I can't remember which method now... maybe retrieve_blast()?
  6. if asked, save output in a named file - implemented for text (in 1.5.1) and XML (in the bugfix I sent) via save_output(). This could do a simple check for any allowed output and just save the everything from the text file into the file passed in save_output()
  7. if parsing output, check tempfile for XML, tabular, text - implemented in relevant next_result() in Bio::SearchIO modules
or something along those lines ;> --Chris Fields 18:09, 10 February 2006 (EST)

Nimrod's comments

It appears that the perldoc version of the documentation doesn't match the CPAN in that the retrieve_parameter & co. methods have been removed from the module. Also there's the example provided regarding how to change paramters--that too, seems out of date. I believe

$remote_blastxml->retrieve_parameter('FORMAT_TYPE', 'XML'); # tells NCBI to send XML back

should be

$Bio::Tools::Run::RemoteBlast::HEADER{'FORMAT_TYPE'} = 'xml';

I just don't have the guts the change it because I have no idea what I'm doing, but I just wanted to help. Btw, thanks for all the awesome work guys.

Also, I noticed the links of available databases is outdated now. If I come across the new pages, I'll correct it. --Disneylord 03:30, 20 February 2008 (EST)

Thanks for the note. The BLAST tools are in need of revision and integration (see the main page for the overall idea), so it's no surprise there are disparities. --Chris Fields 11:38, 20 February 2008 (EST)

I added the new link of available databases, and I see someone removed its outdated brother. Scratch the earlier comment regarding the retrieve_parameter method. The deobfuscator still lists it (not deprecated), and I was running my script on ActivePerl today, and I didn't get any of the errors or warnings I got when I ran it on my Mac. At my level, I can only assume some of my modules there were outdated. --Disneylord 19:44, 20 February 2008 (EST)

Personal tools
Namespaces
Variants
Actions
Main Links
documentation
community
development
Toolbox