[Bioperl-l] Proposed bioperl module for local running of the NCBI standalone blastpackage

Hilmar Lapp hlapp@gmx.net
Mon, 16 Oct 2000 11:56:16 +0200

Peter Schattner wrote:
> So instead I propose to write a relatively "light weight"  bioperl
> wrapper module for running the NCBI standalone blast package.  Its
> format would be similar to that of the Clustalw.pm module. I believe its
> approach would also be similar to that of the Jeff Chang's biopython
> NCBIStandalone.py module (thanks to Brad Chapman for bringing this
> module to my attention).
> The syntax of the proposed module would involve creating a local blast
> "factory object". The constructor would be passed the name of the blast
> method and database to be used, the desired method for parsing the blast
> report (Blast or BPlite) and an optional array of (non-default)
> parameters to be used by the factory, eg:
> @params = ('method' => 'blastn', 'database' => 'ecoli.nt','outformat' => 'BPlite');
> $factory = Bio::Tools::StandAloneBlast->new(@params);

Sounds good to me, and is certainly useful. We (and certainly a lot of
others :) are already calling the stand-alone BLAST from within Perl as a
system call, but your proposal is certainly much more transparent and
re-usable, and I like the factory idea. 

Basically, I have one comment. It would be very helpful if such a module
could also support running stand-alone BLASTs in parallel, e.g., if
you've got a multi-processor machine. I know that the current NCBI BLAST
supports multi-threading, but on well-equipped machines it often scales
better to run multiple processes. So, the idea is then that I can pass an
array of Seq objects and these will be run in parallel, returning an
array of BPlite or Blast.pm objects. At the low-level, there may be a
memory issue if the array is a few hundreds or thousands seqs long (which
it is for us). So, instead of returning a full array of result objects,
one may consider a callback invoked for each finished report.


Hilmar Lapp                                email: hlapp@gmx.net
NFI Vienna, IFD/Bioinformatics             phone: +43 1 86634 631
A-1235 Vienna                                fax: +43 1 86634 727