[Bioperl-l] Proposed bioperl module for local running of the NCBI standalone blast package

Peter Schattner schattner@alum.swarthmore.edu
Thu, 12 Oct 2000 08:31:20 -0700


Hello all,

I have a need for a easy-to-use perl interface for running the
standalone blast package from NCBI - including its blastall, blastpgp
and bl2seq programs.

I have attempted to modify Steve's "skeleton" LocalBlast.pm module for
this purpose, but have found that it does not seem to be a good fit. On
the one hand, it is quite complicated; on the other hand it does not
have an interface for some of the newer NCBI programs (eg psiblast,
phiblast).  

So instead I propose to write a relatively "light weight"  bioperl
wrapper module for running the NCBI standalone blast package.  Its
format would be similar to that of the Clustalw.pm module. I believe its
approach would also be similar to that of the Jeff Chang's biopython
NCBIStandalone.py module (thanks to Brad Chapman for bringing this
module to my attention).

The syntax of the proposed module would involve creating a local blast
"factory object". The constructor would be passed the name of the blast
method and database to be used, the desired method for parsing the blast
report (Blast or BPlite) and an optional array of (non-default)
parameters to be used by the factory, eg:

@params = ('method' => 'blastn', 'database' => 'ecoli.nt','outformat' => 'BPlite');
$factory = Bio::Tools::StandAloneBlast->new(@params);

Once the factory has been created and the appropriate parameters set,
one can call one of the supported blast programs.  The input sequence to
these programs may be in the form of either a Bio:Seq object or the
filename of a fasta-formatted sequence; eg 

$input = Bio::Seq->new(-id=>"test query",-seq=>"ACTACCCTTTAAATCAGTGGGGG");
$blast_report = $factory->blastall($input);

In either case, blastall (or blastpgp) returns a reference to either a
Blast object or a BPlite object containing the results of the blast operation.

One specific question I have is whether anyone is familiar with a perl
blast parser capable of   parsing terms specific to the  psiblast
reports (eg 'iteration round')?  As far as I can tell neither the
current Blast.pm nor the BPlite modules have this capability.  If
necessary I would be willing to modify one of the bioperl parsers to
support parsing psiblast reports - but it would be nice have a working
psiblast report parser to use as a model.

In addition, I would appreciate any feedback re the usefulness,
structure, usage, interaction with other modules, etc. of this proposed object.

Thanks

Peter Schattner

PS I apologize if you have received two copies of this posting or of my
previous posting regarding Clustalw.pm  my e-mail program has been
acting rather erratically recently.