[Bioperl-l] arbitrary hashes, blast, statistics, parameters,
and java interoperability
Aaron J. Mackey
amackey at pcbi.upenn.edu
Fri May 14 07:57:42 EDT 2004
I'd be happy to see the statistics and parameters turn into full
objects, and could even imagine some useful functions that a
Bio::Search::Statistics::BLAST object might provide:
my $stats = $result->statistics;
# use the report's database size to get a bit score threshold
# that corresponds to a given expectation threshold:
my $bitscore_threshold = $stats->E_to_bits(1e-6);
# vice versa:
my $expect_threshold = $stats->bits_to_E(32.0);
# calculate a bitscore or expectation for a given comparison:
my $bitscore = $stats->bitscore($rawscore, $querylen, $liblen);
my $exp = $stats->expect($rawscore, $querylen, $liblen);
# Make Warren Gish happy:
my $nats = $stats->bits_to_nats($bitscore);
I realize you (and 99.9% of the world) only care about BLAST statistics
and parameters, but I really do think you should subclass these things
so that we can plug in others when/if necessary. I would think that
all an interface should gaurantee are generic data access methods
(get_param, set_param, etc).
$stats->set_param( Lambda => 0.123 );
$stats->set_param( K => 0.002 );
Specific subclasses might include direct parameter access:
But we shouldn't try to agree on "universal" statistical parameters,
because they really don't exist.
In terms of run-time parameters, I would guess that a
Bio::Tools::Run::ParameterI kinda thing would be appropriate; that way,
you could build a runtime parameter object, pass it off to the
runnable, and get a result object back that included the (possibly
modified) parameter object.
On May 14, 2004, at 12:52 AM, Chad Matsalla wrote:
> Greetings all,
> I am writing a web service that provides Bio::Search::Result objects to
> a Java client. Yes, this does work and yes, it is very kewl.
> I created UML models for all of the components required to produce a
> Bio::Search::Result (Bio::Seq, Bio::HitI, etc) and used a code
> generation system to create Java classes that match. Would you like me
> to contribute this UML model (XMI format) to the project? I notice that
> the UML for Bioperl is a bit... dated.
> I tell a Java client to ask for a Bio::Search::Result from a SOAP::Lite
> service. This works, until...
> The _statistics and _parameters attributes of a Bio::Search::Result
> object are hashes. Although Java has a corresponding Hashtable class,
> it is not smart enough to deserialize a perl hash in an efficient,
> hack-free manner.
> I propose creating a SearchStatistics module that would hold these
> statistics and a SearchParameters object that would hold the
> I understand that hashes are used when you need an arbitrary data
> structure. At least in the case of Blast we know what the keys in a
> statistics and parameters hashtable are going to be so why not have
> At this time, I really only care about Blast results. Does anybody see
> why I should not change those two parameters to refer to objects rather
> then hashes in the Blast parts of the SearchIO subsystem?
> In the case that I create, for example, a SearchStatistics object I
> think that code based on the fact that _statistics is a hash would not
> break because _statistics is still a hash- it is just an object hash.
> Can anybody suggest what package these modules should belong to?
> I'm very eager to do this so unless there are reasonable objections I
> will do it this weekend. If it suddenly breaks tests or something I can
> undo it.
> I have invested significant time in Java<->BioPerl interoperability
> web services and if anybody is interested in my work just give me a
> shout (ISMB/BOSC?).
> Chad Matsalla
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
Aaron J. Mackey, Ph.D.
Dept. of Biology, Goddard 212
University of Pennsylvania email: amackey at pcbi.upenn.edu
415 S. University Avenue office: 215-898-1205
Philadelphia, PA 19104-6017 fax: 215-746-6697
More information about the Bioperl-l