[Bioperl-l] Bio::Score of interest?
cjfields at uiuc.edu
Tue Jun 27 10:08:57 EDT 2006
> Hilmar Lapp wrote:
> > So you basically want to attach semantic information to a number, and
> > type the number thereby?
> Basically, I want to be able to stick a bunch of (different kinds of)
> numbers into an object, and later get the 'best' one out (of a
> particular kind), or sort multiple of those objects.
The 'best one' might be tricky when dealing with different kinds of scores,
esp. scores calculated different ways. For instance, I run RNA motif
programs quite frequently (RNAMotif, ERPIN, Infernal), but all generate
'scores' based on different criteria (algorithms, different parameters, how
the author slept, and so on). RNAMotif in particular is hard to deal with
(though a great program) b/c the scores are based on criteria in the
descriptor file (the file used to describe the motif), so aren't comparable
to other descriptors, which may have their own method of generating scores,
let alone output from other programs. Which one would be 'the best?' It's
a bit subjective since the scores are predictive based upon your input,
various program limitations, specific program parameter implementations,
I do like the idea of grouping together scores for comparison, such as when
a particular region of DNA has multiple hits from different programs with
different scores. It would at least suffice as a test on how various
programs or experimental data would compare with one another.
> > If so, an ontology would be the more natural choice (and in the end more
> > flexible one) for expressing this kind of information.
> I'm not really sure I understand 'and type the number', or what (useful)
> flexibility doing it with an ontology would provide.
I'm not sure, but maybe something along the lines of what the number (the
score) actually means, especially when compared to other scores. In other
words, how you could compare one score or number versus the other. An
ontology would allow more complex information to be included along with the
score information so one could make more informed choices based on how the
score was obtained, the algorithm used, the program involved, etc. Hence
flexible. Is that close, Hilmar?
To use my RNA program example above, I could include the information about
how the scores were obtained, the programs involved, parameters used, the
various raw scores, the time it took to run the program, etc. (i.e. you
could make it as specific as you wanted). This could also be extended to
other data types as well besides program, such as wet bench experimental
data and so on, which I deal with quite a bit. I think there are a few XML
specs out there besides MAGE that do this as well but I can't think of any
off the top of my head.
> > Have you looked at the concept of 'quantitation types', e.g. in MAGE
> > (the XML [MGAE-ML] or the object model [MAGE-OM])?
> I had a quick look, but not really sure what you intended to suggest here.
I think the idea is that MAGE, strictly as an example, deals with microarray
data from different sources or different data systems for comparison.
Sounds a little like what you want to do.
> > There is no quantitation type ontology at a repository I know of. I have
> > used my own ones in the past and they have been pretty useful.
> Can you provide a brief example of what you mean?
> If it would be appropriate to implement a Bio::Score with an ontology
> that's fine. Would we want a Bio::Score implemented though? Or are you
> suggesting each module make it's own quantitation type ontology when it
> wants to deal with numerous scores?
> I like the idea of a Bio::Score because then you can compare complex
> scores from multiple different unrelated modules.
Which is what MAGE does in a way, but more specifically, i.e. just
microarray data from different sources. So the array data may be calculated
in different ways based upon the specs for different machines, the way array
slides were prepared, how the experimenter slept, etc.
More information about the Bioperl-l