[Bioperl-l] Hidden Markov Model in Bioperl?
Yee Man Chan
ymc at paxil.stanford.edu
Fri Mar 25 18:49:53 EST 2005
I just wrote a C module to do Hidden Markov Model (HMM) related
calculations. I find that there is no HMM implementation anywhere (there
are parsers for HMMER output however) in Bioperl. I think maybe it will be
a good idea for me to add this module to Bioperl?
I am thinking of an interface like this:
- instantiate an HMM object with a string of symbols (each character
corresponds to one symbol) and a string of states. Other parameters of the
model is generated randomly. Good for starting a Baum-Welch training.
Bio::Tools::HMM->new("symbols", "states", array of initial state
probabilities, matrix of state transition probabilities, matrix of
- similar to the one before but now we explicit assign the HMM parameters.
Bio::Tools::HMM->ObsSeqProb("string of observed sequence")
- return the probability of an observed sequence.
Bio::Tools::HMM->Viterbi("string of observed sequence")
- return a string of hidden sequence that maximize the probability of the
happening of the observed sequence.
Bio::Tools::HMM->BaumWelchTraining(array of observed sequences)
- uses an array of observed sequences to find the HMM parameters that
locally maximizes the probabilities of these observed sequences. Optional
parameters can be passed to change the tolerance and maximum number of
Bio::Tools::HMM->StatisticalTraining(array of observed sequences, array of
hidden state sequences)
- when the hidden state sequence is also known, use it to determine the
parameter of an HMM using statistical method.
- return the array of initial state probabilities as an @array
- return the matrix of state transition probabilities as MatrixI
- return the matrix of emission probabilities as MatrixI
This should cover the most HMM applications. What do you think? Do
you have other functions in mind?
I already contributed Bio::Tools::dpAlign before, so I am not a
newbie. If someone thinks it is a good idea to have this in Bioperl, I can
work on it as soon as possible.
More information about the Bioperl-l