[Bioperl-l] PSI-BLAST Matrix Parser?

Stefan Kirov skirov at utk.edu
Wed Sep 8 14:07:21 EDT 2004

This seems reasonable to me. The one thing you need to consider is the 
structure that should contain the matrix. The current design of

Bio::Matrix::PSM::Psm and Bio::Matrix::PSM::SiteMatrix does not allow this as SiteMatrix is a DNA only object.
There are two ways to go:
Either change SiteMatrix to accept protein matrix data or add a protein matrix class to Bio::Matrix::PSM (say Bio::Matrix::PSM::ProtMatrix), which will hold the data and make Bio::Matrix::PSM::Psm inherit from the class and be able to contain the object (as it is actually a container right now).
So you will have something like:
my $psmIO= new Bio::Matrix::PSM::IO(-file=>$file, -format=>'psi-blast'); #this will call the actual parser (Bio::Matrix::PSM::IO::psiblast)
  my $header=$psmIO->.... #I guess there will be some header data

  while (my $psm=$psmIO->next_psm) {
   my $psimatrix=$psm->protmatrix; #This will be Bio::Matrix::PSM::ProtMatrix object 
   $psimatrix->.....; #Now process the data parsed into this object through its methods...

If you do this maybe you should get an account and commit it yourself?
Does this make sense to you?

James Thompson wrote:

>Thanks for the response. For reading in the actual alignment I would use
>Bio::AlignIO to read the PSI-BLAST output as it's just another alignment file,
>but the matrix file that I'm talking about is slightly different. Now that
>I've perused CVS more and learned more about how the Bio::Matrix::PSM modules
>work, I think I have a more clear picture of what I'd like to do. 
>If you run PSI-BLAST with the -Q option, will take the matrix that it
>used for the position-specific search and output it to a file. I've put up a
>link to one of my matrix files up here if you'd like to look at it:
>Basically I'd like to make some Bio::Matrix::PSM::Psm objects (or at least
>a PsmI-compliant object), and I think that the correct way to do this would
>be to add a file format parser to Bio::Matrix::PSM::IO. Currently in Bioperl
>there are three format parsers:
>   - mast
>   - meme
>   - transfac
>None of these work with the PSI-BLAST matrix files.  I'd like to write a new
>matrix file parser (perhaps called psi-blast?) in the spirit of the three other
>If I were to write this, could someone commit it for me? 
>James Thompson
>On Tue, 7 Sep 2004, Stefan A Kirov wrote:
>>I am not sure what object you are going to store your data in... Are you
>>going to develop your own class to hold the data or use an existing one?
>>Also is there any reason not to use Bio::AlignIO (it reads PSI-Blast as
>>far as I know)?
>>On Tue, 7 Sep 2004, James Thompson wrote:
>>>Dear Bioperl-ers,
>>>I'd like to parse the output of a PSI-BLAST matrix, and I was wondering if
>>>there was a Bioperl way of parsing these files. If not, I'd like to make my
>>>code general enough to be committed, and I'd like some advice on where exactly
>>>to put such a module. From my cursory knowledge of Bioperl, I think that adding
>>>another format parser to Bio::Matrix::PSM::IO would be a good way to go.
>>>I have a couple of questions:
>>>- Does anyone know what the PSI-BLAST matrix format is called?
>>>- Is this the correct place in which to put code for parsing this type of files?
>>>The file format represents a position-specific scoring matrix with some added
>>>statistical information, here's a general overview of the information available
>>>from the matrix file:
>>>Last position-specific scoring matrix computed, weighted observed percentages
>>>rounded down, information per position, and relative weight of gapless real
>>>matches to p seudocounts.
>>>Any help is greatly appreciated.
>>>James Thompson
>>>Bioperl-l mailing list
>>>Bioperl-l at portal.open-bio.org

Stefan Kirov, Ph.D.
University of Tennessee/Oak Ridge National Laboratory
1060 Commerce Park, Oak Ridge
TN 37830-8026
tel +865 576 5120
fax +865 241 1965
e-mail: skirov at utk.edu
sao at ornl.gov

More information about the Bioperl-l mailing list