[Bioperl-guts-l] [Bug 2576] New: SearchIO is ignoring an excellent match

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Wed Aug 27 20:35:38 EDT 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2576

           Summary: SearchIO is ignoring an excellent match
           Product: BioPerl
           Version: 1.5 branch
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Bio::Search/Bio::SearchIO
        AssignedTo: bioperl-guts-l at bioperl.org
        ReportedBy: jayoung at fhcrc.org
                CC: jayoung at fhcrc.org


Hi again,

I'm parsing a lot of blast reports (NCBI blastall BLASTN v 2.2.18) using
SearchIO and have been doing some spot checks on output. 

I have some blast outputs where SearchIO seems to be ignoring the first hit,
even though it's a good one. But on other outputs it picks up the first hit as
expected.  A simplified version of my script is below.

In the example I'm attaching, there are 6 hits.  The result object has 6 hits
according to $result->num_hits(). BUT  when I cycle through the hits using
$result->next_hit(), the first, really good hit (E value e-122) doesn't appear. 

(An aside - did NCBI recently start leaving off the first 1 in Ev-value 1e-122
- I doubt this is the problem as the second hit has E-value e-105 and it parses
fine.)

Another odd thing that might shed some light is this: without a signif
parameter,  $result->num_hits gives the correct answer (6 hits), but if I add
-signif=>'1e-5' when I create the SearchIO object, then the result object has
only 5 hits, even though all 6 hits have E-value better than I specified. 

I guess one solution is to re-do all the blasts with -m 8 output format but I
would love to stick to doing this with bioperl if possible.

thanks,

Janet

The script:

#!/usr/bin/perl

use warnings;
use strict;
use Bio::SearchIO;

foreach my $file (@ARGV){

     #my $blastObj = new Bio::SearchIO(-file=>$file,-format=>'blast',-signif
=>'1e-5');
     my $blastObj = new Bio::SearchIO(-file=>$file,-format=>'blast');

     while ( my $result = $blastObj->next_result() ) {
          print "num hits ", $result->num_hits(), "\n";
          while( my $hit = $result->next_hit() ) {
              my $hitname = $hit->name();
              while( my $hsp = $hit->next_hsp() ) {
                  my $frac = $hsp->frac_identical();
                  print "hitname $hitname frac_ident $frac\n";
              }
          }
      }
}






------------------------------------------------------------------- 

Dr. Janet Young (Trask lab)

Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168, 
P.O. Box 19024, Seattle, WA 98109-1024, USA.

tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung at fhcrc.org

-------------------------------------------------------------------


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


More information about the Bioperl-guts-l mailing list