[Bioperl-l] What does Expect(2) mean in a blast result?

Amir Karger akarger at CGR.Harvard.edu
Mon Nov 19 10:38:26 EST 2007


 

> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu] 
> Sent: Tuesday, November 13, 2007 12:42 PM
> To: Amir Karger
> Cc: Steve Chervitz; Dave Messina; bioperl-l
> Subject: Re: [Bioperl-l] What does Expect(2) mean in a blast result?
> 
> Amir,
> 
> Can you file this as a bug?  

Done.

http://bugzilla.open-bio.org/show_bug.cgi?id=2399

> Dave mentioned he would look 
> into it but  
> I think it warrants tracking to make sure it gets fixed:
> 
> http://www.bioperl.org/wiki/Bugs
> 
> Attach the example BLAST report from your last post to the report.   
> BTW, I wonder how this appears in XML output?
> 
> chris
> 
> On Nov 13, 2007, at 11:30 AM, Amir Karger wrote:
> 
> >> From: trutane at gmail.com [mailto:trutane at gmail.com] On Behalf
> >> Of Steve Chervitz
> >>
> >> The Bioperl blast parser should extract that value and you 
> can obtain
> >> it from an HSP object, via the HSPI::n() method, documented here:
> >>
> >> http://doc.bioperl.org/releases/bioperl-current/bioperl-live/B
> > io/Search/HSP/HSPI.html#POD23
> >
> > As I mentioned in my email:
> >
> > And does anyone know off-hand if Bioperl will tell me when 
> situations
> > like this happen? I thought the Bio::Search::HSP::BlastHSP::n  
> > subroutine
> > would help, but I just get a bunch of empty strings for that,  
> > whether or
> > not there's a (2) in the Expect string. (hsp->n is empty, hsp-> 
> > {"_n"} is
> > undef.)
> >
> > And the docs for n() actually say, "This value is not defined with  
> > NCBI
> > Blast2 with gapping" although they don't say why. Which may 
> explain  
> > why,
> > when I ran the following code on the blast result I included in my  
> > last
> > email, I got empty values for all of the n's. (Why is n() 
> undefined  
> > for
> > gapped blast if I'm getting n's in my results from that blast?)
> >
> > use warnings;
> > use strict;
> > use Bio::SearchIO;
> >
> > my $blast_out = $ARGV[0];
> > my $in = new Bio::SearchIO(-format => 'blast',
> >                             -file   => $blast_out,
> >                             -report_type => 'tblastn');
> >
> > print join("\t", qw(Qname Qstart Qend Strand Sname Sstart 
> Send Frame N
> > Evalue)), "\n";
> > while(my $query = $in->next_result) {
> >     while(my $subject = $query->next_hit) {
> >         while (my $hsp = $subject->next_hsp) {
> >             print join("\t",
> >                 $query->query_name,
> >                 $hsp->start("query"),
> >                 $hsp->end("query"),
> >                 $hsp->strand("hit"),
> >                 $subject->name,
> >                 $hsp->start("hit"),
> >                 $hsp->end("hit"),
> >                 $subject->frame,
> >                 $hsp->n,
> >                 $hsp->evalue,
> >             ),"\n";
> >         }
> >     }
> > }
> >
> >> Dave's basically correct in his explanation. It's a result of the
> >> application of sum statistics by the blast algorithm. You 
> can read  
> >> all
> >> about it in Korf et al's BLAST book. Here's the relevant section:
> >
> > [snip]
> >
> > Thanks,
> >
> > -Amir
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> 



More information about the Bioperl-l mailing list