[Bioperl-l] frac_aligned_query returning results >1.

Chris Fields cjfields at uiuc.edu
Sat Mar 3 17:07:40 EST 2007


Thiago,

Could you file a bug report and add the relevant files as attachments?

http://www.bioperl.org/wiki/Bugs
http://bugzilla.open-bio.org/

chris

On Mar 3, 2007, at 6:41 AM, Thiago Venancio wrote:

> Hi all.
>
> Sorry about this, but the bug persists. Although the number of  
> problematic
> cases is too low (3 out of 35139), they are present.
>
> Please find attached an example buggy blast report.
>
> The line I use to call the function is:
> print $result->query_name."\t".$hit->frac_aligned_query."\n";
>
> The warning bellow is still appearing a lot of times during processing
> reports, so I think it is not due to the same bug.
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: Undefined sub-sequence (821,821). Valid range = 778 - 821
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
> STACK: Bio::Search::HSP::HSPI::matches
> /usr/share/perl5/Bio/Search/HSP/HSPI.pm:711
> STACK: Bio::Search::SearchUtils::_adjust_contigs
> /usr/share/perl5/Bio/Search/SearchUtils.pm:421
> STACK: Bio::Search::SearchUtils::tile_hsps
> /usr/share/perl5/Bio/Search/SearchUtils.pm:200
> STACK: Bio::Search::Hit::GenericHit::frac_aligned_query
> /usr/share/perl5/Bio/Search/Hit/GenericHit.pm:1145
> STACK: ./geraStatGenome.pl:34
> -----------------------------------------------------------
>
> I have checked the code, but I have no idea about what is happening  
> in this
> case. the attached file produced the ">1" result and pops the  
> exception
> error, so it could be useful.
>
> Thiago
>
>
> On 3/2/07, Steve Chervitz <sac at bioperl.org> wrote:
>>
>> Glad you fixed the problem, Sendu.
>>
>> I thought this might have been due to a problem in HSPI::matches()  
>> since
>> it was reporting (1507,1507) as an invalid range within  
>> (1444,1507), when it
>> should be valid (the last position). So it looked like an edge  
>> condition
>> bug, but I didn't confirm. So there still could be a lingering  
>> problem in
>> the matches() function, or in the way the matches string is parsed  
>> from the
>> report.
>>
>> Speaking of which, HSPI::matches() is quite BLAST-specific. It's even
>> format specific, since it won't work if you are parsing in tabular  
>> blast
>> reports as they lack any string of match symbols. I thought about  
>> moving the
>> matches implementation in HSPI into BlastHSP.pm, but that module  
>> appears
>> to not be used anymore. Not sure the way to go here.
>>
>> Steve
>>
>> On 3/2/07, Thiago Venancio < thiago.venancio at gmail.com> wrote:
>>
>> > Hi Sendu,
>> >
>> > Great to know you fixed the problem.
>> > I have updated the SearchUtils and seems to be correct now.
>> >
>> > Best!
>> >
>> > Thiago
>> >
>> >
>> > On 3/2/07, Sendu Bala <bix at sendu.me.uk> wrote:
>> > >
>> > > Thiago Venancio wrote:
>> > > > Hi Sendu and Chris,
>> > > >
>> > > > Thanks for the help.
>> > > > As I mentioned, I have updated my SearchUtils file from:
>> > > >
>> > >
>> > http://code.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl- 
>> live/Bio/Search/SearchUtils.pm
>> > > > <
>> > >
>> > http://code.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl- 
>> live/Bio/Search/SearchUtils.pm
>> > > >
>> > > >
>> > > > I am also using the lates BioPerl version, installed from CPAN.
>> > > >
>> > > > Please find a buggy blast report attached.
>> > > > In this case, the frac_aligned_query() outputs "1.04", but I  
>> have
>> > others
>> > > > with " 1.57" for example.
>> > > >
>> > > > Just for a quantitative aspect, I got ">1" values in only 61 /
>> > 53,377.
>> > >
>> > > Many thanks for that.
>> > >
>> > > I've committed another fix for SearchUtils so please get  
>> revision 1.23
>> > > and try again. Hopefully all 61 will no longer be >1, but if  
>> any are
>> > > please send me sample blast files again.
>> > >
>> > > For anyone interested, the bug was due to a completely  
>> unbelievable
>> > > oversight on my part in the contig merging algorithm: I forgot  
>> to deal
>> > > with contigs that were fully contained by others. Wow!
>> > >
>> >
>> >
>> >
>> > --
>> > "The way to get started is to quit talking and begin doing."
>> >       Walt Disney
>> >
>> > ========================
>> > Thiago Motta Venancio, MSc
>> > PhD student in Bioinformatics
>> > University of Sao Paulo
>> > ========================
>> > _______________________________________________
>> > Bioperl-l mailing list
>> > Bioperl-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >
>>
>>
>
>
> -- 
> "The way to get started is to quit talking and begin doing."
>      Walt Disney
>
> ========================
> Thiago Motta Venancio, MSc
> PhD student in Bioinformatics
> University of Sao Paulo
> ========================
> <buggyBlast.txt>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign





More information about the Bioperl-l mailing list