[Bioperl-l] Question about the definition of 'gaps' in blast -m8 output...
dan.bolser at gmail.com
Fri Mar 20 13:23:00 EDT 2009
2009/3/19 Phillip San Miguel <pmiguel at purdue.edu>:
> Dan Bolser wrote:
>> 2009/3/18 Phillip San Miguel <pmiguel at purdue.edu>
>>> Dan Bolser wrote:
>>>> Can someone clarify the definition of the 'gaps' column in the blast -m8
>>>> output format for me?
>>>> I thought that the column 'gaps' was basically the number of columns in
>>>> HSP that contains a gap character.
>>> Hi Dan,
>>> "gaps", to me, denotes the number of gaps. Not the total length of all
>>> Just my interpretation, but given your results my guess is that whomever
>>> wrote blastall was thinking the way I do.
>> Yeah, I'll have to go look at the HSPs to confirm this... I'm just
>> that there are not more gaps of length >1. i.e. my data (given your
>> interpretation) suggests that 90% of the HSPs have no gaps > length 1.
> Sounds about right. Depends on how you have gap opening vs gap lengthening
> parameters set.
I see. I thought that by default extension was less than opening, so I
had expected there to be more gaps of length >1 ... anyway... where
can I read more about selecting parameters for certain tasks?
Currently I'm blasting tomato against potato sequence, and the two
organisms are known to be 'highly syntenic' - I'm just not sure how
that translates into how I should set the parameters. I'm after large
alignments of large regions of the chromosome. My thinking is to just
run through the list of HSPs and merge based on gap / window size
(dynamic programming style) - that way I can play with the set of HSPs
that I have, and look at the effect of different settings, then I can
just globally align the matching regions using SW (if I need to). Does
that sound reasonable, or is using the default settings just dumb?
More information about the Bioperl-l