[Bioperl-l] Can't parse blast report written by Bio::SearchIO::Writer::TextResultWriter

Jason Stajich jason at bioperl.org
Mon May 12 19:53:15 EDT 2008


okay - so there's a bug - I remember someone tried to fix something  
in the writers recently so will have to look and see how that got  
broken and can be fixed.
-j
On May 12, 2008, at 4:26 PM, Prachi Shah wrote:

> Hi Jason,
>
> The negative coordinates in the HSP show up when I generate a Text
> report regardless of how/if I sort the HSP order. I think it has
> something to do with the frame. In the example I gave, the Query
> sequence matches the subject sequence on the negative strand. My guess
> is that TextResultWriter somehow takes the strand into account and
> tries to recalculates the start and stop locations?
>
> Thanks,
> Prachi
>
> On Mon, May 12, 2008 at 4:21 PM, Jason Stajich <jason at bioperl.org>  
> wrote:
>> that's a very strange bug - I don't quite understand where it is  
>> coming
>> from.  IF you don't mess with the HSP order and start with a  
>> report and
>> generate the Text report output, does it also give the negative  
>> coordinates
>> or are you still reconstituting the Hit/HSP objects "manually" in  
>> your code?
>>
>>  -jason
>>
>>
>>  On May 12, 2008, at 4:17 PM, Prachi Shah wrote:
>>
>>
>>> Thanks Jason for adding the sort_hsps method in
>>> Bio::Search::Hit::GenericHit. I tested it out and it works great.
>>>
>>> The other issue I have is the format of HSP start and stop  
>>> coordinates
>>> when I write a new blast report (with HSPs sorted) using
>>> Bio::SearchIO::Writer::TextResultWriter. Below is an example of the
>>> same HSP alignment as output from BLAST and later when the blast
>>> report is generated by TextResultWriter. Notice, the change in start
>>> and stop coordinates. I would like to keep the start and stop format
>>> as in the first case. How do I specify that? Any indicators are
>>> greatly appreciated.
>>>
>>> Thanks,
>>> Prachi
>>>
>>>
>> --------------------------------------------------------------------- 
>> -------------------------------
>>> **HSP alignment in blast report generated by BLAST itself:
>>>
>>>  Score = 10150 (1529.0 bits), Expect = 0., Sum P(3) = 0.
>>>  Identities = 2120/2345 (90%), Positives = 2120/2345 (90%), Strand =
>>> Minus / Plus
>>>
>>> Query:    2364
>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG 2305
>>>                
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251160
>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG
>>> 2251219
>>>
>>> Query:    2304
>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA 2245
>>>                
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251220
>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA
>>> 2251279
>>>
>>> Query:    2244
>> ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC 2185
>>>                
>>> ||||||||||||||                                             |
>>> Sbjct: 2251280
>> ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC
>>> 2251339
>>>
>>> Query:    2184
>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG 2125
>>>                
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251340
>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG
>>> 2251399
>>>
>>> Query:    2124
>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG 2065
>>>                
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251400
>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG
>>> 2251459
>>>
>>> Query:    2064
>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG 2005
>>>                
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251460
>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG
>>> 2251519
>>>
>>> Query:    2004
>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG 1945
>>>                
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251520
>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG
>>> 2251579
>>>
>>> Query:    1944
>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA 1885
>>>                
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251580
>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA
>>> 2251639
>>>
>>> Query:    1884
>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA 1825
>>>                
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251640
>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA
>>> 2251699
>>>
>>>
>>>
>> --------------------------------------------------------------------- 
>> -------------------------------
>>> ** HSP alignment written by TextResultWriter:
>>>
>>>  Score = 1529.0 bits (10150), Expect = 0., P = 0.
>>>  Identities = 2120/2345 (90%)
>>>  Frame =  -1 / +1
>>>
>>> Query: 20
>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG -39
>>>                
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251160
>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG
>>> 2251219
>>>
>>> Query: -40
>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA -99
>>>                
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251220
>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA
>>> 2251279
>>>
>>> Query: -100
>> ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC -159
>>>                
>>> ||||||||||||||                                             |
>>> Sbjct: 2251280
>> ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC
>>> 2251339
>>>
>>> Query: -160
>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG -219
>>>                
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251340
>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG
>>> 2251399
>>>
>>> Query: -220
>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG -279
>>>                
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251400
>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG
>>> 2251459
>>>
>>> Query: -280
>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG -339
>>>                
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251460
>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG
>>> 2251519
>>>
>>> Query: -340
>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG -399
>>>                
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251520
>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG
>>> 2251579
>>>
>>> Query: -400
>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA -459
>>>                
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251580
>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA
>>> 2251639
>>>
>>> Query: -460
>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA -519
>>>                
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251640
>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA
>>> 2251699
>>>
>>
>>



More information about the Bioperl-l mailing list