[Bioperl-l] Can't parse blast report written by Bio::SearchIO::Writer::TextResultWriter

Chris Fields cjfields at uiuc.edu
Mon May 12 20:33:25 EDT 2008


I ran some fixes on the writers recently.  If we have the BLAST report  
generating this I can work on debugging it (I'll file a bug for  
tracking).

chris

On May 12, 2008, at 6:53 PM, Jason Stajich wrote:

> okay - so there's a bug - I remember someone tried to fix something  
> in the writers recently so will have to look and see how that got  
> broken and can be fixed.
> -j
> On May 12, 2008, at 4:26 PM, Prachi Shah wrote:
>
>> Hi Jason,
>>
>> The negative coordinates in the HSP show up when I generate a Text
>> report regardless of how/if I sort the HSP order. I think it has
>> something to do with the frame. In the example I gave, the Query
>> sequence matches the subject sequence on the negative strand. My  
>> guess
>> is that TextResultWriter somehow takes the strand into account and
>> tries to recalculates the start and stop locations?
>>
>> Thanks,
>> Prachi
>>
>> On Mon, May 12, 2008 at 4:21 PM, Jason Stajich <jason at bioperl.org>  
>> wrote:
>>> that's a very strange bug - I don't quite understand where it is  
>>> coming
>>> from.  IF you don't mess with the HSP order and start with a  
>>> report and
>>> generate the Text report output, does it also give the negative  
>>> coordinates
>>> or are you still reconstituting the Hit/HSP objects "manually" in  
>>> your code?
>>>
>>> -jason
>>>
>>>
>>> On May 12, 2008, at 4:17 PM, Prachi Shah wrote:
>>>
>>>
>>>> Thanks Jason for adding the sort_hsps method in
>>>> Bio::Search::Hit::GenericHit. I tested it out and it works great.
>>>>
>>>> The other issue I have is the format of HSP start and stop  
>>>> coordinates
>>>> when I write a new blast report (with HSPs sorted) using
>>>> Bio::SearchIO::Writer::TextResultWriter. Below is an example of the
>>>> same HSP alignment as output from BLAST and later when the blast
>>>> report is generated by TextResultWriter. Notice, the change in  
>>>> start
>>>> and stop coordinates. I would like to keep the start and stop  
>>>> format
>>>> as in the first case. How do I specify that? Any indicators are
>>>> greatly appreciated.
>>>>
>>>> Thanks,
>>>> Prachi
>>>>
>>>>
>>> ----------------------------------------------------------------------------------------------------
>>>> **HSP alignment in blast report generated by BLAST itself:
>>>>
>>>> Score = 10150 (1529.0 bits), Expect = 0., Sum P(3) = 0.
>>>> Identities = 2120/2345 (90%), Positives = 2120/2345 (90%), Strand =
>>>> Minus / Plus
>>>>
>>>> Query:    2364
>>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG 2305
>>>>               
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251160
>>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG
>>>> 2251219
>>>>
>>>> Query:    2304
>>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA 2245
>>>>               
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251220
>>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA
>>>> 2251279
>>>>
>>>> Query:    2244
>>> ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC 2185
>>>>               
>>>> ||||||||||||||                                             |
>>>> Sbjct: 2251280
>>> ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC
>>>> 2251339
>>>>
>>>> Query:    2184
>>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG 2125
>>>>               
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251340
>>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG
>>>> 2251399
>>>>
>>>> Query:    2124
>>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG 2065
>>>>               
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251400
>>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG
>>>> 2251459
>>>>
>>>> Query:    2064
>>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG 2005
>>>>               
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251460
>>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG
>>>> 2251519
>>>>
>>>> Query:    2004
>>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG 1945
>>>>               
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251520
>>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG
>>>> 2251579
>>>>
>>>> Query:    1944
>>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA 1885
>>>>               
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251580
>>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA
>>>> 2251639
>>>>
>>>> Query:    1884
>>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA 1825
>>>>               
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251640
>>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA
>>>> 2251699
>>>>
>>>>
>>>>
>>> ----------------------------------------------------------------------------------------------------
>>>> ** HSP alignment written by TextResultWriter:
>>>>
>>>> Score = 1529.0 bits (10150), Expect = 0., P = 0.
>>>> Identities = 2120/2345 (90%)
>>>> Frame =  -1 / +1
>>>>
>>>> Query: 20
>>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG -39
>>>>               
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251160
>>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG
>>>> 2251219
>>>>
>>>> Query: -40
>>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA -99
>>>>               
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251220
>>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA
>>>> 2251279
>>>>
>>>> Query: -100
>>> ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC -159
>>>>               
>>>> ||||||||||||||                                             |
>>>> Sbjct: 2251280
>>> ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC
>>>> 2251339
>>>>
>>>> Query: -160
>>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG -219
>>>>               
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251340
>>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG
>>>> 2251399
>>>>
>>>> Query: -220
>>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG -279
>>>>               
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251400
>>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG
>>>> 2251459
>>>>
>>>> Query: -280
>>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG -339
>>>>               
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251460
>>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG
>>>> 2251519
>>>>
>>>> Query: -340
>>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG -399
>>>>               
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251520
>>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG
>>>> 2251579
>>>>
>>>> Query: -400
>>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA -459
>>>>               
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251580
>>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA
>>>> 2251639
>>>>
>>>> Query: -460
>>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA -519
>>>>               
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251640
>>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA
>>>> 2251699
>>>>
>>>
>>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list