[Bioperl-l] proposed change -- symbols SimpleAlign

Chris Fields cjfields at uiuc.edu
Sun Nov 25 11:39:01 EST 2007


Bernd,

That would be when generating Bio::LocatableSeq instances for  
building a Bio::SimpleAlign object.  Judging by test suite results  
that doesn't appear to be affected.

chris

On Nov 25, 2007, at 10:13 AM, Bernd Web wrote:

> Hi,
>
> I am not sure if this is related, but I remember SimpleAlign was
> adapted to cope with more gap symbols that can occur in
> alignments/FastA sequences, as: . _ - =
> Previous versions would throw an error on 'illegal' gap characters,
>
> Regards,
> Bernd
>
> On Nov 25, 2007 4:38 PM, Chris Fields <cjfields at uiuc.edu> wrote:
>> Albert,
>>
>> I was getting a single AlignIO.t fail which appeared to be related to
>> this:
>>
>> ...
>> ok 122 - The object isa Bio::Align::AlignI
>> ok 123 - consensus_string on metafasta
>>
>> not ok 124 - symbol_chars() using metafasta
>> #   Failed test 'symbol_chars() using metafasta'
>> #   in t/AlignIO.t at line 346.
>> #          got: '0'
>> #     expected: '23'
>>
>> It was b/c the symbol hash was initialized in the constructor (so it
>> was present, just empty).  I have changed that in CVS; all tests pass
>> now.
>>
>> chris
>>
>>
>> On Nov 25, 2007, at 5:50 AM, Albert Vilella wrote:
>>
>>> cvs commited now. it is calculated anyway when calling symbol_chars
>>> so...
>>>
>>> On Nov 23, 2007 12:49 AM, Chris Fields <cjfields at uiuc.edu> wrote:
>>>> Albert,
>>>>
>>>> Found it:
>>>>
>>>> http://code.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/
>>>> Bio/
>>>> SimpleAlign.pm.diff?r1=1.36&r2=1.37
>>>>
>>>> If it slows performance that dramatically, maybe we can move  
>>>> this to
>>>> a separate AlignUtils method instead.  Maybe something to ask Jason
>>>> about?
>>>>
>>>> chris
>>>>
>>>> On Nov 22, 2007, at 3:55 PM, Albert Vilella wrote:
>>>>
>>>>
>>>>> Hi,
>>>>>
>>>>> Am I right in thinking that the '_symbols' hash in SimpleAlign is
>>>>> only
>>>>> used if one calls the symbol_chars method?
>>>>>
>>>>> When I comment out this line:
>>>>>
>>>>> map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq) if
>>>>> $seq->seq; # line 257
>>>>>
>>>>> I get a nice speed boost on loading alignments.
>>>>>
>>>>> Can I comment this line out in the CVS HEAD?
>>>>>
>>>>> Cheers,
>>>>>
>>>>>     Albert.
>>>>>
>>>>> [init] 5.96046447753906e-06 secs...
>>>>> [loading aln /home/avilella/ensembl/exoseq/test/
>>>>> ENSG00000162399.chr1.fasta]
>>>>> 0.0022270679473877 secs...
>>>>> [loading aln /home/avilella/ensembl/exoseq/test/
>>>>> ENSG00000158022.chr1.fasta]
>>>>> 2.14348912239075 secs...
>>>>> [loading aln /home/avilella/ensembl/exoseq/test/
>>>>> ENSG00000162585.chr1.fasta]
>>>>> 6.91910791397095 secs...
>>>>> [loading aln /home/avilella/ensembl/exoseq/test/
>>>>> ENSG00000121957.chr1.fasta]
>>>>> 15.8402290344238 secs...
>>>>>
>>>>> avilella at magneto:~$ perl
>>>>> /home/avilella/src/ensembl_main/ensembl-personal/avilella/exoseq/
>>>>> ancestral_alleles.pl
>>>>> -dir /home/avilella/ensembl/exoseq/test -verbose
>>>>> [init] 1.21593475341797e-05 secs...
>>>>> [loading aln /home/avilella/ensembl/exoseq/test/
>>>>> ENSG00000162399.chr1.fasta]
>>>>> 0.00294303894042969 secs...
>>>>> [loading aln /home/avilella/ensembl/exoseq/test/
>>>>> ENSG00000158022.chr1.fasta]
>>>>> 0.510555982589722 secs...
>>>>> [loading aln /home/avilella/ensembl/exoseq/test/
>>>>> ENSG00000162585.chr1.fasta]
>>>>> 1.6192569732666 secs...
>>>>> [loading aln /home/avilella/ensembl/exoseq/test/
>>>>> ENSG00000121957.chr1.fasta]
>>>>> 3.86473417282104 secs...
>>>>> [loading aln /home/avilella/ensembl/exoseq/test/
>>>>> ENSG00000203717.chr1.fasta]
>>>>> 6.99602198600769 secs...
>>>>> [loading aln /home/avilella/ensembl/exoseq/test/
>>>>> ENSG00000196188.chr1.fasta]
>>>>> 7.26704716682434 secs...
>>>>> [loading aln /home/avilella/ensembl/exoseq/test/
>>>>> ENSG00000025800.chr1.fasta]
>>>>> 8.44332504272461 secs...
>>>>> [loading aln /home/avilella/ensembl/exoseq/test/
>>>>> ENSG00000117475.chr1.fasta]
>>>>> 12.103296995163 secs...
>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> Christopher Fields
>>>> Postdoctoral Researcher
>>>> Lab of Dr. Robert Switzer
>>>> Dept of Biochemistry
>>>> University of Illinois Urbana-Champaign
>>>>
>>>>
>>>>
>>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign





More information about the Bioperl-l mailing list