[Bioperl-l] problem while parsing UniProt(ltaxon.pm)

Chris Fields cjfields at uiuc.edu
Thu Mar 29 11:39:55 EDT 2007


On Mar 29, 2007, at 9:29 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> On Mar 29, 2007, at 8:41 AM, Sendu Bala wrote:
>>> Thanks. I've made a tentative fix to swiss.pm. The only problem  
>>> might be common names/ descriptions don't get caught on some  
>>> strange OS lines. I don't have enough experience of OS lines to  
>>> know what they might look like.
>>>
>>> Still, at least there won't be thrown exceptions, which some  
>>> users may prefer ;)
>>>
>>> I'll add tests later if and when Ambrose/ yourself confirm all is  
>>> well.
>> I'm getting it to parse but there is a '.' appended to the  
>> scientific_name():
>> Venerupis (Ruditapes) philippinarum.
>
> Ok, that should be fixed as well now. How do/will these changes  
> feed into your driver stuff? What is the status on that work? The  
> intent? Are we switching over to using swissdriver.pm et al. at  
> some point?

It's fine now; I did notice that EMBL leaves it in as well so I fixed  
that.

As for SeqIO::swissdriver, it does remove the '.' from the OS line  
but leaves it in the classification line.  Doh!  I'll try fixing that...

In relation to that, the driver/handler-based SeqIO parsers are still  
being worked on when I have time (which there hasn't been much of  
lately).  I don't see them immediately replacing SeqIO's genbank/embl/ 
swiss, though the next_seq() implementation works fine.  It is very  
possible that these will replace the older parsers down the road,  
though (maybe post-1.6).  They aren't intended for a stable release  
for now so may not be included in v 1.6, but they pass the current  
genbank/embl/swiss tests and can be included in any dev releases for  
testing.  As for the Handler.t tests, I cheat a little since they  
don't have a write_seq() implemented yet; I may just delegate those  
to SeqIO::genbank/embl/swiss::write_seq() for the time being.

The general idea of what I want to do is in the following link,  
though it's woefully incomplete at this stage.  If you have any ideas  
let me know or add your own thoughts to the Talk page there!

http://www.bioperl.org/wiki/Handler-based_SeqIO_parsers

chris



More information about the Bioperl-l mailing list