[Bioperl-l] GenBank accession bug?

Chris Fields cjfields at uiuc.edu
Wed Feb 21 14:12:57 EST 2007


I'm forwarding this to the mail list.  In the future please post/ 
respond to the regular mail list so other BioPerl developers/users  
can comment.  You'll get feedback much faster here (and maybe even  
some support!).

The issue at hand is whether we can support GenBank accessions/ 
display_id/version with your naming scheme.  My feeling is that  
support for nonalphanumerics was removed to be compliant with the  
GenBank standard for accessions, though I may be wrong.  Maybe  
someone who was around during bioperl 1.2 can elaborate more?

 From http://bugzilla.open-bio.org/show_bug.cgi?id=2214
Thanks for verbose explanation. It seems that I would need to apply
my local patches to the BioPerl module(s). With BioPerl-1.2 there was
no problem with '-' in sequence names.

The problem is that in the project we participate (Vizier project)  
sequence name convention was adopted:

VZ##<virus_ICTV>-(<GenBank LOCUS ID>or<strain designation>)-<$$>

VZ Stands for Vizier

## Your 2-digits Partner ID within the VIZIER consortium

<virus_ICTV> Virus name according to the ICTV nomenclature;

<GenBank LOCUS ID>,
<strain designation> If sequence has not been assigned a GenBank  
available strain designation, short as possible, should be used

<$$> Unique 2-digits number on your discretion to label sequence variant


