[Bioperl-l] Bio::DB::BioDB - insert failed. Dupllicate entry '' for key 2?

Hilmar Lapp hlapp at gmx.net
Mon Mar 6 12:44:37 EST 2006

On Mar 4, 2006, at 3:52 PM, Jay Hannah wrote:

> $ perl j2.pl
> Human adenovirus type 15 | Mastadenovirus | Adenoviridae | dsDNA 
> viruses, no RNA stage | Viruses
> -------------------- WARNING ---------------------
> MSG: insert in Bio::DB::BioSQL::BioNamespaceAdaptor (driver) failed, 
> values were ("","") FKs ()
> Duplicate entry '' for key 2
> ---------------------------------------------------

This means the namespace wasn't set. Within Bioperl namespace isn't 
usually ever necessary to deal with, but in BioSQL it is (and generally 
is when you need to uniquely identify a sequence, see e.g. LSID). 
Bio::PrimarySeqI has a $seq->namespace() method, just set it to 
whatever you'd like.

load_seqdatabase.pl does that automatically for you (and you have a 
command line option to provide it). Most (I believe in fact all) 
Bio::SeqIO parsers do not set the namespace because there is no 
universal standard that would dictate or suggest the "right" value.

> mysql> select * from biodatabase;
> +----------------+------+-----------+-------------+
> | biodatabase_id | name | authority | description |
> +----------------+------+-----------+-------------+
> |             23 |      | NULL      | NULL        |
> +----------------+------+-----------+-------------+
> 1 row in set (0.00 sec)

Well that's one of the more notorious MySQL artifacts - 
biodatabase.name is NOT NULLable, so MySQL silently converts a NULL 
value (undef attribute value) to an empty string instead of throwing an 
error - which probably also would have told you much more directly what 
is going wrong.

Hth, and thanks Marc & Chris for chiming in, I saw you were on the 
right path. If you're dealing with viral sequences, then you should 
definitely consider pre-loading the taxonomy as Marc & Chris suggested 
because virus canonical names often (if not always) don't follow the 
standard binomial convention, so Bioperl may frequently fail at parsing 
them correctly. If the NCBI taxon ID is in the feature table Bioperl-db 
will first look-up by taxon ID instead of by a possibly mis-parsed 


Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757

More information about the Bioperl-l mailing list