[Bioperl-l] Error while running load_seqdatabase.pl

George Heller george.heller at yahoo.com
Thu Jan 25 21:51:05 EST 2007


Hi Hilmar,
   
  I still seem to be having problems loading my fasta file. I wrote a new package, SeqProcessor.pm as below,
   
  package SeqProcessor::Accession;
use strict;
use vars qw(@ISA);
use Bio::Seq::BaseSeqProcessor;
use Bio::SeqFeature::Generic;
  @ISA = qw(Bio::Seq::BaseSeqProcessor);
  sub process_seq
{
  my ($self, $seq) = @_;
  $seq->accession_number($seq->display_id);
  return ($seq);
}
  1;

  I have this file SeqProcessor.pm in my home directory, and I have set the PERL5LIB variable accordingly. When I run load_seqdatabase.pl,
   
   perl load_seqdatabase.pl -host localhost -dbname biodb -format fasta -dbuser postgres -driver Pg --pipeline="SeqProcessor::Accession" maize_pep.fasta
   
  I still get the error,
   
  Loading maize_pep.fasta ...
  -------------------- WARNING ---------------------
MSG: insert in Bio::DB::BioSQL::SeqAdaptor (driver) failed, values were ("FGENESHT0000001||AC155633|570|4400|1","FGENESHT0000001||AC155633|570|4400|1","FGENESHT0000001||AC155633|570|4400|1","","0","") FKs (1,<NULL>)
ERROR:  duplicate key violates unique constraint "bioentry_accession_key"
  ---------------------------------------------------
Could not store FGENESHT0000001||AC155633|570|4400|1:
------------- EXCEPTION  -------------
MSG: error while executing statement in Bio::DB::BioSQL::SeqAdaptor::find_by_unique_key: ERROR:  current transaction is aborted, commands ignored until end of transaction block
  STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key /home/akar/local/perl//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:951
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key /home/akar/local/perl//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:855
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /home/akar/local/perl//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:205
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /home/akar/local/perl//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:254
STACK Bio::DB::Persistent::PersistentObject::store /home/akar/local/perl//Bio/DB/Persistent/PersistentObject.pm:272
STACK (eval) load_seqdatabase.pl:620
STACK toplevel load_seqdatabase.pl:602
  --------------------------------------
   at load_seqdatabase.pl line 633

  Is there something I am missing?
   
  Thanks!
  George.

  

Hilmar Lapp <hlapp at gmx.net> wrote:
  Hi George, sorry for the sluggish response, I was tied up during the 
week. This is also why you always want to keep the thread on the list.

Perl is an interpreted language, so no compilation is necessary. The 
only thing you need to do is have the package in a place where perl 
can find it. The simplest way to achieve this is by setting the 
PERL5LIB environment variable:

$ export PERL5LIB=/where/you/put/your/perl/package

or if PERL5LIB was set already, you'd append it:

$ export PERL5LIB=${PERL5LIB}:/where/you/put/your/perl/package

I do assume that you didn't really add your code to the SeqAdaptor.pm 
package - there is no necessity for nor benefit from that, and at 
worst (and quite likely) perl won't be able to find the package. Note 
that there is plenty of documentation for how to write packages for 
perl and how to make them accessible to perl.

Hth,

-hilmar

On Jan 8, 2007, at 11:52 PM, George Heller wrote:

> Hi Hilmer.
>
> Thanks so much for the response. As I am new to Bioperl, I have 
> another question.
>
> I have made the changes as suggested by you, and have added the 
> code below to the SeqAdaptor.pm script.
>
> package SeqProcessor::Accession;
> use strict;
> use vars qw(@ISA);
> use Bio::Seq::BaseSeqProcessor;
> use Bio::SeqFeature::Generic;
>
> @ISA = qw(Bio::Seq::BaseSeqProcessor);
>
> sub process_seq
> {
> my ($self, $seq) = @_;
> $seq->accession_number($seq->display_id);
> return ($seq);
> }
>
> Now that I have done my changes, do I need to compile or something 
> for the changes to reflect? If so, can you please let me know the 
> command for the same, or direct me to any lin that has 
> documentation for the same?
>
> Thanks so much for the help.
> George.
>
> Hilmar Lapp wrote:
> George,
>
> this is almost certainly caused by using FASTA format and bioperl's
> treatment of it. I am guilty of not having written a FAQ yet for
> Bioperl-db, as this would certainly be there.
>
> Specifically, the Bioperl fasta SeqIO parser (load_seqdatabase.pl
> uses Bioperl to parse sequence files) does not extract the accession
> number from the description line of the fasta sequence, and instead
> sets the accession_number property if sequence objects it creates to
> "unknown". Since there is a unique key constraint on
> (accession,version,namespace) the second sequence loaded will raise
> an exception as it will violate the constraint.
>
> The simplest way to deal with this is to write a SeqProcessor that
> massages the accession_number appropriately and then supply the
> module to load_seqdatabase.pl using the --pipeline command line 
> switch.
>
> There are several examples for how to do this in the email archives.
> See for example this thread on the Biosql list:
>
> http://lists.open-bio.org/pipermail/biosql-l/2005-August/000901.html
>
> with two links to examples, and Marc Logghe gives another one in the
> thread itself.
>
> Hth,
>
> -hilmar
>
> On Jan 8, 2007, at 3:17 PM, George Heller wrote:
>
> > Hi all.
> >
> > I am new to Bioperl and am trying to run the load_seqdatabase.pl
> > script to load sequence data from a file into Postgres database. I
> > am invoking the script through the following command:
> >
> > perl load_seqdatabase.pl -host localhost -dbname biodb06 -format
> > fasta
> > -dbuser postgres -driver Pg
> >
> > I am getting the following error:
> >
> > -------------------- WARNING ---------------------
> > MSG: insert in Bio::DB::BioSQL::SeqAdaptor (driver) failed, values
> > were ("FGENES
> > HT0000001||AC155633|570|4400|1","FGENESHT0000001||AC155633|570|4400|
> > 1","unknown"
> > ,"","0","") FKs (1,)
> > ERROR: duplicate key violates unique constraint
> > "bioentry_accession_key"
> > ---------------------------------------------------
> > Could not store unknown:
> > ------------- EXCEPTION -------------
> > MSG: error while executing statement in
> > Bio::DB::BioSQL::SeqAdaptor::find_by_uni
> > que_key: ERROR: current transaction is aborted, commands ignored
> > until end of t
> > ransaction block
> > STACK
> > Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key /usr/
> > lib/perl
> > 5/site_perl/5.8.5/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:948
> > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key /
> > usr/lib/perl5
> > /site_perl/5.8.5/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:852
> > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/
> > perl5/site_perl/5
> > .8.5/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:203
> > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /usr/lib/perl5/
> > site_perl/5.
> > 8.5/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
> > STACK Bio::DB::Persistent::PersistentObject::store /usr/lib/perl5/
> > site_perl/5.8.
> > 5/Bio/DB/Persistent/PersistentObject.pm:271
> > STACK (eval) load_seqdatabase.pl:620
> > STACK toplevel load_seqdatabase.pl:602
> > --------------------------------------
> > at load_seqdatabase.pl line 633
> >
> > Can anyone tell me how I can correct this error and get my script
> > running? Thanks!!!
> >
> > George.
> >
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam? Yahoo! Mail has the best spam protection around
> > http://mail.yahoo.com
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================







 	 
---------------------------------
Looking for earth-friendly autos? 
 Browse Top Cars by "Green Rating" at Yahoo! Autos' Green Center.  


More information about the Bioperl-l mailing list