[Bioperl-l] One more load_seqdatabase.pl question

Hilmar Lapp hlapp at gmx.net
Tue Nov 28 23:54:53 EST 2006


These are the protein translations stored in the feature table as  
tags of features, right?

You can change the type of the column (although there may be some  
issues when you update the column because the NVL() clause won't work  
if I recall that correctly), but doing so will deprive you of any  
'normal' searches against that column. (You can still use functions  
from the DBMS_LOB package, but they will be much slower and are  
completely non-standard.)

It is up to you whether that is too big of a price to pay for having  
some redundant protein translations (translating the feature's DNA  
sequence should give you the same) in the database. I always trimmed  
those feature tags off (using a custom SeqProcessor). An alternative  
is to convert these feature tags into actual bioentries (i.e.,  
Bio::Seq objects; again, a custom SeqProcessor will allow you to do  
that).

	-hilmar

On Nov 28, 2006, at 4:13 PM, gang wu wrote:

> Hi everyone,
>
> I'm using load_seqdatabase.pl to upload some Genbank genome  
> sequences to
> my Oracle BioSQL database. I saw some errors(See attached warning
> message) related to seqfeature_qualifier_value
> (SG_SEQFEATURE_QUALIFIER_ASSOC.VALUE column), which has Varchar2 data
> type of maximum 4000 bytes. Did anybody mention this issue before?
> Should I just modify the column to a type being able store more data
> such as LONG or CLOB?
>
> Thanks.
>
> Gang
>
>
> Log information:
> ============================================
>
> load_seqdatabase.pl -host elegans -driver Oracle -dbname sparc -dbuser
> biosqldb-sgowner -dbpass PASS -format genbank -namespace genbank
> /genomeseq/arabidopsis//NC_003070.gbk
>
>
> Loading /genomeseq/arabidopsis//NC_003070.gbk ...
>
>
> -------------------- WARNING ---------------------
> MSG: SimpleValueAdaptor::add_assoc: unexpected failure of statement
> execution: ORA-01461: can bind a LONG value only for insert into a  
> LONG
> column (DBD ERROR: error possibly near <*> indicator at char 12 in
> 'INSERT INTO <*>seqfeature_qualifier_value (fea_oid, trm_oid, value,
> rank) VALUES (:p1, :p2, :p3, :p4)')
>     name: INSERT ASSOC [2]
> Bio::SeqFeature::Generic;Bio::Annotation::SimpleValue
>     values: FK[Bio::SeqFeature::Generic]:14898,
> FK[Bio::Annotation::SimpleValue]:800,
> value:"MVAVTGEVLHLLRRYLGEYVHGLSTEALRISVWKGDVVLKDLKLKAEALNSLKLPVAVKSGFV 
> GTITLKVPWKSLGKEPVIVLIDRVFVLAYPAPDDRTLKFFTLVGTEFAYTNYIPGGRQGKASRNQASADR 
> GTSYFWLMELHGYEAETATLEARAKSKLGSPPQGNSWLGSIIATIIGNLKVSISNVHIRYEDSTRDSSEI 
> LASFFSYFNNICSSNPGHPFAAGITLAKLAAVTMDEEGNETFDTSGALDKLRKSLQLERLALYHDSNSFP 
> WEIEKQWDNITPEEWIEMFEDGIKEQTEHKIKSKWALNRHYLLSPINGSLKYHRLGNQERNNPEIPFERA 
> SVILNDVNVTITEEQYHDWIKLVEVVSRYKTYIEISHLRPMVPVSEAPRLWWRFAAQASLQQKRLWYTRY 
> IQLYANFLQQSSDVNYPEMREIEKDLDSKVILLWRLLAHAKVESVKSKEAAEQRKLKKGGWFSFNWRTEA 
> EDDPEVDSVAGGSKLMEERLTKDEWKAINKLLSHQPDEEMNLYSGKDMQNMTHFLVTVSIGQGAARIVDI 
> NQTEVLCGRFEQLDVTTKFRHRSTQCDVSLRFYGLSAPEGSLAQSVSSERKTNALMASFVNAPIGENIDW 
> RLSATISPCHATIWTESYDRVLEFVKRSNAVSPTVALETAAVLQMKLEEVTRRAQEQLQIVLEEQSRFAL 
> DIDIDAPKVRIPLRASGSSKCSSHFLLDFGNFTLTTMDTRSEEQRQNLYSRFCISGRDIAAFFTDCGSDN 
> QGCSLVMEDFTNQPILSPILEKADNVYSLIDRCGMAVIVDQIKVPHPSYPSTRISIQVPNIGVHFSPTRY 
> MRIMQLFDILYGAMKTYSQAPVDHMPDGIQPWSPTDLASDARILVWKGIGNSVATWQSCRLVLSGLYLYT 
> FESEKSLDYQRYLCMAGRQVFEVPPANIGGSPYCLAVGVRGTDLKKALESSSTWIIEFQGEEKAAWLRGL 
> VQATYQASA!
>   
> PLSGDVLGQTSDGDGDFHEPQTRNMKAADLVITGALVETKLYLYGKIKNECDEQVEEVLLLKVLASGGKV 
> HLISSESGLTVRTKLHSLKIKDELQQQQSGSAQYLAYSVLKNEDIQESLGTCDSFDKEMPVGHADDEDAY 
> TDALPEFLSPTEPGTPDMDMIQCSMMMDSDEHVGLEDTEGGFHEKDTSQGKSLCDEVFYEVQGGEFSDFV 
> SVVFLTRSSSSHDYNGIDTQMSIRMSKLEFFCSRPTVVALIGFGFDLSTASYIENDKDANTLVPEKSDSE 
> KETNDESGRIEGLLGYGKDRVVFYLNMNVDNVTVFLNKEDGSQLAMFVQERFVLDIKVHPSSLSVEGTLG 
> NFKLCDKSLDSGNCWSWLCDIRDPGVESLIKFKFSSYSAGDDDYEGYDYSLSGKLSAVRIVFLYRFVQEV 
> TAYFMGLATPHSEEVIKLVDKVGGFEWLIQKDEMDGATAVKLDLSLDTPIIVVPRDSLSKDYIQLDLGQL 
> EVSNEISWHGCPEKDATAVRVDVLHAKILGLNMSVGINGSIGKPMIREGQGLDIFVRRSLRDVFKKVPTL 
> SVEVKIDFLHAVMSDKEYDIIVSCTSMNLFEEPKLPPDFRGSSSGPKAKMRLLADKVNLNSQMIMSRTVT 
> ILAVDINYALLELRNSVNEESSLAHVAVRASEPNSSISWMTSLSETDLYVSVPKVSVLDIRPNTKPEMRL 
> MLGSSVDASKQASSESLPFSLNKGSFKRANSRAVLDFDAPCSTMLLMDYRWRASSQSCVLRVQQPRILAV 
> PDFLLAVGEFFVPALRAITGRDETLDPTNDPITRSRGIVLSEPLYKQTEDVVHLSPRRQLVADSLGIDEY 
> TYDGCGKVISLSEQGEKDLNVGRLEPIIIVGHGKKLRFVNVKIKNGSLLSKCIYLSNDSSCLFSPEDGVD 
> ISMLENASSNPENVLSNAHKSSDVSDTCQYDSKSGQSFTFEAQVVSPEFTFFDGTKSSLDDSSAVEKLLR 
> VKLDFNFM!
>   
> YASKEKDIWVRALLKNLVVETGSGLIILDPVDISGGYTSVKEKTNMSLTSTDIYMHLSLSALSLLLNLQS
> QVTGALQSGNAIPLASCTNFDRIWVSPKENGPRNNLTIWRPQAPSNYVILGDCVTSRAIPPTQAVMAVSN 
> TYGRVRKPIGFNRIGLFSVIQGLEGDNVQHSHNSNECSLWMPVAPVGYTAMGCVANIGSEQPPDHIVYCL 
> SIWRADNVLGAFYAHTSTAAPSKKYSPGLSHCLLWNPLQSKTSSSSDPSSTSGSRSEQSSDQTGNSSGWD 
> ILRSISKATSYHVSTPNFERIWWDKGGDLRRPVSIWRPVPRPGFAILGDSITEGLEPPALGILFKADDSE 
> IAAKPVQFNKVAHIVGKGFDEVFCWFPVAPPGYVSLGCVLSKFDEAPHVDSFCCPRIDLVNQANIYEASV 
> TRSSSSKSSQLWSIWKVDNQACTFLARSDLKRPPSRMAFAVGESVKPKTQENVNAEIKLRCFSLTLLDGL 
> HGMMTPLFDTTVTNIKLATHGRPEAMNAVLISSIAASTFNPQLEAWEPLLEPFDGIFKLETYDTALNQSS 
> KPGKRLRIAATNILNINVSAANLETLGDAVVSWRRQLELEERAAKMKEESAASRESGDLSAFSALDEDDF 
> QTIVVENKLGRDIYLKKLEENSDVVVKLCHDENTSVWVPPPRFSNRLNVADSSREARNYMTVQILEAKGL 
> HIIDDGNSHSFFCTLRLVVDSQGAEPQKLFPQSARTKCVKPSTTIVNDLMECTSKWNELFIFEIPRKGVA 
> RLEVEVTNLAAKAGKGEVVGSLSFPVGHGESTLRKVASVRMLHQSSDAENISSYTLQRKNAEDKHDNGCL 
> LISTSYFEKTTIPNTLRNMESKDFVDGDTGFWIGVRPDDSWHSIRSLLPLCIAPKSLQNDFIAMEVSMRN 
> GRKHATFRCLATVVNDSDVNLEISISSDQNVSSGVSNHNAVIASRSSYVLPWGCLSKDNEQCLHIRPKVE 
> NSHHSYAWGYCIAVSSGCGKDQPFVDQGLLTRQNTIKQSSRASTFFLRLNQLEKKDMLFCCQPSTGSKPL 
> WLSVGADAS!
>   
> VLHTDLNTPVYDWKISISSPLKLENRLPCPVKFTVWEKTKEGTYLERQHGVVSSRKSAHVYSADIQRPVY 
> LTLAVHGGWALEKDPIPVLDISSNDSVSSFWFVHQQSKRRLRVSIERDVGETGAAPKTIRFFVPYWITND 
> SYLPLSYRVVEIEPSENVEAGSPCLTRASKSFKKNPVFSMERRHQKKNVRVLESIEDTSPMPSMLSPQES 
> AGRSGVVLFPSQKDSYVSPRIGIAVAARDSDSYSPGISLLELEKKERIDVKAFCKDASYYMLSAVLNMTS 
> DRTKVIHLQPHTLFINRVGVSICLQQCDCQTEEWINPSDPPKLFGWQSSTRLELLKLRVKGYRWSTPFSV 
> FSEGTMRVPVPKEDGTDQLQLRVQVRSGTKNSRYEVIFRPNSISGPYRIENRSMFLPIRYRQVEGVSESW 
> QFLPPNAAASFYWENLGRRHLFELLVDGNDPSNSEKFDIDKIGDYPPRSESGPTRPIRVTILKEDKKNIV 
> RISDWMPAIEPTSSISRRLPASSLSELSGNESQQSHLLASEDSEFHVIVELAELGISVIDHAPEEILYMS 
> VQNLFVAYSTGLGSGLSRFKLRMQGIQVDNQLPLAPMPVLFRPQRTGDKADYILKFSVTLQSNAGLDLRV 
> YPYIDFQGRENTAFLINIHEPIIWRIHEMIQQANLSRLSDPNSTAVSVDPFIQIGVLNFSEVRFRVSMAM 
> SPSQRPRGVLGFWSSLMTALGNTENMPVRISERFHENISMRQSTMINNAIRNVKKDLLGQPLQLLSGVDI 
> LGNASSALGHMSQGIAALSMDKKFIQSRQRQENKGVEDFGDIIREGGGALAKGLFRGVTGILTKPLEGAK 
> SSGVEGFVSGFGKGIIGAAAQPVSGVLDLLSKTTEGANAMRMKIAAAITSDEQLLRRRLPRAVGADSLLR 
> PYNDYRAQGQVILQLAESGSFLGQVDLFKVRGKFALTDAYESHFILPKGKVLMITHRRVILLQQPSNIMG 
> QRKFIPAK!
>   
> DACSIQWDILWNDLVTMELSDGKKDPPNSPPSRLILYLKAKPHDPKEQFRVVKCIPNSKQAFDVYSAIDQ
> AINLYGQNALKGMVKNKVTRPYSPISESSWAEGASQQMPASVTPSSTFGTSPTTSSS",
> rank:"1"
> --------------------------------------------------
>
>
> =============================================
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================







More information about the Bioperl-l mailing list