[Bioperl-l] Entrez Gene parser questions

Stefan Kirov skirov at utk.edu
Thu Jun 9 15:08:31 EDT 2005

It is slow- there is a lot of data and it goes into many bioperl 
objects. Performance is not the idea of this parser. If you need really 
high performance you may want to stick to the flat files.
One suggestion is to get rid of the debug option (it won't make big 
difference, but still...). The whole human file takes about an hour on 
my machine, depends what you have to do the analysis. Also i
Oh, by the way you need bioperl-live, not 1.5. No need for uninstall- 
just install bioperl-live, it should overwrite the old stuff.

Law, Annie wrote:

>Thanks for everybody's responses.  Yes, if I turn off the warnings then the scripts works. I was sticking with bioperl 1.4 because I think I read somewhere that the even extensions are the stable ones and Also I was happy with how 1.4 was working for me (and it seems that 1.4 works okay with the Entrez gene parser now) but if it is unadvised to plug in new modules to an older version then I will look into installing Bioperl 1.5.  It seems to be fine for a lot of people :)
>Just some questions about getting rid of bioperl 1.4.  Am I correct in thinking that to 'uninstall' there is no such Uninstall command but I would just delete the /usr/lib/perl5/site_perl/5.8.0/Bio  ie. The Bio directory and then Use CPAN to install bioperl 1.5
>I am running my simple script and some print statements and it takes about 10 minutes to print 2000 Entrez gene Ids.  How long does it generally take the parser to finish for example the Homo sapiens file? My real goal is not just to print but I just wanted to do a test run.  Is this about the same performance others are getting or are there some other options to speed it up?
>Thanks again for all of your efforts!
>-----Original Message-----
>From: Stefan Kirov [mailto:skirov at utk.edu] 
>Sent: Wednesday, June 08, 2005 4:50 PM
>To: Mingyi Liu
>Cc: Law, Annie; Bioperl list
>Subject: Re: [Bioperl-l] Entrez Gene parser questions
>Sure. Thanks for letting me know.
>Annie, does it work for you now?
>Mingyi Liu wrote:
>>Stefan's right in suggesting you turn off -w, which would make your 
>>script work.  But thanks for finding this bug.  I just noticed that 
>>entrezgene.pm was actually calling the Bio::ASN1::EntrezGene->next_seq 
>>incorrectly (probably my documentation was a bit confusing & my module 
>>did not follow standard hash-based parameter passing of subroutines).
>>It should be called like ->next_seq(1), but entrezgene.pm called using 
>>->next_seq(-trimopt => 1).  This worked for all of us who do not use
>>-w, as it would fall back to option '1' in my next_seq function 
>>(exactly as Stefan's calling function wanted).  Therefore this bug 
>>went unnoticed until you turned on -w (I guess we were all spoiled by 
>>the easy (and sometimes messy) life of weak datatyping in Perl).
>>Stefan would you please change the calling of next_seq to next_seq(1)?  
>>This would fix the error messages Annie was seeing.

More information about the Bioperl-l mailing list