[Bioperl-l] dealing with large files
bix at sendu.me.uk
Thu Dec 20 18:29:30 EST 2007
Amir Karger wrote:
>> Amir Karger wrote:
>>>> It would be nice to code up a lazy sequence object and related
>>>> parsers; maybe for the next dev release.
>>> Also, BLAST parsing. Blasting the proteome against the
>>> genome makes for rather large result files.
>> This has already been done. Use Bio::SearchIO::blast_pull. In a
>> situation like yours I dropped run time from 20223s to
>> 951s (~20x faster) and memory usage from over 8GB to less
>> than 5GB (~40% less).
> Not in 1.5.1. Is it in 1.5.2 or just in cvs? Is there a single file I
> can put in my own perl lib for this, or does it require large bunches of
> new code? (I'm guessing the latter.) We're about to upgrade to 1.5.2
> here, but I don't see our whole center using CVS Bioperl.
blast_pull is only in CVS (and needs a whole bunch of associated modules
to work), though 1.5.2 also contains significant improvements to
SearchIO generally which should provide you with significant speed
improvements during blast parsing with the normal Bio::SearchIO::blast.
More information about the Bioperl-l