[Bioperl-l] SearchIO speed up
sdavis2 at mail.nih.gov
Mon Aug 14 10:36:26 EDT 2006
On 8/14/06 10:00 AM, "aaron.j.mackey at gsk.com" <aaron.j.mackey at gsk.com>
> I'm failing to understand, sorry.
> The UNIX utility "more" (or "less" if you prefer) is a pull parser; it
> reads the stream as much as it needs to satisfy the current iteration (the
> next iteration occurring when the user asks for an additional screen or
> line). It does not copy data from a pipe into temp storage.
> That said, you can't use "more" to page backwards in piped content (unless
> your "more" is keeping a buffer, which some do).
> So, I agree that you will need some form of storage for the *current*
> information to be parsed (and must process all of the stream necessary to
> obtain all such information), but not for any of the information yet to be
I hesitate to try to "clarify", but this is as much for my own good as for
that of others. I think the distinction here is between "random access",
which is probably not necessary for Blast parsing, and "pull parsing", which
only needs sequential, chunk-based parsing. Is this the source of some
> bioperl-l-bounces at lists.open-bio.org wrote on 08/14/2006 09:04:19 AM:
>> aaron.j.mackey at gsk.com wrote:
>>> A "pull parser" need not read everything (i.e. the entire file) into
>>> memory, just the current/next chunk, right?
>> The problem arises when you need random-access to the input data in
>> order to do what you need to do, like get just the next chunk or bit of
>> So I don't see a way for a generalized pull-parser to cope with piped
>> input, because most operations are going to have use seek() to work, and
>> you can't seek piped input.
>> What I do at the moment, then, is on detecting piped input, I'm forced
>> to read all the input data in in one go and spit it out into seekable
>> memory or a temp file. After which normal behaviour resumes - you don't
>> read everything, just the bit you want.
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
More information about the Bioperl-l