[Bioperl-l] SearchIO speed up

Sean Davis sdavis2 at mail.nih.gov
Mon Aug 14 10:36:26 EDT 2006

On 8/14/06 10:00 AM, "aaron.j.mackey at gsk.com" <aaron.j.mackey at gsk.com>

> I'm failing to understand, sorry.
> The UNIX utility "more" (or "less" if you prefer) is a pull parser; it
> reads the stream as much as it needs to satisfy the current iteration (the
> next iteration occurring when the user asks for an additional screen or
> line).  It does not copy data from a pipe into temp storage.
> That said, you can't use "more" to page backwards in piped content (unless
> your "more" is keeping a buffer, which some do).
> So, I agree that you will need some form of storage for the *current*
> information to be parsed (and must process all of the stream necessary to
> obtain all such information), but not for any of the information yet to be
> accessed.

I hesitate to try to "clarify", but this is as much for my own good as for
that of others.  I think the distinction here is between "random access",
which is probably not necessary for Blast parsing, and "pull parsing", which
only needs sequential, chunk-based parsing.  Is this the source of some

> bioperl-l-bounces at lists.open-bio.org wrote on 08/14/2006 09:04:19 AM:
>> aaron.j.mackey at gsk.com wrote:
>>> A "pull parser" need not read everything (i.e. the entire file) into
>>> memory, just the current/next chunk, right?
>> The problem arises when you need random-access to the input data in
>> order to do what you need to do, like get just the next chunk or bit of
>> information.
>> So I don't see a way for a generalized pull-parser to cope with piped
>> input, because most operations are going to have use seek() to work, and
>> you can't seek piped input.
>> What I do at the moment, then, is on detecting piped input, I'm forced
>> to read all the input data in in one go and spit it out into seekable
>> memory or a temp file. After which normal behaviour resumes - you don't
>> read everything, just the bit you want.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

More information about the Bioperl-l mailing list