[Bioperl-l] what to do about Blast.pm, parsing

Jason Stajich jason@chg.mc.duke.edu
Tue, 16 Jan 2001 17:54:20 -0500 (EST)

On the refactor front -

I think BPlite is a good way to go for moving functionality from Blast.pm,
however things like to_html/from_html are very nice and I'd like to see
migrated along.  Perhaps we could get a poll or priority list of features
from Blast.pm which identify what we use it for to be sure they are
migrated first.  Another alternative is to go for a clean code base and
write a module like what I've started locally called YABP (Yet Another
Blast Parser).  I'd like us to really identify the functions we want
before starting to write it since porting all of Blast.pm to a new module
is sort of silly if we aren't going to see signif benefit in
functionality or speed.  I do see the value in having a lightweight module
to accomplish some tasks and a heavyweight one for doing others.

I also have been playing with Parse::RecDescent some.  While writing a
grammar is not the most fun I've ever had, I've been able to write a
parser for GenBank files and get at least accession,locus, and sequence
lines parsed (I know, big deal).  Feature table will be a bit more fun,
but I think it may be a useful exercise whether or not we will really just
write grammars for seqformats I don't know.  Perhaps a grammar could be
written for blast files - might be more trouble than it's worth...

Just some thought rattling around...

Jason Stajich
Center for Human Genetics
Duke University Medical Center