[Bioperl-l] problems with blast parser
cjfields at uiuc.edu
Thu Apr 6 13:42:16 EDT 2006
I didn't think of that, but makes sense considering he mentioned the file is
huge and the process is killed off. I agree with Jason, that tabular output
is probably the best way to go here.
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign
> -----Original Message-----
> From: Jason Stajich [mailto:jason.stajich at duke.edu]
> Sent: Thursday, April 06, 2006 12:30 PM
> To: Chris Fields; Alessandro S. Nascimento
> Cc: BioPerl list
> Subject: Re: [Bioperl-l] problems with blast parser
> I'm pretty sure for thousands of HSPs this can be an out of memory
> problem. I've explained workarounds before on the list, but they
> basically mean building a new listener object that creates simple
> hashes (or arrays) instead of full-blown HSP objects. Personally I
> use a hybrid approach depending on the dataset - SearchIO can be too
> slow and too memory intensive for the cases where I am just getting
> top hits or summary stats, but if I want the alignment strings, more
> stats, etc then I use SearchIO.
> The question is - do you really want to be parsing a huge file, can
> you get away with using tabular output (-m8 or -m9) from BLAST? If
> you are balking at re-running the blast something like blast2table is
> simple pure-perl to generate an -m 8 tabular output from BLAST report
> very efficiently. This is discussed on the bioperl BLAST wiki page I
> On Apr 6, 2006, at 11:56 AM, Chris Fields wrote:
> > Alessandro,
> > We need to know a few things first:
> > 1) What version of Bioperl?
> > 2) BLAST version?
> > 3) What OS?
> > 4) Perl version?
> > 5) Exactly how large is your file?
> > It would also be nice to see at least a chunk of your script to
> > rule out a
> > logic error there. If you want you can also submit your script by
> > filing
> > this as a bug in Bugzilla and attaching your script.
> > http://www.bioperl.org/wiki/Bugs
> > If you have an older version of Bioperl (such as 1.4) consider
> > upgrading to
> > 1.5.1 or CVS. Lots of fixes have been incorporated since 1.4,
> > including to
> > SearchIO.
> > Chris
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of Alessandro S. Nascimento
> >> Sent: Tuesday, April 04, 2006 10:28 AM
> >> To: bioperl-l at lists.open-bio.org
> >> Subject: [Bioperl-l] problems with blast parser
> >> Hi all
> >> I'm trying to parse a blast standalone (blaspgp) result file and
> >> filter
> >> some sequences using length and identity. The script used to work but
> >> this time after several minutes working in 99.9% of my processor I
> >> have
> >> a "killed"message with no more information. The blast file is very
> >> large. Does anyone have any clue ?
> >> Thanks in advance
> >> Alessandro
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> Jason Stajich
> Duke University
More information about the Bioperl-l