[Bioperl-l] About to tag the last RC...
cjfields at illinois.edu
Thu Jan 15 00:17:31 EST 2009
On Jan 14, 2009, at 5:45 PM, Scott Markel wrote:
> We've been testing 1.6 RC2 with our set of nightly Pipeline Pilot
> regressions and have noticed a few issues. Sorry we couldn't get
> this feedback to you sooner.
> 1) There is a problem with the output filename for bl2seq on
> Windows. In response to bug 2707, quotemeta was used when building
> the parameter string at line 507 in
> Bio::Tools::Run::StandAloneNCBIBlast (1.5.9_2). This causes a
> problem with the path to the output file on Windows. For example,
> "C:\DOCUME~1\outfile" becomes "C\:\\DOCUME\~1\\outfile". bl2seq
> can't open the output file and fails.
I've added an OS check for that so this isn't used with Windows (I
wondered whether quotemeta would bite me there). I'm seriously
considering ripping out that code altogether, though. I'm not sure we
want to wade into attempting to accurately escape shell chars simply
based on OS differences.
> 2) Parsing megablast output (format 2) with Bio::SearchIO::blast.pm
> now returns an algorithm name of "BLASTN" instead of "MEGABLAST".
> This change seems to have been introduced in revision 11579 of
> blast.pm when a couple regex changes were made (lines 452 and 1201
> of blast.pm in 1.5.9_2). Subbing in the old regular expression for
> megablast in line 452 returned the correct "MEGABLAST" algorithm name.
I worked out why that regex isn't working (it doesn't match MEGABLAST
at all). I fixed it and added a test for checking the algorithm to
the test suite for MEGABLAST output, seems to work now.
> We also see some minor differences that we can live with, e.g.,
> BLAST hit scores changing from 40 to 40.1 and e-values having
> trailing zeros. We'll just update our baselines.
Okay, but let me know if that becomes pressing. The e-value issue is
a bit odd and may be worth looking into.
> The change to using Bio::Annotation::TagTree for SwissProt sequence
> gene names broke a number of our tests but we'll fix that by
> modifying the adapters we use between our internal representation
> and BioPerl's.
That would be from the switchover from StructureValue (which wasn't
really designed for the purposes of storing such data). A layered
Bio::Annotation::Collection was the other option (this is almost a
light version of that).
> One thing we haven't tracked down yet is a change in tag type, e.g.,
> b:integervalue to b:stringvalue, in the XML representations of our
> Pipeline Pilot data records. We're only seeing this for programs in
> NCBI's BLAST suite. At this point we don't know what's changed on
> the BioPerl side to trigger the change in our code. We'll continue
> to investigate this.
Again, if you find it's on our side let us know.
> Scott Markel, Ph.D.
> Principal Bioinformatics Architect email: smarkel at accelrys.com
> Accelrys (SciTegic R&D) mobile: +1 858 205 3653
> 10188 Telesis Court, Suite 100 voice: +1 858 799 5603
> San Diego, CA 92121 fax: +1 858 799 5222
> USA web: http://www.accelrys.com
> Board of Directors: International Society for Computational Biology
> Co-chair: ISCB Publications Committee
> Associate Editor: PLoS Computational Biology
> Editorial Board: Briefings in Bioinformatics
Thanks Scott! Let us know if you have any other problems. I've been
busier than expected but should get RC3 out soon.
More information about the Bioperl-l