[Bioperl-l] Windows bug in Bio::DB::Fasta?

Scott Cain cain at cshl.edu
Mon Aug 15 13:22:29 EDT 2005

Just to follow up on my own email with a little more information: in
Fasta.pm, line 697:

  $termination_length ||= /\r\n$/ ? 2 : 1;  # account for crlf-terminated Windows files

The pattern match is failing on DOS formatted files; I don't know why.
Does anyone else?

On Mon, 2005-08-15 at 10:35 -0400, Scott Cain wrote:
> Hello all,
> I am investigating a bug in GBrowse that seems to only surface when
> people are using the memory (ie, file) adaptor on Windows systems.
> Here's the bug report:
> https://sourceforge.net/tracker/?func=detail&atid=391291&aid=1256169&group_id=27707
> I've tracked the problem down to Bio::DB::Fasta when the file is dos
> formatted (that is, it has both line feeds and carriage returns), BDF
> returns the wrong string when a subsequence is requested, but when the
> file is unix formatted (ie only CR (or is it only LF?)), it returns the
> right string.  I wrote the very simple test script below and stepped it
> through the perl debugger.  It looks like the bug is in the caloffset
> method, as it returns the same offsets regardless of the file type,
> which then makes the subsequent seek into the file go to the wrong
> coordinates of dos formatted files.
> Unfortunately, I don't really know what is going on caloffset, so I
> don't know how to fix it, but it presumably has to check the format of
> the file somewhere and take that into account.
> Thanks,
> Scott
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

More information about the Bioperl-l mailing list