[Bioperl-l] Merging separate sequence and quality files to FASTQ ?

Dan Bolser dan.bolser at gmail.com
Thu Dec 3 08:07:27 EST 2009

2009/12/3 Peter <biopython at maubp.freeserve.co.uk>:
> On Thu, Dec 3, 2009 at 11:44 AM, Dan Bolser <dan.bolser at gmail.com> wrote:
>> Hi, can someone test the script here on zero length fasta / qual files?
>> http://www.bioperl.org/wiki/Merging_separate_sequence_and_quality_files_to_FASTQ
>> It seems the output has an extra newline in the sequence part of the
>> output (which throws off scripts that rely on the 'four lines per
>> record' structure of the fastq (although I'm not sure if it's illegal
>> fastq).
> Hi Dan,
> The OBF consensus was FASTQ records with a zero length
> sequence might be useful, and should be output as exactly
> four lines (one blank sequence line, one blank quality line).
> However for parsing, any number of blank lines should be OK.
> http://lists.open-bio.org/pipermail/open-bio-l/2009-July/000522.html
> I can confirm the perl script currently outputs a FASTQ file
> with TWO blank lines for the sequence, giving five lines in
> total for the zero length record. That does suggest a bug.
> What version of BioPerl are you running?

Hi Peter,

Basically, I'm not running the 'latest' version of BP, which is why I
asked this question of the list rather than filing a bug report. What
version are you running? ;-)

Sounds like 5 lines instead of the expected 4 is a minor bug. (Thanks
for the info).

> Peter
> P.S. The script is throwing away any description after the
> identifier.

That's probably bad. Feel free to edit the script on the wiki. Sadly,
MediaWiki's diff features are less than optimal, so developing scripts
on the wiki isn't ideal. Anyone know how to plug git-hub into a script
apparently hosted on a wiki?

Or is git-hub basically designed to be 'wiki for code'?

I'm wondering, because with the FlaggedRevs extension you could
basically build a whole release in the wiki. Which would be fun if
nothing else!


JHP: Biology is bioinformatics and bioinformatics is biology.

More information about the Bioperl-l mailing list