[Bioperl-l] fastq splitter

Peter Cock p.j.a.cock at googlemail.com
Wed Feb 29 10:32:55 EST 2012

On Wed, Feb 29, 2012 at 3:27 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> On Feb 29, 2012, at 4:32 AM, Peter Cock wrote:
>> On Wed, Feb 29, 2012 at 2:42 AM, Fields, Christopher J
>> <cjfields at illinois.edu> wrote:
>>> Frankly, there never seemed to be a real fixed standard in the way that FASTQ
>>> headers were written (and just when it seems there is some consensus, Illumina
>>> pulls the rug out from under you), hence the reason I leave it alone.  We could
>>> add some ID munging in there if needed, would just need a qr// with a standard
>>> fallback.
>>> chris
>> Indeed - just like FASTA, it seems every company/tool/database has its own
>> conventions about the FASTQ ID line and how to stuff as much meta-data
>> into it as possible. This is a major reason why I hope unaligned reads in
>> SAM/BAM takes off - places like the Sanger and Broad use this in their
>> pipelines.
>> http://blastedbio.blogspot.com/2011/10/fastq-must-die-long-live-sambam.html
>> Peter
> Unaligned BAM makes the most sense.  I've also been talking with the
> HDF5 folks here sporadically, they're still keen on promoting BioHDF
> (it is pretty fast), though that has cooled considerably.
> Anyone working directly with CRAM in their pipelines?
> chris

I understand that Sanger are looking at moving their pipelines from BAM to
CRAM later this year, but CRAM is still quite new and in flux.


More information about the Bioperl-l mailing list