[Bioperl-l] fastq splitter

Peter Cock p.j.a.cock at googlemail.com
Thu Mar 1 10:03:02 EST 2012

On Thu, Mar 1, 2012 at 2:41 PM, Pablo marin-garcia
<harpactocrates at googlemail.com> wrote:
> On Wed, Feb 29, 2012 at 4:32 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> I understand that Sanger are looking at moving their pipelines from BAM to
>> CRAM later this year, but CRAM is still quite new and in flux.
> my concern is that being CRAM based in delta compression (comparison
> against reference), I  am not sure how much compression it would
> achieve with unaligned bams.

This can be done with an appropriate dummy reference, for instance
from a mini-assembly of the unmapped reads.

> The other thing that CRAM does is to
> remove a lot of extra tags and metadata (even from the header
> reference info), and here the strong point of bam against FASTQ is the
> availability of structured metadata. CRAM is still in development in
> this area so we will see where they go.

Did you miss Ewan's reply about CRAM 0.7 which is due soon?

Might this be better continued on the cram-dev list
or on this SEQanswers thread?


More information about the Bioperl-l mailing list