[Bioperl-l] Fasta Qual files

James Gilbert jgrg@sanger.ac.uk
Fri, 15 Sep 2000 09:46:14 +0100 (BST)

On Thu, 14 Sep 2000 hilmar.lapp@pharma.Novartis.com wrote:

> >   my @qual = (40,34,35,99,99);
> >   my $qual_str = pack('C*', @qual);
> >   @qual = unpack('C*', $qual_str);
> >
> Reversal:
>      $qual_str = reverse($qual_str);
> Truncation:
>      $qual_str = substr($qual_str, 0, 3);
> <someone being forced to use an email client full of bells and whistles but
> lacking support for quoting -- because you can simply use color - what a
> great idea -- trying to indicate where his scribble begins>
>    I see. As you see I'm not so familiar with Perl guts. I was able to
>    verify that 0s are correctly retained (and don't terminate a string),
>    but negative values obviously lose their original values. Negative
>    values do occurr in quality values, at least in Phred/Phrap. What about
>    signed chars? Are there quality values beyond 127? The highest value
>    used by Phred/Phrap is 100. (BTW you can obviously convert 'high' values
>    back to their original negative representations, but you'd still have to
>    make an assumption about the highest possible quality value).


Yes, we must use signed chars ('c*') if negative
quality values are used.  I believe that quality
values beyond 100 aren't used.  I think that a
quality value of 100 means that there is a 1e-10
chance of that base being wrong, and is anyone
really that sure!  (And I think that a quality of
99 in Phred/Phrap is used as a special flag for
manually curated bases.)

And yes, Perl is "8 bit clean", which means that
you won't get nasty effects if your string happens
to include a null bytes or control characters.


James G.R. Gilbert
The Sanger Centre
Wellcome Trust Genome Campus
Cambridge                        Tel: 01223 494906
CB10 1SA                         Fax: 01223 494919