[Bioperl-l] seq doesn't validate error

Jason Stajich jason at bioperl.org
Sat Jun 16 20:17:58 EDT 2007


There error is clearly saying there must be a symbol or letter in  
your sequence that violates the regexp.
I had modified the code in CVS to actually provide a more informative  
mismatch error in the error message, but this probably not in the  
release you are using.

Anyways, add this to see what is causing the problem:

print join(",",($nstarthash{$_}[1] =~ /([^ 
$Bio::PrimarySeq::MATCHPATTERN]+)/g)), "\n";

-jason
On Jun 15, 2007, at 4:53 PM, Sheri Simmons wrote:

> Thanks for the suggestion, but that still gives the same error as  
> before.
>
> On Friday 15 June 2007 4:11 pm, Kevin Brown wrote:
>>> I'm getting an error as follows when I try to reverse
>>> complement a sequence string stored in a hash of arrays. The
>>> storage code is:
>>>
>>> 		$nstarthash{$key} = [$sortchecks[0], join("",
>>> @nseq),
>>> join("",@{$seqhash{$key}})];
>>>
>>> the sequence of interest is the element at index 1.
>>>
>>> Later, I try to retrieve this string for a subset of keys so
>>> I can reverse complement it based on input from another hash
>>> (%complement):
>>>
>>> 			my %revcomphash = map { my $read = $_;
>>> 			grep $complement{$read} eq 'C', %complement;
>>> 			{$_, (Bio::Seq->new(-seq
>>> =>$nstarthash{$_}[1]))->revcom->seq()};}
>>> 			 keys(%nstarthash);
>>>
>>>
>>> I get the following warning (long sequence edited for clarity):
>>>
>>> -- -------------------- WARNING ---------------------
>>> MSG: seq doesn't validate, mismatch is 1
>>> ---------------------------------------------------
>>>
>>> ------------- EXCEPTION  -------------
>>> MSG: Attempting to set the sequence to
>>> [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC]
>>> which does not look healthy
>>> STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
>>> STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
>>> STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK
>>> toplevel ../quality_wrapper.pl:103
>>>
>>> I cannot find any non-allowed characters in the sequence, and
>>> the de-referencing appears to work correctly. Can anyone help me?
>>> I'm using the latest Bioperl installation (1.5.2) with
>>> ActivePerl5.8 on a Mepis 6.5 system.
>>
>> Try telling the Bio::Seq object what alphabet to use when creating  
>> it.
>> I tend to create them like:
>>
>> Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna')
>
> -- 
> Sheri Simmons
> Department of Earth and Planetary Sciences
> University of California, Berkeley
> Berkeley, CA 94720-4767
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/




More information about the Bioperl-l mailing list