[Bioperl-l] Bio::Tools::RestrictionEnzyme

Chris Fields cjfields at uiuc.edu
Fri Nov 3 13:28:53 EST 2006


Nick,

Could you file this as a bug?

Chris

On Nov 3, 2006, at 10:29 AM, Staffa, Nick (NIH/NIEHS) wrote:

> The module Bio::Tools::RestrictionEnzyme
> Uses the perl split function to generate the fragments of a digestion,
> Using the recognition pattern as the delimiter.  It then glues onto  
> the
> resulting strings that part of the pattern representing the  
> sequence before
> and after the cut.  This is fine for non-ambiguous patterns, but  
> starts
> looking funny for patterns having ambiguities.
> Worse that in doing a double digest, one enzyme after another, the  
> ambiguity
> code character can mask a true cut site.
> I was using BsaHI [GRCGYC] followed by HpaII [CCGG]
> Below is the example of the CCGG pattern being masked by a Y
> And the different results of the digestion.
>
>
> CGYCGGCATGTCGATGGTGACCATGTGACAGCACGAGTCACTGCTGCTTTCAAGTTCCGAACAGGAATTA 
> GAAA
> As opposed to the real thing:
> CGCCGGCATGTCGATGGTGACCATGTGACAGCACGAGTCACTGCTGCTTTCAAGTTCCGAACAGGAATTA 
> GAAA
>
> Which when cut by HpaII [CCGG] really yields
> first A_B_frag  :
> CGC
>
> Instead of :
>  first A_B_frag =
>   
> CGYCGGCATGTCGATGGTGACCATGTGACAGCACGAGTCACTGCTGCTTTCAAGTTCCGAACAGGAATTA 
> GAAAG
> ACTTGCTAGTGCTGTTGGGTCTCC
> TTGACTCTGAGACAATGATAACAATGTTGAAGGTGGTCTAGGCATTTGGGTGCTGTGGAGTTATAAAGAG 
> GAAAAG
> AAAAGATAAAACAAAAAAAAATAG
> GAAACAAATGATTAAGCCACTACTAAGGGGTCTAGTCTAATGCCAACTGGGTAAATTCATGGGAACAATG 
> TGTGCC
> AGTCTTTAGAAACACTGTTTCATA
> TTGCATATATTATGGCATGGTATTACATTGATTAATTTTACTTTAGAGATGAAGAAGCTGAGATTTGGGG 
> TGAATA
> GCAATTATCCCAAAGTCTCTCAGA
> TAGCTGGAGGCAGCAGGGTCTGGGGTATTCACAGTCCCTACTCCATATTGTGTGGTCAGAACCAAATGAG 
> ACAGAT
> AAAGGGCAGACAAAAGAGAAAGTG
> GGGAGTATGATTTGAAAATGATGGTGTGACCCAGATTTCTGATGGAAATATCTAATGGCTGCAGACTGGA 
> TAGCTG
> TGACCATTTTAGTTACTGAATTCA
> GGAGATCTTATCTCAATGGAGGCATGTTGTCAACCAAAAGCCAGGATAAGCAAGGGTCAGTGTCTAGACA 
> TTGGAG
> TAAGGTTTGCCTGGATATTTCCAC
> AGGGAACCAAGTGTCATGGAGTCTTATTCATTGGGAGGTTATCTTTGTTACACACATGGACATATCATCA 
> AGCCAG
> CAATTCAGCAAAACTGTCAACACA
> CAAATAGAGATGTATTGACAACGGGGAACCACAAGTCATGCTTATTCCAAGCTAAAGCCCTCATGTGGAA 
> CTTGTT
> TTGTATGGCATTTGTCTCATCTAC
> ACATTGATGGGAAGGGTAAAAGGAAGTCTTTGGTGGGATTACAGAAGTCAGTAAAAAAGCAAAAGGAAAG 
> ATTTAG
> AAAACAAAGAAAAAGAAAAGGGAG
> GAAAGGAAAAGAAAAAAGATTTCAGAGATCTCAACATCAATTCAGACCAAGGGTGCCTCTTATACTATGT 
> CCAAGC
> CAGTAAGTGGGGTTGTTCTTGTTA
> ACTACAGCCATGTATAGAGGTGAACTTCAGGCTCCTGACTGATCCTCTGAGGTAGAAAGTAAACAGTACT 
> CTTATG
> ACACACGCAGTTGTTCAGTGCTGA
> CATGAAAATGTCATTGCTTACAGCGCTAGGAGAC
>
>
> This subroutine yields, I believe, the true sequence,
> Although I don't know how efficient it is.
> I'm thinking it must be more efficient than having to turn each  
> fragment
> from the first digestion into a BioPerl Sequence Object before  
> applying the
> cut_seq method.
>
> sub cut_seq {
> my $number= 0;
> my @frags = ();
> my $bigline = shift @_;
> my $recognition_site = shift @_;
> my $cutsite = shift @_;
>  my $pat = &expanded_string($recognition_site);
> while ($bigline){
> #my $offset = index $bigline, $pat;
> if ($bigline =~/($pat)/){
> my $first = substr $&,0,$cutsite;
> my $last = substr $&,$cutsite;
> my $frag = $`.$first;
> push @frags, $frag;
> $number++;
> #print "fragment # $number:\n$frag\n";
> my $rest_of_bigline = $last.$';
> $bigline = $rest_of_bigline;}
> else {push @frags, $bigline;    #Last one
> $number++;
> #print "fragment # $number:\n$bigline\n";
> $bigline = "";}
> }
> return @frags;
> }
>
>
>
>
> Nick Staffa
> Telephone: 919-316-4569  (NIEHS: 6-4569)
> Scientific Computing Support Group
> NIEHS Information Technology Support Services Contract
> (Science Task Monitor: Jack L. Field( field1 at niehs.nih.gov )
> National Institute of Environmental Health Sciences
> National Institutes of Health
> Research Triangle Park, North Carolina
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign





More information about the Bioperl-l mailing list