[Bioperl-l] problems to use parsing of rst in Bio::Tools::Run::Phylo::PAML::Codeml

Albert Vilella avilella at gmail.com
Fri Oct 26 14:45:29 EDT 2007


Hi Claude,

Did you submit the bug in bugzilla?

Or if you want, can you attach me a tar.gz with your script and input file
so I can have a look at this?

Cheers,

    Albert.

On 10/24/07, Jason Stajich <jason at bioperl.org> wrote:
>
> Claude - Will try and take a look when I have time, but pretty overwhelmed
> with other things for the time being - best bet is to also submit this as a
> bug to bugzilla.open-bio.org
>
> -jason
> On Oct 22, 2007, at 12:56 PM, Claude Rispe wrote:
>
> Dear Dr Stajich
> We have difficulty using this bioperl module to run Paml, when we try to
> parse the reconstruction of ancestral sites (rst). I send a workable extract
> of our program,  and two example fasta files (alignments). The program
> seems to work fine (at least for running codeml) and generates a correct rst
> file (by visual inspection) but we have two problems when we try to parse
> the results.
>
> 1/ we have 36 taxa and 11 internal nodes (see attached .nwk file).
> However, when we use the functions of "get_rst_persite", they do not work
> for node 36 (the last terminal node) and for node 47 (the last internal
> node): they return errors. However, inspecting visually the rst file, there
> are no problemes with these sequences.
>
> 2/ if alignments contain gaps (as eg. the file b1630.fas-GBL), the
> get_rst_persite returns wrong values.It seems the references are not
> correct (gaps being ignored ?) and so we obtain codons which are different
> from those actually expected for most sequences!
>
> May be you will have a bit of time to have a glance at our code and
> example files. We might have not written the code correctly (I am not an
> expert at this task). Thanks a lot in advance for any advice or feedback,
> Claude Rispe
> #!usr/bin/perl -w
>
> use Bio::SeqIO;
> use Bio::AlignIO;
> use Bio::Tools::Phylo::PAML;
> use Bio::Tools::Run::Phylo::PAML::Codeml;
> use Bio::TreeIO;
> use Bio::Tree::TreeI;
>
> use strict;
>
>
> my $treeio=Bio::TreeIO->new(-file =>"entero_topo.nwk", -format=>"newick");
> my $tree=$treeio->next_tree();
>
>
>    # runs PAML and gets encestral sequences (for internal nodes)
> ------------------------------------------------------------
>       my $in=Bio::AlignIO->new(-file => "b3298.fas-GBL",-format  =>
> 'fasta');
>       while ( my $aln=$in->next_aln() ) {
>          my $kaks_factory=new Bio::Tools::Run::Phylo::PAML::Codeml (
> -tree=> $tree, -save_tempfiles=> 'TRUE', -dir=>'/tmp', -params=>
> {'runmode'=>0,'seqtype'=>1,'model'=>0,'RateAncestor'=>1} );
>          $kaks_factory->alignment($aln);
>          my($rc,$parser)= $kaks_factory->run();
>  my $result=$parser->next_result;
>
>
>          # parsing of ancestral states and of mutations along branches
> ------------------------------------------------------------
>  my @trees = $result->get_rst_trees;
>  my $persite = $result->get_rst_persite;
>  my $nbsites=scalar(@$persite)-1;
>  print "$nbsites\n";
>     for my $t ( @trees ) {
>                for my $node ( $t->get_nodes ) {
>                    next unless $node->ancestor; # skip root node
>    printf "node:%s     ancestor: %s  node is leaf?
> %s\n",$node->id,$node->ancestor->id,$node->is_Leaf;
>    if ($node->id=~/^(\d*)/) {
>        for( my $i=1;$i<=$nbsites;$i++) {
>          print " $i ",$persite->[$i]->[$1]->{'codon'}, '
> ',$persite->[$i]->[$1]->{'aa'}," ";
>          unless ($node->is_Leaf) {print "
> ",$persite->[$i]->[$1]->{'prob'}," ";}
>
>
>        }
>    }
>            print "\n";
>                }
>
>
>            }
>
>
>
>
>       }
>
> Eco3
>
> GTCGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAGC ATGCCGTAAT CGCATTAACT
> TCGATTTATG GCGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GGACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Shig6
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAGC ATGCCGTAAT CGCATTAACT
> TCGATTTATG GCGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GGACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Shig4
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAGC ATGCCGTAAT CGCATTAACT
> TCGATTTATG GCGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GGACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Shig5
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAGC ATGCCGTAAT CGCATTAACT
> TCGATTTATG GCGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GGACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Shig2
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAGC ATGCCGTAAT CGCATTAACT
> TCGATTTATG GCGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GGACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Erwca
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAAC ATACCGTTAT TGCGTTAACG
> TCAATTTTCG GTATCGGTAA AACTCGGTCG CAGGCTATTT GTGCTGCAAC GGAGATTGCC
> GAAAATGTTA AGATCAGTGA GCTGTCTGAA GAGCAAATCG ATAAGCTGCG TGACGAAGTT
> GCCAAGTTTG TTGTAGAAGG TGATCTGCGT CGTGAAGTTA CCCTGAGCAT CAAGCGTCTT
> ATGGACCTTG GTACTTATCG TGGTTTGCGT CATCGTCGTG GTCTGCCGGT TCGCGGTCAG
> CGTACCAAGA CTAACGCCCG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Shig3
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAGC ATGCCGTAAT CGCATTAACT
> TCGATTTATG GCGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GGACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Shig1
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAGC ATGCCGTAAT CGCATTAACT
> TCGATTTATG GCGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GGACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGTCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Yep2
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCAGAAAC ATACCGTTAT CGCTTTAACT
> GCGATCTTCG GCATCGGCAA AACTCGCTCA CAGGCTATCT GTGTTGCTGC GGGTATTGCT
> GAACATGTTA AGATCAGTGA GCTGTCTGAA GAGCAAATCG AGAAGCTGCG TGACGAAGTT
> GCCAAGTACG TTGTAGAAGG TGATCTGCGT CGTGAGGTGA CCCTGAGCAT CAAGCGTCTG
> ATGGACCTTG GGACTTATCG TGGTTTGCGT CATCGTCGTG GTCTACCAGT TCGCGGTCAG
> CGTACTAAGA CCAACGCACG TACCCGTAAG GGTCCGCGTA AACCGATCAA GAAA
>
> Yep3
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCAGAAAC ATACCGTTAT CGCTTTAACT
> GCGATCTTCG GCATCGGCAA AACTCGCTCA CAGGCTATCT GTGTTGCTGC GGGTATTGCT
> GAACATGTTA AGATCAGTGA GCTGTCTGAA GAGCAAATCG AGAAGCTGCG TGACGAAGTT
> GCCAAGTACG TTGTAGAAGG TGATCTGCGT CGTGAGGTGA CCCTGAGCAT CAAGCGTCTG
> ATGGACCTTG GGACTTATCG TGGTTTGCGT CATCGTCGTG GTCTACCAGT TCGCGGTCAG
> CGTACTAAGA CCAACGCACG TACCCGTAAG GGTCCGCGTA AACCGATCAA GAAA
>
> Klebs
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCACAAAC ATACCGTTAT CGCTTTAACC
> GCTATTTTCG GTATCGGCAA AACCCGTTCT AAAGCCATCT GCGCTGAAAC GGGCATCGCT
> GAAAATGTTA AGATCAGTGA GCTGTCTGAA GAACAAATTG ATATTCTGCG TGAAGCAGTA
> GGTAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA CCCTGAGCAT CAAGCGTCTG
> ATGGACCTTG GTTGCTACCG TGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGTA AACCGATCAA GAAA
>
> Yep1
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCAGAAAC ATACCGTTAT CGCTTTAACT
> GCGATCTTCG GCATCGGCAA AACTCGCTCA CAGGCTATCT GTGTTGCTGC GGGTATTGCT
> GAACATGTTA AGATCAGTGA GCTGTCTGAA GAGCAAATCG AGAAGCTGCG TGACGAAGTT
> GCCAAGTACG TTGTAGAAGG TGATCTGCGT CGTGAGGTGA CCCTGAGCAT CAAGCGTCTG
> ATGGACCTTG GGACTTATCG TGGTTTGCGT CATCGTCGTG GTCTACCAGT TCGCGGTCAG
> CGTACTAAGA CCAACGCACG TACCCGTAAG GGTCCGCGTA AACCGATCAA GAAA
>
> Yent
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCAGAAAC ATACCGTTAT CGCGTTAACT
> GCGATCTACG GCATCGGTAA GACCCGTTCA CAGGCTATCT GTGTTGCTGC GGGTATTGCT
> GAAAATGTTA AGATCAGTGA GCTGTCTGAA GAGCAAATCG AGAAGCTGCG TGACGAAGTT
> GCCAAGTACG TTGTAGAAGG TGATCTGCGT CGTGAGGTGA CCCTGAGCAT CAAGCGTCTG
> ATGGACCTTG GGACTTATCG TGGTTTGCGT CATCGTCGTG GTCTACCAGT TCGCGGTCAG
> CGTACTAAGA CCAACGCACG TACCCGTAAG GGTCCGCGTA AACCGATCAA GAAA
>
> Yep4
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCAGAAAC ATACCGTTAT CGCTTTAACT
> GCGATCTTCG GCATCGGCAA AACTCGCTCA CAGGCTATCT GTGTTGCTGC GGGTATTGCT
> GAACATGTTA AGATCAGTGA GCTGTCTGAA GAGCAAATCG AGAAGCTGCG TGACGAAGTT
> GCCAAGTACG TTGTAGAAGG TGATCTGCGT CGTGAGGTGA CCCTGAGCAT CAAGCGTCTG
> ATGGACCTTG GGACTTATCG TGGTTTGCGT CATCGTCGTG GTCTACCAGT TCGCGGTCAG
> CGTACTAAGA CCAACGCACG TACCCGTAAG GGTCCGCGTA AACCGATCAA GAAA
>
> Yep5
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCAGAAAC ATACCGTTAT CGCTTTAACT
> GCGATCTTCG GCATCGGCAA AACTCGCTCA CAGGCTATCT GTGTTGCTGC GGGTATTGCT
> GAACATGTTA AGATCAGTGA GCTGTCTGAA GAGCAAATCG AGAAGCTGCG TGACGAAGTT
> GCCAAGTACG TTGTAGAAGG TGATCTGCGT CGTGAGGTGA CCCTGAGCAT CAAGCGTCTG
> ATGGACCTTG GGACTTATCG TGGTTTGCGT CATCGTCGTG GTCTACCAGT TCGCGGTCAG
> CGTACTAAGA CCAACGCACG TACCCGTAAG GGTCCGCGTA AACCGATCAA GAAA
>
> Yeps2
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCAGAAAC ATACCGTTAT CGCTTTAACT
> GCGATCTTCG GCATCGGCAA AACTCGCTCA CAGGCTATCT GTGTTGCTGC GGGTATTGCT
> GAACATGTTA AGATCAGTGA GCTGTCTGAA GAGCAAATCG AGAAGCTGCG TGACGAAGTT
> GCCAAGTACG TTGTAGAAGG TGATCTGCGT CGTGAGGTGA CCCTGAGCAT CAAGCGTCTG
> ATGGACCTTG GGACTTATCG TGGTTTGCGT CATCGTCGTG GTCTACCAGT TCGCGGTCAG
> CGTACTAAGA CCAACGCACG TACCCGTAAG GGTCCGCGTA AACCGATCAA GAAA
>
> Yeps1
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCAGAAAC ATACCGTTAT CGCTTTAACT
> GCGATCTTCG GCATCGGCAA AACTCGCTCA CAGGCTATCT GTGTTGCTGC GGGTATTGCT
> GAACATGTTA AGATCAGTGA GCTGTCTGAA GAGCAAATCG AGAAGCTGCG TGACGAAGTT
> GCCAAGTACG TTGTAGAAGG TGATCTGCGT CGTGAGGTGA CCCTGAGCAT CAAGCGTCTG
> ATGGACCTTG GGACTTATCG TGGTTTGCGT CATCGTCGTG GTCTACCAGT TCGCGGTCAG
> CGTACTAAGA CCAACGCACG TACCCGTAAG GGTCCGCGTA AACCGATCAA GAAA
>
> Eco10
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAGC ATGCCGTAAT CGCATTAACT
> TCGATTTATG GCGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GGACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Yep6
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCAGAAAC ATACCGTTAT CGCTTTAACT
> GCGATCTTCG GCATCGGCAA AACTCGCTCA CAGGCTATCT GTGTTGCTGC GGGTATTGCT
> GAACATGTTA AGATCAGTGA GCTGTCTGAA GAGCAAATCG AGAAGCTGCG TGACGAAGTT
> GCCAAGTACG TTGTAGAAGG TGATCTGCGT CGTGAGGTGA CCCTGAGCAT CAAGCGTCTG
> ATGGACCTTG GGACTTATCG TGGTTTGCGT CATCGTCGTG GTCTACCAGT TCGCGGTCAG
> CGTACTAAGA CCAACGCACG TACCCGTAAG GGTCCGCGTA AACCGATCAA GAAA
>
> Styphim
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCAGAAAC ACGCCGTGAT CGCGTTAACT
> TCGATCTACG GTGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAAATGTTA AGATCAGTGA GCTGTCTGAA GAACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCTCG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Sty2
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCAGAAAC ACGCCGTGAT CGCGTTAACT
> TCGATCTACG GTGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAAATGTTA AGATCAGTGA GCTGTCTGAA GAACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCTCG TACCCGTAAG GGTCCGTGCA AACCGATCAA GAAA
>
> Enterob
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCAAAAAC ATGCTGTGAT CGCATTAACT
> TCGATCTATG GCGTCGGCAA GACCCGTTCC AAAGCCATTT TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GAACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAGTTA GCATGAGCAT CAAGCGCCTT
> ATGGACCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGCCAG
> CGCACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Sty1
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCAGAAAC ACGCCGTGAT CGCGTTAACT
> TCGATCTACG GTGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAAATGTTA AGATCAGTGA GCTGTCTGAA GAACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCTCG TACCCGTAAG GGTCCGTGCA AACCGATCAA GAAA
>
> Eco5
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAGC ATGCCGTAAT CGCATTAACT
> TCGATTTATG GCGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GGACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Eco4
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAGC ATGCCGTAAT CGCATTAACT
> TCGATTTATG GCGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GGACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Eco7
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAGC ATGCCGTAAT CGCATTAACT
> TCGATTTATG GCGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GGACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Eco6
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAGC ATGCCGTAAT CGCATTAACT
> TCGATTTATG GCGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GGACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Eco1
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAGC ATGCCGTAAT CGCATTAACT
> TCGATTTATG GCGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GGACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Serrp
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAAC ATACCGTAAT CGCCTTAACG
> TCGATCTTCG GAATCGGTAA AACCCGCTCA CAGTCTATCT GTGCGTCTAC GGGTATTGCT
> GAACATGTTA AGATCAGTGA GCTGTCTGAA GAGCAAATTG AACAGCTGCG TGAAGCAGTC
> GCCAAATTCA CTGTAGAAGG TGATTTGCGT CGTGAAGTTA CCCTGAGCAT CAAGCGTCTG
> ATGGATCTTG GTACTTACCG TGGTTTGCGT CATCGTCGTG GTCTGCCAGT TCGCGGTCAG
> CGTACTAAGA CCAACGCACG TACCCGTAAG GGTCCGCGTA AACCAATCAA GAAA
>
> Sodalis
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAAC ATACCGTTAT TGCCCTAACG
> TCAATCTACG GTATCGGCAA AACCCGCTCG CAGCACATCT GCGCGGCTAC GGGTATTGCT
> GAACATGTTA AGATCAGTGA GCTGTCTGAA GAGCAGATTG ACACGCTGCG TGAAGCAGTT
> ACCAAGTTTG TTGTCGAAGG CGATCTGCGC CGTGAAGTCA CCCTGAGTAT CAAGCGTCTG
> ATGGACCTTG GTACCTATCG TGGTTTGCGT CATCGTCGTG GTCTTCCGGT TCGCGGTCAG
> CGTACCAAGA CCAATGCCCG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Eco2
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAGC ATGCCGTAAT CGCATTAACT
> TCGATTTATG GCGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GGACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Sent2
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCAGAAAC ACGCCGTGAT CGCGTTAACT
> TCGATCTACG GTGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAAATGTTA AGATCAGTGA GCTGTCTGAA GAACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCTCG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Sent1
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCAGAAAC ACGCCGTGAT CGCGTTAACT
> TCGATCTACG GTGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAAATGTTA AGATCAGTGA GCTGTCTGAA GAACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGCCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCTCG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Pholu
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCAGAAGC ATACCGTGAT CGCTTTAACA
> TCGATCTACG GAATTGGCAA AACTCGCTCC CAAGCCATTT GTGCTGCGGC GGGTATTGCT
> GAGCATGTTA AGATCAGCGA GCTGTCTGAA GAGCAAATTG ATAAGCTGCG TGACGAAGTT
> GCCAAATACG TTGTAGAAGG CGATTTGCGT CGTGAAGTAA CCCTGAGCAT CAAACGTCTG
> ATGGATCTTG GTACTTATCG TGGTTTACGT CACCGTCGTG GTCTACCTGT TCGCGGCCAG
> CGTACTAAGA CCAACGCACG TACCCGTAAG GGTCCACGTA AGCCGATCAA GAAA
>
> Eco9
>
> GTGGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAGC ATGCCGTAAT CGCATTAACT
> TCGATTTATG GCGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GGACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Eco8
>
> GTAGCCCGTA TAGCAGGCAT TAACATTCCT GATCATAAGC ATGCCGTAAT CGCATTAACT
> TCGATTTATG GCGTCGGCAA GACCCGTTCT AAAGCCATCC TGGCTGCAGC GGGTATCGCT
> GAAGATGTTA AGATCAGTGA GCTGTCTGAA GGACAAATCG ACACGCTGCG TGACGAAGTT
> GCCAAATTTG TCGTTGAAGG TGATCTGCGC CGTGAAATCA GCATGAGCAT CAAGCGCCTG
> ATGGATCTTG GTTGCTATCG CGGTTTGCGT CATCGTCGTG GTCTCCCGGT TCGCGGTCAG
> CGTACCAAGA CCAACGCACG TACCCGTAAG GGTCCGCGCA AACCGATCAA GAAA
>
> Eco3
>
> ATGGTATTCA TAGCTAGCTC CCCTTATACC CATAACCAGC GCCAGACATC GCGCATTATG
> CTGTTGGTAT TGCTCGCAGC CGTGCCAGGA ATCGCAGCGC AACTGTGGTT TTTTGGTTGG
> GGTACTCTCG TTCAGATCCT GTTGGCGTCG GTCAGTGCTC TTCTGGCCGA AGCTCTCGTC
> CTCAAACTAC GCAAGCAGTC GGTAGCTGCA ACGTTGAAAG ATAACTCAGC ATTGCTGACA
> GGCTTATTGC TGGCGGTAAG TATTCCCCCC CTCGCGCCAT GGTGGATGGT CGTGCTGGGT
> ACGGTGTTTG CGGTGATCAT CGCTAAACAG TTGTATGGCG GTCTGGGGCA AAACCCGTTT
> AATCCGGCAA TGATTGGTTA TGTGGTCTTA CTGATCTCCT TCCCGGTGCA GATGACCAGT
> TGGTTACCGC CACATGAAAT TGCGGTCAAC ATCCCTGGTT TTATCGACGC CATACAGGTT
> ATTTTCAGCG GTCATACCGC CAGTGGTGGT GATATGAACA CGCTACGCTT AGGTATTGAT
> GGCATTAGTC AGGCGACACC GCTGGATACA TTCAAAACCT CTGTCCGCGC AGGTCATTCG
> GTTGAACAGA TTATGCAATA TCCGATCTAC AGCGGTATTC TGGCGGGCGT TGGTTGGCAA
> TGGGTAAATC TCGCCTGGCT GGCTGGCGGC GTATGGTTGC TGTGGCAGAA AGCGATTCGC
> TGGCATATTC CCCTCAGCTT CTTAGTAACG CTGGCGTTAT GCGCAACGTT GGGCTGGTTG
> TTCTCACCAG AAACACTGGC AGCACCGCAA ATTCATCTGC TGTCTGGAGC GACCATGCTC
> GGCGCATTCT TTATTTTGAC TGACCCGGTT ACCGCTTCTA CGACCAATCG TGGTCGTCTT
> ATTTTCGGCG CGCTGGCGGG CTTATTAGTC TGGATGATCC GCAGTTTCGG CGGCTATCCT
> GACGGCGTTG CTTTTGCCGT CCTGCTGGCG AACATCACGG TTCCTCTGAT CGATTACTAC
> ACGCGTCCGC GCGTCTACGG CCATCGCAAA
>
> Shig6
>
> ATGGTATTCA TAGCTAGCTC CCCTTATACC CATAACCAGC GCCAGACATC GCGCATTATG
> CTGTTGGTGT TGCTCGCAGC CGTGCCAGGA ATCGCAGCGC AACTGTGGTT TTTTGGTTGG
> GGTACTCTCG TTCAGATCCT GTTGGCGTCG GTCAGTGCTC TGTTAGCCGA AGCTCTCGTA
> CTCAAACTAC GCAAGCAGTC GGTAGCCGCA ACGTTGAAAG ATAACTCAGC ATTGCTGACA
> GGCTTATTGC TGGCGGTAAG TATTCCCCCC CTAGCGCCAT GGTGGATGGT CGTACTGGGT
> ACGGTGTTTG CGGTGATCAT CGCTAAACAG TTGTATGGCG GTCTGGGGCA AAACCCGTTT
> AATCCGGCAA TGATTGGTTA TGTGGTCTTA CTGATCTCCT TCCCGGTGCA GATGACCAGC
> TGGTTACCGC CACATGAAAT TGCGGTCAAC ATCCCTGGTT TTATCGACGC CATACAGGTT
> ATTTTCAGCG GTCATACCGC CAGTGGTGGT GATATGAATA CGCTACGCTT AGGTATTGAT
> GGCATTAGTC AGGCGACACC GCTGGATACA TTCAAAACCT CTGTCCGCGC AGGTCATTCG
> GTTGAACAGA TTATGCAATA TCCGATCTAC AGCGGTATTC TGGCGGGCGC TGGTTGGCAA
> TGGGTAAATC TCGCCTGGCT GGCTGGTGGC GTGTGGTTGC TATGGCAGAA AGCGATTCGC
> TGGCATATTC CCCTCAGCTT CTTAGTAACG CTGGCGTTAT GCGCAACGTT GGGCTGGTTG
> TTCTCACCAG AAACACTGGC AGCACCGCAA ATTCATCTGC TGTCTGGTGC GACTATGCTC
> GGCGCATTCT TTATTTTGAC TGACCCAGTT ACTGCTTCTA CGACCAATCG TGGTCGGCTG
> ATGTTCGGCG CGCTGGCGGG CTTATTAGTC TGGTTGATCC GCAGTTTCGG CGGCTATCCT
> GACGGCGTGG CTTTTGCCGT CCTGCTGGCG AACATCACGG TTCCTCTGAT CGATTACTAC
> ACGCGTCCGC GCGTCTACGG CCATCGCAAA
>
> Shig4
>
> ATGGTATTCA TAGCTAGCTC CCCTTATACC CATAACCAGC GCCAGACATC GCGCATTATG
> CTGTTGGTGT TGCTCGCAGC CGTGCCAGGA ATCGCAGCGC AACTGCGGTT TTTTGGTTGG
> GGTACTCTCG TTCAGATCCT GTTGGCGTCG GTCAGTGCTC TGTTAGCCGA AGCTCTCGTA
> CTCAAACTAC GCAAGCAGTC GGTAGCCGCA ACGTTGAAAG ATAACTCAGC ATTGCTGACA
> GGCTTATTGC TGGCGGTAAG TATTCCCCCC CTAGCGCCAT GGTGGATGGT CGTACTGGGT
> ACGGTGTTTG CGGTGATCAT CGCTAAACAG TTGTATGGCG GTCTGGGGCA AAACCCGTTT
> AATCCGGCAA TGATTGGTTA TGTGGTCTTA CTGATCTCCT TCCCGGTGCA GATGACCAGC
> TGGTTACCGC CACATGAAAT TGCGGTCAAC ATCCCTGGTT TTATCGACGC CATACAGGTT
> ATTTTCAGCG GTCATACCGC CAGTGGTGGT GATATGAATA CGCTACGCTT AGGTATTGAT
> GGCATTAGTC AGGCGACACC GCTGGATACA TTCAAAACCT CTGTCCGCGC AGGTCATTCG
> GTTGAACAGA TTATGCAATA TCCGATCTAC AGCGGTATTC TGGCGGGCGC TGGTTGGCAA
> TGGGTAAATC TCGCCTGGCT GGCTGGTGGC GTGTGGTTGC TATGGCAGAA AGCGATTCGC
> TGGCATATTC CCCTCAGCTT CTTAGTAACG CTGGCGTTAT GCGCAACGTT GGGCTGGTTG
> TTCTCACCAG AAACACTGGC AGCACCGCAA ATTCATCTGC TGTCTGGTGC GACTATGCTC
> GGCGCATTCT TTATTTTGAC TGACCCAGTT ACTGCTTCTA CGACCAATCG TGGTCGGCTG
> ATGTTCGGCG CGCTGGCGGG CTTATTAGTC TGGTTGATCC GCAGTTTCGG CGGCTATCCT
> GACGGCGTGG CTTTTGCCGT CCTGCTGGCG AACATCACGG TTCCTCTGAT CGATTACTAC
> ACGCGTCCGC GCGTCTACGG CCATCGCAAA
>
> Shig5
>
> ATGGTATTCA TAGCTAGCTC CCCTTATACC CATAACCAGC GCCAGACATC GCGCATTATG
> CTGTTGGTGT TGCTCGCAGC CGTGCCAGGA ATCGCAGCGC AACTGCGGTT TTTTGGTTGG
> GGTACTCTCG TTCAGATCCT GTTGGCGTCG GTCAGTGCTC TGTTAGCCGA AGCTCTCGTA
> CTCAAACTAC GCAAGCAGTC GGTAGCCGCA ACGTTGAAAG ATAACTCAGC ATTGCTGACA
> GGCTTATTGC TGGCGGTAAG TATTCCCCCC CTAGCGCCAT GGTGGATGGT CGTACTGGGT
> ACGGTGTTTG CGGTGATCAT CGCTAAACAG TTGTATGGCG GTCTGGGGCA AAACCCGTTT
> AATCCGGCAA TGATTGGTTA TGTGGTCTTA CTGATCTCCT TCCCGGTGCA GATGACCAGC
> TGGTTACCGC CACATGAAAT TGCGGTCAAC ATCCCTGGTT TTATCGACGC CATACAGGTT
> ATTTTCAGCG GTCATACCGC CAGTGGTGGT GATATGAATA CGCTACGCTT AGGTATTGAT
> GGCATTAGTC AGGCGACACC GCTGGATACA TTCAAAACCT CTGTCCGCGC AGGTCATTCG
> GTTGAACAGA TTATGCAATA TCCGATCTAC AGCGGTATTC TGGCGGGCGC TGGTTGGCAA
> TGGGTAAATC TCGCCTGGCT GGCTGGTGGC GTGTGGTTGC TATGGCAGAA AGCGATTCGC
> TGGCATATTC CCCTCAGCTT CTTAGTAACG CTGGCGTTAT GCGCAACGTT GGGCTGGTTG
> TTCTCACCAG AAACACTGGC AGCACCGCAA ATTCATCTGC TGTCTGGTGC GACTATGCTC
> GGCGCATTCT TTATTTTGAC TGACCCAGTT ACTGCTTCTA CGACCAATCG TGGTCGGCTG
> ATGTTCGGCG CGCTGGCGGG CTTATTAGTC TGGTTGATCC GCAGTTTCGG CGGCTATCCT
> GACGGCGTGG CTTTTGCCGT CCTGCTGGCG AACATCACGG TTCCTCTGAT CGATTACTAC
> ACGCGTCCGC GCGTCTACGG CCATCGCAAA
>
> Shig2
>
> ATGGTATTCA TAGCTAGCTC CCCTTATACC CATAACCAGC GCCAGACATC GCGCATTATG
> CTGTTGGTGT TGCTCGCAGC CGTGCCAGGA ATCGCAGCGC AACTGTGGTT TTTTGGTTGG
> GGTACTCTCG TTCAGATCCT GTTGGCATCG GTTAGTACTC TGTTAGCCGA AGCTCTCGTA
> CTCAAACTAC GCAAGCAGTC GGTAGCCGCA ACGTTGAAAG ATAACTCAGC ATTGCTGACA
> GGCTTATTGC TGGCGGTAAG TATTCCCCCC CTCGCGCCAT GGTGGATGGT CGTGCTGGGT
> ACGGTGTTTG CGGTGATCAT CGCTAAACAG TTGTATGGCG GTCTGGGGCA AAACCCGTTT
> AATCCGGCAA TGATTGGTTA TGTGGTCTTA CTGATCTCCT TCCCGGTGCA GATGACCAGC
> TGGTTACCGC CACATGAAAT TGCGGTCAAC ATCCCTGGTT TTATCGACGC CATACAGGTT
> ATTTTCAGCG GGCATACCGC CAGTGGTGCT GATATGAACA CACTACACTT AGGTATTGAT
> GGCATTAGTC AGGCGACACC GCTGGATACA TTTAAAACCT CTGTCCGTGC CGGTCATTCG
> GTTGAACAGA TTATGCAATA TCCGATCTAC AGCGGTATTC TGGCGGGCGC TGGTTGGCAA
> TGGGTAAATC TCGCCTGGCT GGCTGGCGGC CTGTGGTTGC TATGGCAGAA AGCGATTCGC
> TGGCATATTC CCCTCAGCTT CTTAGTAACG CTGGCGTTAT GCGCAACGTT GGGCTGGTTG
> TTCTCACCAG AAACACTGGC AGCACCGCAA ATTCATCTGC TGTCTGGTGC GACCATGCTC
> GGCGCATTCT TTATTTTGAC TGACCCGGTT ACCGCTTCTA CGACCAATCG TGGTCGTCTT
> ATTTTCGGCG CGCTTGCGGG CTTATTAGTC TGGTTGATCC GCAGTTTCGG CGGCTATCCT
> GACGGCGTGG CTTTTGCCGT CCTGCTGGCG AACATCACGG TTCCTCTGAT CGATTACTAC
> ACGCGTCCGC GCGTCTACGG CCATCGCAAA
>
> Erwca
>
> ATGGCTTTTA TTGCGAGTTC ACCGTTCACA CATAACCAGC AGCGCACGCA GCGCATCATG
> CTGTGGGTTA TTCTGGCCTG CCTGCCGGGC ATGTTGGCGC AGGTCTATTT CTTTGGCTAC
> GGCAACCTGA TTCAGGTCGG GCTGGCGTCT GCCACCGCAC TCATTGCGGA AGGCGTGACG
> CTGTCACTGA GGAAATTTGC GGTTCGTACC ACCCTGGCTG ATAATTCCGC GTTACTGACC
> GCCGTGCTGC TCGGAATCAG TCTACCGCCA TTAGCGCCCT GGTGGATGGT GGTCATGGCA
> ACCGTCTTCG CTATCATTAT CGCTAAACAG CTGTATGGCG GATTAGGGCA AAATCCCTTT
> AACCCCGCGA TGATTGGTTA TGTGGTGTTG CTGATCTCTT TCCCTGTCCA GATGACGAGC
> TGGCTGCCAC CCGAGCCGCT GCAAACCATC TCGCTTAGTT TCCATGACTC GCTAGTCATC
> ATTTTTACCG GACACACGCC TGACGGGCAT ACCATGCAGC AGCTGATGCA CAATGTTGAT
> GGCGTGAGTC AGGCCACCCC GCTGGATACG TTTAAAACCA GCTTACGTTC CGGCCAAACG
> CCACAGAACA TCCTGCAACA GCCGATGTTT GCACAATCAC TGTCTGGTAT TGGCTGGCAG
> TGGGTTAATA TCGGTTTTCT CATCGGCGGG CTGTTCTTGC TGATGCGCGG CACGATTCGC
> TGGCATATTC CAGTCAGTTT CCTGCTGTCG CTGATGTTCT GCGCCCTGCT AAGCTGGATC
> ATCGCGCCGG ATAAATTTGC CCAACCGATG TTACATCTGT TGTCCGGTGC GACCATGCTC
> GGCGCATTTT TCATCGCCAC AGATCCCGTT ACAGCATCAA CCACGAACCG GGGCCGTCTG
> ATTTTCGGTG CGTTGATTGG GTTACTGGTG TGGCTGATTC GCACCTATGG CGGCTACCCA
> GACGGCGTTG CCTTTGCCGT TCTACTGGCG AACATCACCG TCCCGCTGAT CGACTATTAC
> ACCAAGCCAC GTGCTTACGG CCACCATCGC
>
> Shig3
>
> ATGGTATTCA TAGCTAGCTC CCCTTATACC CATAACCAGC GCCAGACATC GCGCATTATG
> CTGTTGGTGT TGCTCGCAGC CGTGCCAGGA ATCGCAGCGC AACTGTGGTT TTTTGGTTGG
> GGTACTCTCG TTCAGATCCT GTTGGCATCG GTTAGTACTC TGTTAGCCGA AGCTCTCGTA
> CTCAAACTAC GCAAGCAGTC GGTAGCCGCA ACGTTGAAAG ATAACTCAGC ATTGCTGACA
> GGCTTATTGC TGGCGGTAAG TATTCCCCCC CTCGCGCCAT GGTGGATGGT CGTGCTGGGT
> ACGGTGTTTG CGGTGATCAT CGCTAAACAG TTGTATGGCG GTCTGGGGCA AAACCCGTTT
> AATCCGGCAA TGATTGGTTA TGTGGTCTTA CTGATCTCCT TCCCGGTGCA GATGACCAGC
> TGGTTACCGC CACATGAAAT TGCGGTCAAC ATCCCTGGTT TTATCGACGC CATACAGGTT
> ATTTTCAGCG GGCATACCGC CAGTGGTGGT GATATGAACA CACTACGCTT AGGTATTGAT
> GGCATTAGTC AGGCGACACC GCTGGATACA TTTAAAACCT CTGTCCGTGC CGGTCATTCG
> GTTGAACAGA TTATGCAATA TCCGATCTAC AGCGGTATTC TGGCGGGCGC TGGTTGGCAA
> TGGGTAAATC TCGCCTGGCT GGCTGGCGGC CTGTGGTTGC TATGGCAGAA AGCGATTCGC
> TGGCATATTC CCCTCAGCTT CTTAGTAACG CTGGCGTTAT GCGCAACGTT GGGCTGGTTG
> TTCTCACCAG AAACACTGGC AGCACCGCAA ATTCATCTGC TGTCTGGTGC GACCATGCTC
> GGCGCATTCT TTATTTTGAC TGACCCGGTT ACCGCTTCTA CGACCAATCG TGGTCGTCTT
> ATTTTCGGCG CGCTTGCGGG CTTATTAGTC TGGTTGATCC GCAGTTTCGG CGGCTATCCT
> GACGGCGTGG CTTTTGCCGT CCTGCTGGCG AACATCACGG TTCCTCTGAT CGATTACTAC
> ACGCGTCCGC GCGTCTACGG CCATCGCAAA
>
> Shig1
>
> ATGGTATTCA TAGCTAGCTC CCCTTATACC CATAACCAGC GCCAGACATC GCGCATTATG
> CTGTTGGTGT TGCTCGCAGC CGTGCCAGGA ATCGCAGCGC AACTGTGGTT TTTTGGTTGG
> GGTACTCTCG TTCAGATCCT GTTGGCGTCG GTCAGTGCTC TGTTAGCCGA AGCTCTCGTA
> CTCAAACTAC GCAAACAGTC GGTAGCCGCC ACGTTGAAAG ATAACTCAGC ATTGCTGACA
> GGCTTATTGC TGGCGGTAAG TATTCCCCCC CTCGCGCCAT GGTGGATGGT CGTGCTGGGT
> ACGGTGTTTG CGGTGATCAT CGCTAAACAG TTATATGGCG GTCTGGGACA AAACCCGTTT
> AATCCGGCAA TGATTGGTTA TGTGGTCTTA CTGATCTCCT TCCCGGTGCA GATGACCAGC
> TGGTTACCGC CACATGAAAT TGCGGTCAAC ATCCTTGGTT TTATCGACGC CATCCAGGTT
> ATTTTCAGCG GTCATACCGC CAGTGGTGGT GATATGAACA CGCTACGCTT AGGTATTGAT
> GGCATTAGTC AGGCGACACC GCTGGATACA TTTAAAACCT CTGTCCGTGC CGGTCATTCG
> GTTGAACAGA TTATGCAATA TCCGATCTAC AGCGGTATTC TGGCGGGCGC TGGTTGGCAA
> TGGGTAAATC TCGCCTGGCT GGCTGGCGGC GTATGGTTGC TATGGCAGAA AGCGATTCGC
> TGGCATATTC CCCTCAGCTT CTTAGTAACG CTGGCGTTAT GCGCAACGTT GGGCTGGTTG
> TTCTCACCAG AAACACTGGC AGCACCGCAA ATTCATCTGC TGTCTGGAGC GACCATGCTC
> GGCGCATTCT TTATTTTGAC TGACCCGGTT ACCGCTTCTA CGACCAATCG TGGTCGTCTT
> ATTTTCGGCG CGCTTGCGGG CTTATTAGTC TGGTTGATCC GCAGTTTCGG CGGCTATCCT
> GACGGCGTGG CTTTTGCCGT CCTGCTGGCG AACATCACGG TTCCTCTGAT CGATTACTAC
> ACGCGTCCGC GCGTCTACGG CCATCGCAAA
>
> Yep2
>
> ATGAAATTCA TTGCCAGTTC CCCTTTTACC CACAATCAAC GCAGTACCCG TCGTATTATG
> CTGCTGGTTA TTCTGGCCTG TATTCCGGGG ATTATCGCCC AGACTTACTT CTTCGGTTAT
> GGCAGCCTGA TTCAAGTCAT GCTGGCTATG ATCACTGCAC TGTTGGCCGA AGGTGCCGTA
> TTGCAGTTAC GTAAACAACC CGTCATGGCG CGGTTACAAG ATAACTCAGC CCTACTGACT
> GCCTTATTGT TGGGGATAAG CCTCCCCCCA TTGGCCCCCT GGTGGATGAT CGTGCTGGGC
> ACGCTATTTG CTATTGTCAT TGCCAAACAA CTCTATGGTG GCCTAGGGCA GAACCCGTTT
> AACCCGGCAA TGGTCGGTTA TGTTGTGCTC CTTATTTCAT TTCCGGTACA AATGACCAGT
> TGGCTGCCAC CGCTACCGTT GCAAGGAACG TCGGTGGGAT TCTATGACAG TCTCTTAACC
> ATTTTCACCG GTTATACCCA CAGTGGTGAG AATATTCATC AGCTCCAAGT TGGCTATGAC
> GGCATCAGCC AAGCGACGCC ACTGGATACG TTTAAGACCT CGCTACGTTC A---CAGCCA
> GCAGATCAAA TTCTGCAACA GCCCATATTT GGCGGTGTGC TGGCGGGTTT AGGCTGGCAA
> TGGGTTAATA CCGGCTTTCT GGTTGGCGGC TTATTGCTAC TGTGGCGTAA AGCTATTCAC
> TGGCATATTC CCGTGAGCTT CCTGCTCGCC TTAGGAGGTT GTGCCGCAGT GAGCTGGATG
> ATTGCTCCAC AAAGCTTTGC CTCACCAATG CTACATTTGT TCTCCGGTGC CACCATGTTA
> GGTGCATTTT TCATTGCTAC CGATCCTGTC AGTGCCTCTA CAACACCCCG TGGCCGCCTG
> ATATTTGGTG CCCTCATTGG TATTCTGGTG TGGCTAATCC GGGTATATGG CGGCTATCCG
> GATAGTGTTG CTTTTGCCGT GTTGCTCGCC AATATTACAG TTCCGTTGAT TGACCACTAT
> ACCCAACCTC GGGTCTATGG CCATAAAAGC
>
> Yep3
>
> ---------A TTGCCAGTTC CCCTTTTACC CACAATCAAC GCAGTACCCG TCGTATTATG
> CTGCTGGTTA TTCTGGCCTG TATTCCGGGG ATTATCGCCC AGACTTACTT CTTCGGTTAT
> GGCAGCCTGA TTCAAGTCAT GCTGGCTATG ATCACTGCAC TGTTGGCCGA AGGTGCCGTA
> TTGCAGTTAC GTAAACAACC CGTCATGGCG CGGTTACAAG ATAACTCAGC CCTACTGACT
> GCCTTATTGT TGGGGATAAG CCTCCCCCCA TTGGCCCCCT GGTGGATGAT CGTGCTGGGC
> ACGCTATTTG CTATTGTCAT TGCCAAACAA CTCTATGGTG GCCTAGGGCA GAACCCGTTT
> AACCCGGCAA TGGTCGGTTA TGTTGTGCTC CTTATTTCAT TTCCGGTACA AATGACCAGT
> TGGCTGCCAC CGCTACCGTT GCAAGGAACG TCGGTGGGAT TCTATGACAG TCTCTTAACC
> ATTTTCACCG GTTATACCCA CAGTGGTGAG AATATTCATC AGCTCCAAGT TGGCTATGAC
> GGCATCAGCC AAGCGACGCC ACTGGATACG TTTAAGACCT CGCTACGTTC A---CAGCCA
> GCAGATCAAA TTCTGCAACA GCCCATATTT GGCGGTGTGC TGGCGGGTTT AGGCTGGCAA
> TGGGTTAATA CCGGCTTTCT GGTTGGCGGC TTATTGCTAC TGTGGCGTAA AGCTATTCAC
> TGGCATATTC CCGTGAGCTT CCTGCTCGCC TTAGGAGGTT GTGCCGCAGT GAGCTGGATG
> ATTGCTCCAC AAAGCTTTGC CTCACCAATG CTACATTTGT TCTCCGGTGC CACCATGTTA
> GGTGCATTTT TCATTGCTAC CGATCCTGTC AGTGCCTCTA CAACACCCCG TGGCCGCCTG
> ATATTTGGTG CCCTCATTGG TATTCTGGTG TGGCTAATCC GGGTATATGG CGGCTATCCG
> GATAGTGTTG CTTTTGCCGT GTTGCTCGCC AATATTACAG TTCCGTTGAT TGACCACTAT
> ACCCAACCTC GGGTCTATGG CCATAAAAGC
>
> Klebs
>
> ATGGTTTTCA TCGCAAGCTC CCCCTATACC CATAACCAGC GGCAGACCTC GCGTATTATG
> CTGCTGGTGC TGCTCGCCGC TGTGCCTGGC ATTGTGGTCC AGACCTGGTT TTTCGGCTGG
> GGCACCGTAC TGCAGATTGT CCTCGCCGCC CTGACGGCCT GGGCGACCGA AGCCGCTATT
> CTCAAACTGC GTAAACAGCC TGTTGCTGCC ACGCTGAAAG ATAATTCCGC CCTGCTCACC
> GGCCTGCTGC TGGCGGTGAG TATTCCGCCA CTGGCCCCGT GGTGGATGGT GGTGCTCGGC
> ACTGCCTTTG CCGTAGTCAT TGCCAAGCAG TTGTACGGAG GCCTGGGCCA TAACCCCTTC
> AACCCGGCAA TGATCGGCTA TGTCGTGCTG CTGATTTCGT TCCCGGTGCA GATGACCTCC
> TGGCTGCCTT CCTATGAGAT AGCCGCCCAT ATCCCGGCAT TCAGCGACGC GCTGCAGATG
> ATCTTCACCG GTCATACCGC CGCCGGAGGC GACATGGCCA GTCTGCGGCT GGGTATCGAC
> GGCGTCAGCC AGGCCACCCC GCTGGATACC TTTAAAACCT CCCTGCATGC CGGACATAGC
> GTTCAGCAGG TGCTGCAATT GCCGGTCTAC GGCGGCGTGC TGGCAGGTCT GGGCTGGCAG
> TGGGTGAATA TCGCCTGGCT GGCAGGCGGC CTTTTCCTGC TGTGGCAGAA GGCGATCCGC
> TGGCATATCC CGGTCAGCTT CCTTGTTAGC CTCGGCCTGT GCGCCACCCT CGGCTGGATT
> TTTTCGCCGC AGAGCCTGGC CTCACCGCAG ATGCATTTGC TCTCCGGGGC GACCATGCTT
> GGCGCCTTCT TTATCCTGAC CGATCCGGTG ACGGCCTCAA CAACCAACCG CGGCCGTCTG
> ATCTTTGGCG CCCTCGCTGG CCTGCTGGTC TGGCTCATCC GCAGCTTCGG TGGTTATCCG
> GACGGCGTGG CGTTTGCCGT TCTGCTGGCG AATATTACCG TGCCGCTGAT CGATTACTAC
> ACGCGCCCGC GCGTATATGG CCACCGT---
>
> Yep1
>
> ATGAAATTCA TTGCCAGTTC CCCTTTTACC CACAATCAAC GCAGTACCCG TCGTATTATG
> CTGCTGGTTA TTCTGGCCTG TATTCCGGGG ATTATCGCCC AGACTTACTT CTTCGGTTAT
> GGCAGCCTGA TTCAAGTCAT GCTGGCTATG ATCACTGCAC TGTTGGCCGA AGGTGCCGTA
> TTGCAGTTAC GTAAACAACC CGTCATGGCG CGGTTACAAG ATAACTCAGC CCTACTGACT
> GCCTTATTGT TGGGGATAAG CCTCCCCCCA TTGGCCCCCT GGTGGATGAT CGTGCTGGGC
> ACGCTATTTG CTATTGTCAT TGCCAAACAA CTCTATGGTG GCCTAGGGCA GAACCCGTTT
> AACCCGGCAA TGGTCGGTTA TGTTGTGCTC CTTATTTCAT TTCCGGTACA AATGACCAGT
> TGGCTGCCAC CGCTACCGTT GCAAGGAACG TCGGTGGGAT TCTATGACAG TCTCTTAACC
> ATTTTCACCG GTTATACCCA CAGTGGTGAG AATATTCATC AGCTCCAAGT TGGCTATGAC
> GGCATCAGCC AAGCGACGCC ACTGGATACG TTTAAGACCT CGCTACGTTC A---CAGCCA
> GCAGATCAAA TTCTGCAACA GCCCATATTT GGCGGTGTGC TGGCGGGTTT AGGCTGGCAA
> TGGGTTAATA CCGGCTTTCT GGTTGGCGGC TTATTGCTAC TGTGGCGTAA AGCTATTCAC
> TGGCATATTC CCGTGAGCTT CCTGCTCGCC TTAGGAGGTT GTGCCGCAGT GAGCTGGATG
> ATTGCTCCAC AAAGCTTTGC CTCACCAATG CTACATTTGT TCTCCGGTGC CACCATGTTA
> GGTGCATTTT TCATTGCTAC CGATCCTGTC AGTGCCTCTA CAACACCCCG TGGCCGCCTG
> ATATTTGGTG CCCTCATTGG TATTCTGGTG TGGCTAATCC GGGTATATGG CGGCTATCCG
> GATAGTGTTG CTTTTGCCGT GTTGCTCGCC AATATTACAG TTCCGTTGAT TGACCACTAT
> ACCCAACCTC GGGTCTATGG CCATAAAAGC
>
> Yent
>
> ---------A TAGCCAGTTC CCCTTTTACG CATAATCAAC GTAGCACCCG CAGTATTATG
> CTGCTGGTTA TTTTAGCCTG CATTCCGGGG ATTATCGCCC AGACTTACTT CTTTGGTTAT
> GGCAGCCTAA TCCAAGTTGC ACTGGCTATA ATGACCGCCG TGCTGGCCGA AGGCGCTGTC
> TTGCATTTAC GTAAACAGCC GGTACTGACA CGATTACAGG ATAACTCTGC CCTGCTCACT
> GGCTTATTGT TAGGTATCAG TCTGCCGCCA CTTGCCCCTT GGTGGATGAT TGTTCTGGGT
> ACGGCATTTG CCATTATTAT TGCCAAACAA CTTTACGGCG GTTTAGGTCA GAATCCGTTT
> AATCCTGCCA TGGTCGGCTA TGTTGTACTA CTGATTTCAT TCCCAGTACA AATGACCAGT
> TGGCTTCCTC CTCTGCCATT ACAGGGAACG CCGGTTGGAT TCTATGACAG CTTATTAACT
> ATTTTCACCG GATTCACCCA AAATGGTGCC GATATTCACC AACTGCAGAT TGGCTATGAT
> GGGATAAGTC AGGCCACGCC TCTCGATAAC TTTAAAACCT CACTGCGCTC C---CAACCC
> GTGGAACAGA TTCTGCAACA ACCGATTTTT ACCGCGGGGC TGGCGGGTAT TGGTTGGCAA
> TGGATTAACC TCGGTTTTCT GGCGGGCGGT CTGTTGCTGC TATGGCGTAA AGCCATCCAC
> TGGCATATTC CGGTGAGTTT CCTATTGGCT TTGGCAGGTT GTGCTGCTAT CAGTTGGATG
> ATTGCGCCAC ACAGCTTCGC CCCCCCTATG TTGCATCTGT TTTCCGGTGC CACCATGTTG
> GGTGCTTTCT TTATTGCCAC CGATCCGGTC AGCGCCTCGA CAACACCTCG TGGTCGGCTG
> ATTTTCGGCG CATTGATTGG TATTTTGGTG TGGCTGATTC GCGTTTACGG CGGTTATCCC
> GATGGGGTAG CATTTGCGGT GCTGCTGGCC AACATCTGTG TTCCACTGAT TGATCACTAC
> ACTCAACCAC GCGTTTATGG TCATCAGCGC
>
> Yep4
>
> ATGAAATTCA TTGCCAGTTC CCCTTTTACC CACAATCAAC GCAGTACCCG TCGTATTATG
> CTGCTGGTTA TTCTGGCCTG TATTCCGGGG ATTATCGCCC AGACTTACTT CTTCGGTTAT
> GGCAGCCTGA TTCAAGTCAT GCTGGCTATG ATCACTGCAC TGTTGGCCGA AGGTGCCGTA
> TTGCAGTTAC GTAAACAACC CGTCATGGCG CGGTTACAAG ATAACTCAGC CCTACTGACT
> GCCTTATTGT TGGGGATAAG CCTCCCCCCA TTGGCCCCCT GGTGGATGAT CGTGCTGGGC
> ACGCTATTTG CTATTGTCAT TGCCAAACAA CTCTATGGTG GCCTAGGGCA GAACCCGTTT
> AACCCGGCAA TGGTCGGTTA TGTTGTGCTC CTTATTTCAT TTCCGGTACA AATGACCAGT
> TGGCTGCCAC CGCTACCGTT GCAAGGAACG TCGGTGGGAT TCTATGACAG TCTCTTAACC
> ATTTTCACCG GTTATACCCA CAGTGGTGAG AATATTCATC AGCTCCAAGT TGGCTATGAC
> GGCATCAGCC AAGCGACGCC ACTGGATACG TTTAAGACCT CGCTACGTTC A---CAGCCA
> GCAGATCAAA TTCTGCAACA GCCCATATTT GGCGGTGTGC TGGCGGGTTT AGGCTGGCAA
> TGGGTTAATA CCGGCTTTCT GGTTGGCGGC TTATTGCTAC TGTGGCGTAA AGCTATTCAC
> TGGCATATTC CCGTGAGCTT CCTGCTCGCC TTAGGAGGTT GTGCCGCAGT GAGCTGGATG
> ATTGCTCCAC AAAGCTTTGC CTCACCAATG CTACATTTGT TCTCCGGTGC CACCATGTTA
> GGTGCATTTT TCATTGCTAC CGATCCTGTC AGTGCCTCTA CAACACCCCG TGGCCGCCTG
> ATATTTGGTG CCCTCATTGG TATTCTGGTG TGGCTAATCC GGGTATATGG CGGCTATCCG
> GATAGTGTTG CTTTTGCCGT GTTGCTCGCC AATATTACAG TTCCGTTGAT TGACCACTAT
> ACCCAACCTC GGGTCTATGG CCATAAAAGC
>
> Yep5
>
> ATGAAATTCA TTGCCAGTTC CCCTTTTACC CACAATCAAC GCAGTACCCG TCGTATTATG
> CTGCTGGTTA TTCTGGCCTG TATTCCGGGG ATTATCGCCC AGACTTACTT CTTCGGTTAT
> GGCAGCCTGA TTCAAGTCAT GCTGGCTATG ATCACTGCAC TGTTGGCCGA AGGTGCCGTA
> TTGCAGTTAC GTAAACAACC CGTCATGGCG CGGTTACAAG ATAACTCAGC CCTACTGACT
> GCCTTATTGT TGGGGATAAG CCTCCCCCCA TTGGCCCCCT GGTGGATGAT CGTGCTGGGC
> ACGCTATTTG CTATTGTCAT TGCCAAACAA CTCTATGGTG GCCTAGGGCA GAACCCGTTT
> AACCCGGCAA TGGTCGGTTA TGTTGTGCTC CTTATTTCAT TTCCGGTACA AATGACCAGT
> TGGCTGCCAC CGCTACCGTT GCAAGGAACG TCGGTGGGAT TCTATGACAG TCTCTTAACC
> ATTTTCACCG GTTATACCCA CAGTGGTGAG AATATTCATC AGCTCCAAGT TGGCTATGAC
> GGCATCAGCC AAGCGACGCC ACTGGATACG TTTAAGACCT CGCTACGTTC A---CAGCCA
> GCAGATCAAA TTCTGCAACA GCCCATATTT GGCGGTGTGC TGGCGGGTTT AGGCTGGCAA
> TGGGTTAATA CCGGCTTTCT GGTTGGCGGC TTATTGCTAC TGTGGCGTAA AGCTATTCAC
> TGGCATATTC CCGTGAGCTT CCTGCTCGCC TTAGGAGGTT GTGCCGCAGT GAGCTGGATG
> ATTGCTCCAC AAAGCTTTGC CTCACCAATG CTACATTTGT TCTCCGGTGC CACCATGTTA
> GGTGCATTTT TCATTGCTAC CGATCCTGTC AGTGCCTCTA CAACACCCCG TGGCCGCCTG
> ATATTTGGTG CCCTCATTGG TATTCTGGTG TGGCTAATCC GGGTATATGG CGGCTATCCG
> GATAGTGTTG CTTTTGCCGT GTTGCTCGCC AATATTACAG TTCCGTTGAT TGACCACTAT
> ACCCAACCTC GGGTCTATGG CCATAAAAGC
>
> Yeps2
>
> ---------A TTGCCAGTTC CCCTTTTACC CACAATCAAC GCAGTACCCG TCGTATTATG
> CTGCTGGTTA TTCTGGCCTG TATTCCGGGG ATTATCGCCC AGACTTACTT CTTCGGTTAT
> GGCAGCCTGA TTCAAGTCAT GCTGGCTATG ATCACTGCAC TGTTGGCCGA AGGTGCCGTA
> TTGCAGTTAC GTAAACAACC CGTCATGGCG CGGTTACAAG ATAACTCAGC CCTGCTGACT
> GCCTTATTGT TGGGGATAAG CCTCCCCCCA TTGGCCCCCT GGTGGATGAT CGTGCTGGGC
> ACGCTATTTG CTATTGTCAT TGCCAAACAA CTCTATGGTG GCCTAGGGCA GAACCCGTTT
> AACCCGGCAA TGGTCGGTTA TGTTGTGCTC CTTATTTCAT TTCCGGTACA AATGACCAGT
> TGGCTGCCAC CGCTACCGTT GCAAGGAACG TCGGTGGGAT TCTATGACAG TCTCTTAACC
> ATTTTCACCG GTTATACCCA CAGTGGTGAG AATATTCATC AGCTCCAAGT TGGCTATGAC
> GGCATCAGCC AAGCGACGCC ACTGGATACG TTTAAGACCT CGCTACGTTC A---CAGCCA
> GCAGATCAAA TTCTGCAACA GCCCATATTT GGCGGTGTGC TGGCGGGTTT AGGCTGGCAA
> TGGGTTAATC TCGGCTTTCT GGTTGGCGGT TTATTGCTAC TGTGGCGTAA AGCTATTCAC
> TGGCATATTC CCGTGAGCTT CCTGCTCGCC TTAGGAGGTT GTGCCGCAGT GAGCTGGATG
> ATTGCTCCAC AAAGCTTTGC CTCACCAATG CTACATTTGT TCTCCGGTGC CACCATGTTA
>
> ...
>
> [Message clipped]


More information about the Bioperl-l mailing list