[Bioperl-l] How to change a fasta format alignment into clustalw format?

Tao Zhu taozhu at mail.bnu.edu.cn
Thu Sep 13 01:26:56 EDT 2012


Thank you! I'm using an old version, perhaps 1.6.1? I don't know how to
check the version.

When I turn to version 1.6.9, the problem has been solved.

于 2012年09月12日 21:37, Fields, Christopher J 写道:
> The below worked fine for me using the latest bioperl-live.  Are you using an older version?
> 
> chris
> 
> [cjfields at pyrimidine-laptop clustalw]$ cat convert.pl 
> #!/usr/bin/env perl
> use Modern::Perl;
> use Bio::AlignIO;
> 
> my $in = Bio::AlignIO->new(-file => shift,
>                            -format => 'fasta');
> 
> my $out = Bio::AlignIO->new(-format => 'clustalw');
> 
> while (my $aln = $in->next_aln) {
>     $out->write_aln($aln);
> }
> 
> [cjfields at pyrimidine-laptop clustalw]$ cat test.fa 
>> SPOG_04578#scry
> MESRMTNSVRIRSITKKDVSVVFQFI2IELADFEDARDQVEATEESLLHAFGFT-
>> SOCG_01498#soct
> ----MTNSVRVRPITNKDISTVIQFI2IELADFEEARDQVEATEESLLNVFGFNE
>> SPAC1002.07c#spom
> -----MGSVRIRSVIKEDLPTVYQFI2KELAEFEKCEDQVEATIPNLEVAFGFID
>> SJAG_03288#sjap
> --MTNKTTAVVRRLKREDCPVVLQFI2KELAEYQKEPQQVEATVEKLEKAFGFVE
> 
> [cjfields at pyrimidine-laptop clustalw]$ perl convert.pl test.fa 
> CLUSTAL W (1.81) multiple sequence alignment
> 
> 
> SPOG_04578#scry/1-54   MESRMTNSVRIRSITKKDVSVVFQFI2IELADFEDARDQVEATEESLLHAFGFT-
> SOCG_01498#soct/1-51   ----MTNSVRVRPITNKDISTVIQFI2IELADFEEARDQVEATEESLLNVFGFNE
> SPAC1002.07c#spom/1-50 -----MGSVRIRSVIKEDLPTVYQFI2KELAEFEKCEDQVEATIPNLEVAFGFID
> SJAG_03288#sjap/1-53   --MTNKTTAVVRRLKREDCPVVLQFI2KELAEYQKEPQQVEATVEKLEKAFGFVE
>                               :. :* : .:* ..* **** ***:::.  :*****  .*  .***  
> 
> On Sep 12, 2012, at 7:28 AM, Tao Zhu <taozhu at mail.bnu.edu.cn> wrote:
> 
>> Hello, everyone
>>
>> I have an multiple protein sequence alignment in FASTA format:
>>
>>> SPOG_04578#scry
>> MESRMTNSVRIRSITKKDVSVVFQFI2IELADFEDARDQVEATEESLLHAFGFT-
>>> SOCG_01498#soct
>> ----MTNSVRVRPITNKDISTVIQFI2IELADFEEARDQVEATEESLLNVFGFNE
>>> SPAC1002.07c#spom
>> -----MGSVRIRSVIKEDLPTVYQFI2KELAEFEKCEDQVEATIPNLEVAFGFID
>>> SJAG_03288#sjap
>> --MTNKTTAVVRRLKREDCPVVLQFI2KELAEYQKEPQQVEATVEKLEKAFGFVE
>>
>> I want to change it to CLUSTALW format. It could have been easy:
>>
>> my $in  = shift;
>> my $out = shift;
>> my $alignio = Bio::AlignIO->new(-file=>$in, -format=>'fasta');
>> my $writeio = Bio::AlignIO->new(-file=>">$out", -format=>'clustalw');
>> while ( my $align_obj = $alignio->next_aln ) {
>>    $writeio->write_aln($align_obj);
>> }
>>
>> That'OK. However it doesn't work, because it says "seq doesn't validate".
>>
>> In fact there has letter "2" in the alignment. Such "2" is intentionally
>> marked by myself, meaning a phase-2 intron exists here. I hope to keep
>> these markers in the output clustalw format. Is there any methods?
>>
>> -- 
>> Tao Zhu, College of Life Sciences, Beijing Normal University, Beijing
>> 100875, China
>> Email: tzhu at mail.bnu.edu.cn
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 


-- 
Tao Zhu, College of Life Sciences, Beijing Normal University, Beijing
100875, China
Email: tzhu at mail.bnu.edu.cn



More information about the Bioperl-l mailing list