[Bioperl-l] Recoding Bio::SimpleAlign

Jun Yin jun.yin at ucd.ie
Fri Jul 16 11:54:36 EDT 2010

Dear all,


I am the Google Summer of Code student working on refactoring Bio::Align
subsystems. The first aim of the project is to recode Bio::SimpleAlign. This
is because this package is really useful, but it was created a long time
ago, written by several people, and a bit inconsistent mainly due to the
above two reasons.


I tried to keep the package consistent (e.g. method calling, coding styles)
with the previous distribution. However, there are still a few changes.
Since this package is created and used by the community, I think it is
better to show it to everyone before it is merged with the major
distribution. Any suggestions and criticisms are welcome.


Here are the major improvements on Bio::SimpleAlign


1. MSA modifying and selection methods are more consistent and easier to
use. I have enabled multiple/reverse selections for all sequences/columns
selection methods, and change the names to be more understandable.


For example, 

$aln->select() and $aln->select_noncont() are both deprecated, and renamed
as $aln->select_Seqs() now. Because selections should be both in seqs and
columns, which need to be explicit in the method call. 


For example, multiple sequence selections can be called by:



Or you can toggle selection(reverse selection) using:


If you can the method using the old ways, e.g.


A warning will be shown:

select - deprecated method. Use select_Seqs() instead.

And, the calling will be redirected to



2. gap chars/missing chars are more consistent in the package

Default values for gap char and missing char are now set in the package. 

Calling/Setting gap char should be made by calling $aln->gap_char("-").


3. Some redundant methods are removed. The methods are moved to more
reasonable categories.

For example, $aln->select and $aln->select_noncont are deprecated now.
Please use $aln->select_Seqs.



4. Some methods are renamed. Methods selecting/giving objects are
capitalized, e.g. each_seq to each_Seq.

Another example, the method is renamed to give a clearer information.

$aln->purge is renamed into $aln->remove_redundant_Seqs

$aln->splice_by_seq_pos is renamed to $aln->remove_gaps


For further information, you can visit:

hQUZ6WFE&hl=en&authkey=CJTCw4QL> &hl=en&authkey=CJTCw4QL



Jun Yin

Ph.D. student in U.C.D.


Bioinformatics Laboratory

Conway Institute

University College Dublin


More information about the Bioperl-l mailing list