[Bioperl-l] RetrictionEnzyme.pm: a proposal

Paul-Christophe Varoutas paul-christophe.varoutas@curie.fr
Mon, 29 Jan 2001 11:28:02 +0100

Yesterday I studied RestrictionEnzyme.pm more in depth. I haven't yet added 
the methods I wanted to, because in my opinion it is far more urgent for 
this module to get some redesigning.

The module somewhat suffers of poor design, and just adding methods to it 
will just worsen the situation.

RestrictionEnzyme has methods which are proper to the restriction enzymes:
  - seq() is the accessor method to the enzyme's recognition sequence.
  - cut_seq() "cuts" a Bio::Seq-derived object and generates an array of 
restriction site fragments.
  - cuts_seq_at() does the same but this time generates an array of 
restriction site coordinates.

and methods which are proper to the list of enzymes:
  - is_available() says if a particular enzyme is in the list.
  - available_list() gives the list of all enzymes or list of n-base cutters.

Steve Chervitz already suggested in the module's documentation that 
is_available() "may be more appropriate for a REData.pm class", and I share 
his opinion. From a conceptual point of view, the existing 
RestrictionEnzyme.pm module corresponds to two object classes, not one.

Here is an outline of my proposal:

Separate RestrictionEnzyme in two classes:

RestrictionEnzymeDBase (or whatever more appropriate):
  - members: the list of restriction enzymes.
  - methods:
       - constructor using hardwired list of enzymes OR user file OR URL.
       - add/remove enzyme to/from list (adding will be the equivalent of 
_make_custom() ).
       - member accessor methods: already existing methods: is_available(), 

   - members: the same as now (_name, _seq, _site, _cuts_after).
   - methods:
       - constructor (equivalent to the constructor calling the 
_make_standard() sub).
       - already existing accessor methods.
       - already existing methods: cut_seq, cuts_seq_at, etc.

This design, apart from being more "correct", will facilitate any future 
extensions of the two modules. The drawback in separating RestrictionEnzyme 
in two classes is that all code using RestrictionEnzyme.pm will have to be 

Perhaps we should take advantage of the imminent release of the 0.7 version 
and decide to proceed in the redesigning. If we change the design this will 
also be the opportunity to slightly change/extend its public interface to 
add small new functionalities such as being able to add and use asymmetric 
cutters and enzymes which cut outside the recognition site (perhaps just 
incorporating small changes now in order to be in time for the 0.7 release 
and leaving extensions for afterwards, especially if I do this alone based 
on what we decide).

Tell me what you think about it:
- First of all, is redesigning possible or are we obliged to maintain 
compatibility ? In the latter case I will just add functionality, 
maintaining the poor design of the module.
- If redesigning is possible, please make comments/suggestions.