[Bioperl-l] Re: [Bioperl-guts-l] RestrictionEnzyme.pm

Hilmar Lapp hlapp@gmx.net
Wed, 24 Jan 2001 01:55:01 -0800

Paul-Christophe Varoutas wrote:
> Hi again,
> Yesterday night I started experimenting with RestrictionEnzyme.pm.
> I liked very much the '-MAKE' =>'custom' switch in the constructor but I
> think it would nevertheless be a good idea to  write a public method which
> updates the enzyme list from the NEBASE site.
> I suggest to write a sub (lets call it update_list or update_RE_list) that:
> - goes to the NEBASE site and gets the last version of the restriction
> enzyme list. We can choose between http/ftp and various types of
> lists/formats. My preference would be to go to their ftp site and get what
> they call "format 18": DNAStrider format, list of all commercially
> available enzymes. The file is ftp://ftp.nebase.com/pub/nebase/striderc.*,
> the extension of the file reflects the version).
> - saves this list in a text file, in the Bio/Tools/ directory. An
> alternative is to update the enzyme list in the RestrictionEnzyme.pm file
> itself, at the beginning of the file, within the definition of the %RE
> hash, but intuitively I would not tend to recommend it, as I don't know if
> writting in a file at the same time it is being read by the perl
> interpreter will behave well in all operating systems. Tell me what you
> think about it.

You normally can't write to Bio/Tools as a user (under Unix), and
a user client shouldn't attempt to do so under any circumstances.
Regarding the ability to update the list of known REs, I see the
following options.
1) Accept an additional (named!) parameter at initialization that
denotes a file (in DNAStrider format?) containing the enzymes to
be known in addition to a collection of hard-coded enzymes.
2) Same as before, but the parameter denotes a URL from where to
obtain this file.
3) Put all hard-coded enzymes into a file that resides at a known
place within the Bio/ directory tree, and read (parse) that upon
initialization of RestrictionEnzyme.pm. An update would mean
updating that file.

I'm not sure option 3) would have compelling advantages to the
present layout. Options 1) and 2) are certainly worthwhile to
pursue and in essence are almost identical, the only difference
being how to open the stream containing the enzyme data. So, one
could try to combine both into one parameter, and have the code
figure out whether it's a file or a http/ftp URL.


Do you already have a CVS write account?

> - if the enzyme list is saved in a separate file, I will also modify the
> initialisation of the %RE hash, with code that reads and parses the enzyme
> list file.
> If this sounds OK to you, I will write it this weekend and submit it. Of
> course if you had something completely different in mind please say it, I
> will try to adapt to it.
> Paul-Christophe

Hilmar Lapp                                email: hlapp@gmx.net
GNF, San Diego, Ca. 92122                  phone: +1 858 812 1757