Bioperl-guts: Possible Seq.pm reimplementation
James Gilbert
jgrg@sanger.ac.uk
Mon, 2 Aug 1999 14:31:31 +0100 (BST)
It seems unanimous that Bio::Seq needs a rewrite!
Here are some random thoughts on the subject:
I support renaming the old module to Bio::OldSeq
I'd like a name/id field to be kept. Most
external programs bioperlers will use need a name
of some kind to mark up the results they return.
This could be provided by the AnnSeq object, but I
think we should keep the core Seq object useable
in its own right. I can't think of when I'd
really need the desc() method though. Shouldn't
we take care of this with the Bio::AnnSeq::Comment
class?
The Bio::Seq object will be one of the first
modules that novice users look at, so we sholdn't
try to be too "clever" with it, but keep it a
simple and understandable as possible. I don't
want to see it inheriting from a large number of
different classes, which the user has to hunt
through for methods and documentation.
I agree that we should remove the counter-
intuitive capitalization of the strings (Dna, Rna,
Amino), returned from the type() method. I think
it would be nicer to have the different types of
sequences optionally blessable into sub-classes of
Bio::Seq, such as Bio::Seq::Amino (which would
then throw an exception if the revcom() method was
called).
I think we should have methods for checking the
integrity of the sequence string against standard
alphabets. Should this be in the Bio::Seq object?
Maybe there should be a simple checking method for
checking that there aren't non-printable
characters in the string in Bio::Seq, which would
be replaced by checking for non-nucleotide
characters in Bio::Seq::DNA
I agree that the translate() method shouldn't be
in the core object. I don't see the need for the
start() method if this is true. The module which
implements translation should know about the
common translation tables. There's a list at:
http://www3.ncbi.nlm.nih.gov/htbin-post/Taxonomy/wprintgc?mode=c
Inspired by Ian Korf's Eppendorf object, how about
a Ribosome object?
And a minor niggle. Why revcom() and not
revcomp(), which I imagine most people would more
naturally type? I guess we're stuck with
revcom().
James G.R. Gilbert
The Sanger Centre
Wellcome Trust Genome Campus
Hinxton
Cambridge Tel: 01223 494906
CB10 1SA Fax: 01223 494919
=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl-guts.html
====================================================================