Bioperl-guts: Possible Seq.pm reimplementation

James Gilbert jgrg@sanger.ac.uk
Mon, 2 Aug 1999 14:31:31 +0100 (BST)



It seems unanimous that Bio::Seq needs a rewrite!

Here are some random thoughts on the subject:


I support renaming the old module to Bio::OldSeq


I'd like a name/id field to be kept.  Most
external programs bioperlers will use need a name
of some kind to mark up the results they return.  
This could be provided by the AnnSeq object, but I
think we should keep the core Seq object useable
in its own right.  I can't think of when I'd
really need the desc() method though.  Shouldn't
we take care of this with the Bio::AnnSeq::Comment
class?


The Bio::Seq object will be one of the first
modules that novice users look at, so we sholdn't
try to be too "clever" with it, but keep it a
simple and understandable as possible.  I don't
want to see it inheriting from a large number of
different classes, which the user has to hunt
through for methods and documentation.


I agree that we should remove the counter-
intuitive capitalization of the strings (Dna, Rna,
Amino), returned from the type() method.  I think
it would be nicer to have the different types of
sequences optionally blessable into sub-classes of
Bio::Seq, such as Bio::Seq::Amino (which would
then throw an exception if the revcom() method was
called).


I think we should have methods for checking the
integrity of the sequence string against standard
alphabets.  Should this be in the Bio::Seq object?
Maybe there should be a simple checking method for
checking that there aren't non-printable
characters in the string in Bio::Seq, which would
be replaced by checking for non-nucleotide
characters in Bio::Seq::DNA


I agree that the translate() method shouldn't be
in the core object.  I don't see the need for the
start() method if this is true.  The module which
implements translation should know about the
common translation tables.  There's a list at:

http://www3.ncbi.nlm.nih.gov/htbin-post/Taxonomy/wprintgc?mode=c

Inspired by Ian Korf's Eppendorf object, how about
a Ribosome object?


And a minor niggle.  Why revcom() and not
revcomp(), which I imagine most people would more
naturally type?  I guess we're stuck with
revcom().


James G.R. Gilbert
The Sanger Centre
Wellcome Trust Genome Campus
Hinxton
Cambridge                        Tel: 01223 494906
CB10 1SA                         Fax: 01223 494919








=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl-guts.html
====================================================================