[Bioperl-l] Hilmar and Ewan debate SeqFeatures some more...

David Block dblock@gene.pbi.nrc.ca
Mon, 22 Jan 2001 12:20:16 -0600 (CST)

Hello everyone!

Just back from Calgary, doing final bits of paperwork to prepare for my
defense.  After Feb 20, my mind will be a lot clearer!

Okay, I just read through everybody's arguments, and since you want my
opinion, I'll give it to you.

Our pathway to enlightenment here has been that we started with simple
cases, then met complex cases and had to tear everything down multiple
times to accomodate complexity.  So it looks like BioPerl is doing that
now with fuzzy locations (which have been tossed around the list for
longer than I've been on it).

We should bite the bullet and build for posterity.  Extensibility is a
major priority in this situation, and for that reason, Hilmar wins my vote

Backwards compatibility- I would like it very much if for simple cases, a
simple location object was by default created.  A complex location object
should only be created when complex location input is given.

Then the familiar start, end notation would refer to the default simple
location object.  I like the idea of some sort of global environment-type
variable that would set the policy for fuzzy instances.  A well-documented
default would be fine here as well.

What Workbench would do would be to use the default behaviour (widest,
probably) for fuzzy locations, and then when details were requested, would
show that fuzziness at the base-pair level.  So it would be great if
start, end returned hard locations according to some policy that could be
defined (at object creation?), and details would be returned only when
requested.  In that case, could location be an optional object, only
created when needed?

So start, end would return numbers, either hard numbers given to them at
creation, or numbers computed by a location object.  A different call
($feature->detailedstart or something) would call $feature->start if there
was no more info on the location, and would call the location object
otherwise.  This could then return whatever array or hash we decide
on.  That would take care of the memory concerns (we create a lot of
objects with Workbench as well), since in most cases, the start/end pair
would be all that was stored.  The complexities could be handled whenever
the client desired complexity.

Would it be necessary to flag objects that have detailed location
information?  Well, that's a simple check for the presence of a LocationI
object attached to the SeqFeature object.

Okay, there's my opinion.  Let me know what you think. 

David Block
Plant Biotechnology Institute
National Research Council of Canada
Saskatoon, Saskatchewan