[Bioperl-l] Re: Gene Structure / GenScan

hilmar.lapp@pharma.Novartis.com hilmar.lapp@pharma.Novartis.com
Tue, 1 Aug 2000 16:09:10 +0100

> The Ensembl Genscan parser Ewan sent yesterday seems to be a good
> point. However, I'd prefer to have a gene structure represented
> independent of the/an underlying sequence (object), that is, as a feature
> which may or may not have a sequence attached. In addition, a parser
> not need to rely on being provided with the source sequence, and the
> resulting gene structure representation can be attached to the pertaining
> source sequence by the client.
> I'd propose the following:
> Bio::SeqFeature::GeneStructure is-a Bio::SeqFeature::Generic (or just a
> Bio::SeqFeatureI ?)
> and offers specific support for gene structure related things, like

Aha. Now you want the appropiate Ensembl gene objects, not the genscan
parser. Look at


Look at


Again, I would be happy if these moved "across" to bioperl.

you will want to add additional stuff to the Gene object to handle
promoters (or perhaps the transcript object). Don't forget about
alternative splicing.

     Well, that's not really what I was aiming at. I thought about a
     representation of the _data_ which make up a gene structure, as, e.g.
     people find it or programs predict it. IMHO all that _interpretation_
     of the data (features in this case) belongs to separate classes,
     either derived ones, or within another hierarchy (you could think of a
     GeneTranscriber who knows about alternative splicing). So, the modules
     I proposed shouldn't do much with actual sequences apart from maybe
     very basic things. They're just features, which in the first place is
     all you need to represent e.g. GenScan results. And they should be
     rich enough to allow other modules to make real stuff like protein
     sequences out of it. So, lightweight, but heavy enough.

     Am I missing something?