[Bioperl-l] Bio::LocatableSeq and Annotation vs Feature
chmille4 at gmail.com
Thu Jun 25 13:57:25 EDT 2009
Ok, I'll use the full length SeqFeature for now and mark it with a TODO.
Thanks for the help.
On Thu, Jun 25, 2009 at 1:02 PM, Chris Fields <cjfields at illinois.edu> wrote:
> On Jun 25, 2009, at 9:46 AM, Chase Miller wrote:
> Hi all,
>> Quick question I came across while writing the Bio::Nexml module.
>> I'm trying to link taxon data to a Bio::LocatableSeq object inside a
>> Bio::SimpleAlign object. Bio::SimpleAlign has the ability to add
>> SeqFeatures, but according to this HowTo (
>> http://www.bioperl.org/wiki/HOWTO:Feature-Annotation) a feature is
>> considered to refer to a portion of a sequence, whereas something like
>> data would refer to the entire sequence and should be handled as an
>> annotation. However, as far as I can tell Bio::LocatableSeq does not
>> annotation objects.
>> What would be the best way to relate taxon data to a single sequence
>> an alignment?
> From working with feature/annotation-rich alignment formats such as
> stockholm I found this is one of the areas for Align that needs some
> rethinking. One way to work around this w/o major refactoring is to have a
> full-length SeqFeature (pointing to the proper LocatableSeq) that stores the
> Bio::Annotation. I don't necessarily like that approach as a long-term
> solution, though, as it's a little hacky and indirect, but it might get you
> started (just mark it as TODO so we can catch it at some point).
> For a long-term solution I don't think the answer is as simple as making
> LocatableSeq Bio::AnnotatableI; that would not be congruent with the
> PrimarySeq implementation (which is not AnnotatableI). LocatableSeq is
> supposed to represent a simple PrimarySeq that can be mapped to other
> sequences via start/end/strand, and thus inherits from both Bio::PrimarySeq
> (note lack of 'I') and RangeI.
> Three options:
> 1) Bio::Seq could be refactored to handle both Bio::PrimarySeq and
> Bio::LocatableSeq, and SimpleAlign reworked to allow any simple RangeI.
> 2) Bio::PrimarySeq can be AnnotatableI (Bio::Seq would delegate to the
> PrimarySeq AnnotationCollection).
> 3) All AnnotationI need to be linked back to the PrimarySeqI somehow e.g.
> I personally think option #2 is easiest, as this means anything that is-a
> PrimarySeq is also AnnotatableI, and it might not break past scripts. Not
> sure how this would affect overall performance though.
More information about the Bioperl-l