[Bioperl-l] split location problems
jason at bioperl.org
Mon Oct 16 22:48:14 EDT 2006
This probably was exposed by the fact that the Split object used to
explicitly sort the features by start*strand always. But with remote
locations and needing to be able to explicitly set the order (for features
that are not required to be 5' -> 3') that code must have been removed. I
think there is just one place that must be missing a 'reverse' on the list
of sub-locations when the top-level feature is a complement. I'll wait for
your fix before wading in - we probably might want to figure out a
'consolidate' method to shrink redundant and equivalent representations to
the shortest possible form. Ugh this really starts to resemble trying to
write a boolean logic toolkit....
On 10/16/06, Chris Fields <cjfields at uiuc.edu> wrote:
> On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote:
> > The whole point of split locations is to represent genes with
> > introns so that is not the "rare" case.
> > I'm confused where the problem is. The locations that I get out
> > with to_FTstring on the location object are exactly the same as
> > those input.
> The problem is with the a subset of split locations described in the
> bug report. The following works:
> whereas this:
> gives this:
> which is not syntactically the same. It should be:
> since 'join' implies that the order of the segments to be joined is
> important ('order' and 'bond' do not, I guess).
> > I have processed the genbank fungal genomes into GFF3 and have had
> > no problems so I'm confused where you are breaking down. If I
> > write them out as embl I also get the correct thing. This is using
> > the CVS version of bioperl from the HEAD.
> > I've added code to test this to bug 2101 including a C.glabrata
> > chromsome downloaded from genbank. Perhaps the problem is on the
> > EMBL parsing side, I didn't test that.
> > On the technical side, I still am not sure I fully know where the
> > strand information should be stored - the top level container or
> > the sub-features. I'll try and stay up on the discussion if
> > anything has been decided that I should know about.
> > -jason
> Split::strand() sets the sublocations as well, which seems to confuse
> the situation more but it is consistent with LocationI, as Hilmar
> points out. I'm looking into a few solutions now, including a fix in
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
jason at bioperl.org
More information about the Bioperl-l