[Bioperl-guts-l] [Bug 2630] Bio::LocatableSeq range validation does not work with translated sequence coordinates

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Sun Nov 16 01:06:13 EST 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2630





------- Comment #2 from cjfields at bioperl.org  2008-11-16 01:06 EST -------
(In reply to comment #1)
> Initial mapping() implementation added to subversion, which takes a two element
> array (or array ref with a parameter) and returns a two element array.  The two
> elements are integers (not typechecked yet) which indicate:
> 
> 1) # of residues which map to...
> 2) # of positions based on the LocatableSeq start/end, or the calculated length
> of the sequence - gaps
> 
> A translated sequence mapping to nucleotide coordinates would be [1,3], while a
> reverse-translated nucleotide sequence mapping to amino acid coordinates would
> be [3,1].
> 
> This passes all tests so far, but a few warnings are being thrown for
> Bio::SearchIO::fasta parsing which indicate that end coordinates are still
> being miscalculated (my guess is something wrong with the way gap symbols are
> being calculated in FASTA-based LocatableSeq).  Will close out when that bug is
> fixed.

In order to allow frameshifts for HSPs I will be adding a frameshifts() method
to LocatableSeq.  The data structure will likely be a simple internal hash with
key = sequence positions of frameshifts and value = how many positions it is
shifted relative to the sequence itself (1 or -1 for FASTA, but it could
feasibly be more).  The end() calculation will start taking this into account
when cross-checking passed end() args along with mapping criteria.


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


More information about the Bioperl-guts-l mailing list