[Bioperl-l] LocatableSeq::subseq(): bug or not?
Mark A. Jensen
maj at fortinbras.us
Sun Nov 23 21:40:06 EST 2008
Since subseq() returns a string, I would (and do) expect a 1-origin
substring of the actual character data. It would be nice to continue
to have a thoughtless data grab without using substr directly. I find
I want to deal with both gapped sequence and the gap-stripped sequence
at various points in an app, and have the goodies that Locatable and
Align provide as well. It might be convenient to have another method
for dealing specifically with the gap-stripped sequence, say
subseq_nogap() or subseq_residues(), so that the expected (and
regressed) subseq() behavior is preserved. [I prefer 'residues' to
'bases' to highlight the generality of the representation.]
----- Original Message -----
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Sunday, November 23, 2008 7:31 PM
Subject: [Bioperl-l] LocatableSeq::subseq(): bug or not?
> Currently, we have Bio::LocatableSeq use the default
> (Bio::PrimarySeq) implementation of subseq(). However the returned
> data apparently clashes with the actual PrimarySeq documentation:
> Function: returns the subseq from start to end, where the first
> is 1 and the number is inclusive, ie 1-2 are the first
> bases of the sequence
> So, should the following actually return the indicated range of
> bases (no gaps)? Or should we clarify the above documentation to
> indicate subseq() returns the first x positions/columns (anything)
> instead of 'bases' (no gaps)?
> my $seq = Bio::LocatableSeq->new(
> -seq => '--atg---gta--',
> -strand => 1,
> -start => 1,
> -end => 6,
> -alphabet => 'dna'
> # comments indicate current returned val
> $seq->subseq(1,3); # returns '--a'
> $seq->subseq(3,6); # returns 'atg-'
> $seq->subseq(1,10); # returns '--atg---gt'
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
More information about the Bioperl-l