[Bioperl-l] Still trouble with remote joined records
cjfields at uiuc.edu
Wed Oct 17 23:45:55 EDT 2007
On Oct 17, 2007, at 9:49 PM, Chris Fields wrote:
>> So, as far as I can see, passing the DB handle isn't causing the
>> spliced_seq method to go elsewhere for the nucleic acid sequence
>> I thought that was the purpose.
>> Can anyone enlighten me, or is this a bug?
>> Warren Gallin
> This appears to be a bug. The remote sequence is designated in the
> CDS join
> /product="voltage-gated potassium channel Kv4.2"
> but the location is truncated when passed through SeqIO to genbank
> output, which explains the spliced_seq() problem:
> CDS 430..1544
> I'll try looking into this; if I can't get to it immediately I'll
> file a bug report. Thanks!
Looked into it and there isn't a problem with BioPerl, but there
appears to be an error on NCBI's end with some full GenBank seqs and
remote locations. The default return type for records using
Bio::DB::GenBank is 'gbwithparts' (which retrieves full records for
everything), but this particular record version has a truncated
location. You can see the truncated version here via eutils:
You can get around this in your script by changing the requested
format to 'gb', which has the correct location string and returns the
full protein seq:
my $gbh = Bio::DB::GenBank->new(-format => 'gb');
More information about the Bioperl-l