[Bioperl-l] Still trouble with remote joined records

Chris Fields cjfields at uiuc.edu
Wed Oct 17 22:49:37 EDT 2007

On Oct 17, 2007, at 8:42 PM, Warren Gallin wrote:

> I must be missing something, but I can not get the procedure outlined
> in FAQ 5.5 to do what I think it should (maybe my expectations are
> incorrect.
> I think I have an up-to-date release:
> GallinPowerbook:~ wgallin$ perl -MBio::Root::Version -e 'print
> $Bio::Root::Version::VERSION,"\n"'
> 1.005002102
> ...
> I get an error that a protein sequence can not be translated.  I
> thought that by feeding the handle to GenBank into the spliced_seq
> method that it would retrieve the necessary nucleic acid sequence
> records and splice together the specified ranges.

No, the error is expected:

MSG: Can't translate an amino acid sequence.

The record in question is a protein record, so you are retrieving a  
protein sequence, which can't be translated (the exception is valid,  
in other words).  Note that the 'CDS' feature in this case has a  
specified location of 1..630 (indicated by arrow):

      CDS             1..630   <---------

The tag name 'coded_by' has the data you want; however it is stored  
as a string only.

> So I tried using the corresponding nucleic acid record, gi7648671,
> which holds the 5' end of the CDS ( I used $gbh on the get_Seq step).
> That yielded the correct amino acid sequence for the first half the
> protein, encoded by the sequence in the record itself, but it did not
> retrieve the other nucleic acid record that is specified to contain
> the 3' end of the sequence.
> ...
> So, as far as I can see, passing the DB handle isn't causing the
> spliced_seq method to go elsewhere for the nucleic acid sequence data.
> I thought that was the purpose.
> Can anyone enlighten me, or is this a bug?
> Warren Gallin

This appears to be a bug.  The remote sequence is designated in the  

      CDS             join 
                      /product="voltage-gated potassium channel Kv4.2"

but the location is truncated when passed through SeqIO to genbank  
output, which explains the spliced_seq() problem:

      CDS             430..1544

I'll try looking into this; if I can't get to it immediately I'll  
file a bug report.  Thanks!


More information about the Bioperl-l mailing list