[Bioperl-l] possible bug printing GenBank feature qualfiers

Chris Fields cjfields at uiuc.edu
Fri Mar 31 13:04:25 EST 2006


Okay.  I'm committing this change; it passes all tests for SeqIO.t and
genbank.t.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: Scott Markel [mailto:smarkel at scitegic.com]
> Sent: Friday, March 31, 2006 11:31 AM
> To: Chris Fields
> Cc: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] possible bug printing GenBank feature qualfiers
> 
> Chris,
> 
> Looks like I made my test case too simple.  In our application,
> which calls BioPerl, I'm creating the feature with the zero-
> valued qualifier.  It's not being read in from a file, so
> my only issue is with writing GenBank files.  The real feature
> is one for a primer binding site.  The qualifier contains the
> number of mismatches.  The one line change of
> 
>      $value = $value->{"value"}
> 
> definitely fixes our problem and causes no regression
> failures in our application.
> 
> Scott
> 
> Chris Fields wrote:
> 
> > I tried this on WinXP (I'm using bioperl-live) and got a warning:
> >
> > -------------------- WARNING ---------------------
> > MSG: Unexpected error in feature table for  Skipping feature, attempting
> to
> > recover
> > ---------------------------------------------------
> >
> > Running using debugging shows that no feature key was found in
> > _read_FTHelper_GenBank.  So I'm getting an error, but on input not
> output.
> > In fact, turning on -verbose in the SeqIO input object gives the below
> extra
> > output, whereas turning -verbose on only in the output object just gives
> the
> > warning above.
> >
> > ====================================
> > C:\Perl\Scripts\gb_test>test.pl
> > no feature key!
> >
> > -------------------- WARNING ---------------------
> > MSG: Unexpected error in feature table for  Skipping feature, attempting
> to
> > recover
> > STACK Bio::SeqIO::genbank::next_seq
> > C:\Perl\src\bioperl\core/Bio\SeqIO\genbank.pm:583
> > STACK toplevel C:\Perl\Scripts\gb_test\test.pl:18
> > sequence length is 10
> > ====================================
> >
> > The sequence came back w/o any features in the feature table, which is
> what
> > I would expect from this error:
> > ====================================
> > LOCUS       MY_LOCUS                  10 aa            linear   linear
> > DEFINITION  my description.
> > ACCESSION   12345
> > KEYWORDS    .
> > FEATURES             Location/Qualifiers
> > ORIGIN
> >         1 atggagaact
> > //
> > ====================================
> >
> > Adding the extra line before the s/// didn't help any (warning still
> pops
> > up, no change in output).  Anybody out there with any ideas?
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> >>-----Original Message-----
> >>From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >>bounces at lists.open-bio.org] On Behalf Of Scott Markel
> >>Sent: Thursday, March 30, 2006 7:18 PM
> >>To: bioperl-l at lists.open-bio.org
> >>Subject: [Bioperl-l] possible bug printing GenBank feature qualfiers
> >>
> >>In our upgrade from BioPerl 1.4 to 1.5.1 we tripped over the
> >>following.
> >>
> >>Annotation tags used by Bio::SeqIO::FTHelper were strings and
> >>are now Bio::Annotation::SimpleValue.  In the _print_GenBank_FTHelper
> >>subroutine of Bio::SeqIO::genbank the following code still
> >>assumes that tags are strings.
> >>
> >>    foreach my $tag ( keys %{$fth->field} ) {
> >>        foreach my $value ( @{$fth->field->{$tag}} ) {
> >>        $value =~ s/\"/\"\"/g;
> >>
> >>If the tag value was a zero, an empty string is written.
> >>
> >>We think that
> >>
> >>            $value = $value->{"value"};
> >>
> >>should be added before the s/// call.
> >>
> >>Here's our test case.  Note that the qualifier value for "foo"
> >>is changed to an empty string.
> >>
> >>Input file
> >>
> >>====================================
> >>LOCUS       MY_LOCUS                  10 aa            linear   UNK
> >>DEFINITION  my description.
> >>ACCESSION   12345
> >>FEATURES             Location/Qualifiers
> >>      misc_feature    1..10
> >>                      /foo="0"
> >>ORIGIN
> >>         1 atggagaact
> >>//
> >>====================================
> >>
> >>Perl code
> >>====================================
> >>use strict;
> >>use warnings;
> >>
> >>use Bio::SeqIO;
> >>
> >>my $inputFilename = "input.gbff";
> >>my $outputFilename = "output.gbff";
> >>
> >>my $in  = Bio::SeqIO->new(-file   => $inputFilename,
> >>                           -format => "genbank");
> >>my $out = Bio::SeqIO->new(-file => ">$outputFilename",
> >>                           -format => "genbank");
> >>
> >>my $sequence = $in->next_seq();
> >>$out->write_seq($sequence);
> >>====================================
> >>
> >>Output file
> >>====================================
> >>LOCUS       MY_LOCUS                  10 aa            linear   linear
> >>DEFINITION  my description.
> >>ACCESSION   12345
> >>KEYWORDS    .
> >>FEATURES             Location/Qualifiers
> >>      misc_feature    1..10
> >>                      /foo=""
> >>ORIGIN
> >>         1 atggagaact
> >>//
> >>====================================
> >>
> >>I'll add this to bugzilla, but first I want to make sure
> >>I'm not missing something obvious.
> >>
> >>Scott
> 
> --
> Scott Markel, Ph.D.
> Principal Bioinformatics Architect  email:  smarkel at scitegic.com
> SciTegic Inc.                       mobile: +1 858 205 3653
> 9665 Chesapeake Drive, Suite 401    voice:  +1 858 279 8800, ext. 253
> San Diego, CA 92123                 fax:    +1 858 279 8804
> USA                                 web:    http://www.scitegic.com



More information about the Bioperl-l mailing list