[Bioperl-l] `get_feature_by_name` not working after migrating to Bio::DB::SeqFeature::Store from a Bio::DB::GFF backend

Scott Cain scott at scottcain.net
Wed Feb 1 14:43:56 EST 2012


Hi Vivek,

What I don't understand from the error stack that you copied in your
email is that it doesn't mention the get_features_by_name method,
rather it mentions the segment method--are you sure that the line
number reported in your get_annotation_db_features method corresponds
to the change you made in the get_features_by_name call?  I get the
feeling that it is dying somewhere else now.  I would say that
resorting to adding class to the feature names is unlikely to help.

Scott


On Wed, Feb 1, 2012 at 11:51 AM, Vivek Krishnakumar
<vivekkrishnakumar at gmail.com> wrote:
> Hi Scott,
>
> Thanks very much for your suggestions. Looks I did miss it somehow
> (confusion was caused because I was using both bioperl-l at googlegroups and
> bioperl-l at open-bio)
>
> Anyway, I had modified my function exactly like your suggestion:
>
> my ($locus_obj) = $gff_dbh->get_features_by_name(-name => $locus, -type =>
> 'gene');
>
> But doing so just returns the following error:
>
> -------------------- EXCEPTION --------------------
> MSG: segment() called in a scalar context but multiple features match.
> Either call in a list context or narrow your search using the -types or
> -class arguments
>
> STACK Bio::DB::SeqFeature::Store::segment
> /usr/local/packages/perl-5.10.1/lib/5.10.1/Bio/DB/SeqFeature/Store.pm:1322
> STACK main::get_annotation_db_features
> /opt/www/medicago/cgi-bin/medicago/eucap/eucap.pl:899
> STACK main::structural_annotation
> /opt/www/medicago/cgi-bin/medicago/eucap/eucap.pl:660
> STACK toplevel /opt/www/medicago/cgi-bin/medicago/eucap/eucap.pl:119
> -------------------------------------------
>
> which would suggest to oneself that there are several such features with the
> same ID. But in fact, I was able to verify by querying the database that I
> have only one such locus.
>
> As for your question regarding how $locus is populated, it is populated from
> a CGI parameter passed to the script. I know that I am only passing it one
> locus ID. And as I mentioned earlier in this thread, the warning statement I
> inserted before making the function call shows me that there is only one ID
> in the $locus variable.
>
> My last resort now is to try as you suggested and modify my GFF3 file and
> embed the -class => 'Gene' into the "Name" attribute. While doing so, should
> I also embed the 'mRNA' class into the "Name" attribute of the mRNA feature
> like so:
>
> chr2    working_models  mRNA    30427563    30429139    .   -   .
> ID=mrna_36255;Parent=gene_35804;Name=mRNA:Medtr2g097580.1;conf_class=F
>
> Subsequently, should I modify the function call to include the 'class':
> my ($locus_obj) = $gff_dbh->get_features_by_name(-name => $locus, -type =>
> 'gene', -class => 'Gene');
>
> Thank you.
> Vivek
>
>
> On Wed, Feb 1, 2012 at 11:26 AM, Scott Cain <scott at scottcain.net> wrote:
>>
>> Hi Vivek,
>>
>> I responded to your original email and I suspect you may have missed
>> it.  I'll copy it below.  Another few things: how does $locus get
>> populated?  Are you sure what you expect to be there is?
>>
>> Also, to answer your question about the other bioperl modules you're
>> using: no, I don't think that's interfering.
>>
>> Scott
>>
>> ------------------------------------------------
>> Hello Vivek,
>>
>> In your GFF3, you don't have any features of class "Gene".  In GFF2,
>> the class was the text string that started the ninth column, like
>> this:
>>
>> chr2   .  gene    30427563    30429139    .   -   .  Gene 35804
>>
>> where the class would be Gene and the name (also called group) would
>> be 35804.  Class is not a particularly well defined concept in GFF3,
>> so the easiest way to restore functionality to your script is to
>> change the call from this:
>>
>>  my ($locus_obj) = $gff_dbh->get_feature_by_name('Gene' => $locus);
>>
>> to this:
>>
>>  my ($locus_obj) = $gff_dbh->get_features_by_name(-name => $locus,
>> -type => 'gene');
>>
>> I believe (though haven't tested it myself in a very long time) that
>> you can embed class in the name of the feature, like this:
>>
>> chr2  .  gene    30427563    30429139    .   -   .
>>  Name=Gene:Medtr2g097580
>>
>> which may or may not be easier, depending on your data and your code base.
>>
>> Scott
>>
>>
>> On Wed, Feb 1, 2012 at 10:50 AM, Vivek Krishnakumar
>> <vivekkrishnakumar at gmail.com> wrote:
>> > Hi Lincoln,
>> >
>> > Thanks very much for your suggestions. Not sure how the single quotation
>> > marks appeared around the $locus variable. But looks like it was only in
>> > the email. Fortunately did not have quotes around the variable in my
>> > original code.
>> >
>> > Now, when I switch over to 'get_features_by_name()', my script does not
>> > run
>> > to completion.
>> >
>> > I want to mention that this snippet of code is part of a larger CGI
>> > script
>> > that interfaces with the SeqFeature backend DB. When I modify the
>> > function
>> > call to $gff_dbh->get_features_by_name($locus), the script just runs
>> > indefinitely and returns absolutely nothing. I did put in a warn
>> > statement
>> > to see if the correct locus ID is being passed to the function (I am
>> > able
>> > to see the warning message in my apache error log), which seems to be
>> > fine.
>> > But the moment it reaches the function call step, the CGI script freezes
>> > up
>> > and I am unable to do anything. It just ends up as a rogue process owned
>> > by
>> > the 'daemon' user and continues to use up a lot of memory.
>> >
>> > I am using the following BioPerl modules in this CGI script:
>> > use Bio::SeqIO;
>> > use Bio::SearchIO;
>> > use Bio::DB::SeqFeature::Store;
>> > use Bio::SeqFeature::Generic;
>> > use Bio::Graphics;
>> > use Bio::Graphics::Feature;
>> >
>> > Could any of these be interfering with get_features_by_name()?
>> >
>> > Thank you.
>> > Vivek
>> > _______________________________________________
>> > Bioperl-l mailing list
>> > Bioperl-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>> --
>> ------------------------------------------------------------------------
>> Scott Cain, Ph. D.                                   scott at scottcain
>> dot net
>> GMOD Coordinator (http://gmod.org/)                     216-392-3087
>> Ontario Institute for Cancer Research
>
>



-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research



More information about the Bioperl-l mailing list