[Bioperl-l] Bio::DB::SeqFeature::Store::memory -> filter_by_type very slow

Lincoln Stein lincoln.stein at gmail.com
Fri Feb 5 10:46:02 EST 2010


I think the problem with the filter function is that the type you request
may be "BAC" but the feature's type is "BAC:FPC", and you want to be able to
filter by the more generic type terms. Nevertheless I'm sure we can do
better than 60 min running time and so I'll have to look at how this
function works more carefully. I can't do this right now, unfortunately, so
perhaps someone on the mailing list would be willing to take a look?

Lincoln

On Mon, Feb 1, 2010 at 7:24 AM, Jelle Scholtalbers <j.scholtalbers at gmail.com
> wrote:

> Hi,
> I used the Bio::DB::SeqFeature::Store::memory module to load in a GFF3 file
> which I could then use in my script in a 'queryable' way. To retrieve
> features I used for example
>        $db->features(-type => 'BAC:FPC', -seq_id=>'chromosome0')
> However when doing a profile on my script I found out that 60% of the
> running time went into filter_by_type from
> Bio::DB::SeqFeature::Store::memory.
> Replacing this function with
>     my @features = grep{$_->type eq 'BAC:FPC'}
> $db->features(-seq_id=>'chromosome0')
> which gave me the same results was just a fraction of the earlier run time.
> My script went from 60min. to 4min. for the same result and only changing
> this function (is called often).
> Can/Should this be fixed or is this just the faster way to do it?
>
> Cheers,
> Jelle
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



-- 
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <Renata.Musa at oicr.on.ca>


More information about the Bioperl-l mailing list