[Bioperl-l] reading and writing GFF3

Scott Cain cain at cshl.edu
Tue Jun 20 12:03:26 EDT 2006


Hi Hilmar,

Of course you are right--I was under the influence of a perl module that
I work with that does something similar, but both of your solutions are
better.

I wasn't familiar with Bio::SeqFeature::TypedSeqFeatureI; I'll take a
look this week.

As for next week, I plan on spending the day at NESCent on Wednesday
(though I haven't told Todd or Jeff that I am arriving early yet) just
to make sure all the details are in place.  I imagine I'll have a fair
amount of free time to hash this stuff out.  Anyone else who is in town
(that is, in Durham, NC, USA) is welcome to come draw on a white board
too. :-)

Scott


On Sat, 2006-06-17 at 12:20 -0400, Hilmar Lapp wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> You don't need a new method for this. Instead, support a -feature  
> argument.
> 
> 	my $bsfa = Bio::SeqFeature::Annotated->new(-feature => $feature);
> 
> This should work for any instance of Bio::SeqFeatureI. If it is a  
> B::SF::Annotated already it is obviously just a deep copy (if copy is  
> desired - could be another parameter). Otherwise more will be involved.
> 
> Alternatively, and possibly better, is to write a specialized  
> SeqFeatureI factory (that would implement  
> Bio::Factory::ObjectFactoryI) and then delegate this job to it:
> 
> 	my $feat_factory = Bio::SeqFeature::TypedFeatureFactory->new(
> 		-type_ontology => $sequence_ontology,
> 		-source_ontology => $feature_source_ontology,
> 		-unflatten => 1);
> 	my $bsfa = $feat_factory->create_object({-feature => $feature});
> 
> This is preferable because it separates business logic that isn't  
> necessarily related into defined units. I.e., the logic necessary to  
> convert an ordinary feature into a strongly typed one is different  
> from how to represent a strongly typed feature. IMHO anyway ...
> 
> Also, don't dismiss the Bio::SeqFeature::TypedSeqFeatureI that Ewan  
> started as the result of a discussion thread earlier this (or last?)  
> year. Bio::SeqFeature::Annotated as such may as well be obsoleted,  
> though not in concept.
> 
> Maybe we need to get together again and thrash out a strategy; or a  
> BOF at the GMOD meeting? I feel this does need a core group of people  
> who care, hash out a strategy that will also solve the backwards  
> compatibility problem with the current Bio::SeqFeatureI state-of- 
> limbo, and allow us to implement the decisions with a few people in a  
> concentrated effort. This will then also remove the only real large  
> stumbling block towards a 1.6 release.
> 
> Maybe we should think about a little pre-GMOD hackathon to clear up  
> this mess? Scott, you'll be there a day early? I'll be already back  
> and Jason I believe will still be in town, although he may have other  
> commitments already. Nonetheless, it shouldn't really take that much  
> but rather dedicated time, a whiteboard, and a few people who care  
> thrashing this out and then do it.
> 
> Thoughts?
> 
> 	-hilmar
> 
> On Jun 16, 2006, at 11:56 PM, Scott Cain wrote:
> 
> > Rob,
> >
> > I came to the same conclusion as well; I wrote my response as I was
> > heading out the door and while I was running errands, I realized the
> > right thing to do is to write a Bio::SeqFeature::Annotated method  
> > called
> > new_from_object, whose usage would be:
> >
> >   my $my_BSFA = Bio::SeqFeature::Annotated->new_from_object 
> > ($my_BSFI, %args);
> >
> > where you would give it a Bio::SeqFeatureI compliant object and try to
> > create a BSFA like use suggested below.  You could allow passing in  
> > args
> > to control how different things are handled, like mapping non-SO types
> > to SO types.  I'll think about this over the weekend and let you  
> > know if
> > brilliance strikes me.
> >
> > Scott
> >
> >
> > On Fri, 2006-06-16 at 13:31 -0700, Robert Buels wrote:
> >> Rather than cobble together some ad-hoc solution, I would be  
> >> interested
> >> in working on a good solution to this problem, because it seems like
> >> it's just going to get more common as more people start wanting to  
> >> write
> >> GFF3.  What about some code in whatever customarily makes these  
> >> objects
> >> (probably BSF::Annotated's new() method?) that could take another  
> >> type
> >> of Feature object and attempt to shoehorn its data into a new
> >> BSF::Annotated?  If it failed (because the type isn't in SO or
> >> whatever), it could throw() some informative error message.
> >>
> >> Then, people could write straightforward code something like:
> >>
> >> while(my $oldstylefeature = $features_in->next_feature) {
> >>     $oldstylefeature->primary_tag('something_that_is_in_so');
> >>     $oldstylefeature->something_else('some other something that  
> >> needs to
> >> be changed for compliance');
> >>     my $newfeature = Bio::SeqFeature::Annotated->new 
> >> ($oldstylefeature);
> >>     $gff3_out->write_feature($newfeature);
> >> }
> >>
> >> Does that sound like a good idea?  I'd be more than willing to  
> >> implement
> >> this, since I'm going to need to do this sort of thing with many more
> >> things than just RepeatMasker.
> >>
> >> Rob
> >>
> >> Scott Cain wrote:
> >>> Um, yeah, good question.  The reason I didn't answer you when you  
> >>> wrote
> >>> before is that I was hoping for divine inspiration for an answer  
> >>> (or for
> >>> somebody else to answer, which would have been really great :-)
> >>>
> >>> The short answer (and easy one for me to type) is that you will  
> >>> probably
> >>> need an ad hoc method to do it, which is the same thing I do when  
> >>> I need
> >>> to convert gff2 to gff3, to make sure the things I need mapped get
> >>> mapped the 'right' way (that is, the way I want them to go).  I  
> >>> don't
> >>> have any sample code that does this, but if you want to start  
> >>> working up
> >>> an ad hoc method, I will certainly try to help you as much as I can.
> >>>
> >>> Scott
> >>>
> >>>
> >>> On Fri, 2006-06-16 at 12:34 -0700, Robert Buels wrote:
> >>>
> >>>> So about that converting ye olde feature objects into
> >>>> Bio::SeqFeature::Annotated objects.  How do I do it?
> >>>>
> >>>>
> >>>> Scott Cain wrote:
> >>>>
> >>>>> That's OK--You added a few items that should be escaped that  
> >>>>> weren't, so
> >>>>> I added those too.
> >>>>>
> >>>>> Thanks,
> >>>>> Scott
> >>>>>
> >>>>>
> >>>>> On Fri, 2006-06-16 at 12:30 -0700, Robert Buels wrote:
> >>>>>
> >>>>>
> >>>>>> Woops, I should have said something about that.  I submitted  
> >>>>>> it before
> >>>>>> I saw that Scott had already done the escaping in CVS.
> >>>>>>
> >>>>>> Chris Fields wrote:
> >>>>>>
> >>>>>>
> >>>>>>> Scott,
> >>>>>>>
> >>>>>>> Looks like Robert also submitted a bug report related to this  
> >>>>>>> as well=
> >>>>>>> ---------------------------------------------------------------- 
> >>>>>>> --------
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> Bioperl-l mailing list
> >>>>>>> Bioperl-l at lists.open-bio.org
> >>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> > -- 
> > ---------------------------------------------------------------------- 
> > --
> > Scott Cain, Ph. D.                                          
> > cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> - --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.2.2 (Darwin)
> 
> iD8DBQFElCvAuV6N2JxL7qsRAhw1AJ9SaMR4tMFZCTrzimnEnDdjKqbPGgCgk38V
> ImoAXD/jrbF0gXzSr2CY4tQ=
> =XfDq
> -----END PGP SIGNATURE-----
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20060620/4b71554e/attachment-0001.bin 


More information about the Bioperl-l mailing list