[Bioperl-l] Use of Bio namespace

Ewan Birney birney@ebi.ac.uk
Tue, 10 Oct 2000 13:33:33 +0100 (GMT)

On 10 Oct 2000, Keith James wrote:

> >>>>> "Ewan" == Ewan Birney <birney@ebi.ac.uk> writes:
>     Ewan> I would hope that sequence/feature stuff could be merged
>     Ewan> inside bioperl but there
>     Ewan> 	(a) maybe good reasons not to
>     Ewan> 	(b) you may not want to ;)
>     Ewan> both of which are sensible compliants.
> I've been having a look at making my Sequence class SeqI compliant and
> my Feature class SeqFeatureI compliant. It's been a bit tricky trying
> to work out what is the best way to treat fuzzy ranges (which I've
> supported) in a bioperl Seq.

Right. This is the first step...



>     Ewan> I would also encourage you to
>     Ewan> 	- if possible, work with bioperl or criticise bioperl
>     Ewan> if it wasn't good enough for what you wanted to do.
> It seems like bad form to criticise when I haven't contributed very
> much to bioperl (if I don't like it, I should fix it...). I had a go
> at hacking bioperl a while back but found my limitations (never
> written a Perl module, knew nothing about OO coding) so I needed to
> write some stuff from scratch to see how it all worked.

I suspect that if I can talk you into rolling your stuff into bioperl then
that would be great. Remember in Bioperl the motto is

  "The person who codes it wins the argument"

so if you're motivated to do something, it doesn't break tests then ---
check it in!

I'll complain sometimes; sometimes with reason, but if someone wants to
put something in then that is ok by me!

> Stuff I wanted was:
>  Non-fussy but fairly complete EMBL parsing

I think we are closer to that, but there are still issues:

   (a) having this CDS_span with sub features. Perhaps bad

   (b) completely not handling fuzziness.

(I *hate* fuzziness. But I guess some people use it.). 

Knowing what you do keith, I would expect that in addition your parsers
can possibly make more "assumptions" about how to interpret the genbank
file as you know precisely how you would want to use it. (? am I wrong)

>  Terse, but intuitive manipulation of feature qualifiers in scripts

I don't like this part of Bioperl 100% either. What is your suggestion?

>  Features with & without sequence

That is handled ok in theory, just small changes required.

>  Clone, trim, reverse-complement sequences with all the features
>  attached

We could do this. I have held off on it because it can lead to serious
complications for complex feature compliant objects (cloneability of
features). I can now see a "way through".

>  Fuzzy ranges (parsed from EMBL, supported in other operations)

This could be a bug-bear. What is your object model here?

>  Low memory Blast parsing

We have this now (BPLite) 

>  Fasta search output parsing

We'd love to get this in...

> I'm in a better position to work on bioperl now, but still find a lot
> of it hard to follow (esp. where the methods have no documentation -
> this isn't just me, I know others who have been discouraged from
> working on it for this reason).

Indeed. Though... to be honest... I have seen less documentation on
projects than bioperl. I think that there is an energy barrier to get
over, but it aint as bad as elsewhere. 

> As I'm sure you can appreciate, there is the time aspect to this as
> well. Annotation projects need to keep to deadlines and if writing a
> new module is significantly quicker than modifying an existing one,
> that's the way it goes.

Yup. Somewhere this curve changes and working collaboratively gives you
pay back in the 2/3 month look-ahead for development. And sometimes the
win is *huge* (like - someone has completely written something that does

> To be honest, these modules were not originally intended for release
> (hence their cutesy and non-CPAN acceptable names). However they have
> since been used in some scripts (cos we've found them easier than
> bioperl) which we now need to distribute, so the issue has come up. I
> would prefer to integrate at some point, if possible.

It should be possible. I guess we should go for coffee sometime right?

> cheers,
> -- 
> -= Keith James - kdj@sanger.ac.uk - http://www.sanger.ac.uk/Users/kdj =-
> The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA

Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420