Bioperl: article for Dr. Dobb's Journal

Ewan Birney
Fri, 9 Oct 1998 17:34:55 +0100 (BST)

On Fri, 9 Oct 1998, Lincoln Stein wrote:

> Ewan Birney writes:
>  > It's a very nice article. How much do you need cut out? Here are some
>  > suggestions:
> The article needs to be cut by about 50% (I'd actually asked to make
> it a two-parter originally, but got turned down).  If I cut out the
> alignment stuff there will still need to be some substantial trimming
> in the rest.  Alternatively, I could focus on the alignment algorithm
> entirely, and this is what the editor has suggested.  I hate leaving
> out all the OO stuff, however.

To be honest I think the OOP stuff is more important than the algorithm
and the fact that perl is the *ideal* language to glue and provide a
development 'framework' is v. important. But the algorithm might look
more sexy to people. I'd go OOP-Perl to say that it is more than a 
web/systems glue language.


>  > I think your biggest saving would be to drop the alignment class stuff
>  > all together. It's sad because that's where this stops becoming simple
>  > datastructures and starts getting interesting (and of course, I find
>  > alignments v.interesting), but I think trying to explain OOP-perl,
>  > bioinformatics and dynamic programming all in one small article is taking
>  > on quite a job.
> Do you think the alignment part is strong enough to stand on its own?
> The code actually runs pretty slowly and uses a lot of memory (and
> uses a horrible trick in which strings are turned into numbers
> automagically).  Maybe I should focus on the algorithm and then show
> how it can be turned into an XS module.

Perhaps. Does DDJ really want an explanation of dynamic programming? It
isn't very 'perly' then, and alot of people have written about dynamic
programming alot (ie - you'd have to watch out that you didn't tick off
some computer science types by your explanation - I tend to do this alot 

I think it is foolish to write dynamic programming in perl if it is a
serious thing to be used in anger. DP is a v. cpu intensive algorithm
which is almost perfect for a RISC chip + a good C optimiser. I think the
algorithm -> C implementation + C API -> bioperl intergration via XS is a
much more realistic example of this... It makes the article much more
'here is a complex algorithm that we want to provide sensibly for non-C
users to use'.

I might point out of course that the current dump from the bio-perl cvs 
directory has a protein smith-waterman implementation written in C and
stuck in via XS - it produces a Bio::SimpleAlign object which is a pure
perl object. Quite an interesting starting point if you are looking for
pre-cooked implementations... (guess who wrote it <grin>)

>  > b) I think the point about perl is that not only is it a rapid development
>  > cycle but that existing command line based solutions can be worked into
>  > it, as can C based APIs (a la AcePerl and the bioperl alignment
>  > routines).
> Very good point.  I'll add that to the intro.

I've been claiming that Perl (not java) is the ideal driver language for
'components' of code that you want to put together - some components
written in Perl, some in C/C++, some CORBA'ized. (I had some odd looks at
Objects in bioinformatics when I said that...).

There are lots of things you can focus on in this article. I guess you're
going to have to weigh up 'readability' 'sexiness' and 'importance'. 

I'm happy to reread anything if you like. Have fun!

> Lincoln
> -- 
> ========================================================================
> Lincoln D. Stein                           Cold Spring Harbor Laboratory
>			                  Cold Spring Harbor, NY
> ========================================================================

Ewan Birney

=========== Bioperl Project Mailing List Message Footer =======
Project URL:
For info about how to (un)subscribe, where messages are archived, etc: