Naming the modules; Mailing lists

Georg Fuellen fuellen@dali.Mathematik.Uni-Bielefeld.DE
Fri, 21 Feb 1997 12:12:00 +0000 (GMT)

Hi SteveB,

you wrote,
> [...] 
> GF> I even consider Bio::Aln to be sufficiently general to
> GF> process alignments of numeric data, linguistic data, etc !
>   Indeed, right now it can hold any sort of data -- but that is perhaps a
> weakness!  This is a bioperl object and it should have features for
> supporting biological sequences.  Right now, the object lacks support
> for many types of operations that people would want to do on proteins.

Ok, Let's call the module Bio::UnivAln, for it's universality. :-)
And let's call the module you're envisioning Bio::Seq::ProtAln.

> [...] As noted above, I suggest  Bio::Seq::NucAln and Bio::Seq::ProtAln, and 
> -- if possible -- the two are merged into Bio::Seq::Aln at some time in
> the future.  I don't see why you want to preempt future improvements.

It seems that protein researchers want to have fast access to data
related to the protein sequences; in the current Bio::Aln design, this
can be done by storing stuff in $self->{'names'}{'seqs'}[$seq_index]
(this is currently done for the ID and description of the individual 
sequences), or in a seperate column/row (maybe there are even better ways -
we need to brainstorm about this). (It's still unclear to me why the current
design makes your needs difficult to accomplish; but let's assume it does.)

For phylogeny calculations, you need fast and flexible access to the
individual columns/rows of the alignment, and you need lots of slicing and
mapping of evaluation functions onto these columns/rows.

> Storing it as the raw sequence (without clear reference to the original
> sequence and attachments) is the problem. Suffice it to say that to get
> things to work for many protein operations, there would need to be
> relators for vitually every operation.  This would be complicated and
> inefficient. 

Maybe the fact that Bio::UnivAln can hold any sort of data can be put to
use here ? I mean, you can perhaps put the additional data into the 
zeroeth column / zeroeth row ?!

> I think that anything relying on PerlDL should be a 'special feature' and
> not built into the core of the "basic" alignment module.  This is because
> PerlDL is non-trivial to install, which will prevent its use by a large
> fraction of potential bioperl users.  That said, I would of course
> heartily endorse development of modules using PerlDL if that does prove to
> be more efficient and effective.

Agreed; I hope PerlDL will be easy to install in a year or so 
(I've spent many hours trying to install it, with limited success,
so I can relate to this :-)

best wishes,