[Bioperl-l] TFBS databases, Bio::Matrix::PSM suitable?

Chris Fields cjfields at uiuc.edu
Tue Aug 22 11:27:23 EDT 2006

> > The test files are good only if there is access to the full data set.
> > By their nature, tests files can span only a representation of
> > multiple scenarios to check the installation validity, this in no way
> > could be a check for synchronization between the full data set and
> > the code.
> I'm not sure what you mean. Do you think that before a genbank parser
> can be released, all genbank files in existence must be supplied in the
> test suite to ensure it really does work on everyone's machine? The test
> data need only be representative, and if it isn't good enough and a user
> discovers a problem, a bug is reported and fixed as normal.

I think he means that the full set of data needs to be publicly accessible.
If that were true then we couldn't use several of the SeqIO modules either
(lasergene and some of the chromatogram classes come to mind).  

My opinion: if the data is publicly available and we can use a subset for
testing, I have no problem with what Sendu proposes.  If there are
copyright/patent/legal issue, we must figure out what the restrictions are
on their use (i.e. do the restrictions cover the data, the format, any use
of the format (including manipulation, the use of TRANSFAC as a name, etc).


> Who says it won't be maintained? I will maintain it. The very second I
> can no longer maintain it and no one else can, it can be deprecated to
> avoid clutter. I don't see the problem. But in any case see below -
> anyone could probably maintain it.

Personally, I don't think this will be a problem.  

> > I agree that a transfac module is necessary and useful (this is why I
> > started developing one in the first place)  in general but I doubt it
> > is reasonable to support one without access to the underlying data
> > structure.
> I have access to the pro data files. Everyone has access to
> http://www.biobase-international.com/pages/index.php?id=117 which I
> think documents changes since the last version (in this case, there were
> no changes to the data format since 10.1). Everyone has access to the
> websites.

But Stefan has something of a point: if we have restrictions on their use
then we must find out specifically what those restrictions are and play fair
by their requirements.


More information about the Bioperl-l mailing list