[Bioperl-l] Packaging bioperl for Fedora

Allen Day allenday at gmail.com
Fri Mar 30 20:30:27 EDT 2007

Hi Alex,

You've aptly noted that there are several classes of packages being
discussed here, and that they should not be treated equally.  From my
point of view and of specific relevance to the Bioperl community we
have at least:

1) "regular" CPAN dependencies and their occassional C/C++/Fortran
dependencies.  These should all be in Fedora Extras, as they are of
general utility.  Biopackages.net currently hosts about 200 packages
(.spec files, specifically) that are like this.  Maybe 80 of these are
needed for Bioperl.

2) academic packages, such as BLAT, NCBI Toolkit, CLUSTAL, genscan,
etc.  From what I've seen, these typically have strange/custom
licenses that may not be valid for some users.  BLAT has a dual
licensing scheme for academic and non-academic licensees, for
instance.  These packages are not of general utility.  For these two
reasons, my stance is that they should not be included in Fedora

3) Bioperl packages.  Several subsets here.  The Bioperl-run libraries
depend directly on type (2) packages, so aren't appropriate to include
in Fedora Extras.  Bioperl-live is not really that useful without type
(2) packages.  It is also sensible to all of the keep the Bioperl-*
packages in the same repository.  For these reasons, my stance is that
they should not be included in Fedora Extras.

4) Bioinformatics / Comp. Bio. data sets.  These don't have licensing
problems, but they tend to be large.  Usually in the 10E7 - 10E10 byte
range.  RPM can not even generate correct metadata for some of them
correctly if the files are too large (overflow problems).  Probably
not appropriate to put in Fedora Extras because they are too large and
not generally useful.

5) Bioinformatics-specific System databases / daemons.  These
high-level packages depend on types (2), (3), and (4), and so are not
appropriate to put into Fedora Extras.  An example is a BLAT daemon,
which relies on the BLAT server, as well as NIB-formatted genome
sequence files.

That said, there are a lot of type (1) packages in the Biopackages.net
repository.  If you're interested in migrating the spec files from our
repository to the Fedora project it would save us (the Biopackages.net
maintainers) a ton of build and maintenance time, so please feel free
to take them, just let us know.  If we can reach some agreement on
where the bioinformatics-specific packages should be maintained/built
we may be able to work together on these as well.


On 3/30/07, Alex Lancaster <alexl at users.sourceforge.net> wrote:
> >>>>> "AD" == Allen Day  writes:
> AD> Hi Alex, The Biopackages.net project is still active, we are
> AD> regularly adding packages to it, mostly R packages lately.  Most
> AD> of the systems we use are running CentOS at this point, which is
> AD> why you have not seen support for FC6 yet.  There is nothing
> AD> preventing building FC6 packages aside from lack of time to set up
> AD> the FC6 build farm nodes.
> Hi Allen and other,
> Great news to hear that Biopackages.net is still active!  I would like
> to help out if possible.  I don't believe in "FUD" either... ;)
> AD> If you're interested in packaging BioPerl or other
> AD> bioinformatics-related software, please join the Biopackages
> AD> project on SourceForge.  We object to the Fedora Extras FUD
> AD> tactics used to discourage people from using 3rd party
> AD> repositories, and suspect they may not want to host some of our
> AD> data packages, such as the >2GB genome packages.  Biopackages
> AD> project is likely to partially merge with RPMForge.  We are
> AD> already discussing with them how best to do it.
> The packages that I created which are currently available in Fedora
> Packages are Perl dependencies which, as I said are useful for
> packages outside the bioinformatics purview.  I do have a (base)
> bioperl package in review, but it is not yet released.
> As for third-party repos, I don't object to them at all, and for some
> kinds of projects they are indeed appropriate. (e.g. for non-free
> stuff like Livna or Freshrpms).  However I do have practical concerns
> about repository mixing, but I think that it does need to be handled
> carefully but that co-operation between Fedora and third-party repos
> can make it work.
> For example, one practical concern is that as of the
> soon-to-be-released Fedora 7, Core+Extras will be merged, so there
> will be no distinction at the repository-level between formerly Extras
> packages and formerly Core packages (as of now there are only "Fedora
> Packages"), which means that it will not be possible for third-party
> repos to limit their dependencies to just those in a former base set
> (i.e. excluding Extras).
> I agree that a few years ago (circa 2003-2004) there was concern about
> the way some third party repositories were treated somewhat badly by
> the (then) Fedora Extras (with some people going so far as to say that
> third-party repos were bad in principle and should always be ignored
> which I disagree with too).  But it seems to me that culture has
> shifted since, with some notable packagers such as Matthias Saou (of
> Freshrpms) and Axel Thimm (of Atrpms) now contributing packages to
> Fedora itself.  The process of contributing has also become much
> simpler and reviews are conducted speedily and efficiently, I had
> packages in the repository in a matter of a few days from initial
> submission.  Freshrpms itself now enables and depends on the (old)
> Extras.
> The real question for me, then is what packages it makes sense to go
> in Fedora, and what packages go in third party repositories.  It seems
> to me that in the case of Perl packages which could be dependencies
> for other packages not specific to the third-party repo in question,
> it makes sense for them to go into Fedora itself, so I think I will
> continue to package them.  This lessens the load on the third-party
> repo, while making them available for all other third-party repos.
> (This is approach that Freshrpms seems to be taking, Matthias has
> contributed most packages back to Fedora now other than the non-free
> ones).
> At the other end of the spectrum are packages like you mention, genome
> packages, which may be of concern because of their size and/or highly
> specialised nature, and, as you say, may make sense to go in a
> third-party repo like Biopackages.net.  Also packages which can't be
> packaged by Fedora for legal reasons like Clustal could/should go in
> Biopackages.net.
> In the middle are packages like bioperl itself which are potentially
> useful to perhaps a wider group of people than the genome packages but
> may not necessarily be dependencies for other packages.  I lean
> towards making them part of Fedora so that they will be available of
> out the box on the planned "Everything" DVD ISO, but I welcome a
> discussion on this.
> As I said, I'm glad to hear that Biopackages.net is alive and well and
> I welcome a discussion on how upstream Fedora can usefully interact
> with Biopackages.net (I guess perhaps on the Biopackages.net list).
> Regards,
> Alex
> PS.  As the upstream author If you could clarify the license on
> perl-SVG-Graph, on CPAN (or on the mailing list) that would be great.
> --
> Alex Lancaster, Ph.D. | Ecology & Evolutionary Biology, University of Arizona
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

More information about the Bioperl-l mailing list