[Bioperl-l] RFC: Bio::App::SELEX::RNAmotifAnalysis

Bottoms, Christopher A BottomsC at missouri.edu
Wed Aug 22 18:12:49 EDT 2012

Dear BioPerl community,

I developed this application for a research lab here at the University of Missouri. I was wondering if this sounded okay and if it were okay to use the "Bio" namespace.

Thank you for all you do.


Christopher Bottoms

        perl perl5/lib/perl5/Bio/App/SELEX/RNAmotifAnalysis.pm --infile simple.seqs --cpus 4 --run

        This module pipelines steps in the analysis of SELEX (Systematic Evolution
        of Ligands through EXponential enrichment) data.

        This main module creates scripts to do the following: 

        (1) Cluster similar sequences based on edit distance.

        (2) Align sequences within each cluster (using mafft).

        (3) Calculate the secondary structure of the aligned sequences (using
            RNAalifold, from the Vienna RNA package) 

        (4) Build covariance models using cmbuild from Infernal. 

        The module Bio::App::SELEX::CovarianceSearch can also be used to create
        scripts for doing iterative refinements of covariance models.

        perl perl5/lib/perl5/Bio/App/SELEX/RNAmotifAnalysis.pm --infile simple.seqs --cpus 4 --run

        (The file 'simple.seqs' should only contain sequences, one per line.)

        This will cluster the sequences found in 'simple.seqs' and create a FASTA
        file for each one. The FASTA files will be grouped into batches (i.e. one
        per cpu requested) that will be placed in a separate directory for each
        batch, and processed within that directory. At the end of processing, for
        each cluster there will be a covariance model and postscript illustration
        files. The batch script used to process each batch will be located in the
        respective batch directory.  To produce the scripts without running them,
        simply exclude the --run flag from the command line.

        As written, this code makes heavy use of UNIX utilities and is
        therefore only supported on UNIX-like environemnts (e.g. Linux, UNIX, Mac
        OS X).

        Install Infernal, MAFFT, and the RNA Vienna package ahead of time and add
        the directories containing their executables to your PATH, so that the
        first time you run RNAmotifAnalysis.pm a configuration file (cluster.cfg)
        will be generated for you with all of the correct parameters. Otherwise,
        you'll need to update your cluster.cfg file manually. 

        After installing mafft, Infernal, and Vienna RNA packages, add the
        directories in which their executables reside in your PATH. 
        For example, assuming that the mafft executable is located in the directory
        '/usr/local/myapps/bin/', you would want to add it to your PATH.  To make
        sure this is done every time you open a terminal window, add this to your
        .bashrc file, thus:
            echo 'export PATH=/usr/local/myapps/bin:$PATH' >> ~/.bashrc. 

        Then, to make it effective immediately, you can source your .bashrc file:

            source ~/.bashrc

        These installation instructions assume being able to open and use a
        terminal window on Linux.

        (0) Some systems need several dependencies installed ahead of time.

            You may be able to skip this step. However, if subsequent steps don't
            work, then be sure that some basic libraries are installed, as shown
            below (or ask a system administrator to take care of it):

            For RedHat or CentOS 5.x systems (tested on CentOS 5.5)

                Open a terminal and then type the following command, answering all
                questions in the afirmative:

                    sudo yum install gcc

            For RedHat or CentOS 6.x systems (tested on CentOS 6.3)

                Open a terminal and then type the following commands, answering
                all questions in the afirmative:

                    sudo yum install gcc
                    sudo yum install perl-devel

            For Debian or Ubuntu systems (tested on Debian 5.06, Ubuntu 12-04 LTS)

                Open a terminal and then type the following commands, answering
                all questions in the afirmative:

                    sudo apt-get install gcc
                    sudo apt-get install make

        (1) Install the non-Perl dependencies:
            (Versions shown are those that we've tested. Please contact us if
            newer versions do not work.)

            Infernal            1.0.2    (http://infernal.janelia.org/)
            MAFFT               6.849b   (http://mafft.cbrc.jp/alignment/software/)
            RNA Vienna package  1.8.4    (http://www.tbi.univie.ac.at/~ivo/RNA/)

        (2) Either (a) download and run our installer or (b) use a CPAN client
            to install Bio::App::SELEX::RNAmotifAnalysis. Note that our installer
            creates the directory 'perl5' inside your home directory. This
            directory is for holding Perl modules, including this module and any
            Perl module dependencies not already included on your system. The
            installer also appends commands to your .bashrc file to make it easy
            for the Perl runtime to find these new modules (i.e. it includes your
            local 'perl5/lib/perl5' directory in the PERL5LIB environment
            (a) Use the installer
                  i. Download installer (and name it "installer")
                        curl -o installer -L http://ircf.rnet.missouri.edu:8000/share.attachment/184
                 ii. Make it executable

                        chmod u+x installer

                iii. Run it. In a few cases (e.g. CentOS 5.5) we've had to run the
                     installer as many as three times to get all of the Perl
                     modules installed. Please contact us if this doesn't work
                     after three attempts.


            (b) If you prefer using a CPAN client, then we recommend that you install
                Bio::App::SELEX::RNAmotifAnalyis 'locally' instead of to system
                perl, to avoid overwriting core Perl modules. If this doesn't make
                sense to you, then please be sure to use the installer as
                described in (a) above.

        None known

         There are no known bugs in this module.
         Please report problems to molecules <at> cpan <dot> org
         Patches are welcome.

        Ditzler et. al. Manuscript currently in review.

More information about the Bioperl-l mailing list