[Bioperl-l] Re: Developing Improved Bioperl Documentation

Peter Schattner schattner@alum.mit.edu
Sun, 17 Dec 2000 15:28:33 -0800

Kris Boulez wrote:
> > 
> My plan is to start writing cookbook-like documentation (see an earlier
> mail from Ewan), based on the SYNOPSIS part of each module's
> documentation. In the mean time chekcing these section for correctness.
> Kris,

Kris, I gather now that the sort of cookbook that you (and Brian?) are
planning is rather different than the type of tutorial that I have been
envisioning.  I donít think this is bad.  After all, in learning new
software (as in coding our favorite computer language) there is a
something to be said for "having more than one way to do it".  I think
that your cookbook would be complementary (and I would hope also
complimentary ;-) to the tutorial I would like to write.

I plan to go less into the precise syntax for using the various modules
(I think that information would fit better in your cookbook) and more
into "motivation" - describing what tasks a (computational) biologist
could use bioperl for and where they should look to find those
capabilities within the Bioperl package.  I know that when I was
learning bioperl, not knowing what tools were available and where to
look for them was one of the bigger stumbling blocks for me.

I have attached a outline of the proposed tutorial, below.  I would be
grateful for feedback for anybody on the list regarding uses of bioperl
Iíve omitted, modules that should be included or omitted from a
tutorial, or any other suggestions that you think might be helpful in a
tutorial of the type I am describing.  Thanks. 

-- Peter

Bioperl Tutorial - outline

  What Bioperl is intended to do
  User required capabilities
  Software requirements
    Minimal installation
      Bioperl "core"
    Complete installation
      Perl - CPAN extensions (LWP, File:Temp, etc)
      Bioperl Perl -extensions: bp-gui, bp-ensembl, bp-biocorba
      Bioperl c -extensions
      Non-perl bio-informatics c programs: clustalw, ncbi blast, tcoffee
    Obtaining the core components
    Installing the external components / extensions
  Additional info for non-unix users
  Where to go for more information

Brief intro to Bioperl's objects
  Motivation: (or why understanding a little about the relationships among
     Bioperl's basic objects will make the user's life easier)
  Sequence objects: (Seq, PrimarySeq, LocatableSeq, LiveSeq, LargeSeq)
  Alignment objects (SimpleAlign, UnivAln)
  Where to go for more information

Using Bioperl
  Overview of molecular biology tasks where bioperl can help
  Accessing sequence data from local and remote databases
    Accessing remote databases (Bio::DB::GenBank, etc)
    Indexing and accessing local databases (bpindex.pl, bpfetch.pl)
  Transforming formats of database/ file records
    Transforming sequence files (SeqIO)
    Transforming alignment files (AlignIO)
  Manipulating individual sequences
    Obtaining basic sequence statistics - eg MW, nucleotide & codon
frequencies (SeqStats, SeqWords)
    Expanding sequences with ambiguous nts or aas (SeqPattern, IUPAC)
    Reverse-complementing nt seqs (SeqPattern)
    Translating nt seqs (CodonTable)
    Identifying aa characteristics - eg charge, hydrophobicity (OddCodes)
    Identifying restriction enzyme sites (RestrictionEnzyme)
    Identifying aa cleavage sites (Sigcleave)
  Searching for "similar" sequences
    Running BLAST locally  (StandAloneBlast)
    Running BLAST remotely (Blast)
    Parsing BLAST reports (Blast, BPlite, BPpsilite)
  Creating and manipulating sequence alignments
    Aligning 2 sequences with Smith-Waterman (pSW)
    Aligning 2 sequences with Blast (StandAloneBlast, BPbl2seq)
    Aligning multiple sequences (Clustalw, TCoffee)
    Manipulating / displaying alignments (SimpleAlign, UnivAln)
  Searching for genes and other structures on genomic DNA
    Parsing reports of gene-searching programs (Genscan, ESTScan. MZEF)
    Parsing HMM reports (HMMER::Results)
  Developing machine readable sequence annotations
    Representing sequence annotations for a single sequence (Annotation,
SeqFeature, GeneStructure)
    Representing and annotating genomic and/or very large sequences (LiveSeq,LargeSeq)
    Representing related sequences - mutations, polymorphisms etc
(Allele, SeqDiff, etc)
    Sequence XML representations - generation and parsing (SeqIO::game)
  Graphically displaying annotated sequences (Bioperl Ė gui)
  Where to go for more information