[Bioperl-l] search2gff

Hilmar Lapp hlapp at gmx.net
Thu Jan 19 18:06:57 EST 2006

I added a couple of capabilities to the scripts/utilities/search2gff
script written by Jason. In a nutshell, there are now options for
controlling the score, location, and method of the HSP-representing
feature, as well as options for printing of parent, which parent, and
whether to skip all except the first HSP for each hit.

As for possible applications, for example using these options you can
blast SNP assay primers and use the options to create SNP features for
a single basepair at the end of the primer, ready to be piped to a
GBrowse GFF3 loader.

I tried to preserve the original functionality in its entirety, i.e.,
if you don't use any of the new options the script should work as
before. If not please let me know.

POD is attached.

: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
-------------- next part --------------
    Usage: search2gff [-o outputfile] [-f reportformat] [-i inputfilename]
    OR file1 file2 ..

    This script will turn a protein Search report (BLASTP, FASTP, SSEARCH,
    AXT, WABA) into a GFF File.

    The options are:

       -i infilename      - (optional) inputfilename, will read
                            either ARGV files or from STDIN
       -o filename        - the output filename [default STDOUT]
       -f format          - search result format (blast, fasta,waba,axt)
                            (ssearch is fasta format). default is blast.
       -t/--type seqtype  - if you want to see query or hit information
                            in the GFF report
       -s/--source        - specify the source (will be algorithm name
                            otherwise like BLASTN)
       --method           - the method tag (primary_tag) of the features
                            (default is similarity)
       --scorefunc        - a string or a file that when parsed evaluates
                            to a closure which will be passed a feature
                            object and that returns the score to be printed
       --locfunc          - a string or a file that when parsed evaluates
                            to a closure which will be passed two
                            features, query and hit, and returns the
                            location (Bio::LocationI compliant) for the
                            GFF3 feature created for each HSP; the closure
                            may use the clone_loc() and create_loc()
                            functions for convenience, see their PODs
       --onehsp           - only print the first HSP feature for each hit
       -p/--parent        - the parent to which HSP features should refer
                            if not the name of the hit or query (depending
                            on --type)
       --target/--notarget - whether to always add the Target tag or not
       -h                 - this help menu
       --version          - GFF version to use (put a 3 here to use gff 3)
       --component        - generate GFF component fields (chromosome)
       -m/--match         - generate a 'match' line which is a container
                            of all the similarity HSPs
       --addid            - add ID tag in the absence of --match
       -c/--cutoff        - specify an evalue cutoff

    Additionally specify the filenames you want to process on the
    command-line. If no files are specified then STDIN input is assumed. You
    specify this by doing: search2gff < file1 file2 file3

    Jason Stajich, jason-at-bioperl-dot-org

    Hilmar Lapp, hlapp-at-gmx-dot-net

     Title   : clone_loc
     Usage   : my $l = clone_loc($feature->location);
     Function: Helper function to simplify the task of cloning locations
               for --locfunc closures.

               Presently simply implemented using Storable::dclone().
     Example :
     Returns : A L<Bio::LocationI> object of the same type and with the
               same properties as the argument, but physically different.
               All structured properties will be cloned as well.
     Args    : A L<Bio::LocationI> compliant object

     Title   : create_loc
     Usage   : my $l = create_loc("10..12");
     Function: Helper function to simplify the task of creating locations
               for --locfunc closures. Creates a location from a feature-
               table formatted string.

     Example :
     Returns : A L<Bio::LocationI> object representing the location given
               as formatted string.
     Args    : A GenBank feature-table formatted string.

More information about the Bioperl-l mailing list