[Bioperl-l] BLAST output parsing
barry.moore at genetics.utah.edu
Thu Nov 1 00:03:01 EDT 2007
If you are using NCBI fasta files you can use files from NCBIs gene
database to map your gene IDs to names and organisms. Look in
particular at the files gene2accession, gene2refseq, and gene_info.
For example, if you had RefSeq protein IDs like NP_123456, you could
use gene2refseq to map those RefSeq accessions to gene IDs and then
gene_info to map the gene IDs to organisms and gene name.
On Oct 31, 2007, at 7:27 PM, Torsten Seemann wrote:
>> I am new to bioperl. I did BLAST search of ~4000 genes and I need
>> to parse
>> it. I did use -m 9 option to get a tabular information of the
>> blast data.
>> But it does not include the gene names or the names of the
>> organisms of each
>> hit. Are there any parsers that can do this job ??
> The -m 9 tabular output does not include gene descriptions and
> organisms. It only includes the "gene id" that was present immediately
> after the ">" sign in the FASTA file that was used to create the BLAST
> database you specified with the -d option when you ran BLAST.
> Hence, no parser will help you. You either have to re-do the BLAST
> with a different -m value that includes the information you desire, or
> write code to convert your gene IDs into what you want.
> --Torsten Seemann
> --Victorian Bioinformatics Consortium, Monash University
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
More information about the Bioperl-l