[Bioperl-l] Remote blast

Roy Chaudhuri roy.chaudhuri at gmail.com
Thu Nov 19 11:10:28 EST 2009


Hi Roopa,

I think that the -Organism parameter that you specify for 
Bio::Tools::Run::RemoteBlast is ignored - I can't find any reference to 
it in the documentation:
http://search.cpan.org/~cjfields/BioPerl-1.6.1/Bio/Tools/Run/RemoteBlast.pm

You have the correct approach in your code - limiting the search to the 
Entrez query "Trypanosoma brucei[ORGN]", but the line is commented out. 
If you uncomment the line (and add a semicolon afterwards), the program 
runs correctly, but no hits are reported below your threshold e-value. 
If you change the value of $e_val to 10 then some T.brucei hits are 
reported.

Roy.

Roopa Raghuveer wrote:
> Hello everybody,
> 
> I have a problem. I would like to use remote blast to find sequences
> matching for an input sequence.
> 
> Ex:-I would like to search sequences which match Trypanosoma Brucei
> sequence.
> 
> I want the output to be only Trypanosoma Brucei sequences matching with my
> query.When i tried to use remoteblast to nr database,I got sequences from
> different organisms like E.coli,Pseudomonas etc.,
> 
> Could you please tell me how can this be solved...?
> 
> My code is as follows.
> 
> use Bio::Tools::Run::RemoteBlast;
>   use strict;
>   my $prog = 'blastn';
>   my $db   = 'nr';
>   my $e_val= '1e-10';
>  my $organism= 'Trypanosoma Brucei';
> 
>   my @params = ( '-prog' => $prog,
>          '-data' => $db,
>          '-expect' => $e_val,
>          '-readmethod' => 'SearchIO',
>          '-Organism'   => $organism );
> 
>   my $factory = Bio::Tools::Run::RemoteBlast->
> new(@params);
> 
>   #change a paramter
>   #$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Trypanosoma
> brucei[ORGN]'
> 
>   #remove a parameter
>   #delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> 
>   my $v = 1;
>   #$v is just to turn on and off the messages
> 
>   my $str = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 'fasta' ,
> '-organism' => 'Trypanosoma Brucei' );
> 
>   while (my $input = $str->next_seq()){
>     #Blast a sequence against a database:
>    my $r = $factory->submit_blast($input);
>     #my $r = $factory->submit_blast('amino.fa');
> 
>     print STDERR "waiting..." if( $v > 0 );
>     while ( my @rids = $factory->each_rid ) {
>       foreach my $rid ( @rids ) {
>         my $rc = $factory->retrieve_blast($rid);
>         if( !ref($rc) ) {
>           if( $rc < 0 ) {
>             $factory->remove_rid($rid);
>           }
>           print STDERR "." if ( $v > 0 );
>          sleep 5;
>         }
>      else {
>           my $result = $rc->next_result();
>           #save the output
>           my $filename = $result->query_name()."\.out";
>           $factory->save_output($filename);
>           $factory->remove_rid($rid);
>           print "\nQuery Name: ", $result->query_name(), "\n";
>           while ( my $hit = $result->next_hit ) {
>             next unless ( $v > 0);
>             print "\thit name is ", $hit->name, "\n";
>             while( my $hsp = $hit->next_hsp ) {
>               print "\t\tscore is ", $hsp->score, "\n";
>             }
>           }
>         }
>       }
>     }
>   }
> 
> My input sequence is
> 
>> ref|NC_009512.1|:385-1902
> GTGTCAGTGGAACTTTGGCAGCAGTGCGTGGAGCTTCTGCGCGATGAACTGCCTGCCCAGCAATTCAACA
> CCTGGATCCGTCCGCTACAGGTCGAAGCCGAAGGCGACGAGTTGCGCGTCTATGCGCCTAACCGTTTCGT
> TCTCGATTGGGTCAATGAAAAGTACCTGGGTCGTTTGCTCGAGCTGTTGGGTGAGAACGGTAGCGGCATT
> GCACCAGCCCTTTCCTTATTAATAGGTAGCCGCCGCAGCTCGGCCCCAAGGGCTGCACCCAACGCGCCGG
> TCAGCGCTGCCGTTGCGGCTTCGCTGGCGCAGACTCAGGCGCACAAGACGGCCCCGGCAGCAGCGGTTGA
> ACCCGTTGCCGTGGCCGCGGCCGAGCCTGTATTGGTCGAGACGTCTTCGCGTGACAGCTTTGATGCCATG
> GCCGAGCCTGCTGCTGCGCCGCCCAGTGGTGGCCGGGCTGAACAGCGCACCGTGCAGGTTGAAGGTGCGC
> TCAAGCACACCAGTTACCTGAACCGGACCTTTACCTTTGACACCTTCGTCGAAGGTAAGTCGAACCAGCT
> CGCCCGCGCGGCTGCCTGGCAGGTTGCGGACAACCCTAAGCATGGCTACAACCCACTGTTCCTTTATGGC
> GGTGTGGGTTTGGGTAAAACCCACCTTATGCATGCTGTGGGTAACCATCTGCTGAAGAAGAATCCGAACG
> CCAAGGTGGTGTACCTGCATTCGGAGCGCTTCGTCGCGGACATGGTCAAAGCGTTGCAACTCAACGCCAT
> CAACGAATTCAAGCGCTTCTACCGCTCGGTGGACGCGTTGCTGATCGACGATATCCAGTTCTTCGCTCGC
> AAAGAGCGCTCGCAAGAAGAGTTTTTCCACACCTTCAACGCCTTGCTTGAGGGTGGCCAGCAGGTAATCC
> TTACCTCTGACCGCTATCCCAAGGAAATCGAAGGCCTGGAAGAGCGTCTGAAGTCGCGCTTTGGTTGGGG
> CCTGACGGTGGCTGTCGAGCCGCCAGAGCTGGAGACCCGCGTAGCGATCCTGATGAAGAAGGCCGACCAG
> GCCAAAGTCGAGCTCCCGCATGACGCAGCCTTTTTCATCGCTCAGCGCATCCGGTCCAACGTCCGTGAGC
> TGGAAGGTGCACTGAAGCGAGTTATTGCTCACTCGCACTTCATGGGGCGTGACATCACCATCGAGCTGAT
> TCGTGAATCGCTCAAGGATCTGTTGGCGCTGCAAGACAAACTGGTCAGTGTGGATAACATTCAGCGTACC
> GTCGCTGAGTACTACAAGATCAAGATCTCCGATCTGTTGTCCAAGCGTCGTTCGCGTTCTGTCGCGCGCC
> CGCGTCAGGTAGCCATGGCCCTGTCCAAGGAGTTGACCAACCACAGTCTGCCGGAAATCGGCGACATGTT
> CGGTGGTCGCGACCATACGACCGTGCTGCACGCCTGCCGCAAAATCAATGAACTGAAGGAATCCGACGCG
> GACATCCGCGAGGACTACAAGAACCTGCTGCGGACGCTGACGACCTGA
> 
> Please mail me regarding any queries.
> 
> Regards,
> Roopa.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l



More information about the Bioperl-l mailing list