From Laurence.Amilhat at toulouse.inra.fr Tue May 6 05:32:53 2008 From: Laurence.Amilhat at toulouse.inra.fr (Laurence Amilhat) Date: Tue, 06 May 2008 11:32:53 +0200 Subject: [Bioperl-l] BioPerl and NHX tree In-Reply-To: <24c96eca0801030615k44a1b188pb3aef683674f3153@mail.gmail.com> References: <476A7736.109@toulouse.inra.fr> <24c96eca0712200732q20523c1co1075c15d056ff634@mail.gmail.com> <477CBFDC.8020503@toulouse.inra.fr> <24c96eca0801030615k44a1b188pb3aef683674f3153@mail.gmail.com> Message-ID: <482025C5.1030805@toulouse.inra.fr> Hello, I am trying to convert a newick treefile to a NHX file with specie tags in order to visualize it with the ATV viewer. The script is working but I think there is an error because the ATV return this error message: " Failed to read gene tree from "BX881913.1.p.om.4.tfa_prot.tfa.taxid.alltree.cons_outtree.rooted.long.nhx" [Error in NHX format: More than one distance to parent:"0.0"]" When comparing the intree and outtree, they seem to be different, for example the intree file begins with (((((( and the outtree begins with ((( Do you have an idea of what I am doing wrong? Here is my code: use strict; use Bio::TreeIO; use Bio::Tree::NodeNHX; use Getopt::Long; my $tree_file; my $outfile; my $codefile; my %corresp; GetOptions('f|file:s' =>\$tree_file, 'o|out:s' =>\$outfile, 'c|code:s' =>\$codefile); # Read the correspondence file # For each sequence get: # - the TAXID # - the specie name # - the specie name (with no space) # - the complete fasta header open (CODE, "< $codefile"); while () { chomp; my($code,$a, $b, $c, $d, $e)=split (/\t/); $corresp{$code}{"taxid"}=$b; $corresp{$code}{"species"}=$d; $corresp{$code}{"header"}=$e; $corresp{$code}{"nom"}=$c; } my $treeio = new Bio::TreeIO (-format => 'nhx', -file => "$tree_file"); #my $treeout= new Bio::TreeIO (-format => 'nhx', -file =>">$outfile", -binary=>"1"); my $treeout= new Bio::TreeIO (-format => 'nhx', -file =>">$outfile"); # Read the tree and change sequence header and add a NHX flag to specify the specie while (my $tree= $treeio->next_tree) { my @nodes=$tree->get_nodes(); foreach my $nd(@nodes) { if ($nd->is_Leaf()) { my $id=$nd->id(); print STDOUT "ID $id\n"; #add a NHX tag to the node which is the specie name $nd->nhx_tag({S=>$corresp{$id}{"nom"}}); #change the sequence code by its complete fasta header $id=$corresp{$id}{"header"}; $nd->id($id); } } $treeout->write_tree($tree); } Here is the infile: ((((((20:3.0,21:3.0):2.0,(((17:3.0,18:3.0):2.0,19:3.0):3.0,(15:3.0,16:3.0):3.0):1.0):2.0, 14:3.0):3.0,22:3.0):3.0,((13:3.0,(11:3.0,(10:3.0,12:3.0):1.0):3.0):3.0,(2:3.0, 1:3.0):3.0):3.0):0.0,((5:3.0,4:3.0):3.0,(3:3.0,((8:3.0,6:3.0):3.0,(9:3.0,7:6.0):3.0):3.0):2.0):3.0); Here is the output file: (((lcl|Fam_018802_Contig1_2_TAXID=8022_:3.0[&&NHX:S=Oncorhynchus mykiss],BX881913.1.p.om.4_1_1_-_501_TAXID=8022_:3.0[&&NHX:S=Oncorhynchus my kiss]):3.0[&&NHX],(lcl|Fam_013546_Contig1_PIMPR_6_TAXID=90988_:3.0[&&NHX:S=Pimephales promelas],(lcl|ENSDARP00000087648_pep_known_chromosome _ZFISH7_13_51517919_51522668_-1_gene_ENSDARG00000063670_t:3.0[&&NHX:S=Danio rerio],(lcl|ENSDARP00000087661_pep_novel_chromosome_ZFISH7_13_51 517919_51522668_-1_gene_ENSDARG00000063670_t:3.0[&&NHX:S=Danio rerio],lcl|ENSDARP00000087654_pep_known_chromosome_ZFISH7_13_51517544_5152273 9_-1_gene_ENSDARG00000063670_t:3.0[&&NHX:S=Danio rerio]):1.0[&&NHX]):3.0[&&NHX]):3.0[&&NHX]):3.0[&&NHX],(lcl|Fam_012588_Contig3090_GADMO_2_T AXID=8049_:3.0[&&NHX:S=Gadus morhua],(lcl|GSTENP00018428001_pep_known_chromosome_TETRAODON7_14_8497414_8500061_-1_gene_GSTENG00018428001_t:3 .0[&&NHX:S=Tetraodon nigroviridis],((lcl|ENSORLP00000013438_pep_novel_chromosome_MEDAKA1_24_3589482_3594915_-1_gene_ENSORLG00000010721_tr:3. 0[&&NHX:S=Oryzias latipes],lcl|ENSGACP00000006915_pep_novel_group_BROADS1_groupXVIII_2150130_2155380_1_gene_ENSGACG00000005224_:3.0[&&NHX:S= Gasterosteus aculeatus]):2.0[&&NHX],((lcl|ENSDARP00000074838_pep_novel_chromosome_ZFISH7_20_12837032_12851267_1_gene_ENSDARG00000011000_tr:3 .0[&&NHX:S=Danio rerio],lcl|ENSDARP00000015974_pep_known_chromosome_ZFISH7_20_12836852_12852683_1_gene_ENSDARG00000011000_tr:3.0[&&NHX:S=Dan io rerio]):3.0[&&NHX],(lcl|Contig618_HIPHI_5_TAXID=8267_:3.0[&&NHX:S=Hippoglossus hippoglossus],(lcl|Fam_023545_Contig2_2_TAXID=8022_:3.0[&& NHX:S=Oncorhynchus mykiss],lcl|ENSTRUP00000046040_pep_novel_scaffold_FUGU4_scaffold_185_27966_32394_1_gene_ENSTRUG00000017961_t:3.0[&&NHX:S= Takifugu rubripes]):2.0[&&NHX]):3.0[&&NHX]):1.0[&&NHX]):2.0[&&NHX]):3.0[&&NHX]):3.0[&&NHX]):0.0[&&NHX],((lcl|ENSORLP00000013701_pep_novel_ch romosome_MEDAKA1_15_25438171_25450498_-1_gene_ENSORLG00000010924_:3.0[&&NHX:S=Oryzias latipes],lcl|ENSGACP00000007323_pep_novel_group_BROADS 1_groupVI_6476613_6485834_1_gene_ENSGACG00000005527_tra:3.0[&&NHX:S=Gasterosteus aculeatus]):3.0[&&NHX],(lcl|GSTENP00030753001_pep_known_chr omosome_TETRAODON7_17_3400689_3407671_1_gene_GSTENG00030753001_tr:3.0[&&NHX:S=Tetraodon nigroviridis],((lcl|ENSTRUP00000035694_pep_novel_sca ffold_FUGU4_scaffold_125_722763_725332_1_gene_ENSTRUG00000013959:3.0[&&NHX:S=Takifugu rubripes],lcl|ENSTRUP00000035693_pep_novel_scaffold_FU GU4_scaffold_125_722763_725332_1_gene_ENSTRUG00000013959:3.0[&&NHX:S=Takifugu rubripes]):3.0[&&NHX],(lcl|ENSTRUP00000035695_pep_novel_scaffo ld_FUGU4_scaffold_125_722853_725332_1_gene_ENSTRUG00000013959:3.0[&&NHX:S=Takifugu rubripes],lcl|ENSTRUP00000035691_pep_novel_scaffold_FUGU4 _scaffold_125_718572_725332_1_gene_ENSTRUG00000013959:6.0[&&NHX:S=Takifugu rubripes]):3.0[&&NHX]):3.0[&&NHX]):2.0[&&NHX]):3.0[&&NHX]; -- ==================================================================== = Laurence Amilhat INRA Toulouse 31326 Castanet-Tolosan = = Tel: 33 5 61 28 57 08 Email: laurence.amilhat at toulouse.inra.fr = ==================================================================== From shameer at ncbs.res.in Thu May 8 07:11:45 2008 From: shameer at ncbs.res.in (K. Shameer) Date: Thu, 8 May 2008 16:41:45 +0530 (IST) Subject: [Bioperl-l] HMMER - Parse hmmpfam output using bioperl - help!! In-Reply-To: <952B0A4E-8A14-4E8E-B36D-14596B20E330@bioperl.org> References: <946658.12337.qm@web36802.mail.mud.yahoo.com> <952B0A4E-8A14-4E8E-B36D-14596B20E330@bioperl.org> Message-ID: <47772.192.168.1.1.1210245105.squirrel@mail.ncbs.res.in> Dear All, Here is the code snippet I used to get the hit name and hit length from an hmmpfam file. I need to add the sequence start and end information (query), description of domain, score and e-value. I checked for the available method in deobfuscator, but I couldn't find the details i wanted. Is there methods available in the Bio::SearchIO or related modules. __CODE__ $input = shift; use Bio::SearchIO; my $in = Bio::SearchIO->new(-format => 'hmmer', -file => $input); while( my $result = $in->next_result ) { while( my $hit = $result->next_hit ) { print $hit->name(),"\t"; while( my $hsp = $hit->next_hsp ) { print $hsp->length(), "\n"; } } } _END_ Result for test.pf TRAP_240kDa 680 -- Thanks in advance, K. Shameer From shameer at ncbs.res.in Thu May 8 08:31:22 2008 From: shameer at ncbs.res.in (K. Shameer) Date: Thu, 8 May 2008 18:01:22 +0530 (IST) Subject: [Bioperl-l] HMMER - Parse hmmpfam output using bioperl - help!! In-Reply-To: <4822F8BE.6090507@sendu.me.uk> References: <946658.12337.qm@web36802.mail.mud.yahoo.com> <952B0A4E-8A14-4E8E-B36D-14596B20E330@bioperl.org> <47772.192.168.1.1.1210245105.squirrel@mail.ncbs.res.in> <4822F8BE.6090507@sendu.me.uk> Message-ID: <57429.192.168.1.1.1210249882.squirrel@mail.ncbs.res.in> Hi Sendu, Thanks for your quick reply. This is really useful. Separately, One quick question, I dont have hmmer_pull.pm. I am using an older version of Bioperl. Is there any provision to to download and install individual bioperl modules separately ? Thanks, K. Shameer > You'll find the relevant documentation under Bio::Search::Result, > Bio::Search::Hit and Bio::Search::HSP. > > Using Bio::SearchIO->new(-format => 'hmmer_pull') will also give you a > faster parser that may behave more closely to your expectation during > your loops. > > Anyway, there are various obvious-named methods you can use: > > $result->query_description > $hit->score > $hit->significance > $hit->description > $hit->start('query') > $hit->end('query') > $hsp->start('query') > $hsp->evalue > > etc. > From bix at sendu.me.uk Thu May 8 08:57:34 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 08 May 2008 13:57:34 +0100 Subject: [Bioperl-l] HMMER - Parse hmmpfam output using bioperl - help!! In-Reply-To: <47772.192.168.1.1.1210245105.squirrel@mail.ncbs.res.in> References: <946658.12337.qm@web36802.mail.mud.yahoo.com> <952B0A4E-8A14-4E8E-B36D-14596B20E330@bioperl.org> <47772.192.168.1.1.1210245105.squirrel@mail.ncbs.res.in> Message-ID: <4822F8BE.6090507@sendu.me.uk> K. Shameer wrote: > Dear All, > > Here is the code snippet I used to get the hit name and hit length from an > hmmpfam file. I need to add the sequence start and end information > (query), description of domain, score and e-value. > > I checked for the available method in deobfuscator, but I couldn't find > the details i wanted. Is there methods available in the Bio::SearchIO or > related modules. > __CODE__ > $input = shift; > use Bio::SearchIO; > my $in = Bio::SearchIO->new(-format => 'hmmer', > -file => $input); > while( my $result = $in->next_result ) { > while( my $hit = $result->next_hit ) { > print $hit->name(),"\t"; > while( my $hsp = $hit->next_hsp ) { > print $hsp->length(), "\n"; > } > } > } You'll find the relevant documentation under Bio::Search::Result, Bio::Search::Hit and Bio::Search::HSP. Using Bio::SearchIO->new(-format => 'hmmer_pull') will also give you a faster parser that may behave more closely to your expectation during your loops. Anyway, there are various obvious-named methods you can use: $result->query_description $hit->score $hit->significance $hit->description $hit->start('query') $hit->end('query') $hsp->start('query') $hsp->evalue etc. From bix at sendu.me.uk Thu May 8 09:43:28 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 08 May 2008 14:43:28 +0100 Subject: [Bioperl-l] HMMER - Parse hmmpfam output using bioperl - help!! In-Reply-To: <57429.192.168.1.1.1210249882.squirrel@mail.ncbs.res.in> References: <946658.12337.qm@web36802.mail.mud.yahoo.com> <952B0A4E-8A14-4E8E-B36D-14596B20E330@bioperl.org> <47772.192.168.1.1.1210245105.squirrel@mail.ncbs.res.in> <4822F8BE.6090507@sendu.me.uk> <57429.192.168.1.1.1210249882.squirrel@mail.ncbs.res.in> Message-ID: <48230380.5040908@sendu.me.uk> K. Shameer wrote: > Hi Sendu, > > Thanks for your quick reply. This is really useful. > > Separately, One quick question, I dont have hmmer_pull.pm. I am using an > older version of Bioperl. Is there any provision to to download and > install individual bioperl modules separately ? Generally you can grab individual modules from svn, eg: http://code.open-bio.org/svnweb/index.cgi/bioperl/view/bioperl-live/trunk/Bio/SearchIO/hmmer_pull.pm (click the 'checkout' link). However in this particular case it's really complicated, with that module needing lots of other new modules to work. Use the normal hmmer.pm module; if you don't have any problems with it stick with it. Otherwise I'd recommend upgrading your entire Bioperl to 1.5.2 or svn. From prachi at stanford.edu Thu May 8 16:54:06 2008 From: prachi at stanford.edu (Prachi Shah) Date: Thu, 8 May 2008 13:54:06 -0700 Subject: [Bioperl-l] Can't parse blast report written by Bio::SearchIO::Writer::TextResultWriter Message-ID: <8684cf960805081354s6400b1eey917f6b9ae862eded@mail.gmail.com> Hi all, I am trying to order of HSPs within each BLAST Hit in the order of ascending P-values. So, I parse my WU-BLAST report using Bio::SearchIO and create new Result, Hit and HSP objects in the order and then write out another BLAST report with the Bio::SearchIO::Writer::TextResultWriter module. All this works fine. But, when I try to parse this new blast report with Bio::SearchIO::blast, I get the following error: ------------- EXCEPTION ------------- MSG: no data for midline Query: 0 1 STACK Bio::SearchIO::blast::next_result /tools/perl/5.6.1/lib/site_perl/5.6.1/Bio/SearchIO/blast.pm:1151 STACK toplevel bin/testBlastParse.pl:12 -------------------------------------- I have copied below sample sections of both blast reports and the code. Any hints/ pointers/ suggestions are greatly appreciated. Thanks, Prachi The old vs new blast reports look slightly different, esp. note the HSP start and stop coordinates for the QUERY sequence. **Snippet of OLD blast report (generated by WU-BLAST): ---------------------------------------------------------------------------------------------------- Query= orf19.4890 (4931 letters) Database: Ca21_Chromosomes 9 sequences; 14,324,492 total letters. Searching....10....20....30....40....50....60....70....80....90....100% done WARNING: hspmax=1000 was exceeded by 8 of the database sequences, causing the associated cutoff score, S2, to be transiently set as high as 113. Smallest Sum High Probability Sequences producing High-scoring Segment Pairs: Score P(N) N Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) 24655 0. 1 Ca21chr5 Assembly 21, Ca21chr5 (1190941 nucleotides) 1682 3.4e-68 3 Ca21chr6 Assembly 21, Ca21chr6 (1033553 nucleotides) 908 3.0e-34 3 Ca21chr2 Assembly 21, Ca21chr2 (2232049 nucleotides) 859 4.7e-30 1 Ca21chr7 Assembly 21, Ca21chr7 (949626 nucleotides) 492 7.3e-24 3 Ca21chr4 Assembly 21, Ca21chr4 (1603475 nucleotides) 528 9.8e-21 2 Ca21chrR Assembly 21, Ca21chrR (2286425 nucleotides) 520 1.4e-19 5 Ca21chr3 Assembly 21, Ca21chr3 (1799426 nucleotides) 502 1.7e-14 2 Ca19-mtDNA Assembly 19, Ca19-mtDNA (40420 nucleotides) 313 2.9e-06 2 >Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) Length = 3,188,577 Plus Strand HSPs: Score = 506 (82.0 bits), Expect = 4.9e-14, P = 4.9e-14 Identities = 850/1549 (54%), Positives = 850/1549 (54%), Strand = Plus / Plus Query: 3450 ATGCATATGGTAATGTTAA-AATCACTGATTTTGGA-TTTTGTGCTAAATTAAC-T-GAT 3505 | | ||| | | || |||| ||| ||||| ||| | ||||| || | || | | | | Sbjct: 155924 AGGGATACGATTAT-TTAAGAATT-CTGATATTGAAATTTTG-GC-ATTTTCATATAGTT 155979 Query: 3506 CAAAGA--AATAAACGTGCC-ACAATGGTGGGGACACCATATTGG-ATGGCACCTGAAGT 3561 |||| | |||||| | | |||| || | ||| | | ||| | | | | Sbjct: 155980 CAAACATTAATAAATATATTGAAAATGTTGATTTAATCAT-TAGTCATG---CTGGTACT 156035 Query: 3562 GGTTAAACAAAAGGAATATGATGAAAAAGTTGATGTTTGGTCATTGGGGATTATGACTAT 3621 || | || | | || || | | | |||| | |||| |||| || Sbjct: 156036 GGATCAATCATTG--AT-TGTTTACAT--TTGAA--TAAACCATTAATTGTTATTGTTAA 156088 Query: 3622 TGAAATGATTGAAGGAGAACCACCTTATTTGAA-T-GAAGAACCATTAAAAGCATTATAT 3679 ---------------------------------------------------------------------------------------------------- **Snippet of NEW blast report (generated using Bio::SearchIO::Writer::TextResultWriter) ---------------------------------------------------------------------------------------------------- uery= orf19.4890 (4,931 letters) Database: Ca21_Chromosomes 9 sequences; 14,324,492 total letters Score E Sequences producing significant alignments: (bits) value Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) 24655 0. Ca21chr5 Assembly 21, Ca21chr5 (1190941 nucleotides) 1682 3.4e-68 Ca21chr6 Assembly 21, Ca21chr6 (1033553 nucleotides) 908 3.0e-34 Ca21chr2 Assembly 21, Ca21chr2 (2232049 nucleotides) 859 4.7e-30 Ca21chr7 Assembly 21, Ca21chr7 (949626 nucleotides) 492 7.3e-24 Ca21chr4 Assembly 21, Ca21chr4 (1603475 nucleotides) 528 9.8e-21 Ca21chrR Assembly 21, Ca21chrR (2286425 nucleotides) 520 1.4e-19 Ca21chr3 Assembly 21, Ca21chr3 (1799426 nucleotides) 502 1.7e-14 Ca19-mtDNA Assembly 19, Ca19-mtDNA (40420 nucleotides) 313 2.9e-06 >Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) Length = 3188577 Score = 3705.3 bits (24655), Expect = 0., P = 0. Identities = 4931/4931 (100%) Frame = -1 / +1 Query: 1 ATAAAGGATGCCAAATAGTAGTAGTAAAATAGTAAATAGAATTGCAAAACAAAAATGATT -58 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2248574 ATAAAGGATGCCAAATAGTAGTAGTAAAATAGTAAATAGAATTGCAAAACAAAAATGATT 2248633 Query: -59 AAATAGCCCTTTATCAATAAATTTTTAAAGTTAGTTTCTTCTGGAACCCTACCCTCTTGG -118 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2248634 AAATAGCCCTTTATCAATAAATTTTTAAAGTTAGTTTCTTCTGGAACCCTACCCTCTTGG 2248693 Query: -119 TGTTAATCTTTTAAGTTAATATTTATAGTTAATAAAGTAGAAGTGTCTATTTATTGATTG -178 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2248694 TGTTAATCTTTTAAGTTAATATTTATAGTTAATAAAGTAGAAGTGTCTATTTATTGATTG 2248753 Query: -179 TTGTTGTTGTTGATTAAGAATATAAAGAAAAACAGAAAAGAAAAAAAGAAGGTTTAAAAA -238 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2248754 TTGTTGTTGTTGATTAAGAATATAAAGAAAAACAGAAAAGAAAAAAAGAAGGTTTAAAAA 2248813 Query: -239 AGTTAATTGTGAAGTAAAAGGGTTGAAAAATTTTTTTTTTTTCTGTTTCTCTCTTTGAGA -298 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2248814 AGTTAATTGTGAAGTAAAAGGGTTGAAAAATTTTTTTTTTTTCTGTTTCTCTCTTTGAGA 2248873 Query: -299 TTCTTTGACATATTTATTATTATAACACTATGCTATACTAAAAACAGTACTACCAATTGA -358 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2248874 TTCTTTGACATATTTATTATTATAACACTATGCTATACTAAAAACAGTACTACCAATTGA 2248933 Query: -359 ATTAAATTAAATTAAATTAAATTAAATTATTAGACCAATTTCAATAAAGATAAGCAATTT -418 ---------------------------------------------------------------------------------------------------- **Here is the snippet of code that reads the old report, generates new objects and writes new report: ---------------------------------------------------------------------------------------------------- my $blast_report = Bio::SearchIO->new(-format => 'blast', -file => $blastOutputTmp); my $writer = Bio::SearchIO::Writer::TextResultWriter->new(-no_wublastlinks => 0); my $out_blast_report = Bio::SearchIO->new(-writer => $writer, -file => ">$blastOutputFile"); my $sorted_blast_report; while( my $result = $blast_report->next_result ) { my (%parameters, %statistics); foreach my $param ($result->available_parameters) { $parameters{$param} = $result->get_parameter($param); } foreach my $stat ($result->available_statistics) { $statistics{$stat} = $result->get_statistic($stat); } my $generic_result = Bio::Search::Result::BlastResult->new(-query_name => $result->query_name, -query_length => $result->query_length, -database_name => $result->database_name, -database_entries => $result->database_entries, -parameters => \%parameters, -statistics => \%statistics, -algorithm => $result->algorithm, -query_description => $result->query_description, -algorithm_reference => $result->algorithm_reference, -algorithm_version => $result->algorithm_version, -database_letters => $result->database_letters); while( my $hit = $result->next_hit ) { my $generic_hit = Bio::Search::Hit::BlastHit->new(-name => $hit->name, -algorithm => $hit->algorithm, -description => $hit->description, -length => $hit->length, -score => $hit->score, -bits => $hit->bits, -significance => $hit->significance); my (@hsp_sorted, @hsps); while( my $hsp = $hit->next_hsp ) { push(@hsps, $hsp); } @hsp_sorted = sort {$a->pvalue <=> $b->pvalue} @hsps; for(my $i=0; $i<=$#hsp_sorted; $i++) { $generic_hit->add_hsp($hsp_sorted[$i]); } $generic_result->add_hit($generic_hit); } $out_blast_report->write_result($generic_result); } ---------------------------------------------------------------------------------------------------- From jason at bioperl.org Thu May 8 18:29:40 2008 From: jason at bioperl.org (Jason Stajich) Date: Thu, 8 May 2008 15:29:40 -0700 Subject: [Bioperl-l] Can't parse blast report written by Bio::SearchIO::Writer::TextResultWriter In-Reply-To: <8684cf960805081354s6400b1eey917f6b9ae862eded@mail.gmail.com> References: <8684cf960805081354s6400b1eey917f6b9ae862eded@mail.gmail.com> Message-ID: <27483384-0188-44F5-8AF8-5293A7A83547@bioperl.org> I suspect somehow you are not reconstituting the Hit or Result objects properly, but I didn't try and debug this myself. You can specify a sort order function to the Result object now to specify the Hit order, maybe we should add sort function to Hit object for retrieving the underlying HSPs in a programmable order. Seems like that would be a cleaner fix. -jason On May 8, 2008, at 1:54 PM, Prachi Shah wrote: > Hi all, > > I am trying to order of HSPs within each BLAST Hit in the order of > ascending P-values. So, I parse my WU-BLAST report using Bio::SearchIO > and create new Result, Hit and HSP objects in the order and then write > out another BLAST report with the > Bio::SearchIO::Writer::TextResultWriter module. All this works fine. > But, when I try to parse this new blast report with > Bio::SearchIO::blast, I get the following error: > > ------------- EXCEPTION ------------- > MSG: no data for midline Query: 0 1 > STACK Bio::SearchIO::blast::next_result > /tools/perl/5.6.1/lib/site_perl/5.6.1/Bio/SearchIO/blast.pm:1151 > STACK toplevel bin/testBlastParse.pl:12 > -------------------------------------- > > I have copied below sample sections of both blast reports and the > code. Any hints/ pointers/ suggestions are greatly appreciated. > > Thanks, > Prachi > > > > The old vs new blast reports look slightly different, esp. note the > HSP start and stop coordinates for the QUERY sequence. > > **Snippet of OLD blast report (generated by WU-BLAST): > ---------------------------------------------------------------------- > ------------------------------ > Query= orf19.4890 > (4931 letters) > > Database: Ca21_Chromosomes > 9 sequences; 14,324,492 total letters. > Searching.... > 10....20....30....40....50....60....70....80....90....100% done > > WARNING: hspmax=1000 was exceeded by 8 of the database sequences, > causing the > associated cutoff score, S2, to be transiently set as > high as 113. > > S > mallest > > Sum > High > Probability > Sequences producing High-scoring Segment Pairs: Score > P(N) N > > Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) 24655 > 0. 1 > Ca21chr5 Assembly 21, Ca21chr5 (1190941 nucleotides) 1682 > 3.4e-68 3 > Ca21chr6 Assembly 21, Ca21chr6 (1033553 nucleotides) 908 > 3.0e-34 3 > Ca21chr2 Assembly 21, Ca21chr2 (2232049 nucleotides) 859 > 4.7e-30 1 > Ca21chr7 Assembly 21, Ca21chr7 (949626 nucleotides) 492 > 7.3e-24 3 > Ca21chr4 Assembly 21, Ca21chr4 (1603475 nucleotides) 528 > 9.8e-21 2 > Ca21chrR Assembly 21, Ca21chrR (2286425 nucleotides) 520 > 1.4e-19 5 > Ca21chr3 Assembly 21, Ca21chr3 (1799426 nucleotides) 502 > 1.7e-14 2 > Ca19-mtDNA Assembly 19, Ca19-mtDNA (40420 nucleotides) 313 > 2.9e-06 2 > > >> Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) > Length = 3,188,577 > > Plus Strand HSPs: > > Score = 506 (82.0 bits), Expect = 4.9e-14, P = 4.9e-14 > Identities = 850/1549 (54%), Positives = 850/1549 (54%), Strand = > Plus / Plus > > Query: 3450 ATGCATATGGTAATGTTAA-AATCACTGATTTTGGA- > TTTTGTGCTAAATTAAC-T-GAT 3505 > | | ||| | | || |||| ||| ||||| ||| | ||||| || | || > | | | | > Sbjct: 155924 AGGGATACGATTAT-TTAAGAATT-CTGATATTGAAATTTTG-GC- > ATTTTCATATAGTT > 155979 > > Query: 3506 CAAAGA--AATAAACGTGCC-ACAATGGTGGGGACACCATATTGG- > ATGGCACCTGAAGT 3561 > |||| | |||||| | | |||| || | ||| | | ||| > | | | | > Sbjct: 155980 CAAACATTAATAAATATATTGAAAATGTTGATTTAATCAT-TAGTCATG--- > CTGGTACT > 156035 > > Query: 3562 > GGTTAAACAAAAGGAATATGATGAAAAAGTTGATGTTTGGTCATTGGGGATTATGACTAT 3621 > || | || | | || || | | | |||| | |||| > |||| || > Sbjct: 156036 GGATCAATCATTG--AT-TGTTTACAT--TTGAA-- > TAAACCATTAATTGTTATTGTTAA > 156088 > > Query: 3622 TGAAATGATTGAAGGAGAACCACCTTATTTGAA-T- > GAAGAACCATTAAAAGCATTATAT 3679 > ---------------------------------------------------------------------- > ------------------------------ > > **Snippet of NEW blast report (generated using > Bio::SearchIO::Writer::TextResultWriter) > ---------------------------------------------------------------------- > ------------------------------ > uery= orf19.4890 > (4,931 letters) > > Database: Ca21_Chromosomes > 9 sequences; 14,324,492 total letters > > > Score E > Sequences producing significant alignments: > (bits) value > Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) > 24655 0. > Ca21chr5 Assembly 21, Ca21chr5 (1190941 nucleotides) > 1682 3.4e-68 > Ca21chr6 Assembly 21, Ca21chr6 (1033553 nucleotides) > 908 3.0e-34 > Ca21chr2 Assembly 21, Ca21chr2 (2232049 nucleotides) > 859 4.7e-30 > Ca21chr7 Assembly 21, Ca21chr7 (949626 nucleotides) > 492 7.3e-24 > Ca21chr4 Assembly 21, Ca21chr4 (1603475 nucleotides) > 528 9.8e-21 > Ca21chrR Assembly 21, Ca21chrR (2286425 nucleotides) > 520 1.4e-19 > Ca21chr3 Assembly 21, Ca21chr3 (1799426 nucleotides) > 502 1.7e-14 > Ca19-mtDNA Assembly 19, Ca19-mtDNA (40420 nucleotides) > 313 2.9e-06 > > >> Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) > Length = 3188577 > > Score = 3705.3 bits (24655), Expect = 0., P = 0. > Identities = 4931/4931 (100%) > Frame = -1 / +1 > > Query: 1 > ATAAAGGATGCCAAATAGTAGTAGTAAAATAGTAAATAGAATTGCAAAACAAAAATGATT -58 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2248574 > ATAAAGGATGCCAAATAGTAGTAGTAAAATAGTAAATAGAATTGCAAAACAAAAATGATT > 2248633 > > Query: -59 > AAATAGCCCTTTATCAATAAATTTTTAAAGTTAGTTTCTTCTGGAACCCTACCCTCTTGG -118 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2248634 > AAATAGCCCTTTATCAATAAATTTTTAAAGTTAGTTTCTTCTGGAACCCTACCCTCTTGG > 2248693 > > Query: -119 > TGTTAATCTTTTAAGTTAATATTTATAGTTAATAAAGTAGAAGTGTCTATTTATTGATTG -178 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2248694 > TGTTAATCTTTTAAGTTAATATTTATAGTTAATAAAGTAGAAGTGTCTATTTATTGATTG > 2248753 > > Query: -179 > TTGTTGTTGTTGATTAAGAATATAAAGAAAAACAGAAAAGAAAAAAAGAAGGTTTAAAAA -238 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2248754 > TTGTTGTTGTTGATTAAGAATATAAAGAAAAACAGAAAAGAAAAAAAGAAGGTTTAAAAA > 2248813 > > Query: -239 > AGTTAATTGTGAAGTAAAAGGGTTGAAAAATTTTTTTTTTTTCTGTTTCTCTCTTTGAGA -298 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2248814 > AGTTAATTGTGAAGTAAAAGGGTTGAAAAATTTTTTTTTTTTCTGTTTCTCTCTTTGAGA > 2248873 > > Query: -299 > TTCTTTGACATATTTATTATTATAACACTATGCTATACTAAAAACAGTACTACCAATTGA -358 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2248874 > TTCTTTGACATATTTATTATTATAACACTATGCTATACTAAAAACAGTACTACCAATTGA > 2248933 > > Query: -359 > ATTAAATTAAATTAAATTAAATTAAATTATTAGACCAATTTCAATAAAGATAAGCAATTT -418 > > ---------------------------------------------------------------------- > ------------------------------ > > **Here is the snippet of code that reads the old report, generates new > objects and writes new report: > ---------------------------------------------------------------------- > ------------------------------ > my $blast_report = Bio::SearchIO->new(-format => 'blast', > -file => $blastOutputTmp); > > my $writer = > Bio::SearchIO::Writer::TextResultWriter->new(-no_wublastlinks => 0); > my $out_blast_report = Bio::SearchIO->new(-writer => $writer, > -file => ">$blastOutputFile"); > > my $sorted_blast_report; > > while( my $result = $blast_report->next_result ) { > > my (%parameters, %statistics); > > foreach my $param ($result->available_parameters) { > > $parameters{$param} = $result->get_parameter($param); > } > > foreach my $stat ($result->available_statistics) { > > $statistics{$stat} = $result->get_statistic($stat); > } > > my $generic_result = > Bio::Search::Result::BlastResult->new(-query_name => > $result->query_name, > -query_length => > $result->query_length, > -database_name => > $result->database_name, > -database_entries => > $result->database_entries, > -parameters => \% > parameters, > -statistics => \% > statistics, > -algorithm => $result- > >algorithm, > -query_description => > $result->query_description, > -algorithm_reference => > $result->algorithm_reference, > -algorithm_version => > $result->algorithm_version, > -database_letters => > $result->database_letters); > > while( my $hit = $result->next_hit ) { > > my $generic_hit = Bio::Search::Hit::BlastHit->new(-name > => $hit->name, > -algorithm => $hit->algorithm, > -description => $hit->description, > -length => $hit->length, > -score => $hit->score, > -bits => $hit->bits, > -significance => $hit- > >significance); > > my (@hsp_sorted, @hsps); > while( my $hsp = $hit->next_hsp ) { > > push(@hsps, $hsp); > } > > @hsp_sorted = sort {$a->pvalue <=> $b->pvalue} @hsps; > > for(my $i=0; $i<=$#hsp_sorted; $i++) { > > $generic_hit->add_hsp($hsp_sorted[$i]); > > } > > $generic_result->add_hit($generic_hit); > > } > > $out_blast_report->write_result($generic_result); > > } > ---------------------------------------------------------------------- > ------------------------------ > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From prachi at stanford.edu Thu May 8 18:35:30 2008 From: prachi at stanford.edu (Prachi Shah) Date: Thu, 8 May 2008 15:35:30 -0700 Subject: [Bioperl-l] Can't parse blast report written by Bio::SearchIO::Writer::TextResultWriter In-Reply-To: <27483384-0188-44F5-8AF8-5293A7A83547@bioperl.org> References: <8684cf960805081354s6400b1eey917f6b9ae862eded@mail.gmail.com> <27483384-0188-44F5-8AF8-5293A7A83547@bioperl.org> Message-ID: <8684cf960805081535v2a8c8261hcd373612100cdaf5@mail.gmail.com> > I suspect somehow you are not reconstituting the Hit or Result objects properly, > but I didn't try and debug this myself. Its possible, but I haven't been to point out what is going wrong. But then, the writer object is able to write the report without incident. I am at a loss. > You can specify a sort order function to the Result object now to specify the Hit order, > maybe we should add sort function to Hit object for retrieving the underlying HSPs in a > programmable order. Seems like that would be a cleaner fix. That would be ideal! But, until that is available, I will have to make-do with such a solution. Thanks, Prachi > On May 8, 2008, at 1:54 PM, Prachi Shah wrote: > >> Hi all, >> >> I am trying to order of HSPs within each BLAST Hit in the order of >> ascending P-values. So, I parse my WU-BLAST report using Bio::SearchIO >> and create new Result, Hit and HSP objects in the order and then write >> out another BLAST report with the >> Bio::SearchIO::Writer::TextResultWriter module. All this works fine. >> But, when I try to parse this new blast report with >> Bio::SearchIO::blast, I get the following error: >> >> ------------- EXCEPTION ------------- >> MSG: no data for midline Query: 0 1 >> STACK Bio::SearchIO::blast::next_result >> /tools/perl/5.6.1/lib/site_perl/5.6.1/Bio/SearchIO/blast.pm:1151 >> STACK toplevel bin/testBlastParse.pl:12 >> -------------------------------------- >> >> I have copied below sample sections of both blast reports and the >> code. Any hints/ pointers/ suggestions are greatly appreciated. >> >> Thanks, >> Prachi >> >> >> >> The old vs new blast reports look slightly different, esp. note the >> HSP start and stop coordinates for the QUERY sequence. >> >> **Snippet of OLD blast report (generated by WU-BLAST): >> ---------------------------------------------------------------------------------------------------- >> Query= orf19.4890 >> (4931 letters) >> >> Database: Ca21_Chromosomes >> 9 sequences; 14,324,492 total letters. >> Searching....10....20....30....40....50....60....70....80....90....100% done >> >> WARNING: hspmax=1000 was exceeded by 8 of the database sequences, causing the >> associated cutoff score, S2, to be transiently set as high as 113. >> >> Smallest >> Sum >> High Probability >> Sequences producing High-scoring Segment Pairs: Score P(N) N >> >> Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) 24655 0. 1 >> Ca21chr5 Assembly 21, Ca21chr5 (1190941 nucleotides) 1682 3.4e-68 3 >> Ca21chr6 Assembly 21, Ca21chr6 (1033553 nucleotides) 908 3.0e-34 3 >> Ca21chr2 Assembly 21, Ca21chr2 (2232049 nucleotides) 859 4.7e-30 1 >> Ca21chr7 Assembly 21, Ca21chr7 (949626 nucleotides) 492 7.3e-24 3 >> Ca21chr4 Assembly 21, Ca21chr4 (1603475 nucleotides) 528 9.8e-21 2 >> Ca21chrR Assembly 21, Ca21chrR (2286425 nucleotides) 520 1.4e-19 5 >> Ca21chr3 Assembly 21, Ca21chr3 (1799426 nucleotides) 502 1.7e-14 2 >> Ca19-mtDNA Assembly 19, Ca19-mtDNA (40420 nucleotides) 313 2.9e-06 2 >> >> >>> Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) >> >> Length = 3,188,577 >> >> Plus Strand HSPs: >> >> Score = 506 (82.0 bits), Expect = 4.9e-14, P = 4.9e-14 >> Identities = 850/1549 (54%), Positives = 850/1549 (54%), Strand = Plus / Plus >> >> Query: 3450 ATGCATATGGTAATGTTAA-AATCACTGATTTTGGA-TTTTGTGCTAAATTAAC-T-GAT 3505 >> | | ||| | | || |||| ||| ||||| ||| | ||||| || | || | | | | >> Sbjct: 155924 AGGGATACGATTAT-TTAAGAATT-CTGATATTGAAATTTTG-GC-ATTTTCATATAGTT >> 155979 >> >> Query: 3506 CAAAGA--AATAAACGTGCC-ACAATGGTGGGGACACCATATTGG-ATGGCACCTGAAGT 3561 >> |||| | |||||| | | |||| || | ||| | | ||| | | | | >> Sbjct: 155980 CAAACATTAATAAATATATTGAAAATGTTGATTTAATCAT-TAGTCATG---CTGGTACT >> 156035 >> >> Query: 3562 GGTTAAACAAAAGGAATATGATGAAAAAGTTGATGTTTGGTCATTGGGGATTATGACTAT 3621 >> || | || | | || || | | | |||| | |||| |||| || >> Sbjct: 156036 GGATCAATCATTG--AT-TGTTTACAT--TTGAA--TAAACCATTAATTGTTATTGTTAA >> 156088 >> >> Query: 3622 TGAAATGATTGAAGGAGAACCACCTTATTTGAA-T-GAAGAACCATTAAAAGCATTATAT 3679 >> ---------------------------------------------------------------------------------------------------- >> >> **Snippet of NEW blast report (generated using >> Bio::SearchIO::Writer::TextResultWriter) >> ---------------------------------------------------------------------------------------------------- >> uery= orf19.4890 >> (4,931 letters) >> >> Database: Ca21_Chromosomes >> 9 sequences; 14,324,492 total letters >> >> Score E >> Sequences producing significant alignments: (bits) value >> Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) 24655 0. >> Ca21chr5 Assembly 21, Ca21chr5 (1190941 nucleotides) >> 1682 3.4e-68 >> Ca21chr6 Assembly 21, Ca21chr6 (1033553 nucleotides) >> 908 3.0e-34 >> Ca21chr2 Assembly 21, Ca21chr2 (2232049 nucleotides) >> 859 4.7e-30 >> Ca21chr7 Assembly 21, Ca21chr7 (949626 nucleotides) >> 492 7.3e-24 >> Ca21chr4 Assembly 21, Ca21chr4 (1603475 nucleotides) >> 528 9.8e-21 >> Ca21chrR Assembly 21, Ca21chrR (2286425 nucleotides) >> 520 1.4e-19 >> Ca21chr3 Assembly 21, Ca21chr3 (1799426 nucleotides) >> 502 1.7e-14 >> Ca19-mtDNA Assembly 19, Ca19-mtDNA (40420 nucleotides) >> 313 2.9e-06 >> >> >>> Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) >> >> Length = 3188577 >> >> Score = 3705.3 bits (24655), Expect = 0., P = 0. >> Identities = 4931/4931 (100%) >> Frame = -1 / +1 >> >> Query: 1 ATAAAGGATGCCAAATAGTAGTAGTAAAATAGTAAATAGAATTGCAAAACAAAAATGATT -58 >> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >> Sbjct: 2248574 ATAAAGGATGCCAAATAGTAGTAGTAAAATAGTAAATAGAATTGCAAAACAAAAATGATT >> 2248633 >> >> Query: -59 AAATAGCCCTTTATCAATAAATTTTTAAAGTTAGTTTCTTCTGGAACCCTACCCTCTTGG -118 >> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >> Sbjct: 2248634 AAATAGCCCTTTATCAATAAATTTTTAAAGTTAGTTTCTTCTGGAACCCTACCCTCTTGG >> 2248693 >> >> Query: -119 TGTTAATCTTTTAAGTTAATATTTATAGTTAATAAAGTAGAAGTGTCTATTTATTGATTG -178 >> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >> Sbjct: 2248694 TGTTAATCTTTTAAGTTAATATTTATAGTTAATAAAGTAGAAGTGTCTATTTATTGATTG >> 2248753 >> >> Query: -179 TTGTTGTTGTTGATTAAGAATATAAAGAAAAACAGAAAAGAAAAAAAGAAGGTTTAAAAA -238 >> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >> Sbjct: 2248754 TTGTTGTTGTTGATTAAGAATATAAAGAAAAACAGAAAAGAAAAAAAGAAGGTTTAAAAA >> 2248813 >> >> Query: -239 AGTTAATTGTGAAGTAAAAGGGTTGAAAAATTTTTTTTTTTTCTGTTTCTCTCTTTGAGA -298 >> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >> Sbjct: 2248814 AGTTAATTGTGAAGTAAAAGGGTTGAAAAATTTTTTTTTTTTCTGTTTCTCTCTTTGAGA >> 2248873 >> >> Query: -299 TTCTTTGACATATTTATTATTATAACACTATGCTATACTAAAAACAGTACTACCAATTGA -358 >> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >> Sbjct: 2248874 TTCTTTGACATATTTATTATTATAACACTATGCTATACTAAAAACAGTACTACCAATTGA >> 2248933 >> >> Query: -359 ATTAAATTAAATTAAATTAAATTAAATTATTAGACCAATTTCAATAAAGATAAGCAATTT -418 >> >> ---------------------------------------------------------------------------------------------------- >> >> **Here is the snippet of code that reads the old report, generates new >> objects and writes new report: >> ---------------------------------------------------------------------------------------------------- >> my $blast_report = Bio::SearchIO->new(-format => 'blast', >> -file => $blastOutputTmp); >> >> my $writer = >> Bio::SearchIO::Writer::TextResultWriter->new(-no_wublastlinks => 0); >> my $out_blast_report = Bio::SearchIO->new(-writer => $writer, >> -file => ">$blastOutputFile"); >> >> my $sorted_blast_report; >> >> while( my $result = $blast_report->next_result ) { >> >> my (%parameters, %statistics); >> >> foreach my $param ($result->available_parameters) { >> >> $parameters{$param} = $result->get_parameter($param); >> } >> >> foreach my $stat ($result->available_statistics) { >> >> $statistics{$stat} = $result->get_statistic($stat); >> } >> >> my $generic_result = >> Bio::Search::Result::BlastResult->new(-query_name => >> $result->query_name, >> -query_length => >> $result->query_length, >> -database_name => >> $result->database_name, >> -database_entries => >> $result->database_entries, >> -parameters => \%parameters, >> -statistics => \%statistics, >> -algorithm => $result->algorithm, >> -query_description => >> $result->query_description, >> -algorithm_reference => >> $result->algorithm_reference, >> -algorithm_version => >> $result->algorithm_version, >> -database_letters => >> $result->database_letters); >> >> while( my $hit = $result->next_hit ) { >> >> my $generic_hit = Bio::Search::Hit::BlastHit->new(-name >> => $hit->name, >> -algorithm => $hit->algorithm, >> -description => $hit->description, >> -length => $hit->length, >> -score => $hit->score, >> -bits => $hit->bits, >> -significance => $hit->significance); >> >> my (@hsp_sorted, @hsps); >> while( my $hsp = $hit->next_hsp ) { >> >> push(@hsps, $hsp); >> } >> >> @hsp_sorted = sort {$a->pvalue <=> $b->pvalue} @hsps; >> >> for(my $i=0; $i<=$#hsp_sorted; $i++) { >> >> $generic_hit->add_hsp($hsp_sorted[$i]); >> >> } >> >> $generic_result->add_hit($generic_hit); >> >> } >> >> $out_blast_report->write_result($generic_result); >> >> } >> ---------------------------------------------------------------------------------------------------- >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Thu May 8 20:03:11 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 8 May 2008 19:03:11 -0500 Subject: [Bioperl-l] Can't parse blast report written by Bio::SearchIO::Writer::TextResultWriter In-Reply-To: <8684cf960805081535v2a8c8261hcd373612100cdaf5@mail.gmail.com> References: <8684cf960805081354s6400b1eey917f6b9ae862eded@mail.gmail.com> <27483384-0188-44F5-8AF8-5293A7A83547@bioperl.org> <8684cf960805081535v2a8c8261hcd373612100cdaf5@mail.gmail.com> Message-ID: <6661CE6F-0795-4EDE-9D05-CD95BAB3DBA4@uiuc.edu> You can always post it as an enhancement request in bugzilla. I don't think it would be too hard to implement. chris On May 8, 2008, at 5:35 PM, Prachi Shah wrote: >> I suspect somehow you are not reconstituting the Hit or Result >> objects properly, >> but I didn't try and debug this myself. > > Its possible, but I haven't been to point out what is going wrong. But > then, the writer object is able to write the report without incident. > I am at a loss. > >> You can specify a sort order function to the Result object now to >> specify the Hit order, >> maybe we should add sort function to Hit object for retrieving the >> underlying HSPs in a >> programmable order. Seems like that would be a cleaner fix. > > That would be ideal! But, until that is available, I will have to > make-do with such a solution. > > Thanks, > Prachi > > >> On May 8, 2008, at 1:54 PM, Prachi Shah wrote: >> >>> Hi all, >>> >>> I am trying to order of HSPs within each BLAST Hit in the order of >>> ascending P-values. So, I parse my WU-BLAST report using >>> Bio::SearchIO >>> and create new Result, Hit and HSP objects in the order and then >>> write >>> out another BLAST report with the >>> Bio::SearchIO::Writer::TextResultWriter module. All this works fine. >>> But, when I try to parse this new blast report with >>> Bio::SearchIO::blast, I get the following error: >>> >>> ------------- EXCEPTION ------------- >>> MSG: no data for midline Query: 0 1 >>> STACK Bio::SearchIO::blast::next_result >>> /tools/perl/5.6.1/lib/site_perl/5.6.1/Bio/SearchIO/blast.pm:1151 >>> STACK toplevel bin/testBlastParse.pl:12 >>> -------------------------------------- >>> >>> I have copied below sample sections of both blast reports and the >>> code. Any hints/ pointers/ suggestions are greatly appreciated. >>> >>> Thanks, >>> Prachi >>> >>> >>> >>> The old vs new blast reports look slightly different, esp. note the >>> HSP start and stop coordinates for the QUERY sequence. >>> >>> **Snippet of OLD blast report (generated by WU-BLAST): >>> ---------------------------------------------------------------------------------------------------- >>> Query= orf19.4890 >>> (4931 letters) >>> >>> Database: Ca21_Chromosomes >>> 9 sequences; 14,324,492 total letters. >>> Searching.... >>> 10....20....30....40....50....60....70....80....90....100% done >>> >>> WARNING: hspmax=1000 was exceeded by 8 of the database sequences, >>> causing the >>> associated cutoff score, S2, to be transiently set as high >>> as 113. >>> >>> >>> Smallest >>> Sum >>> High >>> Probability >>> Sequences producing High-scoring Segment Pairs: >>> Score P(N) N >>> >>> Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) >>> 24655 0. 1 >>> Ca21chr5 Assembly 21, Ca21chr5 (1190941 nucleotides) >>> 1682 3.4e-68 3 >>> Ca21chr6 Assembly 21, Ca21chr6 (1033553 nucleotides) >>> 908 3.0e-34 3 >>> Ca21chr2 Assembly 21, Ca21chr2 (2232049 nucleotides) >>> 859 4.7e-30 1 >>> Ca21chr7 Assembly 21, Ca21chr7 (949626 nucleotides) >>> 492 7.3e-24 3 >>> Ca21chr4 Assembly 21, Ca21chr4 (1603475 nucleotides) >>> 528 9.8e-21 2 >>> Ca21chrR Assembly 21, Ca21chrR (2286425 nucleotides) >>> 520 1.4e-19 5 >>> Ca21chr3 Assembly 21, Ca21chr3 (1799426 nucleotides) >>> 502 1.7e-14 2 >>> Ca19-mtDNA Assembly 19, Ca19-mtDNA (40420 nucleotides) >>> 313 2.9e-06 2 >>> >>> >>>> Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) >>> >>> Length = 3,188,577 >>> >>> Plus Strand HSPs: >>> >>> Score = 506 (82.0 bits), Expect = 4.9e-14, P = 4.9e-14 >>> Identities = 850/1549 (54%), Positives = 850/1549 (54%), Strand = >>> Plus / Plus >>> >>> Query: 3450 ATGCATATGGTAATGTTAA-AATCACTGATTTTGGA- >>> TTTTGTGCTAAATTAAC-T-GAT 3505 >>> | | ||| | | || |||| ||| ||||| ||| | ||||| || | || | >>> | | | >>> Sbjct: 155924 AGGGATACGATTAT-TTAAGAATT-CTGATATTGAAATTTTG-GC- >>> ATTTTCATATAGTT >>> 155979 >>> >>> Query: 3506 CAAAGA--AATAAACGTGCC-ACAATGGTGGGGACACCATATTGG- >>> ATGGCACCTGAAGT 3561 >>> |||| | |||||| | | |||| || | ||| | | ||| | >>> | | | >>> Sbjct: 155980 CAAACATTAATAAATATATTGAAAATGTTGATTTAATCAT-TAGTCATG--- >>> CTGGTACT >>> 156035 >>> >>> Query: 3562 >>> GGTTAAACAAAAGGAATATGATGAAAAAGTTGATGTTTGGTCATTGGGGATTATGACTAT 3621 >>> || | || | | || || | | | |||| | |||| >>> |||| || >>> Sbjct: 156036 GGATCAATCATTG--AT-TGTTTACAT--TTGAA-- >>> TAAACCATTAATTGTTATTGTTAA >>> 156088 >>> >>> Query: 3622 TGAAATGATTGAAGGAGAACCACCTTATTTGAA-T- >>> GAAGAACCATTAAAAGCATTATAT 3679 >>> ---------------------------------------------------------------------------------------------------- >>> >>> **Snippet of NEW blast report (generated using >>> Bio::SearchIO::Writer::TextResultWriter) >>> ---------------------------------------------------------------------------------------------------- >>> uery= orf19.4890 >>> (4,931 letters) >>> >>> Database: Ca21_Chromosomes >>> 9 sequences; 14,324,492 total letters >>> >>> >>> Score E >>> Sequences producing significant alignments: >>> (bits) value >>> Ca21chr1 Assembly 21, Ca21chr1 (3188577 >>> nucleotides) 24655 0. >>> Ca21chr5 Assembly 21, Ca21chr5 (1190941 nucleotides) >>> 1682 3.4e-68 >>> Ca21chr6 Assembly 21, Ca21chr6 (1033553 nucleotides) >>> 908 3.0e-34 >>> Ca21chr2 Assembly 21, Ca21chr2 (2232049 nucleotides) >>> 859 4.7e-30 >>> Ca21chr7 Assembly 21, Ca21chr7 (949626 nucleotides) >>> 492 7.3e-24 >>> Ca21chr4 Assembly 21, Ca21chr4 (1603475 nucleotides) >>> 528 9.8e-21 >>> Ca21chrR Assembly 21, Ca21chrR (2286425 nucleotides) >>> 520 1.4e-19 >>> Ca21chr3 Assembly 21, Ca21chr3 (1799426 nucleotides) >>> 502 1.7e-14 >>> Ca19-mtDNA Assembly 19, Ca19-mtDNA (40420 nucleotides) >>> 313 2.9e-06 >>> >>> >>>> Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) >>> >>> Length = 3188577 >>> >>> Score = 3705.3 bits (24655), Expect = 0., P = 0. >>> Identities = 4931/4931 (100%) >>> Frame = -1 / +1 >>> >>> Query: 1 >>> ATAAAGGATGCCAAATAGTAGTAGTAAAATAGTAAATAGAATTGCAAAACAAAAATGATT -58 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2248574 >>> ATAAAGGATGCCAAATAGTAGTAGTAAAATAGTAAATAGAATTGCAAAACAAAAATGATT >>> 2248633 >>> >>> Query: -59 >>> AAATAGCCCTTTATCAATAAATTTTTAAAGTTAGTTTCTTCTGGAACCCTACCCTCTTGG -118 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2248634 >>> AAATAGCCCTTTATCAATAAATTTTTAAAGTTAGTTTCTTCTGGAACCCTACCCTCTTGG >>> 2248693 >>> >>> Query: -119 >>> TGTTAATCTTTTAAGTTAATATTTATAGTTAATAAAGTAGAAGTGTCTATTTATTGATTG -178 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2248694 >>> TGTTAATCTTTTAAGTTAATATTTATAGTTAATAAAGTAGAAGTGTCTATTTATTGATTG >>> 2248753 >>> >>> Query: -179 >>> TTGTTGTTGTTGATTAAGAATATAAAGAAAAACAGAAAAGAAAAAAAGAAGGTTTAAAAA -238 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2248754 >>> TTGTTGTTGTTGATTAAGAATATAAAGAAAAACAGAAAAGAAAAAAAGAAGGTTTAAAAA >>> 2248813 >>> >>> Query: -239 >>> AGTTAATTGTGAAGTAAAAGGGTTGAAAAATTTTTTTTTTTTCTGTTTCTCTCTTTGAGA -298 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2248814 >>> AGTTAATTGTGAAGTAAAAGGGTTGAAAAATTTTTTTTTTTTCTGTTTCTCTCTTTGAGA >>> 2248873 >>> >>> Query: -299 >>> TTCTTTGACATATTTATTATTATAACACTATGCTATACTAAAAACAGTACTACCAATTGA -358 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2248874 >>> TTCTTTGACATATTTATTATTATAACACTATGCTATACTAAAAACAGTACTACCAATTGA >>> 2248933 >>> >>> Query: -359 >>> ATTAAATTAAATTAAATTAAATTAAATTATTAGACCAATTTCAATAAAGATAAGCAATTT -418 >>> >>> ---------------------------------------------------------------------------------------------------- >>> >>> **Here is the snippet of code that reads the old report, generates >>> new >>> objects and writes new report: >>> ---------------------------------------------------------------------------------------------------- >>> my $blast_report = Bio::SearchIO->new(-format => 'blast', >>> -file => $blastOutputTmp); >>> >>> my $writer = >>> Bio::SearchIO::Writer::TextResultWriter->new(-no_wublastlinks => 0); >>> my $out_blast_report = Bio::SearchIO->new(-writer => $writer, >>> -file => ">$blastOutputFile"); >>> >>> my $sorted_blast_report; >>> >>> while( my $result = $blast_report->next_result ) { >>> >>> my (%parameters, %statistics); >>> >>> foreach my $param ($result->available_parameters) { >>> >>> $parameters{$param} = $result->get_parameter($param); >>> } >>> >>> foreach my $stat ($result->available_statistics) { >>> >>> $statistics{$stat} = $result->get_statistic($stat); >>> } >>> >>> my $generic_result = >>> Bio::Search::Result::BlastResult->new(-query_name => >>> $result->query_name, >>> -query_length => >>> $result->query_length, >>> -database_name => >>> $result->database_name, >>> -database_entries => >>> $result->database_entries, >>> -parameters => \ >>> %parameters, >>> -statistics => \ >>> %statistics, >>> -algorithm => $result- >>> >algorithm, >>> -query_description => >>> $result->query_description, >>> -algorithm_reference => >>> $result->algorithm_reference, >>> -algorithm_version => >>> $result->algorithm_version, >>> -database_letters => >>> $result->database_letters); >>> >>> while( my $hit = $result->next_hit ) { >>> >>> my $generic_hit = Bio::Search::Hit::BlastHit->new(-name >>> => $hit->name, >>> -algorithm => $hit->algorithm, >>> -description => $hit->description, >>> -length => $hit->length, >>> -score => $hit->score, >>> -bits => $hit->bits, >>> -significance => $hit- >>> >significance); >>> >>> my (@hsp_sorted, @hsps); >>> while( my $hsp = $hit->next_hsp ) { >>> >>> push(@hsps, $hsp); >>> } >>> >>> @hsp_sorted = sort {$a->pvalue <=> $b->pvalue} @hsps; >>> >>> for(my $i=0; $i<=$#hsp_sorted; $i++) { >>> >>> $generic_hit->add_hsp($hsp_sorted[$i]); >>> >>> } >>> >>> $generic_result->add_hit($generic_hit); >>> >>> } >>> >>> $out_blast_report->write_result($generic_result); >>> >>> } >>> ---------------------------------------------------------------------------------------------------- >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From punit_vergoboy2004 at yahoo.co.in Fri May 9 07:45:36 2008 From: punit_vergoboy2004 at yahoo.co.in (punit kumar) Date: Fri, 9 May 2008 17:15:36 +0530 (IST) Subject: [Bioperl-l] help_to_acces_clustal-w Message-ID: <937459.50783.qm@web8712.mail.in.yahoo.com> hi friends can?any one suggest me that how can i install fully the bioperl in my computer and how can access the? module?of clustal-w in my programme . i am wating of any persons reply . ?punit kumar kadimi. Explore your hobbies and interests. Go to http://in.promos.yahoo.com/groups/ From David.Messina at sbc.su.se Fri May 9 08:44:32 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Fri, 9 May 2008 14:44:32 +0200 Subject: [Bioperl-l] help_to_acces_clustal-w In-Reply-To: <937459.50783.qm@web8712.mail.in.yahoo.com> References: <937459.50783.qm@web8712.mail.in.yahoo.com> Message-ID: <628aabb70805090544r2edc5fber7ce8fd49693fc041@mail.gmail.com> Hi Punit, You haven't said whether you've tried to install BioPerl already, or what kind of computer you have, so I'm afraid we don't know what you need help with. There are detailed installation instructions on the website here: http://www.bioperl.org/wiki/Installing_BioPerl If you want to parse ClustalW output, you use the AlignIO module. Details on that here: http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/AlignIO.html Dave From prachi at stanford.edu Fri May 9 13:43:41 2008 From: prachi at stanford.edu (Prachi Shah) Date: Fri, 9 May 2008 10:43:41 -0700 Subject: [Bioperl-l] Can't parse blast report written by Bio::SearchIO::Writer::TextResultWriter In-Reply-To: <6661CE6F-0795-4EDE-9D05-CD95BAB3DBA4@uiuc.edu> References: <8684cf960805081354s6400b1eey917f6b9ae862eded@mail.gmail.com> <27483384-0188-44F5-8AF8-5293A7A83547@bioperl.org> <8684cf960805081535v2a8c8261hcd373612100cdaf5@mail.gmail.com> <6661CE6F-0795-4EDE-9D05-CD95BAB3DBA4@uiuc.edu> Message-ID: <8684cf960805091043j706d2aaej8584b1e7d4e2e4d7@mail.gmail.com> Thanks. I have put in a bugzilla request. Although, I do need suggestions to solve my immediate problems. Any hints are greatly appreciated. Thanks, Prachi On Thu, May 8, 2008 at 5:03 PM, Chris Fields wrote: > You can always post it as an enhancement request in bugzilla. I don't think > it would be too hard to implement. > > chris > > On May 8, 2008, at 5:35 PM, Prachi Shah wrote: > >>> I suspect somehow you are not reconstituting the Hit or Result objects >>> properly, >>> but I didn't try and debug this myself. >> >> Its possible, but I haven't been to point out what is going wrong. But >> then, the writer object is able to write the report without incident. >> I am at a loss. >> >>> You can specify a sort order function to the Result object now to specify >>> the Hit order, >>> maybe we should add sort function to Hit object for retrieving the >>> underlying HSPs in a >>> programmable order. Seems like that would be a cleaner fix. >> >> That would be ideal! But, until that is available, I will have to >> make-do with such a solution. >> >> Thanks, >> Prachi >> >> >>> On May 8, 2008, at 1:54 PM, Prachi Shah wrote: >>> >>>> Hi all, >>>> >>>> I am trying to order of HSPs within each BLAST Hit in the order of >>>> ascending P-values. So, I parse my WU-BLAST report using Bio::SearchIO >>>> and create new Result, Hit and HSP objects in the order and then write >>>> out another BLAST report with the >>>> Bio::SearchIO::Writer::TextResultWriter module. All this works fine. >>>> But, when I try to parse this new blast report with >>>> Bio::SearchIO::blast, I get the following error: >>>> >>>> ------------- EXCEPTION ------------- >>>> MSG: no data for midline Query: 0 1 >>>> STACK Bio::SearchIO::blast::next_result >>>> /tools/perl/5.6.1/lib/site_perl/5.6.1/Bio/SearchIO/blast.pm:1151 >>>> STACK toplevel bin/testBlastParse.pl:12 >>>> -------------------------------------- >>>> >>>> I have copied below sample sections of both blast reports and the >>>> code. Any hints/ pointers/ suggestions are greatly appreciated. >>>> >>>> Thanks, >>>> Prachi >>>> >>>> >>>> >>>> The old vs new blast reports look slightly different, esp. note the >>>> HSP start and stop coordinates for the QUERY sequence. >>>> >>>> **Snippet of OLD blast report (generated by WU-BLAST): >>>> >>>> ---------------------------------------------------------------------------------------------------- >>>> Query= orf19.4890 >>>> (4931 letters) >>>> >>>> Database: Ca21_Chromosomes >>>> 9 sequences; 14,324,492 total letters. >>>> Searching....10....20....30....40....50....60....70....80....90....100% >>>> done >>>> >>>> WARNING: hspmax=1000 was exceeded by 8 of the database sequences, >>>> causing the >>>> associated cutoff score, S2, to be transiently set as high as >>>> 113. >>>> >>>> >>>> Smallest >>>> Sum >>>> High >>>> Probability >>>> Sequences producing High-scoring Segment Pairs: Score P(N) >>>> N >>>> >>>> Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) 24655 0. >>>> 1 >>>> Ca21chr5 Assembly 21, Ca21chr5 (1190941 nucleotides) 1682 >>>> 3.4e-68 3 >>>> Ca21chr6 Assembly 21, Ca21chr6 (1033553 nucleotides) 908 >>>> 3.0e-34 3 >>>> Ca21chr2 Assembly 21, Ca21chr2 (2232049 nucleotides) 859 >>>> 4.7e-30 1 >>>> Ca21chr7 Assembly 21, Ca21chr7 (949626 nucleotides) 492 >>>> 7.3e-24 3 >>>> Ca21chr4 Assembly 21, Ca21chr4 (1603475 nucleotides) 528 >>>> 9.8e-21 2 >>>> Ca21chrR Assembly 21, Ca21chrR (2286425 nucleotides) 520 >>>> 1.4e-19 5 >>>> Ca21chr3 Assembly 21, Ca21chr3 (1799426 nucleotides) 502 >>>> 1.7e-14 2 >>>> Ca19-mtDNA Assembly 19, Ca19-mtDNA (40420 nucleotides) 313 >>>> 2.9e-06 2 >>>> >>>> >>>>> Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) >>>> >>>> Length = 3,188,577 >>>> >>>> Plus Strand HSPs: >>>> >>>> Score = 506 (82.0 bits), Expect = 4.9e-14, P = 4.9e-14 >>>> Identities = 850/1549 (54%), Positives = 850/1549 (54%), Strand = Plus / >>>> Plus >>>> >>>> Query: 3450 >>>> ATGCATATGGTAATGTTAA-AATCACTGATTTTGGA-TTTTGTGCTAAATTAAC-T-GAT 3505 >>>> | | ||| | | || |||| ||| ||||| ||| | ||||| || | || | | | | >>>> Sbjct: 155924 >>>> AGGGATACGATTAT-TTAAGAATT-CTGATATTGAAATTTTG-GC-ATTTTCATATAGTT >>>> 155979 >>>> >>>> Query: 3506 >>>> CAAAGA--AATAAACGTGCC-ACAATGGTGGGGACACCATATTGG-ATGGCACCTGAAGT 3561 >>>> |||| | |||||| | | |||| || | ||| | | ||| | | | | >>>> Sbjct: 155980 >>>> CAAACATTAATAAATATATTGAAAATGTTGATTTAATCAT-TAGTCATG---CTGGTACT >>>> 156035 >>>> >>>> Query: 3562 >>>> GGTTAAACAAAAGGAATATGATGAAAAAGTTGATGTTTGGTCATTGGGGATTATGACTAT 3621 >>>> || | || | | || || | | | |||| | |||| |||| || >>>> Sbjct: 156036 >>>> GGATCAATCATTG--AT-TGTTTACAT--TTGAA--TAAACCATTAATTGTTATTGTTAA >>>> 156088 >>>> >>>> Query: 3622 >>>> TGAAATGATTGAAGGAGAACCACCTTATTTGAA-T-GAAGAACCATTAAAAGCATTATAT 3679 >>>> >>>> ---------------------------------------------------------------------------------------------------- >>>> >>>> **Snippet of NEW blast report (generated using >>>> Bio::SearchIO::Writer::TextResultWriter) >>>> >>>> ---------------------------------------------------------------------------------------------------- >>>> uery= orf19.4890 >>>> (4,931 letters) >>>> >>>> Database: Ca21_Chromosomes >>>> 9 sequences; 14,324,492 total letters >>>> >>>> Score >>>> E >>>> Sequences producing significant alignments: (bits) >>>> value >>>> Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) >>>> 24655 0. >>>> Ca21chr5 Assembly 21, Ca21chr5 (1190941 nucleotides) >>>> 1682 3.4e-68 >>>> Ca21chr6 Assembly 21, Ca21chr6 (1033553 nucleotides) >>>> 908 3.0e-34 >>>> Ca21chr2 Assembly 21, Ca21chr2 (2232049 nucleotides) >>>> 859 4.7e-30 >>>> Ca21chr7 Assembly 21, Ca21chr7 (949626 nucleotides) >>>> 492 7.3e-24 >>>> Ca21chr4 Assembly 21, Ca21chr4 (1603475 nucleotides) >>>> 528 9.8e-21 >>>> Ca21chrR Assembly 21, Ca21chrR (2286425 nucleotides) >>>> 520 1.4e-19 >>>> Ca21chr3 Assembly 21, Ca21chr3 (1799426 nucleotides) >>>> 502 1.7e-14 >>>> Ca19-mtDNA Assembly 19, Ca19-mtDNA (40420 nucleotides) >>>> 313 2.9e-06 >>>> >>>> >>>>> Ca21chr1 Assembly 21, Ca21chr1 (3188577 nucleotides) >>>> >>>> Length = 3188577 >>>> >>>> Score = 3705.3 bits (24655), Expect = 0., P = 0. >>>> Identities = 4931/4931 (100%) >>>> Frame = -1 / +1 >>>> >>>> Query: 1 >>>> ATAAAGGATGCCAAATAGTAGTAGTAAAATAGTAAATAGAATTGCAAAACAAAAATGATT -58 >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2248574 >>>> ATAAAGGATGCCAAATAGTAGTAGTAAAATAGTAAATAGAATTGCAAAACAAAAATGATT >>>> 2248633 >>>> >>>> Query: -59 >>>> AAATAGCCCTTTATCAATAAATTTTTAAAGTTAGTTTCTTCTGGAACCCTACCCTCTTGG -118 >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2248634 >>>> AAATAGCCCTTTATCAATAAATTTTTAAAGTTAGTTTCTTCTGGAACCCTACCCTCTTGG >>>> 2248693 >>>> >>>> Query: -119 >>>> TGTTAATCTTTTAAGTTAATATTTATAGTTAATAAAGTAGAAGTGTCTATTTATTGATTG -178 >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2248694 >>>> TGTTAATCTTTTAAGTTAATATTTATAGTTAATAAAGTAGAAGTGTCTATTTATTGATTG >>>> 2248753 >>>> >>>> Query: -179 >>>> TTGTTGTTGTTGATTAAGAATATAAAGAAAAACAGAAAAGAAAAAAAGAAGGTTTAAAAA -238 >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2248754 >>>> TTGTTGTTGTTGATTAAGAATATAAAGAAAAACAGAAAAGAAAAAAAGAAGGTTTAAAAA >>>> 2248813 >>>> >>>> Query: -239 >>>> AGTTAATTGTGAAGTAAAAGGGTTGAAAAATTTTTTTTTTTTCTGTTTCTCTCTTTGAGA -298 >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2248814 >>>> AGTTAATTGTGAAGTAAAAGGGTTGAAAAATTTTTTTTTTTTCTGTTTCTCTCTTTGAGA >>>> 2248873 >>>> >>>> Query: -299 >>>> TTCTTTGACATATTTATTATTATAACACTATGCTATACTAAAAACAGTACTACCAATTGA -358 >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2248874 >>>> TTCTTTGACATATTTATTATTATAACACTATGCTATACTAAAAACAGTACTACCAATTGA >>>> 2248933 >>>> >>>> Query: -359 >>>> ATTAAATTAAATTAAATTAAATTAAATTATTAGACCAATTTCAATAAAGATAAGCAATTT -418 >>>> >>>> >>>> ---------------------------------------------------------------------------------------------------- >>>> >>>> **Here is the snippet of code that reads the old report, generates new >>>> objects and writes new report: >>>> >>>> ---------------------------------------------------------------------------------------------------- >>>> my $blast_report = Bio::SearchIO->new(-format => 'blast', >>>> -file => $blastOutputTmp); >>>> >>>> my $writer = >>>> Bio::SearchIO::Writer::TextResultWriter->new(-no_wublastlinks => 0); >>>> my $out_blast_report = Bio::SearchIO->new(-writer => $writer, >>>> -file => ">$blastOutputFile"); >>>> >>>> my $sorted_blast_report; >>>> >>>> while( my $result = $blast_report->next_result ) { >>>> >>>> my (%parameters, %statistics); >>>> >>>> foreach my $param ($result->available_parameters) { >>>> >>>> $parameters{$param} = $result->get_parameter($param); >>>> } >>>> >>>> foreach my $stat ($result->available_statistics) { >>>> >>>> $statistics{$stat} = $result->get_statistic($stat); >>>> } >>>> >>>> my $generic_result = >>>> Bio::Search::Result::BlastResult->new(-query_name => >>>> $result->query_name, >>>> -query_length => >>>> $result->query_length, >>>> -database_name => >>>> $result->database_name, >>>> -database_entries => >>>> $result->database_entries, >>>> -parameters => \%parameters, >>>> -statistics => \%statistics, >>>> -algorithm => >>>> $result->algorithm, >>>> -query_description => >>>> $result->query_description, >>>> -algorithm_reference => >>>> $result->algorithm_reference, >>>> -algorithm_version => >>>> $result->algorithm_version, >>>> -database_letters => >>>> $result->database_letters); >>>> >>>> while( my $hit = $result->next_hit ) { >>>> >>>> my $generic_hit = Bio::Search::Hit::BlastHit->new(-name >>>> => $hit->name, >>>> -algorithm => $hit->algorithm, >>>> -description => $hit->description, >>>> -length => $hit->length, >>>> -score => $hit->score, >>>> -bits => $hit->bits, >>>> -significance => $hit->significance); >>>> >>>> my (@hsp_sorted, @hsps); >>>> while( my $hsp = $hit->next_hsp ) { >>>> >>>> push(@hsps, $hsp); >>>> } >>>> >>>> @hsp_sorted = sort {$a->pvalue <=> $b->pvalue} @hsps; >>>> >>>> for(my $i=0; $i<=$#hsp_sorted; $i++) { >>>> >>>> $generic_hit->add_hsp($hsp_sorted[$i]); >>>> >>>> } >>>> >>>> $generic_result->add_hit($generic_hit); >>>> >>>> } >>>> >>>> $out_blast_report->write_result($generic_result); >>>> >>>> } >>>> >>>> ---------------------------------------------------------------------------------------------------- >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > From vdar at yorku.ca Fri May 9 21:10:23 2008 From: vdar at yorku.ca (nisa_dar) Date: Fri, 9 May 2008 18:10:23 -0700 (PDT) Subject: [Bioperl-l] problems with clustalw Message-ID: <17158917.post@talk.nabble.com> Hi, I need to do multiple sequence alignments of DNA sequences by using Bioperl. I am using the following module Bio::Tools::Run::Alignment::Clustalw; and I am getting the following error message Can't locate Bio/Tools/Run/Alignment/Clustalw.pm in @INC (@INC contains: /share/iNquiry/perl/lib/5.8.5/x86_64-linux-thread-multi /share/iNquiry/perl/lib/5.8.5 /share/iNquiry/perl/lib/x86_64-linux-thread-multi /share/iNquiry/perl/lib/5.8.4 /share/iNquiry/perl/lib/5.8.3 /share/iNquiry/perl/lib/5.8.2 /share/iNquiry/perl/lib/5.8.1 /share/iNquiry/perl/lib/5.8.0 /share/iNquiry/perl/lib /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.4/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.3/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.2/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.1/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.0/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.4/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.2/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.1/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.0/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl .) at mult_align.pl line 9. BEGIN failed--compilation aborted at mult_align.pl line 9. Here is the piece of code that gives this message #!/usr/bin/perl -w use Bio::SeqIO; use Bio::Align::AlignI; use Bio::AlignIO; use Bio::AlignIO::msf; use Bio::SimpleAlign; use Bio::PrimarySeq; use Bio::Tools::Run::Alignment::Clustalw; use Bio::Root::IO; use Bio::Seq; my $query_string = "tatgtggctggcgagacacgacacttcatatggttttacctctacgtttgagtaattaagtacaatgagctatcact"; my $hit_string = "tatgtggctggcgagacacgacacttcatatggttttacctctacgtttgagtaattaagtacaatgagctatcact"; my $hit_string_two = "tatgtggctggcgagacacgacacttcatatggttttacctctacgtttgagtaattaagtacaatgagctatcact"; my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM'); my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params); my $ktuple = 2; $factory->ktuple($ktuple); my $seq_obj_on = Bio::Seq->new(-id =>"thal", -seq =>"$query_string"); my $seq_obj_too = Bio::Seq->new(-id =>"lyrata", -seq =>"$hit_string"); my $seq_obj_thre = Bio::Seq->new(-id =>"boechera", -seq =>"$hit_string_two"); my @seq_array = qw/$seq_obj_on $seq_obj_too $seq_obj_thre/; my $seq_array_ref = \@seq_array; my $aln = $factory->align($seq_array_ref); I would appreciate if anyone could help. I don't know how to supply the environment variables at unix so if this is the solution please explain how can I do that. Thanks! -- View this message in context: http://www.nabble.com/problems-with-clustalw-tp17158917p17158917.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From bix at sendu.me.uk Sat May 10 02:02:56 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Sat, 10 May 2008 07:02:56 +0100 Subject: [Bioperl-l] problems with clustalw In-Reply-To: <17158917.post@talk.nabble.com> References: <17158917.post@talk.nabble.com> Message-ID: <48253A90.8060300@sendu.me.uk> nisa_dar wrote: > I need to do multiple sequence alignments of DNA sequences by using Bioperl. > I am using the following module > Bio::Tools::Run::Alignment::Clustalw; > and I am getting the following error message > > Can't locate Bio/Tools/Run/Alignment/Clustalw.pm in @INC You need to install Bioperl-run, eg. cpan cpan>install S/SE/SENDU/bioperl-run-1.5.2_100.tar.gz (if you have core 1.5.2) From bix at sendu.me.uk Mon May 12 13:07:18 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 12 May 2008 18:07:18 +0100 Subject: [Bioperl-l] problems with clustalw In-Reply-To: <1210611353.48287699ba469@mymail.yorku.ca> References: <17158917.post@talk.nabble.com> <48253A90.8060300@sendu.me.uk> <1210608557.48286bad2f2c5@mymail.yorku.ca> <48286CF8.2060405@sendu.me.uk> <1210609239.48286e57324cd@mymail.yorku.ca> <4828700A.4050709@sendu.me.uk> <1210610375.482872c72d63e@mymail.yorku.ca> <482874C7.2050606@sendu.me.uk> <1210611353.48287699ba469@mymail.yorku.ca> Message-ID: <48287946.8030007@sendu.me.uk> vdar at yorku.ca wrote: > Hi, > > Yes, I have clustalw installed and following is the result o which command > > $ which clustalw > /opt/Bio/bin/clustalw > > Please see aa.txt as output of perl -V and mult_align.pl is my script I've CC'd back the bioperl mailing list so other people can learn. Please keep it CC'd. Your script has two main errors: use Clustalw; $ENV{CLUSTALDIR} = '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/Run/Alignment/'; These should be: use Bio::Tools::Run::Alignment::Clustalw; $ENV{CLUSTALDIR} = '/opt/Bio/bin/clustalw'; There is also something very wrong with your installation, since you are using perl 5.8.5 yet have bioperl-run installed into a directory for 5.8.8. This is why Bio::Tools::Run::Alignment::Clustalw wasn't being found in the normal way; the 5.8.8 directory was never checked. PERL5LIB="/opt/rocks/lib/perl5/site_perl/5.8.8" should let it be found. If not, you might have to move the Bio folder from 5.8.8 to 5.8.5. From bix at sendu.me.uk Mon May 12 13:37:37 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 12 May 2008 18:37:37 +0100 Subject: [Bioperl-l] problems with clustalw In-Reply-To: <1210613062.48287d46ba7c8@mymail.yorku.ca> References: <17158917.post@talk.nabble.com> <48253A90.8060300@sendu.me.uk> <1210608557.48286bad2f2c5@mymail.yorku.ca> <48286CF8.2060405@sendu.me.uk> <1210609239.48286e57324cd@mymail.yorku.ca> <4828700A.4050709@sendu.me.uk> <1210610375.482872c72d63e@mymail.yorku.ca> <482874C7.2050606@sendu.me.uk> <1210611353.48287699ba469@mymail.yorku.ca> <48287946.8030007@sendu.me.uk> <1210613062.48287d46ba7c8@mymail.yorku.ca> Message-ID: <48288061.9030001@sendu.me.uk> vdar at yorku.ca wrote: > Yes, seems like it worked, now I am having the following error message which is > not because of the errors in installation..right? > > $ perl mult_align.pl > Can't call method "isa" without a package or object reference at > /opt/rocks/lib/perl5/site_perl/5.8.8//Bio/Tools/Run/Alignment/Clustalw.pm line > 617. You weren't passing sequence objects to align() due to another error in your script: Instead of: my @seq_array = qw/$seq_obj_on $seq_obj_too $seq_obj_thre/; my $seq_array_ref = \@seq_array; my $aln = $factory->align($seq_array_ref); You can have: my @seq_array = ($seq_obj_on, $seq_obj_too, $seq_obj_thre); my $seq_array_ref = \@seq_array; my $aln = $factory->align($seq_array_ref); Or just: my $aln = $factory->align([$seq_obj_on, $seq_obj_too, $seq_obj_thre]); From vdar at yorku.ca Mon May 12 13:24:22 2008 From: vdar at yorku.ca (vdar at yorku.ca) Date: Mon, 12 May 2008 13:24:22 -0400 Subject: [Bioperl-l] problems with clustalw In-Reply-To: <48287946.8030007@sendu.me.uk> References: <17158917.post@talk.nabble.com> <48253A90.8060300@sendu.me.uk> <1210608557.48286bad2f2c5@mymail.yorku.ca> <48286CF8.2060405@sendu.me.uk> <1210609239.48286e57324cd@mymail.yorku.ca> <4828700A.4050709@sendu.me.uk> <1210610375.482872c72d63e@mymail.yorku.ca> <482874C7.2050606@sendu.me.uk> <1210611353.48287699ba469@mymail.yorku.ca> <48287946.8030007@sendu.me.uk> Message-ID: <1210613062.48287d46ba7c8@mymail.yorku.ca> Yes, seems like it worked, now I am having the following error message which is not because of the errors in installation..right? $ perl mult_align.pl Can't call method "isa" without a package or object reference at /opt/rocks/lib/perl5/site_perl/5.8.8//Bio/Tools/Run/Alignment/Clustalw.pm line 617. Quoting Sendu Bala : > vdar at yorku.ca wrote: > > Hi, > > > > Yes, I have clustalw installed and following is the result o which command > > > > $ which clustalw > > /opt/Bio/bin/clustalw > > > > Please see aa.txt as output of perl -V and mult_align.pl is my script > > I've CC'd back the bioperl mailing list so other people can learn. > Please keep it CC'd. > > Your script has two main errors: > > use Clustalw; > $ENV{CLUSTALDIR} = > '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/Run/Alignment/'; > > These should be: > use Bio::Tools::Run::Alignment::Clustalw; > $ENV{CLUSTALDIR} = '/opt/Bio/bin/clustalw'; > > There is also something very wrong with your installation, since you are > using perl 5.8.5 yet have bioperl-run installed into a directory for > 5.8.8. This is why Bio::Tools::Run::Alignment::Clustalw wasn't being > found in the normal way; the 5.8.8 directory was never checked. > > PERL5LIB="/opt/rocks/lib/perl5/site_perl/5.8.8" should let it be found. > If not, you might have to move the Bio folder from 5.8.8 to 5.8.5. > From vdar at yorku.ca Mon May 12 14:19:27 2008 From: vdar at yorku.ca (vdar at yorku.ca) Date: Mon, 12 May 2008 14:19:27 -0400 Subject: [Bioperl-l] problems with clustalw In-Reply-To: <48288061.9030001@sendu.me.uk> References: <17158917.post@talk.nabble.com> <48253A90.8060300@sendu.me.uk> <1210608557.48286bad2f2c5@mymail.yorku.ca> <48286CF8.2060405@sendu.me.uk> <1210609239.48286e57324cd@mymail.yorku.ca> <4828700A.4050709@sendu.me.uk> <1210610375.482872c72d63e@mymail.yorku.ca> <482874C7.2050606@sendu.me.uk> <1210611353.48287699ba469@mymail.yorku.ca> <48287946.8030007@sendu.me.uk> <1210613062.48287d46ba7c8@mymail.yorku.ca> <48288061.9030001@sendu.me.uk> Message-ID: <1210616367.48288a2f1191a@mymail.yorku.ca> Thanks a lot! You have solved the problem. I am getting the following output now. Would it be possible for you to let me know how can I print the alignments? I need the alignments as we get after running web-based clustalw or multalin programs. CLUSTAL W (1.83) Multiple Sequence Alignments Sequence format is Pearson Sequence 1: thal 77 bp Sequence 2: lyrata 77 bp Sequence 3: boechera 77 bp Start of Pairwise alignments Aligning... Sequences (1:2) Aligned. Score: 100 Sequences (1:3) Aligned. Score: 100 Sequences (2:3) Aligned. Score: 100 Guide tree file created: [/tmp/EIZp1pI1gi/jKZ8gRG2dY.dnd] Start of Multiple Alignment There are 2 groups Aligning... Group 1: Sequences: 2 Score:1463 Group 2: Sequences: 3 Score:1463 Alignment Score 1551 GCG-Alignment file created [/tmp/EIZp1pI1gi/Wa9Du2UIum] Nisa Quoting Sendu Bala : > vdar at yorku.ca wrote: > > Yes, seems like it worked, now I am having the following error message > which is > > not because of the errors in installation..right? > > > > $ perl mult_align.pl > > Can't call method "isa" without a package or object reference at > > /opt/rocks/lib/perl5/site_perl/5.8.8//Bio/Tools/Run/Alignment/Clustalw.pm > line > > 617. > > You weren't passing sequence objects to align() due to another error in > your script: > > Instead of: > my @seq_array = qw/$seq_obj_on $seq_obj_too $seq_obj_thre/; > my $seq_array_ref = \@seq_array; > my $aln = $factory->align($seq_array_ref); > > You can have: > my @seq_array = ($seq_obj_on, $seq_obj_too, $seq_obj_thre); > my $seq_array_ref = \@seq_array; > my $aln = $factory->align($seq_array_ref); > > Or just: > my $aln = $factory->align([$seq_obj_on, $seq_obj_too, $seq_obj_thre]); > From bix at sendu.me.uk Mon May 12 14:50:45 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 12 May 2008 19:50:45 +0100 Subject: [Bioperl-l] problems with clustalw In-Reply-To: <1210616367.48288a2f1191a@mymail.yorku.ca> References: <17158917.post@talk.nabble.com> <48253A90.8060300@sendu.me.uk> <1210608557.48286bad2f2c5@mymail.yorku.ca> <48286CF8.2060405@sendu.me.uk> <1210609239.48286e57324cd@mymail.yorku.ca> <4828700A.4050709@sendu.me.uk> <1210610375.482872c72d63e@mymail.yorku.ca> <482874C7.2050606@sendu.me.uk> <1210611353.48287699ba469@mymail.yorku.ca> <48287946.8030007@sendu.me.uk> <1210613062.48287d46ba7c8@mymail.yorku.ca> <48288061.9030001@sendu.me.uk> <1210616367.48288a2f1191a@mymail.yorku.ca> Message-ID: <48289185.1030005@sendu.me.uk> vdar at yorku.ca wrote: > Thanks a lot! You have solved the problem. I am getting the following output > now. Would it be possible for you to let me know how can I print the > alignments? I need the alignments as we get after running web-based clustalw or > multalin programs. >[...] >> Or just: >> my $aln = $factory->align([$seq_obj_on, $seq_obj_too, $seq_obj_thre]); $aln is a Bio::SimpleAlign object. Check the docs for how to use it: http://docs.bioperl.org/bioperl-live/Bio/SimpleAlign.html For printing, you'll want to use AlignIO: http://docs.bioperl.org/bioperl-live/Bio/AlignIO.html For an example, see: http://www.bioperl.org/wiki/HOWTO:SearchIO#Using_the_methods From vdar at yorku.ca Mon May 12 16:19:37 2008 From: vdar at yorku.ca (vdar at yorku.ca) Date: Mon, 12 May 2008 16:19:37 -0400 Subject: [Bioperl-l] problems with clustalw In-Reply-To: <48289185.1030005@sendu.me.uk> References: <17158917.post@talk.nabble.com> <48253A90.8060300@sendu.me.uk> <1210608557.48286bad2f2c5@mymail.yorku.ca> <48286CF8.2060405@sendu.me.uk> <1210609239.48286e57324cd@mymail.yorku.ca> <4828700A.4050709@sendu.me.uk> <1210610375.482872c72d63e@mymail.yorku.ca> <482874C7.2050606@sendu.me.uk> <1210611353.48287699ba469@mymail.yorku.ca> <48287946.8030007@sendu.me.uk> <1210613062.48287d46ba7c8@mymail.yorku.ca> <48288061.9030001@sendu.me.uk> <1210616367.48288a2f1191a@mymail.yorku.ca> <48289185.1030005@sendu.me.uk> Message-ID: <1210623577.4828a659ea351@mymail.yorku.ca> Thank you so much! Nisa Quoting Sendu Bala : > vdar at yorku.ca wrote: > > Thanks a lot! You have solved the problem. I am getting the following > output > > now. Would it be possible for you to let me know how can I print the > > alignments? I need the alignments as we get after running web-based > clustalw or > > multalin programs. > >[...] > >> Or just: > >> my $aln = $factory->align([$seq_obj_on, $seq_obj_too, $seq_obj_thre]); > > $aln is a Bio::SimpleAlign object. > Check the docs for how to use it: > http://docs.bioperl.org/bioperl-live/Bio/SimpleAlign.html > > For printing, you'll want to use AlignIO: > http://docs.bioperl.org/bioperl-live/Bio/AlignIO.html > For an example, see: > http://www.bioperl.org/wiki/HOWTO:SearchIO#Using_the_methods > > From vdar at yorku.ca Mon May 12 17:22:45 2008 From: vdar at yorku.ca (nisa_dar) Date: Mon, 12 May 2008 14:22:45 -0700 (PDT) Subject: [Bioperl-l] automated stand alone blast with repeat masker Message-ID: <17189995.post@talk.nabble.com> Hi, I'm running a stand alone blast against my local databases by using the following code use Bio::Seq; use Bio::Tools::Run::StandAloneBlast; @params = (program => 'blastn', database => 'db.fa'); $blast_obj = Bio::Tools::Run::StandAloneBlast->new(@params); $seq_obj = Bio::Seq->new(-id =>"test query", -seq =>"TTTAAATATATTTTGAAGTATAGATTATATGTT"); $report_obj = $blast_obj->blastall($seq_obj); $result_obj = $report_obj->next_result; print $result_obj->num_hits; How can I include the code for repeat masker in it? Thanks Nisa -- View this message in context: http://www.nabble.com/automated-stand-alone-blast-with-repeat-masker-tp17189995p17189995.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From David.Messina at sbc.su.se Mon May 12 17:58:40 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Mon, 12 May 2008 23:58:40 +0200 Subject: [Bioperl-l] automated stand alone blast with repeat masker In-Reply-To: <17189995.post@talk.nabble.com> References: <17189995.post@talk.nabble.com> Message-ID: <628aabb70805121458o5bc808f8jf46869b08e65e8ac@mail.gmail.com> I haven't done this myself, but from a quick search on the BioPerl website, it looks like you'll want to use the Bio::Tools::Run::RepeatMaskermodule to create a repeat-masked fasta file. If you RepeatMask your query sequence(s), then you need to specify that sequence when you create your Bio::Seq object. If you instead RepeatMask your database, you'll need to create a blast database from the repeat-masked sequences and specify that db in your @params. I don't think there's a module for running formatdb, but you can do it through a system call. Dave From prachi at stanford.edu Mon May 12 19:17:53 2008 From: prachi at stanford.edu (Prachi Shah) Date: Mon, 12 May 2008 16:17:53 -0700 Subject: [Bioperl-l] Can't parse blast report written by Bio::SearchIO::Writer::TextResultWriter In-Reply-To: <8684cf960805091043j706d2aaej8584b1e7d4e2e4d7@mail.gmail.com> References: <8684cf960805081354s6400b1eey917f6b9ae862eded@mail.gmail.com> <27483384-0188-44F5-8AF8-5293A7A83547@bioperl.org> <8684cf960805081535v2a8c8261hcd373612100cdaf5@mail.gmail.com> <6661CE6F-0795-4EDE-9D05-CD95BAB3DBA4@uiuc.edu> <8684cf960805091043j706d2aaej8584b1e7d4e2e4d7@mail.gmail.com> Message-ID: <8684cf960805121617h9e2cf4ftdd5aee0f81635c47@mail.gmail.com> Thanks Jason for adding the sort_hsps method in Bio::Search::Hit::GenericHit. I tested it out and it works great. The other issue I have is the format of HSP start and stop coordinates when I write a new blast report (with HSPs sorted) using Bio::SearchIO::Writer::TextResultWriter. Below is an example of the same HSP alignment as output from BLAST and later when the blast report is generated by TextResultWriter. Notice, the change in start and stop coordinates. I would like to keep the start and stop format as in the first case. How do I specify that? Any indicators are greatly appreciated. Thanks, Prachi ---------------------------------------------------------------------------------------------------- **HSP alignment in blast report generated by BLAST itself: Score = 10150 (1529.0 bits), Expect = 0., Sum P(3) = 0. Identities = 2120/2345 (90%), Positives = 2120/2345 (90%), Strand = Minus / Plus Query: 2364 CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG 2305 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2251160 CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG 2251219 Query: 2304 TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA 2245 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2251220 TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA 2251279 Query: 2244 ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC 2185 |||||||||||||| | Sbjct: 2251280 ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC 2251339 Query: 2184 CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG 2125 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2251340 CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG 2251399 Query: 2124 TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG 2065 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2251400 TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG 2251459 Query: 2064 TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG 2005 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2251460 TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG 2251519 Query: 2004 CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG 1945 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2251520 CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG 2251579 Query: 1944 ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA 1885 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2251580 ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA 2251639 Query: 1884 CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA 1825 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2251640 CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA 2251699 ---------------------------------------------------------------------------------------------------- ** HSP alignment written by TextResultWriter: Score = 1529.0 bits (10150), Expect = 0., P = 0. Identities = 2120/2345 (90%) Frame = -1 / +1 Query: 20 CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG -39 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2251160 CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG 2251219 Query: -40 TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA -99 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2251220 TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA 2251279 Query: -100 ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC -159 |||||||||||||| | Sbjct: 2251280 ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC 2251339 Query: -160 CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG -219 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2251340 CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG 2251399 Query: -220 TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG -279 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2251400 TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG 2251459 Query: -280 TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG -339 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2251460 TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG 2251519 Query: -340 CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG -399 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2251520 CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG 2251579 Query: -400 ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA -459 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2251580 ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA 2251639 Query: -460 CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA -519 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 2251640 CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA 2251699 From jason at bioperl.org Mon May 12 19:21:58 2008 From: jason at bioperl.org (Jason Stajich) Date: Mon, 12 May 2008 16:21:58 -0700 Subject: [Bioperl-l] Can't parse blast report written by Bio::SearchIO::Writer::TextResultWriter In-Reply-To: <8684cf960805121617h9e2cf4ftdd5aee0f81635c47@mail.gmail.com> References: <8684cf960805081354s6400b1eey917f6b9ae862eded@mail.gmail.com> <27483384-0188-44F5-8AF8-5293A7A83547@bioperl.org> <8684cf960805081535v2a8c8261hcd373612100cdaf5@mail.gmail.com> <6661CE6F-0795-4EDE-9D05-CD95BAB3DBA4@uiuc.edu> <8684cf960805091043j706d2aaej8584b1e7d4e2e4d7@mail.gmail.com> <8684cf960805121617h9e2cf4ftdd5aee0f81635c47@mail.gmail.com> Message-ID: <6DAEC561-D4C6-4F52-9359-84E4A336FD01@bioperl.org> that's a very strange bug - I don't quite understand where it is coming from. IF you don't mess with the HSP order and start with a report and generate the Text report output, does it also give the negative coordinates or are you still reconstituting the Hit/HSP objects "manually" in your code? -jason On May 12, 2008, at 4:17 PM, Prachi Shah wrote: > Thanks Jason for adding the sort_hsps method in > Bio::Search::Hit::GenericHit. I tested it out and it works great. > > The other issue I have is the format of HSP start and stop coordinates > when I write a new blast report (with HSPs sorted) using > Bio::SearchIO::Writer::TextResultWriter. Below is an example of the > same HSP alignment as output from BLAST and later when the blast > report is generated by TextResultWriter. Notice, the change in start > and stop coordinates. I would like to keep the start and stop format > as in the first case. How do I specify that? Any indicators are > greatly appreciated. > > Thanks, > Prachi > > ---------------------------------------------------------------------- > ------------------------------ > **HSP alignment in blast report generated by BLAST itself: > > Score = 10150 (1529.0 bits), Expect = 0., Sum P(3) = 0. > Identities = 2120/2345 (90%), Positives = 2120/2345 (90%), Strand = > Minus / Plus > > Query: 2364 > CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG 2305 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2251160 > CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG > 2251219 > > Query: 2304 > TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA 2245 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2251220 > TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA > 2251279 > > Query: 2244 > ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC 2185 > > |||||||||||||| | > Sbjct: 2251280 > ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC > 2251339 > > Query: 2184 > CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG 2125 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2251340 > CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG > 2251399 > > Query: 2124 > TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG 2065 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2251400 > TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG > 2251459 > > Query: 2064 > TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG 2005 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2251460 > TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG > 2251519 > > Query: 2004 > CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG 1945 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2251520 > CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG > 2251579 > > Query: 1944 > ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA 1885 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2251580 > ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA > 2251639 > > Query: 1884 > CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA 1825 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2251640 > CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA > 2251699 > > > ---------------------------------------------------------------------- > ------------------------------ > ** HSP alignment written by TextResultWriter: > > Score = 1529.0 bits (10150), Expect = 0., P = 0. > Identities = 2120/2345 (90%) > Frame = -1 / +1 > > Query: 20 > CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG -39 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2251160 > CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG > 2251219 > > Query: -40 > TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA -99 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2251220 > TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA > 2251279 > > Query: -100 > ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC -159 > > |||||||||||||| | > Sbjct: 2251280 > ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC > 2251339 > > Query: -160 > CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG -219 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2251340 > CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG > 2251399 > > Query: -220 > TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG -279 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2251400 > TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG > 2251459 > > Query: -280 > TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG -339 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2251460 > TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG > 2251519 > > Query: -340 > CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG -399 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2251520 > CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG > 2251579 > > Query: -400 > ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA -459 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2251580 > ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA > 2251639 > > Query: -460 > CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA -519 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 2251640 > CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA > 2251699 From prachi at stanford.edu Mon May 12 19:26:41 2008 From: prachi at stanford.edu (Prachi Shah) Date: Mon, 12 May 2008 16:26:41 -0700 Subject: [Bioperl-l] Can't parse blast report written by Bio::SearchIO::Writer::TextResultWriter In-Reply-To: <6DAEC561-D4C6-4F52-9359-84E4A336FD01@bioperl.org> References: <8684cf960805081354s6400b1eey917f6b9ae862eded@mail.gmail.com> <27483384-0188-44F5-8AF8-5293A7A83547@bioperl.org> <8684cf960805081535v2a8c8261hcd373612100cdaf5@mail.gmail.com> <6661CE6F-0795-4EDE-9D05-CD95BAB3DBA4@uiuc.edu> <8684cf960805091043j706d2aaej8584b1e7d4e2e4d7@mail.gmail.com> <8684cf960805121617h9e2cf4ftdd5aee0f81635c47@mail.gmail.com> <6DAEC561-D4C6-4F52-9359-84E4A336FD01@bioperl.org> Message-ID: <8684cf960805121626y2fb9e8a1n7bbfc81e3a61a2bc@mail.gmail.com> Hi Jason, The negative coordinates in the HSP show up when I generate a Text report regardless of how/if I sort the HSP order. I think it has something to do with the frame. In the example I gave, the Query sequence matches the subject sequence on the negative strand. My guess is that TextResultWriter somehow takes the strand into account and tries to recalculates the start and stop locations? Thanks, Prachi On Mon, May 12, 2008 at 4:21 PM, Jason Stajich wrote: > that's a very strange bug - I don't quite understand where it is coming > from. IF you don't mess with the HSP order and start with a report and > generate the Text report output, does it also give the negative coordinates > or are you still reconstituting the Hit/HSP objects "manually" in your code? > > -jason > > > On May 12, 2008, at 4:17 PM, Prachi Shah wrote: > > > > Thanks Jason for adding the sort_hsps method in > > Bio::Search::Hit::GenericHit. I tested it out and it works great. > > > > The other issue I have is the format of HSP start and stop coordinates > > when I write a new blast report (with HSPs sorted) using > > Bio::SearchIO::Writer::TextResultWriter. Below is an example of the > > same HSP alignment as output from BLAST and later when the blast > > report is generated by TextResultWriter. Notice, the change in start > > and stop coordinates. I would like to keep the start and stop format > > as in the first case. How do I specify that? Any indicators are > > greatly appreciated. > > > > Thanks, > > Prachi > > > > > ---------------------------------------------------------------------------------------------------- > > **HSP alignment in blast report generated by BLAST itself: > > > > Score = 10150 (1529.0 bits), Expect = 0., Sum P(3) = 0. > > Identities = 2120/2345 (90%), Positives = 2120/2345 (90%), Strand = > > Minus / Plus > > > > Query: 2364 > CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG 2305 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 2251160 > CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG > > 2251219 > > > > Query: 2304 > TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA 2245 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 2251220 > TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA > > 2251279 > > > > Query: 2244 > ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC 2185 > > |||||||||||||| | > > Sbjct: 2251280 > ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC > > 2251339 > > > > Query: 2184 > CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG 2125 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 2251340 > CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG > > 2251399 > > > > Query: 2124 > TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG 2065 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 2251400 > TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG > > 2251459 > > > > Query: 2064 > TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG 2005 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 2251460 > TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG > > 2251519 > > > > Query: 2004 > CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG 1945 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 2251520 > CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG > > 2251579 > > > > Query: 1944 > ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA 1885 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 2251580 > ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA > > 2251639 > > > > Query: 1884 > CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA 1825 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 2251640 > CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA > > 2251699 > > > > > > > ---------------------------------------------------------------------------------------------------- > > ** HSP alignment written by TextResultWriter: > > > > Score = 1529.0 bits (10150), Expect = 0., P = 0. > > Identities = 2120/2345 (90%) > > Frame = -1 / +1 > > > > Query: 20 > CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG -39 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 2251160 > CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG > > 2251219 > > > > Query: -40 > TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA -99 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 2251220 > TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA > > 2251279 > > > > Query: -100 > ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC -159 > > |||||||||||||| | > > Sbjct: 2251280 > ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC > > 2251339 > > > > Query: -160 > CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG -219 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 2251340 > CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG > > 2251399 > > > > Query: -220 > TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG -279 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 2251400 > TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG > > 2251459 > > > > Query: -280 > TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG -339 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 2251460 > TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG > > 2251519 > > > > Query: -340 > CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG -399 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 2251520 > CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG > > 2251579 > > > > Query: -400 > ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA -459 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 2251580 > ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA > > 2251639 > > > > Query: -460 > CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA -519 > > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > > Sbjct: 2251640 > CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA > > 2251699 > > > > From jason at bioperl.org Mon May 12 19:53:15 2008 From: jason at bioperl.org (Jason Stajich) Date: Mon, 12 May 2008 16:53:15 -0700 Subject: [Bioperl-l] Can't parse blast report written by Bio::SearchIO::Writer::TextResultWriter In-Reply-To: <8684cf960805121626y2fb9e8a1n7bbfc81e3a61a2bc@mail.gmail.com> References: <8684cf960805081354s6400b1eey917f6b9ae862eded@mail.gmail.com> <27483384-0188-44F5-8AF8-5293A7A83547@bioperl.org> <8684cf960805081535v2a8c8261hcd373612100cdaf5@mail.gmail.com> <6661CE6F-0795-4EDE-9D05-CD95BAB3DBA4@uiuc.edu> <8684cf960805091043j706d2aaej8584b1e7d4e2e4d7@mail.gmail.com> <8684cf960805121617h9e2cf4ftdd5aee0f81635c47@mail.gmail.com> <6DAEC561-D4C6-4F52-9359-84E4A336FD01@bioperl.org> <8684cf960805121626y2fb9e8a1n7bbfc81e3a61a2bc@mail.gmail.com> Message-ID: <83452C02-671E-4468-85FB-F7F4FA556D71@bioperl.org> okay - so there's a bug - I remember someone tried to fix something in the writers recently so will have to look and see how that got broken and can be fixed. -j On May 12, 2008, at 4:26 PM, Prachi Shah wrote: > Hi Jason, > > The negative coordinates in the HSP show up when I generate a Text > report regardless of how/if I sort the HSP order. I think it has > something to do with the frame. In the example I gave, the Query > sequence matches the subject sequence on the negative strand. My guess > is that TextResultWriter somehow takes the strand into account and > tries to recalculates the start and stop locations? > > Thanks, > Prachi > > On Mon, May 12, 2008 at 4:21 PM, Jason Stajich > wrote: >> that's a very strange bug - I don't quite understand where it is >> coming >> from. IF you don't mess with the HSP order and start with a >> report and >> generate the Text report output, does it also give the negative >> coordinates >> or are you still reconstituting the Hit/HSP objects "manually" in >> your code? >> >> -jason >> >> >> On May 12, 2008, at 4:17 PM, Prachi Shah wrote: >> >> >>> Thanks Jason for adding the sort_hsps method in >>> Bio::Search::Hit::GenericHit. I tested it out and it works great. >>> >>> The other issue I have is the format of HSP start and stop >>> coordinates >>> when I write a new blast report (with HSPs sorted) using >>> Bio::SearchIO::Writer::TextResultWriter. Below is an example of the >>> same HSP alignment as output from BLAST and later when the blast >>> report is generated by TextResultWriter. Notice, the change in start >>> and stop coordinates. I would like to keep the start and stop format >>> as in the first case. How do I specify that? Any indicators are >>> greatly appreciated. >>> >>> Thanks, >>> Prachi >>> >>> >> --------------------------------------------------------------------- >> ------------------------------- >>> **HSP alignment in blast report generated by BLAST itself: >>> >>> Score = 10150 (1529.0 bits), Expect = 0., Sum P(3) = 0. >>> Identities = 2120/2345 (90%), Positives = 2120/2345 (90%), Strand = >>> Minus / Plus >>> >>> Query: 2364 >> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG 2305 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2251160 >> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG >>> 2251219 >>> >>> Query: 2304 >> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA 2245 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2251220 >> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA >>> 2251279 >>> >>> Query: 2244 >> ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC 2185 >>> >>> |||||||||||||| | >>> Sbjct: 2251280 >> ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC >>> 2251339 >>> >>> Query: 2184 >> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG 2125 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2251340 >> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG >>> 2251399 >>> >>> Query: 2124 >> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG 2065 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2251400 >> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG >>> 2251459 >>> >>> Query: 2064 >> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG 2005 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2251460 >> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG >>> 2251519 >>> >>> Query: 2004 >> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG 1945 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2251520 >> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG >>> 2251579 >>> >>> Query: 1944 >> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA 1885 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2251580 >> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA >>> 2251639 >>> >>> Query: 1884 >> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA 1825 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2251640 >> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA >>> 2251699 >>> >>> >>> >> --------------------------------------------------------------------- >> ------------------------------- >>> ** HSP alignment written by TextResultWriter: >>> >>> Score = 1529.0 bits (10150), Expect = 0., P = 0. >>> Identities = 2120/2345 (90%) >>> Frame = -1 / +1 >>> >>> Query: 20 >> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG -39 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2251160 >> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG >>> 2251219 >>> >>> Query: -40 >> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA -99 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2251220 >> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA >>> 2251279 >>> >>> Query: -100 >> ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC -159 >>> >>> |||||||||||||| | >>> Sbjct: 2251280 >> ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC >>> 2251339 >>> >>> Query: -160 >> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG -219 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2251340 >> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG >>> 2251399 >>> >>> Query: -220 >> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG -279 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2251400 >> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG >>> 2251459 >>> >>> Query: -280 >> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG -339 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2251460 >> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG >>> 2251519 >>> >>> Query: -340 >> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG -399 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2251520 >> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG >>> 2251579 >>> >>> Query: -400 >> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA -459 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2251580 >> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA >>> 2251639 >>> >>> Query: -460 >> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA -519 >>> >>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>> Sbjct: 2251640 >> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA >>> 2251699 >>> >> >> From cjfields at uiuc.edu Mon May 12 20:33:25 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 12 May 2008 19:33:25 -0500 Subject: [Bioperl-l] Can't parse blast report written by Bio::SearchIO::Writer::TextResultWriter In-Reply-To: <83452C02-671E-4468-85FB-F7F4FA556D71@bioperl.org> References: <8684cf960805081354s6400b1eey917f6b9ae862eded@mail.gmail.com> <27483384-0188-44F5-8AF8-5293A7A83547@bioperl.org> <8684cf960805081535v2a8c8261hcd373612100cdaf5@mail.gmail.com> <6661CE6F-0795-4EDE-9D05-CD95BAB3DBA4@uiuc.edu> <8684cf960805091043j706d2aaej8584b1e7d4e2e4d7@mail.gmail.com> <8684cf960805121617h9e2cf4ftdd5aee0f81635c47@mail.gmail.com> <6DAEC561-D4C6-4F52-9359-84E4A336FD01@bioperl.org> <8684cf960805121626y2fb9e8a1n7bbfc81e3a61a2bc@mail.gmail.com> <83452C02-671E-4468-85FB-F7F4FA556D71@bioperl.org> Message-ID: I ran some fixes on the writers recently. If we have the BLAST report generating this I can work on debugging it (I'll file a bug for tracking). chris On May 12, 2008, at 6:53 PM, Jason Stajich wrote: > okay - so there's a bug - I remember someone tried to fix something > in the writers recently so will have to look and see how that got > broken and can be fixed. > -j > On May 12, 2008, at 4:26 PM, Prachi Shah wrote: > >> Hi Jason, >> >> The negative coordinates in the HSP show up when I generate a Text >> report regardless of how/if I sort the HSP order. I think it has >> something to do with the frame. In the example I gave, the Query >> sequence matches the subject sequence on the negative strand. My >> guess >> is that TextResultWriter somehow takes the strand into account and >> tries to recalculates the start and stop locations? >> >> Thanks, >> Prachi >> >> On Mon, May 12, 2008 at 4:21 PM, Jason Stajich >> wrote: >>> that's a very strange bug - I don't quite understand where it is >>> coming >>> from. IF you don't mess with the HSP order and start with a >>> report and >>> generate the Text report output, does it also give the negative >>> coordinates >>> or are you still reconstituting the Hit/HSP objects "manually" in >>> your code? >>> >>> -jason >>> >>> >>> On May 12, 2008, at 4:17 PM, Prachi Shah wrote: >>> >>> >>>> Thanks Jason for adding the sort_hsps method in >>>> Bio::Search::Hit::GenericHit. I tested it out and it works great. >>>> >>>> The other issue I have is the format of HSP start and stop >>>> coordinates >>>> when I write a new blast report (with HSPs sorted) using >>>> Bio::SearchIO::Writer::TextResultWriter. Below is an example of the >>>> same HSP alignment as output from BLAST and later when the blast >>>> report is generated by TextResultWriter. Notice, the change in >>>> start >>>> and stop coordinates. I would like to keep the start and stop >>>> format >>>> as in the first case. How do I specify that? Any indicators are >>>> greatly appreciated. >>>> >>>> Thanks, >>>> Prachi >>>> >>>> >>> ---------------------------------------------------------------------------------------------------- >>>> **HSP alignment in blast report generated by BLAST itself: >>>> >>>> Score = 10150 (1529.0 bits), Expect = 0., Sum P(3) = 0. >>>> Identities = 2120/2345 (90%), Positives = 2120/2345 (90%), Strand = >>>> Minus / Plus >>>> >>>> Query: 2364 >>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG 2305 >>>> >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2251160 >>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG >>>> 2251219 >>>> >>>> Query: 2304 >>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA 2245 >>>> >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2251220 >>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA >>>> 2251279 >>>> >>>> Query: 2244 >>> ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC 2185 >>>> >>>> |||||||||||||| | >>>> Sbjct: 2251280 >>> ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC >>>> 2251339 >>>> >>>> Query: 2184 >>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG 2125 >>>> >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2251340 >>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG >>>> 2251399 >>>> >>>> Query: 2124 >>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG 2065 >>>> >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2251400 >>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG >>>> 2251459 >>>> >>>> Query: 2064 >>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG 2005 >>>> >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2251460 >>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG >>>> 2251519 >>>> >>>> Query: 2004 >>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG 1945 >>>> >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2251520 >>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG >>>> 2251579 >>>> >>>> Query: 1944 >>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA 1885 >>>> >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2251580 >>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA >>>> 2251639 >>>> >>>> Query: 1884 >>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA 1825 >>>> >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2251640 >>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA >>>> 2251699 >>>> >>>> >>>> >>> ---------------------------------------------------------------------------------------------------- >>>> ** HSP alignment written by TextResultWriter: >>>> >>>> Score = 1529.0 bits (10150), Expect = 0., P = 0. >>>> Identities = 2120/2345 (90%) >>>> Frame = -1 / +1 >>>> >>>> Query: 20 >>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG -39 >>>> >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2251160 >>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG >>>> 2251219 >>>> >>>> Query: -40 >>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA -99 >>>> >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2251220 >>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA >>>> 2251279 >>>> >>>> Query: -100 >>> ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC -159 >>>> >>>> |||||||||||||| | >>>> Sbjct: 2251280 >>> ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC >>>> 2251339 >>>> >>>> Query: -160 >>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG -219 >>>> >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2251340 >>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG >>>> 2251399 >>>> >>>> Query: -220 >>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG -279 >>>> >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2251400 >>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG >>>> 2251459 >>>> >>>> Query: -280 >>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG -339 >>>> >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2251460 >>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG >>>> 2251519 >>>> >>>> Query: -340 >>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG -399 >>>> >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2251520 >>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG >>>> 2251579 >>>> >>>> Query: -400 >>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA -459 >>>> >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2251580 >>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA >>>> 2251639 >>>> >>>> Query: -460 >>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA -519 >>>> >>>> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| >>>> Sbjct: 2251640 >>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA >>>> 2251699 >>>> >>> >>> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Tue May 13 11:29:16 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Tue, 13 May 2008 17:29:16 +0200 Subject: [Bioperl-l] help_to_acces_clustal-w In-Reply-To: <462687.68349.qm@web8713.mail.in.yahoo.com> References: <462687.68349.qm@web8713.mail.in.yahoo.com> Message-ID: <628aabb70805130829r1e9b7c2fpbc6ecf036f01286f@mail.gmail.com> Hi Punit, Please make sure that you use 'reply to all' when responding so that this gets seen on the BioPerl mailing list, too. On Mon, May 12, 2008 at 1:25 PM, punit kumar wrote: > in actually i use the perl version 5.6 and i have tried before to install > the bioperl on the windows workstation before. > > but in actually it was so painfull i do not wanna change my prior version > of the perl which is used by my > and wanna install too so that is my problem. > I have never tried installing Perl on Windows, but if you read the BioPerl installation guide for Windowsthat I pointed you to, you'll see that a straightforward ActivePerl installer is available. So you shouldn't have to install Perl from the source code. Again, I haven't done it personally, but I would expect that the ActivePerl installer allows you to specify where it installs Perl. This would enable you to keep your existing Perl 5.6 installation and have a separate Perl 5.8.x installation for use with BioPerl. According to the BioPerl Windows installation guide, once you install ActivePerl, there is a Perl Package Manager with a graphical interface that makes it very easy to install the latest version of BioPerl. Dave From jay at jays.net Tue May 13 12:28:36 2008 From: jay at jays.net (Jay Hannah) Date: Tue, 13 May 2008 11:28:36 -0500 Subject: [Bioperl-l] [Gmod-gbrowse] Script to convert blastall output into gff format. In-Reply-To: References: Message-ID: <7419081D-CAEC-4F9C-ABD6-D5F8BBBA3106@jays.net> On May 13, 2008, at 8:59 AM, Gabriel Dalmazo wrote: > I've been serching for such tool, but couldn't find this especific > type of script, there are many parsers, but don't know which one is > more apropriated. I think this does what you're asking for: bioperl-live/scripts/utilities/search2gff.PLS http://code.open-bio.org/svnweb/index.cgi/bioperl/view/bioperl- live/trunk/scripts/utilities/search2gff.PLS If your goal is visualization you might find this interesting: http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output HTH, j http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From jason at bioperl.org Tue May 13 13:25:42 2008 From: jason at bioperl.org (Jason Stajich) Date: Tue, 13 May 2008 10:25:42 -0700 Subject: [Bioperl-l] MSA manipulation In-Reply-To: References: Message-ID: Jon - [CC-ing list in case others have input.] AlignI is the interface module but the actual implementation is in Bio:SimpleAlign. You want to read in alignments with Bio::AlignIO and that will give you Bio::SimpleAlign objects that can then be manipulated. There are methods for removing columns, etc. -jason On May 13, 2008, at 2:35 AM, Jon Wright ((JIC)) wrote: > Hi Jason, > > > > I am looking for a bioperl module into which you can load a multiple > sequence alignment and manipulate it programmatically (similar to > something like Jalview). The closest thing in BioPerl that I can find > is your implementation of the Bio::Align::AlignI but this doesn't > allow > any manipulation as far as I can tell. Are you aware of anything else > that might do the job? > > > > Thanks for your help. > > > > Jon > > > > ********************************************************* > > Jonathan Wright > > Computational and Systems Biology Department > > John Innes Centre > > Norwich > > UK > > > > www.jic.bbsrc.ac.uk > > Tel. +44 (0)1603 450811 > > ********************************************************* > > > From vdar at yorku.ca Tue May 13 16:45:13 2008 From: vdar at yorku.ca (vdar at yorku.ca) Date: Tue, 13 May 2008 16:45:13 -0400 Subject: [Bioperl-l] automated stand alone blast with repeat masker In-Reply-To: <628aabb70805121458o5bc808f8jf46869b08e65e8ac@mail.gmail.com> References: <17189995.post@talk.nabble.com> <628aabb70805121458o5bc808f8jf46869b08e65e8ac@mail.gmail.com> Message-ID: <1210711513.4829fdd9b756d@mymail.yorku.ca> Do we have to install it separately because seems like its not there on my system although I have bioperl installed on my system. Quoting Dave Messina : > I haven't done this myself, but from a quick search on the BioPerl website, > it looks like you'll want to use the > Bio::Tools::Run::RepeatMaskermodule > to create a repeat-masked fasta file. > > If you RepeatMask your query sequence(s), then you need to specify that > sequence when you create your Bio::Seq object. > > If you instead RepeatMask your database, you'll need to create a blast > database from the repeat-masked sequences and specify that db in your > @params. I don't think there's a module for running formatdb, but you can do > it through a system call. > > > > Dave > From vdar at yorku.ca Tue May 13 16:56:18 2008 From: vdar at yorku.ca (vdar at yorku.ca) Date: Tue, 13 May 2008 16:56:18 -0400 Subject: [Bioperl-l] automated stand alone blast with repeat masker In-Reply-To: <628aabb70805121458o5bc808f8jf46869b08e65e8ac@mail.gmail.com> References: <17189995.post@talk.nabble.com> <628aabb70805121458o5bc808f8jf46869b08e65e8ac@mail.gmail.com> Message-ID: <1210712178.482a0072c9499@mymail.yorku.ca> Following is the path to repeatmasker.pm on my system /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/RepeatMasker.pm but when I run my program, the error message comes RepeatMasker program not found as or not executable Here is my piece of code which gives this error, #!/usr/bin/perl use strict; use warnings; use Bio::Seq; use Bio::Tools::Run::StandAloneBlast; use Bio::Search::Hit::HitI; use Bio::Search::Hit::BlastHit; use Bio::Search::HSP::BlastHSP; use Bio::Search::HSP::HSPI; use Bio::SearchIO; use Bio::Tools::Run::RepeatMasker; BEGIN { $ENV{REPEATMASKERDIR} = '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/'; } my @params = ("mam" => 1,"noint"=>1); my $factory = Bio::Tools::Run::RepeatMasker->new(@params); my $in = Bio::SeqIO->new(-file => "boechera.fasta", -format => 'fasta'); I tried finding RepeatMasker directory by typing which RepeatMasker but the error message was /usr/bin/which: no RepeatMasker in (/opt/openmpi/1.1.4/bin:/opt/lsfhpc/ego/1.2/linux2.6-glibc2.3-x86_64/etc:/opt/lsfhpc/ego/1.2/linux2.6-glibc2.3-x86_64/bin:/opt/lsfhpc/7.0/linux2.6-glibc2.3-x86_64/etc:/opt/lsfhpc/7.0/linux2.6-glibc2.3-x86_64/bin:/usr/kerberos/bin:/usr/java/jdk1.5.0_07/bin:/share/iNquiry/biotools/bin:/share/iNquiry/bin/lx24-x86:/share/iNquiry/bin/lx24-amd64:/opt/Bio/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/opt/modules/current/bin/:/opt/modules/bin/:/opt/Bio/glimmer/scripts:/opt/Bio/gromacs/bin:/opt/eclipse:/opt/ganglia/bin:/opt/maven/bin:/opt/rocks/bin:/opt/rocks/sbin:/home/vdar/bin) Quoting Dave Messina : > I haven't done this myself, but from a quick search on the BioPerl website, > it looks like you'll want to use the > Bio::Tools::Run::RepeatMaskermodule > to create a repeat-masked fasta file. > > If you RepeatMask your query sequence(s), then you need to specify that > sequence when you create your Bio::Seq object. > > If you instead RepeatMask your database, you'll need to create a blast > database from the repeat-masked sequences and specify that db in your > @params. I don't think there's a module for running formatdb, but you can do > it through a system call. > > > > Dave > From vdar at yorku.ca Tue May 13 16:59:00 2008 From: vdar at yorku.ca (nisa_dar) Date: Tue, 13 May 2008 13:59:00 -0700 (PDT) Subject: [Bioperl-l] Re peatMasker not found Message-ID: <17218229.post@talk.nabble.com> Following is the path to repeatmasker.pm on my system /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/RepeatMasker.pm but when I run my program, the error message comes RepeatMasker program not found as or not executable Here is my piece of code which gives this error, #!/usr/bin/perl use strict; use warnings; use Bio::Seq; use Bio::Tools::Run::StandAloneBlast; use Bio::Search::Hit::HitI; use Bio::Search::Hit::BlastHit; use Bio::Search::HSP::BlastHSP; use Bio::Search::HSP::HSPI; use Bio::SearchIO; use Bio::Tools::Run::RepeatMasker; BEGIN { $ENV{REPEATMASKERDIR} = '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/'; } my @params = ("mam" => 1,"noint"=>1); my $factory = Bio::Tools::Run::RepeatMasker->new(@params); my $in = Bio::SeqIO->new(-file => "boechera.fasta", -format => 'fasta'); I tried finding RepeatMasker directory by typing which RepeatMasker but the error message was /usr/bin/which: no RepeatMasker in (/opt/openmpi/1.1.4/bin:/opt/lsfhpc/ego/1.2/linux2.6-glibc2.3-x86_64/etc:/opt/lsfhpc/ego/1.2/linux2.6-glibc2.3-x86_64/bin:/opt/lsfhpc/7.0/linux2.6-glibc2.3-x86_64/etc:/opt/lsfhpc/7.0/linux2.6-glibc2.3-x86_64/bin:/usr/kerberos/bin:/usr/java/jdk1.5.0_07/bin:/share/iNquiry/biotools/bin:/share/iNquiry/bin/lx24-x86:/share/iNquiry/bin/lx24-amd64:/opt/Bio/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/opt/modules/current/bin/:/opt/modules/bin/:/opt/Bio/glimmer/scripts:/opt/Bio/gromacs/bin:/opt/eclipse:/opt/ganglia/bin:/opt/maven/bin:/opt/rocks/bin:/opt/rocks/sbin:/home/vdar/bin) what should I do? Thanks -- View this message in context: http://www.nabble.com/RepeatMasker-not-found-tp17218229p17218229.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From jason at bioperl.org Tue May 13 17:06:31 2008 From: jason at bioperl.org (Jason Stajich) Date: Tue, 13 May 2008 14:06:31 -0700 Subject: [Bioperl-l] Re peatMasker not found In-Reply-To: <17218229.post@talk.nabble.com> References: <17218229.post@talk.nabble.com> Message-ID: <4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org> Dare I ask, did you install repeat masker? http://www.repeatmasker.org/ On May 13, 2008, at 1:59 PM, nisa_dar wrote: > > Following is the path to repeatmasker.pm on my system > > /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/RepeatMasker.pm > > but when I run my program, the error message comes > > RepeatMasker program not found as or not executable > > Here is my piece of code which gives this error, > #!/usr/bin/perl > > use strict; > use warnings; > > use Bio::Seq; > use Bio::Tools::Run::StandAloneBlast; > use Bio::Search::Hit::HitI; > use Bio::Search::Hit::BlastHit; > use Bio::Search::HSP::BlastHSP; > use Bio::Search::HSP::HSPI; > use Bio::SearchIO; > use Bio::Tools::Run::RepeatMasker; > > BEGIN { > > $ENV{REPEATMASKERDIR} = '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/ > Tools/'; > > } > > > my @params = ("mam" => 1,"noint"=>1); > my $factory = Bio::Tools::Run::RepeatMasker->new(@params); > my $in = Bio::SeqIO->new(-file => "boechera.fasta", -format => > 'fasta'); > > I tried finding RepeatMasker directory by typing > > which RepeatMasker > > but the error message was > > /usr/bin/which: no RepeatMasker in > (/opt/openmpi/1.1.4/bin:/opt/lsfhpc/ego/1.2/linux2.6-glibc2.3- > x86_64/etc:/opt/lsfhpc/ego/1.2/linux2.6-glibc2.3-x86_64/bin:/opt/ > lsfhpc/7.0/linux2.6-glibc2.3-x86_64/etc:/opt/lsfhpc/7.0/linux2.6- > glibc2.3-x86_64/bin:/usr/kerberos/bin:/usr/java/jdk1.5.0_07/bin:/ > share/iNquiry/biotools/bin:/share/iNquiry/bin/lx24-x86:/share/ > iNquiry/bin/lx24-amd64:/opt/Bio/bin:/usr/local/bin:/bin:/usr/bin:/ > usr/X11R6/bin:/opt/modules/current/bin/:/opt/modules/bin/:/opt/Bio/ > glimmer/scripts:/opt/Bio/gromacs/bin:/opt/eclipse:/opt/ganglia/bin:/ > opt/maven/bin:/opt/rocks/bin:/opt/rocks/sbin:/home/vdar/bin) > > > what should I do? > > Thanks > > > > > -- > View this message in context: http://www.nabble.com/RepeatMasker- > not-found-tp17218229p17218229.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Tue May 13 17:24:45 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Tue, 13 May 2008 23:24:45 +0200 Subject: [Bioperl-l] automated stand alone blast with repeat masker In-Reply-To: <1210711513.4829fdd9b756d@mymail.yorku.ca> References: <17189995.post@talk.nabble.com> <628aabb70805121458o5bc808f8jf46869b08e65e8ac@mail.gmail.com> <1210711513.4829fdd9b756d@mymail.yorku.ca> Message-ID: <628aabb70805131424w79bca41cu1b872fd6695f3aef@mail.gmail.com> > Do we have to install it separately because seems like its not there on my > system although I have bioperl installed on my system. > Yes. BioPerl doesn't include all of the many programs it potentially interacts with. It'd be great if you could get everything in one shebang, but practically this isn't possible because there are so many bioinformatics programs, because they are written and maintained by their authors not by the BioPerl group, and because they are constantly being updated and the version included in BioPerl would quickly become out of sync. >From the Bio::Tools::Run::RepeatMasker documentation: *To use this module, the RepeatMasker program (and probably database) must be installed. RepeatMasker is a program that screens DNA sequences for interspersed repeats known to exist in mammalian genomes as well as for low complexity DNA sequences. For more information, on the program and its usage, please refer to http://www.repeatmasker.org/. * Dave From aparna_pall at hotmail.com Fri May 16 10:32:26 2008 From: aparna_pall at hotmail.com (Aparna Pallavajjala) Date: Fri, 16 May 2008 10:32:26 -0400 Subject: [Bioperl-l] bl2seq for many Message-ID: Hi, I would like to know if anyone tried to do bl2seq for multiple sequences at a time? Please let me know how. Thx, Aparna _________________________________________________________________ Keep your kids safer online with Windows Live Family Safety. http://www.windowslive.com/family_safety/overview.html?ocid=TXT_TAGLM_WL_Refresh_family_safety_052008 From cjfields at uiuc.edu Fri May 16 11:24:31 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 16 May 2008 10:24:31 -0500 Subject: [Bioperl-l] bl2seq for many In-Reply-To: References: Message-ID: Not sure what you mean here, as bl2seq uses two sequences. From the bl2seq doc: "Bl2seq performs a comparison between two sequences using either the blastn or blastp algorithm. Both sequences must be either nucleotides or proteins. The options may be obtained by executing 'bl2seq -'." If you mean running multiple rounds of bl2seq with varying sequences, the boilerplate demo in the Bio::Tools::Run::StandAloneBlast synopsis could be modified to do what you want: my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => 'blastp'); # grab two sequences at a time (loop?) and run bl2seq my $bl2seq_report = $factory->bl2seq($seq1, $seq2); Does this answer your question? -chris On May 16, 2008, at 9:32 AM, Aparna Pallavajjala wrote: > Hi, > > I would like to know if anyone tried to do bl2seq for multiple > sequences at a time? > > Please let me know how. > > Thx, > Aparna > _________________________________________________________________ > Keep your kids safer online with Windows Live Family Safety. > http://www.windowslive.com/family_safety/overview.html?ocid=TXT_TAGLM_WL_Refresh_family_safety_052008 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri May 16 11:32:44 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 16 May 2008 10:32:44 -0500 Subject: [Bioperl-l] [BioPython] Parsing the pairwise alignments from the FASTA tool In-Reply-To: <320fb6e00805160810s75f27329yc4fa8d2a1676a1dd@mail.gmail.com> References: <320fb6e00805160810s75f27329yc4fa8d2a1676a1dd@mail.gmail.com> Message-ID: <3FE79510-1A20-4863-B2F1-C9AFA13572B7@uiuc.edu> Peter, An enhancement request is in place in bugzilla for this, but BioPerl hasn't implemented parsing -m10 yet. As you indicated this shouldn't be too hard to implement; just needs someone with the time to code it up. chris On May 16, 2008, at 10:10 AM, Peter wrote: > ... > P.S. For anyone interested, BioPerl have had support for the human > readable FASTA output for a while, and judging from this thread, they > added support for the FASTA m10 variant last year: > http://bioperl.org/pipermail/bioperl-l/2007-April/025465.html > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From bioperlanand at yahoo.com Fri May 16 18:52:27 2008 From: bioperlanand at yahoo.com (Anand Venkatraman) Date: Fri, 16 May 2008 15:52:27 -0700 (PDT) Subject: [Bioperl-l] Question on extracting Molecular Weight fields from UniProt records Message-ID: <631075.78579.qm@web36807.mail.mud.yahoo.com> Hi everybody, I would like to know if there is a way in which one can extract the value from "Molecular Weight" field of a UniProt Record (for example that value is "140080 Da" for http://www.pir.uniprot.org/cgi-bin/upEntry?id=MSH6_YEAST ) I am able to use Bio::DB::SwissProt to get the length & seq $seq_length = $seq_object->length() $sequence_as_a_string = $seq_object->seq(); Is there a similar method to extract the Molecular Weight field? Thanks in advance, Anand From jason at bioperl.org Fri May 16 19:15:44 2008 From: jason at bioperl.org (Jason Stajich) Date: Fri, 16 May 2008 16:15:44 -0700 Subject: [Bioperl-l] Bio::Align::DNAStatistics In-Reply-To: References: Message-ID: <170D48AE-125E-4F22-8F9C-A8934A3F243E@bioperl.org> Sounds like an old bioperl version mixture, but I'm not sure, you'll need to let us know what version of bioperl you have installed. Please include the mailing list in these types of questions so others can help. -jason On May 15, 2008, at 1:49 PM, Steve Beckstrom-Sternberg wrote: > Hi Jason, > > I have bioperl-run installed and I am testing pairwise_kaks.PLS. > I am getting an error from Clustalw.pm, and it looks like this is > because > subroutine ?new? is calling a subroutine that is not there > (_set_from_args). > > Here is the error: > > ./pairwise_kaks.PLS > Can't locate object method "_set_from_args" via package > "Bio::Tools::Run::Alignment::Clustalw" at > /usr/lib/perl5/site_perl/5.8.8/Bio/Tools/Run/Alignment/Clustalw.pm > line 409. > > Here is the version of Clustalw.pm: > # $Id: Clustalw.pm,v 1.53 2007/06/14 15:23:08 sendu Exp $ > > > Any direction on how to resolve this? > > Thanks, > > Steve > > From jason at bioperl.org Fri May 16 21:33:25 2008 From: jason at bioperl.org (Jason Stajich) Date: Fri, 16 May 2008 18:33:25 -0700 Subject: [Bioperl-l] give a BioPerl talk at BOSC! Message-ID: <84BEDFFC-7CBA-4101-B23D-3B7DF7569D36@bioperl.org> There is an opportunity for someone to speak at BOSC this year about BioPerl. This could be an interesting application of BioPerl in your work or some ideas about deigning of new components to the toolkit. It doesn't need to be an overview talk about the toolkit. If you are interested please submit an abstract on the BOSC website. http://events.open-bio.org/BOSC2008/openconf.php We'd like submissions to be in by May 18th - if you need additional time please email the committee bosc - at - open-bio.org. -jason -- Jason Stajich jason at bioperl.org From aparna_pall at hotmail.com Sun May 18 19:07:51 2008 From: aparna_pall at hotmail.com (Aparna Pallavajjala) Date: Sun, 18 May 2008 19:07:51 -0400 Subject: [Bioperl-l] bl2seq for many In-Reply-To: References: Message-ID: Hi, does any one know how to run a dos command through perl script? I am trying to run bl2seq command in a for loop-with the inputs beeing 2 Filehandlers. General syntax of the command is: bl2seq -p programname -i firstfilename -j secondfilename -o which in my program is : C:\\xxx\\xxx\\xxx\\bl2seq\\bin\\bl2seq -p blastp -i C:\\xxx\\xxx\\xxx\\xx\\FH1 -j C:\\xxx\\xxx\\xxx\\xxx\\FH2 -o C:\\xxx\\xxx\\xxx\\xxx\\FH1.txt"); I am trying to call this using system(); Please help me. Thanks, Aparna > CC: bioperl-l at lists.open-bio.org> From: cjfields at uiuc.edu> To: aparna_pall at hotmail.com> Subject: Re: [Bioperl-l] bl2seq for many> Date: Fri, 16 May 2008 10:24:31 -0500> > Not sure what you mean here, as bl2seq uses two sequences. From the > bl2seq doc:> "Bl2seq performs a comparison between two sequences using either the > blastn or blastp algorithm. Both sequences must be either nucleotides > or proteins. The options may be obtained by executing 'bl2seq -'."> If you mean running multiple rounds of bl2seq with varying sequences, > the boilerplate demo in the Bio::Tools::Run::StandAloneBlast synopsis > could be modified to do what you want:> > my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => > 'blastp');> # grab two sequences at a time (loop?) and run bl2seq> my $bl2seq_report = $factory->bl2seq($seq1, $seq2);> > Does this answer your question?> > -chris> > On May 16, 2008, at 9:32 AM, Aparna Pallavajjala wrote:> > > Hi,> >> > I would like to know if anyone tried to do bl2seq for multiple > > sequences at a time?> >> > Please let me know how.> >> > Thx,> > Aparna> > _________________________________________________________________> > Keep your kids safer online with Windows Live Family Safety.> > http://www.windowslive.com/family_safety/overview.html?ocid=TXT_TAGLM_WL_Refresh_family_safety_052008> > _______________________________________________> > Bioperl-l mailing list> > Bioperl-l at lists.open-bio.org> > http://lists.open-bio.org/mailman/listinfo/bioperl-l> > Christopher Fields> Postdoctoral Researcher> Lab of Dr. Marie-Claude Hofmann> College of Veterinary Medicine> University of Illinois Urbana-Champaign> > > > _________________________________________________________________ Change the world with e-mail. Join the i?m Initiative from Microsoft. http://im.live.com/Messenger/IM/Join/Default.aspx?source=EML_WL_ChangeWorld From Marc.Logghe at ablynx.com Mon May 19 11:35:37 2008 From: Marc.Logghe at ablynx.com (Marc Logghe) Date: Mon, 19 May 2008 17:35:37 +0200 Subject: [Bioperl-l] Question on extracting Molecular Weight fields fromUniProt records In-Reply-To: <631075.78579.qm@web36807.mail.mud.yahoo.com> References: <631075.78579.qm@web36807.mail.mud.yahoo.com> Message-ID: <03C512635899144083CADB0EE222018901B14C32@alpaca.lan.ablynx.com> Hi Anand, I don't believe that field is extracted. But you can calculate it using Bio::Tools::SeqStats. my $seq_stats = Bio::Tools::SeqStats->new(-seq => $seq_object); my $weight = $seq_stats->get_mol_wt; HTH, Marc Marc Logghe Senior Bioinformatician Ablynx nv > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Anand Venkatraman > Sent: Saturday, May 17, 2008 12:52 AM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Question on extracting Molecular Weight fields > fromUniProt records > > Hi everybody, > > I would like to know if there is a way in which one can extract the value > from "Molecular Weight" field of a UniProt Record (for example that value > is "140080 Da" for http://www.pir.uniprot.org/cgi- > bin/upEntry?id=MSH6_YEAST ) > > I am able to use Bio::DB::SwissProt to get the length & seq > $seq_length = $seq_object->length() > $sequence_as_a_string = $seq_object->seq(); > > Is there a similar method to extract the Molecular Weight field? > > Thanks in advance, > > Anand > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Mon May 19 13:07:35 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 19 May 2008 12:07:35 -0500 Subject: [Bioperl-l] Question on extracting Molecular Weight fields fromUniProt records In-Reply-To: <03C512635899144083CADB0EE222018901B14C32@alpaca.lan.ablynx.com> References: <631075.78579.qm@web36807.mail.mud.yahoo.com> <03C512635899144083CADB0EE222018901B14C32@alpaca.lan.ablynx.com> Message-ID: <013C8A2F-CDCE-4100-A8FF-4D14364ED605@uiuc.edu> Information for the SQ line (length, MW, crc64) is not parsed. On output everything is recalculated from scratch, with the MW being recalculated using SeqStats (as indicated by Marc). I would consider this a feature, for if you change/modify the sequence record you would have to rerun all those calculations anyway. chris On May 19, 2008, at 10:35 AM, Marc Logghe wrote: > Hi Anand, > I don't believe that field is extracted. But you can calculate it > using > Bio::Tools::SeqStats. > > my $seq_stats = Bio::Tools::SeqStats->new(-seq => $seq_object); > my $weight = $seq_stats->get_mol_wt; > > > HTH, > Marc > > Marc Logghe > Senior Bioinformatician > Ablynx nv >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> bounces at lists.open-bio.org] On Behalf Of Anand Venkatraman >> Sent: Saturday, May 17, 2008 12:52 AM >> To: bioperl-l at lists.open-bio.org >> Subject: [Bioperl-l] Question on extracting Molecular Weight fields >> fromUniProt records >> >> Hi everybody, >> >> I would like to know if there is a way in which one can extract the > value >> from "Molecular Weight" field of a UniProt Record (for example that > value >> is "140080 Da" for http://www.pir.uniprot.org/cgi- >> bin/upEntry?id=MSH6_YEAST ) >> >> I am able to use Bio::DB::SwissProt to get the length & seq >> $seq_length = $seq_object->length() >> $sequence_as_a_string = $seq_object->seq(); >> >> Is there a similar method to extract the Molecular Weight field? >> >> Thanks in advance, >> >> Anand >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From sbeckstrom at tgen.org Mon May 19 20:14:53 2008 From: sbeckstrom at tgen.org (sbeckstrom at tgen.org) Date: Mon, 19 May 2008 17:14:53 -0700 Subject: [Bioperl-l] Bio::Align::DNAStatistics In-Reply-To: <170D48AE-125E-4F22-8F9C-A8934A3F243E@bioperl.org> Message-ID: Hi Jason, The problem seems to be specific to Clustalw.pm, with version listed here: (# $Id: Clustalw.pm,v 1.53 2007/06/14 15:23:08 sendu Exp $) Below is subroutine "new", which calls _set_from_args in line 409 of Clustalw.pm: sub new { my ($class, at args) = @_; my $self = $class->SUPER::new(@args); $self->_set_from_args(\@args, -methods => [@CLUSTALW_PARAMS, @CLUSTALW_SWITCHES, @OTHER_SWITCHES], -create => 1); return $self; } The error states that it cannot find the _set_from_args method. And I cannot find a subroutine called _set_from_args in Clustalw.pm. What am I missing? Thanks, Steve On 5/16/08 4:15 PM, "Jason Stajich" wrote: > Sounds like an old bioperl version mixture, but I'm not sure, you'll > need to let us know what version of bioperl you have installed. > > Please include the mailing list in these types of questions so others > can help. > > -jason > On May 15, 2008, at 1:49 PM, Steve Beckstrom-Sternberg wrote: > >> Hi Jason, >> >> I have bioperl-run installed and I am testing pairwise_kaks.PLS. >> I am getting an error from Clustalw.pm, and it looks like this is >> because >> subroutine ?new? is calling a subroutine that is not there >> (_set_from_args). >> >> Here is the error: >> >> ./pairwise_kaks.PLS >> Can't locate object method "_set_from_args" via package >> "Bio::Tools::Run::Alignment::Clustalw" at >> /usr/lib/perl5/site_perl/5.8.8/Bio/Tools/Run/Alignment/Clustalw.pm >> line 409. >> >> Here is the version of Clustalw.pm: >> # $Id: Clustalw.pm,v 1.53 2007/06/14 15:23:08 sendu Exp $ >> >> >> Any direction on how to resolve this? >> >> Thanks, >> >> Steve >> >> From arareko at campus.iztacala.unam.mx Mon May 19 22:07:47 2008 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Mon, 19 May 2008 21:07:47 -0500 Subject: [Bioperl-l] Bio::Align::DNAStatistics In-Reply-To: References: Message-ID: <48323273.8080105@campus.iztacala.unam.mx> Hi Steve, The _set_from_args() method is implemented in the Bio::Root::RootI class, which is a module from the bioperl (core) distribution, not from bioperl-run. In order to use any of the bioperl-run wrappers you'll need to install the core bioperl distribution as well. Hope this helps. Regards, Mauricio. sbeckstrom at tgen.org wrote: > Hi Jason, > > The problem seems to be specific to Clustalw.pm, with version listed here: > (# $Id: Clustalw.pm,v 1.53 2007/06/14 15:23:08 sendu Exp $) > > Below is subroutine "new", which calls _set_from_args in line 409 of > Clustalw.pm: > > sub new { > my ($class, at args) = @_; > my $self = $class->SUPER::new(@args); > > $self->_set_from_args(\@args, -methods => [@CLUSTALW_PARAMS, > @CLUSTALW_SWITCHES, > @OTHER_SWITCHES], > -create => 1); > > return $self; > } > > The error states that it cannot find the _set_from_args method. > And I cannot find a subroutine called _set_from_args in Clustalw.pm. > What am I missing? > > Thanks, > > Steve > > > > > On 5/16/08 4:15 PM, "Jason Stajich" wrote: > >> Sounds like an old bioperl version mixture, but I'm not sure, you'll >> need to let us know what version of bioperl you have installed. >> >> Please include the mailing list in these types of questions so others >> can help. >> >> -jason >> On May 15, 2008, at 1:49 PM, Steve Beckstrom-Sternberg wrote: >> >>> Hi Jason, >>> >>> I have bioperl-run installed and I am testing pairwise_kaks.PLS. >>> I am getting an error from Clustalw.pm, and it looks like this is >>> because >>> subroutine ?new? is calling a subroutine that is not there >>> (_set_from_args). >>> >>> Here is the error: >>> >>> ./pairwise_kaks.PLS >>> Can't locate object method "_set_from_args" via package >>> "Bio::Tools::Run::Alignment::Clustalw" at >>> /usr/lib/perl5/site_perl/5.8.8/Bio/Tools/Run/Alignment/Clustalw.pm >>> line 409. >>> >>> Here is the version of Clustalw.pm: >>> # $Id: Clustalw.pm,v 1.53 2007/06/14 15:23:08 sendu Exp $ >>> >>> >>> Any direction on how to resolve this? >>> >>> Thanks, >>> >>> Steve >>> >>> > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From arareko at campus.iztacala.unam.mx Mon May 19 23:18:38 2008 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Mon, 19 May 2008 22:18:38 -0500 Subject: [Bioperl-l] Bio::Align::DNAStatistics In-Reply-To: References: Message-ID: <4832430E.7060508@campus.iztacala.unam.mx> The version you have is quite old, Sendu added _set_from_args() just after the one you have. Current one has the following id: $Id: RootI.pm 11557 2007-07-05 15:46:22Z sendu $ You should checkout a more recent version of bioperl. Mauricio. Steve Beckstrom-Sternberg wrote: > Hi Mauricio, > > I checked out RootI.pm in Bio::Root and did not find the _set_from_args() > method. Below is the version of RootI.pm. Is it too old? > > # $Id: RootI.pm,v 1.69.4.4 2006/10/02 23:10:23 sendu Exp $ > > > > Thanks, > > Steve > > > > On 5/19/08 7:07 PM, "Mauricio Herrera Cuadra" > wrote: > >> Hi Steve, >> >> The _set_from_args() method is implemented in the Bio::Root::RootI >> class, which is a module from the bioperl (core) distribution, not from >> bioperl-run. In order to use any of the bioperl-run wrappers you'll need >> to install the core bioperl distribution as well. >> >> Hope this helps. >> >> Regards, >> Mauricio. >> >> sbeckstrom at tgen.org wrote: >>> Hi Jason, >>> >>> The problem seems to be specific to Clustalw.pm, with version listed here: >>> (# $Id: Clustalw.pm,v 1.53 2007/06/14 15:23:08 sendu Exp $) >>> >>> Below is subroutine "new", which calls _set_from_args in line 409 of >>> Clustalw.pm: >>> >>> sub new { >>> my ($class, at args) = @_; >>> my $self = $class->SUPER::new(@args); >>> >>> $self->_set_from_args(\@args, -methods => [@CLUSTALW_PARAMS, >>> @CLUSTALW_SWITCHES, >>> @OTHER_SWITCHES], >>> -create => 1); >>> >>> return $self; >>> } >>> >>> The error states that it cannot find the _set_from_args method. >>> And I cannot find a subroutine called _set_from_args in Clustalw.pm. >>> What am I missing? >>> >>> Thanks, >>> >>> Steve >>> >>> >>> >>> >>> On 5/16/08 4:15 PM, "Jason Stajich" wrote: >>> >>>> Sounds like an old bioperl version mixture, but I'm not sure, you'll >>>> need to let us know what version of bioperl you have installed. >>>> >>>> Please include the mailing list in these types of questions so others >>>> can help. >>>> >>>> -jason >>>> On May 15, 2008, at 1:49 PM, Steve Beckstrom-Sternberg wrote: >>>> >>>>> Hi Jason, >>>>> >>>>> I have bioperl-run installed and I am testing pairwise_kaks.PLS. >>>>> I am getting an error from Clustalw.pm, and it looks like this is >>>>> because >>>>> subroutine ?new? is calling a subroutine that is not there >>>>> (_set_from_args). >>>>> >>>>> Here is the error: >>>>> >>>>> ./pairwise_kaks.PLS >>>>> Can't locate object method "_set_from_args" via package >>>>> "Bio::Tools::Run::Alignment::Clustalw" at >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Tools/Run/Alignment/Clustalw.pm >>>>> line 409. >>>>> >>>>> Here is the version of Clustalw.pm: >>>>> # $Id: Clustalw.pm,v 1.53 2007/06/14 15:23:08 sendu Exp $ >>>>> >>>>> >>>>> Any direction on how to resolve this? >>>>> >>>>> Thanks, >>>>> >>>>> Steve >>>>> >>>>> >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> > > > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From RChu at coh.org Tue May 20 03:09:56 2008 From: RChu at coh.org (Chu, Roy) Date: Tue, 20 May 2008 00:09:56 -0700 Subject: [Bioperl-l] SearchIO: write/read database Message-ID: <58C9944D6E1A894EA9BC47DEB89EB5A633876C@EXCH-VS2.coh.org> Hi, I tried a mailing-list and module search, but I couldn't find what I was looking for--but I know it's there. On the bioperl page (http://www.bioperl.org/wiki/HOWTO:SearchIO#Writing_and_formatting_output) it says: "If your data is instead stored in a database you could build the Bio::Search objects up in memory directly from your database and then use the Writer object to output the data." I want to write/read BLAST results to a database w/o having to write to files or parse any xml. Can someone direct me to any available module(s) that will help me in my endeavor? Thanks in advance, Roy --------------------------------------------------------------------- SECURITY/CONFIDENTIALITY WARNING: This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. --------------------------------------------------------------------- From bix at sendu.me.uk Tue May 20 05:52:03 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 20 May 2008 10:52:03 +0100 Subject: [Bioperl-l] SearchIO: write/read database In-Reply-To: <58C9944D6E1A894EA9BC47DEB89EB5A633876C@EXCH-VS2.coh.org> References: <58C9944D6E1A894EA9BC47DEB89EB5A633876C@EXCH-VS2.coh.org> Message-ID: <48329F43.6000709@sendu.me.uk> Chu, Roy wrote: > Hi, > > I tried a mailing-list and module search, but I couldn't find what I > was looking for--but I know it's there. > > On the bioperl page > (http://www.bioperl.org/wiki/HOWTO:SearchIO#Writing_and_formatting_output) > it says: "If your data is instead stored in a database you could > build the Bio::Search objects up in memory directly from your > database and then use the Writer object to output the data." > > I want to write/read BLAST results to a database w/o having to write > to files or parse any xml. Can someone direct me to any available > module(s) that will help me in my endeavor? I've not used Writer so perhaps someone has a better answer, but... Bio::Tools::Run::StandAloneBlast always creates a temporary file at least, so you may not want to use that to run your blasts. Let's say you arrange for the output of your blast executable to be (perhaps indirectly) piped into your database. If your question is then how to parse that data in your database without creating a file on disc, just arrange to pull the data out of your database into memory as a perl string, treat that string as a file handle, then supply that FH to a SearchIO. Using Bio::SearchIO::blast_pull here may be beneficial. Alternatively, if you wanted to store parsed results in your db, again you could arrange for to run your blast executable but this time pipe into blast_pull (which avoids any temp files), and store the desired Bio::Search::* objects in your db for later retrieval. From David.Messina at sbc.su.se Tue May 20 13:10:26 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Tue, 20 May 2008 19:10:26 +0200 Subject: [Bioperl-l] SearchIO: write/read database In-Reply-To: <48329F43.6000709@sendu.me.uk> References: <58C9944D6E1A894EA9BC47DEB89EB5A633876C@EXCH-VS2.coh.org> <48329F43.6000709@sendu.me.uk> Message-ID: <628aabb70805201010r3dfa993ei26750e6584b39fb@mail.gmail.com> Echoing what Sendu said, the most straightforward way to read output from a program without writing to a file first is to open a pipe from the program as a filehandle. e.g. my @args = ('/usr/bin/blastp', '/path/to/my/blastdb', '/path/to/my/blastquery'); open (my $blast_fh, '-|', @args) or die "couldn't open blast stream"; Now you can use that filehandle to read in data and manipulate just as if you had read it in from a file (except of course since it's a stream you can't rewind). Dave From RChu at coh.org Tue May 20 13:22:50 2008 From: RChu at coh.org (Chu, Roy) Date: Tue, 20 May 2008 10:22:50 -0700 Subject: [Bioperl-l] SearchIO: write/read database In-Reply-To: <628aabb70805201010r3dfa993ei26750e6584b39fb@mail.gmail.com> References: <58C9944D6E1A894EA9BC47DEB89EB5A633876C@EXCH-VS2.coh.org> <48329F43.6000709@sendu.me.uk> <628aabb70805201010r3dfa993ei26750e6584b39fb@mail.gmail.com> Message-ID: <58C9944D6E1A894EA9BC47DEB89EB5A62724EC@EXCH-VS2.coh.org> I've never really explored using pipes beyond the shell envrionment and writing images w/ the bioperl GD module but always a good time to learn! If I can't get it to work, then I'll just parse the xml output. Thanks, -Roy ________________________________ From: dave at davemessina.com [mailto:dave at davemessina.com] On Behalf Of Dave Messina Sent: Tuesday, May 20, 2008 10:10 AM To: Chu, Roy Cc: Sendu Bala; bioperl-l at bioperl.org Subject: Re: [Bioperl-l] SearchIO: write/read database Echoing what Sendu said, the most straightforward way to read output from a program without writing to a file first is to open a pipe from the program as a filehandle. e.g. my @args = ('/usr/bin/blastp', '/path/to/my/blastdb', '/path/to/my/blastquery'); open (my $blast_fh, '-|', @args) or die "couldn't open blast stream"; Now you can use that filehandle to read in data and manipulate just as if you had read it in from a file (except of course since it's a stream you can't rewind). Dave --------------------------------------------------------------------- SECURITY/CONFIDENTIALITY WARNING: This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. --------------------------------------------------------------------- From cjfields at uiuc.edu Tue May 20 13:23:19 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 20 May 2008 12:23:19 -0500 Subject: [Bioperl-l] SearchIO: write/read database In-Reply-To: <628aabb70805201010r3dfa993ei26750e6584b39fb@mail.gmail.com> References: <58C9944D6E1A894EA9BC47DEB89EB5A633876C@EXCH-VS2.coh.org> <48329F43.6000709@sendu.me.uk> <628aabb70805201010r3dfa993ei26750e6584b39fb@mail.gmail.com> Message-ID: On May 20, 2008, at 12:10 PM, Dave Messina wrote: > Echoing what Sendu said, the most straightforward way to read output > from a > program without writing to a file first is to open a pipe from the > program > as a filehandle. > > e.g. > my @args = ('/usr/bin/blastp', '/path/to/my/blastdb', > '/path/to/my/blastquery'); > open (my $blast_fh, '-|', @args) or die "couldn't open blast stream"; > > Now you can use that filehandle to read in data and manipulate just > as if > you had read it in from a file (except of course since it's a stream > you > can't rewind). > > > Dave I think that works with everything except Win32 (unless more progress has been made on that front), which if I recall doesn't deal well with pipes/forks. That may have changed, though; anyone know? chris From cjfields at uiuc.edu Tue May 20 13:40:30 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 20 May 2008 12:40:30 -0500 Subject: [Bioperl-l] SearchIO: write/read database In-Reply-To: <58C9944D6E1A894EA9BC47DEB89EB5A62724EC@EXCH-VS2.coh.org> References: <58C9944D6E1A894EA9BC47DEB89EB5A633876C@EXCH-VS2.coh.org> <48329F43.6000709@sendu.me.uk> <628aabb70805201010r3dfa993ei26750e6584b39fb@mail.gmail.com> <58C9944D6E1A894EA9BC47DEB89EB5A62724EC@EXCH-VS2.coh.org> Message-ID: Jason has a starter script available in scripts/graphics/ search_overview.PLS (should be installed as bp_search_overview.pl if you elected to install scripts). It can draw a simple graphical search overview using Bio::Search objects; might be worth a look. chris On May 20, 2008, at 12:22 PM, Chu, Roy wrote: > I've never really explored using pipes beyond the shell envrionment > and > writing images w/ the bioperl GD module but always a good time to > learn! > If I can't get it to work, then I'll just parse the xml output. > > Thanks, > -Roy > > ________________________________ > > From: dave at davemessina.com [mailto:dave at davemessina.com] On Behalf Of > Dave Messina > Sent: Tuesday, May 20, 2008 10:10 AM > To: Chu, Roy > Cc: Sendu Bala; bioperl-l at bioperl.org > Subject: Re: [Bioperl-l] SearchIO: write/read database > > > Echoing what Sendu said, the most straightforward way to read output > from a program without writing to a file first is to open a pipe from > the program as a filehandle. > > e.g. > my @args = ('/usr/bin/blastp', '/path/to/my/blastdb', > '/path/to/my/blastquery'); > open (my $blast_fh, '-|', @args) or die "couldn't open blast stream"; > > Now you can use that filehandle to read in data and manipulate just as > if you had read it in from a file (except of course since it's a > stream > you can't rewind). > > > Dave > > > --------------------------------------------------------------------- > > SECURITY/CONFIDENTIALITY WARNING: > This message and any attachments are intended solely for the > individual or entity to which they are addressed. This communication > may contain information that is privileged, confidential, or exempt > from disclosure under applicable law (e.g., personal health > information, research data, financial information). Because this e- > mail has been sent without encryption, individuals other than the > intended recipient may be able to view the information, forward it > to others or tamper with the information without the knowledge or > consent of the sender. If you are not the intended recipient, or the > employee or person responsible for delivering the message to the > intended recipient, any dissemination, distribution or copying of > the communication is strictly prohibited. If you received the > communication in error, please notify the sender immediately by > replying to this message and deleting the message and any > accompanying files from your system. If, due to the security risks, > you do not wi! > sh to receive further communications via e-mail, please reply to > this message and inform the sender that you do not wish to receive > further e-mail from the sender. > > --------------------------------------------------------------------- > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From artendulkar at gmail.com Tue May 20 16:20:32 2008 From: artendulkar at gmail.com (artendulkar at gmail.com) Date: Tue, 20 May 2008 15:20:32 -0500 Subject: [Bioperl-l] How to extract list of SNPs for a given gene? Message-ID: <6dbb54fe0805201320s7382aba9k3af82a98e1cee5c1@mail.gmail.com> Hi, Can anyone please tell me how to get list of SNPs in any particular gene using BioPerl, given NCBI Gene ID? Is there any method, which takes NCBI gene ID as argument and returns list of SNPs by connecting to dbSNP? Thank you. Abhijit From jason at bioperl.org Tue May 20 23:40:49 2008 From: jason at bioperl.org (Jason Stajich) Date: Tue, 20 May 2008 21:40:49 -0600 Subject: [Bioperl-l] How to extract list of SNPs for a given gene? In-Reply-To: <6dbb54fe0805201320s7382aba9k3af82a98e1cee5c1@mail.gmail.com> References: <6dbb54fe0805201320s7382aba9k3af82a98e1cee5c1@mail.gmail.com> Message-ID: <33EE9D35-22F3-4E5C-8E05-AFF9878C6A4E@bioperl.org> is this in humans? If so, You can do this in BioMart http:// www.biomart.org/ For humans go here: http://www.ensembl.org/biomart/martview/ -jason On May 20, 2008, at 2:20 PM, artendulkar at gmail.com wrote: > Hi, > Can anyone please tell me how to get list of SNPs in any particular > gene > using BioPerl, given NCBI Gene ID? > Is there any method, which takes NCBI gene ID as argument and > returns list > of SNPs by connecting to dbSNP? > Thank you. > Abhijit > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From vdar at yorku.ca Wed May 21 14:41:00 2008 From: vdar at yorku.ca (nisa_dar) Date: Wed, 21 May 2008 11:41:00 -0700 (PDT) Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... Message-ID: <17367665.post@talk.nabble.com> Hi, My multiple alignment program works fine from command line but when i put the same piece of code in my cgi perl program, it gives me the same error, which is, Can't locate Bio/Tools/Run/Alignment/Clustalw.pm in @INC (@INC contains: /export/share/iNquiry/perl/lib/5.8.5/x86_64-linux-thread-multi /export/share/iNquiry/perl/lib/5.8.5 /export/share/iNquiry/perl/lib/x86_64-linux-thread-multi /export/share/iNquiry/perl/lib/5.8.4 /export/share/iNquiry/perl/lib/5.8.3 /export/share/iNquiry/perl/lib/5.8.2 /export/share/iNquiry/perl/lib/5.8.1 /export/share/iNquiry/perl/lib/5.8.0 /export/share/iNquiry/perl/lib /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.4/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.3/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.2/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.1/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.0/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.4/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.2/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.1/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.0/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl .) at /export/share/iNquiry/www/cgi-bin/bipod/nisa/snpfinder.cgi line 14. BEGIN failed--compilation aborted at /export/share/iNquiry/www/cgi-bin/bipod/nisa/snpfinder.cgi line 14. Here is my cgi perl program #!/usr/bin/perl -w use strict; use CGI qw(:standard); use CGI::Carp qw/fatalsToBrowser/; use Bio::SeqIO; use Bio::Align::AlignI; use Bio::AlignIO; use Bio::AlignIO::msf; use Bio::SimpleAlign; use Bio::PrimarySeq; use Bio::Tools::Run::Alignment::Clustalw; use Bio::PrimarySeqI; use Bio::Root::IO; use Bio::Seq; use Bio::TreeIO; use Bio::Root::Root Bio::Tools::Run::WrapperBase; use Bio::LocatableSeq; BEGIN { $ENV{CLUSTALDIR} = '/opt/Bio/bin/clustalw'; } print"Content-type: text/html\n\n"; if (param()){ #condition if user supplied some data my $new_seq=param("sequence"); #to store sequence from text box of form my $selection=param("size");#to store drop down menu selection for translation type my $file1=param("uploadfile");#variable for name of the file my $address=param("email");#variable to hold e.mail address if ($file1=~/.+/) {#test condition, if user supplied a file my $R_FH = upload("uploadfile");#get file handle my $in = Bio::AlignIO->new(-file => $file1 , -format => 'fasta'); my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , -format => 'pfam'); while ( my $aln1 = $in->next_aln() ) { $out->write_aln($aln1); } close $R_FH; open FH, "out.aln.pfam" || die "Alignment file doesn't exist\n"; while(){ print $_,"\n"; } close FH; #To send results through e.mail $report=~s/
/\n/g;#change new line character $report=~s/(<.+?>)//g;#remove html tags $report=~s/( )/ /g;#change space character open(MAIL, "|/usr/sbin/sendmail -t") or die"can't compose an e.mail"; print MAIL "To:$address\n"; print MAIL "From: nisa\n"; print MAIL "Subject: results of $file1\n"; print MAIL "\n$report\n"; close(MAIL);#close mail }else{#error message if no file name is entered print "

Error:

You did not enter a file name,please upload a file
"; } } Please let me know what should I do, because clustalw is there and my other programs (non web-based) are working. Thanks Nisa -- View this message in context: http://www.nabble.com/I-can%27t-access-clustalw-from-my-cgi-perl-program...-tp17367665p17367665.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From sharpton at berkeley.edu Wed May 21 14:51:14 2008 From: sharpton at berkeley.edu (Thomas Sharpton) Date: Wed, 21 May 2008 11:51:14 -0700 Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... In-Reply-To: <17367665.post@talk.nabble.com> References: <17367665.post@talk.nabble.com> Message-ID: <48346F22.5060806@berkeley.edu> Hi Nisa, Looks to me like you need to add BioPerl to your PERL5LIB environmental variable. Until you set this up, Perl doesn't know where on your machine to find the Bioperl code. On a UNIX/Linux/Mac OSX based system, the fix is as follows: From http://www.bioperl.org/wiki/Using_Subversion : Tell perl where to find BioPerl (assuming you checked out the code in $HOME/src; set this in your .bash_profile, .profile, or .cshrc): bash: $ export PERL5LIB="$HOME/src/bioperl-live:$PERL5LIB" tcsh: $ setenv PERL5LIB "$HOME/src/bioperl-live:$PERL5LIB" On a windows machine, you can either environmental variables through autoexec.bat or via the Control Panel. Feel free to back channel me if you need help. Cheers, Tom nisa_dar wrote: > Hi, > > My multiple alignment program works fine from command line but when i put > the > same piece of code in my cgi perl program, it gives me the same error, which > is, > > Can't locate Bio/Tools/Run/Alignment/Clustalw.pm in @INC (@INC contains: > /export/share/iNquiry/perl/lib/5.8.5/x86_64-linux-thread-multi > /export/share/iNquiry/perl/lib/5.8.5 > /export/share/iNquiry/perl/lib/x86_64-linux-thread-multi > /export/share/iNquiry/perl/lib/5.8.4 /export/share/iNquiry/perl/lib/5.8.3 > /export/share/iNquiry/perl/lib/5.8.2 /export/share/iNquiry/perl/lib/5.8.1 > /export/share/iNquiry/perl/lib/5.8.0 /export/share/iNquiry/perl/lib > /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/5.8.5 > /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi > /usr/lib64/perl5/site_perl/5.8.4/x86_64-linux-thread-multi > /usr/lib64/perl5/site_perl/5.8.3/x86_64-linux-thread-multi > /usr/lib64/perl5/site_perl/5.8.2/x86_64-linux-thread-multi > /usr/lib64/perl5/site_perl/5.8.1/x86_64-linux-thread-multi > /usr/lib64/perl5/site_perl/5.8.0/x86_64-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 > /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 > /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 > /usr/lib/perl5/site_perl > /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi > /usr/lib64/perl5/vendor_perl/5.8.4/x86_64-linux-thread-multi > /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi > /usr/lib64/perl5/vendor_perl/5.8.2/x86_64-linux-thread-multi > /usr/lib64/perl5/vendor_perl/5.8.1/x86_64-linux-thread-multi > /usr/lib64/perl5/vendor_perl/5.8.0/x86_64-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 > /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 > /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 > /usr/lib/perl5/vendor_perl .) at > /export/share/iNquiry/www/cgi-bin/bipod/nisa/snpfinder.cgi line 14. > BEGIN failed--compilation aborted at > /export/share/iNquiry/www/cgi-bin/bipod/nisa/snpfinder.cgi line 14. > > > > Here is my cgi perl program > > #!/usr/bin/perl -w > > use strict; > use CGI qw(:standard); > use CGI::Carp qw/fatalsToBrowser/; > use Bio::SeqIO; > use Bio::Align::AlignI; > use Bio::AlignIO; > use Bio::AlignIO::msf; > use Bio::SimpleAlign; > use Bio::PrimarySeq; > use Bio::Tools::Run::Alignment::Clustalw; > use Bio::PrimarySeqI; > use Bio::Root::IO; > use Bio::Seq; > use Bio::TreeIO; > use Bio::Root::Root Bio::Tools::Run::WrapperBase; > use Bio::LocatableSeq; > > > BEGIN { > $ENV{CLUSTALDIR} = '/opt/Bio/bin/clustalw'; > } > print"Content-type: text/html\n\n"; > > > if (param()){ #condition if user supplied some data > my $new_seq=param("sequence"); #to store sequence from text box of form > my $selection=param("size");#to store drop down menu selection for > translation type > my $file1=param("uploadfile");#variable for name of the file > my $address=param("email");#variable to hold e.mail address > > if ($file1=~/.+/) {#test condition, if user supplied a file > > my $R_FH = upload("uploadfile");#get file handle > > > > > my $in = Bio::AlignIO->new(-file => $file1 , > -format => 'fasta'); > my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , > -format => 'pfam'); > > while ( my $aln1 = $in->next_aln() ) { > $out->write_aln($aln1); > } > > close $R_FH; > open FH, "out.aln.pfam" || die "Alignment file doesn't exist\n"; > while(){ > > print $_,"\n"; > } > close FH; > > #To send results through e.mail > $report=~s/
/\n/g;#change new line character > $report=~s/(<.+?>)//g;#remove html tags > $report=~s/( )/ /g;#change space character > open(MAIL, "|/usr/sbin/sendmail -t") or die"can't compose an e.mail"; > print MAIL "To:$address\n"; > print MAIL "From: nisa\n"; > print MAIL "Subject: results of $file1\n"; > print MAIL "\n$report\n"; > close(MAIL);#close mail > > > }else{#error message if no file name is entered > print "

Error:

You did not enter a file > name,please upload a file
"; > > } > > } > > Please let me know what should I do, because clustalw is there and my other > programs (non web-based) are working. > > Thanks > Nisa > -- Thomas Sharpton PhD Candidate - UC Berkeley Search smarter: www.siphs.com From vdar at yorku.ca Wed May 21 15:18:22 2008 From: vdar at yorku.ca (vdar at yorku.ca) Date: Wed, 21 May 2008 15:18:22 -0400 Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... In-Reply-To: <48346F22.5060806@berkeley.edu> References: <17367665.post@talk.nabble.com> <48346F22.5060806@berkeley.edu> Message-ID: <1211397502.4834757e732c3@mymail.yorku.ca> Hi, This is my .bashrc file and perl can see bioperl because my commandline programs work fine, the problem is with web-based programs ############################################## # .bashrc # User specific aliases and functions # Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fi export PERL5LIB="/opt/rocks/lib/perl5/site_perl/5.8.8" ################################################ Please let me know if I need to change it now... Thanks Nisa Quoting Thomas Sharpton : > Hi Nisa, > > Looks to me like you need to add BioPerl to your PERL5LIB environmental > variable. Until you set this up, Perl doesn't know where on your > machine to find the Bioperl code. On a UNIX/Linux/Mac OSX based system, > the fix is as follows: > > From http://www.bioperl.org/wiki/Using_Subversion : > > Tell perl where to find BioPerl (assuming you checked out the code in > $HOME/src; set this in your .bash_profile, .profile, or .cshrc): > > bash: $ export PERL5LIB="$HOME/src/bioperl-live:$PERL5LIB" > tcsh: $ setenv PERL5LIB "$HOME/src/bioperl-live:$PERL5LIB" > > On a windows machine, you can either environmental variables through > autoexec.bat or via the Control Panel. > > Feel free to back channel me if you need help. > > Cheers, > Tom > > > nisa_dar wrote: > > Hi, > > > > My multiple alignment program works fine from command line but when i put > > the > > same piece of code in my cgi perl program, it gives me the same error, > which > > is, > > > > Can't locate Bio/Tools/Run/Alignment/Clustalw.pm in @INC (@INC contains: > > /export/share/iNquiry/perl/lib/5.8.5/x86_64-linux-thread-multi > > /export/share/iNquiry/perl/lib/5.8.5 > > /export/share/iNquiry/perl/lib/x86_64-linux-thread-multi > > /export/share/iNquiry/perl/lib/5.8.4 /export/share/iNquiry/perl/lib/5.8.3 > > /export/share/iNquiry/perl/lib/5.8.2 /export/share/iNquiry/perl/lib/5.8.1 > > /export/share/iNquiry/perl/lib/5.8.0 /export/share/iNquiry/perl/lib > > /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/5.8.5 > > /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi > > /usr/lib64/perl5/site_perl/5.8.4/x86_64-linux-thread-multi > > /usr/lib64/perl5/site_perl/5.8.3/x86_64-linux-thread-multi > > /usr/lib64/perl5/site_perl/5.8.2/x86_64-linux-thread-multi > > /usr/lib64/perl5/site_perl/5.8.1/x86_64-linux-thread-multi > > /usr/lib64/perl5/site_perl/5.8.0/x86_64-linux-thread-multi > > /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 > > /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 > > /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 > > /usr/lib/perl5/site_perl > > /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi > > /usr/lib64/perl5/vendor_perl/5.8.4/x86_64-linux-thread-multi > > /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi > > /usr/lib64/perl5/vendor_perl/5.8.2/x86_64-linux-thread-multi > > /usr/lib64/perl5/vendor_perl/5.8.1/x86_64-linux-thread-multi > > /usr/lib64/perl5/vendor_perl/5.8.0/x86_64-linux-thread-multi > > /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 > > /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 > > /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 > > /usr/lib/perl5/vendor_perl .) at > > /export/share/iNquiry/www/cgi-bin/bipod/nisa/snpfinder.cgi line 14. > > BEGIN failed--compilation aborted at > > /export/share/iNquiry/www/cgi-bin/bipod/nisa/snpfinder.cgi line 14. > > > > > > > > Here is my cgi perl program > > > > #!/usr/bin/perl -w > > > > use strict; > > use CGI qw(:standard); > > use CGI::Carp qw/fatalsToBrowser/; > > use Bio::SeqIO; > > use Bio::Align::AlignI; > > use Bio::AlignIO; > > use Bio::AlignIO::msf; > > use Bio::SimpleAlign; > > use Bio::PrimarySeq; > > use Bio::Tools::Run::Alignment::Clustalw; > > use Bio::PrimarySeqI; > > use Bio::Root::IO; > > use Bio::Seq; > > use Bio::TreeIO; > > use Bio::Root::Root Bio::Tools::Run::WrapperBase; > > use Bio::LocatableSeq; > > > > > > BEGIN { > > $ENV{CLUSTALDIR} = '/opt/Bio/bin/clustalw'; > > } > > print"Content-type: text/html\n\n"; > > > > > > if (param()){ #condition if user supplied some data > > my $new_seq=param("sequence"); #to store sequence from text box of form > > my $selection=param("size");#to store drop down menu selection for > > translation type > > my $file1=param("uploadfile");#variable for name of the file > > my $address=param("email");#variable to hold e.mail address > > > > if ($file1=~/.+/) {#test condition, if user supplied a file > > > > my $R_FH = upload("uploadfile");#get file handle > > > > > > > > > > my $in = Bio::AlignIO->new(-file => $file1 , > > -format => 'fasta'); > > my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , > > -format => 'pfam'); > > > > while ( my $aln1 = $in->next_aln() ) { > > $out->write_aln($aln1); > > } > > > > close $R_FH; > > open FH, "out.aln.pfam" || die "Alignment file doesn't exist\n"; > > while(){ > > > > print $_,"\n"; > > } > > close FH; > > > > #To send results through e.mail > > $report=~s/
/\n/g;#change new line character > > $report=~s/(<.+?>)//g;#remove html tags > > $report=~s/( )/ /g;#change space character > > open(MAIL, "|/usr/sbin/sendmail -t") or die"can't compose an e.mail"; > > print MAIL "To:$address\n"; > > print MAIL "From: nisa\n"; > > print MAIL "Subject: results of $file1\n"; > > print MAIL "\n$report\n"; > > close(MAIL);#close mail > > > > > > }else{#error message if no file name is entered > > print "

Error:

You did not enter a file > > name,please upload a file
"; > > > > } > > > > } > > > > Please let me know what should I do, because clustalw is there and my other > > programs (non web-based) are working. > > > > Thanks > > Nisa > > > > > -- > Thomas Sharpton > PhD Candidate - UC Berkeley > Search smarter: www.siphs.com > > From David.Messina at sbc.su.se Wed May 21 15:44:28 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Wed, 21 May 2008 21:44:28 +0200 Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... In-Reply-To: <17367665.post@talk.nabble.com> References: <17367665.post@talk.nabble.com> Message-ID: <628aabb70805211244r63cce0detc585b629d6677b81@mail.gmail.com> Since your script runs correctly from the command line, this doesn't look like it's a BioPerl problem. The error message you got is: Can't locate Bio/Tools/Run/Alignment/Clustalw.pm followed by a long list of directories where it looked for that module. So the first thing to check is Is Bio/Tools/Run/Alignment/Clustalw.pm in one of those @INC directories? The fact that other Bioperl modules are 'use'd in your script first and didn't produce an error suggests that you might have the BIoperl core installation in those directories, but not Bio::Tools::Run. If Bio/Tools/Run/Alignment/Clustalw.pm is in fact in the @INC directories listed, then it's probably a CGI/web issue. Do you know as what user on your machine web scripts are run? That user probably has limited permissions compared to your regular user account. Dave From arareko at campus.iztacala.unam.mx Wed May 21 16:01:42 2008 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Wed, 21 May 2008 15:01:42 -0500 Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... In-Reply-To: <1211397502.4834757e732c3@mymail.yorku.ca> References: <17367665.post@talk.nabble.com> <48346F22.5060806@berkeley.edu> <1211397502.4834757e732c3@mymail.yorku.ca> Message-ID: <48347FA6.4090301@campus.iztacala.unam.mx> Hi Nisa, CGI scripts are generally run by a different user than you, and which user (e.g. apache, nobody) will depend on the platform you're running the script on, thus the environment variables you currently have for your login shell are not being inherited to the web interface. The best workaround for this is to add a 'use lib' pragma at the top of your CGI script: use lib '/path/to/your/bioperl/installation/'; Also, it's a good practice to use taint mode for CGI scripts: #!/usr/bin/perl -wT Hope this helps. Regards, Mauricio. vdar at yorku.ca wrote: > Hi, > > This is my .bashrc file and perl can see bioperl because my commandline programs > work fine, the problem is with web-based programs > > ############################################## > # .bashrc > > # User specific aliases and functions > > # Source global definitions > if [ -f /etc/bashrc ]; then > . /etc/bashrc > > fi > export PERL5LIB="/opt/rocks/lib/perl5/site_perl/5.8.8" > > > ################################################ > > Please let me know if I need to change it now... > > Thanks > Nisa > > > Quoting Thomas Sharpton : > >> Hi Nisa, >> >> Looks to me like you need to add BioPerl to your PERL5LIB environmental >> variable. Until you set this up, Perl doesn't know where on your >> machine to find the Bioperl code. On a UNIX/Linux/Mac OSX based system, >> the fix is as follows: >> >> From http://www.bioperl.org/wiki/Using_Subversion : >> >> Tell perl where to find BioPerl (assuming you checked out the code in >> $HOME/src; set this in your .bash_profile, .profile, or .cshrc): >> >> bash: $ export PERL5LIB="$HOME/src/bioperl-live:$PERL5LIB" >> tcsh: $ setenv PERL5LIB "$HOME/src/bioperl-live:$PERL5LIB" >> >> On a windows machine, you can either environmental variables through >> autoexec.bat or via the Control Panel. >> >> Feel free to back channel me if you need help. >> >> Cheers, >> Tom >> >> >> nisa_dar wrote: >>> Hi, >>> >>> My multiple alignment program works fine from command line but when i put >>> the >>> same piece of code in my cgi perl program, it gives me the same error, >> which >>> is, >>> >>> Can't locate Bio/Tools/Run/Alignment/Clustalw.pm in @INC (@INC contains: >>> /export/share/iNquiry/perl/lib/5.8.5/x86_64-linux-thread-multi >>> /export/share/iNquiry/perl/lib/5.8.5 >>> /export/share/iNquiry/perl/lib/x86_64-linux-thread-multi >>> /export/share/iNquiry/perl/lib/5.8.4 /export/share/iNquiry/perl/lib/5.8.3 >>> /export/share/iNquiry/perl/lib/5.8.2 /export/share/iNquiry/perl/lib/5.8.1 >>> /export/share/iNquiry/perl/lib/5.8.0 /export/share/iNquiry/perl/lib >>> /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/5.8.5 >>> /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi >>> /usr/lib64/perl5/site_perl/5.8.4/x86_64-linux-thread-multi >>> /usr/lib64/perl5/site_perl/5.8.3/x86_64-linux-thread-multi >>> /usr/lib64/perl5/site_perl/5.8.2/x86_64-linux-thread-multi >>> /usr/lib64/perl5/site_perl/5.8.1/x86_64-linux-thread-multi >>> /usr/lib64/perl5/site_perl/5.8.0/x86_64-linux-thread-multi >>> /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 >>> /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 >>> /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 >>> /usr/lib/perl5/site_perl >>> /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi >>> /usr/lib64/perl5/vendor_perl/5.8.4/x86_64-linux-thread-multi >>> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi >>> /usr/lib64/perl5/vendor_perl/5.8.2/x86_64-linux-thread-multi >>> /usr/lib64/perl5/vendor_perl/5.8.1/x86_64-linux-thread-multi >>> /usr/lib64/perl5/vendor_perl/5.8.0/x86_64-linux-thread-multi >>> /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 >>> /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 >>> /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 >>> /usr/lib/perl5/vendor_perl .) at >>> /export/share/iNquiry/www/cgi-bin/bipod/nisa/snpfinder.cgi line 14. >>> BEGIN failed--compilation aborted at >>> /export/share/iNquiry/www/cgi-bin/bipod/nisa/snpfinder.cgi line 14. >>> >>> >>> >>> Here is my cgi perl program >>> >>> #!/usr/bin/perl -w >>> >>> use strict; >>> use CGI qw(:standard); >>> use CGI::Carp qw/fatalsToBrowser/; >>> use Bio::SeqIO; >>> use Bio::Align::AlignI; >>> use Bio::AlignIO; >>> use Bio::AlignIO::msf; >>> use Bio::SimpleAlign; >>> use Bio::PrimarySeq; >>> use Bio::Tools::Run::Alignment::Clustalw; >>> use Bio::PrimarySeqI; >>> use Bio::Root::IO; >>> use Bio::Seq; >>> use Bio::TreeIO; >>> use Bio::Root::Root Bio::Tools::Run::WrapperBase; >>> use Bio::LocatableSeq; >>> >>> >>> BEGIN { >>> $ENV{CLUSTALDIR} = '/opt/Bio/bin/clustalw'; >>> } >>> print"Content-type: text/html\n\n"; >>> >>> >>> if (param()){ #condition if user supplied some data >>> my $new_seq=param("sequence"); #to store sequence from text box of form >>> my $selection=param("size");#to store drop down menu selection for >>> translation type >>> my $file1=param("uploadfile");#variable for name of the file >>> my $address=param("email");#variable to hold e.mail address >>> >>> if ($file1=~/.+/) {#test condition, if user supplied a file >>> >>> my $R_FH = upload("uploadfile");#get file handle >>> >>> >>> >>> >>> my $in = Bio::AlignIO->new(-file => $file1 , >>> -format => 'fasta'); >>> my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , >>> -format => 'pfam'); >>> >>> while ( my $aln1 = $in->next_aln() ) { >>> $out->write_aln($aln1); >>> } >>> >>> close $R_FH; >>> open FH, "out.aln.pfam" || die "Alignment file doesn't exist\n"; >>> while(){ >>> >>> print $_,"\n"; >>> } >>> close FH; >>> >>> #To send results through e.mail >>> $report=~s/
/\n/g;#change new line character >>> $report=~s/(<.+?>)//g;#remove html tags >>> $report=~s/( )/ /g;#change space character >>> open(MAIL, "|/usr/sbin/sendmail -t") or die"can't compose an e.mail"; >>> print MAIL "To:$address\n"; >>> print MAIL "From: nisa\n"; >>> print MAIL "Subject: results of $file1\n"; >>> print MAIL "\n$report\n"; >>> close(MAIL);#close mail >>> >>> >>> }else{#error message if no file name is entered >>> print "

Error:

You did not enter a file >>> name,please upload a file
"; >>> >>> } >>> >>> } >>> >>> Please let me know what should I do, because clustalw is there and my other >>> programs (non web-based) are working. >>> >>> Thanks >>> Nisa >>> >> >> -- >> Thomas Sharpton >> PhD Candidate - UC Berkeley >> Search smarter: www.siphs.com >> >> > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From vdar at yorku.ca Wed May 21 16:34:05 2008 From: vdar at yorku.ca (vdar at yorku.ca) Date: Wed, 21 May 2008 16:34:05 -0400 Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... In-Reply-To: <48347FA6.4090301@campus.iztacala.unam.mx> References: <17367665.post@talk.nabble.com> <48346F22.5060806@berkeley.edu> <1211397502.4834757e732c3@mymail.yorku.ca> <48347FA6.4090301@campus.iztacala.unam.mx> Message-ID: <1211402045.4834873deb708@mymail.yorku.ca> How can I find where bioperl is installed? Quoting Mauricio Herrera Cuadra : > Hi Nisa, > > CGI scripts are generally run by a different user than you, and which > user (e.g. apache, nobody) will depend on the platform you're running > the script on, thus the environment variables you currently have for > your login shell are not being inherited to the web interface. The best > workaround for this is to add a 'use lib' pragma at the top of your CGI > script: > > use lib '/path/to/your/bioperl/installation/'; > > Also, it's a good practice to use taint mode for CGI scripts: > > #!/usr/bin/perl -wT > > Hope this helps. > > Regards, > Mauricio. > > vdar at yorku.ca wrote: > > Hi, > > > > This is my .bashrc file and perl can see bioperl because my commandline > programs > > work fine, the problem is with web-based programs > > > > ############################################## > > # .bashrc > > > > # User specific aliases and functions > > > > # Source global definitions > > if [ -f /etc/bashrc ]; then > > . /etc/bashrc > > > > fi > > export PERL5LIB="/opt/rocks/lib/perl5/site_perl/5.8.8" > > > > > > ################################################ > > > > Please let me know if I need to change it now... > > > > Thanks > > Nisa > > > > > > Quoting Thomas Sharpton : > > > >> Hi Nisa, > >> > >> Looks to me like you need to add BioPerl to your PERL5LIB environmental > >> variable. Until you set this up, Perl doesn't know where on your > >> machine to find the Bioperl code. On a UNIX/Linux/Mac OSX based system, > >> the fix is as follows: > >> > >> From http://www.bioperl.org/wiki/Using_Subversion : > >> > >> Tell perl where to find BioPerl (assuming you checked out the code in > >> $HOME/src; set this in your .bash_profile, .profile, or .cshrc): > >> > >> bash: $ export PERL5LIB="$HOME/src/bioperl-live:$PERL5LIB" > >> tcsh: $ setenv PERL5LIB "$HOME/src/bioperl-live:$PERL5LIB" > >> > >> On a windows machine, you can either environmental variables through > >> autoexec.bat or via the Control Panel. > >> > >> Feel free to back channel me if you need help. > >> > >> Cheers, > >> Tom > >> > >> > >> nisa_dar wrote: > >>> Hi, > >>> > >>> My multiple alignment program works fine from command line but when i put > >>> the > >>> same piece of code in my cgi perl program, it gives me the same error, > >> which > >>> is, > >>> > >>> Can't locate Bio/Tools/Run/Alignment/Clustalw.pm in @INC (@INC contains: > >>> /export/share/iNquiry/perl/lib/5.8.5/x86_64-linux-thread-multi > >>> /export/share/iNquiry/perl/lib/5.8.5 > >>> /export/share/iNquiry/perl/lib/x86_64-linux-thread-multi > >>> /export/share/iNquiry/perl/lib/5.8.4 /export/share/iNquiry/perl/lib/5.8.3 > >>> /export/share/iNquiry/perl/lib/5.8.2 /export/share/iNquiry/perl/lib/5.8.1 > >>> /export/share/iNquiry/perl/lib/5.8.0 /export/share/iNquiry/perl/lib > >>> /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/5.8.5 > >>> /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi > >>> /usr/lib64/perl5/site_perl/5.8.4/x86_64-linux-thread-multi > >>> /usr/lib64/perl5/site_perl/5.8.3/x86_64-linux-thread-multi > >>> /usr/lib64/perl5/site_perl/5.8.2/x86_64-linux-thread-multi > >>> /usr/lib64/perl5/site_perl/5.8.1/x86_64-linux-thread-multi > >>> /usr/lib64/perl5/site_perl/5.8.0/x86_64-linux-thread-multi > >>> /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 > >>> /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 > >>> /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 > >>> /usr/lib/perl5/site_perl > >>> /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi > >>> /usr/lib64/perl5/vendor_perl/5.8.4/x86_64-linux-thread-multi > >>> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi > >>> /usr/lib64/perl5/vendor_perl/5.8.2/x86_64-linux-thread-multi > >>> /usr/lib64/perl5/vendor_perl/5.8.1/x86_64-linux-thread-multi > >>> /usr/lib64/perl5/vendor_perl/5.8.0/x86_64-linux-thread-multi > >>> /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 > >>> /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 > >>> /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 > >>> /usr/lib/perl5/vendor_perl .) at > >>> /export/share/iNquiry/www/cgi-bin/bipod/nisa/snpfinder.cgi line 14. > >>> BEGIN failed--compilation aborted at > >>> /export/share/iNquiry/www/cgi-bin/bipod/nisa/snpfinder.cgi line 14. > >>> > >>> > >>> > >>> Here is my cgi perl program > >>> > >>> #!/usr/bin/perl -w > >>> > >>> use strict; > >>> use CGI qw(:standard); > >>> use CGI::Carp qw/fatalsToBrowser/; > >>> use Bio::SeqIO; > >>> use Bio::Align::AlignI; > >>> use Bio::AlignIO; > >>> use Bio::AlignIO::msf; > >>> use Bio::SimpleAlign; > >>> use Bio::PrimarySeq; > >>> use Bio::Tools::Run::Alignment::Clustalw; > >>> use Bio::PrimarySeqI; > >>> use Bio::Root::IO; > >>> use Bio::Seq; > >>> use Bio::TreeIO; > >>> use Bio::Root::Root Bio::Tools::Run::WrapperBase; > >>> use Bio::LocatableSeq; > >>> > >>> > >>> BEGIN { > >>> $ENV{CLUSTALDIR} = '/opt/Bio/bin/clustalw'; > >>> } > >>> print"Content-type: text/html\n\n"; > >>> > >>> > >>> if (param()){ #condition if user supplied some data > >>> my $new_seq=param("sequence"); #to store sequence from text box of form > >>> my $selection=param("size");#to store drop down menu selection for > >>> translation type > >>> my $file1=param("uploadfile");#variable for name of the file > >>> my $address=param("email");#variable to hold e.mail address > >>> > >>> if ($file1=~/.+/) {#test condition, if user supplied a file > >>> > >>> my $R_FH = upload("uploadfile");#get file handle > >>> > >>> > >>> > >>> > >>> my $in = Bio::AlignIO->new(-file => $file1 , > >>> -format => 'fasta'); > >>> my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , > >>> -format => 'pfam'); > >>> > >>> while ( my $aln1 = $in->next_aln() ) { > >>> $out->write_aln($aln1); > >>> } > >>> > >>> close $R_FH; > >>> open FH, "out.aln.pfam" || die "Alignment file doesn't exist\n"; > >>> while(){ > >>> > >>> print $_,"\n"; > >>> } > >>> close FH; > >>> > >>> #To send results through e.mail > >>> $report=~s/
/\n/g;#change new line character > >>> $report=~s/(<.+?>)//g;#remove html tags > >>> $report=~s/( )/ /g;#change space character > >>> open(MAIL, "|/usr/sbin/sendmail -t") or die"can't compose an e.mail"; > >>> print MAIL "To:$address\n"; > >>> print MAIL "From: nisa\n"; > >>> print MAIL "Subject: results of $file1\n"; > >>> print MAIL "\n$report\n"; > >>> close(MAIL);#close mail > >>> > >>> > >>> }else{#error message if no file name is entered > >>> print "

Error:

You did not enter a > file > >>> name,please upload a file
"; > >>> > >>> } > >>> > >>> } > >>> > >>> Please let me know what should I do, because clustalw is there and my > other > >>> programs (non web-based) are working. > >>> > >>> Thanks > >>> Nisa > >>> > >> > >> -- > >> Thomas Sharpton > >> PhD Candidate - UC Berkeley > >> Search smarter: www.siphs.com > >> > >> > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > MAURICIO HERRERA CUADRA > arareko at campus.iztacala.unam.mx > Laboratorio de Gen??tica > Unidad de Morfofisiolog??a y Funci??n > Facultad de Estudios Superiores Iztacala, UNAM > From bix at sendu.me.uk Wed May 21 17:02:08 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 21 May 2008 22:02:08 +0100 Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... In-Reply-To: <1211402045.4834873deb708@mymail.yorku.ca> References: <17367665.post@talk.nabble.com> <48346F22.5060806@berkeley.edu> <1211397502.4834757e732c3@mymail.yorku.ca> <48347FA6.4090301@campus.iztacala.unam.mx> <1211402045.4834873deb708@mymail.yorku.ca> Message-ID: <48348DD0.2040705@sendu.me.uk> vdar at yorku.ca wrote: > How can I find where bioperl is installed? You already know where it's installed. See below. > Quoting Mauricio Herrera Cuadra : > >> Hi Nisa, >> >> CGI scripts are generally run by a different user than you, and which >> user (e.g. apache, nobody) will depend on the platform you're running >> the script on, thus the environment variables you currently have for >> your login shell are not being inherited to the web interface. The best >> workaround for this is to add a 'use lib' pragma at the top of your CGI >> script: >> >> use lib '/path/to/your/bioperl/installation/'; [...] >> vdar at yorku.ca wrote: >>> export PERL5LIB="/opt/rocks/lib/perl5/site_perl/5.8.8" From cjfields at uiuc.edu Wed May 21 17:49:15 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 21 May 2008 16:49:15 -0500 Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... In-Reply-To: <1211402045.4834873deb708@mymail.yorku.ca> References: <17367665.post@talk.nabble.com> <48346F22.5060806@berkeley.edu> <1211397502.4834757e732c3@mymail.yorku.ca> <48347FA6.4090301@campus.iztacala.unam.mx> <1211402045.4834873deb708@mymail.yorku.ca> Message-ID: <5FF34D3D-4B09-4F3D-9579-7F6D7FB1C185@uiuc.edu> Bio::Tools::Run::Align::ClustalW is not in bioperl-live, it is in bioperl-run (separate distribution). You'll need to install that as well (after bioperl-live is installed). Use 'perldoc -l' to find the location for the module/script it finds in @INC (which is the one used by the system most of the time): cjfields$ perldoc -l Bio::Root::Root /Users/cjfields/bioperl/bioperl-live/Bio/Root/Root.pm Don't think this will help for CGI, though, if you have PERL5LIB set up for your local env (as Mauricio indicates). You'll need to 'use lib'. -chris On May 21, 2008, at 3:34 PM, vdar at yorku.ca wrote: > How can I find where bioperl is installed? > > > > Quoting Mauricio Herrera Cuadra : > >> Hi Nisa, >> >> CGI scripts are generally run by a different user than you, and which >> user (e.g. apache, nobody) will depend on the platform you're running >> the script on, thus the environment variables you currently have for >> your login shell are not being inherited to the web interface. The >> best >> workaround for this is to add a 'use lib' pragma at the top of your >> CGI >> script: >> >> use lib '/path/to/your/bioperl/installation/'; >> >> Also, it's a good practice to use taint mode for CGI scripts: >> >> #!/usr/bin/perl -wT >> >> Hope this helps. >> >> Regards, >> Mauricio. >> >> vdar at yorku.ca wrote: >>> Hi, >>> >>> This is my .bashrc file and perl can see bioperl because my >>> commandline >> programs >>> work fine, the problem is with web-based programs >>> >>> ############################################## >>> # .bashrc >>> >>> # User specific aliases and functions >>> >>> # Source global definitions >>> if [ -f /etc/bashrc ]; then >>> . /etc/bashrc >>> >>> fi >>> export PERL5LIB="/opt/rocks/lib/perl5/site_perl/5.8.8" >>> >>> >>> ################################################ >>> >>> Please let me know if I need to change it now... >>> >>> Thanks >>> Nisa >>> >>> >>> Quoting Thomas Sharpton : >>> >>>> Hi Nisa, >>>> >>>> Looks to me like you need to add BioPerl to your PERL5LIB >>>> environmental >>>> variable. Until you set this up, Perl doesn't know where on your >>>> machine to find the Bioperl code. On a UNIX/Linux/Mac OSX based >>>> system, >>>> the fix is as follows: >>>> >>>> From http://www.bioperl.org/wiki/Using_Subversion : >>>> >>>> Tell perl where to find BioPerl (assuming you checked out the >>>> code in >>>> $HOME/src; set this in your .bash_profile, .profile, or .cshrc): >>>> >>>> bash: $ export PERL5LIB="$HOME/src/bioperl-live:$PERL5LIB" >>>> tcsh: $ setenv PERL5LIB "$HOME/src/bioperl-live:$PERL5LIB" >>>> >>>> On a windows machine, you can either environmental variables >>>> through >>>> autoexec.bat or via the Control Panel. >>>> >>>> Feel free to back channel me if you need help. >>>> >>>> Cheers, >>>> Tom >>>> >>>> >>>> nisa_dar wrote: >>>>> Hi, >>>>> >>>>> My multiple alignment program works fine from command line but >>>>> when i put >>>>> the >>>>> same piece of code in my cgi perl program, it gives me the same >>>>> error, >>>> which >>>>> is, >>>>> >>>>> Can't locate Bio/Tools/Run/Alignment/Clustalw.pm in @INC (@INC >>>>> contains: >>>>> /export/share/iNquiry/perl/lib/5.8.5/x86_64-linux-thread-multi >>>>> /export/share/iNquiry/perl/lib/5.8.5 >>>>> /export/share/iNquiry/perl/lib/x86_64-linux-thread-multi >>>>> /export/share/iNquiry/perl/lib/5.8.4 /export/share/iNquiry/perl/ >>>>> lib/5.8.3 >>>>> /export/share/iNquiry/perl/lib/5.8.2 /export/share/iNquiry/perl/ >>>>> lib/5.8.1 >>>>> /export/share/iNquiry/perl/lib/5.8.0 /export/share/iNquiry/perl/ >>>>> lib >>>>> /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi /usr/lib/ >>>>> perl5/5.8.5 >>>>> /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi >>>>> /usr/lib64/perl5/site_perl/5.8.4/x86_64-linux-thread-multi >>>>> /usr/lib64/perl5/site_perl/5.8.3/x86_64-linux-thread-multi >>>>> /usr/lib64/perl5/site_perl/5.8.2/x86_64-linux-thread-multi >>>>> /usr/lib64/perl5/site_perl/5.8.1/x86_64-linux-thread-multi >>>>> /usr/lib64/perl5/site_perl/5.8.0/x86_64-linux-thread-multi >>>>> /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 >>>>> /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 >>>>> /usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 >>>>> /usr/lib/perl5/site_perl >>>>> /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi >>>>> /usr/lib64/perl5/vendor_perl/5.8.4/x86_64-linux-thread-multi >>>>> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi >>>>> /usr/lib64/perl5/vendor_perl/5.8.2/x86_64-linux-thread-multi >>>>> /usr/lib64/perl5/vendor_perl/5.8.1/x86_64-linux-thread-multi >>>>> /usr/lib64/perl5/vendor_perl/5.8.0/x86_64-linux-thread-multi >>>>> /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4 >>>>> /usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl/5.8.2 >>>>> /usr/lib/perl5/vendor_perl/5.8.1 /usr/lib/perl5/vendor_perl/5.8.0 >>>>> /usr/lib/perl5/vendor_perl .) at >>>>> /export/share/iNquiry/www/cgi-bin/bipod/nisa/snpfinder.cgi line >>>>> 14. >>>>> BEGIN failed--compilation aborted at >>>>> /export/share/iNquiry/www/cgi-bin/bipod/nisa/snpfinder.cgi line >>>>> 14. >>>>> >>>>> >>>>> >>>>> Here is my cgi perl program >>>>> >>>>> #!/usr/bin/perl -w >>>>> >>>>> use strict; >>>>> use CGI qw(:standard); >>>>> use CGI::Carp qw/fatalsToBrowser/; >>>>> use Bio::SeqIO; >>>>> use Bio::Align::AlignI; >>>>> use Bio::AlignIO; >>>>> use Bio::AlignIO::msf; >>>>> use Bio::SimpleAlign; >>>>> use Bio::PrimarySeq; >>>>> use Bio::Tools::Run::Alignment::Clustalw; >>>>> use Bio::PrimarySeqI; >>>>> use Bio::Root::IO; >>>>> use Bio::Seq; >>>>> use Bio::TreeIO; >>>>> use Bio::Root::Root Bio::Tools::Run::WrapperBase; >>>>> use Bio::LocatableSeq; >>>>> >>>>> >>>>> BEGIN { >>>>> $ENV{CLUSTALDIR} = '/opt/Bio/bin/clustalw'; >>>>> } >>>>> print"Content-type: text/html\n\n"; >>>>> >>>>> >>>>> if (param()){ #condition if user supplied some data >>>>> my $new_seq=param("sequence"); #to store sequence from text box >>>>> of form >>>>> my $selection=param("size");#to store drop down menu selection >>>>> for >>>>> translation type >>>>> my $file1=param("uploadfile");#variable for name of the file >>>>> my $address=param("email");#variable to hold e.mail address >>>>> >>>>> if ($file1=~/.+/) {#test condition, if user supplied a file >>>>> >>>>> my $R_FH = upload("uploadfile");#get file handle >>>>> >>>>> >>>>> >>>>> >>>>> my $in = Bio::AlignIO->new(-file => $file1 , >>>>> -format => 'fasta'); >>>>> my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , >>>>> -format => 'pfam'); >>>>> >>>>> while ( my $aln1 = $in->next_aln() ) { >>>>> $out->write_aln($aln1); >>>>> } >>>>> >>>>> close $R_FH; >>>>> open FH, "out.aln.pfam" || die "Alignment file doesn't exist\n"; >>>>> while(){ >>>>> >>>>> print $_,"\n"; >>>>> } >>>>> close FH; >>>>> >>>>> #To send results through e.mail >>>>> $report=~s/
/\n/g;#change new line character >>>>> $report=~s/(<.+?>)//g;#remove html tags >>>>> $report=~s/( )/ /g;#change space character >>>>> open(MAIL, "|/usr/sbin/sendmail -t") or die"can't compose an >>>>> e.mail"; >>>>> print MAIL "To:$address\n"; >>>>> print MAIL "From: nisa\n"; >>>>> print MAIL "Subject: results of $file1\n"; >>>>> print MAIL "\n$report\n"; >>>>> close(MAIL);#close mail >>>>> >>>>> >>>>> }else{#error message if no file name is entered >>>>> print "

Error:

You did not >>>>> enter a >> file >>>>> name,please upload a file
"; >>>>> >>>>> } >>>>> >>>>> } >>>>> >>>>> Please let me know what should I do, because clustalw is there >>>>> and my >> other >>>>> programs (non web-based) are working. >>>>> >>>>> Thanks >>>>> Nisa >>>>> >>>> >>>> -- >>>> Thomas Sharpton >>>> PhD Candidate - UC Berkeley >>>> Search smarter: www.siphs.com >>>> >>>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> -- >> MAURICIO HERRERA CUADRA >> arareko at campus.iztacala.unam.mx >> Laboratorio de Gen??tica >> Unidad de Morfofisiolog? a y Funci??n >> Facultad de Estudios Superiores Iztacala, UNAM >> > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From vdar at yorku.ca Wed May 21 18:12:07 2008 From: vdar at yorku.ca (vdar at yorku.ca) Date: Wed, 21 May 2008 18:12:07 -0400 Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... In-Reply-To: <48348DD0.2040705@sendu.me.uk> References: <17367665.post@talk.nabble.com> <48346F22.5060806@berkeley.edu> <1211397502.4834757e732c3@mymail.yorku.ca> <48347FA6.4090301@campus.iztacala.unam.mx> <1211402045.4834873deb708@mymail.yorku.ca> <48348DD0.2040705@sendu.me.uk> Message-ID: <1211407927.48349e3770e94@mymail.yorku.ca> ok thanks, its not giving me any error now, but its not doing anything too, the following code works from commandline but not from my cgi script. I have added the path to bioperl and have tried everything else that I could find... my $in = Bio::AlignIO->new(-file => $inputfilename , -format => 'fasta'); my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , -format => 'pfam'); while ( my $aln1 = $in->next_aln() ) { $out->write_aln($aln1); } output file is not produced. what should I do? thanks nisa Quoting Sendu Bala : > vdar at yorku.ca wrote: > > How can I find where bioperl is installed? > > You already know where it's installed. See below. > > > > Quoting Mauricio Herrera Cuadra : > > > >> Hi Nisa, > >> > >> CGI scripts are generally run by a different user than you, and which > >> user (e.g. apache, nobody) will depend on the platform you're running > >> the script on, thus the environment variables you currently have for > >> your login shell are not being inherited to the web interface. The best > >> workaround for this is to add a 'use lib' pragma at the top of your CGI > >> script: > >> > >> use lib '/path/to/your/bioperl/installation/'; > [...] > >> vdar at yorku.ca wrote: > >>> export PERL5LIB="/opt/rocks/lib/perl5/site_perl/5.8.8" > From arareko at campus.iztacala.unam.mx Wed May 21 18:27:24 2008 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Wed, 21 May 2008 17:27:24 -0500 Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... In-Reply-To: <1211407927.48349e3770e94@mymail.yorku.ca> References: <17367665.post@talk.nabble.com> <48346F22.5060806@berkeley.edu> <1211397502.4834757e732c3@mymail.yorku.ca> <48347FA6.4090301@campus.iztacala.unam.mx> <1211402045.4834873deb708@mymail.yorku.ca> <48348DD0.2040705@sendu.me.uk> <1211407927.48349e3770e94@mymail.yorku.ca> Message-ID: <4834A1CC.5070709@campus.iztacala.unam.mx> You're using '>out.aln.pfam' as the full path for the output file. Most probably, the file is being produced but in the same location where the CGI script lives. Check inside the same directory where you installed your script. Mauricio. vdar at yorku.ca wrote: > ok thanks, its not giving me any error now, but its not doing anything too, the > following code works from commandline but not from my cgi script. I have added > the path to bioperl and have tried everything else that I could find... > > my $in = Bio::AlignIO->new(-file => $inputfilename , > -format => 'fasta'); > my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , > -format => 'pfam'); > > > > while ( my $aln1 = $in->next_aln() ) { > $out->write_aln($aln1); > } > > > output file is not produced. what should I do? > > thanks > nisa > > > Quoting Sendu Bala : > >> vdar at yorku.ca wrote: >>> How can I find where bioperl is installed? >> You already know where it's installed. See below. >> >> >>> Quoting Mauricio Herrera Cuadra : >>> >>>> Hi Nisa, >>>> >>>> CGI scripts are generally run by a different user than you, and which >>>> user (e.g. apache, nobody) will depend on the platform you're running >>>> the script on, thus the environment variables you currently have for >>>> your login shell are not being inherited to the web interface. The best >>>> workaround for this is to add a 'use lib' pragma at the top of your CGI >>>> script: >>>> >>>> use lib '/path/to/your/bioperl/installation/'; >> [...] >>>> vdar at yorku.ca wrote: >>>>> export PERL5LIB="/opt/rocks/lib/perl5/site_perl/5.8.8" > > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From vdar at yorku.ca Wed May 21 18:40:10 2008 From: vdar at yorku.ca (vdar at yorku.ca) Date: Wed, 21 May 2008 18:40:10 -0400 Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... In-Reply-To: <4834A1CC.5070709@campus.iztacala.unam.mx> References: <17367665.post@talk.nabble.com> <48346F22.5060806@berkeley.edu> <1211397502.4834757e732c3@mymail.yorku.ca> <48347FA6.4090301@campus.iztacala.unam.mx> <1211402045.4834873deb708@mymail.yorku.ca> <48348DD0.2040705@sendu.me.uk> <1211407927.48349e3770e94@mymail.yorku.ca> <4834A1CC.5070709@campus.iztacala.unam.mx> Message-ID: <1211409610.4834a4ca98853@mymail.yorku.ca> Yes, I've seen in that directory, but it doesn't exist. Another wierd thing which is being happening is that If I make this output file manually in that directory, it is read by the following code and printed on screen if ("out.aln.pfam"){ open FH, "out.aln.pfam" || die "Alignment file doesn't exist
"; while(){ print $_,"
"; } close FH; } only when this code is not followed by the original code. When its followed by the original code i.e. my $in = Bio::AlignIO->new(-file => $file1 , -format => 'fasta'); my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , -format => 'pfam'); while ( my $aln1 = $in->next_aln() ) { $out->write_aln($aln1); } if ("out.aln.pfam"){ open FH, "out.aln.pfam" || die "Alignment file doesn't exist
"; while(){ print $_,"
"; } close FH; } Nothing is printed on screen, while it stays there in the directory. I have changed the name of output file in this code but new file is not produced by this program. Does anyone know what is going on? thanks nisa Quoting Mauricio Herrera Cuadra : > You're using '>out.aln.pfam' as the full path for the output file. Most > probably, the file is being produced but in the same location where the > CGI script lives. Check inside the same directory where you installed > your script. > > Mauricio. > > vdar at yorku.ca wrote: > > ok thanks, its not giving me any error now, but its not doing anything too, > the > > following code works from commandline but not from my cgi script. I have > added > > the path to bioperl and have tried everything else that I could find... > > > > my $in = Bio::AlignIO->new(-file => $inputfilename , > > -format => 'fasta'); > > my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , > > -format => 'pfam'); > > > > > > > > while ( my $aln1 = $in->next_aln() ) { > > $out->write_aln($aln1); > > } > > > > > > output file is not produced. what should I do? > > > > thanks > > nisa > > > > > > Quoting Sendu Bala : > > > >> vdar at yorku.ca wrote: > >>> How can I find where bioperl is installed? > >> You already know where it's installed. See below. > >> > >> > >>> Quoting Mauricio Herrera Cuadra : > >>> > >>>> Hi Nisa, > >>>> > >>>> CGI scripts are generally run by a different user than you, and which > >>>> user (e.g. apache, nobody) will depend on the platform you're running > >>>> the script on, thus the environment variables you currently have for > >>>> your login shell are not being inherited to the web interface. The best > >>>> workaround for this is to add a 'use lib' pragma at the top of your CGI > >>>> script: > >>>> > >>>> use lib '/path/to/your/bioperl/installation/'; > >> [...] > >>>> vdar at yorku.ca wrote: > >>>>> export PERL5LIB="/opt/rocks/lib/perl5/site_perl/5.8.8" > > > > > > -- > MAURICIO HERRERA CUADRA > arareko at campus.iztacala.unam.mx > Laboratorio de Gen??tica > Unidad de Morfofisiolog??a y Funci??n > Facultad de Estudios Superiores Iztacala, UNAM > From arareko at campus.iztacala.unam.mx Wed May 21 19:14:46 2008 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Wed, 21 May 2008 18:14:46 -0500 Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... In-Reply-To: <1211409610.4834a4ca98853@mymail.yorku.ca> References: <17367665.post@talk.nabble.com> <48346F22.5060806@berkeley.edu> <1211397502.4834757e732c3@mymail.yorku.ca> <48347FA6.4090301@campus.iztacala.unam.mx> <1211402045.4834873deb708@mymail.yorku.ca> <48348DD0.2040705@sendu.me.uk> <1211407927.48349e3770e94@mymail.yorku.ca> <4834A1CC.5070709@campus.iztacala.unam.mx> <1211409610.4834a4ca98853@mymail.yorku.ca> Message-ID: <4834ACE6.1090402@campus.iztacala.unam.mx> A couple of things inlined: vdar at yorku.ca wrote: > Yes, I've seen in that directory, but it doesn't exist. Another wierd thing > which is being happening is that If I make this output file manually in that > directory, it is read by the following code and printed on screen > > > if ("out.aln.pfam"){ > open FH, "out.aln.pfam" || die "Alignment file doesn't exist
"; > while(){ > > print $_,"
"; > } > close FH; > } > > only when this code is not followed by the original code. When its followed by > the original code i.e. Yeah, this works because you're placing the file there by hand, so it's found by the open() function, not the 'if ("out.aln.pfam")' statement (which, btw, always evaluates as TRUE). Something simpler like this will work as you expect and it's easier to understand: open FH, "out.aln.pfam" or die "Alignment file doesn't exist
"; while () { print $_, "
"; } close FH; > my $in = Bio::AlignIO->new(-file => $file1 , > -format => 'fasta'); > my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , > -format => 'pfam'); > > > > while ( my $aln1 = $in->next_aln() ) { > $out->write_aln($aln1); > } > > > > if ("out.aln.pfam"){ > open FH, "out.aln.pfam" || die "Alignment file doesn't exist
"; > while(){ > > print $_,"
"; > } > close FH; > } > > Nothing is printed on screen, while it stays there in the directory. I have > changed the name of output file in this code but new file is not produced by > this program. Does anyone know what is going on? Apparently, this is a problem with permissions. Maybe your script lives under some directory (i.e. your home directory) which is owned by a different user than the one who is actually running the CGI interface (e.g. apache, nobody) ?? Check your Apache logs from another shell screen to see what is really happening while you run your script: $ tail -f /path/to/your/apache/error.log In a CGI environment, all Perl messages/warnings are printed to the webserver's log, not the standard output (your shell). Mauricio. > thanks > nisa > > > Quoting Mauricio Herrera Cuadra : > >> You're using '>out.aln.pfam' as the full path for the output file. Most >> probably, the file is being produced but in the same location where the >> CGI script lives. Check inside the same directory where you installed >> your script. >> >> Mauricio. >> >> vdar at yorku.ca wrote: >>> ok thanks, its not giving me any error now, but its not doing anything too, >> the >>> following code works from commandline but not from my cgi script. I have >> added >>> the path to bioperl and have tried everything else that I could find... >>> >>> my $in = Bio::AlignIO->new(-file => $inputfilename , >>> -format => 'fasta'); >>> my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , >>> -format => 'pfam'); >>> >>> >>> >>> while ( my $aln1 = $in->next_aln() ) { >>> $out->write_aln($aln1); >>> } >>> >>> >>> output file is not produced. what should I do? >>> >>> thanks >>> nisa >>> >>> >>> Quoting Sendu Bala : >>> >>>> vdar at yorku.ca wrote: >>>>> How can I find where bioperl is installed? >>>> You already know where it's installed. See below. >>>> >>>> >>>>> Quoting Mauricio Herrera Cuadra : >>>>> >>>>>> Hi Nisa, >>>>>> >>>>>> CGI scripts are generally run by a different user than you, and which >>>>>> user (e.g. apache, nobody) will depend on the platform you're running >>>>>> the script on, thus the environment variables you currently have for >>>>>> your login shell are not being inherited to the web interface. The best >>>>>> workaround for this is to add a 'use lib' pragma at the top of your CGI >>>>>> script: >>>>>> >>>>>> use lib '/path/to/your/bioperl/installation/'; >>>> [...] >>>>>> vdar at yorku.ca wrote: >>>>>>> export PERL5LIB="/opt/rocks/lib/perl5/site_perl/5.8.8" >>> >> -- >> MAURICIO HERRERA CUADRA >> arareko at campus.iztacala.unam.mx >> Laboratorio de Gen??tica >> Unidad de Morfofisiolog??a y Funci??n >> Facultad de Estudios Superiores Iztacala, UNAM >> > > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From jason at bioperl.org Thu May 22 00:09:28 2008 From: jason at bioperl.org (Jason Stajich) Date: Wed, 21 May 2008 22:09:28 -0600 Subject: [Bioperl-l] quick question about Yn00.pm In-Reply-To: <9e0c5a210805211525p684c6d15u93d65685aac2c57b@mail.gmail.com> References: <9e0c5a210805211525p684c6d15u93d65685aac2c57b@mail.gmail.com> Message-ID: <9E477B60-34DD-4FEE-8F39-A03127FFCF4E@bioperl.org> please email the list with questions as well. The program is called yn00 so I don't know what you get that message. Did you actually install PAML? what does 'which yn00' say? Did you specify the path to the executable with $yn->executable('/path/to/yn00'); -jason On May 21, 2008, at 4:25 PM, zhuangli liang wrote: > Dear Jason, > > I am really sorry to bother you at this email address, but I am > working on a project and really need to run the KaKs calculation > urgently. I am using your Yn00.pm module as in > Bio::Tools::Run::Phylo::PAML::Yn00, which I installed from bioperl- > run-1.4. > > When I run the example script given in the Synopsis of Yn00.pm, I > got the following error: > ------------- EXCEPTION ------------- > MSG: unable to find executable for 'yn' > STACK Bio::Tools::Run::Phylo::PAML::Yn00::run Yn00.pm:265 > STACK toplevel test.pl:13 > > Could you please give me some suggestions to make it work? Thanks > a lot!!! > I am using perl, v5.8.8 built for i386-linux-thread-multi. > > Zhuangli > From jason at bioperl.org Thu May 22 01:45:10 2008 From: jason at bioperl.org (Jason Stajich) Date: Wed, 21 May 2008 23:45:10 -0600 Subject: [Bioperl-l] SearchIO: write/read database In-Reply-To: <58C9944D6E1A894EA9BC47DEB89EB5A633876C@EXCH-VS2.coh.org> References: <58C9944D6E1A894EA9BC47DEB89EB5A633876C@EXCH-VS2.coh.org> Message-ID: <4BEF6E33-BB3F-4FD4-851E-2F1CE89898C6@bioperl.org> Do you already have a database schema structure designed for storing BLAST/similarity results? As it tries to say in the quote, this would be possible, but as it stands there is no "BLAST result database" that is part of BioPerl at this time. It would be a relatively straight-forward solution, but I don't know if it makes more sense to try and store things in a database schema like Chado so that this is not Yet-Another-Standalone- Database-Schema-in-Informatics. -jason On May 20, 2008, at 1:09 AM, Chu, Roy wrote: > Hi, > > I tried a mailing-list and module search, but I couldn't find what > I was looking for--but I know it's there. > > On the bioperl page (http://www.bioperl.org/wiki/ > HOWTO:SearchIO#Writing_and_formatting_output) it says: > "If your data is instead stored in a database you could build the > Bio::Search objects up in memory directly from your database and > then use the Writer object to output the data." > > I want to write/read BLAST results to a database w/o having to > write to files or parse any xml. Can someone direct me to any > available module(s) that will help me in my endeavor? > > Thanks in advance, > Roy > > > > --------------------------------------------------------------------- > > SECURITY/CONFIDENTIALITY WARNING: > This message and any attachments are intended solely for the > individual or entity to which they are addressed. This > communication may contain information that is privileged, > confidential, or exempt from disclosure under applicable law (e.g., > personal health information, research data, financial information). > Because this e-mail has been sent without encryption, individuals > other than the intended recipient may be able to view the > information, forward it to others or tamper with the information > without the knowledge or consent of the sender. If you are not the > intended recipient, or the employee or person responsible for > delivering the message to the intended recipient, any > dissemination, distribution or copying of the communication is > strictly prohibited. If you received the communication in error, > please notify the sender immediately by replying to this message > and deleting the message and any accompanying files from your > system. If, due to the security risks, you do not wi! > sh to receive further communications via e-mail, please reply to > this message and inform the sender that you do not wish to receive > further e-mail from the sender. > > --------------------------------------------------------------------- > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From yezhiqiang at gmail.com Thu May 22 04:07:22 2008 From: yezhiqiang at gmail.com (Zhi-Qiang Ye) Date: Thu, 22 May 2008 16:07:22 +0800 Subject: [Bioperl-l] FASTA format check Message-ID: <34198fe40805220107m450c5fb4g109b98c20a1c67b9@mail.gmail.com> Dear all, I was looking for a module or library for the check of FASTA format. I have a program taking FASTA format protein sequence as input. If the user provides invalid input, the program has to check it and output friendly error messeges. Is there any packages or modules with these functionalities? Thank you very much! Best Regards! -- Zhi-Qiang Ye Ph.D in Bioinformatics From ewijaya at gmail.com Thu May 22 04:22:55 2008 From: ewijaya at gmail.com (Edward Wijaya) Date: Thu, 22 May 2008 17:22:55 +0900 Subject: [Bioperl-l] FASTA format check In-Reply-To: <34198fe40805220107m450c5fb4g109b98c20a1c67b9@mail.gmail.com> References: <34198fe40805220107m450c5fb4g109b98c20a1c67b9@mail.gmail.com> Message-ID: <3521d3670805220122v79308fcfr70b7835aede16bd2@mail.gmail.com> Yes there is. Check this out: http://search.cpan.org/~birney/bioperl-1.4/Bio/Tools/GuessSeqFormat.pm - Edward On Thu, May 22, 2008 at 5:07 PM, Zhi-Qiang Ye wrote: > Dear all, > > I was looking for a module or library for the check of FASTA > format. I have a program taking FASTA format protein sequence as > input. > If the user provides invalid input, the program has to check it and > output friendly error messeges. Is there any packages or modules with > these functionalities? > > Thank you very much! > > Best Regards! > -- > Zhi-Qiang Ye > Ph.D in Bioinformatics > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From yezhiqiang at gmail.com Thu May 22 05:06:25 2008 From: yezhiqiang at gmail.com (Zhi-Qiang Ye) Date: Thu, 22 May 2008 17:06:25 +0800 Subject: [Bioperl-l] FASTA format check In-Reply-To: <3521d3670805220122v79308fcfr70b7835aede16bd2@mail.gmail.com> References: <34198fe40805220107m450c5fb4g109b98c20a1c67b9@mail.gmail.com> <3521d3670805220122v79308fcfr70b7835aede16bd2@mail.gmail.com> Message-ID: <34198fe40805220206n563adac8v52d796fcef2dc00@mail.gmail.com> Thanks, Edward and Fabian. The Guess module is for guessing formats of valid input, which is not my requirement. As for my testing, >protein\nAAAAAAAAAAAAAAAAAAAAAAAAAAaaa ==> fasta >proteinAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA ==> fasta AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAaaaa ==> undefined value AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA ==> raw so, what I need is not this one. Thanks anyway. Best wishes! Zhi-Qiang Ye 2008/5/22 Edward Wijaya : > Yes there is. Check this out: > > http://search.cpan.org/~birney/bioperl-1.4/Bio/Tools/GuessSeqFormat.pm > > - Edward > > On Thu, May 22, 2008 at 5:07 PM, Zhi-Qiang Ye wrote: >> Dear all, >> >> I was looking for a module or library for the check of FASTA >> format. I have a program taking FASTA format protein sequence as >> input. >> If the user provides invalid input, the program has to check it and >> output friendly error messeges. Is there any packages or modules with >> these functionalities? >> >> Thank you very much! >> >> Best Regards! >> -- >> Zhi-Qiang Ye >> Ph.D in Bioinformatics >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> From David.Messina at sbc.su.se Thu May 22 05:24:31 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 22 May 2008 11:24:31 +0200 Subject: [Bioperl-l] FASTA format check In-Reply-To: <34198fe40805220107m450c5fb4g109b98c20a1c67b9@mail.gmail.com> References: <34198fe40805220107m450c5fb4g109b98c20a1c67b9@mail.gmail.com> Message-ID: <628aabb70805220224n59e07193s8a5519474ac78352@mail.gmail.com> I don't think there's something like this in BioPerl (strangely -- I would have thought Bio::SeqIO would have some validate() method. Perhaps if you turn on debugging or verbosity flags?). Someone who knows The Truth will probably chime in here. The Squid package that comes with HMMer has a program called seqstat. If you specify a file format (such as fasta) and your file is not in that format, seqstat will complain (nicely). Dave From bix at sendu.me.uk Thu May 22 06:08:17 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 22 May 2008 11:08:17 +0100 Subject: [Bioperl-l] FASTA format check In-Reply-To: <628aabb70805220224n59e07193s8a5519474ac78352@mail.gmail.com> References: <34198fe40805220107m450c5fb4g109b98c20a1c67b9@mail.gmail.com> <628aabb70805220224n59e07193s8a5519474ac78352@mail.gmail.com> Message-ID: <48354611.2080700@sendu.me.uk> Dave Messina wrote: > I don't think there's something like this in BioPerl (strangely -- I would > have thought Bio::SeqIO would have some validate() method. Perhaps if you > turn on debugging or verbosity flags?). Someone who knows The Truth will > probably chime in here. BioPerl doesn't validate. It's on the project priority list though: http://www.bioperl.org/wiki/Project_priority_list#Parsing_code (number 7) From vdar at yorku.ca Thu May 22 11:47:20 2008 From: vdar at yorku.ca (nisa_dar) Date: Thu, 22 May 2008 08:47:20 -0700 (PDT) Subject: [Bioperl-l] Re peatMasker not found In-Reply-To: <4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org> References: <17218229.post@talk.nabble.com> <4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org> Message-ID: <17407201.post@talk.nabble.com> ok now I have installed repeat masker, with its prerequisites as given on http://www.repeatmasker.org/ but now I am getting this error message. RepeatMasker program not found as or not executable. what should I do? Jason Stajich-3 wrote: > > Dare I ask, did you install repeat masker? > > http://www.repeatmasker.org/ > > On May 13, 2008, at 1:59 PM, nisa_dar wrote: > >> >> Following is the path to repeatmasker.pm on my system >> >> /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/RepeatMasker.pm >> >> but when I run my program, the error message comes >> >> RepeatMasker program not found as or not executable >> >> Here is my piece of code which gives this error, >> #!/usr/bin/perl >> >> use strict; >> use warnings; >> >> use Bio::Seq; >> use Bio::Tools::Run::StandAloneBlast; >> use Bio::Search::Hit::HitI; >> use Bio::Search::Hit::BlastHit; >> use Bio::Search::HSP::BlastHSP; >> use Bio::Search::HSP::HSPI; >> use Bio::SearchIO; >> use Bio::Tools::Run::RepeatMasker; >> >> BEGIN { >> >> $ENV{REPEATMASKERDIR} = '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/ >> Tools/'; >> >> } >> >> >> my @params = ("mam" => 1,"noint"=>1); >> my $factory = Bio::Tools::Run::RepeatMasker->new(@params); >> my $in = Bio::SeqIO->new(-file => "boechera.fasta", -format => >> 'fasta'); >> >> I tried finding RepeatMasker directory by typing >> >> which RepeatMasker >> >> but the error message was >> >> /usr/bin/which: no RepeatMasker in >> (/opt/openmpi/1.1.4/bin:/opt/lsfhpc/ego/1.2/linux2.6-glibc2.3- >> x86_64/etc:/opt/lsfhpc/ego/1.2/linux2.6-glibc2.3-x86_64/bin:/opt/ >> lsfhpc/7.0/linux2.6-glibc2.3-x86_64/etc:/opt/lsfhpc/7.0/linux2.6- >> glibc2.3-x86_64/bin:/usr/kerberos/bin:/usr/java/jdk1.5.0_07/bin:/ >> share/iNquiry/biotools/bin:/share/iNquiry/bin/lx24-x86:/share/ >> iNquiry/bin/lx24-amd64:/opt/Bio/bin:/usr/local/bin:/bin:/usr/bin:/ >> usr/X11R6/bin:/opt/modules/current/bin/:/opt/modules/bin/:/opt/Bio/ >> glimmer/scripts:/opt/Bio/gromacs/bin:/opt/eclipse:/opt/ganglia/bin:/ >> opt/maven/bin:/opt/rocks/bin:/opt/rocks/sbin:/home/vdar/bin) >> >> >> what should I do? >> >> Thanks >> >> >> >> >> -- >> View this message in context: http://www.nabble.com/RepeatMasker- >> not-found-tp17218229p17218229.html >> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- View this message in context: http://www.nabble.com/RepeatMasker-not-found-tp17218229p17407201.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From bix at sendu.me.uk Thu May 22 12:12:59 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 22 May 2008 17:12:59 +0100 Subject: [Bioperl-l] RepeatMasker not found In-Reply-To: <17407201.post@talk.nabble.com> References: <17218229.post@talk.nabble.com> <4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org> <17407201.post@talk.nabble.com> Message-ID: <48359B8B.10507@sendu.me.uk> nisa_dar wrote: > ok now I have installed repeat masker, with its prerequisites as given on > http://www.repeatmasker.org/ > but now I am getting this error message. > > RepeatMasker program not found as or not executable. > > what should I do? Well now you have to correct your code to tell it where you installed RepeatMasker: >> On May 13, 2008, at 1:59 PM, nisa_dar wrote: [...] >>> BEGIN { >>> >>> $ENV{REPEATMASKERDIR} = '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/'; >>> >>> } From vdar at yorku.ca Thu May 22 12:59:19 2008 From: vdar at yorku.ca (nisa_dar) Date: Thu, 22 May 2008 09:59:19 -0700 (PDT) Subject: [Bioperl-l] Re peatMasker not found In-Reply-To: <48359B8B.10507@sendu.me.uk> References: <17218229.post@talk.nabble.com> <4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org> <17407201.post@talk.nabble.com> <48359B8B.10507@sendu.me.uk> Message-ID: <17408831.post@talk.nabble.com> please see my full message and all the approaches that i have been doing to tell my pogram where repeat masker is...what else is correct if these are not? Sendu Bala-2 wrote: > > nisa_dar wrote: >> ok now I have installed repeat masker, with its prerequisites as given on >> http://www.repeatmasker.org/ >> but now I am getting this error message. >> >> RepeatMasker program not found as or not executable. >> >> what should I do? > > Well now you have to correct your code to tell it where you installed > RepeatMasker: > > >>> On May 13, 2008, at 1:59 PM, nisa_dar wrote: > [...] >>>> BEGIN { >>>> >>>> $ENV{REPEATMASKERDIR} = >>>> '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/'; >>>> >>>> } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- View this message in context: http://www.nabble.com/RepeatMasker-not-found-tp17218229p17408831.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From florent.angly at gmail.com Thu May 22 13:10:12 2008 From: florent.angly at gmail.com (Florent Angly) Date: Thu, 22 May 2008 10:10:12 -0700 Subject: [Bioperl-l] Re peatMasker not found In-Reply-To: <17408831.post@talk.nabble.com> References: <17218229.post@talk.nabble.com> <4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org> <17407201.post@talk.nabble.com> <48359B8B.10507@sendu.me.uk> <17408831.post@talk.nabble.com> Message-ID: <4835A8F4.2090702@gmail.com> The path that you set up for $ENV{REPEATMASKERDIR} should be the path to the RepeatMasker program executable, not the path to the ReapeatMasker BioPerl module. Florent nisa_dar wrote: > please see my full message and all the approaches that i have been doing to > tell my pogram where repeat masker is...what else is correct if these are > not? > > > > > Sendu Bala-2 wrote: > >> nisa_dar wrote: >> >>> ok now I have installed repeat masker, with its prerequisites as given on >>> http://www.repeatmasker.org/ >>> but now I am getting this error message. >>> >>> RepeatMasker program not found as or not executable. >>> >>> what should I do? >>> >> Well now you have to correct your code to tell it where you installed >> RepeatMasker: >> >> >> >>>> On May 13, 2008, at 1:59 PM, nisa_dar wrote: >>>> >> [...] >> >>>>> BEGIN { >>>>> >>>>> $ENV{REPEATMASKERDIR} = >>>>> '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/'; >>>>> >>>>> } >>>>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> > > From arareko at campus.iztacala.unam.mx Thu May 22 13:39:23 2008 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Thu, 22 May 2008 12:39:23 -0500 Subject: [Bioperl-l] Re peatMasker not found In-Reply-To: <17408831.post@talk.nabble.com> References: <17218229.post@talk.nabble.com> <4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org> <17407201.post@talk.nabble.com> <48359B8B.10507@sendu.me.uk> <17408831.post@talk.nabble.com> Message-ID: <4835AFCB.6030204@campus.iztacala.unam.mx> Sendu already pointed it out: $ENV{REPEATMASKER} needs to be changed. RepeatMasker docs must have some info on where it was installed depending on your platform, use `which` otherwise. Mauricio. nisa_dar wrote: > please see my full message and all the approaches that i have been doing to > tell my pogram where repeat masker is...what else is correct if these are > not? > > > > > Sendu Bala-2 wrote: >> nisa_dar wrote: >>> ok now I have installed repeat masker, with its prerequisites as given on >>> http://www.repeatmasker.org/ >>> but now I am getting this error message. >>> >>> RepeatMasker program not found as or not executable. >>> >>> what should I do? >> Well now you have to correct your code to tell it where you installed >> RepeatMasker: >> >> >>>> On May 13, 2008, at 1:59 PM, nisa_dar wrote: >> [...] >>>>> BEGIN { >>>>> >>>>> $ENV{REPEATMASKERDIR} = >>>>> '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/'; >>>>> >>>>> } >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From bix at sendu.me.uk Thu May 22 13:51:38 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 22 May 2008 18:51:38 +0100 Subject: [Bioperl-l] Re peatMasker not found In-Reply-To: <17408831.post@talk.nabble.com> References: <17218229.post@talk.nabble.com> <4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org> <17407201.post@talk.nabble.com> <48359B8B.10507@sendu.me.uk> <17408831.post@talk.nabble.com> Message-ID: <4835B2AA.50505@sendu.me.uk> nisa_dar wrote: > please see my full message and all the approaches that i have been doing to > tell my pogram where repeat masker is...what else is correct if these are > not? We don't know where you installed RepeatMasker. Only you do. You need to supply that installation directory to $ENV{REPEATMASKERDIR} in your code. > Sendu Bala-2 wrote: >> nisa_dar wrote: >>> ok now I have installed repeat masker, with its prerequisites as given on >>> http://www.repeatmasker.org/ >>> but now I am getting this error message. >>> >>> RepeatMasker program not found as or not executable. >>> >>> what should I do? >> Well now you have to correct your code to tell it where you installed >> RepeatMasker: >> >> >>>> On May 13, 2008, at 1:59 PM, nisa_dar wrote: >> [...] >>>>> BEGIN { >>>>> >>>>> $ENV{REPEATMASKERDIR} = >>>>> '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/'; >>>>> >>>>> } >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > From vdar at yorku.ca Thu May 22 13:53:51 2008 From: vdar at yorku.ca (nisa_dar) Date: Thu, 22 May 2008 10:53:51 -0700 (PDT) Subject: [Bioperl-l] Re peatMasker not found In-Reply-To: <4835A8F4.2090702@gmail.com> References: <17218229.post@talk.nabble.com> <4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org> <17407201.post@talk.nabble.com> <48359B8B.10507@sendu.me.uk> <17408831.post@talk.nabble.com> <4835A8F4.2090702@gmail.com> Message-ID: <17409898.post@talk.nabble.com> I am trying to find RepeatMasker program executable by typing which RepeatMasker but its no where on my system, although I installed it and in my revious message I have mentioned the paths of its .pm file and its directory. I don't know how to find the path of what my program needs, bcs I have tried all these paths which i mentioned in my previous message. any suggestions? Florent Angly wrote: > > The path that you set up for $ENV{REPEATMASKERDIR} should be the path to > the RepeatMasker program executable, not the path to the ReapeatMasker > BioPerl module. > Florent > > nisa_dar wrote: >> please see my full message and all the approaches that i have been doing >> to >> tell my pogram where repeat masker is...what else is correct if these are >> not? >> >> >> >> >> Sendu Bala-2 wrote: >> >>> nisa_dar wrote: >>> >>>> ok now I have installed repeat masker, with its prerequisites as given >>>> on >>>> http://www.repeatmasker.org/ >>>> but now I am getting this error message. >>>> >>>> RepeatMasker program not found as or not executable. >>>> >>>> what should I do? >>>> >>> Well now you have to correct your code to tell it where you installed >>> RepeatMasker: >>> >>> >>> >>>>> On May 13, 2008, at 1:59 PM, nisa_dar wrote: >>>>> >>> [...] >>> >>>>>> BEGIN { >>>>>> >>>>>> $ENV{REPEATMASKERDIR} = >>>>>> '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/'; >>>>>> >>>>>> } >>>>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- View this message in context: http://www.nabble.com/RepeatMasker-not-found-tp17218229p17409898.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From vdar at yorku.ca Thu May 22 13:54:12 2008 From: vdar at yorku.ca (nisa_dar) Date: Thu, 22 May 2008 10:54:12 -0700 (PDT) Subject: [Bioperl-l] Re peatMasker not found Message-ID: <17409898.post@talk.nabble.com> I am trying to find RepeatMasker program executable by typing which RepeatMasker but its no where on my system, although I installed it and in my previous message I have mentioned the paths of its .pm file and its directory. I don't know how to find the path of what my program needs, bcs I have tried all these paths which i mentioned in my previous message. any suggestions? Florent Angly wrote: > > The path that you set up for $ENV{REPEATMASKERDIR} should be the path to > the RepeatMasker program executable, not the path to the ReapeatMasker > BioPerl module. > Florent > > nisa_dar wrote: >> please see my full message and all the approaches that i have been doing >> to >> tell my pogram where repeat masker is...what else is correct if these are >> not? >> >> >> >> >> Sendu Bala-2 wrote: >> >>> nisa_dar wrote: >>> >>>> ok now I have installed repeat masker, with its prerequisites as given >>>> on >>>> http://www.repeatmasker.org/ >>>> but now I am getting this error message. >>>> >>>> RepeatMasker program not found as or not executable. >>>> >>>> what should I do? >>>> >>> Well now you have to correct your code to tell it where you installed >>> RepeatMasker: >>> >>> >>> >>>>> On May 13, 2008, at 1:59 PM, nisa_dar wrote: >>>>> >>> [...] >>> >>>>>> BEGIN { >>>>>> >>>>>> $ENV{REPEATMASKERDIR} = >>>>>> '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/'; >>>>>> >>>>>> } >>>>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- View this message in context: http://www.nabble.com/RepeatMasker-not-found-tp17218229p17409898.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From vdar at yorku.ca Thu May 22 14:12:49 2008 From: vdar at yorku.ca (nisa_dar) Date: Thu, 22 May 2008 11:12:49 -0700 (PDT) Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... In-Reply-To: <4834ACE6.1090402@campus.iztacala.unam.mx> References: <17367665.post@talk.nabble.com> <48346F22.5060806@berkeley.edu> <1211397502.4834757e732c3@mymail.yorku.ca> <48347FA6.4090301@campus.iztacala.unam.mx> <1211402045.4834873deb708@mymail.yorku.ca> <48348DD0.2040705@sendu.me.uk> <1211407927.48349e3770e94@mymail.yorku.ca> <4834A1CC.5070709@campus.iztacala.unam.mx> <1211409610.4834a4ca98853@mymail.yorku.ca> <4834ACE6.1090402@campus.iztacala.unam.mx> Message-ID: <17410291.post@talk.nabble.com> I have seen in my error log file that output file is not being written bcs of no permissions [Thu May 22 14:07:35 2008] [error] [client xxxxxxxxxx] MSG: Could not open >/home/vdar/in.aln.pfam: Permission denied, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html Does any one know how can i give permissions to others so that my program can write this file in my home directory? All I knew was chmod 777 ... but in this case this doesn't work... thanks Mauricio Herrera Cuadra-3 wrote: > > A couple of things inlined: > > vdar at yorku.ca wrote: >> Yes, I've seen in that directory, but it doesn't exist. Another wierd >> thing >> which is being happening is that If I make this output file manually in >> that >> directory, it is read by the following code and printed on screen >> >> >> if ("out.aln.pfam"){ >> open FH, "out.aln.pfam" || die "Alignment file doesn't exist
"; >> while(){ >> >> print $_,"
"; >> } >> close FH; >> } >> >> only when this code is not followed by the original code. When its >> followed by >> the original code i.e. > > Yeah, this works because you're placing the file there by hand, so it's > found by the open() function, not the 'if ("out.aln.pfam")' statement > (which, btw, always evaluates as TRUE). Something simpler like this will > work as you expect and it's easier to understand: > > open FH, "out.aln.pfam" or die "Alignment file doesn't exist
"; > while () { > print $_, "
"; > } > close FH; > >> my $in = Bio::AlignIO->new(-file => $file1 , >> -format => 'fasta'); >> my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , >> -format => 'pfam'); >> >> >> >> while ( my $aln1 = $in->next_aln() ) { >> $out->write_aln($aln1); >> } >> >> >> >> if ("out.aln.pfam"){ >> open FH, "out.aln.pfam" || die "Alignment file doesn't exist
"; >> while(){ >> >> print $_,"
"; >> } >> close FH; >> } >> >> Nothing is printed on screen, while it stays there in the directory. I >> have >> changed the name of output file in this code but new file is not produced >> by >> this program. Does anyone know what is going on? > > Apparently, this is a problem with permissions. Maybe your script lives > under some directory (i.e. your home directory) which is owned by a > different user than the one who is actually running the CGI interface > (e.g. apache, nobody) ?? Check your Apache logs from another shell > screen to see what is really happening while you run your script: > > $ tail -f /path/to/your/apache/error.log > > In a CGI environment, all Perl messages/warnings are printed to the > webserver's log, not the standard output (your shell). > > Mauricio. > >> thanks >> nisa >> >> >> Quoting Mauricio Herrera Cuadra : >> >>> You're using '>out.aln.pfam' as the full path for the output file. Most >>> probably, the file is being produced but in the same location where the >>> CGI script lives. Check inside the same directory where you installed >>> your script. >>> >>> Mauricio. >>> >>> vdar at yorku.ca wrote: >>>> ok thanks, its not giving me any error now, but its not doing anything >>>> too, >>> the >>>> following code works from commandline but not from my cgi script. I >>>> have >>> added >>>> the path to bioperl and have tried everything else that I could find... >>>> >>>> my $in = Bio::AlignIO->new(-file => $inputfilename , >>>> -format => 'fasta'); >>>> my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , >>>> -format => 'pfam'); >>>> >>>> >>>> >>>> while ( my $aln1 = $in->next_aln() ) { >>>> $out->write_aln($aln1); >>>> } >>>> >>>> >>>> output file is not produced. what should I do? >>>> >>>> thanks >>>> nisa >>>> >>>> >>>> Quoting Sendu Bala : >>>> >>>>> vdar at yorku.ca wrote: >>>>>> How can I find where bioperl is installed? >>>>> You already know where it's installed. See below. >>>>> >>>>> >>>>>> Quoting Mauricio Herrera Cuadra : >>>>>> >>>>>>> Hi Nisa, >>>>>>> >>>>>>> CGI scripts are generally run by a different user than you, and >>>>>>> which >>>>>>> user (e.g. apache, nobody) will depend on the platform you're >>>>>>> running >>>>>>> the script on, thus the environment variables you currently have for >>>>>>> your login shell are not being inherited to the web interface. The >>>>>>> best >>>>>>> workaround for this is to add a 'use lib' pragma at the top of your >>>>>>> CGI >>>>>>> script: >>>>>>> >>>>>>> use lib '/path/to/your/bioperl/installation/'; >>>>> [...] >>>>>>> vdar at yorku.ca wrote: >>>>>>>> export PERL5LIB="/opt/rocks/lib/perl5/site_perl/5.8.8" >>>> >>> -- >>> MAURICIO HERRERA CUADRA >>> arareko at campus.iztacala.unam.mx >>> Laboratorio de Gen??tica >>> Unidad de Morfofisiolog??a y Funci??n >>> Facultad de Estudios Superiores Iztacala, UNAM >>> >> >> > > -- > MAURICIO HERRERA CUADRA > arareko at campus.iztacala.unam.mx > Laboratorio de Gen?tica > Unidad de Morfofisiolog?a y Funci?n > Facultad de Estudios Superiores Iztacala, UNAM > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- View this message in context: http://www.nabble.com/I-can%27t-access-clustalw-from-my-cgi-perl-program...-tp17367665p17410291.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From vdar at yorku.ca Thu May 22 14:23:45 2008 From: vdar at yorku.ca (nisa_dar) Date: Thu, 22 May 2008 11:23:45 -0700 (PDT) Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... In-Reply-To: <628aabb70805211244r63cce0detc585b629d6677b81@mail.gmail.com> References: <17367665.post@talk.nabble.com> <628aabb70805211244r63cce0detc585b629d6677b81@mail.gmail.com> Message-ID: <17410519.post@talk.nabble.com> this is from my error log file, does this make any sense to anyone in order to help me out [Thu May 22 14:20:01 2008] [error] [client ] STACK: Error::throw, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: Bio::Root::Root::throw /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:328, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: Bio::Root::IO::_initialize_io /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Root/IO.pm:313, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: Bio::AlignIO::_initialize /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/AlignIO.pm:379, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: Bio::AlignIO::new /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/AlignIO.pm:305, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: Bio::AlignIO::new /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/AlignIO.pm:326, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: /export/share/iNquiry/www/cgi-bin/bipod/nisa/snpfinder.cgi:64, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] -----------------------------------------------------------, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:02 2008] [error] [client ] File does not exist: /var/www/html/favicon.ico [Thu May 22 14:20:04 2008] [error] [client ] File does not exist: /var/www/html/favicon.ico Thanks Dave Messina-3 wrote: > > Since your script runs correctly from the command line, this doesn't look > like it's a BioPerl problem. > > The error message you got is: > > Can't locate Bio/Tools/Run/Alignment/Clustalw.pm > > > followed by a long list of directories where it looked for that module. So > the first thing to check is > > Is Bio/Tools/Run/Alignment/Clustalw.pm in one of those @INC directories? > > The fact that other Bioperl modules are 'use'd in your script first and > didn't produce an error suggests that you might have the BIoperl core > installation in those directories, but not Bio::Tools::Run. > > If Bio/Tools/Run/Alignment/Clustalw.pm is in fact in the @INC directories > listed, then it's probably a CGI/web issue. Do you know as what user on > your > machine web scripts are run? That user probably has limited permissions > compared to your regular user account. > > > Dave > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- View this message in context: http://www.nabble.com/I-can%27t-access-clustalw-from-my-cgi-perl-program...-tp17367665p17410519.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From vdar at yorku.ca Thu May 22 14:25:22 2008 From: vdar at yorku.ca (nisa_dar) Date: Thu, 22 May 2008 11:25:22 -0700 (PDT) Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... In-Reply-To: <4834ACE6.1090402@campus.iztacala.unam.mx> References: <17367665.post@talk.nabble.com> <48346F22.5060806@berkeley.edu> <1211397502.4834757e732c3@mymail.yorku.ca> <48347FA6.4090301@campus.iztacala.unam.mx> <1211402045.4834873deb708@mymail.yorku.ca> <48348DD0.2040705@sendu.me.uk> <1211407927.48349e3770e94@mymail.yorku.ca> <4834A1CC.5070709@campus.iztacala.unam.mx> <1211409610.4834a4ca98853@mymail.yorku.ca> <4834ACE6.1090402@campus.iztacala.unam.mx> Message-ID: <17410577.post@talk.nabble.com> this is from my error log file, does this make any sense to anyone in order to help me out [Thu May 22 14:20:01 2008] [error] [client ] STACK: Error::throw, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: Bio::Root::Root::throw /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:328, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: Bio::Root::IO::_initialize_io /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Root/IO.pm:313, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: Bio::AlignIO::_initialize /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/AlignIO.pm:379, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: Bio::AlignIO::new /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/AlignIO.pm:305, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: Bio::AlignIO::new /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/AlignIO.pm:326, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: /export/share/iNquiry/www/cgi-bin/bipod/nisa/snpfinder.cgi:64, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] -----------------------------------------------------------, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:02 2008] [error] [client ] File does not exist: /var/www/html/favicon.ico [Thu May 22 14:20:04 2008] [error] [client ] File does not exist: /var/www/html/favicon.ico Thanks Mauricio Herrera Cuadra-3 wrote: > > A couple of things inlined: > > vdar at yorku.ca wrote: >> Yes, I've seen in that directory, but it doesn't exist. Another wierd >> thing >> which is being happening is that If I make this output file manually in >> that >> directory, it is read by the following code and printed on screen >> >> >> if ("out.aln.pfam"){ >> open FH, "out.aln.pfam" || die "Alignment file doesn't exist
"; >> while(){ >> >> print $_,"
"; >> } >> close FH; >> } >> >> only when this code is not followed by the original code. When its >> followed by >> the original code i.e. > > Yeah, this works because you're placing the file there by hand, so it's > found by the open() function, not the 'if ("out.aln.pfam")' statement > (which, btw, always evaluates as TRUE). Something simpler like this will > work as you expect and it's easier to understand: > > open FH, "out.aln.pfam" or die "Alignment file doesn't exist
"; > while () { > print $_, "
"; > } > close FH; > >> my $in = Bio::AlignIO->new(-file => $file1 , >> -format => 'fasta'); >> my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , >> -format => 'pfam'); >> >> >> >> while ( my $aln1 = $in->next_aln() ) { >> $out->write_aln($aln1); >> } >> >> >> >> if ("out.aln.pfam"){ >> open FH, "out.aln.pfam" || die "Alignment file doesn't exist
"; >> while(){ >> >> print $_,"
"; >> } >> close FH; >> } >> >> Nothing is printed on screen, while it stays there in the directory. I >> have >> changed the name of output file in this code but new file is not produced >> by >> this program. Does anyone know what is going on? > > Apparently, this is a problem with permissions. Maybe your script lives > under some directory (i.e. your home directory) which is owned by a > different user than the one who is actually running the CGI interface > (e.g. apache, nobody) ?? Check your Apache logs from another shell > screen to see what is really happening while you run your script: > > $ tail -f /path/to/your/apache/error.log > > In a CGI environment, all Perl messages/warnings are printed to the > webserver's log, not the standard output (your shell). > > Mauricio. > >> thanks >> nisa >> >> >> Quoting Mauricio Herrera Cuadra : >> >>> You're using '>out.aln.pfam' as the full path for the output file. Most >>> probably, the file is being produced but in the same location where the >>> CGI script lives. Check inside the same directory where you installed >>> your script. >>> >>> Mauricio. >>> >>> vdar at yorku.ca wrote: >>>> ok thanks, its not giving me any error now, but its not doing anything >>>> too, >>> the >>>> following code works from commandline but not from my cgi script. I >>>> have >>> added >>>> the path to bioperl and have tried everything else that I could find... >>>> >>>> my $in = Bio::AlignIO->new(-file => $inputfilename , >>>> -format => 'fasta'); >>>> my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , >>>> -format => 'pfam'); >>>> >>>> >>>> >>>> while ( my $aln1 = $in->next_aln() ) { >>>> $out->write_aln($aln1); >>>> } >>>> >>>> >>>> output file is not produced. what should I do? >>>> >>>> thanks >>>> nisa >>>> >>>> >>>> Quoting Sendu Bala : >>>> >>>>> vdar at yorku.ca wrote: >>>>>> How can I find where bioperl is installed? >>>>> You already know where it's installed. See below. >>>>> >>>>> >>>>>> Quoting Mauricio Herrera Cuadra : >>>>>> >>>>>>> Hi Nisa, >>>>>>> >>>>>>> CGI scripts are generally run by a different user than you, and >>>>>>> which >>>>>>> user (e.g. apache, nobody) will depend on the platform you're >>>>>>> running >>>>>>> the script on, thus the environment variables you currently have for >>>>>>> your login shell are not being inherited to the web interface. The >>>>>>> best >>>>>>> workaround for this is to add a 'use lib' pragma at the top of your >>>>>>> CGI >>>>>>> script: >>>>>>> >>>>>>> use lib '/path/to/your/bioperl/installation/'; >>>>> [...] >>>>>>> vdar at yorku.ca wrote: >>>>>>>> export PERL5LIB="/opt/rocks/lib/perl5/site_perl/5.8.8" >>>> >>> -- >>> MAURICIO HERRERA CUADRA >>> arareko at campus.iztacala.unam.mx >>> Laboratorio de Gen??tica >>> Unidad de Morfofisiolog??a y Funci??n >>> Facultad de Estudios Superiores Iztacala, UNAM >>> >> >> > > -- > MAURICIO HERRERA CUADRA > arareko at campus.iztacala.unam.mx > Laboratorio de Gen?tica > Unidad de Morfofisiolog?a y Funci?n > Facultad de Estudios Superiores Iztacala, UNAM > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- View this message in context: http://www.nabble.com/I-can%27t-access-clustalw-from-my-cgi-perl-program...-tp17367665p17410577.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From vdar at yorku.ca Thu May 22 14:33:44 2008 From: vdar at yorku.ca (nisa_dar) Date: Thu, 22 May 2008 11:33:44 -0700 (PDT) Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... Message-ID: <17410577.post@talk.nabble.com> this is from my error log file, does this make any sense to anyone in order to help me out also how can i give write permissions so that /home/vdar/in.aln.pfam is actually written by this program? [Thu May 22 14:31:20 2008] [error] [client ] MSG: Could not open >/home/vdar/in.aln.pfam: Permission denied, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: Error::throw, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: Bio::Root::Root::throw /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:328, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: Bio::Root::IO::_initialize_io /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Root/IO.pm:313, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: Bio::AlignIO::_initialize /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/AlignIO.pm:379, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: Bio::AlignIO::new /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/AlignIO.pm:305, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: Bio::AlignIO::new /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/AlignIO.pm:326, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] STACK: /export/share/iNquiry/www/cgi-bin/bipod/nisa/snpfinder.cgi:64, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:01 2008] [error] [client ] -----------------------------------------------------------, referer: https://capsella.ccs.yorku.ca/nisa/snp_finder.html [Thu May 22 14:20:02 2008] [error] [client ] File does not exist: /var/www/html/favicon.ico [Thu May 22 14:20:04 2008] [error] [client ] File does not exist: /var/www/html/favicon.ico Thanks Mauricio Herrera Cuadra-3 wrote: > > A couple of things inlined: > > vdar at yorku.ca wrote: >> Yes, I've seen in that directory, but it doesn't exist. Another wierd >> thing >> which is being happening is that If I make this output file manually in >> that >> directory, it is read by the following code and printed on screen >> >> >> if ("out.aln.pfam"){ >> open FH, "out.aln.pfam" || die "Alignment file doesn't exist
"; >> while(){ >> >> print $_,"
"; >> } >> close FH; >> } >> >> only when this code is not followed by the original code. When its >> followed by >> the original code i.e. > > Yeah, this works because you're placing the file there by hand, so it's > found by the open() function, not the 'if ("out.aln.pfam")' statement > (which, btw, always evaluates as TRUE). Something simpler like this will > work as you expect and it's easier to understand: > > open FH, "out.aln.pfam" or die "Alignment file doesn't exist
"; > while () { > print $_, "
"; > } > close FH; > >> my $in = Bio::AlignIO->new(-file => $file1 , >> -format => 'fasta'); >> my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , >> -format => 'pfam'); >> >> >> >> while ( my $aln1 = $in->next_aln() ) { >> $out->write_aln($aln1); >> } >> >> >> >> if ("out.aln.pfam"){ >> open FH, "out.aln.pfam" || die "Alignment file doesn't exist
"; >> while(){ >> >> print $_,"
"; >> } >> close FH; >> } >> >> Nothing is printed on screen, while it stays there in the directory. I >> have >> changed the name of output file in this code but new file is not produced >> by >> this program. Does anyone know what is going on? > > Apparently, this is a problem with permissions. Maybe your script lives > under some directory (i.e. your home directory) which is owned by a > different user than the one who is actually running the CGI interface > (e.g. apache, nobody) ?? Check your Apache logs from another shell > screen to see what is really happening while you run your script: > > $ tail -f /path/to/your/apache/error.log > > In a CGI environment, all Perl messages/warnings are printed to the > webserver's log, not the standard output (your shell). > > Mauricio. > >> thanks >> nisa >> >> >> Quoting Mauricio Herrera Cuadra : >> >>> You're using '>out.aln.pfam' as the full path for the output file. Most >>> probably, the file is being produced but in the same location where the >>> CGI script lives. Check inside the same directory where you installed >>> your script. >>> >>> Mauricio. >>> >>> vdar at yorku.ca wrote: >>>> ok thanks, its not giving me any error now, but its not doing anything >>>> too, >>> the >>>> following code works from commandline but not from my cgi script. I >>>> have >>> added >>>> the path to bioperl and have tried everything else that I could find... >>>> >>>> my $in = Bio::AlignIO->new(-file => $inputfilename , >>>> -format => 'fasta'); >>>> my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , >>>> -format => 'pfam'); >>>> >>>> >>>> >>>> while ( my $aln1 = $in->next_aln() ) { >>>> $out->write_aln($aln1); >>>> } >>>> >>>> >>>> output file is not produced. what should I do? >>>> >>>> thanks >>>> nisa >>>> >>>> >>>> Quoting Sendu Bala : >>>> >>>>> vdar at yorku.ca wrote: >>>>>> How can I find where bioperl is installed? >>>>> You already know where it's installed. See below. >>>>> >>>>> >>>>>> Quoting Mauricio Herrera Cuadra : >>>>>> >>>>>>> Hi Nisa, >>>>>>> >>>>>>> CGI scripts are generally run by a different user than you, and >>>>>>> which >>>>>>> user (e.g. apache, nobody) will depend on the platform you're >>>>>>> running >>>>>>> the script on, thus the environment variables you currently have for >>>>>>> your login shell are not being inherited to the web interface. The >>>>>>> best >>>>>>> workaround for this is to add a 'use lib' pragma at the top of your >>>>>>> CGI >>>>>>> script: >>>>>>> >>>>>>> use lib '/path/to/your/bioperl/installation/'; >>>>> [...] >>>>>>> vdar at yorku.ca wrote: >>>>>>>> export PERL5LIB="/opt/rocks/lib/perl5/site_perl/5.8.8" >>>> >>> -- >>> MAURICIO HERRERA CUADRA >>> arareko at campus.iztacala.unam.mx >>> Laboratorio de Gen??tica >>> Unidad de Morfofisiolog??a y Funci??n >>> Facultad de Estudios Superiores Iztacala, UNAM >>> >> >> > > -- > MAURICIO HERRERA CUADRA > arareko at campus.iztacala.unam.mx > Laboratorio de Gen?tica > Unidad de Morfofisiolog?a y Funci?n > Facultad de Estudios Superiores Iztacala, UNAM > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- View this message in context: http://www.nabble.com/I-can%27t-access-clustalw-from-my-cgi-perl-program...-tp17367665p17410577.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From sharpton at berkeley.edu Thu May 22 14:39:04 2008 From: sharpton at berkeley.edu (Thomas Sharpton) Date: Thu, 22 May 2008 11:39:04 -0700 Subject: [Bioperl-l] Re peatMasker not found In-Reply-To: <17409898.post@talk.nabble.com> References: <17218229.post@talk.nabble.com> <4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org> <17407201.post@talk.nabble.com> <48359B8B.10507@sendu.me.uk> <17408831.post@talk.nabble.com> <4835A8F4.2090702@gmail.com> <17409898.post@talk.nabble.com> Message-ID: <4835BDC8.4040509@berkeley.edu> 'which repeatmasker' can only find executables that are part of your system path (which you can set in your .bashrc file). Did you follow these instructions when installing? http://www.repeatmasker.org/RMDownload.html If so, I'd venture a guess that the RepeatMasker executable is sitting in /usr/local/bin or /usr/local/RepeatMasker. Good luck, -T nisa_dar wrote: > I am trying to find RepeatMasker program executable by typing > which RepeatMasker > but its no where on my system, although I installed it and in my revious > message I have mentioned the paths of its .pm file and its directory. > I don't know how to find the path of what my program needs, bcs I have tried > all these paths which i mentioned in my previous message. > any suggestions? > > > > Florent Angly wrote: > >> The path that you set up for $ENV{REPEATMASKERDIR} should be the path to >> the RepeatMasker program executable, not the path to the ReapeatMasker >> BioPerl module. >> Florent >> >> nisa_dar wrote: >> >>> please see my full message and all the approaches that i have been doing >>> to >>> tell my pogram where repeat masker is...what else is correct if these are >>> not? >>> >>> >>> >>> >>> Sendu Bala-2 wrote: >>> >>> >>>> nisa_dar wrote: >>>> >>>> >>>>> ok now I have installed repeat masker, with its prerequisites as given >>>>> on >>>>> http://www.repeatmasker.org/ >>>>> but now I am getting this error message. >>>>> >>>>> RepeatMasker program not found as or not executable. >>>>> >>>>> what should I do? >>>>> >>>>> >>>> Well now you have to correct your code to tell it where you installed >>>> RepeatMasker: >>>> >>>> >>>> >>>> >>>>>> On May 13, 2008, at 1:59 PM, nisa_dar wrote: >>>>>> >>>>>> >>>> [...] >>>> >>>> >>>>>>> BEGIN { >>>>>>> >>>>>>> $ENV{REPEATMASKERDIR} = >>>>>>> '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/'; >>>>>>> >>>>>>> } >>>>>>> >>>>>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> >>>> >>>> >>> >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> > > -- Thomas Sharpton PhD Candidate - UC Berkeley Search smarter: www.siphs.com From vdar at yorku.ca Thu May 22 14:55:48 2008 From: vdar at yorku.ca (vdar at yorku.ca) Date: Thu, 22 May 2008 14:55:48 -0400 Subject: [Bioperl-l] Re peatMasker not found In-Reply-To: <4835BDC8.4040509@berkeley.edu> References: <17218229.post@talk.nabble.com> <4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org> <17407201.post@talk.nabble.com> <48359B8B.10507@sendu.me.uk> <17408831.post@talk.nabble.com> <4835A8F4.2090702@gmail.com> <17409898.post@talk.nabble.com> <4835BDC8.4040509@berkeley.edu> Message-ID: <1211482548.4835c1b4e70e9@mymail.yorku.ca> I can see that RepeatMasker program is there in this directory /home/vdar/RepeatMasker Here is a whole list of whats inside this directory [vdar at capsella RepeatMasker]$ ls -al total 35048 drwxr-xr-x 5 vdar users 4096 May 22 11:41 . drwx------ 10 vdar users 4096 May 22 14:15 .. -rwxr-xr-x 1 vdar users 11386 May 20 14:26 ArrayListIterator.pm -rwxr-xr-x 1 vdar users 13888 May 20 14:26 ArrayList.pm -rw-r--r-- 1 vdar users 1683 May 20 14:26 bluegrad.jpg -rwxr-xr-x 1 vdar users 16439 May 20 14:26 configure -rwxr-xr-x 1 vdar users 26098 May 20 14:26 CrossmatchSearchEngine.pm -rwxr-xr-x 1 vdar users 26342 May 20 14:26 DateRepeats -rw-r--r-- 1 vdar users 5935 May 20 14:26 daterepeats.help -rwxr-xr-x 1 vdar users 33871 May 20 14:26 DeCypherSearchEngine.pm -rwxr-xr-x 1 vdar users 33265 May 20 14:26 DupMasker -rwxr-xr-x 1 vdar users 54550 May 20 14:26 FastaDB.pm -rw-r--r-- 1 vdar users 6966 May 20 14:26 HTMLAnnotHeader.html -rwxr--r-- 1 vdar users 2543 May 20 14:26 INSTALL drwxr-xr-x 2 vdar users 4096 May 20 14:26 Libraries -rw-r--r-- 1 vdar users 9862 May 20 14:26 license.txt -rwxr-xr-x 1 vdar users 425419 May 20 14:26 LineHash.pm drwxr-xr-x 4 vdar users 4096 May 20 14:26 Matrices -rwxr-xr-x 1 vdar users 322391 May 20 14:26 ProcessRepeats -rwxr-xr-x 1 vdar users 9381 May 20 14:26 PubRef.pm -rwxr--r-- 1 vdar users 1836 May 20 14:26 README -rwxr-xr-x 1 vdar users 22385 May 20 14:26 RepbaseEMBL.pm -rwxr-xr-x 1 vdar users 49282 May 20 14:26 RepbaseRecord.pm -rwxr-xr-x 1 vdar users 3929708 May 20 14:26 RepeatAnnotationData.pm -rwxr-xr-x 1 vdar users 213727 May 20 14:26 RepeatMasker -rw-r--r-- 1 vdar users 5721 May 20 14:26 RepeatMaskerConfig.pm -rwxr-xr-x 1 vdar users 5689 May 20 14:26 RepeatMaskerConfig.tmpl -rw-r--r-- 1 vdar users 85557 May 20 14:26 repeatmasker.help -rwxr-xr-x 1 vdar users 19538 May 20 14:26 RepeatProteinMask -rwxr-xr-x 1 vdar users 17666 May 20 14:26 SearchEngineI.pm -rwxr-xr-x 1 vdar users 24600 May 20 14:26 SearchResultCollection.pm -rwxr-xr-x 1 vdar users 35040 May 20 14:26 SearchResult.pm -rwxr-xr-x 1 vdar users 11565 May 20 14:26 SeqDBI.pm -rwxr-xr-x 1 vdar users 25427 May 20 14:26 SimpleBatcher.pm -rwxr--r-- 1 vdar users 30140103 May 20 14:26 taxonomy.dat -rwxr-xr-x 1 vdar users 23897 May 20 14:26 Taxonomy.pm -rwxr-xr-x 1 vdar users 20151 May 20 14:26 TRF.pm -rwxr-xr-x 1 vdar users 14587 May 20 14:26 TRFResult.pm drwxr-xr-x 2 vdar users 4096 May 20 14:26 util -rwxr-xr-x 1 vdar users 41181 May 20 14:26 WUBlastSearchEngine.pm -rwxr-xr-x 1 vdar users 35404 May 20 14:26 WUBlastXSearchEngine.pm [vdar at capsella RepeatMasker]$ pwd /home/vdar/RepeatMasker But when I enter this path in my program like this BEGIN { $ENV{REPEATMASKERDIR} = '/home/vdar/RepeatMasker'; } It gives the same error, do you still think that its not installed and I should consult someone at campus? Thanks! Quoting Thomas Sharpton : > 'which repeatmasker' can only find executables that are part of your > system path (which you can set in your .bashrc file). > > Did you follow these instructions when installing? > > http://www.repeatmasker.org/RMDownload.html > > If so, I'd venture a guess that the RepeatMasker executable is sitting > in /usr/local/bin or /usr/local/RepeatMasker. > > Good luck, > > -T > > nisa_dar wrote: > > I am trying to find RepeatMasker program executable by typing > > which RepeatMasker > > but its no where on my system, although I installed it and in my revious > > message I have mentioned the paths of its .pm file and its directory. > > I don't know how to find the path of what my program needs, bcs I have > tried > > all these paths which i mentioned in my previous message. > > any suggestions? > > > > > > > > Florent Angly wrote: > > > >> The path that you set up for $ENV{REPEATMASKERDIR} should be the path to > >> the RepeatMasker program executable, not the path to the ReapeatMasker > >> BioPerl module. > >> Florent > >> > >> nisa_dar wrote: > >> > >>> please see my full message and all the approaches that i have been doing > >>> to > >>> tell my pogram where repeat masker is...what else is correct if these are > >>> not? > >>> > >>> > >>> > >>> > >>> Sendu Bala-2 wrote: > >>> > >>> > >>>> nisa_dar wrote: > >>>> > >>>> > >>>>> ok now I have installed repeat masker, with its prerequisites as given > >>>>> on > >>>>> http://www.repeatmasker.org/ > >>>>> but now I am getting this error message. > >>>>> > >>>>> RepeatMasker program not found as or not executable. > >>>>> > >>>>> what should I do? > >>>>> > >>>>> > >>>> Well now you have to correct your code to tell it where you installed > >>>> RepeatMasker: > >>>> > >>>> > >>>> > >>>> > >>>>>> On May 13, 2008, at 1:59 PM, nisa_dar wrote: > >>>>>> > >>>>>> > >>>> [...] > >>>> > >>>> > >>>>>>> BEGIN { > >>>>>>> > >>>>>>> $ENV{REPEATMASKERDIR} = > >>>>>>> '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/'; > >>>>>>> > >>>>>>> } > >>>>>>> > >>>>>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>>> > >>>> > >>>> > >>> > >>> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> > >> > > > > > > > -- > Thomas Sharpton > PhD Candidate - UC Berkeley > Search smarter: www.siphs.com > > From vdar at yorku.ca Thu May 22 14:59:32 2008 From: vdar at yorku.ca (nisa_dar) Date: Thu, 22 May 2008 11:59:32 -0700 (PDT) Subject: [Bioperl-l] Re peatMasker not found In-Reply-To: <4835AFCB.6030204@campus.iztacala.unam.mx> References: <17218229.post@talk.nabble.com> <4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org> <17407201.post@talk.nabble.com> <48359B8B.10507@sendu.me.uk> <17408831.post@talk.nabble.com> <4835AFCB.6030204@campus.iztacala.unam.mx> Message-ID: <17411230.post@talk.nabble.com> I can see that RepeatMasker program is there in this directory /home/vdar/RepeatMasker Here is a whole list of whats inside this directory [vdar at capsella RepeatMasker]$ ls -al total 35048 drwxr-xr-x 5 vdar users 4096 May 22 11:41 . drwx------ 10 vdar users 4096 May 22 14:15 .. -rwxr-xr-x 1 vdar users 11386 May 20 14:26 ArrayListIterator.pm -rwxr-xr-x 1 vdar users 13888 May 20 14:26 ArrayList.pm -rw-r--r-- 1 vdar users 1683 May 20 14:26 bluegrad.jpg -rwxr-xr-x 1 vdar users 16439 May 20 14:26 configure -rwxr-xr-x 1 vdar users 26098 May 20 14:26 CrossmatchSearchEngine.pm -rwxr-xr-x 1 vdar users 26342 May 20 14:26 DateRepeats -rw-r--r-- 1 vdar users 5935 May 20 14:26 daterepeats.help -rwxr-xr-x 1 vdar users 33871 May 20 14:26 DeCypherSearchEngine.pm -rwxr-xr-x 1 vdar users 33265 May 20 14:26 DupMasker -rwxr-xr-x 1 vdar users 54550 May 20 14:26 FastaDB.pm -rw-r--r-- 1 vdar users 6966 May 20 14:26 HTMLAnnotHeader.html -rwxr--r-- 1 vdar users 2543 May 20 14:26 INSTALL drwxr-xr-x 2 vdar users 4096 May 20 14:26 Libraries -rw-r--r-- 1 vdar users 9862 May 20 14:26 license.txt -rwxr-xr-x 1 vdar users 425419 May 20 14:26 LineHash.pm drwxr-xr-x 4 vdar users 4096 May 20 14:26 Matrices -rwxr-xr-x 1 vdar users 322391 May 20 14:26 ProcessRepeats -rwxr-xr-x 1 vdar users 9381 May 20 14:26 PubRef.pm -rwxr--r-- 1 vdar users 1836 May 20 14:26 README -rwxr-xr-x 1 vdar users 22385 May 20 14:26 RepbaseEMBL.pm -rwxr-xr-x 1 vdar users 49282 May 20 14:26 RepbaseRecord.pm -rwxr-xr-x 1 vdar users 3929708 May 20 14:26 RepeatAnnotationData.pm -rwxr-xr-x 1 vdar users 213727 May 20 14:26 RepeatMasker -rw-r--r-- 1 vdar users 5721 May 20 14:26 RepeatMaskerConfig.pm -rwxr-xr-x 1 vdar users 5689 May 20 14:26 RepeatMaskerConfig.tmpl -rw-r--r-- 1 vdar users 85557 May 20 14:26 repeatmasker.help -rwxr-xr-x 1 vdar users 19538 May 20 14:26 RepeatProteinMask -rwxr-xr-x 1 vdar users 17666 May 20 14:26 SearchEngineI.pm -rwxr-xr-x 1 vdar users 24600 May 20 14:26 SearchResultCollection.pm -rwxr-xr-x 1 vdar users 35040 May 20 14:26 SearchResult.pm -rwxr-xr-x 1 vdar users 11565 May 20 14:26 SeqDBI.pm -rwxr-xr-x 1 vdar users 25427 May 20 14:26 SimpleBatcher.pm -rwxr--r-- 1 vdar users 30140103 May 20 14:26 taxonomy.dat -rwxr-xr-x 1 vdar users 23897 May 20 14:26 Taxonomy.pm -rwxr-xr-x 1 vdar users 20151 May 20 14:26 TRF.pm -rwxr-xr-x 1 vdar users 14587 May 20 14:26 TRFResult.pm drwxr-xr-x 2 vdar users 4096 May 20 14:26 util -rwxr-xr-x 1 vdar users 41181 May 20 14:26 WUBlastSearchEngine.pm -rwxr-xr-x 1 vdar users 35404 May 20 14:26 WUBlastXSearchEngine.pm [vdar at capsella RepeatMasker]$ pwd /home/vdar/RepeatMasker But when I enter this path in my program like this BEGIN { $ENV{REPEATMASKERDIR} = '/home/vdar/RepeatMasker'; } It gives the same error. Thanks! Mauricio Herrera Cuadra-3 wrote: > > Sendu already pointed it out: $ENV{REPEATMASKER} needs to be changed. > RepeatMasker docs must have some info on where it was installed > depending on your platform, use `which` otherwise. > > Mauricio. > > nisa_dar wrote: >> please see my full message and all the approaches that i have been doing >> to >> tell my pogram where repeat masker is...what else is correct if these are >> not? >> >> >> >> >> Sendu Bala-2 wrote: >>> nisa_dar wrote: >>>> ok now I have installed repeat masker, with its prerequisites as given >>>> on >>>> http://www.repeatmasker.org/ >>>> but now I am getting this error message. >>>> >>>> RepeatMasker program not found as or not executable. >>>> >>>> what should I do? >>> Well now you have to correct your code to tell it where you installed >>> RepeatMasker: >>> >>> >>>>> On May 13, 2008, at 1:59 PM, nisa_dar wrote: >>> [...] >>>>>> BEGIN { >>>>>> >>>>>> $ENV{REPEATMASKERDIR} = >>>>>> '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/'; >>>>>> >>>>>> } >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >> > > -- > MAURICIO HERRERA CUADRA > arareko at campus.iztacala.unam.mx > Laboratorio de Gen?tica > Unidad de Morfofisiolog?a y Funci?n > Facultad de Estudios Superiores Iztacala, UNAM > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- View this message in context: http://www.nabble.com/RepeatMasker-not-found-tp17218229p17411230.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From vdar at yorku.ca Thu May 22 15:28:50 2008 From: vdar at yorku.ca (nisa_dar) Date: Thu, 22 May 2008 12:28:50 -0700 (PDT) Subject: [Bioperl-l] Re peatMasker not found In-Reply-To: <4835B2AA.50505@sendu.me.uk> References: <17218229.post@talk.nabble.com> <4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org> <17407201.post@talk.nabble.com> <48359B8B.10507@sendu.me.uk> <17408831.post@talk.nabble.com> <4835B2AA.50505@sendu.me.uk> Message-ID: <17411731.post@talk.nabble.com> Finally I have put my whole code inside the Directory RepeatMasker and now this error message comes ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Repeat Masker Call(RepeatMasker -noint -mam /tmp/uZzDGdH78C/oyiUOrQOer 2> /dev/null 1>/dev/null) crashed: 32512 STACK: Error::throw STACK: Bio::Root::Root::throw /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:328 STACK: Bio::Tools::Run::RepeatMasker::_run /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/Run/RepeatMasker.pm:266 STACK: Bio::Tools::Run::RepeatMasker::run /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/Run/RepeatMasker.pm:220 STACK: try.pl:28 ----------------------------------------------------------- Here is my code #!/usr/bin/perl use strict; use warnings; use Bio::Seq; use Bio::Tools::Run::StandAloneBlast; use Bio::Search::Hit::HitI; use Bio::Search::Hit::BlastHit; use Bio::Search::HSP::BlastHSP; use Bio::Search::HSP::HSPI; use Bio::SearchIO; use Bio::Tools::Run::RepeatMasker; BEGIN { $ENV{REPEATMASKERDIR} = '/home/vdar/RepeatMasker'; } my @params = ("mam" => 1,"noint"=>1); my $factory = Bio::Tools::Run::RepeatMasker->new(@params); my $in = Bio::SeqIO->new(-file => "boechera.fasta", -format => 'fasta'); my $seq = $in->next_seq(); # #return an array of Bio::SeqFeature::FeaturePair objects my @feats = $factory->run($seq); # # # or # # $factory->run($seq); # my @feats = $factory->repeat_features; # # #return the masked sequence, a Bio::SeqI object my $masked_seq = $factory->run; # if ($masked_seq){ print "yes\n"; # } Does anyone know what does that mean and what to do now? bcs I have seen that RepeatMasker program resides in this directory and I didn't get the previous message this time that program not found. Thanks! Sendu Bala-2 wrote: > > nisa_dar wrote: >> please see my full message and all the approaches that i have been doing >> to >> tell my pogram where repeat masker is...what else is correct if these are >> not? > > We don't know where you installed RepeatMasker. Only you do. You need to > supply that installation directory to $ENV{REPEATMASKERDIR} in your code. > > >> Sendu Bala-2 wrote: >>> nisa_dar wrote: >>>> ok now I have installed repeat masker, with its prerequisites as given >>>> on >>>> http://www.repeatmasker.org/ >>>> but now I am getting this error message. >>>> >>>> RepeatMasker program not found as or not executable. >>>> >>>> what should I do? >>> Well now you have to correct your code to tell it where you installed >>> RepeatMasker: >>> >>> >>>>> On May 13, 2008, at 1:59 PM, nisa_dar wrote: >>> [...] >>>>>> BEGIN { >>>>>> >>>>>> $ENV{REPEATMASKERDIR} = >>>>>> '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/'; >>>>>> >>>>>> } >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- View this message in context: http://www.nabble.com/RepeatMasker-not-found-tp17218229p17411731.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From Russell.Smithies at agresearch.co.nz Thu May 22 16:58:30 2008 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Fri, 23 May 2008 08:58:30 +1200 Subject: [Bioperl-l] Re peatMasker not found In-Reply-To: <17411731.post@talk.nabble.com> References: <17218229.post@talk.nabble.com><4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org><17407201.post@talk.nabble.com> <48359B8B.10507@sendu.me.uk><17408831.post@talk.nabble.com> <4835B2AA.50505@sendu.me.uk> <17411731.post@talk.nabble.com> Message-ID: Have you had tried RepeatMasker running on the command line? If it doesn't run there, it's very unlikely to run through BioPerl. Eg. /home/vdar/RepeatMasker/RepeatMasker -species Ruminantia -xsmall test.fa --Russell > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open- > bio.org] On Behalf Of nisa_dar > Sent: Friday, 23 May 2008 7:29 a.m. > To: Bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Re peatMasker not found > > > Finally I have put my whole code inside the Directory RepeatMasker and now > this error message comes > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Repeat Masker Call(RepeatMasker -noint -mam > /tmp/uZzDGdH78C/oyiUOrQOer > 2> /dev/null 1>/dev/null) crashed: 32512 > > STACK: Error::throw > STACK: Bio::Root::Root::throw > /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:328 > STACK: Bio::Tools::Run::RepeatMasker::_run > /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/Run/RepeatMasker.pm:266 > STACK: Bio::Tools::Run::RepeatMasker::run > /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/Run/RepeatMasker.pm:220 > STACK: try.pl:28 > ----------------------------------------------------------- > > > Here is my code > > #!/usr/bin/perl > > use strict; > use warnings; > > use Bio::Seq; > use Bio::Tools::Run::StandAloneBlast; > use Bio::Search::Hit::HitI; > use Bio::Search::Hit::BlastHit; > use Bio::Search::HSP::BlastHSP; > use Bio::Search::HSP::HSPI; > use Bio::SearchIO; > use Bio::Tools::Run::RepeatMasker; > > BEGIN { > > $ENV{REPEATMASKERDIR} = '/home/vdar/RepeatMasker'; > > } > > > my @params = ("mam" => 1,"noint"=>1); > my $factory = Bio::Tools::Run::RepeatMasker->new(@params); > my $in = Bio::SeqIO->new(-file => "boechera.fasta", -format => 'fasta'); > my $seq = $in->next_seq(); > # > #return an array of Bio::SeqFeature::FeaturePair objects > my @feats = $factory->run($seq); > # > # # or > # > # $factory->run($seq); > # my @feats = $factory->repeat_features; > # > # #return the masked sequence, a Bio::SeqI object > my $masked_seq = $factory->run; > # > if ($masked_seq){ > print "yes\n"; > # > } > > > Does anyone know what does that mean and what to do now? bcs I have seen > that RepeatMasker program resides in this directory and I didn't get the > previous message this time that program not found. > > Thanks! > > > > > > > > > > Sendu Bala-2 wrote: > > > > nisa_dar wrote: > >> please see my full message and all the approaches that i have been doing > >> to > >> tell my pogram where repeat masker is...what else is correct if these are > >> not? > > > > We don't know where you installed RepeatMasker. Only you do. You need to > > supply that installation directory to $ENV{REPEATMASKERDIR} in your code. > > > > > >> Sendu Bala-2 wrote: > >>> nisa_dar wrote: > >>>> ok now I have installed repeat masker, with its prerequisites as given > >>>> on > >>>> http://www.repeatmasker.org/ > >>>> but now I am getting this error message. > >>>> > >>>> RepeatMasker program not found as or not executable. > >>>> > >>>> what should I do? > >>> Well now you have to correct your code to tell it where you installed > >>> RepeatMasker: > >>> > >>> > >>>>> On May 13, 2008, at 1:59 PM, nisa_dar wrote: > >>> [...] > >>>>>> BEGIN { > >>>>>> > >>>>>> $ENV{REPEATMASKERDIR} = > >>>>>> '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/'; > >>>>>> > >>>>>> } > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >>> > >> > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > -- > View this message in context: http://www.nabble.com/RepeatMasker-not-found- > tp17218229p17411731.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From vdar at yorku.ca Thu May 22 17:13:52 2008 From: vdar at yorku.ca (vdar at yorku.ca) Date: Thu, 22 May 2008 17:13:52 -0400 Subject: [Bioperl-l] Re peatMasker not found In-Reply-To: References: <17218229.post@talk.nabble.com><4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org><17407201.post@talk.nabble.com> <48359B8B.10507@sendu.me.uk><17408831.post@talk.nabble.com> <4835B2AA.50505@sendu.me.uk> <17411731.post@talk.nabble.com> Message-ID: <1211490832.4835e210c1e06@mymail.yorku.ca> I have tried it and it gives the following message RepeatMasker could not find the repeat library at: /home/vdar/RepeatMasker/Libraries/RepeatMasker.lib or /home/vdar/RepeatMasker/Libraries/RepeatMaskerLib.embl Please download the latest RepeatMasker library from http://www.girinst.org and install before using RepeatMasker I've downloaded and installed censor-4.2 from www.girinst.org already but of no use. I have seen that RepeatMasker/Libraries/RepeatMasker.lib is empty. What should I do? Quoting "Smithies, Russell" : > Have you had tried RepeatMasker running on the command line? > If it doesn't run there, it's very unlikely to run through BioPerl. > > Eg. /home/vdar/RepeatMasker/RepeatMasker -species Ruminantia > -xsmall test.fa > > > --Russell > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open- > > bio.org] On Behalf Of nisa_dar > > Sent: Friday, 23 May 2008 7:29 a.m. > > To: Bioperl-l at lists.open-bio.org > > Subject: Re: [Bioperl-l] Re peatMasker not found > > > > > > Finally I have put my whole code inside the Directory RepeatMasker and > now > > this error message comes > > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > > MSG: Repeat Masker Call(RepeatMasker -noint -mam > > /tmp/uZzDGdH78C/oyiUOrQOer > > 2> /dev/null 1>/dev/null) crashed: 32512 > > > > STACK: Error::throw > > STACK: Bio::Root::Root::throw > > /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:328 > > STACK: Bio::Tools::Run::RepeatMasker::_run > > /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/Run/RepeatMasker.pm:266 > > STACK: Bio::Tools::Run::RepeatMasker::run > > /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/Run/RepeatMasker.pm:220 > > STACK: try.pl:28 > > ----------------------------------------------------------- > > > > > > Here is my code > > > > #!/usr/bin/perl > > > > use strict; > > use warnings; > > > > use Bio::Seq; > > use Bio::Tools::Run::StandAloneBlast; > > use Bio::Search::Hit::HitI; > > use Bio::Search::Hit::BlastHit; > > use Bio::Search::HSP::BlastHSP; > > use Bio::Search::HSP::HSPI; > > use Bio::SearchIO; > > use Bio::Tools::Run::RepeatMasker; > > > > BEGIN { > > > > $ENV{REPEATMASKERDIR} = '/home/vdar/RepeatMasker'; > > > > } > > > > > > my @params = ("mam" => 1,"noint"=>1); > > my $factory = Bio::Tools::Run::RepeatMasker->new(@params); > > my $in = Bio::SeqIO->new(-file => "boechera.fasta", -format => > 'fasta'); > > my $seq = $in->next_seq(); > > # > > #return an array of Bio::SeqFeature::FeaturePair objects > > my @feats = $factory->run($seq); > > # > > # # or > > # > > # $factory->run($seq); > > # my @feats = $factory->repeat_features; > > # > > # #return the masked sequence, a Bio::SeqI object > > my $masked_seq = $factory->run; > > # > > if ($masked_seq){ > > print "yes\n"; > > # > > } > > > > > > Does anyone know what does that mean and what to do now? bcs I have > seen > > that RepeatMasker program resides in this directory and I didn't get > the > > previous message this time that program not found. > > > > Thanks! > > > > > > > > > > > > > > > > > > > > Sendu Bala-2 wrote: > > > > > > nisa_dar wrote: > > >> please see my full message and all the approaches that i have been > doing > > >> to > > >> tell my pogram where repeat masker is...what else is correct if > these are > > >> not? > > > > > > We don't know where you installed RepeatMasker. Only you do. You > need to > > > supply that installation directory to $ENV{REPEATMASKERDIR} in your > code. > > > > > > > > >> Sendu Bala-2 wrote: > > >>> nisa_dar wrote: > > >>>> ok now I have installed repeat masker, with its prerequisites as > given > > >>>> on > > >>>> http://www.repeatmasker.org/ > > >>>> but now I am getting this error message. > > >>>> > > >>>> RepeatMasker program not found as or not executable. > > >>>> > > >>>> what should I do? > > >>> Well now you have to correct your code to tell it where you > installed > > >>> RepeatMasker: > > >>> > > >>> > > >>>>> On May 13, 2008, at 1:59 PM, nisa_dar wrote: > > >>> [...] > > >>>>>> BEGIN { > > >>>>>> > > >>>>>> $ENV{REPEATMASKERDIR} = > > >>>>>> '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/'; > > >>>>>> > > >>>>>> } > > >>> _______________________________________________ > > >>> Bioperl-l mailing list > > >>> Bioperl-l at lists.open-bio.org > > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > >>> > > >>> > > >> > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > > -- > > View this message in context: > http://www.nabble.com/RepeatMasker-not-found- > > tp17218229p17411731.html > > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > ======================================================================= > Attention: The information contained in this message and/or attachments > from AgResearch Limited is intended only for the persons or entities > to which it is addressed and may contain confidential and/or privileged > material. Any review, retransmission, dissemination or other use of, or > taking of any action in reliance upon, this information by persons or > entities other than the intended recipients is prohibited by AgResearch > Limited. If you have received this message in error, please notify the > sender immediately. > ======================================================================= > > From Russell.Smithies at agresearch.co.nz Thu May 22 17:18:48 2008 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Fri, 23 May 2008 09:18:48 +1200 Subject: [Bioperl-l] Re peatMasker not found In-Reply-To: <1211490832.4835e210c1e06@mymail.yorku.ca> References: <17218229.post@talk.nabble.com><4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org><17407201.post@talk.nabble.com> <48359B8B.10507@sendu.me.uk><17408831.post@talk.nabble.com> <4835B2AA.50505@sendu.me.uk> <17411731.post@talk.nabble.com> <1211490832.4835e210c1e06@mymail.yorku.ca> Message-ID: AFAIK, you need to pay for Repbase (we pay around US$6,200 per year) If you don't want to pay for Repbase, you'll need to create your own repeat library. --Russell > -----Original Message----- > From: vdar at yorku.ca [mailto:vdar at yorku.ca] > Sent: Friday, 23 May 2008 9:14 a.m. > To: Smithies, Russell > Cc: Bioperl-l at lists.open-bio.org > Subject: RE: [Bioperl-l] Re peatMasker not found > > I have tried it and it gives the following message > > RepeatMasker could not find the repeat library at: > /home/vdar/RepeatMasker/Libraries/RepeatMasker.lib > or > /home/vdar/RepeatMasker/Libraries/RepeatMaskerLib.embl > Please download the latest RepeatMasker library from > http://www.girinst.org and install before using RepeatMasker > > > I've downloaded and installed censor-4.2 from www.girinst.org already but of no > use. I have seen that RepeatMasker/Libraries/RepeatMasker.lib is empty. What > should I do? > > > > Quoting "Smithies, Russell" : > > > Have you had tried RepeatMasker running on the command line? > > If it doesn't run there, it's very unlikely to run through BioPerl. > > > > Eg. /home/vdar/RepeatMasker/RepeatMasker -species Ruminantia > > -xsmall test.fa > > > > > > --Russell > > > > > -----Original Message----- > > > From: bioperl-l-bounces at lists.open-bio.org > > [mailto:bioperl-l-bounces at lists.open- > > > bio.org] On Behalf Of nisa_dar > > > Sent: Friday, 23 May 2008 7:29 a.m. > > > To: Bioperl-l at lists.open-bio.org > > > Subject: Re: [Bioperl-l] Re peatMasker not found > > > > > > > > > Finally I have put my whole code inside the Directory RepeatMasker and > > now > > > this error message comes > > > > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > > > MSG: Repeat Masker Call(RepeatMasker -noint -mam > > > /tmp/uZzDGdH78C/oyiUOrQOer > > > 2> /dev/null 1>/dev/null) crashed: 32512 > > > > > > STACK: Error::throw > > > STACK: Bio::Root::Root::throw > > > /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:328 > > > STACK: Bio::Tools::Run::RepeatMasker::_run > > > /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/Run/RepeatMasker.pm:266 > > > STACK: Bio::Tools::Run::RepeatMasker::run > > > /opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/Run/RepeatMasker.pm:220 > > > STACK: try.pl:28 > > > ----------------------------------------------------------- > > > > > > > > > Here is my code > > > > > > #!/usr/bin/perl > > > > > > use strict; > > > use warnings; > > > > > > use Bio::Seq; > > > use Bio::Tools::Run::StandAloneBlast; > > > use Bio::Search::Hit::HitI; > > > use Bio::Search::Hit::BlastHit; > > > use Bio::Search::HSP::BlastHSP; > > > use Bio::Search::HSP::HSPI; > > > use Bio::SearchIO; > > > use Bio::Tools::Run::RepeatMasker; > > > > > > BEGIN { > > > > > > $ENV{REPEATMASKERDIR} = '/home/vdar/RepeatMasker'; > > > > > > } > > > > > > > > > my @params = ("mam" => 1,"noint"=>1); > > > my $factory = Bio::Tools::Run::RepeatMasker->new(@params); > > > my $in = Bio::SeqIO->new(-file => "boechera.fasta", -format => > > 'fasta'); > > > my $seq = $in->next_seq(); > > > # > > > #return an array of Bio::SeqFeature::FeaturePair objects > > > my @feats = $factory->run($seq); > > > # > > > # # or > > > # > > > # $factory->run($seq); > > > # my @feats = $factory->repeat_features; > > > # > > > # #return the masked sequence, a Bio::SeqI object > > > my $masked_seq = $factory->run; > > > # > > > if ($masked_seq){ > > > print "yes\n"; > > > # > > > } > > > > > > > > > Does anyone know what does that mean and what to do now? bcs I have > > seen > > > that RepeatMasker program resides in this directory and I didn't get > > the > > > previous message this time that program not found. > > > > > > Thanks! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Sendu Bala-2 wrote: > > > > > > > > nisa_dar wrote: > > > >> please see my full message and all the approaches that i have been > > doing > > > >> to > > > >> tell my pogram where repeat masker is...what else is correct if > > these are > > > >> not? > > > > > > > > We don't know where you installed RepeatMasker. Only you do. You > > need to > > > > supply that installation directory to $ENV{REPEATMASKERDIR} in your > > code. > > > > > > > > > > > >> Sendu Bala-2 wrote: > > > >>> nisa_dar wrote: > > > >>>> ok now I have installed repeat masker, with its prerequisites as > > given > > > >>>> on > > > >>>> http://www.repeatmasker.org/ > > > >>>> but now I am getting this error message. > > > >>>> > > > >>>> RepeatMasker program not found as or not executable. > > > >>>> > > > >>>> what should I do? > > > >>> Well now you have to correct your code to tell it where you > > installed > > > >>> RepeatMasker: > > > >>> > > > >>> > > > >>>>> On May 13, 2008, at 1:59 PM, nisa_dar wrote: > > > >>> [...] > > > >>>>>> BEGIN { > > > >>>>>> > > > >>>>>> $ENV{REPEATMASKERDIR} = > > > >>>>>> '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/'; > > > >>>>>> > > > >>>>>> } > > > >>> _______________________________________________ > > > >>> Bioperl-l mailing list > > > >>> Bioperl-l at lists.open-bio.org > > > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > >>> > > > >>> > > > >> > > > > > > > > _______________________________________________ > > > > Bioperl-l mailing list > > > > Bioperl-l at lists.open-bio.org > > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > > > > > > -- > > > View this message in context: > > http://www.nabble.com/RepeatMasker-not-found- > > > tp17218229p17411731.html > > > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > ============================================================= > ========== > > Attention: The information contained in this message and/or attachments > > from AgResearch Limited is intended only for the persons or entities > > to which it is addressed and may contain confidential and/or privileged > > material. Any review, retransmission, dissemination or other use of, or > > taking of any action in reliance upon, this information by persons or > > entities other than the intended recipients is prohibited by AgResearch > > Limited. If you have received this message in error, please notify the > > sender immediately. > > > ============================================================= > ========== > > > > > ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From bix at sendu.me.uk Thu May 22 18:01:53 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 22 May 2008 23:01:53 +0100 Subject: [Bioperl-l] Re peatMasker not found In-Reply-To: References: <17218229.post@talk.nabble.com><4333DE4A-B3A7-4AD5-9F4A-408C58D03925@bioperl.org><17407201.post@talk.nabble.com> <48359B8B.10507@sendu.me.uk><17408831.post@talk.nabble.com> <4835B2AA.50505@sendu.me.uk> <17411731.post@talk.nabble.com> <1211490832.4835e210c1e06@mymail.yorku.ca> Message-ID: <4835ED51.5050403@sendu.me.uk> Smithies, Russell wrote: > AFAIK, you need to pay for Repbase (we pay around US$6,200 per year) > If you don't want to pay for Repbase, you'll need to create your own > repeat library. If you're academic you can register and download the RepeatMasker edition from here: http://www.girinst.org/server/RepBase/index.php vdar then needs to extract the files into the location that RepeatMasker is trying look for them and it will work. From vdar at yorku.ca Thu May 22 18:07:27 2008 From: vdar at yorku.ca (nisa_dar) Date: Thu, 22 May 2008 15:07:27 -0700 (PDT) Subject: [Bioperl-l] WUBlastSearchEngine Message-ID: <17414548.post@talk.nabble.com> Does anyone know what does that mean and how to fix it? I'm running repeatmasker from commandline and WUBlastSearchEngine.pm is inside this directory. I installed WU blast. I can't understand how to fix this error now? [vdar at capsella RepeatMasker]$ ./RepeatMasker boechera.fasta RepeatMasker version open-3.2.3 Search engine: WUBlast WUBlastSearchEngine::setPathToEngine( /state/partition1/bin/blast2/blastp ): Cannot determine engine variant and version! at ./RepeatMasker line 534 I have seen line 534 on this program but I couldn't understand what I need to do? Can you help? -- View this message in context: http://www.nabble.com/WUBlastSearchEngine-tp17414548p17414548.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From vdar at yorku.ca Thu May 22 19:00:57 2008 From: vdar at yorku.ca (nisa_dar) Date: Thu, 22 May 2008 16:00:57 -0700 (PDT) Subject: [Bioperl-l] WUBlastSearchEngine In-Reply-To: <17414548.post@talk.nabble.com> References: <17414548.post@talk.nabble.com> Message-ID: <17415278.post@talk.nabble.com> I have the older version of WUBlast and its path is the same which is being mentioned in the error message below. I have changed this path in the WuBlastSearchEngine.pm as well, my $wuEngine = WUBlastSearchEngine->new( pathToEngine=>"/home/vdar/RepeatMasker/blast2.linux-5.1/blastn" ); although i didn't know if it was needed or no. These are a few more lines which i haven't changed but I don't know if I need to change them $wuEngine->setMatrix( "/users/bob/simple.matrix" ); $wuEngine->setQuery( "/users/bob/query.fasta" ); $wuEngine->setSubject( "/users/bob/subject.fasta" ); my $searchResults = $wuEngine->search(); .. does anyone know what should I change in order to get results instead of an error? Thanks nisa_dar wrote: > > Does anyone know what does that mean and how to fix it? I'm running > repeatmasker from commandline and WUBlastSearchEngine.pm is inside this > directory. I installed WU blast. I can't understand how to fix this error > now? > > [vdar at capsella RepeatMasker]$ ./RepeatMasker boechera.fasta > RepeatMasker version open-3.2.3 > Search engine: WUBlast > WUBlastSearchEngine::setPathToEngine( /state/partition1/bin/blast2/blastp > ): Cannot determine engine variant and version! > at ./RepeatMasker line 534 > > I have seen line 534 on this program but I couldn't understand what I need > to do? > > Can you help? > > > > > -- View this message in context: http://www.nabble.com/WUBlastSearchEngine-tp17414548p17415278.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From CMB.zhouw at gmail.com Thu May 22 21:48:07 2008 From: CMB.zhouw at gmail.com (Wei Zhou) Date: Fri, 23 May 2008 09:48:07 +0800 Subject: [Bioperl-l] I can't access clustalw from my cgi perl program... In-Reply-To: <17410291.post@talk.nabble.com> References: <17367665.post@talk.nabble.com> <1211397502.4834757e732c3@mymail.yorku.ca> <48347FA6.4090301@campus.iztacala.unam.mx> <1211402045.4834873deb708@mymail.yorku.ca> <48348DD0.2040705@sendu.me.uk> <1211407927.48349e3770e94@mymail.yorku.ca> <4834A1CC.5070709@campus.iztacala.unam.mx> <1211409610.4834a4ca98853@mymail.yorku.ca> <4834ACE6.1090402@campus.iztacala.unam.mx> <17410291.post@talk.nabble.com> Message-ID: <4c0ae1150805221848r6f25560ct6f23802ed0d24e17@mail.gmail.com> check out the permission of your HOME directory, /home/vdar/ perhaps its mode is 700. if so, chmod 755 ~ Hope this helps. Best regards, Wei Zhou On Fri, May 23, 2008 at 2:12 AM, nisa_dar wrote: > > I have seen in my error log file that output file is not being written bcs > of > no permissions > > [Thu May 22 14:07:35 2008] [error] [client xxxxxxxxxx] MSG: Could not open > >/home/vdar/in.aln.pfam: Permission denied, referer: > https://capsella.ccs.yorku.ca/nisa/snp_finder.html > > Does any one know how can i give permissions to others so that my program > can write this file in my home directory? > All I knew was chmod 777 ... but in this case this doesn't work... > > thanks > > > > > > > > Mauricio Herrera Cuadra-3 wrote: > > > > A couple of things inlined: > > > > vdar at yorku.ca wrote: > >> Yes, I've seen in that directory, but it doesn't exist. Another wierd > >> thing > >> which is being happening is that If I make this output file manually in > >> that > >> directory, it is read by the following code and printed on screen > >> > >> > >> if ("out.aln.pfam"){ > >> open FH, "out.aln.pfam" || die "Alignment file doesn't exist
"; > >> while(){ > >> > >> print $_,"
"; > >> } > >> close FH; > >> } > >> > >> only when this code is not followed by the original code. When its > >> followed by > >> the original code i.e. > > > > Yeah, this works because you're placing the file there by hand, so it's > > found by the open() function, not the 'if ("out.aln.pfam")' statement > > (which, btw, always evaluates as TRUE). Something simpler like this will > > work as you expect and it's easier to understand: > > > > open FH, "out.aln.pfam" or die "Alignment file doesn't exist
"; > > while () { > > print $_, "
"; > > } > > close FH; > > > >> my $in = Bio::AlignIO->new(-file => $file1 , > >> -format => 'fasta'); > >> my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , > >> -format => 'pfam'); > >> > >> > >> > >> while ( my $aln1 = $in->next_aln() ) { > >> $out->write_aln($aln1); > >> } > >> > >> > >> > >> if ("out.aln.pfam"){ > >> open FH, "out.aln.pfam" || die "Alignment file doesn't exist
"; > >> while(){ > >> > >> print $_,"
"; > >> } > >> close FH; > >> } > >> > >> Nothing is printed on screen, while it stays there in the directory. I > >> have > >> changed the name of output file in this code but new file is not > produced > >> by > >> this program. Does anyone know what is going on? > > > > Apparently, this is a problem with permissions. Maybe your script lives > > under some directory (i.e. your home directory) which is owned by a > > different user than the one who is actually running the CGI interface > > (e.g. apache, nobody) ?? Check your Apache logs from another shell > > screen to see what is really happening while you run your script: > > > > $ tail -f /path/to/your/apache/error.log > > > > In a CGI environment, all Perl messages/warnings are printed to the > > webserver's log, not the standard output (your shell). > > > > Mauricio. > > > >> thanks > >> nisa > >> > >> > >> Quoting Mauricio Herrera Cuadra : > >> > >>> You're using '>out.aln.pfam' as the full path for the output file. Most > >>> probably, the file is being produced but in the same location where the > >>> CGI script lives. Check inside the same directory where you installed > >>> your script. > >>> > >>> Mauricio. > >>> > >>> vdar at yorku.ca wrote: > >>>> ok thanks, its not giving me any error now, but its not doing anything > >>>> too, > >>> the > >>>> following code works from commandline but not from my cgi script. I > >>>> have > >>> added > >>>> the path to bioperl and have tried everything else that I could > find... > >>>> > >>>> my $in = Bio::AlignIO->new(-file => $inputfilename , > >>>> -format => 'fasta'); > >>>> my $out = Bio::AlignIO->new(-file => ">out.aln.pfam" , > >>>> -format => 'pfam'); > >>>> > >>>> > >>>> > >>>> while ( my $aln1 = $in->next_aln() ) { > >>>> $out->write_aln($aln1); > >>>> } > >>>> > >>>> > >>>> output file is not produced. what should I do? > >>>> > >>>> thanks > >>>> nisa > >>>> > >>>> > >>>> Quoting Sendu Bala : > >>>> > >>>>> vdar at yorku.ca wrote: > >>>>>> How can I find where bioperl is installed? > >>>>> You already know where it's installed. See below. > >>>>> > >>>>> > >>>>>> Quoting Mauricio Herrera Cuadra : > >>>>>> > >>>>>>> Hi Nisa, > >>>>>>> > >>>>>>> CGI scripts are generally run by a different user than you, and > >>>>>>> which > >>>>>>> user (e.g. apache, nobody) will depend on the platform you're > >>>>>>> running > >>>>>>> the script on, thus the environment variables you currently have > for > >>>>>>> your login shell are not being inherited to the web interface. The > >>>>>>> best > >>>>>>> workaround for this is to add a 'use lib' pragma at the top of your > >>>>>>> CGI > >>>>>>> script: > >>>>>>> > >>>>>>> use lib '/path/to/your/bioperl/installation/'; > >>>>> [...] > >>>>>>> vdar at yorku.ca wrote: > >>>>>>>> export PERL5LIB="/opt/rocks/lib/perl5/site_perl/5.8.8" > >>>> > >>> -- > >>> MAURICIO HERRERA CUADRA > >>> arareko at campus.iztacala.unam.mx > >>> Laboratorio de Gen?(c)tica > >>> Unidad de Morfofisiolog??a y Funci??n > >>> Facultad de Estudios Superiores Iztacala, UNAM > >>> > >> > >> > > > > -- > > MAURICIO HERRERA CUADRA > > arareko at campus.iztacala.unam.mx > > Laboratorio de Gen?tica > > Unidad de Morfofisiolog?a y Funci?n > > Facultad de Estudios Superiores Iztacala, UNAM > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > View this message in context: > http://www.nabble.com/I-can%27t-access-clustalw-from-my-cgi-perl-program...-tp17367665p17410291.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bug-bioperl at rt.cpan.org Thu May 22 20:24:04 2008 From: bug-bioperl at rt.cpan.org (Serguei Trouchelle via RT) Date: Thu, 22 May 2008 20:24:04 -0400 Subject: [Bioperl-l] [rt.cpan.org #36117] Circular dependency References: Message-ID: Thu May 22 20:23:58 2008: Request 36117 was acted upon. Transaction: Ticket created by STRO Queue: bioperl Subject: Circular dependency Broken in: (no value) Severity: Critical Owner: Nobody Requestors: stro at railways.dp.ua Status: new Ticket bioperl-1.5.2_102 depends on Bio::ASN1::EntrezGene Bio-ASN1-EntrezGene-1.091 depends on Bio::Index::AbstractSeq Bio::Index::AbstractSeq is a part of bioperl So we have a circular dependency. -- Serguei Trouchelle From tataetis at gmail.com Fri May 23 05:35:48 2008 From: tataetis at gmail.com (Giwrgos Tataetis) Date: Fri, 23 May 2008 12:35:48 +0300 Subject: [Bioperl-l] torsion angle Message-ID: Hi guys. I am an undergraduate student in Biology and I ve been assigned to calculate the torsion angles in a molecule... The hole idea is that we have 4 points in a 3D system A-B-C-D and the only thing we know is their coordinates, i.e. A(xa, ya, za). The wanted is to find tha torsion angle among between A-B an C-D. I have repeatedly searched and asked around but I still cant figure out what would a possible formula or algorithm would be. I have found this in the archives of a mailing list but I really dont know if it stands or is it true: http://www.pikipimp.com/pp/pimped_ph...=1211535101226 even if this formula stands I have a lot of question. 1) how can I calculate r1, r2, r3 out of the coordinates of the points 2) how can I multipy them. well actually I think this is the cross product but i am not sure.. 3) I found how to calculate the cross product of two vectors and I think that this is it *a* ? *b* = (a2b3 ? a3b2) *i* + (a3b1 ? a1b3) *j* + (a1b2 ? a2b1) *k* = (a2b3 ? a3b2, a3b1 ? a1b3, a1b2 ? a2b1) however I still cant understand how can a vector which -from what I understood- is not just a point, be defined only by three coordinates, and if so how is this calculated out of its start and endpoint... 4)in the third an fourth equation from what I can tell there are two or more unknown elements so how can eaxh be calculated... 5) if I can find the cos or the sin of the angle what do I need atan for, and what atan stands for, since tha only things I know is cos, sin and tan. 6) Last but not least I encounter two major problems which makes it even harder to find an answer, since i am a biologist and maths was never of first priority for me -regreatably i must admit- and english is not my native language so the math terms are not easily understood. Please can anyone help me finding some answers or even point some directions here? thank you very much!!! From jason at bioperl.org Fri May 23 12:44:57 2008 From: jason at bioperl.org (Jason Stajich) Date: Fri, 23 May 2008 10:44:57 -0600 Subject: [Bioperl-l] Fwd: about get_all_tags BioPerl method References: Message-ID: <7100D8D0-1D54-486C-80F2-3DF01CA0591F@bioperl.org> resending from correct addr. Begin forwarded message: > > Date: May 23, 2008 9:44:18 AM MDT > To: Mgavi Brathwaite > Cc: bioperl list > Subject: Fwd: about get_all_tags BioPerl method > > Mgavi - > you'd have to change the module to accept a custom sort function I > suppose. I don't have any time to help hack it though, but others > in the project may so it is important to ask on the list as I've CC- > ed. > > -jason > > Begin forwarded message: > >> From: "Mgavi Brathwaite" >> Date: May 23, 2008 8:07:10 AM MDT >> Subject: Re: about get_all_tags BioPerl method >> >> Hi Jason! >> >> I have a genbank file and I want to arrange the feature qualifiers >> for CDS in a particular format. Presently the output is: >> >> CDS join(34018..34488,35055..35136,35636..35757) >> /db_xref="CCDS:CCDS15306.1" >> /db_xref="MGI:Myog" >> /protein_id="ENSMUSP00000027730" >> /gene="ENSMUSG00000026459" >> /note="transcript_id=ENSMUST00000027730" >> >> I want it to be: >> >> CDS join(34018..34488,35055..35136,35636..35757) >> /note="transcript_id=ENSMUST00000027730" >> /protein_id="ENSMUSP00000027730 >> /gene="ENSMUSG00000026459 >> /db_xref="CCDS:CCDS15306.1" >> /db_xref="MGI:Myog" >> >> How can I control the format and output? >> >> Thanks, >> Mgavi >> >> On Thu, May 22, 2008 at 4:10 PM, Jason Stajich > wrote: >> I don't understand what you want - can you be more specific? >> >> On May 22, 2008, at 1:31 PM, Mgavi Brathwaite wrote: >> >>> Hi Jason, >>> >>> You commented on the get_all_tags method: >>> # added a sort so that tags will be returned in a predictable order >>> # I still think we should be able to specify a sort function >>> # to the object at some point >>> # -js >>> >>> Would u be willing to share that code with me? >>> >>> M >> >> From reece at harts.net Fri May 23 16:26:36 2008 From: reece at harts.net (Reece Hart) Date: Fri, 23 May 2008 20:26:36 +0000 Subject: [Bioperl-l] torsion angle In-Reply-To: References: Message-ID: <1211574396.6958.137.camel@snafu> On Fri, 2008-05-23 at 12:35 +0300, Giwrgos Tataetis wrote: > I am an undergraduate student in Biology and I ve been assigned to > calculate the torsion angles in a molecule... > Please can anyone help me finding some answers or even point some > directions Giwrgos- I don't know why couldn't find this with google -- searching for "protein torsion angles cross product" (unquoted) turns up two good documents: http://www.math.fsu.edu/~quine/IntroMathBio_05/torsion_pdb/torsion_pdb.pdf http://en.wikipedia.org/wiki/Dihedral_angle The first does a good job of explaining the vector math at a fairly elementary level. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 From cjfields at uiuc.edu Fri May 23 13:18:36 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 23 May 2008 12:18:36 -0500 Subject: [Bioperl-l] Fwd: about get_all_tags BioPerl method In-Reply-To: <7100D8D0-1D54-486C-80F2-3DF01CA0591F@bioperl.org> References: <7100D8D0-1D54-486C-80F2-3DF01CA0591F@bioperl.org> Message-ID: <72A7A685-AE8C-4D80-B75F-FEBA6B54310F@uiuc.edu> We can probably hack this in fairly easily; just need to think about the best way to do it. Maybe a way of generating the SeqFeature string via a callback (which would allow some customization)? chris On May 23, 2008, at 11:44 AM, Jason Stajich wrote: > > resending from correct addr. > > Begin forwarded message: > >> >> Date: May 23, 2008 9:44:18 AM MDT >> To: Mgavi Brathwaite >> Cc: bioperl list >> Subject: Fwd: about get_all_tags BioPerl method >> >> Mgavi - >> you'd have to change the module to accept a custom sort function I >> suppose. I don't have any time to help hack it though, but others >> in the project may so it is important to ask on the list as I've CC- >> ed. >> >> -jason >> >> Begin forwarded message: >> >>> From: "Mgavi Brathwaite" >>> Date: May 23, 2008 8:07:10 AM MDT >>> Subject: Re: about get_all_tags BioPerl method >>> >>> Hi Jason! >>> >>> I have a genbank file and I want to arrange the feature qualifiers >>> for CDS in a particular format. Presently the output is: >>> >>> CDS join(34018..34488,35055..35136,35636..35757) >>> /db_xref="CCDS:CCDS15306.1" >>> /db_xref="MGI:Myog" >>> /protein_id="ENSMUSP00000027730" >>> /gene="ENSMUSG00000026459" >>> /note="transcript_id=ENSMUST00000027730" >>> >>> I want it to be: >>> >>> CDS join(34018..34488,35055..35136,35636..35757) >>> /note="transcript_id=ENSMUST00000027730" >>> /protein_id="ENSMUSP00000027730 >>> /gene="ENSMUSG00000026459 >>> /db_xref="CCDS:CCDS15306.1" >>> /db_xref="MGI:Myog" >>> >>> How can I control the format and output? >>> >>> Thanks, >>> Mgavi >>> >>> On Thu, May 22, 2008 at 4:10 PM, Jason Stajich > wrote: >>> I don't understand what you want - can you be more specific? >>> >>> On May 22, 2008, at 1:31 PM, Mgavi Brathwaite wrote: >>> >>>> Hi Jason, >>>> >>>> You commented on the get_all_tags method: >>>> # added a sort so that tags will be returned in a predictable order >>>> # I still think we should be able to specify a sort function >>>> # to the object at some point >>>> # -js >>>> >>>> Would u be willing to share that code with me? >>>> >>>> M >>> >>> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From arareko at campus.iztacala.unam.mx Fri May 23 19:11:13 2008 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Fri, 23 May 2008 18:11:13 -0500 Subject: [Bioperl-l] OT: Microarray analysis software - testers wanted Message-ID: <48374F11.9000509@campus.iztacala.unam.mx> Hi all, A friend of mine is looking for some volunteers who would like to help testing a visual/interactive system for microarray data analysis he's working on for his thesis. Experience with array data and its analysis is best but not required. If any of you is interested please reply to me privately. Thanks and regards, Mauricio. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From arshad25 at gmail.com Fri May 23 23:36:54 2008 From: arshad25 at gmail.com (arshad mohammed) Date: Sat, 24 May 2008 11:36:54 +0800 Subject: [Bioperl-l] Format Validator Message-ID: Hi All, I am putting some effort to write a sub module to validate bio sequences, as my first attempt to contribute something to bioperl. It is quite premature as it can validate only FASTA format at this stage. I would really like to have some feedback from all of you guys to improve and bioperlify it. Following is the module and the snippet how to use it. ############################## ############################# ##FormatValidator.pm package FormatValidator; use strict; use Carp; use warnings; use version; our $VERSION = qv('0.0.1'); #I am a poor constructor with no attributes sub new { my ($class) = @_; my $object = bless {}, $class; return $object; } #I will check if the sequence satisfy rules of "Fasta" and if it is, I will return "1" sub is_fasta { my ($self, $file) = @_; return 0 if !defined $_[1]; #I am not sure the user passed in a "File-Handler" or an "Array reference". #I am flexible enough to read the data from both. So I will make sure #what actually the user passed in and read it accordingly with the help of my bro "readfile". my @file_data = $self->readfile($file); for (@file_data) { if ($_ =~ /^>/) { #Bang!! This is the only identifier I know 'bt FASTA, so no point in further reading return 1; } else { return 0; } } return undef; } #I will Read the file data from either File Handler or Array and pass it to the caller sub readfile { my ($self, $file) = @_; #If it is file Handler if (ref($file) eq 'IO::Handle') { my @file = <$file>; return @file; } #Or if it is Array reference elsif (ref($file) eq 'ARRAY') { return @{$file}; } #If it is anything else else { carp "I can read only Array reference or File Handler, But this is something else !\n"; return; } } 1; ################################## ################################## #test.pl use FormatValidator; use strict; open FH, 'new(); #pass either file handler if ($validator->is_fasta(*FH{IO})) { print "Its a FASTA..\n"; } else { print "Its not a FASTA\n"; } close FH; # or the array reference of the file content open FH, '; if ($validator->is_fasta(\@file_data)) { print "Its a FASTA..\n"; } else { print "Its not a FASTA\n"; } #Test for an invalid condition my $invalid = $validator->is_fasta($validator); if (!defined $invalid) { print "DIE..\n"; } else { print "Something Wrong in my module"; } Perl ly Arshad Mohammed -- \\\|/// \\ - - // ( @ @ ) --------o00o-(_)-o00o----------- From karchana at ibab.ac.in Sat May 24 05:33:24 2008 From: karchana at ibab.ac.in (karchana at ibab.ac.in) Date: Sat, 24 May 2008 15:03:24 +0530 (IST) Subject: [Bioperl-l] How to find introns? Message-ID: <41893.192.168.1.254.1211621604.squirrel@webmail.ibab.ac.in> Hi, I have mRNA positions like this: (12873..12907,13120..13355,15952..16054,18291..18468). I have sequence also. I want to find introns and represent graphically introns and exons !! How to find introns and represent graphically? With regards Archana From tataetis at gmail.com Sun May 25 04:43:39 2008 From: tataetis at gmail.com (Giwrgos Tataetis) Date: Sun, 25 May 2008 11:43:39 +0300 Subject: [Bioperl-l] torsion angles. Message-ID: Nevermind I figured it out... Can anyone tell me how can I find, given the coordinates of the points, whether an angle is clockwise or counterclockwise? From bamboowarrior at gmail.com Mon May 26 16:18:05 2008 From: bamboowarrior at gmail.com (Arkady) Date: Mon, 26 May 2008 15:18:05 -0500 Subject: [Bioperl-l] Get genes/annotations associated with a probe Message-ID: <91656c3f0805261318k1f0965e8r76f4802bea6e23ec@mail.gmail.com> Hi everyone, I've got a list of genomic positions (e.g., the coordinates used by UCSC) for transcribed fragments from a microarray. Is there a simple way to get a list of annotations for those regions using bioperl? e.g., every annotation that spans the entire length of the transfrag, or 80% of it; or annotations that overlap the transfrag? Thanks in advance! Cheers, John Woods -- Institute for Cellular and Molecular Biology The University of Texas at Austin From jason at bioperl.org Mon May 26 16:23:06 2008 From: jason at bioperl.org (Jason Stajich) Date: Mon, 26 May 2008 13:23:06 -0700 Subject: [Bioperl-l] Get genes/annotations associated with a probe In-Reply-To: <91656c3f0805261318k1f0965e8r76f4802bea6e23ec@mail.gmail.com> References: <91656c3f0805261318k1f0965e8r76f4802bea6e23ec@mail.gmail.com> Message-ID: <07D3C30A-B452-4A05-ABFC-45BDD0DF5456@bioperl.org> As always, what species are you talking about; not everyone works on human only? If human I would suggest using EnsemblMart or Ensembl API for this kind of data. http://biomart.org/ There is a perl API in Ensembl which can be used to query this sort of thing. It will probably be easier to just download the EnsMart full set of annotations with genomic locations included and then you can do an overlap lookup - or load into Bio::DB::GFF to get the transcript name and overlap lookup. -jason On May 26, 2008, at 1:18 PM, Arkady wrote: > Hi everyone, > > I've got a list of genomic positions (e.g., the coordinates used by > UCSC) > for transcribed fragments from a microarray. Is there a simple way > to get a > list of annotations for those regions using bioperl? e.g., every > annotation > that spans the entire length of the transfrag, or 80% of it; or > annotations > that overlap the transfrag? > > Thanks in advance! > > Cheers, > John Woods > > -- > Institute for Cellular and Molecular Biology > The University of Texas at Austin > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bug-bioperl at rt.cpan.org Tue May 27 11:01:38 2008 From: bug-bioperl at rt.cpan.org (Sendu Bala via RT) Date: Tue, 27 May 2008 11:01:38 -0400 Subject: [Bioperl-l] [rt.cpan.org #36117] Circular dependency References: Message-ID: Queue: bioperl Ticket On Thu May 22 20:23:58 2008, STRO wrote: > bioperl-1.5.2_102 depends on Bio::ASN1::EntrezGene > Bio-ASN1-EntrezGene-1.091 depends on Bio::Index::AbstractSeq > Bio::Index::AbstractSeq is a part of bioperl > > So we have a circular dependency. Yes. However, it isn't a strict dependency of BioPerl, but an optional installation. So chose not to install it. From bug-bioperl at rt.cpan.org Tue May 27 11:03:07 2008 From: bug-bioperl at rt.cpan.org (Sendu Bala via RT) Date: Tue, 27 May 2008 11:03:07 -0400 Subject: [Bioperl-l] [rt.cpan.org #29533] Bio::SeqIO::interpro depends on XML::DOM::XPath References: Message-ID: Queue: bioperl Ticket resolved, closing out From bug-bioperl at rt.cpan.org Tue May 27 11:06:49 2008 From: bug-bioperl at rt.cpan.org (Sendu Bala via RT) Date: Tue, 27 May 2008 11:06:49 -0400 Subject: [Bioperl-l] [rt.cpan.org #17505] fails to install References: Message-ID: Queue: bioperl Ticket Newer version of Bioperl released, shouldn't have those problems. From bug-bioperl at rt.cpan.org Tue May 27 11:16:51 2008 From: bug-bioperl at rt.cpan.org (Sendu Bala via RT) Date: Tue, 27 May 2008 11:16:51 -0400 Subject: [Bioperl-l] [rt.cpan.org #31796] SeqIO In-Reply-To: <5F694A96-AC4B-4279-8060-9E28A92837ED@afmb.univ-mrs.fr> References: <5F694A96-AC4B-4279-8060-9E28A92837ED@afmb.univ-mrs.fr> Message-ID: Queue: bioperl Ticket Need more information, was it ever submitted to bugzilla? From bix at sendu.me.uk Tue May 27 11:12:23 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 27 May 2008 16:12:23 +0100 Subject: [Bioperl-l] Someone available to apply a documentation patch? Message-ID: <483C24D7.3050601@sendu.me.uk> http://rt.cpan.org/Ticket/Display.html?id=12802 If someone could resolve this that would be great. I checked the first file he patched and the error is still there. I doubt the patch can be applied directly, however, since it's based on BioPerl 1.4 - manual effort required. From cjfields at uiuc.edu Tue May 27 13:48:14 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 27 May 2008 12:48:14 -0500 Subject: [Bioperl-l] Someone available to apply a documentation patch? In-Reply-To: <483C24D7.3050601@sendu.me.uk> References: <483C24D7.3050601@sendu.me.uk> Message-ID: <8AE1A09C-31E5-4FA1-A6B7-594031E9A961@uiuc.edu> I'm working on it now by hand (some have been corrected already). I'll commit when finished. chris On May 27, 2008, at 10:12 AM, Sendu Bala wrote: > http://rt.cpan.org/Ticket/Display.html?id=12802 > > If someone could resolve this that would be great. I checked the > first file he patched and the error is still there. I doubt the > patch can be applied directly, however, since it's based on BioPerl > 1.4 - manual effort required. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue May 27 14:07:15 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 27 May 2008 13:07:15 -0500 Subject: [Bioperl-l] Someone available to apply a documentation patch? In-Reply-To: <483C24D7.3050601@sendu.me.uk> References: <483C24D7.3050601@sendu.me.uk> Message-ID: Sendu, I can't marl this as resolved (I'm not listed as a maintainer via RT). Could you close this out? -c On May 27, 2008, at 10:12 AM, Sendu Bala wrote: > http://rt.cpan.org/Ticket/Display.html?id=12802 > > If someone could resolve this that would be great. I checked the > first file he patched and the error is still there. I doubt the > patch can be applied directly, however, since it's based on BioPerl > 1.4 - manual effort required. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From bug-bioperl at rt.cpan.org Tue May 27 13:52:38 2008 From: bug-bioperl at rt.cpan.org (Kashi Revanna via RT) Date: Tue, 27 May 2008 13:52:38 -0400 Subject: [Bioperl-l] [rt.cpan.org #36216] Unable to parse score from BLAST output file References: Message-ID: Tue May 27 13:52:36 2008: Request 36216 was acted upon. Transaction: Ticket created by kajack Queue: bioperl Subject: Unable to parse score from BLAST output file Broken in: (no value) Severity: (no value) Owner: Nobody Requestors: kashi.mail at gmail.com Status: new Ticket Hi, I think there is a small bug in the Bio::SearchIO module. I am parsing the BLAST output file using this module. It works great except for one thing. I have included a part of the blast outputfile ( I have modified the lines to fit into this box). Most of the times the score of Sequences producing significant alignments is in the format of 6.149e+04. This module picks up only 6 and ignores other digits. Can you please look into this for me. Thank you in advance Kashi Attached: The sample Blast output file is here ==================================================================== BLASTN 2.2.15 [Oct-15-2006] Query= Contig_1011 (31,018 letters) Database: scaffold_3.fsa 84 sequences; 3,615,155 total letters Searching..................................................done Score E Sequences producing significant alignments: (bits) Value Contig_1011 6.149e+04 0.0 Contig_8873 2397 0.0 Contig_1482 2042 0.0 Contig_9461 1475 0.0 Contig_1977 339 7e-92 >Contig_1011 Length = 31018 Score = 6.149e+04 bits (31018), Expect = 0.0 Identities = 31018/31018 (100%) Strand = Plus / Plus Query: 1 cttcaacaaacacgtatttctgaatgaaattgtttagagtttgttgaaggtcacgatcag 60 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 1 cttcaacaaacacgtatttctgaatgaaattgtttagagtttgttgaaggtcacgatcag 60 Query: 61 gctcatagaccagcggtcctgaaagaggattgcctttaagtttgttggaaaaaacgatta 120 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 61 gctcatagaccagcggtcctgaaagaggattgcctttaagtttgttggaaaaaacgatta 120 ==================================================================== From bug-bioperl at rt.cpan.org Tue May 27 14:10:04 2008 From: bug-bioperl at rt.cpan.org (Chris Fields via RT) Date: Tue, 27 May 2008 14:10:04 -0400 Subject: [Bioperl-l] [rt.cpan.org #36216] Unable to parse score from BLAST output file References: Message-ID: Queue: bioperl Ticket Please submit this using the BioPerl bugzilla: http://bugzilla.open-bio.org/ Attach a full BLAST example to the bug report so we can track it. On Tue May 27 13:52:36 2008, kajack wrote: > Hi, > > I think there is a small bug in the Bio::SearchIO module. I am parsing > the BLAST output file using this module. It works great except for one > thing. > > I have included a part of the blast outputfile ( I have modified the > lines to fit into this box). Most of the times the score of Sequences > producing significant alignments is in the format of 6.149e+04. This > module picks up only 6 and ignores other digits. > > Can you please look into this for me. > > Thank you in advance > Kashi > > Attached: The sample Blast output file is here > > ==================================================================== > BLASTN 2.2.15 [Oct-15-2006] > > Query= Contig_1011 (31,018 letters) > > Database: scaffold_3.fsa > 84 sequences; 3,615,155 total letters > > Searching..................................................done > > > > Score E > Sequences producing significant alignments: (bits) Value > > Contig_1011 6.149e+04 0.0 > Contig_8873 2397 0.0 > Contig_1482 2042 0.0 > Contig_9461 1475 0.0 > Contig_1977 339 7e-92 > > >Contig_1011 > Length = 31018 > > Score = 6.149e+04 bits (31018), Expect = 0.0 > Identities = 31018/31018 (100%) > Strand = Plus / Plus > > > Query: 1 cttcaacaaacacgtatttctgaatgaaattgtttagagtttgttgaaggtcacgatcag 60 > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 1 cttcaacaaacacgtatttctgaatgaaattgtttagagtttgttgaaggtcacgatcag 60 > > > Query: 61 > gctcatagaccagcggtcctgaaagaggattgcctttaagtttgttggaaaaaacgatta 120 > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > Sbjct: 61 > gctcatagaccagcggtcctgaaagaggattgcctttaagtttgttggaaaaaacgatta 120 > > > ==================================================================== From bug-bioperl at rt.cpan.org Tue May 27 14:12:33 2008 From: bug-bioperl at rt.cpan.org (Sendu Bala via RT) Date: Tue, 27 May 2008 14:12:33 -0400 Subject: [Bioperl-l] [rt.cpan.org #12802] Documentation patch References: Message-ID: Queue: bioperl Ticket Applied, courtesy of Chris Fields From jason at bioperl.org Tue May 27 17:24:47 2008 From: jason at bioperl.org (Jason Stajich) Date: Tue, 27 May 2008 14:24:47 -0700 Subject: [Bioperl-l] Sort for get_all_tags method In-Reply-To: References: Message-ID: Again, please ask these questions on the list. yes it is possible, but I don't specifically have any code for doing the sorting myself, but have hacked up some untested code at the end of this msg. The main problem for bioperl is the Feature object needs a new method to accept setting a sort function, something we can easily do, but someone just needs to put in a little time to test it out and make sure it all works. It would boil down to updating the function that is implemented in Bio::SeqFeature::Generic : sub get_all_tags { my ($self, @args) = @_; return sort keys %{ $self->{'_gsf_tag_hash'}}; } If you are looking for more example code, you might try the Perl Cookbook from O'Reilly. I would implement a specific sort function by probably mapping the keys to a number based on your pre-determined preference (ie 'note' would map to 0, 'gene' to 1, etc) and then all tags without a mapping would get ordered alphabetically. Something like this The solution of updating the module to accept a general sort function would be a more general solution, but you can always override this all_tags function in your local script by including this code in the beginning of your script: use Bio::SeqFeature::Generic; my %lookup = ( 'note' => 0, 'gene' => 1, 'locus' => 2 ); sub Bio::SeqFeature::Generic::get_all_tags { my ($self, @args) = @_; return sort { my ($amap,$bmap) = map { $lookup{$_} } ($a,$b); my $rc = undef; if( defined $amap && ! defined $bmap ) { $rc = -1; # only $a is in the lookup, it should come first } elsif( ! defined $amap && defined $bmap ) { $rc = 1; # only $b is in the lookup, it should come first } elsif( defined $amap && defined $bmap ) { $rc = $amap <=> $bmap; # numeric compare, these are both in the lookup } else { $rc = $amap cmp $bmap; # alpha compare, neither are in the lookup; } $rc; # return code from the function passed to sort } keys %{ $self->{'_gsf_tag_hash'}}; } # end routine -jason On May 22, 2008, at 12:24 PM, Mgavi Brathwaite wrote: > Hi Jason, > > I looked at your comment and I agree that there should be a sort > method to > return the tags in a predictable order. Would you be willing to > share you > code for sorting tags with the get_all_tags method ? > > M > From cjfields at uiuc.edu Tue May 27 21:48:45 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 27 May 2008 20:48:45 -0500 Subject: [Bioperl-l] Sort for get_all_tags method In-Reply-To: References: Message-ID: I posted this to bugzilla as an enhancement request (so it doesn't get lost in the mail archives). The question is, do we want to pass in a code ref as a sort function, defaulting to the original 'no sort' behavior? chris On May 27, 2008, at 4:24 PM, Jason Stajich wrote: > Again, please ask these questions on the list. > > yes it is possible, but I don't specifically have any code for doing > the sorting myself, but have hacked up some untested code at the end > of this msg. The main problem for bioperl is the Feature object > needs a new method to accept setting a sort function, something we > can easily do, but someone just needs to put in a little time to > test it out and make sure it all works. > > It would boil down to updating the function that is implemented in > Bio::SeqFeature::Generic : > > sub get_all_tags { > my ($self, @args) = @_; > return sort keys %{ $self->{'_gsf_tag_hash'}}; > } > > If you are looking for more example code, you might try the Perl > Cookbook from O'Reilly. I would implement a specific sort function > by probably mapping the keys to a number based on your pre- > determined preference (ie 'note' would map to 0, 'gene' to 1, etc) > and then all tags without a mapping would get ordered alphabetically. > Something like this > > > The solution of updating the module to accept a general sort > function would be a more general solution, but you can always > override this all_tags function in your local script by including > this code in the beginning of your script: > > > use Bio::SeqFeature::Generic; > my %lookup = ( 'note' => 0, 'gene' => 1, 'locus' => 2 ); > sub Bio::SeqFeature::Generic::get_all_tags { > my ($self, @args) = @_; > return sort { my ($amap,$bmap) = map { $lookup{$_} } ($a,$b); > my $rc = undef; > if( defined $amap && ! defined $bmap ) { > $rc = -1; # only $a is in the lookup, it should come first > } elsif( ! defined $amap && defined $bmap ) { > $rc = 1; # only $b is in the lookup, it should > come first > } elsif( defined $amap && defined $bmap ) { > $rc = $amap <=> $bmap; # numeric compare, these are > both in the lookup > } else { > $rc = $amap cmp $bmap; # alpha compare, neither are > in the lookup; > } > $rc; # return code from the function passed to sort > } keys %{ $self->{'_gsf_tag_hash'}}; > } # end routine > > -jason > > On May 22, 2008, at 12:24 PM, Mgavi Brathwaite wrote: > >> Hi Jason, >> >> I looked at your comment and I agree that there should be a sort >> method to >> return the tags in a predictable order. Would you be willing to >> share you >> code for sorting tags with the get_all_tags method ? >> >> M >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From heikki at sanbi.ac.za Wed May 28 04:31:21 2008 From: heikki at sanbi.ac.za (Heikki Lehvaslaiho) Date: Wed, 28 May 2008 10:31:21 +0200 Subject: [Bioperl-l] bioperl anonymous SVN checkout fails Message-ID: <200805281031.22095.heikki@sanbi.ac.za> Checking out the bioperl-live repository ( as suggested in http://www.bioperl.org/wiki/Using_Subversion): svn co svn://code.open-bio.org/bioperl/bioperl-live/trunk bioperl-live fails with the following error message: svn: Can't find a temporary directory: Error string not specified yet -Heikki -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ From heikki at sanbi.ac.za Wed May 28 04:23:51 2008 From: heikki at sanbi.ac.za (Heikki Lehvaslaiho) Date: Wed, 28 May 2008 10:23:51 +0200 Subject: [Bioperl-l] extending the PHYLIP format Message-ID: <200805281023.51697.heikki@sanbi.ac.za> I just learned that a number of phylogenetics packages (PAUP, PHYML, Mr Bayes at least ) now allow longer than 10 character IDs in PHYLIP format. The documentation is scarce but the rules seem to be: 1. There can be spaces before the ID. 2. The ID can be up to 50 characters long. 3. ID can contain any characters. If you are using spaces within the ID, you have to put the whole ID in single quotes ('). Single quotes can be used for all IDs and are removed when parsing in. 4. It is customary to have two spaces between the ID and the sequence. This custom seems to have come into PHYLIP format from Nexus. Note that this allows sequences in a file to start at different columns. Can anyone shed more light into matter? I need to get this into bioperl as the names in HIV sequences that I work with are very long and can not be sensibly truncated. What would be the best way to do this? 1. Add more options to the already heavily hacked Bio::AlignIO::phylip.pm 2. Create a Bio::AlignIO::phyliplong.pm Do those ugly hacks for supporting fixed length long IDs really really belong in the vanilla phylip.pm file? Opinions? -Heikki -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ From avilella at gmail.com Wed May 28 06:31:07 2008 From: avilella at gmail.com (Albert Vilella) Date: Wed, 28 May 2008 11:31:07 +0100 Subject: [Bioperl-l] extending the PHYLIP format In-Reply-To: <200805281023.51697.heikki@sanbi.ac.za> References: <200805281023.51697.heikki@sanbi.ac.za> Message-ID: <358f4d650805280331h7b1387cat76794d9135f7577d@mail.gmail.com> Hi Heikki, About a year ago, some code was added to deal with "the more than 10 chars" ids problem. ( https://www.nescent.org/wg_phyloinformatics/Phylohackathon_1/BioPerl_Targets) Basically: (1) mapping the long ids to 10-char numeric ids, (2) running the program with the id limitation, (3) reverting the ids back to the originals in the output. The pods explain how to do it. So I would say that the solution is at least "partially" there :-) Albert. On Wed, May 28, 2008 at 9:23 AM, Heikki Lehvaslaiho wrote: > > I just learned that a number of phylogenetics packages (PAUP, PHYML, Mr > Bayes > at least ) now allow longer than 10 character IDs in PHYLIP format. The > documentation is scarce but the rules seem to be: > > 1. There can be spaces before the ID. > 2. The ID can be up to 50 characters long. > 3. ID can contain any characters. If you are using spaces within the ID, > you > have to put the whole ID in single quotes ('). Single quotes can be used > for > all IDs and are removed when parsing in. > 4. It is customary to have two spaces between the ID and the sequence. > > This custom seems to have come into PHYLIP format from Nexus. > Note that this allows sequences in a file to start at different columns. > > Can anyone shed more light into matter? > > > I need to get this into bioperl as the names in HIV sequences that I work > with > are very long and can not be sensibly truncated. > > What would be the best way to do this? > 1. Add more options to the already heavily > hacked Bio::AlignIO::phylip.pm > 2. Create a Bio::AlignIO::phyliplong.pm > > Do those ugly hacks for supporting fixed length long IDs really really > belong > in the vanilla phylip.pm file? > > Opinions? > > -Heikki > > -- > ______ _/ _/_____________________________________________________ > _/ _/ > _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za > _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho > _/ _/ _/ SANBI, South African National Bioinformatics Institute > _/ _/ _/ University of Western Cape, South Africa > _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 > ___ _/_/_/_/_/________________________________________________________ > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From calum.robb at gmail.com Wed May 28 10:10:57 2008 From: calum.robb at gmail.com (calum robb) Date: Wed, 28 May 2008 15:10:57 +0100 Subject: [Bioperl-l] Installation problem Message-ID: <273360f60805280710l154f982dyc46b42823fec6ab2@mail.gmail.com> Hi there. I use Microsoft Windows XP, Home Edition, Version 2002, Service Pack 2. I have installed Perl version 5.10.0, and want to now install bioperl version 1.5.2, can you give me easy steps to do this please. Regards, -- Mr Calum Robb. BSc (Hons) Phd Student in Comparative Immunology School of Life Sciences John Muir Building Heriot Watt University Riccarton, Edinburgh, EH14 4AS Scotland Tel: 07849539610 Email: calum.robb at gmail.com or cr70 at hw.ac.uk From cjfields at uiuc.edu Wed May 28 09:51:55 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 28 May 2008 08:51:55 -0500 Subject: [Bioperl-l] extending the PHYLIP format In-Reply-To: <200805281023.51697.heikki@sanbi.ac.za> References: <200805281023.51697.heikki@sanbi.ac.za> Message-ID: <7C9CC210-AD03-466E-B27C-791FB8DA78DD@uiuc.edu> Could you post a few example phylip sequences with long names to svn and add a ticket to bugzilla? I would consider this a somewhat high- priority enhancement. I think keeping this in a single phylip module would be best, but we'll to see how feasible it is. I think it is possible to do so, however, and still retain some backwards compatibility (I may even have an idea how, just need to test it out). chris On May 28, 2008, at 3:23 AM, Heikki Lehvaslaiho wrote: > I just learned that a number of phylogenetics packages (PAUP, PHYML, > Mr Bayes > at least ) now allow longer than 10 character IDs in PHYLIP format. > The > documentation is scarce but the rules seem to be: > > 1. There can be spaces before the ID. > 2. The ID can be up to 50 characters long. > 3. ID can contain any characters. If you are using spaces within the > ID, you > have to put the whole ID in single quotes ('). Single quotes can be > used for > all IDs and are removed when parsing in. > 4. It is customary to have two spaces between the ID and the sequence. > > This custom seems to have come into PHYLIP format from Nexus. > Note that this allows sequences in a file to start at different > columns. > > Can anyone shed more light into matter? > > > I need to get this into bioperl as the names in HIV sequences that I > work with > are very long and can not be sensibly truncated. > > What would be the best way to do this? > 1. Add more options to the already heavily > hacked Bio::AlignIO::phylip.pm > 2. Create a Bio::AlignIO::phyliplong.pm > > Do those ugly hacks for supporting fixed length long IDs really > really belong > in the vanilla phylip.pm file? > > Opinions? > > -Heikki > > -- > ______ _/ _/_____________________________________________________ > _/ _/ > _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za > _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho > _/ _/ _/ SANBI, South African National Bioinformatics Institute > _/ _/ _/ University of Western Cape, South Africa > _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 > ___ _/_/_/_/_/________________________________________________________ > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From bix at sendu.me.uk Wed May 28 10:49:54 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 28 May 2008 15:49:54 +0100 Subject: [Bioperl-l] Installation problem In-Reply-To: <273360f60805280710l154f982dyc46b42823fec6ab2@mail.gmail.com> References: <273360f60805280710l154f982dyc46b42823fec6ab2@mail.gmail.com> Message-ID: <483D7112.4080403@sendu.me.uk> calum robb wrote: > I use Microsoft Windows XP, Home Edition, Version 2002, Service Pack 2. I > have installed Perl version 5.10.0, and want to now install bioperl version > 1.5.2, can you give me easy steps to do this please. You subject mentions a problem. What was it? Installation instructions are on the website: http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows From jason at bioperl.org Wed May 28 13:28:50 2008 From: jason at bioperl.org (Jason Stajich) Date: Wed, 28 May 2008 10:28:50 -0700 Subject: [Bioperl-l] extending the PHYLIP format In-Reply-To: <7C9CC210-AD03-466E-B27C-791FB8DA78DD@uiuc.edu> References: <200805281023.51697.heikki@sanbi.ac.za> <7C9CC210-AD03-466E-B27C-791FB8DA78DD@uiuc.edu> Message-ID: Should also ask Weigang what the status is, I think he implemented a lot of it. -jason On May 28, 2008, at 6:51 AM, Chris Fields wrote: > Could you post a few example phylip sequences with long names to > svn and add a ticket to bugzilla? I would consider this a somewhat > high-priority enhancement. > > I think keeping this in a single phylip module would be best, but > we'll to see how feasible it is. I think it is possible to do so, > however, and still retain some backwards compatibility (I may even > have an idea how, just need to test it out). > > chris > > On May 28, 2008, at 3:23 AM, Heikki Lehvaslaiho wrote: > >> I just learned that a number of phylogenetics packages (PAUP, >> PHYML, Mr Bayes >> at least ) now allow longer than 10 character IDs in PHYLIP >> format. The >> documentation is scarce but the rules seem to be: >> >> 1. There can be spaces before the ID. >> 2. The ID can be up to 50 characters long. >> 3. ID can contain any characters. If you are using spaces within >> the ID, you >> have to put the whole ID in single quotes ('). Single quotes can >> be used for >> all IDs and are removed when parsing in. >> 4. It is customary to have two spaces between the ID and the >> sequence. >> >> This custom seems to have come into PHYLIP format from Nexus. >> Note that this allows sequences in a file to start at different >> columns. >> >> Can anyone shed more light into matter? >> >> >> I need to get this into bioperl as the names in HIV sequences that >> I work with >> are very long and can not be sensibly truncated. >> >> What would be the best way to do this? >> 1. Add more options to the already heavily >> hacked Bio::AlignIO::phylip.pm >> 2. Create a Bio::AlignIO::phyliplong.pm >> >> Do those ugly hacks for supporting fixed length long IDs really >> really belong >> in the vanilla phylip.pm file? >> >> Opinions? >> >> -Heikki >> >> -- >> ______ _/ _/ >> _____________________________________________________ >> _/ _/ >> _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za >> _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho >> _/ _/ _/ SANBI, South African National Bioinformatics Institute >> _/ _/ _/ University of Western Cape, South Africa >> _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 >> ___ _/_/_/_/_/ >> ________________________________________________________ >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Marie-Claude Hofmann > College of Veterinary Medicine > University of Illinois Urbana-Champaign > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From weigang at GENECTR.HUNTER.CUNY.EDU Wed May 28 14:01:55 2008 From: weigang at GENECTR.HUNTER.CUNY.EDU (Weigang Qiu) Date: Wed, 28 May 2008 14:01:55 -0400 Subject: [Bioperl-l] extending the PHYLIP format In-Reply-To: References: <200805281023.51697.heikki@sanbi.ac.za> <7C9CC210-AD03-466E-B27C-791FB8DA78DD@uiuc.edu> Message-ID: <483D9E13.8090809@genectr.hunter.cuny.edu> Let me summarize a few things I implemented during the 07 Nescent Hackathon (with a lot of help from Sandu, Aaron, and Jason): 1. A "longname.aln" is included in the bioperl-live. Which turns out to be the same file as "pep-266.aln" (one of them showed be removed). This file is in clustalw format. Also, it doesn't contain the tough cases like 50 chars long, with spaces and single quotes. 2. The solution to this ugly restriction that had been implemented include the following pair of SimpleAlignI methods: set_displayname_safe Title : set_displayname_safe Usage : ($new_aln, $ref_name)=$ali->set_displayname_safe(4) Function : Assign machine-generated serial names to sequences in input order. Designed to protect names during PHYLIP runs. Assign 10-char string in the form of "S000000001" to "S999999999". Restore the original names using "restore_displayname". Returns : 1. a new $aln with system names; 2. a hash ref for restoring names Argument : Number for id length (default 10) restore_displayname Title : restore_displayname Usage : $aln_name_restored=$ali->restore_displayname($hash_ref) Function : Restore original sequence names (after running $ali->set_displayname_safe) Returns : a new $aln with names restored. Argument : a hash reference of names from "set_displayname_safe". 3. Added following tests in "SimpleAlign.t": # test set_displayname_safe & restore_displayname: $str = Bio::AlignIO->new(-file=> Bio::Root::IO->catfile("t","data","pep-266.aln")); $aln=$str->next_aln(); is $aln->get_seq_by_pos(3)->display_id, 'Smik_Contig1103.1', 'initial display id ok'; my ($new_aln, $ref)=$aln->set_displayname_safe(); is $new_aln->get_seq_by_pos(3)->display_id, 'S000000003', 'safe display id ok'; my $restored_aln=$new_aln->restore_displayname($ref); is $restored_aln->get_seq_by_pos(3)->display_id, 'Smik_Contig1103.1', 'restored display id ok'; I would be happy to contribute more if additional work or design is needed. ps. We developed a module for graphic annotation of alignments using GD (modeled after Bio::Graphics). This should be useful for people who are annotating alignments manually (such as highlight alignment positions, labeling domains, etc). Someone help me to deposit it in bioperl-live through subversion would be great (my cvs developer's account was told to be not useful any more). Jason Stajich wrote: > Should also ask Weigang what the status is, I think he implemented a > lot of it. > > -jason > On May 28, 2008, at 6:51 AM, Chris Fields wrote: > >> Could you post a few example phylip sequences with long names to svn >> and add a ticket to bugzilla? I would consider this a somewhat >> high-priority enhancement. >> >> I think keeping this in a single phylip module would be best, but >> we'll to see how feasible it is. I think it is possible to do so, >> however, and still retain some backwards compatibility (I may even >> have an idea how, just need to test it out). >> >> chris >> >> On May 28, 2008, at 3:23 AM, Heikki Lehvaslaiho wrote: >> >>> I just learned that a number of phylogenetics packages (PAUP, PHYML, >>> Mr Bayes >>> at least ) now allow longer than 10 character IDs in PHYLIP format. The >>> documentation is scarce but the rules seem to be: >>> >>> 1. There can be spaces before the ID. >>> 2. The ID can be up to 50 characters long. >>> 3. ID can contain any characters. If you are using spaces within the >>> ID, you >>> have to put the whole ID in single quotes ('). Single quotes can be >>> used for >>> all IDs and are removed when parsing in. >>> 4. It is customary to have two spaces between the ID and the sequence. >>> >>> This custom seems to have come into PHYLIP format from Nexus. >>> Note that this allows sequences in a file to start at different >>> columns. >>> >>> Can anyone shed more light into matter? >>> >>> >>> I need to get this into bioperl as the names in HIV sequences that I >>> work with >>> are very long and can not be sensibly truncated. >>> >>> What would be the best way to do this? >>> 1. Add more options to the already heavily >>> hacked Bio::AlignIO::phylip.pm >>> 2. Create a Bio::AlignIO::phyliplong.pm >>> >>> Do those ugly hacks for supporting fixed length long IDs really >>> really belong >>> in the vanilla phylip.pm file? >>> >>> Opinions? >>> >>> -Heikki >>> >>> --______ _/ >>> _/_____________________________________________________ >>> _/ _/ >>> _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za >>> _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho >>> _/ _/ _/ SANBI, South African National Bioinformatics Institute >>> _/ _/ _/ University of Western Cape, South Africa >>> _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 >>> ___ _/_/_/_/_/________________________________________________________ >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Marie-Claude Hofmann >> College of Veterinary Medicine >> University of Illinois Urbana-Champaign >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Weigang Qiu, Assist. Professor Department of Biological Sciences Hunter College, City University of New York 695 Park Ave, New York, NY 10021 1-212-772-5296 (Office, Room 839, Hunter North) From heikki at sanbi.ac.za Thu May 29 06:06:12 2008 From: heikki at sanbi.ac.za (Heikki Lehvaslaiho) Date: Thu, 29 May 2008 12:06:12 +0200 Subject: [Bioperl-l] extending the PHYLIP format In-Reply-To: <483D9E13.8090809@genectr.hunter.cuny.edu> References: <200805281023.51697.heikki@sanbi.ac.za> <483D9E13.8090809@genectr.hunter.cuny.edu> Message-ID: <200805291206.13063.heikki@sanbi.ac.za> I commited my first take on long IDs in phylip format. It turned out to be quite a simple thing to add an option 'longid=>1' to it. I had to change the interleaved() to return the set value (it was returning the previous value). It might have been something I did long time ago when I wrote the first version of the module! Strangely, it did not have any knock on effects. The longid version of phylip format seems to work fine with phyml as long as there are no spaces in the Ids (but that is said clearly in phyml docs). This still needs careful testing with other programs. Please try it out with your favorite phylo programs! -Heikki On Wednesday 28 May 2008 20:01:55 Weigang Qiu wrote: > Let me summarize a few things I implemented during the 07 Nescent > Hackathon (with a lot of help from Sandu, Aaron, and Jason): > > 1. A "longname.aln" is included in the bioperl-live. Which turns out to > be the same file as "pep-266.aln" (one of them showed be removed). This > file is in clustalw format. Also, it doesn't contain the tough cases > like 50 chars long, with spaces and single quotes. > > 2. The solution to this ugly restriction that had been implemented > include the following pair of SimpleAlignI methods: > set_displayname_safe > > Title : set_displayname_safe > Usage : ($new_aln, $ref_name)=$ali->set_displayname_safe(4) > Function : Assign machine-generated serial names to sequences in > input order. > Designed to protect names during PHYLIP runs. Assign > 10-char string > in the form of "S000000001" to "S999999999". Restore > the original > names using "restore_displayname". > Returns : 1. a new $aln with system names; > 2. a hash ref for restoring names > Argument : Number for id length (default 10) > > restore_displayname > > Title : restore_displayname > Usage : $aln_name_restored=$ali->restore_displayname($hash_ref) > Function : Restore original sequence names (after running > $ali->set_displayname_safe) > Returns : a new $aln with names restored. > Argument : a hash reference of names from "set_displayname_safe". > > 3. Added following tests in "SimpleAlign.t": > # test set_displayname_safe & restore_displayname: > $str = Bio::AlignIO->new(-file=> > Bio::Root::IO->catfile("t","data","pep-266.aln")); > $aln=$str->next_aln(); > is $aln->get_seq_by_pos(3)->display_id, 'Smik_Contig1103.1', 'initial > display id ok'; > my ($new_aln, $ref)=$aln->set_displayname_safe(); > is $new_aln->get_seq_by_pos(3)->display_id, 'S000000003', 'safe display > id ok'; > my $restored_aln=$new_aln->restore_displayname($ref); > is $restored_aln->get_seq_by_pos(3)->display_id, 'Smik_Contig1103.1', > 'restored display id ok'; > > I would be happy to contribute more if additional work or design is needed. > > ps. We developed a module for graphic annotation of alignments using GD > (modeled after Bio::Graphics). This should be useful for people who are > annotating alignments manually (such as highlight alignment positions, > labeling domains, etc). Someone help me to deposit it in bioperl-live > through subversion would be great (my cvs developer's account was told > to be not useful any more). > > Jason Stajich wrote: > > Should also ask Weigang what the status is, I think he implemented a > > lot of it. > > > > -jason > > > > On May 28, 2008, at 6:51 AM, Chris Fields wrote: > >> Could you post a few example phylip sequences with long names to svn > >> and add a ticket to bugzilla? I would consider this a somewhat > >> high-priority enhancement. > >> > >> I think keeping this in a single phylip module would be best, but > >> we'll to see how feasible it is. I think it is possible to do so, > >> however, and still retain some backwards compatibility (I may even > >> have an idea how, just need to test it out). > >> > >> chris > >> > >> On May 28, 2008, at 3:23 AM, Heikki Lehvaslaiho wrote: > >>> I just learned that a number of phylogenetics packages (PAUP, PHYML, > >>> Mr Bayes > >>> at least ) now allow longer than 10 character IDs in PHYLIP format. The > >>> documentation is scarce but the rules seem to be: > >>> > >>> 1. There can be spaces before the ID. > >>> 2. The ID can be up to 50 characters long. > >>> 3. ID can contain any characters. If you are using spaces within the > >>> ID, you > >>> have to put the whole ID in single quotes ('). Single quotes can be > >>> used for > >>> all IDs and are removed when parsing in. > >>> 4. It is customary to have two spaces between the ID and the sequence. > >>> > >>> This custom seems to have come into PHYLIP format from Nexus. > >>> Note that this allows sequences in a file to start at different > >>> columns. > >>> > >>> Can anyone shed more light into matter? > >>> > >>> > >>> I need to get this into bioperl as the names in HIV sequences that I > >>> work with > >>> are very long and can not be sensibly truncated. > >>> > >>> What would be the best way to do this? > >>> 1. Add more options to the already heavily > >>> hacked Bio::AlignIO::phylip.pm > >>> 2. Create a Bio::AlignIO::phyliplong.pm > >>> > >>> Do those ugly hacks for supporting fixed length long IDs really > >>> really belong > >>> in the vanilla phylip.pm file? > >>> > >>> Opinions? > >>> > >>> -Heikki > >>> > >>> --______ _/ > >>> _/_____________________________________________________ > >>> _/ _/ > >>> _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za > >>> _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho > >>> _/ _/ _/ SANBI, South African National Bioinformatics Institute > >>> _/ _/ _/ University of Western Cape, South Africa > >>> _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 > >>> ___ _/_/_/_/_/________________________________________________________ > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> Christopher Fields > >> Postdoctoral Researcher > >> Lab of Dr. Marie-Claude Hofmann > >> College of Veterinary Medicine > >> University of Illinois Urbana-Champaign > >> > >> > >> > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ From heikki at sanbi.ac.za Thu May 29 06:06:12 2008 From: heikki at sanbi.ac.za (Heikki Lehvaslaiho) Date: Thu, 29 May 2008 12:06:12 +0200 Subject: [Bioperl-l] extending the PHYLIP format In-Reply-To: <483D9E13.8090809@genectr.hunter.cuny.edu> References: <200805281023.51697.heikki@sanbi.ac.za> <483D9E13.8090809@genectr.hunter.cuny.edu> Message-ID: <200805291206.13063.heikki@sanbi.ac.za> I commited my first take on long IDs in phylip format. It turned out to be quite a simple thing to add an option 'longid=>1' to it. I had to change the interleaved() to return the set value (it was returning the previous value). It might have been something I did long time ago when I wrote the first version of the module! Strangely, it did not have any knock on effects. The longid version of phylip format seems to work fine with phyml as long as there are no spaces in the Ids (but that is said clearly in phyml docs). This still needs careful testing with other programs. Please try it out with your favorite phylo programs! -Heikki On Wednesday 28 May 2008 20:01:55 Weigang Qiu wrote: > Let me summarize a few things I implemented during the 07 Nescent > Hackathon (with a lot of help from Sandu, Aaron, and Jason): > > 1. A "longname.aln" is included in the bioperl-live. Which turns out to > be the same file as "pep-266.aln" (one of them showed be removed). This > file is in clustalw format. Also, it doesn't contain the tough cases > like 50 chars long, with spaces and single quotes. > > 2. The solution to this ugly restriction that had been implemented > include the following pair of SimpleAlignI methods: > set_displayname_safe > > Title : set_displayname_safe > Usage : ($new_aln, $ref_name)=$ali->set_displayname_safe(4) > Function : Assign machine-generated serial names to sequences in > input order. > Designed to protect names during PHYLIP runs. Assign > 10-char string > in the form of "S000000001" to "S999999999". Restore > the original > names using "restore_displayname". > Returns : 1. a new $aln with system names; > 2. a hash ref for restoring names > Argument : Number for id length (default 10) > > restore_displayname > > Title : restore_displayname > Usage : $aln_name_restored=$ali->restore_displayname($hash_ref) > Function : Restore original sequence names (after running > $ali->set_displayname_safe) > Returns : a new $aln with names restored. > Argument : a hash reference of names from "set_displayname_safe". > > 3. Added following tests in "SimpleAlign.t": > # test set_displayname_safe & restore_displayname: > $str = Bio::AlignIO->new(-file=> > Bio::Root::IO->catfile("t","data","pep-266.aln")); > $aln=$str->next_aln(); > is $aln->get_seq_by_pos(3)->display_id, 'Smik_Contig1103.1', 'initial > display id ok'; > my ($new_aln, $ref)=$aln->set_displayname_safe(); > is $new_aln->get_seq_by_pos(3)->display_id, 'S000000003', 'safe display > id ok'; > my $restored_aln=$new_aln->restore_displayname($ref); > is $restored_aln->get_seq_by_pos(3)->display_id, 'Smik_Contig1103.1', > 'restored display id ok'; > > I would be happy to contribute more if additional work or design is needed. > > ps. We developed a module for graphic annotation of alignments using GD > (modeled after Bio::Graphics). This should be useful for people who are > annotating alignments manually (such as highlight alignment positions, > labeling domains, etc). Someone help me to deposit it in bioperl-live > through subversion would be great (my cvs developer's account was told > to be not useful any more). > > Jason Stajich wrote: > > Should also ask Weigang what the status is, I think he implemented a > > lot of it. > > > > -jason > > > > On May 28, 2008, at 6:51 AM, Chris Fields wrote: > >> Could you post a few example phylip sequences with long names to svn > >> and add a ticket to bugzilla? I would consider this a somewhat > >> high-priority enhancement. > >> > >> I think keeping this in a single phylip module would be best, but > >> we'll to see how feasible it is. I think it is possible to do so, > >> however, and still retain some backwards compatibility (I may even > >> have an idea how, just need to test it out). > >> > >> chris > >> > >> On May 28, 2008, at 3:23 AM, Heikki Lehvaslaiho wrote: > >>> I just learned that a number of phylogenetics packages (PAUP, PHYML, > >>> Mr Bayes > >>> at least ) now allow longer than 10 character IDs in PHYLIP format. The > >>> documentation is scarce but the rules seem to be: > >>> > >>> 1. There can be spaces before the ID. > >>> 2. The ID can be up to 50 characters long. > >>> 3. ID can contain any characters. If you are using spaces within the > >>> ID, you > >>> have to put the whole ID in single quotes ('). Single quotes can be > >>> used for > >>> all IDs and are removed when parsing in. > >>> 4. It is customary to have two spaces between the ID and the sequence. > >>> > >>> This custom seems to have come into PHYLIP format from Nexus. > >>> Note that this allows sequences in a file to start at different > >>> columns. > >>> > >>> Can anyone shed more light into matter? > >>> > >>> > >>> I need to get this into bioperl as the names in HIV sequences that I > >>> work with > >>> are very long and can not be sensibly truncated. > >>> > >>> What would be the best way to do this? > >>> 1. Add more options to the already heavily > >>> hacked Bio::AlignIO::phylip.pm > >>> 2. Create a Bio::AlignIO::phyliplong.pm > >>> > >>> Do those ugly hacks for supporting fixed length long IDs really > >>> really belong > >>> in the vanilla phylip.pm file? > >>> > >>> Opinions? > >>> > >>> -Heikki > >>> > >>> --______ _/ > >>> _/_____________________________________________________ > >>> _/ _/ > >>> _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za > >>> _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho > >>> _/ _/ _/ SANBI, South African National Bioinformatics Institute > >>> _/ _/ _/ University of Western Cape, South Africa > >>> _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 > >>> ___ _/_/_/_/_/________________________________________________________ > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> Christopher Fields > >> Postdoctoral Researcher > >> Lab of Dr. Marie-Claude Hofmann > >> College of Veterinary Medicine > >> University of Illinois Urbana-Champaign > >> > >> > >> > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ From dave.burt at roslin.ed.ac.uk Thu May 29 04:10:31 2008 From: dave.burt at roslin.ed.ac.uk (dave burt (RI)) Date: Thu, 29 May 2008 09:10:31 +0100 Subject: [Bioperl-l] bio-alignio-maf Message-ID: <1F16910BB8546C4DA5526FABB0C98D09EFA694@ebre2ksrv1.ebrc.bbsrc.ac.uk> Dear All, Testing a simple script #!/bin/perl.exe use strict; use warnings; use Bio::AlignIO; my $alignment_file = "test1.maf"; printf STDERR "%s\n", $alignment_file; my $alignio = Bio::AlignIO->new( -file => $alignment_file, -format => 'maf'); while(my $aln = $alignio->next_aln()){ my $match_line = $aln->match_line; print $aln, "\n"; print $aln->length, "\n"; print $aln->no_residues, "\n"; print $aln->is_flush, "\n"; print $aln->no_sequences, "\n"; $aln->splice_by_seq_pos(1); print $aln->consensus_string(60), "\n"; print $aln->get_seq_by_pos(1)->seq, "\n"; print $aln->match_line(), "\n"; print "\n"; } exit(); Note: test1.maf is attached Problem: the while loop is never enetered - any ideas? Dave -------------- next part -------------- A non-text attachment was scrubbed... Name: test1.maf Type: application/octet-stream Size: 1721 bytes Desc: test1.maf URL: From bix at sendu.me.uk Thu May 29 09:52:26 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 29 May 2008 14:52:26 +0100 Subject: [Bioperl-l] Installation problem In-Reply-To: <273360f60805290646w2541097es80c7798d81e8ed0c@mail.gmail.com> References: <273360f60805280710l154f982dyc46b42823fec6ab2@mail.gmail.com> <483D7112.4080403@sendu.me.uk> <273360f60805290646w2541097es80c7798d81e8ed0c@mail.gmail.com> Message-ID: <483EB51A.1080801@sendu.me.uk> calum robb wrote: > hi. Please include the list when you reply. > i can get to this step on the installation process --- > 11) Type 'perl Build.PL' and answer the questions appropriately. > > but when i type this, i get the message, can't open perl script > "build.pl": No such file or directory. You should use the ActivePerl Perl Package Manager to install BioPerl. If for some reason you absolutely have to do a manual installation, your error is most likely caused by you not being in the correct directory. From bosborne11 at verizon.net Thu May 29 11:11:55 2008 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 29 May 2008 11:11:55 -0400 Subject: [Bioperl-l] bio-alignio-maf In-Reply-To: <1F16910BB8546C4DA5526FABB0C98D09EFA694@ebre2ksrv1.ebrc.bbsrc.ac.uk> References: <1F16910BB8546C4DA5526FABB0C98D09EFA694@ebre2ksrv1.ebrc.bbsrc.ac.uk> Message-ID: Dave, Try your script on some *maf files in the Bioperl package instead of test1.maf: bioperl-live/t/data/bug2453.maf bioperl-live/t/data/humor.maf What is the result? Brian O. On May 29, 2008, at 4:10 AM, dave burt (RI) wrote: > Dear All, > > Testing a simple script > > #!/bin/perl.exe > > use strict; > use warnings; > use Bio::AlignIO; > > my $alignment_file = "test1.maf"; > > printf STDERR "%s\n", $alignment_file; > > my $alignio = Bio::AlignIO->new( -file => $alignment_file, -format => > 'maf'); > > while(my $aln = $alignio->next_aln()){ > my $match_line = $aln->match_line; > > print $aln, "\n"; > > print $aln->length, "\n"; > print $aln->no_residues, "\n"; > print $aln->is_flush, "\n"; > print $aln->no_sequences, "\n"; > > $aln->splice_by_seq_pos(1); > > print $aln->consensus_string(60), "\n"; > print $aln->get_seq_by_pos(1)->seq, "\n"; > print $aln->match_line(), "\n"; > > print "\n"; > } > > exit(); > > Note: test1.maf is attached > > Problem: the while loop is never enetered - any ideas? > > Dave > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason at bioperl.org Thu May 29 12:37:32 2008 From: jason at bioperl.org (Jason Stajich) Date: Thu, 29 May 2008 11:37:32 -0500 Subject: [Bioperl-l] Bio::SearchIO::megablast References: <001101c8c163$906acaa0$55fe010a@LENOVOC4FA0004> Message-ID: <02E202A0-6F0F-422B-994A-E6FA7B8A1831@bioperl.org> Please ask questions on the mailing list. There is a link to speed issues in Bio::SearchIO on the wiki. Depending on what you are doing, you can dump out data as tabular format when you run BLAST. -jason Begin forwarded message: > From: "Thomas Nussbaumer" > Date: May 29, 2008 3:11:14 AM CDT > To: > Subject: Bio::SearchIO::megablast > > hi, > is there any possibility to > > a) > improve the performance, make search faster, > currently i am using a .txt file which contains the hsp > under XP > > b) > have to used this module with more than ~1000000 > how long did it take to parse the file > > best regards > Thomas From dave.burt at roslin.ed.ac.uk Thu May 29 12:46:10 2008 From: dave.burt at roslin.ed.ac.uk (dave burt (RI)) Date: Thu, 29 May 2008 17:46:10 +0100 Subject: [Bioperl-l] bio-alignio-maf References: <1F16910BB8546C4DA5526FABB0C98D09EFA694@ebre2ksrv1.ebrc.bbsrc.ac.uk> Message-ID: <1F16910BB8546C4DA5526FABB0C98D09015CA471@ebre2ksrv1.ebrc.bbsrc.ac.uk> Brain - good idea ! worked fine - the problem was traced to the header line - should be ##maf version=1 scoring=zero - and not the following, which is included in the biomart downloaded file ##maf version=1 #Tue Apr 29 16:49:45 2008 #The start coordinate is a zero-based number. #For segments in the negative strand, the start #is relative to the end of the chromosome. Please, refer to #http://genome.ucsc.edu/FAQ/FAQformat#format5 for a #description of this file format. simple solution - usually is 99% of time many thanks for the clues Dave ________________________________ From: Brian Osborne [mailto:bosborne11 at verizon.net] Sent: Thu 29/05/2008 16:11 To: dave burt (RI) Cc: bioperl-l at lists.open-bio.org Subject: Re: [Bioperl-l] bio-alignio-maf Dave, Try your script on some *maf files in the Bioperl package instead of test1.maf: bioperl-live/t/data/bug2453.maf bioperl-live/t/data/humor.maf What is the result? Brian O. On May 29, 2008, at 4:10 AM, dave burt (RI) wrote: > Dear All, > > Testing a simple script > > #!/bin/perl.exe > > use strict; > use warnings; > use Bio::AlignIO; > > my $alignment_file = "test1.maf"; > > printf STDERR "%s\n", $alignment_file; > > my $alignio = Bio::AlignIO->new( -file => $alignment_file, -format => > 'maf'); > > while(my $aln = $alignio->next_aln()){ > my $match_line = $aln->match_line; > > print $aln, "\n"; > > print $aln->length, "\n"; > print $aln->no_residues, "\n"; > print $aln->is_flush, "\n"; > print $aln->no_sequences, "\n"; > > $aln->splice_by_seq_pos(1); > > print $aln->consensus_string(60), "\n"; > print $aln->get_seq_by_pos(1)->seq, "\n"; > print $aln->match_line(), "\n"; > > print "\n"; > } > > exit(); > > Note: test1.maf is attached > > Problem: the while loop is never enetered - any ideas? > > Dave > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Thu May 29 14:48:28 2008 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 29 May 2008 13:48:28 -0500 Subject: [Bioperl-l] bio-alignio-maf In-Reply-To: <1F16910BB8546C4DA5526FABB0C98D09015CA471@ebre2ksrv1.ebrc.bbsrc.ac.uk> References: <1F16910BB8546C4DA5526FABB0C98D09EFA694@ebre2ksrv1.ebrc.bbsrc.ac.uk> <1F16910BB8546C4DA5526FABB0C98D09015CA471@ebre2ksrv1.ebrc.bbsrc.ac.uk> Message-ID: <27E7E283-D138-47FA-AA4D-723149A41838@uiuc.edu> If this is MAF output from biomart we should probably try to support it directly. I'll try to take a look at it. chris On May 29, 2008, at 11:46 AM, dave burt (RI) wrote: > Brain > > - good idea ! worked fine > > - the problem was traced to the header line > > - should be > > ##maf version=1 scoring=zero > > - and not the following, which is included in the biomart downloaded > file > > ##maf version=1 > #Tue Apr 29 16:49:45 2008 > #The start coordinate is a zero-based number. > #For segments in the negative strand, the start > #is relative to the end of the chromosome. Please, refer to > #http://genome.ucsc.edu/FAQ/FAQformat#format5 for a > #description of this file format. > > simple solution - usually is 99% of time > > many thanks for the clues > > Dave > > > > ________________________________ > > From: Brian Osborne [mailto:bosborne11 at verizon.net] > Sent: Thu 29/05/2008 16:11 > To: dave burt (RI) > Cc: bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] bio-alignio-maf > > > > Dave, > > Try your script on some *maf files in the Bioperl package instead of > test1.maf: > > bioperl-live/t/data/bug2453.maf > bioperl-live/t/data/humor.maf > > What is the result? > > > Brian O. > > > > On May 29, 2008, at 4:10 AM, dave burt (RI) wrote: > >> Dear All, >> >> Testing a simple script >> >> #!/bin/perl.exe >> >> use strict; >> use warnings; >> use Bio::AlignIO; >> >> my $alignment_file = "test1.maf"; >> >> printf STDERR "%s\n", $alignment_file; >> >> my $alignio = Bio::AlignIO->new( -file => $alignment_file, -format => >> 'maf'); >> >> while(my $aln = $alignio->next_aln()){ >> my $match_line = $aln->match_line; >> >> print $aln, "\n"; >> >> print $aln->length, "\n"; >> print $aln->no_residues, "\n"; >> print $aln->is_flush, "\n"; >> print $aln->no_sequences, "\n"; >> >> $aln->splice_by_seq_pos(1); >> >> print $aln->consensus_string(60), "\n"; >> print $aln->get_seq_by_pos(1)->seq, "\n"; >> print $aln->match_line(), "\n"; >> >> print "\n"; >> } >> >> exit(); >> >> Note: test1.maf is attached >> >> Problem: the while loop is never enetered - any ideas? >> >> Dave >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From bix at sendu.me.uk Fri May 30 07:16:56 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 30 May 2008 12:16:56 +0100 Subject: [Bioperl-l] Installation problem In-Reply-To: <273360f60805300410k40485801ydbac75cc41da468b@mail.gmail.com> References: <273360f60805280710l154f982dyc46b42823fec6ab2@mail.gmail.com> <483D7112.4080403@sendu.me.uk> <273360f60805290646w2541097es80c7798d81e8ed0c@mail.gmail.com> <483EB51A.1080801@sendu.me.uk> <273360f60805300410k40485801ydbac75cc41da468b@mail.gmail.com> Message-ID: <483FE228.8010307@sendu.me.uk> calum robb wrote: > Hi again. Do not reply to me directly. Reply to the BioPerl mailing list. > I can nearly install bioperl 1.5.2_102, but this message comes up near > the end of the installation process:- > > cannot create symbolic link named bp_pg_bulk_load_gff.pl on your system > for bp_bulk_load_gff.pl in C:\Perl\site\bin. I don't know, but perhaps this is harmless. Just try and see if a BioPerl-using script now works. From cain.cshl at gmail.com Fri May 30 09:10:06 2008 From: cain.cshl at gmail.com (Scott Cain) Date: Fri, 30 May 2008 09:10:06 -0400 Subject: [Bioperl-l] Installation problem In-Reply-To: <483FE228.8010307@sendu.me.uk> References: <273360f60805280710l154f982dyc46b42823fec6ab2@mail.gmail.com> <483D7112.4080403@sendu.me.uk> <273360f60805290646w2541097es80c7798d81e8ed0c@mail.gmail.com> <483EB51A.1080801@sendu.me.uk> <273360f60805300410k40485801ydbac75cc41da468b@mail.gmail.com> <483FE228.8010307@sendu.me.uk> Message-ID: <1212153006.6480.15.camel@frissell> It is harmless--that symbolic link is just a convienence for the 10 or so people who use a Bio::DB::GFF database with a PostgreSQL server. It used to be that the PostgreSQL loader was a separate script, but they were folded into one some time ago. Scott On Fri, 2008-05-30 at 12:16 +0100, Sendu Bala wrote: > calum robb wrote: > > Hi again. > > Do not reply to me directly. Reply to the BioPerl mailing list. > > > > I can nearly install bioperl 1.5.2_102, but this message comes up near > > the end of the installation process:- > > > > cannot create symbolic link named bp_pg_bulk_load_gff.pl on your system > > for bp_bulk_load_gff.pl in C:\Perl\site\bin. > > I don't know, but perhaps this is harmless. Just try and see if a > BioPerl-using script now works. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain at cshl.edu GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory From mmokrejs at ribosome.natur.cuni.cz Sat May 31 07:10:53 2008 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Sat, 31 May 2008 13:10:53 +0200 Subject: [Bioperl-l] extending the PHYLIP format In-Reply-To: <358f4d650805280331h7b1387cat76794d9135f7577d@mail.gmail.com> References: <200805281023.51697.heikki@sanbi.ac.za> <358f4d650805280331h7b1387cat76794d9135f7577d@mail.gmail.com> Message-ID: <4841323D.10307@ribosome.natur.cuni.cz> BTW, fixing the truncated IDs could be done also using t-coffee, at least it is described in the docs: http://www.tcoffee.org/Documentation/t_coffee/t_coffee_tutorial.htm#_Toc148261714 Albert Vilella wrote: > Hi Heikki, > > About a year ago, some code was added to deal with "the more than 10 chars" > ids > problem. ( > https://www.nescent.org/wg_phyloinformatics/Phylohackathon_1/BioPerl_Targets) > > > Basically: (1) mapping the long ids to 10-char numeric ids, (2) running the > program > with the id limitation, (3) reverting the ids back to the originals in the > output. The pods explain how to do it. > > So I would say that the solution is at least "partially" there :-) > > Albert. > > On Wed, May 28, 2008 at 9:23 AM, Heikki Lehvaslaiho > wrote: > >> I just learned that a number of phylogenetics packages (PAUP, PHYML, Mr >> Bayes >> at least ) now allow longer than 10 character IDs in PHYLIP format. The >> documentation is scarce but the rules seem to be: >> >> 1. There can be spaces before the ID. >> 2. The ID can be up to 50 characters long. >> 3. ID can contain any characters. If you are using spaces within the ID, >> you >> have to put the whole ID in single quotes ('). Single quotes can be used >> for >> all IDs and are removed when parsing in. >> 4. It is customary to have two spaces between the ID and the sequence. >> >> This custom seems to have come into PHYLIP format from Nexus. >> Note that this allows sequences in a file to start at different columns. >> >> Can anyone shed more light into matter? >> >> >> I need to get this into bioperl as the names in HIV sequences that I work >> with >> are very long and can not be sensibly truncated. >> >> What would be the best way to do this? >> 1. Add more options to the already heavily >> hacked Bio::AlignIO::phylip.pm >> 2. Create a Bio::AlignIO::phyliplong.pm >> >> Do those ugly hacks for supporting fixed length long IDs really really >> belong >> in the vanilla phylip.pm file? >> >> Opinions? >> >> -Heikki From jason at bioperl.org Sat May 31 17:53:34 2008 From: jason at bioperl.org (Jason Stajich) Date: Sat, 31 May 2008 16:53:34 -0500 Subject: [Bioperl-l] Fwd: BPlite References: <48412383.5080201@ucsf.edu> Message-ID: Begin forwarded message: > From: Anatoly Urisman > Date: May 31, 2008 5:08:03 AM CDT > To: jason... > Subject: BPlite > > Hi Jason, > I was wondering if you are aware of a fix to the BPlite.pm module > that supports the new NCBI blastall output (i.e. reports are not > delimited by something like BLASTN 2.2.8 [Jan-05-2004]). > Thanks. > Anatoly Urisman, MD-PhD