From shachigahoimbi at gmail.com Wed Jun 1 05:25:50 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Wed, 1 Jun 2011 14:55:50 +0530 Subject: [Bioperl-l] bioperl graphics Message-ID: Dear All, I have one sequence 'ACCTTCTGTTTTCAGCAAGTAGGGTCTTATAACCTTCAAAGAAATATTCCTTCAA' and now I want to highlight (graphically), sequence portion of this sequence starting from position 15 to 35. How can i do this with bioperl. If anyone know please tell me. Thanks in advance. -- Regards, Shachi From roy.chaudhuri at gmail.com Wed Jun 1 06:17:33 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 01 Jun 2011 11:17:33 +0100 Subject: [Bioperl-l] Error calling alignment method In-Reply-To: <4DE53BA5.1020803@upvnet.upv.es> References: <4DDD4337.3030107@upvnet.upv.es> <4DDD51CD.90208@upvnet.upv.es> <4DDD8155.5060002@upvnet.upv.es> <4DE52529.2020403@gmail.com> <4DE53BA5.1020803@upvnet.upv.es> Message-ID: <4DE611BD.5070305@gmail.com> Again, works fine for me (after changing the file and directory locations, and uncommenting the subroutine calls). Which Bioperl version are you using? On 31/05/2011 20:04, Lorenzo Carretero Paulet wrote: > Thanks for the reply, > I attach again the revised versiont of the script I'm working with as > well as the sequences. I still have the same error. The funny thing is > that the yn00 subroutine is running properly when the alignment is > passed directly, but not the codeml one (using either PAML 4.2 or 4.4). > However, both PAML versions and programs are working fine with my data > when I ran them manually. > Any suggestion will be much appreciated. > CHeers, > Lorenzo > > and El 31/05/11 19:28, Roy Chaudhuri escribi?: >> Hi Lorenzo, >> >> I tried your code (the one you attached as testa.pl), and the only >> errors that were reported were unininitialized values $Ka at lines 90 >> and 150 when you print the output. This is because of typos in your >> script, you have "dA" instead of "dN" (PAML uses the terms "dN" and >> "dS" for Ka and Ks, respectively). >> >> I can only think that the problem you are experiencing is because of >> some change to the PAML output format (although it worked fine for me >> with a just-downloaded PAML4.4 and an older PAML4). From what I >> recall, PAML always did have quite volatile output formats. Older >> versions of PAML are archived, so you could try downgrading: >> http://abacus.gene.ucl.ac.uk/software/pamlOld.html >> >> Cheers, >> Roy. >> >> On 25/05/2011 23:23, Lorenzo Carretero wrote: >>> Dave, Jason: >>> >>> I had already tried running PAML manually with the alignment (I always >>> do this to confirm software is properly installed and set up), and ran >>> again with an edited version of the alignment removing the stop codons >>> (I didn't know stop codons at the ends of the alignmente could affect >>> PAML, but inframe stop codons). It worked properly in both cases. I ran >>> again my script (see attached testa.pl) using two different methods, one >>> constructing the codon alignment using aa_to_dna_aln and another one >>> passing the aligned sequences (in both cases after removing the stop >>> codons). I had again the message: >>> >>> ------------- EXCEPTION: Bio::Root::NotImplemented ------------- >>> MSG: Unknown format of PAML output did not see seqtype >>> STACK: Error::throw >>> STACK: Bio::Root::Root::throw /Library/Perl/5.10.0/Bio/Root/Root.pm:368 >>> STACK: Bio::Tools::Phylo::PAML::_parse_summary >>> /Library/Perl/5.10.0/Bio/Tools/Phylo/PAML.pm:461 >>> STACK: Bio::Tools::Phylo::PAML::next_result >>> /Library/Perl/5.10.0/Bio/Tools/Phylo/PAML.pm:270 >>> STACK: main::GettingBioperlAlignmentAAtoDNAplusPAMLcalculation >>> /Users/Lorenzo/Documents/workspace/PlantEvolGen/testa.pl:83 >>> STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/testa.pl:23 >>> ---------------------------------------------------------------- >>> >>> Thanks, >>> Lorenzo >>> >>> On 5/25/11 10:24 PM, Jason Stajich wrote: >>>>> ------------------------------------------ >>>>> >>>>> I think the codon alignment is being proberly constructed by the >>>>> method aa_to_dna_aln, as I can do a Dumper printing of it. So the >>>>> problem must be in the PAML codeml wrapper not properly recognizing >>>>> the codon alignment. Could it be related to the alignment format >>>>> (PAML runs on PHYLIP formatted files)? >>>> The writing out in phylip format is taking care of by the factory - >>>> you are passing in an alignment object so that is not typically the >>>> problem. >>>> >>>> I would repeat Dave's idea that you just dump the codon alignment >>>> file out and you run PAML manually with it. The parsing error >>>> sounds like there are problems when running PAML and you may want to >>>> check that you don't have stop codons in your alignment. It looks >>>> like your CDS file has stops as the last codon so if you drop those >>>> last 3 bases, how does it work? >>>> >>>>> Cheers, >>>>> Lorenzo >>>>> >>>>> >>>>> >>>>> -- >>>>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>>>> >>>>> Lorenzo Carretero Paulet >>>>> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >>>>> Integrative Systems Biology Group >>>>> C/ Ingeniero Fausto Elio s/n. >>>>> 46022 Valencia, Spain >>>>> >>>>> Phone: +34 963879934 >>>>> Fax: +34 963877859 >>>>> e-mail: locarpau at upvnet.upv.es >>>>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>>>> >>>>> >>>>> _______________________________________________ >>>>> >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From locarpau at upvnet.upv.es Wed Jun 1 06:52:51 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Wed, 01 Jun 2011 12:52:51 +0200 Subject: [Bioperl-l] Error calling alignment method In-Reply-To: <4DE611BD.5070305@gmail.com> References: <4DDD4337.3030107@upvnet.upv.es> <4DDD51CD.90208@upvnet.upv.es> <4DDD8155.5060002@upvnet.upv.es> <4DE52529.2020403@gmail.com> <4DE53BA5.1020803@upvnet.upv.es> <4DE611BD.5070305@gmail.com> Message-ID: <4DE61A03.3080904@upvnet.upv.es> perl -MBio::Root::Version -e 'print $Bio::Root::Version::VERSION,"\n"' : 1.006001 Thanks, Lorenzo El 01/06/11 12:17, Roy Chaudhuri escribi?: > Again, works fine for me (after changing the file and directory > locations, and uncommenting the subroutine calls). Which Bioperl > version are you using? > > On 31/05/2011 20:04, Lorenzo Carretero Paulet wrote: >> Thanks for the reply, >> I attach again the revised versiont of the script I'm working with as >> well as the sequences. I still have the same error. The funny thing is >> that the yn00 subroutine is running properly when the alignment is >> passed directly, but not the codeml one (using either PAML 4.2 or 4.4). >> However, both PAML versions and programs are working fine with my data >> when I ran them manually. >> Any suggestion will be much appreciated. >> CHeers, >> Lorenzo >> >> and El 31/05/11 19:28, Roy Chaudhuri escribi?: >>> Hi Lorenzo, >>> >>> I tried your code (the one you attached as testa.pl), and the only >>> errors that were reported were unininitialized values $Ka at lines 90 >>> and 150 when you print the output. This is because of typos in your >>> script, you have "dA" instead of "dN" (PAML uses the terms "dN" and >>> "dS" for Ka and Ks, respectively). >>> >>> I can only think that the problem you are experiencing is because of >>> some change to the PAML output format (although it worked fine for me >>> with a just-downloaded PAML4.4 and an older PAML4). From what I >>> recall, PAML always did have quite volatile output formats. Older >>> versions of PAML are archived, so you could try downgrading: >>> http://abacus.gene.ucl.ac.uk/software/pamlOld.html >>> >>> Cheers, >>> Roy. >>> >>> On 25/05/2011 23:23, Lorenzo Carretero wrote: >>>> Dave, Jason: >>>> >>>> I had already tried running PAML manually with the alignment (I always >>>> do this to confirm software is properly installed and set up), and ran >>>> again with an edited version of the alignment removing the stop codons >>>> (I didn't know stop codons at the ends of the alignmente could affect >>>> PAML, but inframe stop codons). It worked properly in both cases. I >>>> ran >>>> again my script (see attached testa.pl) using two different >>>> methods, one >>>> constructing the codon alignment using aa_to_dna_aln and another one >>>> passing the aligned sequences (in both cases after removing the stop >>>> codons). I had again the message: >>>> >>>> ------------- EXCEPTION: Bio::Root::NotImplemented ------------- >>>> MSG: Unknown format of PAML output did not see seqtype >>>> STACK: Error::throw >>>> STACK: Bio::Root::Root::throw >>>> /Library/Perl/5.10.0/Bio/Root/Root.pm:368 >>>> STACK: Bio::Tools::Phylo::PAML::_parse_summary >>>> /Library/Perl/5.10.0/Bio/Tools/Phylo/PAML.pm:461 >>>> STACK: Bio::Tools::Phylo::PAML::next_result >>>> /Library/Perl/5.10.0/Bio/Tools/Phylo/PAML.pm:270 >>>> STACK: main::GettingBioperlAlignmentAAtoDNAplusPAMLcalculation >>>> /Users/Lorenzo/Documents/workspace/PlantEvolGen/testa.pl:83 >>>> STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/testa.pl:23 >>>> ---------------------------------------------------------------- >>>> >>>> Thanks, >>>> Lorenzo >>>> >>>> On 5/25/11 10:24 PM, Jason Stajich wrote: >>>>>> ------------------------------------------ >>>>>> >>>>>> I think the codon alignment is being proberly constructed by the >>>>>> method aa_to_dna_aln, as I can do a Dumper printing of it. So the >>>>>> problem must be in the PAML codeml wrapper not properly recognizing >>>>>> the codon alignment. Could it be related to the alignment format >>>>>> (PAML runs on PHYLIP formatted files)? >>>>> The writing out in phylip format is taking care of by the factory - >>>>> you are passing in an alignment object so that is not typically the >>>>> problem. >>>>> >>>>> I would repeat Dave's idea that you just dump the codon alignment >>>>> file out and you run PAML manually with it. The parsing error >>>>> sounds like there are problems when running PAML and you may want to >>>>> check that you don't have stop codons in your alignment. It looks >>>>> like your CDS file has stops as the last codon so if you drop those >>>>> last 3 bases, how does it work? >>>>> >>>>>> Cheers, >>>>>> Lorenzo >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>>>>> >>>>>> >>>>>> Lorenzo Carretero Paulet >>>>>> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >>>>>> Integrative Systems Biology Group >>>>>> C/ Ingeniero Fausto Elio s/n. >>>>>> 46022 Valencia, Spain >>>>>> >>>>>> Phone: +34 963879934 >>>>>> Fax: +34 963877859 >>>>>> e-mail: locarpau at upvnet.upv.es >>>>>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> >>>>>> >>>>>> Bioperl-l mailing list >>>>>> Bioperl-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From cjfields at illinois.edu Wed Jun 1 09:31:37 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 1 Jun 2011 08:31:37 -0500 Subject: [Bioperl-l] Error calling alignment method In-Reply-To: <4DE61A03.3080904@upvnet.upv.es> References: <4DDD4337.3030107@upvnet.upv.es> <4DDD51CD.90208@upvnet.upv.es> <4DDD8155.5060002@upvnet.upv.es> <4DE52529.2020403@gmail.com> <4DE53BA5.1020803@upvnet.upv.es> <4DE611BD.5070305@gmail.com> <4DE61A03.3080904@upvnet.upv.es> Message-ID: <4E30E318-015C-4B84-977D-C8A9D9D2CC80@illinois.edu> I think Dave made some changes after 1.6.1; try upgrading to the latest version on CPAN (1.6.901): http://search.cpan.org/~cjfields/BioPerl-1.6.901/ chris On Jun 1, 2011, at 5:52 AM, Lorenzo Carretero Paulet wrote: > perl -MBio::Root::Version -e 'print $Bio::Root::Version::VERSION,"\n"' : 1.006001 > Thanks, > Lorenzo > El 01/06/11 12:17, Roy Chaudhuri escribi?: >> Again, works fine for me (after changing the file and directory locations, and uncommenting the subroutine calls). Which Bioperl version are you using? >> >> On 31/05/2011 20:04, Lorenzo Carretero Paulet wrote: >>> Thanks for the reply, >>> I attach again the revised versiont of the script I'm working with as >>> well as the sequences. I still have the same error. The funny thing is >>> that the yn00 subroutine is running properly when the alignment is >>> passed directly, but not the codeml one (using either PAML 4.2 or 4.4). >>> However, both PAML versions and programs are working fine with my data >>> when I ran them manually. >>> Any suggestion will be much appreciated. >>> CHeers, >>> Lorenzo >>> >>> and El 31/05/11 19:28, Roy Chaudhuri escribi?: >>>> Hi Lorenzo, >>>> >>>> I tried your code (the one you attached as testa.pl), and the only >>>> errors that were reported were unininitialized values $Ka at lines 90 >>>> and 150 when you print the output. This is because of typos in your >>>> script, you have "dA" instead of "dN" (PAML uses the terms "dN" and >>>> "dS" for Ka and Ks, respectively). >>>> >>>> I can only think that the problem you are experiencing is because of >>>> some change to the PAML output format (although it worked fine for me >>>> with a just-downloaded PAML4.4 and an older PAML4). From what I >>>> recall, PAML always did have quite volatile output formats. Older >>>> versions of PAML are archived, so you could try downgrading: >>>> http://abacus.gene.ucl.ac.uk/software/pamlOld.html >>>> >>>> Cheers, >>>> Roy. >>>> >>>> On 25/05/2011 23:23, Lorenzo Carretero wrote: >>>>> Dave, Jason: >>>>> >>>>> I had already tried running PAML manually with the alignment (I always >>>>> do this to confirm software is properly installed and set up), and ran >>>>> again with an edited version of the alignment removing the stop codons >>>>> (I didn't know stop codons at the ends of the alignmente could affect >>>>> PAML, but inframe stop codons). It worked properly in both cases. I ran >>>>> again my script (see attached testa.pl) using two different methods, one >>>>> constructing the codon alignment using aa_to_dna_aln and another one >>>>> passing the aligned sequences (in both cases after removing the stop >>>>> codons). I had again the message: >>>>> >>>>> ------------- EXCEPTION: Bio::Root::NotImplemented ------------- >>>>> MSG: Unknown format of PAML output did not see seqtype >>>>> STACK: Error::throw >>>>> STACK: Bio::Root::Root::throw /Library/Perl/5.10.0/Bio/Root/Root.pm:368 >>>>> STACK: Bio::Tools::Phylo::PAML::_parse_summary >>>>> /Library/Perl/5.10.0/Bio/Tools/Phylo/PAML.pm:461 >>>>> STACK: Bio::Tools::Phylo::PAML::next_result >>>>> /Library/Perl/5.10.0/Bio/Tools/Phylo/PAML.pm:270 >>>>> STACK: main::GettingBioperlAlignmentAAtoDNAplusPAMLcalculation >>>>> /Users/Lorenzo/Documents/workspace/PlantEvolGen/testa.pl:83 >>>>> STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/testa.pl:23 >>>>> ---------------------------------------------------------------- >>>>> >>>>> Thanks, >>>>> Lorenzo >>>>> >>>>> On 5/25/11 10:24 PM, Jason Stajich wrote: >>>>>>> ------------------------------------------ >>>>>>> >>>>>>> I think the codon alignment is being proberly constructed by the >>>>>>> method aa_to_dna_aln, as I can do a Dumper printing of it. So the >>>>>>> problem must be in the PAML codeml wrapper not properly recognizing >>>>>>> the codon alignment. Could it be related to the alignment format >>>>>>> (PAML runs on PHYLIP formatted files)? >>>>>> The writing out in phylip format is taking care of by the factory - >>>>>> you are passing in an alignment object so that is not typically the >>>>>> problem. >>>>>> >>>>>> I would repeat Dave's idea that you just dump the codon alignment >>>>>> file out and you run PAML manually with it. The parsing error >>>>>> sounds like there are problems when running PAML and you may want to >>>>>> check that you don't have stop codons in your alignment. It looks >>>>>> like your CDS file has stops as the last codon so if you drop those >>>>>> last 3 bases, how does it work? >>>>>> >>>>>>> Cheers, >>>>>>> Lorenzo >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>>>>>> >>>>>>> Lorenzo Carretero Paulet >>>>>>> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >>>>>>> Integrative Systems Biology Group >>>>>>> C/ Ingeniero Fausto Elio s/n. >>>>>>> 46022 Valencia, Spain >>>>>>> >>>>>>> Phone: +34 963879934 >>>>>>> Fax: +34 963877859 >>>>>>> e-mail: locarpau at upvnet.upv.es >>>>>>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> >>>>>>> Bioperl-l mailing list >>>>>>> Bioperl-l at lists.open-bio.org >>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>> >>>>>> _______________________________________________ >>>>>> Bioperl-l mailing list >>>>>> Bioperl-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > Lorenzo Carretero Paulet > Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) > Integrative Systems Biology Group > C/ Ingeniero Fausto Elio s/n. > 46022 Valencia, Spain > > Phone: +34 963879934 > Fax: +34 963877859 > e-mail: locarpau at upvnet.upv.es > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From lvu.jun at gmail.com Wed Jun 1 08:30:07 2011 From: lvu.jun at gmail.com (lvu.jun) Date: Wed, 1 Jun 2011 20:30:07 +0800 Subject: [Bioperl-l] questions about the bioperl module Bio::PopGen::Statistics Message-ID: <201106012030039537050@gmail.com> Hi, there, I am trying to calculate the population genetics parameters such as pi using the bioperl module Bio::PopGen::Statistics. But I found that the method only requires the input of the marker genotype of every individuals for the population. I don't know why the module does not take the DNA sequence length into consideration when calculating the pi value. According to the definition of the pi value, besides the polymorphic sites, we also need the monomorphic sites that should be incorporated in the denominator when doing the calculation. Is it right? therefore I'm confused about the module, who can tell me why it can correctly calculate the pi value only with the marker(polymorphic) genotype? Another question, if I want to calculate the pi value using the sliding window along the genome, how can I do this using the Bio::PopGen::Statistics module? Thanks for your help! Yours sincerely, Jun Chinese Academy of Sciences 2011-06-01 lvu.jun From alicarea at gmail.com Wed Jun 1 10:32:57 2011 From: alicarea at gmail.com (Sophie) Date: Wed, 1 Jun 2011 15:32:57 +0100 Subject: [Bioperl-l] Problem installing BioPerl Message-ID: Hi there, I've been trying to install BioPerl 1.6.1 on my Mac, (OS X), but no matter which method I use, CPAN, Buil.PL, etc, I can't install it, even when I try to force it. Should I try an older version of BioPerl? Thank you, Best Regards, Sophie. From scott at scottcain.net Wed Jun 1 10:40:54 2011 From: scott at scottcain.net (Scott Cain) Date: Wed, 1 Jun 2011 10:40:54 -0400 Subject: [Bioperl-l] Problem installing BioPerl In-Reply-To: References: Message-ID: Hi Sophie, If anything, I'd suggest a newer version (the current version is 1.6.901), and I'd also suggest indicating what the problem is; there might be a simple fix. Scott On Wed, Jun 1, 2011 at 10:32 AM, Sophie wrote: > Hi there, > > I've been trying to install BioPerl 1.6.1 on my Mac, (OS X), but no matter > which method I use, CPAN, Buil.PL, etc, I can't install it, even when I try > to force it. Should I try an older version of BioPerl? > > Thank you, > Best Regards, > Sophie. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From scott at scottcain.net Wed Jun 1 10:49:49 2011 From: scott at scottcain.net (Scott Cain) Date: Wed, 1 Jun 2011 10:49:49 -0400 Subject: [Bioperl-l] bioperl graphics In-Reply-To: References: Message-ID: Hello Shachi, Can you describe in more detail what it is you want to see? And is this something that you'll want to generally do for any number of random sequences? How do you identify the region you want to highlight? Scott On Wed, Jun 1, 2011 at 5:25 AM, Shachi Gahoi wrote: > Dear All, > > I have one sequence > 'ACCTTCTGTTTTCAGCAAGTAGGGTCTTATAACCTTCAAAGAAATATTCCTTCAA' > > and now I want to highlight (graphically), ?sequence portion of this > sequence starting from position 15 to 35. How can i do this with bioperl. > > If anyone know please tell me. > > > Thanks in advance. > > -- > Regards, > Shachi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From cjfields at illinois.edu Wed Jun 1 11:04:00 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 1 Jun 2011 10:04:00 -0500 Subject: [Bioperl-l] Problem installing BioPerl In-Reply-To: References: Message-ID: <96F5C958-86C9-462F-B9DA-BABBB6C4D6C2@illinois.edu> I agree with Scott, in that you should use the latest version on CPAN. It's very possible it will fix some of the issues you are seeing, as the build step has been simplified substantially to get around bioperl-specific API conflicts with Module::Build. So far, v 1.6.901 is installing on most OS's, including darwin: http://matrix.cpantesters.org/?dist=BioPerl+1.6.901 However, please let us know more details if you still run into problems. chris On Jun 1, 2011, at 9:40 AM, Scott Cain wrote: > Hi Sophie, > > If anything, I'd suggest a newer version (the current version is > 1.6.901), and I'd also suggest indicating what the problem is; there > might be a simple fix. > > Scott > > > On Wed, Jun 1, 2011 at 10:32 AM, Sophie wrote: >> Hi there, >> >> I've been trying to install BioPerl 1.6.1 on my Mac, (OS X), but no matter >> which method I use, CPAN, Buil.PL, etc, I can't install it, even when I try >> to force it. Should I try an older version of BioPerl? >> >> Thank you, >> Best Regards, >> Sophie. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. scott at scottcain dot net > GMOD Coordinator (http://gmod.org/) 216-392-3087 > Ontario Institute for Cancer Research > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From alicarea at gmail.com Wed Jun 1 11:17:26 2011 From: alicarea at gmail.com (Sophie) Date: Wed, 1 Jun 2011 16:17:26 +0100 Subject: [Bioperl-l] Problem installing BioPerl In-Reply-To: <96F5C958-86C9-462F-B9DA-BABBB6C4D6C2@illinois.edu> References: <96F5C958-86C9-462F-B9DA-BABBB6C4D6C2@illinois.edu> Message-ID: Hi Chris, Scott, Thank you for the quick response, I tried the latest version, and I get this on the terminal: ERROR: Can't create '/usr/local/bin' Do not have write permissions on '/usr/local/bin' !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! at /Users/sophie/.cpan/build/BioPerl-1.6.901-yH47zc/Bio/Root/Build.pm line 853 CJFIELDS/BioPerl-1.6.901.tar.gz ./Build install -- NOT OK Warning (usually harmless): 'YAML' not installed, will not store persistent state Failed during this command: CJFIELDS/BioPerl-1.6.901.tar.gz : install NO I can't think of any other solution than to chmod 777 the /usr/local/bin but I really didn't want to do that. Thanks in advance for your help, Sophie On 1 June 2011 16:04, Chris Fields wrote: > I agree with Scott, in that you should use the latest version on CPAN. > It's very possible it will fix some of the issues you are seeing, as the > build step has been simplified substantially to get around bioperl-specific > API conflicts with Module::Build. So far, v 1.6.901 is installing on most > OS's, including darwin: > > http://matrix.cpantesters.org/?dist=BioPerl+1.6.901 > > However, please let us know more details if you still run into problems. > > chris > > On Jun 1, 2011, at 9:40 AM, Scott Cain wrote: > > > Hi Sophie, > > > > If anything, I'd suggest a newer version (the current version is > > 1.6.901), and I'd also suggest indicating what the problem is; there > > might be a simple fix. > > > > Scott > > > > > > On Wed, Jun 1, 2011 at 10:32 AM, Sophie wrote: > >> Hi there, > >> > >> I've been trying to install BioPerl 1.6.1 on my Mac, (OS X), but no > matter > >> which method I use, CPAN, Buil.PL, etc, I can't install it, even when I > try > >> to force it. Should I try an older version of BioPerl? > >> > >> Thank you, > >> Best Regards, > >> Sophie. > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > > > > > > > -- > > ------------------------------------------------------------------------ > > Scott Cain, Ph. D. scott at scottcain > dot net > > GMOD Coordinator (http://gmod.org/) 216-392-3087 > > Ontario Institute for Cancer Research > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From locarpau at upvnet.upv.es Wed Jun 1 11:37:43 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Wed, 01 Jun 2011 17:37:43 +0200 Subject: [Bioperl-l] Error calling alignment method In-Reply-To: <4E30E318-015C-4B84-977D-C8A9D9D2CC80@illinois.edu> References: <4DDD4337.3030107@upvnet.upv.es> <4DDD51CD.90208@upvnet.upv.es> <4DDD8155.5060002@upvnet.upv.es> <4DE52529.2020403@gmail.com> <4DE53BA5.1020803@upvnet.upv.es> <4DE611BD.5070305@gmail.com> <4DE61A03.3080904@upvnet.upv.es> <4E30E318-015C-4B84-977D-C8A9D9D2CC80@illinois.edu> Message-ID: <4DE65CC7.60305@upvnet.upv.es> Chris, I just downloaded and installed Bioperl 1.6.901 and... everything is working fine now!!! Thank you very much, Lorenzo El 01/06/11 15:31, Chris Fields escribi?: > I think Dave made some changes after 1.6.1; try upgrading to the latest version on CPAN (1.6.901): > > http://search.cpan.org/~cjfields/BioPerl-1.6.901/ > > chris > > On Jun 1, 2011, at 5:52 AM, Lorenzo Carretero Paulet wrote: > >> perl -MBio::Root::Version -e 'print $Bio::Root::Version::VERSION,"\n"' : 1.006001 >> Thanks, >> Lorenzo >> El 01/06/11 12:17, Roy Chaudhuri escribi?: >>> Again, works fine for me (after changing the file and directory locations, and uncommenting the subroutine calls). Which Bioperl version are you using? >>> >>> On 31/05/2011 20:04, Lorenzo Carretero Paulet wrote: >>>> Thanks for the reply, >>>> I attach again the revised versiont of the script I'm working with as >>>> well as the sequences. I still have the same error. The funny thing is >>>> that the yn00 subroutine is running properly when the alignment is >>>> passed directly, but not the codeml one (using either PAML 4.2 or 4.4). >>>> However, both PAML versions and programs are working fine with my data >>>> when I ran them manually. >>>> Any suggestion will be much appreciated. >>>> CHeers, >>>> Lorenzo >>>> >>>> and El 31/05/11 19:28, Roy Chaudhuri escribi?: >>>>> Hi Lorenzo, >>>>> >>>>> I tried your code (the one you attached as testa.pl), and the only >>>>> errors that were reported were unininitialized values $Ka at lines 90 >>>>> and 150 when you print the output. This is because of typos in your >>>>> script, you have "dA" instead of "dN" (PAML uses the terms "dN" and >>>>> "dS" for Ka and Ks, respectively). >>>>> >>>>> I can only think that the problem you are experiencing is because of >>>>> some change to the PAML output format (although it worked fine for me >>>>> with a just-downloaded PAML4.4 and an older PAML4). From what I >>>>> recall, PAML always did have quite volatile output formats. Older >>>>> versions of PAML are archived, so you could try downgrading: >>>>> http://abacus.gene.ucl.ac.uk/software/pamlOld.html >>>>> >>>>> Cheers, >>>>> Roy. >>>>> >>>>> On 25/05/2011 23:23, Lorenzo Carretero wrote: >>>>>> Dave, Jason: >>>>>> >>>>>> I had already tried running PAML manually with the alignment (I always >>>>>> do this to confirm software is properly installed and set up), and ran >>>>>> again with an edited version of the alignment removing the stop codons >>>>>> (I didn't know stop codons at the ends of the alignmente could affect >>>>>> PAML, but inframe stop codons). It worked properly in both cases. I ran >>>>>> again my script (see attached testa.pl) using two different methods, one >>>>>> constructing the codon alignment using aa_to_dna_aln and another one >>>>>> passing the aligned sequences (in both cases after removing the stop >>>>>> codons). I had again the message: >>>>>> >>>>>> ------------- EXCEPTION: Bio::Root::NotImplemented ------------- >>>>>> MSG: Unknown format of PAML output did not see seqtype >>>>>> STACK: Error::throw >>>>>> STACK: Bio::Root::Root::throw /Library/Perl/5.10.0/Bio/Root/Root.pm:368 >>>>>> STACK: Bio::Tools::Phylo::PAML::_parse_summary >>>>>> /Library/Perl/5.10.0/Bio/Tools/Phylo/PAML.pm:461 >>>>>> STACK: Bio::Tools::Phylo::PAML::next_result >>>>>> /Library/Perl/5.10.0/Bio/Tools/Phylo/PAML.pm:270 >>>>>> STACK: main::GettingBioperlAlignmentAAtoDNAplusPAMLcalculation >>>>>> /Users/Lorenzo/Documents/workspace/PlantEvolGen/testa.pl:83 >>>>>> STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/testa.pl:23 >>>>>> ---------------------------------------------------------------- >>>>>> >>>>>> Thanks, >>>>>> Lorenzo >>>>>> >>>>>> On 5/25/11 10:24 PM, Jason Stajich wrote: >>>>>>>> ------------------------------------------ >>>>>>>> >>>>>>>> I think the codon alignment is being proberly constructed by the >>>>>>>> method aa_to_dna_aln, as I can do a Dumper printing of it. So the >>>>>>>> problem must be in the PAML codeml wrapper not properly recognizing >>>>>>>> the codon alignment. Could it be related to the alignment format >>>>>>>> (PAML runs on PHYLIP formatted files)? >>>>>>> The writing out in phylip format is taking care of by the factory - >>>>>>> you are passing in an alignment object so that is not typically the >>>>>>> problem. >>>>>>> >>>>>>> I would repeat Dave's idea that you just dump the codon alignment >>>>>>> file out and you run PAML manually with it. The parsing error >>>>>>> sounds like there are problems when running PAML and you may want to >>>>>>> check that you don't have stop codons in your alignment. It looks >>>>>>> like your CDS file has stops as the last codon so if you drop those >>>>>>> last 3 bases, how does it work? >>>>>>> >>>>>>>> Cheers, >>>>>>>> Lorenzo >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>>>>>>> >>>>>>>> Lorenzo Carretero Paulet >>>>>>>> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >>>>>>>> Integrative Systems Biology Group >>>>>>>> C/ Ingeniero Fausto Elio s/n. >>>>>>>> 46022 Valencia, Spain >>>>>>>> >>>>>>>> Phone: +34 963879934 >>>>>>>> Fax: +34 963877859 >>>>>>>> e-mail: locarpau at upvnet.upv.es >>>>>>>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> >>>>>>>> Bioperl-l mailing list >>>>>>>> Bioperl-l at lists.open-bio.org >>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>>> _______________________________________________ >>>>>>> Bioperl-l mailing list >>>>>>> Bioperl-l at lists.open-bio.org >>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Bioperl-l mailing list >>>>>> Bioperl-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> Lorenzo Carretero Paulet >> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >> Integrative Systems Biology Group >> C/ Ingeniero Fausto Elio s/n. >> 46022 Valencia, Spain >> >> Phone: +34 963879934 >> Fax: +34 963877859 >> e-mail: locarpau at upvnet.upv.es >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From scott at scottcain.net Wed Jun 1 11:39:26 2011 From: scott at scottcain.net (Scott Cain) Date: Wed, 1 Jun 2011 11:39:26 -0400 Subject: [Bioperl-l] Problem installing BioPerl In-Reply-To: References: <96F5C958-86C9-462F-B9DA-BABBB6C4D6C2@illinois.edu> Message-ID: Hi Sophie, You need to use sudo to do the install. Options include "sudo cpan" to use the cpan shell, and "sudo ./Build install" to install from the command line. Scott On Wed, Jun 1, 2011 at 11:17 AM, Sophie wrote: > Hi Chris, Scott, > Thank you for the quick response, I tried the latest version, and I get this > on the terminal: > ERROR: Can't create '/usr/local/bin' > Do not have write permissions on '/usr/local/bin' > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > ?at /Users/sophie/.cpan/build/BioPerl-1.6.901-yH47zc/Bio/Root/Build.pm line > 853 > ??CJFIELDS/BioPerl-1.6.901.tar.gz > ??./Build install ?-- NOT OK > Warning (usually harmless): 'YAML' not installed, will not store persistent > state > Failed during this command: > ?CJFIELDS/BioPerl-1.6.901.tar.gz ? ? ? ? ? ? ?: install NO > > I can't think of any other solution than to chmod 777 the /usr/local/bin but > I really didn't want to do that. > Thanks in advance for your help, > Sophie > On 1 June 2011 16:04, Chris Fields wrote: >> >> I agree with Scott, in that you should use the latest version on CPAN. >> ?It's very possible it will fix some of the issues you are seeing, as the >> build step has been simplified substantially to get around bioperl-specific >> API conflicts with Module::Build. ?So far, v 1.6.901 is installing on most >> OS's, including darwin: >> >> http://matrix.cpantesters.org/?dist=BioPerl+1.6.901 >> >> However, please let us know more details if you still run into problems. >> >> chris >> >> On Jun 1, 2011, at 9:40 AM, Scott Cain wrote: >> >> > Hi Sophie, >> > >> > If anything, I'd suggest a newer version (the current version is >> > 1.6.901), and I'd also suggest indicating what the problem is; there >> > might be a simple fix. >> > >> > Scott >> > >> > >> > On Wed, Jun 1, 2011 at 10:32 AM, Sophie wrote: >> >> Hi there, >> >> >> >> I've been trying to install BioPerl 1.6.1 on my Mac, (OS X), but no >> >> matter >> >> which method I use, CPAN, Buil.PL, etc, I can't install it, even when I >> >> try >> >> to force it. Should I try an older version of BioPerl? >> >> >> >> Thank you, >> >> Best Regards, >> >> Sophie. >> >> _______________________________________________ >> >> Bioperl-l mailing list >> >> Bioperl-l at lists.open-bio.org >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> > >> > >> > >> > -- >> > ------------------------------------------------------------------------ >> > Scott Cain, Ph. D. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? scott at scottcain >> > dot net >> > GMOD Coordinator (http://gmod.org/) ? ? ? ? ? ? ? ? ? ? 216-392-3087 >> > Ontario Institute for Cancer Research >> > >> > _______________________________________________ >> > Bioperl-l mailing list >> > Bioperl-l at lists.open-bio.org >> > http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From alicarea at gmail.com Wed Jun 1 11:46:35 2011 From: alicarea at gmail.com (Sophie) Date: Wed, 1 Jun 2011 16:46:35 +0100 Subject: [Bioperl-l] Problem installing BioPerl In-Reply-To: References: <96F5C958-86C9-462F-B9DA-BABBB6C4D6C2@illinois.edu> Message-ID: Hi there, I wrote this on the cpan shell: o conf make_install_make_command 'sudo make' o conf mbuild_install_build_command 'sudo ./Build' o conf commit install CJFIELDS/BioPerl-1.6.901.tar.gz and it seems like it installed, at least when I type perl -MBio::Seq -e 0 it gives me no error, and when I type perl -MBio::Root::Version -e 'print $Bio::Root::Version::VERSION,"\n"' it gives me this 1.006901 Hopefully it worked, sorry to bother you, Thank you for your time, Best wishes, Sophie. On 1 June 2011 16:39, Scott Cain wrote: > Hi Sophie, > > You need to use sudo to do the install. Options include "sudo cpan" > to use the cpan shell, and "sudo ./Build install" to install from the > command line. > > Scott > > > On Wed, Jun 1, 2011 at 11:17 AM, Sophie wrote: > > Hi Chris, Scott, > > Thank you for the quick response, I tried the latest version, and I get > this > > on the terminal: > > ERROR: Can't create '/usr/local/bin' > > Do not have write permissions on '/usr/local/bin' > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > at /Users/sophie/.cpan/build/BioPerl-1.6.901-yH47zc/Bio/Root/Build.pm > line > > 853 > > CJFIELDS/BioPerl-1.6.901.tar.gz > > ./Build install -- NOT OK > > Warning (usually harmless): 'YAML' not installed, will not store > persistent > > state > > Failed during this command: > > CJFIELDS/BioPerl-1.6.901.tar.gz : install NO > > > > I can't think of any other solution than to chmod 777 the /usr/local/bin > but > > I really didn't want to do that. > > Thanks in advance for your help, > > Sophie > > On 1 June 2011 16:04, Chris Fields wrote: > >> > >> I agree with Scott, in that you should use the latest version on CPAN. > >> It's very possible it will fix some of the issues you are seeing, as > the > >> build step has been simplified substantially to get around > bioperl-specific > >> API conflicts with Module::Build. So far, v 1.6.901 is installing on > most > >> OS's, including darwin: > >> > >> http://matrix.cpantesters.org/?dist=BioPerl+1.6.901 > >> > >> However, please let us know more details if you still run into problems. > >> > >> chris > >> > >> On Jun 1, 2011, at 9:40 AM, Scott Cain wrote: > >> > >> > Hi Sophie, > >> > > >> > If anything, I'd suggest a newer version (the current version is > >> > 1.6.901), and I'd also suggest indicating what the problem is; there > >> > might be a simple fix. > >> > > >> > Scott > >> > > >> > > >> > On Wed, Jun 1, 2011 at 10:32 AM, Sophie wrote: > >> >> Hi there, > >> >> > >> >> I've been trying to install BioPerl 1.6.1 on my Mac, (OS X), but no > >> >> matter > >> >> which method I use, CPAN, Buil.PL, etc, I can't install it, even when > I > >> >> try > >> >> to force it. Should I try an older version of BioPerl? > >> >> > >> >> Thank you, > >> >> Best Regards, > >> >> Sophie. > >> >> _______________________________________________ > >> >> Bioperl-l mailing list > >> >> Bioperl-l at lists.open-bio.org > >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> >> > >> > > >> > > >> > > >> > -- > >> > > ------------------------------------------------------------------------ > >> > Scott Cain, Ph. D. scott at > scottcain > >> > dot net > >> > GMOD Coordinator (http://gmod.org/) 216-392-3087 > >> > Ontario Institute for Cancer Research > >> > > >> > _______________________________________________ > >> > Bioperl-l mailing list > >> > Bioperl-l at lists.open-bio.org > >> > http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > > > > > > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. scott at scottcain dot > net > GMOD Coordinator (http://gmod.org/) 216-392-3087 > Ontario Institute for Cancer Research > From cjfields at illinois.edu Wed Jun 1 11:57:12 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 1 Jun 2011 10:57:12 -0500 Subject: [Bioperl-l] Problem installing BioPerl In-Reply-To: References: <96F5C958-86C9-462F-B9DA-BABBB6C4D6C2@illinois.edu> Message-ID: Just to note, you can also install locally if you do not want the distribution available for everyone (see local::lib on CPAN). In fact, I suggest this in most cases for end users. chris On Jun 1, 2011, at 10:46 AM, Sophie wrote: > Hi there, > > I wrote this on the cpan shell: > > o conf make_install_make_command 'sudo make' > o conf mbuild_install_build_command 'sudo ./Build' > o conf commit > install CJFIELDS/BioPerl-1.6.901.tar.gz > > and it seems like it installed, at least when I type perl -MBio::Seq -e 0 it gives me no error, and when I type perl -MBio::Root::Version -e 'print $Bio::Root::Version::VERSION,"\n"' it gives me this 1.006901 > > Hopefully it worked, sorry to bother you, > Thank you for your time, > Best wishes, > Sophie. > > On 1 June 2011 16:39, Scott Cain wrote: > Hi Sophie, > > You need to use sudo to do the install. Options include "sudo cpan" > to use the cpan shell, and "sudo ./Build install" to install from the > command line. > > Scott > > > On Wed, Jun 1, 2011 at 11:17 AM, Sophie wrote: > > Hi Chris, Scott, > > Thank you for the quick response, I tried the latest version, and I get this > > on the terminal: > > ERROR: Can't create '/usr/local/bin' > > Do not have write permissions on '/usr/local/bin' > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > at /Users/sophie/.cpan/build/BioPerl-1.6.901-yH47zc/Bio/Root/Build.pm line > > 853 > > CJFIELDS/BioPerl-1.6.901.tar.gz > > ./Build install -- NOT OK > > Warning (usually harmless): 'YAML' not installed, will not store persistent > > state > > Failed during this command: > > CJFIELDS/BioPerl-1.6.901.tar.gz : install NO > > > > I can't think of any other solution than to chmod 777 the /usr/local/bin but > > I really didn't want to do that. > > Thanks in advance for your help, > > Sophie > > On 1 June 2011 16:04, Chris Fields wrote: > >> > >> I agree with Scott, in that you should use the latest version on CPAN. > >> It's very possible it will fix some of the issues you are seeing, as the > >> build step has been simplified substantially to get around bioperl-specific > >> API conflicts with Module::Build. So far, v 1.6.901 is installing on most > >> OS's, including darwin: > >> > >> http://matrix.cpantesters.org/?dist=BioPerl+1.6.901 > >> > >> However, please let us know more details if you still run into problems. > >> > >> chris > >> > >> On Jun 1, 2011, at 9:40 AM, Scott Cain wrote: > >> > >> > Hi Sophie, > >> > > >> > If anything, I'd suggest a newer version (the current version is > >> > 1.6.901), and I'd also suggest indicating what the problem is; there > >> > might be a simple fix. > >> > > >> > Scott > >> > > >> > > >> > On Wed, Jun 1, 2011 at 10:32 AM, Sophie wrote: > >> >> Hi there, > >> >> > >> >> I've been trying to install BioPerl 1.6.1 on my Mac, (OS X), but no > >> >> matter > >> >> which method I use, CPAN, Buil.PL, etc, I can't install it, even when I > >> >> try > >> >> to force it. Should I try an older version of BioPerl? > >> >> > >> >> Thank you, > >> >> Best Regards, > >> >> Sophie. > >> >> _______________________________________________ > >> >> Bioperl-l mailing list > >> >> Bioperl-l at lists.open-bio.org > >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> >> > >> > > >> > > >> > > >> > -- > >> > ------------------------------------------------------------------------ > >> > Scott Cain, Ph. D. scott at scottcain > >> > dot net > >> > GMOD Coordinator (http://gmod.org/) 216-392-3087 > >> > Ontario Institute for Cancer Research > >> > > >> > _______________________________________________ > >> > Bioperl-l mailing list > >> > Bioperl-l at lists.open-bio.org > >> > http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > > > > > > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. scott at scottcain dot net > GMOD Coordinator (http://gmod.org/) 216-392-3087 > Ontario Institute for Cancer Research > From alicarea at gmail.com Wed Jun 1 13:51:48 2011 From: alicarea at gmail.com (Sophie) Date: Wed, 1 Jun 2011 18:51:48 +0100 Subject: [Bioperl-l] Problem installing BioPerl In-Reply-To: References: <96F5C958-86C9-462F-B9DA-BABBB6C4D6C2@illinois.edu> Message-ID: Thank you! On 1 June 2011 16:57, Chris Fields wrote: > Just to note, you can also install locally if you do not want the > distribution available for everyone (see local::lib on CPAN). In fact, I > suggest this in most cases for end users. > > chris > > On Jun 1, 2011, at 10:46 AM, Sophie wrote: > > > Hi there, > > > > I wrote this on the cpan shell: > > > > o conf make_install_make_command 'sudo make' > > o conf mbuild_install_build_command 'sudo ./Build' > > o conf commit > > install CJFIELDS/BioPerl-1.6.901.tar.gz > > > > and it seems like it installed, at least when I type perl -MBio::Seq -e 0 > it gives me no error, and when I type perl -MBio::Root::Version -e 'print > $Bio::Root::Version::VERSION,"\n"' it gives me this 1.006901 > > > > Hopefully it worked, sorry to bother you, > > Thank you for your time, > > Best wishes, > > Sophie. > > > > On 1 June 2011 16:39, Scott Cain wrote: > > Hi Sophie, > > > > You need to use sudo to do the install. Options include "sudo cpan" > > to use the cpan shell, and "sudo ./Build install" to install from the > > command line. > > > > Scott > > > > > > On Wed, Jun 1, 2011 at 11:17 AM, Sophie wrote: > > > Hi Chris, Scott, > > > Thank you for the quick response, I tried the latest version, and I get > this > > > on the terminal: > > > ERROR: Can't create '/usr/local/bin' > > > Do not have write permissions on '/usr/local/bin' > > > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > > at /Users/sophie/.cpan/build/BioPerl-1.6.901-yH47zc/Bio/Root/Build.pm > line > > > 853 > > > CJFIELDS/BioPerl-1.6.901.tar.gz > > > ./Build install -- NOT OK > > > Warning (usually harmless): 'YAML' not installed, will not store > persistent > > > state > > > Failed during this command: > > > CJFIELDS/BioPerl-1.6.901.tar.gz : install NO > > > > > > I can't think of any other solution than to chmod 777 the > /usr/local/bin but > > > I really didn't want to do that. > > > Thanks in advance for your help, > > > Sophie > > > On 1 June 2011 16:04, Chris Fields wrote: > > >> > > >> I agree with Scott, in that you should use the latest version on CPAN. > > >> It's very possible it will fix some of the issues you are seeing, as > the > > >> build step has been simplified substantially to get around > bioperl-specific > > >> API conflicts with Module::Build. So far, v 1.6.901 is installing on > most > > >> OS's, including darwin: > > >> > > >> http://matrix.cpantesters.org/?dist=BioPerl+1.6.901 > > >> > > >> However, please let us know more details if you still run into > problems. > > >> > > >> chris > > >> > > >> On Jun 1, 2011, at 9:40 AM, Scott Cain wrote: > > >> > > >> > Hi Sophie, > > >> > > > >> > If anything, I'd suggest a newer version (the current version is > > >> > 1.6.901), and I'd also suggest indicating what the problem is; there > > >> > might be a simple fix. > > >> > > > >> > Scott > > >> > > > >> > > > >> > On Wed, Jun 1, 2011 at 10:32 AM, Sophie wrote: > > >> >> Hi there, > > >> >> > > >> >> I've been trying to install BioPerl 1.6.1 on my Mac, (OS X), but no > > >> >> matter > > >> >> which method I use, CPAN, Buil.PL, etc, I can't install it, even > when I > > >> >> try > > >> >> to force it. Should I try an older version of BioPerl? > > >> >> > > >> >> Thank you, > > >> >> Best Regards, > > >> >> Sophie. > > >> >> _______________________________________________ > > >> >> Bioperl-l mailing list > > >> >> Bioperl-l at lists.open-bio.org > > >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > >> >> > > >> > > > >> > > > >> > > > >> > -- > > >> > > ------------------------------------------------------------------------ > > >> > Scott Cain, Ph. D. scott at > scottcain > > >> > dot net > > >> > GMOD Coordinator (http://gmod.org/) > 216-392-3087 > > >> > Ontario Institute for Cancer Research > > >> > > > >> > _______________________________________________ > > >> > Bioperl-l mailing list > > >> > Bioperl-l at lists.open-bio.org > > >> > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > >> > > > > > > > > > > > > > > -- > > ------------------------------------------------------------------------ > > Scott Cain, Ph. D. scott at scottcain > dot net > > GMOD Coordinator (http://gmod.org/) 216-392-3087 > > Ontario Institute for Cancer Research > > > > From chapmanb at 50mail.com Thu Jun 2 15:48:30 2011 From: chapmanb at 50mail.com (Brad Chapman) Date: Thu, 2 Jun 2011 15:48:30 -0400 Subject: [Bioperl-l] Early registration for BOSC ends tomorrow, Friday June 3 Message-ID: <20110602194830.GE21074@sobchak> If you haven't already registered for BOSC, now is your chance--after June 3, prices will go up! Registration for BOSC is through the ISMB main conference website: http://www.iscb.org/ismbeccb2011-registration#sigs . Since BOSC is a two-day SIG, the price is 2x the one-day SIG price listed on the ISMB website. You can register for BOSC without registering for the main ISMB conference, if you want. The preliminary BOSC schedule (subject to change) is now up at http://www.open-bio.org/wiki/BOSC_2011_Schedule (more details will be added soon). There is also a two day Codefest proceeding BOSC; please add yourself to the list of attendees if you are interested: http://www.open-bio.org/wiki/Codefest_2011 The BOSC talks have already been chosen, but we have spaces for last-minute posters. If you'd like your poster abstract to appear in the BOSC program, you should submit it now--see http://www.open-bio.org/wiki/BOSC_2011#Abstract_Submission_Information We hope to see you at BOSC! Nomi Harris Co-Chair, BOSC 2011 From shachigahoimbi at gmail.com Fri Jun 3 01:07:33 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Fri, 3 Jun 2011 10:37:33 +0530 Subject: [Bioperl-l] use of Bio::Graphics::Glyph::primers Message-ID: Dear All I want to use bioperl module "Bio::Graphics::Glyph::primers". If anyone knows please tell me how to use this module. Thanks is advance -- Regards, Shachi From jonas.zierer at googlemail.com Fri Jun 3 04:51:54 2011 From: jonas.zierer at googlemail.com (Bilbo) Date: Fri, 3 Jun 2011 01:51:54 -0700 (PDT) Subject: [Bioperl-l] adding tags to BAM aligned reads Message-ID: <854ff1e0-a627-4fcd-9536-89a049d5bdfb@hd10g2000vbb.googlegroups.com> Hi everybody, I have a bam file and use bioperl (Bio::DB::Sam) to work with it. Now i wanted to ask if there is any possibility to add tags to aligned reads in this File? i use my $iterator = $bam->features(-iterator => 1, -flags => {M_UNMAPPED=>0}); while (my $align = $iterator->next_seq) { ... } to loop through the aligned reads and I am searching for anthing like $align->addTag(key=>value) and i have one more question: is it possible to get the coverage of a segment using only the read start positions? Thx bye From scott at scottcain.net Fri Jun 3 09:17:49 2011 From: scott at scottcain.net (Scott Cain) Date: Fri, 3 Jun 2011 09:17:49 -0400 Subject: [Bioperl-l] use of Bio::Graphics::Glyph::primers In-Reply-To: References: Message-ID: Hello Shachi, What do you need to know? If you don't know how to use Bio::Graphics yet, start with the tutorial: http://www.bioperl.org/wiki/HOWTO:Graphics If you want to use it with GBrowse and you are starting from scratch, I suggest you look at the GBrowse tutorial that comes with GBrowse. It can be found on the server that GBrowse is installed on: http://localhost/gbrowse2/tutorial/tutorial.html If neither of these situations apply, please ask a more detailed question. Scott On Fri, Jun 3, 2011 at 1:07 AM, Shachi Gahoi wrote: > Dear All > > I want to use bioperl module "Bio::Graphics::Glyph::primers". > > If anyone knows please tell me how to use this module. > > > Thanks is advance > > -- > Regards, > Shachi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From shachigahoimbi at gmail.com Sat Jun 4 07:03:37 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Sat, 4 Jun 2011 16:33:37 +0530 Subject: [Bioperl-l] Bioperl graphics Message-ID: Dear All, My question is can I insert image file in text output file using perl? Half portion of my perl script output is "a1.txt" file and another half portion of perl script output is "a1.png" (created by Bio::Graphics module). Now I want to insert a1.png image file in a1.txt file. or I can say that I want to store image output file a1.png in a1.txt file. ####################################################################* Here is sample of my a1.txt file *- Left primer: CARGAYATHATHTTYGCN Length:18 Start:125 End:142 Opt_Tm:44.84 GC%: 22.22 Right primer:GCNCGNGCNTAYAAYACN Length:18 Start:828 End:845 Opt_Tm:53.82 GC%: 38.89 #################################################################### *Desired output * Left primer: CARGAYATHATHTTYGCN Length:18 Start:125 End:142 Opt_Tm:44.84 GC%: 22.22 Right primer:GCNCGNGCNTAYAAYACN Length:18 Start:828 End:845 Opt_Tm:53.82 GC%: 38.89 and now i want to store a1.png file in the below of a1.txt file *Here I want to print my image file * ################################################################### How can I do this. If anyone know please my help me. Thanks in advance -- Regards, Shachi From wkretzsch at gmail.com Mon Jun 6 07:21:42 2011 From: wkretzsch at gmail.com (Warren W. Kretzschmar) Date: Mon, 6 Jun 2011 12:21:42 +0100 Subject: [Bioperl-l] questions about the bioperl module Bio::PopGen::Statistics In-Reply-To: <201106012030039537050@gmail.com> References: <201106012030039537050@gmail.com> Message-ID: Hi Jun, First, in case you haven't seen the howto page, here it is: Now to address your question. Unfortunately I can only give you a short, inadequate answer: pi is the average number of pairwise differences between the sequences of n individuals and is also an estimator of the scaled mutation rate (theta). Pi assumes a wright-fisher model, which is not directly based on the concept of DNA. Theta is not the per base-pair mutation rate. Instead theta = 2 * M * mu, where M is the effective population size and mu is the per base-pair mutation rate. If you want to learn more, I suggest reading the lecture notes that can be found at , especially the notes for lectures 5-7. Cheers, Warren On Wed, Jun 1, 2011 at 1:30 PM, lvu.jun wrote: > Hi, there, > I am trying to calculate the population genetics parameters such as pi using the bioperl module Bio::PopGen::Statistics. But I found that the method only requires the input of the marker genotype of every individuals for the population. I don't know why the module does not take the DNA sequence length into consideration when calculating the pi value. According to the definition of the pi value, besides the polymorphic sites, we also need the monomorphic sites that should be incorporated in the denominator when doing the calculation. Is it right? therefore I'm confused about the module, who can tell me why it can correctly calculate the pi value only with the marker(polymorphic) genotype? > Another question, if I want to calculate the pi value using the sliding window along the genome, how can I do this using the Bio::PopGen::Statistics module? > Thanks for your help! > Yours sincerely, > Jun > > Chinese Academy of Sciences > > 2011-06-01 > > > > lvu.jun > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From tmcmahon at cae.wisc.edu Mon Jun 6 08:21:16 2011 From: tmcmahon at cae.wisc.edu (Trina McMahon) Date: Mon, 6 Jun 2011 14:21:16 +0200 Subject: [Bioperl-l] Using Bio::SeqUtils->cat() Message-ID: <32357411-0FB9-47A9-A78C-CDBF23D47EC8@cae.wisc.edu> Hello everyone, I am a complete (bio)perl neophyte and am struggling with what should be some very basic code. I am trying to write a simple script to merge a bunch of gbk sequences into one big sequence, while preserving the feature coordinates. I was so proud of myself because I found this function in Bio::SeqUtils called cat(). But I can't figure out how the hell to make it work!!! I'm attaching the script that I wrote below. Can you help me figure out what is wrong? I did some troubleshooting and I am pretty sure it is the Bio::SeqUtils->cat() function that is the problem. It seems like it should return a seq object but if you look at the original code in the documentation online I am not sure. thanks!! trina #!/usr/local/ActivePerl-5.10/bin/perl # # use strict; use warnings; use Bio::SeqIO; use Bio::SeqUtils; use Getopt::Long; # get command-line arguments, or die with a usage statement my $USAGE=< \$help, 'i|infile=s' => \$infile, 'o|outfile=s' => \$outfile, ); ######################################################################## if ($help) { die $USAGE; } if (!$infile) { die "$USAGE\nNo input file! (-infile)\n"; } if (!$outfile) { die "$USAGE\nNo output file! (-outfile)\n"; } my @seqs; my $seqin = Bio::SeqIO->new(-file => $infile, -format=>"genbank"); my $seqout = Bio::SeqIO->new(-file=> ">$outfile", -format => "genbank"); # create an array to hold the sequences from infile while (my $seq = $seqin->next_seq) { push (@seqs, $seq); } # concatenate the sequenes held in the array @seqs my $mergedseqs = Bio::SeqUtils->cat(@seqs); # write the new merged sequence into the specified file $seqout->writeseq($mergedseqs); exit; ------------------------------------------------------------------------------------- Katherine (Trina) McMahon, Associate Professor Goddess of Funkosity Departments of Civil and Environmental Engineering and Bacteriology Environmental Chemistry and Technology Program Limnology and Marine Science Program Microbiology Doctoral Training Program **On sabbatical leave starting August 20, 2010** Mailing Address: 3204 Engineering Hall, 1415 Engineering Drive University of Wisconsin - Madison, Madison, WI 53706-1691 Alternate Office: 5552 Microbial Sciences Building Phone: 608/890-2836 Fax: 608/262-9865 Email: tmcmahon at engr.wisc.edu McMahon Lab: http://www.engr.wisc.edu/cee/faculty/mcmahon_katherine.html North Temperate Lakes Microbial Observatory: http://microbes.limnology.wisc.edu/ ------------------------------------------------------------------------------------- From roy.chaudhuri at gmail.com Mon Jun 6 08:36:53 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Mon, 06 Jun 2011 13:36:53 +0100 Subject: [Bioperl-l] Using Bio::SeqUtils->cat() In-Reply-To: <32357411-0FB9-47A9-A78C-CDBF23D47EC8@cae.wisc.edu> References: <32357411-0FB9-47A9-A78C-CDBF23D47EC8@cae.wisc.edu> Message-ID: <4DECC9E5.5040302@gmail.com> Hi Trina, That's probably how cat should work, but in fact the first sequence in the list is modified by concatenating on the other sequences. The function just returns 1 if it worked. So you need code like: Bio::SeqUtils->cat(@seqs); my $mergedseqs=$seqs[0]; You can blame the idiot who wrote that code for the slightly awkward interface (sorry). I think the module documentation is correct, but let us know if you think it should be clarified. Cheers, Roy. On 06/06/2011 13:21, Trina McMahon wrote: > Hello everyone, > > I am a complete (bio)perl neophyte and am struggling with what should > be some very basic code. > > I am trying to write a simple script to merge a bunch of gbk > sequences into one big sequence, while preserving the feature > coordinates. I was so proud of myself because I found this function > in Bio::SeqUtils called cat(). But I can't figure out how the hell > to make it work!!! > > I'm attaching the script that I wrote below. Can you help me figure > out what is wrong? I did some troubleshooting and I am pretty sure > it is the Bio::SeqUtils->cat() function that is the problem. It > seems like it should return a seq object but if you look at the > original code in the documentation online I am not sure. > > thanks!! trina > > #!/usr/local/ActivePerl-5.10/bin/perl # # > > use strict; use warnings; use Bio::SeqIO; use Bio::SeqUtils; use > Getopt::Long; > > # get command-line arguments, or die with a usage statement > > my $USAGE=< > * catgb * > > concatenates multiple genbank records and adjusts feature coordinates > in the final merged sequence > > catgb -i listofseqs.gbk -o mergedseqs.gbk > > > USAGE > > my ($help,$infile,$outfile); > > GetOptions ( 'h|help' => \$help, 'i|infile=s' => > \$infile, 'o|outfile=s' => \$outfile, ); > > ######################################################################## > > if ($help) { die $USAGE; } if (!$infile) { die "$USAGE\nNo > input file! (-infile)\n"; } if (!$outfile) { die "$USAGE\nNo > output file! (-outfile)\n"; } > > my @seqs; > > > my $seqin = Bio::SeqIO->new(-file => $infile, -format=>"genbank"); > my $seqout = Bio::SeqIO->new(-file=> ">$outfile", -format => > "genbank"); > > > # create an array to hold the sequences from infile while (my $seq = > $seqin->next_seq) { push (@seqs, $seq); > > } > > # concatenate the sequenes held in the array @seqs my $mergedseqs = > Bio::SeqUtils->cat(@seqs); > > # write the new merged sequence into the specified file > > $seqout->writeseq($mergedseqs); > > exit; > ------------------------------------------------------------------------------------- > > Katherine (Trina) McMahon, Associate Professor > Goddess of Funkosity Departments of Civil and Environmental > Engineering and Bacteriology Environmental Chemistry and Technology > Program Limnology and Marine Science Program Microbiology Doctoral > Training Program > > **On sabbatical leave starting August 20, 2010** > > Mailing Address: 3204 Engineering Hall, 1415 Engineering Drive > University of Wisconsin - Madison, Madison, WI 53706-1691 > > Alternate Office: 5552 Microbial Sciences Building > > Phone: 608/890-2836 Fax: 608/262-9865 Email: > tmcmahon at engr.wisc.edu McMahon Lab: > http://www.engr.wisc.edu/cee/faculty/mcmahon_katherine.html North > Temperate Lakes Microbial Observatory: > http://microbes.limnology.wisc.edu/ > > ------------------------------------------------------------------------------------- > > > > > > > > > > > > > > > > _______________________________________________ Bioperl-l mailing > list Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jonas.zierer at campus.lmu.de Mon Jun 6 05:01:24 2011 From: jonas.zierer at campus.lmu.de (Jonas Zierer) Date: Mon, 06 Jun 2011 11:01:24 +0200 Subject: [Bioperl-l] adding tags Message-ID: <1307350884.6625.2.camel@hollerith.bio.ifi.lmu.de> Hi everybody, I have a bam file of aligned transcriptome reads and use bioperl (Bio::DB::Sam) to work with it. Now i wanted to ask if there is any possibility to add tags to alignments (=reads) in this File? I use my $iterator = $bam->features(-iterator => 1, -flags => {M_UNMAPPED=>0}); while (my $align = $iterator->next_seq) { ... } to loop through the aligned reads and I am searching for anthing like $align->addTag(key=>value) and i have one more question: is it possible to get the coverage of a segment using only the read start positions? (that means that each read is treated as it would have Length 1) Thx bye From scott at scottcain.net Mon Jun 6 09:31:47 2011 From: scott at scottcain.net (Scott Cain) Date: Mon, 6 Jun 2011 09:31:47 -0400 Subject: [Bioperl-l] Bioperl graphics In-Reply-To: References: Message-ID: Hi Sachi, You can't put graphics in a text file--that's why it's called a text file. You need a format that can do more, like pdf, html or rdf. What are these files going to be used for? Scott On Sat, Jun 4, 2011 at 7:03 AM, Shachi Gahoi wrote: > Dear All, > > My question is can I insert image file in text output file using perl? > > Half portion of my perl script output is "a1.txt" file and another half > portion of perl script output is "a1.png" (created by Bio::Graphics module). > > Now I want to insert a1.png image file in a1.txt file. or I can say that I > want to store image output file a1.png in a1.txt file. > > ####################################################################* > Here is sample of my a1.txt file *- > > Left primer: CARGAYATHATHTTYGCN ?Length:18 ?Start:125 ?End:142 > Opt_Tm:44.84 ?GC%: 22.22 > > Right primer:GCNCGNGCNTAYAAYACN ?Length:18 ?Start:828 ?End:845 > Opt_Tm:53.82 ?GC%: 38.89 > > #################################################################### > *Desired output * > > Left primer: CARGAYATHATHTTYGCN ?Length:18 ?Start:125 ?End:142 > Opt_Tm:44.84 ?GC%: 22.22 > > Right primer:GCNCGNGCNTAYAAYACN ?Length:18 ?Start:828 ?End:845 > Opt_Tm:53.82 ?GC%: 38.89 > > and now i want to store a1.png file in the below of a1.txt file > > *Here I want to print my image file > * > > > ################################################################### > > How can I do this. If anyone know please my help me. > > > Thanks in advance > > > -- > Regards, > Shachi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From bosborne11 at verizon.net Mon Jun 6 08:41:26 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 06 Jun 2011 08:41:26 -0400 Subject: [Bioperl-l] Using Bio::SeqUtils->cat() In-Reply-To: <32357411-0FB9-47A9-A78C-CDBF23D47EC8@cae.wisc.edu> References: <32357411-0FB9-47A9-A78C-CDBF23D47EC8@cae.wisc.edu> Message-ID: <813EA044-3458-48B6-90EB-B1CBA7F9C32C@verizon.net> Trina, Never used this before, so I'm not entirely sure but the docs are saying that the first sequence in the array is the target, so something like: Bio::SeqUtils->cat(@seqs); my $mergedseq = shift @seqs; ? Brian O. On Jun 6, 2011, at 8:21 AM, Trina McMahon wrote: > Hello everyone, > > I am a complete (bio)perl neophyte and am struggling with what should be some very basic code. > > I am trying to write a simple script to merge a bunch of gbk sequences into one big sequence, while preserving the feature coordinates. I was so proud of myself because I found this function in Bio::SeqUtils called cat(). But I can't figure out how the hell to make it work!!! > > I'm attaching the script that I wrote below. Can you help me figure out what is wrong? I did some troubleshooting and I am pretty sure it is the Bio::SeqUtils->cat() function that is the problem. It seems like it should return a seq object but if you look at the original code in the documentation online I am not sure. > > thanks!! > trina > > #!/usr/local/ActivePerl-5.10/bin/perl > # > # > > use strict; > use warnings; > use Bio::SeqIO; > use Bio::SeqUtils; > use Getopt::Long; > > # get command-line arguments, or die with a usage statement > > my $USAGE=< > * catgb * > > concatenates multiple genbank records and adjusts feature coordinates in > the final merged sequence > > catgb -i listofseqs.gbk -o mergedseqs.gbk > > > USAGE > > my ($help,$infile,$outfile); > > GetOptions ( > 'h|help' => \$help, > 'i|infile=s' => \$infile, > 'o|outfile=s' => \$outfile, > ); > > ######################################################################## > > if ($help) { die $USAGE; } > if (!$infile) { die "$USAGE\nNo input file! (-infile)\n"; } > if (!$outfile) { die "$USAGE\nNo output file! (-outfile)\n"; } > > my @seqs; > > > my $seqin = Bio::SeqIO->new(-file => $infile, -format=>"genbank"); > my $seqout = Bio::SeqIO->new(-file=> ">$outfile", -format => "genbank"); > > > # create an array to hold the sequences from infile > > while (my $seq = $seqin->next_seq) { > > push (@seqs, $seq); > > } > > # concatenate the sequenes held in the array @seqs > > my $mergedseqs = Bio::SeqUtils->cat(@seqs); > > # write the new merged sequence into the specified file > > $seqout->writeseq($mergedseqs); > > exit; > ------------------------------------------------------------------------------------- > Katherine (Trina) McMahon, Associate Professor > Goddess of Funkosity > Departments of Civil and Environmental Engineering and Bacteriology > Environmental Chemistry and Technology Program > Limnology and Marine Science Program > Microbiology Doctoral Training Program > > **On sabbatical leave starting August 20, 2010** > > Mailing Address: > 3204 Engineering Hall, 1415 Engineering Drive > University of Wisconsin - Madison, Madison, WI 53706-1691 > > Alternate Office: 5552 Microbial Sciences Building > > Phone: 608/890-2836 Fax: 608/262-9865 > Email: tmcmahon at engr.wisc.edu > McMahon Lab: http://www.engr.wisc.edu/cee/faculty/mcmahon_katherine.html > North Temperate Lakes Microbial Observatory: http://microbes.limnology.wisc.edu/ > > ------------------------------------------------------------------------------------- > > > > > > > > > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From fs5 at sanger.ac.uk Mon Jun 6 10:11:23 2011 From: fs5 at sanger.ac.uk (Frank Schwach) Date: Mon, 06 Jun 2011 15:11:23 +0100 Subject: [Bioperl-l] adding tags In-Reply-To: <1307350884.6625.2.camel@hollerith.bio.ifi.lmu.de> References: <1307350884.6625.2.camel@hollerith.bio.ifi.lmu.de> Message-ID: <1307369483.18718.113.camel@deskpro15336.internal.sanger.ac.uk> Hi Jonas, I have a module where I need to do the same but I re-cast the Bio::DB::Align object into a Bio::SeqFeature::Lite before adding tags simply because the Align object contained a lot of data that drained memory and I didn't really need it. However, it should be possible to add tags directly to a Bio::DB::Aign object as it it supposed to be Bio::SeqFeatureI compatible, although it doesn't inherit from it. Have you tried this: $align->add_tag_value('key','value'); ? If that doesn't work then you may have to re-cast into a Bio::SeqFeature object before adding tags. Hope this helps, Frank On Mon, 2011-06-06 at 11:01 +0200, Jonas Zierer wrote: > Hi everybody, > > I have a bam file of aligned transcriptome reads and use bioperl > (Bio::DB::Sam) to work with it. > Now i wanted to ask if there is any possibility to add tags to > alignments (=reads) in this File? > > > I use > my $iterator = $bam->features(-iterator => 1, -flags => {M_UNMAPPED=>0}); > while (my $align = $iterator->next_seq) { > ... > } > to loop through the aligned reads and I am searching for anthing like > $align->addTag(key=>value) > > and i have one more question: is it possible to get the coverage of a > segment using only the read start positions? (that means that each read > is treated as it would have Length 1) > > Thx > bye > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From cjfields at illinois.edu Mon Jun 6 11:24:40 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 6 Jun 2011 10:24:40 -0500 Subject: [Bioperl-l] Bioperl graphics In-Reply-To: References: Message-ID: <714F90EB-6A33-47BE-B843-64A670EF91D9@illinois.edu> Agreed, except you probably mean 'rtf' instead of 'rdf', correct? chris On Jun 6, 2011, at 8:31 AM, Scott Cain wrote: > Hi Sachi, > > You can't put graphics in a text file--that's why it's called a text > file. You need a format that can do more, like pdf, html or rdf. > What are these files going to be used for? > > Scott > > > On Sat, Jun 4, 2011 at 7:03 AM, Shachi Gahoi wrote: >> Dear All, >> >> My question is can I insert image file in text output file using perl? >> >> Half portion of my perl script output is "a1.txt" file and another half >> portion of perl script output is "a1.png" (created by Bio::Graphics module). >> >> Now I want to insert a1.png image file in a1.txt file. or I can say that I >> want to store image output file a1.png in a1.txt file. >> >> ####################################################################* >> Here is sample of my a1.txt file *- >> >> Left primer: CARGAYATHATHTTYGCN Length:18 Start:125 End:142 >> Opt_Tm:44.84 GC%: 22.22 >> >> Right primer:GCNCGNGCNTAYAAYACN Length:18 Start:828 End:845 >> Opt_Tm:53.82 GC%: 38.89 >> >> #################################################################### >> *Desired output * >> >> Left primer: CARGAYATHATHTTYGCN Length:18 Start:125 End:142 >> Opt_Tm:44.84 GC%: 22.22 >> >> Right primer:GCNCGNGCNTAYAAYACN Length:18 Start:828 End:845 >> Opt_Tm:53.82 GC%: 38.89 >> >> and now i want to store a1.png file in the below of a1.txt file >> >> *Here I want to print my image file >> * >> >> >> ################################################################### >> >> How can I do this. If anyone know please my help me. >> >> >> Thanks in advance >> >> >> -- >> Regards, >> Shachi >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. scott at scottcain dot net > GMOD Coordinator (http://gmod.org/) 216-392-3087 > Ontario Institute for Cancer Research > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Mon Jun 6 11:36:53 2011 From: scott at scottcain.net (Scott Cain) Date: Mon, 6 Jun 2011 11:36:53 -0400 Subject: [Bioperl-l] Bioperl graphics In-Reply-To: <714F90EB-6A33-47BE-B843-64A670EF91D9@illinois.edu> References: <714F90EB-6A33-47BE-B843-64A670EF91D9@illinois.edu> Message-ID: Hi Chris, Thanks for the catch--yes, that is exactly what I meant. Scott On Mon, Jun 6, 2011 at 11:24 AM, Chris Fields wrote: > Agreed, except you probably mean 'rtf' instead of 'rdf', correct? > > chris > > On Jun 6, 2011, at 8:31 AM, Scott Cain wrote: > >> Hi Sachi, >> >> You can't put graphics in a text file--that's why it's called a text >> file. ?You need a format that can do more, like pdf, html or rdf. >> What are these files going to be used for? >> >> Scott >> >> >> On Sat, Jun 4, 2011 at 7:03 AM, Shachi Gahoi wrote: >>> Dear All, >>> >>> My question is can I insert image file in text output file using perl? >>> >>> Half portion of my perl script output is "a1.txt" file and another half >>> portion of perl script output is "a1.png" (created by Bio::Graphics module). >>> >>> Now I want to insert a1.png image file in a1.txt file. or I can say that I >>> want to store image output file a1.png in a1.txt file. >>> >>> ####################################################################* >>> Here is sample of my a1.txt file *- >>> >>> Left primer: CARGAYATHATHTTYGCN ?Length:18 ?Start:125 ?End:142 >>> Opt_Tm:44.84 ?GC%: 22.22 >>> >>> Right primer:GCNCGNGCNTAYAAYACN ?Length:18 ?Start:828 ?End:845 >>> Opt_Tm:53.82 ?GC%: 38.89 >>> >>> #################################################################### >>> *Desired output * >>> >>> Left primer: CARGAYATHATHTTYGCN ?Length:18 ?Start:125 ?End:142 >>> Opt_Tm:44.84 ?GC%: 22.22 >>> >>> Right primer:GCNCGNGCNTAYAAYACN ?Length:18 ?Start:828 ?End:845 >>> Opt_Tm:53.82 ?GC%: 38.89 >>> >>> and now i want to store a1.png file in the below of a1.txt file >>> >>> *Here I want to print my image file >>> * >>> >>> >>> ################################################################### >>> >>> How can I do this. If anyone know please my help me. >>> >>> >>> Thanks in advance >>> >>> >>> -- >>> Regards, >>> Shachi >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> >> >> -- >> ------------------------------------------------------------------------ >> Scott Cain, Ph. D. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? scott at scottcain dot net >> GMOD Coordinator (http://gmod.org/) ? ? ? ? ? ? ? ? ? ? 216-392-3087 >> Ontario Institute for Cancer Research >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From roy.chaudhuri at gmail.com Mon Jun 6 13:53:24 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Mon, 06 Jun 2011 18:53:24 +0100 Subject: [Bioperl-l] Using Bio::SeqUtils->cat() In-Reply-To: References: <32357411-0FB9-47A9-A78C-CDBF23D47EC8@cae.wisc.edu> <4DECC9E5.5040302@gmail.com> Message-ID: <4DED1414.8000203@gmail.com> Hi Trina, Thanks for noticing that, I have modified the documentation and submitted it as a patch to Redmine. Just a note for the future - please remember to cc the list when replying, so others can follow the conversation. Cheers, Roy. On 06/06/2011 13:49, Trina McMahon wrote: > oops, actually the documentation does recommend this usage: > > my $catseq =*Bio::SeqUtils*->cat(@seqs) > > so maybe it would help to change this... > > best, > trina > > On Jun 6, 2011, at 2:36 PM, Roy Chaudhuri wrote: > >> Hi Trina, >> >> That's probably how cat should work, but in fact the first sequence in >> the list is modified by concatenating on the other sequences. The >> function just returns 1 if it worked. So you need code like: >> >> Bio::SeqUtils->cat(@seqs); >> my $mergedseqs=$seqs[0]; >> >> You can blame the idiot who wrote that code for the slightly awkward >> interface (sorry). I think the module documentation is correct, but >> let us know if you think it should be clarified. >> >> Cheers, >> Roy. >> >> On 06/06/2011 13:21, Trina McMahon wrote: >>> Hello everyone, >>> >>> I am a complete (bio)perl neophyte and am struggling with what should >>> be some very basic code. >>> >>> I am trying to write a simple script to merge a bunch of gbk >>> sequences into one big sequence, while preserving the feature >>> coordinates. I was so proud of myself because I found this function >>> in Bio::SeqUtils called cat(). But I can't figure out how the hell >>> to make it work!!! >>> >>> I'm attaching the script that I wrote below. Can you help me figure >>> out what is wrong? I did some troubleshooting and I am pretty sure >>> it is the Bio::SeqUtils->cat() function that is the problem. It >>> seems like it should return a seq object but if you look at the >>> original code in the documentation online I am not sure. >>> >>> thanks!! trina >>> >>> #!/usr/local/ActivePerl-5.10/bin/perl # # >>> >>> use strict; use warnings; use Bio::SeqIO; use Bio::SeqUtils; use >>> Getopt::Long; >>> >>> # get command-line arguments, or die with a usage statement >>> >>> my $USAGE=<>> >>> * catgb * >>> >>> concatenates multiple genbank records and adjusts feature coordinates >>> in the final merged sequence >>> >>> catgb -i listofseqs.gbk -o mergedseqs.gbk >>> >>> >>> USAGE >>> >>> my ($help,$infile,$outfile); >>> >>> GetOptions ( 'h|help'=> \$help, 'i|infile=s' => >>> \$infile, 'o|outfile=s' => \$outfile, ); >>> >>> ######################################################################## >>> >>> if ($help) { die $USAGE; } if (!$infile) { die "$USAGE\nNo >>> input file! (-infile)\n"; } if (!$outfile) { die "$USAGE\nNo >>> output file! (-outfile)\n"; } >>> >>> my @seqs; >>> >>> >>> my $seqin = Bio::SeqIO->new(-file => $infile, -format=>"genbank"); >>> my $seqout = Bio::SeqIO->new(-file=> ">$outfile", -format => >>> "genbank"); >>> >>> >>> # create an array to hold the sequences from infile while (my $seq = >>> $seqin->next_seq) { push (@seqs, $seq); >>> >>> } >>> >>> # concatenate the sequenes held in the array @seqs my $mergedseqs = >>> Bio::SeqUtils->cat(@seqs); >>> >>> # write the new merged sequence into the specified file >>> >>> $seqout->writeseq($mergedseqs); >>> >>> exit; >>> ------------------------------------------------------------------------------------- >>> >>> >> Katherine (Trina) McMahon, Associate Professor >>> Goddess of Funkosity Departments of Civil and Environmental >>> Engineering and Bacteriology Environmental Chemistry and Technology >>> Program Limnology and Marine Science Program Microbiology Doctoral >>> Training Program >>> >>> **On sabbatical leave starting August 20, 2010** >>> >>> Mailing Address: 3204 Engineering Hall, 1415 Engineering Drive >>> University of Wisconsin - Madison, Madison, WI 53706-1691 >>> >>> Alternate Office: 5552 Microbial Sciences Building >>> >>> Phone: 608/890-2836 Fax: 608/262-9865 Email: >>> tmcmahon at engr.wisc.edu McMahon Lab: >>> http://www.engr.wisc.edu/cee/faculty/mcmahon_katherine.html North >>> Temperate Lakes Microbial Observatory: >>> http://microbes.limnology.wisc.edu/ >>> >>> ------------------------------------------------------------------------------------- >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ Bioperl-l mailing >>> list Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > ------------------------------------------------------------------------------------- > Katherine (Trina) McMahon, Associate Professor > Goddess of Funkosity > Departments of Civil and Environmental Engineering and Bacteriology > Environmental Chemistry and Technology Program > Limnology and Marine Science Program > Microbiology Doctoral Training Program > > **On sabbatical leave starting August 20, 2010** > > Mailing Address: > 3204 Engineering Hall, 1415 Engineering Drive > University of Wisconsin - Madison, Madison, WI 53706-1691 > > Alternate Office: 5552 Microbial Sciences Building > > Phone: 608/890-2836 Fax: 608/262-9865 > Email: tmcmahon at engr.wisc.edu > McMahon Lab: http://www.engr.wisc.edu/cee/faculty/mcmahon_katherine.html > North Temperate Lakes Microbial Observatory: > http://microbes.limnology.wisc.edu/ > > ------------------------------------------------------------------------------------- > > > > > > > > > > > > > From tmcmahon at cae.wisc.edu Mon Jun 6 17:37:31 2011 From: tmcmahon at cae.wisc.edu (Trina McMahon) Date: Mon, 6 Jun 2011 23:37:31 +0200 Subject: [Bioperl-l] Installing Bioperl 1.6.1 on Mac OS 10.6.7 Message-ID: <35BCC454-7B4A-46A0-9D5F-C3DCE2054A64@cae.wisc.edu> Hi everyone, I was having some trouble with Bio::SeqUtils->cat() earlier today and with Roy's help I debugged my code. But it was still not working properly. I sent my script to a friend and he successfully ran it on his machine, producing the desired output. I am now suspecting that I need to upgrade my perl and bioperl. I hadn't upgraded anything in about 6 months (I was running ActivePerl-5.10 and BioPerl 1.6.0).. So I ran my MacPorts upgrade, installed Perl 5.12, installed GraphViz, and configured CPAN as directed here: http://www.sysarchitects.com/bioperl but when I went to build bioperl 1.6.1, I got a lot of what look like fatal errors. Also, none of my scripts using bioperl work using this new build. Inspecting the errors, it seems like it is having trouble with AcePerl and maybe GraphViz (though this seemed to do ok with MacPorts??). Any advice welcome! I saw in the archive that I might not be the only one with this problem, so I hope someone has a work-around?? thanks, trina p.s. note that I am not using the perl that is installed with MacOS, I did a fresh installation in /opt/local/bin/ using MacPorts. I also am pretty sure my path is not the problem (I changed it to reflect the new Perl location) errors look like this: /opt/local/bin/perl "-Iblib/arch" "-Iblib/lib" util/install.PLS util/install.pl Extracting install.pl (with variable substitutions) LDS/AcePerl-1.92.tar.gz /Developer/usr/bin/make -- OK Running make test PERL_DL_NONLAZY=1 /opt/local/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/basic.t ..... Waiting for remote acedb regression database to start up. This may take a few minutes. t/basic.t ..... 2/10 Couldn't establish connection to database. Aborting tests. t/basic.t ..... Dubious, test returned 60 (wstat 15360, 0x3c00) Failed 9/10 subtests t/object.t .... 2/36 Couldn't establish connection to database. Aborting tests. t/object.t .... Dubious, test returned 60 (wstat 15360, 0x3c00) Failed 35/36 subtests t/sequence.t .. 2/54 Couldn't establish connection to database. Aborting tests. t/sequence.t .. Dubious, test returned 60 (wstat 15360, 0x3c00) Failed 53/54 subtests t/update.t .... 2/17 Couldn't establish connection to database. Aborting tests. t/update.t .... Dubious, test returned 60 (wstat 15360, 0x3c00) Failed 16/17 subtests <...snip...> Result: PASS CJFIELDS/BioPerl-1.6.1.tar.gz Tests succeeded but one dependency not OK (Ace) CJFIELDS/BioPerl-1.6.1.tar.gz [dependencies] -- NA Running Build install make test had returned bad status, won't install without force Failed during this command: LDS/AcePerl-1.92.tar.gz : make_test NO CJFIELDS/BioPerl-1.6.1.tar.gz : make_test NO one dependency not OK (Ace) ------------------------------------------------------------------------------------- Katherine (Trina) McMahon, Associate Professor Goddess of Funkosity Departments of Civil and Environmental Engineering and Bacteriology Environmental Chemistry and Technology Program Limnology and Marine Science Program Microbiology Doctoral Training Program **On sabbatical leave starting August 20, 2010** Mailing Address: 3204 Engineering Hall, 1415 Engineering Drive University of Wisconsin - Madison, Madison, WI 53706-1691 Alternate Office: 5552 Microbial Sciences Building Phone: 608/890-2836 Fax: 608/262-9865 Email: tmcmahon at engr.wisc.edu McMahon Lab: http://www.engr.wisc.edu/cee/faculty/mcmahon_katherine.html North Temperate Lakes Microbial Observatory: http://microbes.limnology.wisc.edu/ ------------------------------------------------------------------------------------- From cjfields at illinois.edu Mon Jun 6 17:46:32 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 6 Jun 2011 16:46:32 -0500 Subject: [Bioperl-l] Installing Bioperl 1.6.1 on Mac OS 10.6.7 In-Reply-To: <35BCC454-7B4A-46A0-9D5F-C3DCE2054A64@cae.wisc.edu> References: <35BCC454-7B4A-46A0-9D5F-C3DCE2054A64@cae.wisc.edu> Message-ID: <5A652229-B22F-4631-B4F6-8220D8A866DD@illinois.edu> Trina, You shouldn't need AcePerl nor Graphviz. BTW, Lincoln has, for all intents and purposes, deprecated AcePerl now, so we have removed it as a requirement. Also, the latest version of BioPerl on CPAN is 1.6.901; I think some of these issues have been specifically addressed in that release. Can you try that release? chris On Jun 6, 2011, at 4:37 PM, Trina McMahon wrote: > Hi everyone, > > I was having some trouble with Bio::SeqUtils->cat() earlier today and with Roy's help I debugged my code. But it was still not working properly. I sent my script to a friend and he successfully ran it on his machine, producing the desired output. I am now suspecting that I need to upgrade my perl and bioperl. I hadn't upgraded anything in about 6 months (I was running ActivePerl-5.10 and BioPerl 1.6.0).. So I ran my MacPorts upgrade, installed Perl 5.12, installed GraphViz, and configured CPAN as directed here: > > http://www.sysarchitects.com/bioperl > > but when I went to build bioperl 1.6.1, I got a lot of what look like fatal errors. Also, none of my scripts using bioperl work using this new build. > > Inspecting the errors, it seems like it is having trouble with AcePerl and maybe GraphViz (though this seemed to do ok with MacPorts??). > > Any advice welcome! I saw in the archive that I might not be the only one with this problem, so I hope someone has a work-around?? > > thanks, > trina > > p.s. note that I am not using the perl that is installed with MacOS, I did a fresh installation in /opt/local/bin/ using MacPorts. I also am pretty sure my path is not the problem (I changed it to reflect the new Perl location) > > errors look like this: > > /opt/local/bin/perl "-Iblib/arch" "-Iblib/lib" util/install.PLS util/install.pl > Extracting install.pl (with variable substitutions) > LDS/AcePerl-1.92.tar.gz > /Developer/usr/bin/make -- OK > Running make test > PERL_DL_NONLAZY=1 /opt/local/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t > t/basic.t ..... Waiting for remote acedb regression database to start up. This may take a few minutes. > t/basic.t ..... 2/10 Couldn't establish connection to database. Aborting tests. > t/basic.t ..... Dubious, test returned 60 (wstat 15360, 0x3c00) > Failed 9/10 subtests > t/object.t .... 2/36 Couldn't establish connection to database. Aborting tests. > t/object.t .... Dubious, test returned 60 (wstat 15360, 0x3c00) > Failed 35/36 subtests > t/sequence.t .. 2/54 Couldn't establish connection to database. Aborting tests. > t/sequence.t .. Dubious, test returned 60 (wstat 15360, 0x3c00) > Failed 53/54 subtests > t/update.t .... 2/17 Couldn't establish connection to database. Aborting tests. > t/update.t .... Dubious, test returned 60 (wstat 15360, 0x3c00) > Failed 16/17 subtests > > <...snip...> > > Result: PASS > CJFIELDS/BioPerl-1.6.1.tar.gz > Tests succeeded but one dependency not OK (Ace) > CJFIELDS/BioPerl-1.6.1.tar.gz > [dependencies] -- NA > Running Build install > make test had returned bad status, won't install without force > Failed during this command: > LDS/AcePerl-1.92.tar.gz : make_test NO > CJFIELDS/BioPerl-1.6.1.tar.gz : make_test NO one dependency not OK (Ace) > > > ------------------------------------------------------------------------------------- > Katherine (Trina) McMahon, Associate Professor > Goddess of Funkosity > Departments of Civil and Environmental Engineering and Bacteriology > Environmental Chemistry and Technology Program > Limnology and Marine Science Program > Microbiology Doctoral Training Program > > **On sabbatical leave starting August 20, 2010** > > Mailing Address: > 3204 Engineering Hall, 1415 Engineering Drive > University of Wisconsin - Madison, Madison, WI 53706-1691 > > Alternate Office: 5552 Microbial Sciences Building > > Phone: 608/890-2836 Fax: 608/262-9865 > Email: tmcmahon at engr.wisc.edu > McMahon Lab: http://www.engr.wisc.edu/cee/faculty/mcmahon_katherine.html > North Temperate Lakes Microbial Observatory: http://microbes.limnology.wisc.edu/ > > ------------------------------------------------------------------------------------- > > > > > > > > > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From Russell.Smithies at agresearch.co.nz Mon Jun 6 17:56:11 2011 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Tue, 7 Jun 2011 09:56:11 +1200 Subject: [Bioperl-l] Slightly OT - how to create a blast archive file? Message-ID: <18DF7D20DFEC044098A1062202F5FFF338731C9A60@exchsth.agresearch.co.nz> Slightly off-topic perhaps, but does anyone know how to create a blast archive file from existing blast output? I have 25GB of compressed blastp output that I need to trim to the top 50 descriptions and alignments from each of 2,500,000 queries as the application that's going to post-process the data (MEGAN) can't handle the volume. I thought the blast_formatter would be ideal but as far as I can see, it only accepts blast archive files :( Plan B is SearchIO to read then write each file but I suspect it might take a while. Any better suggestions? Russell Smithies ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From Russell.Smithies at agresearch.co.nz Mon Jun 6 18:02:45 2011 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Tue, 7 Jun 2011 10:02:45 +1200 Subject: [Bioperl-l] Bioperl graphics In-Reply-To: References: <714F90EB-6A33-47BE-B843-64A670EF91D9@illinois.edu> Message-ID: <18DF7D20DFEC044098A1062202F5FFF338731C9A61@exchsth.agresearch.co.nz> Depending on what browser you're using, you can embed images in HTML using base64 encoding: Embedded Image We've used it before (a few years ago) but I think it only worked with Firefox then. --Russell > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Scott Cain > Sent: Tuesday, 7 June 2011 3:37 a.m. > To: Chris Fields > Cc: bioperl-l at lists.open-bio.org; Shachi Gahoi > Subject: Re: [Bioperl-l] Bioperl graphics > > Hi Chris, > > Thanks for the catch--yes, that is exactly what I meant. > > Scott > > > On Mon, Jun 6, 2011 at 11:24 AM, Chris Fields > wrote: > > Agreed, except you probably mean 'rtf' instead of 'rdf', correct? > > > > chris > > > > On Jun 6, 2011, at 8:31 AM, Scott Cain wrote: > > > >> Hi Sachi, > >> > >> You can't put graphics in a text file--that's why it's called a text > >> file. ?You need a format that can do more, like pdf, html or rdf. > >> What are these files going to be used for? > >> > >> Scott > >> > >> > >> On Sat, Jun 4, 2011 at 7:03 AM, Shachi Gahoi > wrote: > >>> Dear All, > >>> > >>> My question is can I insert image file in text output file using > perl? > >>> > >>> Half portion of my perl script output is "a1.txt" file and another > half > >>> portion of perl script output is "a1.png" (created by Bio::Graphics > module). > >>> > >>> Now I want to insert a1.png image file in a1.txt file. or I can say > that I > >>> want to store image output file a1.png in a1.txt file. > >>> > >>> > ####################################################################* > >>> Here is sample of my a1.txt file *- > >>> > >>> Left primer: CARGAYATHATHTTYGCN ?Length:18 ?Start:125 ?End:142 > >>> Opt_Tm:44.84 ?GC%: 22.22 > >>> > >>> Right primer:GCNCGNGCNTAYAAYACN ?Length:18 ?Start:828 ?End:845 > >>> Opt_Tm:53.82 ?GC%: 38.89 > >>> > >>> > #################################################################### > >>> *Desired output * > >>> > >>> Left primer: CARGAYATHATHTTYGCN ?Length:18 ?Start:125 ?End:142 > >>> Opt_Tm:44.84 ?GC%: 22.22 > >>> > >>> Right primer:GCNCGNGCNTAYAAYACN ?Length:18 ?Start:828 ?End:845 > >>> Opt_Tm:53.82 ?GC%: 38.89 > >>> > >>> and now i want to store a1.png file in the below of a1.txt file > >>> > >>> *Here I want to print my image file > >>> * > >>> > >>> > >>> ################################################################### > >>> > >>> How can I do this. If anyone know please my help me. > >>> > >>> > >>> Thanks in advance > >>> > >>> > >>> -- > >>> Regards, > >>> Shachi > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >> > >> > >> > >> -- > >> -------------------------------------------------------------------- > ---- > >> Scott Cain, Ph. D. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? scott at > scottcain dot net > >> GMOD Coordinator (http://gmod.org/) ? ? ? ? ? ? ? ? ? ? 216-392-3087 > >> Ontario Institute for Cancer Research > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > -- > ----------------------------------------------------------------------- > - > Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain > dot net > GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 > Ontario Institute for Cancer Research > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From tmcmahon at cae.wisc.edu Mon Jun 6 18:24:38 2011 From: tmcmahon at cae.wisc.edu (Trina McMahon) Date: Tue, 7 Jun 2011 00:24:38 +0200 Subject: [Bioperl-l] Installing Bioperl 1.6.1 on Mac OS 10.6.7 In-Reply-To: <5A652229-B22F-4631-B4F6-8220D8A866DD@illinois.edu> References: <35BCC454-7B4A-46A0-9D5F-C3DCE2054A64@cae.wisc.edu> <5A652229-B22F-4631-B4F6-8220D8A866DD@illinois.edu> Message-ID: <93366B9A-19E6-421B-8A24-9F9FDD1217C3@cae.wisc.edu> duh! Thanks, Chris. That fixed it!! And now I know how to check which version is most recent on CPAN. :-) trina On Jun 6, 2011, at 11:46 PM, Chris Fields wrote: > Trina, > > You shouldn't need AcePerl nor Graphviz. BTW, Lincoln has, for all intents and purposes, deprecated AcePerl now, so we have removed it as a requirement. > > Also, the latest version of BioPerl on CPAN is 1.6.901; I think some of these issues have been specifically addressed in that release. Can you try that release? > > chris > > On Jun 6, 2011, at 4:37 PM, Trina McMahon wrote: > >> Hi everyone, >> >> I was having some trouble with Bio::SeqUtils->cat() earlier today and with Roy's help I debugged my code. But it was still not working properly. I sent my script to a friend and he successfully ran it on his machine, producing the desired output. I am now suspecting that I need to upgrade my perl and bioperl. I hadn't upgraded anything in about 6 months (I was running ActivePerl-5.10 and BioPerl 1.6.0).. So I ran my MacPorts upgrade, installed Perl 5.12, installed GraphViz, and configured CPAN as directed here: >> >> http://www.sysarchitects.com/bioperl >> >> but when I went to build bioperl 1.6.1, I got a lot of what look like fatal errors. Also, none of my scripts using bioperl work using this new build. >> >> Inspecting the errors, it seems like it is having trouble with AcePerl and maybe GraphViz (though this seemed to do ok with MacPorts??). >> >> Any advice welcome! I saw in the archive that I might not be the only one with this problem, so I hope someone has a work-around?? >> >> thanks, >> trina >> >> p.s. note that I am not using the perl that is installed with MacOS, I did a fresh installation in /opt/local/bin/ using MacPorts. I also am pretty sure my path is not the problem (I changed it to reflect the new Perl location) >> >> errors look like this: >> >> /opt/local/bin/perl "-Iblib/arch" "-Iblib/lib" util/install.PLS util/install.pl >> Extracting install.pl (with variable substitutions) >> LDS/AcePerl-1.92.tar.gz >> /Developer/usr/bin/make -- OK >> Running make test >> PERL_DL_NONLAZY=1 /opt/local/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t >> t/basic.t ..... Waiting for remote acedb regression database to start up. This may take a few minutes. >> t/basic.t ..... 2/10 Couldn't establish connection to database. Aborting tests. >> t/basic.t ..... Dubious, test returned 60 (wstat 15360, 0x3c00) >> Failed 9/10 subtests >> t/object.t .... 2/36 Couldn't establish connection to database. Aborting tests. >> t/object.t .... Dubious, test returned 60 (wstat 15360, 0x3c00) >> Failed 35/36 subtests >> t/sequence.t .. 2/54 Couldn't establish connection to database. Aborting tests. >> t/sequence.t .. Dubious, test returned 60 (wstat 15360, 0x3c00) >> Failed 53/54 subtests >> t/update.t .... 2/17 Couldn't establish connection to database. Aborting tests. >> t/update.t .... Dubious, test returned 60 (wstat 15360, 0x3c00) >> Failed 16/17 subtests >> >> <...snip...> >> >> Result: PASS >> CJFIELDS/BioPerl-1.6.1.tar.gz >> Tests succeeded but one dependency not OK (Ace) >> CJFIELDS/BioPerl-1.6.1.tar.gz >> [dependencies] -- NA >> Running Build install >> make test had returned bad status, won't install without force >> Failed during this command: >> LDS/AcePerl-1.92.tar.gz : make_test NO >> CJFIELDS/BioPerl-1.6.1.tar.gz : make_test NO one dependency not OK (Ace) >> >> >> ------------------------------------------------------------------------------------- >> Katherine (Trina) McMahon, Associate Professor >> Goddess of Funkosity >> Departments of Civil and Environmental Engineering and Bacteriology >> Environmental Chemistry and Technology Program >> Limnology and Marine Science Program >> Microbiology Doctoral Training Program >> >> **On sabbatical leave starting August 20, 2010** >> >> Mailing Address: >> 3204 Engineering Hall, 1415 Engineering Drive >> University of Wisconsin - Madison, Madison, WI 53706-1691 >> >> Alternate Office: 5552 Microbial Sciences Building >> >> Phone: 608/890-2836 Fax: 608/262-9865 >> Email: tmcmahon at engr.wisc.edu >> McMahon Lab: http://www.engr.wisc.edu/cee/faculty/mcmahon_katherine.html >> North Temperate Lakes Microbial Observatory: http://microbes.limnology.wisc.edu/ >> >> ------------------------------------------------------------------------------------- >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > ------------------------------------------------------------------------------------- Katherine (Trina) McMahon, Associate Professor Goddess of Funkosity Departments of Civil and Environmental Engineering and Bacteriology Environmental Chemistry and Technology Program Limnology and Marine Science Program Microbiology Doctoral Training Program **On sabbatical leave starting August 20, 2010** Mailing Address: 3204 Engineering Hall, 1415 Engineering Drive University of Wisconsin - Madison, Madison, WI 53706-1691 Alternate Office: 5552 Microbial Sciences Building Phone: 608/890-2836 Fax: 608/262-9865 Email: tmcmahon at engr.wisc.edu McMahon Lab: http://www.engr.wisc.edu/cee/faculty/mcmahon_katherine.html North Temperate Lakes Microbial Observatory: http://microbes.limnology.wisc.edu/ ------------------------------------------------------------------------------------- From shachigahoimbi at gmail.com Tue Jun 7 00:08:57 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Tue, 7 Jun 2011 09:38:57 +0530 Subject: [Bioperl-l] Bioperl graphics In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF338731C9A61@exchsth.agresearch.co.nz> References: <714F90EB-6A33-47BE-B843-64A670EF91D9@illinois.edu> <18DF7D20DFEC044098A1062202F5FFF338731C9A61@exchsth.agresearch.co.nz> Message-ID: If I cant put graphics in my text file then is it possible to insert graphics in .doc file using perl script? Actually I want to save graphics output as well as text output in one file using perl script. Is there any other way to do that please help me. On Tue, Jun 7, 2011 at 3:32 AM, Smithies, Russell < Russell.Smithies at agresearch.co.nz> wrote: > Depending on what browser you're using, you can embed images in HTML using > base64 encoding: > > Embedded Image src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAADIA..." /> > > We've used it before (a few years ago) but I think it only worked with > Firefox then. > > --Russell > > > > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Scott Cain > > Sent: Tuesday, 7 June 2011 3:37 a.m. > > To: Chris Fields > > Cc: bioperl-l at lists.open-bio.org; Shachi Gahoi > > Subject: Re: [Bioperl-l] Bioperl graphics > > > > Hi Chris, > > > > Thanks for the catch--yes, that is exactly what I meant. > > > > Scott > > > > > > On Mon, Jun 6, 2011 at 11:24 AM, Chris Fields > > wrote: > > > Agreed, except you probably mean 'rtf' instead of 'rdf', correct? > > > > > > chris > > > > > > On Jun 6, 2011, at 8:31 AM, Scott Cain wrote: > > > > > >> Hi Sachi, > > >> > > >> You can't put graphics in a text file--that's why it's called a text > > >> file. You need a format that can do more, like pdf, html or rdf. > > >> What are these files going to be used for? > > >> > > >> Scott > > >> > > >> > > >> On Sat, Jun 4, 2011 at 7:03 AM, Shachi Gahoi > > wrote: > > >>> Dear All, > > >>> > > >>> My question is can I insert image file in text output file using > > perl? > > >>> > > >>> Half portion of my perl script output is "a1.txt" file and another > > half > > >>> portion of perl script output is "a1.png" (created by Bio::Graphics > > module). > > >>> > > >>> Now I want to insert a1.png image file in a1.txt file. or I can say > > that I > > >>> want to store image output file a1.png in a1.txt file. > > >>> > > >>> > > ####################################################################* > > >>> Here is sample of my a1.txt file *- > > >>> > > >>> Left primer: CARGAYATHATHTTYGCN Length:18 Start:125 End:142 > > >>> Opt_Tm:44.84 GC%: 22.22 > > >>> > > >>> Right primer:GCNCGNGCNTAYAAYACN Length:18 Start:828 End:845 > > >>> Opt_Tm:53.82 GC%: 38.89 > > >>> > > >>> > > #################################################################### > > >>> *Desired output * > > >>> > > >>> Left primer: CARGAYATHATHTTYGCN Length:18 Start:125 End:142 > > >>> Opt_Tm:44.84 GC%: 22.22 > > >>> > > >>> Right primer:GCNCGNGCNTAYAAYACN Length:18 Start:828 End:845 > > >>> Opt_Tm:53.82 GC%: 38.89 > > >>> > > >>> and now i want to store a1.png file in the below of a1.txt file > > >>> > > >>> *Here I want to print my image file > > >>> * > > >>> > > >>> > > >>> ################################################################### > > >>> > > >>> How can I do this. If anyone know please my help me. > > >>> > > >>> > > >>> Thanks in advance > > >>> > > >>> > > >>> -- > > >>> Regards, > > >>> Shachi > > >>> _______________________________________________ > > >>> Bioperl-l mailing list > > >>> Bioperl-l at lists.open-bio.org > > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > >>> > > >> > > >> > > >> > > >> -- > > >> -------------------------------------------------------------------- > > ---- > > >> Scott Cain, Ph. D. scott at > > scottcain dot net > > >> GMOD Coordinator (http://gmod.org/) 216-392-3087 > > >> Ontario Institute for Cancer Research > > >> > > >> _______________________________________________ > > >> Bioperl-l mailing list > > >> Bioperl-l at lists.open-bio.org > > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > > > > > > -- > > ----------------------------------------------------------------------- > > - > > Scott Cain, Ph. D. scott at scottcain > > dot net > > GMOD Coordinator (http://gmod.org/) 216-392-3087 > > Ontario Institute for Cancer Research > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > ======================================================================= > Attention: The information contained in this message and/or attachments > from AgResearch Limited is intended only for the persons or entities > to which it is addressed and may contain confidential and/or privileged > material. Any review, retransmission, dissemination or other use of, or > taking of any action in reliance upon, this information by persons or > entities other than the intended recipients is prohibited by AgResearch > Limited. If you have received this message in error, please notify the > sender immediately. > ======================================================================= > -- Regards, Shachi From Russell.Smithies at agresearch.co.nz Tue Jun 7 00:19:21 2011 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Tue, 7 Jun 2011 16:19:21 +1200 Subject: [Bioperl-l] Bioperl graphics In-Reply-To: References: <714F90EB-6A33-47BE-B843-64A670EF91D9@illinois.edu> <18DF7D20DFEC044098A1062202F5FFF338731C9A61@exchsth.agresearch.co.nz> Message-ID: <18DF7D20DFEC044098A1062202F5FFF338731C9A66@exchsth.agresearch.co.nz> I assume you're using Windows if you're trying to write a .doc? Try Win32::OLE Example here: http://www.perlmonks.org/?node_id=153486 --Russell From: Shachi Gahoi [mailto:shachigahoimbi at gmail.com] Sent: Tuesday, 7 June 2011 4:09 p.m. To: Smithies, Russell Cc: Scott Cain; Chris Fields; bioperl-l at lists.open-bio.org Subject: Re: [Bioperl-l] Bioperl graphics If I cant put graphics in my text file then is it possible to insert graphics in .doc file using perl script? Actually I want to save graphics output as well as text output in one file using perl script. Is there any other way to do that please help me. On Tue, Jun 7, 2011 at 3:32 AM, Smithies, Russell > wrote: Depending on what browser you're using, you can embed images in HTML using base64 encoding: Embedded Image We've used it before (a few years ago) but I think it only worked with Firefox then. --Russell > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Scott Cain > Sent: Tuesday, 7 June 2011 3:37 a.m. > To: Chris Fields > Cc: bioperl-l at lists.open-bio.org; Shachi Gahoi > Subject: Re: [Bioperl-l] Bioperl graphics > > Hi Chris, > > Thanks for the catch--yes, that is exactly what I meant. > > Scott > > > On Mon, Jun 6, 2011 at 11:24 AM, Chris Fields > > wrote: > > Agreed, except you probably mean 'rtf' instead of 'rdf', correct? > > > > chris > > > > On Jun 6, 2011, at 8:31 AM, Scott Cain wrote: > > > >> Hi Sachi, > >> > >> You can't put graphics in a text file--that's why it's called a text > >> file. You need a format that can do more, like pdf, html or rdf. > >> What are these files going to be used for? > >> > >> Scott > >> > >> > >> On Sat, Jun 4, 2011 at 7:03 AM, Shachi Gahoi > > wrote: > >>> Dear All, > >>> > >>> My question is can I insert image file in text output file using > perl? > >>> > >>> Half portion of my perl script output is "a1.txt" file and another > half > >>> portion of perl script output is "a1.png" (created by Bio::Graphics > module). > >>> > >>> Now I want to insert a1.png image file in a1.txt file. or I can say > that I > >>> want to store image output file a1.png in a1.txt file. > >>> > >>> > ####################################################################* > >>> Here is sample of my a1.txt file *- > >>> > >>> Left primer: CARGAYATHATHTTYGCN Length:18 Start:125 End:142 > >>> Opt_Tm:44.84 GC%: 22.22 > >>> > >>> Right primer:GCNCGNGCNTAYAAYACN Length:18 Start:828 End:845 > >>> Opt_Tm:53.82 GC%: 38.89 > >>> > >>> > #################################################################### > >>> *Desired output * > >>> > >>> Left primer: CARGAYATHATHTTYGCN Length:18 Start:125 End:142 > >>> Opt_Tm:44.84 GC%: 22.22 > >>> > >>> Right primer:GCNCGNGCNTAYAAYACN Length:18 Start:828 End:845 > >>> Opt_Tm:53.82 GC%: 38.89 > >>> > >>> and now i want to store a1.png file in the below of a1.txt file > >>> > >>> *Here I want to print my image file > >>> * > >>> > >>> > >>> ################################################################### > >>> > >>> How can I do this. If anyone know please my help me. > >>> > >>> > >>> Thanks in advance > >>> > >>> > >>> -- > >>> Regards, > >>> Shachi > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >> > >> > >> > >> -- > >> -------------------------------------------------------------------- > ---- > >> Scott Cain, Ph. D. scott at > scottcain dot net > >> GMOD Coordinator (http://gmod.org/) 216-392-3087 > >> Ontario Institute for Cancer Research > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > -- > ----------------------------------------------------------------------- > - > Scott Cain, Ph. D. scott at scottcain > dot net > GMOD Coordinator (http://gmod.org/) 216-392-3087 > Ontario Institute for Cancer Research > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= -- Regards, Shachi From Russell.Smithies at agresearch.co.nz Tue Jun 7 00:28:05 2011 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Tue, 7 Jun 2011 16:28:05 +1200 Subject: [Bioperl-l] Bioperl graphics In-Reply-To: References: <714F90EB-6A33-47BE-B843-64A670EF91D9@illinois.edu> <18DF7D20DFEC044098A1062202F5FFF338731C9A61@exchsth.agresearch.co.nz> Message-ID: <18DF7D20DFEC044098A1062202F5FFF338731C9A67@exchsth.agresearch.co.nz> http://tinyurl.com/3mebmjj --Russell From: Shachi Gahoi [mailto:shachigahoimbi at gmail.com] Sent: Tuesday, 7 June 2011 4:09 p.m. To: Smithies, Russell Cc: Scott Cain; Chris Fields; bioperl-l at lists.open-bio.org Subject: Re: [Bioperl-l] Bioperl graphics If I cant put graphics in my text file then is it possible to insert graphics in .doc file using perl script? Actually I want to save graphics output as well as text output in one file using perl script. Is there any other way to do that please help me. On Tue, Jun 7, 2011 at 3:32 AM, Smithies, Russell wrote: Depending on what browser you're using, you can embed images in HTML using base64 encoding: Embedded Image We've used it before (a few years ago) but I think it only worked with Firefox then. --Russell > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Scott Cain > Sent: Tuesday, 7 June 2011 3:37 a.m. > To: Chris Fields > Cc: bioperl-l at lists.open-bio.org; Shachi Gahoi > Subject: Re: [Bioperl-l] Bioperl graphics > > Hi Chris, > > Thanks for the catch--yes, that is exactly what I meant. > > Scott > > > On Mon, Jun 6, 2011 at 11:24 AM, Chris Fields > wrote: > > Agreed, except you probably mean 'rtf' instead of 'rdf', correct? > > > > chris > > > > On Jun 6, 2011, at 8:31 AM, Scott Cain wrote: > > > >> Hi Sachi, > >> > >> You can't put graphics in a text file--that's why it's called a text > >> file. ?You need a format that can do more, like pdf, html or rdf. > >> What are these files going to be used for? > >> > >> Scott > >> > >> > >> On Sat, Jun 4, 2011 at 7:03 AM, Shachi Gahoi > wrote: > >>> Dear All, > >>> > >>> My question is can I insert image file in text output file using > perl? > >>> > >>> Half portion of my perl script output is "a1.txt" file and another > half > >>> portion of perl script output is "a1.png" (created by Bio::Graphics > module). > >>> > >>> Now I want to insert a1.png image file in a1.txt file. or I can say > that I > >>> want to store image output file a1.png in a1.txt file. > >>> > >>> > ####################################################################* > >>> Here is sample of my a1.txt file *- > >>> > >>> Left primer: CARGAYATHATHTTYGCN ?Length:18 ?Start:125 ?End:142 > >>> Opt_Tm:44.84 ?GC%: 22.22 > >>> > >>> Right primer:GCNCGNGCNTAYAAYACN ?Length:18 ?Start:828 ?End:845 > >>> Opt_Tm:53.82 ?GC%: 38.89 > >>> > >>> > #################################################################### > >>> *Desired output * > >>> > >>> Left primer: CARGAYATHATHTTYGCN ?Length:18 ?Start:125 ?End:142 > >>> Opt_Tm:44.84 ?GC%: 22.22 > >>> > >>> Right primer:GCNCGNGCNTAYAAYACN ?Length:18 ?Start:828 ?End:845 > >>> Opt_Tm:53.82 ?GC%: 38.89 > >>> > >>> and now i want to store a1.png file in the below of a1.txt file > >>> > >>> *Here I want to print my image file > >>> * > >>> > >>> > >>> ################################################################### > >>> > >>> How can I do this. If anyone know please my help me. > >>> > >>> > >>> Thanks in advance > >>> > >>> > >>> -- > >>> Regards, > >>> Shachi > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >> > >> > >> > >> -- > >> -------------------------------------------------------------------- > ---- > >> Scott Cain, Ph. D. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? scott at > scottcain dot net > >> GMOD Coordinator (http://gmod.org/) ? ? ? ? ? ? ? ? ? ? 216-392-3087 > >> Ontario Institute for Cancer Research > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > -- > ----------------------------------------------------------------------- > - > Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain > dot net > GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 > Ontario Institute for Cancer Research > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= -- Regards, Shachi From shachigahoimbi at gmail.com Tue Jun 7 00:36:25 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Tue, 7 Jun 2011 10:06:25 +0530 Subject: [Bioperl-l] Bioperl graphics In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF338731C9A67@exchsth.agresearch.co.nz> References: <714F90EB-6A33-47BE-B843-64A670EF91D9@illinois.edu> <18DF7D20DFEC044098A1062202F5FFF338731C9A61@exchsth.agresearch.co.nz> <18DF7D20DFEC044098A1062202F5FFF338731C9A67@exchsth.agresearch.co.nz> Message-ID: No i am working on ubuntu On Tue, Jun 7, 2011 at 9:58 AM, Smithies, Russell < Russell.Smithies at agresearch.co.nz> wrote: > http://tinyurl.com/3mebmjj > > --Russell > > > > From: Shachi Gahoi [mailto:shachigahoimbi at gmail.com] > Sent: Tuesday, 7 June 2011 4:09 p.m. > To: Smithies, Russell > Cc: Scott Cain; Chris Fields; bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Bioperl graphics > > If I cant put graphics in my text file then is it possible to insert > graphics in .doc file using perl script? > > Actually I want to save graphics output as well as text output in one file > using perl script. > > Is there any other way to do that please help me. > > On Tue, Jun 7, 2011 at 3:32 AM, Smithies, Russell < > Russell.Smithies at agresearch.co.nz> wrote: > Depending on what browser you're using, you can embed images in HTML using > base64 encoding: > > Embedded Image src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAADIA..." /> > > We've used it before (a few years ago) but I think it only worked with > Firefox then. > > --Russell > > > > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Scott Cain > > Sent: Tuesday, 7 June 2011 3:37 a.m. > > To: Chris Fields > > Cc: bioperl-l at lists.open-bio.org; Shachi Gahoi > > Subject: Re: [Bioperl-l] Bioperl graphics > > > > Hi Chris, > > > > Thanks for the catch--yes, that is exactly what I meant. > > > > Scott > > > > > > On Mon, Jun 6, 2011 at 11:24 AM, Chris Fields > > wrote: > > > Agreed, except you probably mean 'rtf' instead of 'rdf', correct? > > > > > > chris > > > > > > On Jun 6, 2011, at 8:31 AM, Scott Cain wrote: > > > > > >> Hi Sachi, > > >> > > >> You can't put graphics in a text file--that's why it's called a text > > >> file. You need a format that can do more, like pdf, html or rdf. > > >> What are these files going to be used for? > > >> > > >> Scott > > >> > > >> > > >> On Sat, Jun 4, 2011 at 7:03 AM, Shachi Gahoi > > wrote: > > >>> Dear All, > > >>> > > >>> My question is can I insert image file in text output file using > > perl? > > >>> > > >>> Half portion of my perl script output is "a1.txt" file and another > > half > > >>> portion of perl script output is "a1.png" (created by Bio::Graphics > > module). > > >>> > > >>> Now I want to insert a1.png image file in a1.txt file. or I can say > > that I > > >>> want to store image output file a1.png in a1.txt file. > > >>> > > >>> > > ####################################################################* > > >>> Here is sample of my a1.txt file *- > > >>> > > >>> Left primer: CARGAYATHATHTTYGCN Length:18 Start:125 End:142 > > >>> Opt_Tm:44.84 GC%: 22.22 > > >>> > > >>> Right primer:GCNCGNGCNTAYAAYACN Length:18 Start:828 End:845 > > >>> Opt_Tm:53.82 GC%: 38.89 > > >>> > > >>> > > #################################################################### > > >>> *Desired output * > > >>> > > >>> Left primer: CARGAYATHATHTTYGCN Length:18 Start:125 End:142 > > >>> Opt_Tm:44.84 GC%: 22.22 > > >>> > > >>> Right primer:GCNCGNGCNTAYAAYACN Length:18 Start:828 End:845 > > >>> Opt_Tm:53.82 GC%: 38.89 > > >>> > > >>> and now i want to store a1.png file in the below of a1.txt file > > >>> > > >>> *Here I want to print my image file > > >>> * > > >>> > > >>> > > >>> ################################################################### > > >>> > > >>> How can I do this. If anyone know please my help me. > > >>> > > >>> > > >>> Thanks in advance > > >>> > > >>> > > >>> -- > > >>> Regards, > > >>> Shachi > > >>> _______________________________________________ > > >>> Bioperl-l mailing list > > >>> Bioperl-l at lists.open-bio.org > > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > >>> > > >> > > >> > > >> > > >> -- > > >> -------------------------------------------------------------------- > > ---- > > >> Scott Cain, Ph. D. scott at > > scottcain dot net > > >> GMOD Coordinator (http://gmod.org/) 216-392-3087 > > >> Ontario Institute for Cancer Research > > >> > > >> _______________________________________________ > > >> Bioperl-l mailing list > > >> Bioperl-l at lists.open-bio.org > > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > > > > > > -- > > ----------------------------------------------------------------------- > > - > > Scott Cain, Ph. D. scott at scottcain > > dot net > > GMOD Coordinator (http://gmod.org/) 216-392-3087 > > Ontario Institute for Cancer Research > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > ======================================================================= > Attention: The information contained in this message and/or attachments > from AgResearch Limited is intended only for the persons or entities > to which it is addressed and may contain confidential and/or privileged > material. Any review, retransmission, dissemination or other use of, or > taking of any action in reliance upon, this information by persons or > entities other than the intended recipients is prohibited by AgResearch > Limited. If you have received this message in error, please notify the > sender immediately. > ======================================================================= > > > > -- > Regards, > Shachi > -- Regards, Shachi From Russell.Smithies at agresearch.co.nz Tue Jun 7 00:39:13 2011 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Tue, 7 Jun 2011 16:39:13 +1200 Subject: [Bioperl-l] Bioperl graphics In-Reply-To: References: <714F90EB-6A33-47BE-B843-64A670EF91D9@illinois.edu> <18DF7D20DFEC044098A1062202F5FFF338731C9A61@exchsth.agresearch.co.nz> <18DF7D20DFEC044098A1062202F5FFF338731C9A67@exchsth.agresearch.co.nz> Message-ID: <18DF7D20DFEC044098A1062202F5FFF338731C9A68@exchsth.agresearch.co.nz> Try PDF::Create and use pdfs instead. --Russell From: Shachi Gahoi [mailto:shachigahoimbi at gmail.com] Sent: Tuesday, 7 June 2011 4:36 p.m. To: Smithies, Russell Cc: Scott Cain; Chris Fields; bioperl-l at lists.open-bio.org Subject: Re: [Bioperl-l] Bioperl graphics No i am working on ubuntu On Tue, Jun 7, 2011 at 9:58 AM, Smithies, Russell > wrote: http://tinyurl.com/3mebmjj --Russell From: Shachi Gahoi [mailto:shachigahoimbi at gmail.com] Sent: Tuesday, 7 June 2011 4:09 p.m. To: Smithies, Russell Cc: Scott Cain; Chris Fields; bioperl-l at lists.open-bio.org Subject: Re: [Bioperl-l] Bioperl graphics If I cant put graphics in my text file then is it possible to insert graphics in .doc file using perl script? Actually I want to save graphics output as well as text output in one file using perl script. Is there any other way to do that please help me. On Tue, Jun 7, 2011 at 3:32 AM, Smithies, Russell > wrote: Depending on what browser you're using, you can embed images in HTML using base64 encoding: Embedded Image We've used it before (a few years ago) but I think it only worked with Firefox then. --Russell > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Scott Cain > Sent: Tuesday, 7 June 2011 3:37 a.m. > To: Chris Fields > Cc: bioperl-l at lists.open-bio.org; Shachi Gahoi > Subject: Re: [Bioperl-l] Bioperl graphics > > Hi Chris, > > Thanks for the catch--yes, that is exactly what I meant. > > Scott > > > On Mon, Jun 6, 2011 at 11:24 AM, Chris Fields > > wrote: > > Agreed, except you probably mean 'rtf' instead of 'rdf', correct? > > > > chris > > > > On Jun 6, 2011, at 8:31 AM, Scott Cain wrote: > > > >> Hi Sachi, > >> > >> You can't put graphics in a text file--that's why it's called a text > >> file. You need a format that can do more, like pdf, html or rdf. > >> What are these files going to be used for? > >> > >> Scott > >> > >> > >> On Sat, Jun 4, 2011 at 7:03 AM, Shachi Gahoi > > wrote: > >>> Dear All, > >>> > >>> My question is can I insert image file in text output file using > perl? > >>> > >>> Half portion of my perl script output is "a1.txt" file and another > half > >>> portion of perl script output is "a1.png" (created by Bio::Graphics > module). > >>> > >>> Now I want to insert a1.png image file in a1.txt file. or I can say > that I > >>> want to store image output file a1.png in a1.txt file. > >>> > >>> > ####################################################################* > >>> Here is sample of my a1.txt file *- > >>> > >>> Left primer: CARGAYATHATHTTYGCN Length:18 Start:125 End:142 > >>> Opt_Tm:44.84 GC%: 22.22 > >>> > >>> Right primer:GCNCGNGCNTAYAAYACN Length:18 Start:828 End:845 > >>> Opt_Tm:53.82 GC%: 38.89 > >>> > >>> > #################################################################### > >>> *Desired output * > >>> > >>> Left primer: CARGAYATHATHTTYGCN Length:18 Start:125 End:142 > >>> Opt_Tm:44.84 GC%: 22.22 > >>> > >>> Right primer:GCNCGNGCNTAYAAYACN Length:18 Start:828 End:845 > >>> Opt_Tm:53.82 GC%: 38.89 > >>> > >>> and now i want to store a1.png file in the below of a1.txt file > >>> > >>> *Here I want to print my image file > >>> * > >>> > >>> > >>> ################################################################### > >>> > >>> How can I do this. If anyone know please my help me. > >>> > >>> > >>> Thanks in advance > >>> > >>> > >>> -- > >>> Regards, > >>> Shachi > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >> > >> > >> > >> -- > >> -------------------------------------------------------------------- > ---- > >> Scott Cain, Ph. D. scott at > scottcain dot net > >> GMOD Coordinator (http://gmod.org/) 216-392-3087 > >> Ontario Institute for Cancer Research > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > -- > ----------------------------------------------------------------------- > - > Scott Cain, Ph. D. scott at scottcain > dot net > GMOD Coordinator (http://gmod.org/) 216-392-3087 > Ontario Institute for Cancer Research > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= -- Regards, Shachi -- Regards, Shachi From shachigahoimbi at gmail.com Tue Jun 7 01:04:55 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Tue, 7 Jun 2011 10:34:55 +0530 Subject: [Bioperl-l] Bioperl graphics In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF338731C9A68@exchsth.agresearch.co.nz> References: <714F90EB-6A33-47BE-B843-64A670EF91D9@illinois.edu> <18DF7D20DFEC044098A1062202F5FFF338731C9A61@exchsth.agresearch.co.nz> <18DF7D20DFEC044098A1062202F5FFF338731C9A67@exchsth.agresearch.co.nz> <18DF7D20DFEC044098A1062202F5FFF338731C9A68@exchsth.agresearch.co.nz> Message-ID: thanks On Tue, Jun 7, 2011 at 10:09 AM, Smithies, Russell < Russell.Smithies at agresearch.co.nz> wrote: > Try PDF::Create and use pdfs instead. > > > > --Russell > > > > > > *From:* Shachi Gahoi [mailto:shachigahoimbi at gmail.com] > *Sent:* Tuesday, 7 June 2011 4:36 p.m. > > *To:* Smithies, Russell > *Cc:* Scott Cain; Chris Fields; bioperl-l at lists.open-bio.org > *Subject:* Re: [Bioperl-l] Bioperl graphics > > > > No i am working on ubuntu > > On Tue, Jun 7, 2011 at 9:58 AM, Smithies, Russell < > Russell.Smithies at agresearch.co.nz> wrote: > > http://tinyurl.com/3mebmjj > > > --Russell > > > > From: Shachi Gahoi [mailto:shachigahoimbi at gmail.com] > Sent: Tuesday, 7 June 2011 4:09 p.m. > To: Smithies, Russell > > Cc: Scott Cain; Chris Fields; bioperl-l at lists.open-bio.org > > Subject: Re: [Bioperl-l] Bioperl graphics > > If I cant put graphics in my text file then is it possible to insert > graphics in .doc file using perl script? > > Actually I want to save graphics output as well as text output in one file > using perl script. > > Is there any other way to do that please help me. > > On Tue, Jun 7, 2011 at 3:32 AM, Smithies, Russell < > Russell.Smithies at agresearch.co.nz> wrote: > Depending on what browser you're using, you can embed images in HTML using > base64 encoding: > > Embedded Image src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAADIA..." /> > > We've used it before (a few years ago) but I think it only worked with > Firefox then. > > --Russell > > > > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Scott Cain > > Sent: Tuesday, 7 June 2011 3:37 a.m. > > To: Chris Fields > > Cc: bioperl-l at lists.open-bio.org; Shachi Gahoi > > Subject: Re: [Bioperl-l] Bioperl graphics > > > > Hi Chris, > > > > Thanks for the catch--yes, that is exactly what I meant. > > > > Scott > > > > > > On Mon, Jun 6, 2011 at 11:24 AM, Chris Fields > > wrote: > > > Agreed, except you probably mean 'rtf' instead of 'rdf', correct? > > > > > > chris > > > > > > On Jun 6, 2011, at 8:31 AM, Scott Cain wrote: > > > > > >> Hi Sachi, > > >> > > >> You can't put graphics in a text file--that's why it's called a text > > >> file. You need a format that can do more, like pdf, html or rdf. > > >> What are these files going to be used for? > > >> > > >> Scott > > >> > > >> > > >> On Sat, Jun 4, 2011 at 7:03 AM, Shachi Gahoi > > wrote: > > >>> Dear All, > > >>> > > >>> My question is can I insert image file in text output file using > > perl? > > >>> > > >>> Half portion of my perl script output is "a1.txt" file and another > > half > > >>> portion of perl script output is "a1.png" (created by Bio::Graphics > > module). > > >>> > > >>> Now I want to insert a1.png image file in a1.txt file. or I can say > > that I > > >>> want to store image output file a1.png in a1.txt file. > > >>> > > >>> > > ####################################################################* > > >>> Here is sample of my a1.txt file *- > > >>> > > >>> Left primer: CARGAYATHATHTTYGCN Length:18 Start:125 End:142 > > >>> Opt_Tm:44.84 GC%: 22.22 > > >>> > > >>> Right primer:GCNCGNGCNTAYAAYACN Length:18 Start:828 End:845 > > >>> Opt_Tm:53.82 GC%: 38.89 > > >>> > > >>> > > #################################################################### > > >>> *Desired output * > > >>> > > >>> Left primer: CARGAYATHATHTTYGCN Length:18 Start:125 End:142 > > >>> Opt_Tm:44.84 GC%: 22.22 > > >>> > > >>> Right primer:GCNCGNGCNTAYAAYACN Length:18 Start:828 End:845 > > >>> Opt_Tm:53.82 GC%: 38.89 > > >>> > > >>> and now i want to store a1.png file in the below of a1.txt file > > >>> > > >>> *Here I want to print my image file > > >>> * > > >>> > > >>> > > >>> ################################################################### > > >>> > > >>> How can I do this. If anyone know please my help me. > > >>> > > >>> > > >>> Thanks in advance > > >>> > > >>> > > >>> -- > > >>> Regards, > > >>> Shachi > > >>> _______________________________________________ > > >>> Bioperl-l mailing list > > >>> Bioperl-l at lists.open-bio.org > > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > >>> > > >> > > >> > > >> > > >> -- > > >> -------------------------------------------------------------------- > > ---- > > >> Scott Cain, Ph. D. scott at > > scottcain dot net > > >> GMOD Coordinator (http://gmod.org/) 216-392-3087 > > >> Ontario Institute for Cancer Research > > >> > > >> _______________________________________________ > > >> Bioperl-l mailing list > > >> Bioperl-l at lists.open-bio.org > > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > > > > > > -- > > ----------------------------------------------------------------------- > > - > > Scott Cain, Ph. D. scott at scottcain > > dot net > > GMOD Coordinator (http://gmod.org/) 216-392-3087 > > Ontario Institute for Cancer Research > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > ======================================================================= > Attention: The information contained in this message and/or attachments > from AgResearch Limited is intended only for the persons or entities > to which it is addressed and may contain confidential and/or privileged > material. Any review, retransmission, dissemination or other use of, or > taking of any action in reliance upon, this information by persons or > entities other than the intended recipients is prohibited by AgResearch > Limited. If you have received this message in error, please notify the > sender immediately. > ======================================================================= > > > > -- > Regards, > Shachi > > > > > -- > Regards, > Shachi > -- Regards, Shachi From tejaminnu at gmail.com Tue Jun 7 00:20:00 2011 From: tejaminnu at gmail.com (tejaminnu) Date: Mon, 6 Jun 2011 21:20:00 -0700 (PDT) Subject: [Bioperl-l] Installing Bioperl 1.6.1 on Mac OS 10.6.7 In-Reply-To: <35BCC454-7B4A-46A0-9D5F-C3DCE2054A64@cae.wisc.edu> References: <35BCC454-7B4A-46A0-9D5F-C3DCE2054A64@cae.wisc.edu> Message-ID: <31788966.post@talk.nabble.com> Hi Trina, Try re-installing. But this time install as 'su' (Super user) in this Unix shell. Regards Teja -- View this message in context: http://old.nabble.com/Installing-Bioperl-1.6.1-on-Mac-OS-10.6.7-tp31787371p31788966.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From tejaminnu at gmail.com Tue Jun 7 00:21:35 2011 From: tejaminnu at gmail.com (tejaminnu) Date: Mon, 6 Jun 2011 21:21:35 -0700 (PDT) Subject: [Bioperl-l] Problem installing BioPerl In-Reply-To: References: Message-ID: <31788976.post@talk.nabble.com> Hi Sophie, Try installing as a super user 'su' in the unix shell -- View this message in context: http://old.nabble.com/Problem-installing-BioPerl-tp31749997p31788976.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From jonas.zierer at campus.lmu.de Tue Jun 7 06:01:27 2011 From: jonas.zierer at campus.lmu.de (Jonas Zierer) Date: Tue, 07 Jun 2011 12:01:27 +0200 Subject: [Bioperl-l] read coverage Message-ID: <1307440887.6417.36.camel@hensel.bio.ifi.lmu.de> Hi, ich use Bio::DB::SAM to analyze bam files, which contain aligned transcriptome reads. Can I use the coverage function to get 1. the coverage off all reads on one strand (not all strands as the default) 2. the coverage of all read-start-positions only thx bye From sheena.scroggins at gmail.com Tue Jun 7 10:40:47 2011 From: sheena.scroggins at gmail.com (Sheena Scroggins) Date: Tue, 7 Jun 2011 07:40:47 -0700 Subject: [Bioperl-l] GSoC Progess Update Message-ID: I've updated my GSoC project blog at http://techomics.com/ . I'm still in the process of making the Dist::Zilla pluginbundle and testing it. I'll make sure to do weekly updates to keep everyone informed of the progress. Sheena From liam.elbourne at mq.edu.au Tue Jun 7 17:30:15 2011 From: liam.elbourne at mq.edu.au (Liam Elbourne) Date: Wed, 8 Jun 2011 07:30:15 +1000 Subject: [Bioperl-l] Installing Bioperl 1.6.1 on Mac OS 10.6.7 In-Reply-To: <31788966.post@talk.nabble.com> References: <35BCC454-7B4A-46A0-9D5F-C3DCE2054A64@cae.wisc.edu> <31788966.post@talk.nabble.com> Message-ID: "Sudo" is the preferred option in macosx (and others). But it shouldn't matter if it's being installed in a user directory. Liam Elbourne. On 07/06/2011, at 2:20 PM, tejaminnu wrote: > > Hi Trina, > > Try re-installing. But this time install as 'su' (Super user) in this Unix > shell. > > Regards > Teja > -- > View this message in context: http://old.nabble.com/Installing-Bioperl-1.6.1-on-Mac-OS-10.6.7-tp31787371p31788966.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From locarpau at upvnet.upv.es Tue Jun 7 22:26:28 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero) Date: Wed, 08 Jun 2011 04:26:28 +0200 Subject: [Bioperl-l] model 3 on Codeml.pm In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF338731C9A60@exchsth.agresearch.co.nz> References: <18DF7D20DFEC044098A1062202F5FFF338731C9A60@exchsth.agresearch.co.nz> Message-ID: <4DEEDDD4.2040002@upvnet.upv.es> Hi, I'm trying to run the clade model D (Model D: model = 3, NSsites = 3 ncatG = 2, See reference. Bielawski, J. P., and Z. Yang. 2004. A maximum likelihood method for detecting functional divergence at individual codon sites, with application to gene family evolution. Journal of Molecular Evolution 59:121-132. and PAML 4.4 manual page 30) from the Bio::Tools::Run::Phylo::PAML::Codeml module. However, 3 is not among the valid values that can be passed to the module (line 275 'model' => [0..2,7],) and consequently the following Warning message is returned from lines 689-690: 'MSG: parameter ncatG specified value is not recognized, please see the documentation and the code for this module or set the no_param_checks to a true value.' Can line 275: 'model' => [0..2,7], be changed to 'model' => [0..3,7], to accept value 3 or additional changes must be done in other modules to properly run the so-called clade models of PAML. Thanks, Lorenzo -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From jason.stajich at gmail.com Wed Jun 8 01:15:05 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Wed, 8 Jun 2011 00:15:05 -0500 Subject: [Bioperl-l] model 3 on Codeml.pm In-Reply-To: <4DEEDDD4.2040002@upvnet.upv.es> References: <18DF7D20DFEC044098A1062202F5FFF338731C9A60@exchsth.agresearch.co.nz> <4DEEDDD4.2040002@upvnet.upv.es> Message-ID: <894644EC-C39E-4AF2-99E9-E46DBF3EFDD3@gmail.com> it is in github so you can fork a version, make the change, and submit a patch which we can pick up. I am concerned that this change can't be tested without example code. Did you just edit the code and make sure your changes worked, the error message seems to refer to a different parameter (ncatG) not model. Thanks, > MSG: parameter ncatG specified value is not recognized, please see the documentation and the code for this module or set the no_param_checks to a true value.' On Jun 7, 2011, at 9:26 PM, Lorenzo Carretero wrote: > Hi, > I'm trying to run the clade model D (Model D: model = 3, NSsites = 3 ncatG = 2, See reference. Bielawski, J. P., and Z. Yang. 2004. A maximum likelihood method for detecting functional divergence at individual codon sites, with application to gene family evolution. Journal of Molecular Evolution 59:121-132. and PAML 4.4 manual page 30) from the Bio::Tools::Run::Phylo::PAML::Codeml module. However, 3 is not among the valid values that can be passed to the module (line 275 'model' => [0..2,7],) and consequently the following Warning message is returned from lines 689-690: > 'MSG: parameter ncatG specified value is not recognized, please see the documentation and the code for this module or set the no_param_checks to a true value.' > Can line 275: 'model' => [0..2,7], be changed to 'model' => [0..3,7], to accept value 3 or additional changes must be done in other modules to properly run the so-called clade models of PAML. > Thanks, > Lorenzo > > > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > Lorenzo Carretero Paulet > Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) > Integrative Systems Biology Group > C/ Ingeniero Fausto Elio s/n. > 46022 Valencia, Spain > > Phone: +34 963879934 > Fax: +34 963877859 > e-mail: locarpau at upvnet.upv.es > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From shachigahoimbi at gmail.com Wed Jun 8 04:48:02 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Wed, 8 Jun 2011 14:18:02 +0530 Subject: [Bioperl-l] problem in PDF::Create module Message-ID: Dear All, I am using PDF::Create module of bioperl to insert image in pdf file, Here is the script on which I am working ###################################################################### #!/usr/bin/perl use PDF; use PDF::Create; my $pdf = new PDF::Create('filename' => 'image.pdf'); my $a4 = $pdf->new_page('MediaBox' => $pdf->get_page_size('A4')); my $page = $a4->new_page; $page->image('domain.jpeg'); $pdf->close; ############################################################################# In this script I am trying to insert domain.jpeg image in new pdf file image.pdf. But when I am running the script, It is giving one error--- Error- ############################################################################# *Can't use string ("1.2") as a HASH ref while "strict refs" in use at /usr/local/share/perl/5.10.1/PDF/Create/Page.pm line 474. * ############################################################################# I am new to this module (PDF::Create). Is there any error in my script. Please help me. Thanks in advance * * -- Regards, Shachi From dhoworth at mrc-lmb.cam.ac.uk Wed Jun 8 05:31:41 2011 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Wed, 08 Jun 2011 10:31:41 +0100 Subject: [Bioperl-l] problem in PDF::Create module In-Reply-To: References: Message-ID: <4DEF417D.8070409@mrc-lmb.cam.ac.uk> Shachi Gahoi wrote: > Dear All, > > I am using PDF::Create module of bioperl to insert image in pdf file, > > Here is the script on which I am working > > ###################################################################### > > #!/usr/bin/perl > > use PDF; > use PDF::Create; > > my $pdf = new PDF::Create('filename' => 'image.pdf'); > > my $a4 = $pdf->new_page('MediaBox' => $pdf->get_page_size('A4')); > > my $page = $a4->new_page; > > $page->image('domain.jpeg'); > > $pdf->close; > > ############################################################################# > > In this script I am trying to insert domain.jpeg image in new pdf file > image.pdf. But when I am running the script, It is giving one error--- > > Error- > ############################################################################# > *Can't use string ("1.2") as a HASH ref while "strict refs" in use at > /usr/local/share/perl/5.10.1/PDF/Create/Page.pm line 474. > * > ############################################################################# > > I am new to this module (PDF::Create). Is there any error in my script. > Please help me. I'm new to it too. It seems PDF::Create is not very well documented :( Here's some code that works: #!/usr/bin/perl use strict; use warnings; #use PDF; use PDF::Create; my $pdf = new PDF::Create('filename' => 'image.pdf'); my $a4 = $pdf->new_page('MediaBox' => $pdf->get_page_size('A4')); my $page = $a4->new_page; my $image = $pdf->image('domain.jpeg'); #$page->image($image, 10, 10); $page->image(image => $image, xpos => 10, ypos => 10); $pdf->close; Hints: (1) always use strict; use warnings; (2) there's no need to use PDF (3) the image method you called should be invoked by the document, not the page. That's documented, though not very clearly. (4) you then need to invoke the page's image method. It is described later in the POD, but the description of the arguments is plain wrong. (Read its code to find out what the real arguments are!) (5) general perl questions like this are probably better asked at perlmonks rather than here HTH, Dave From shachigahoimbi at gmail.com Wed Jun 8 06:07:00 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Wed, 8 Jun 2011 15:37:00 +0530 Subject: [Bioperl-l] problem in PDF::Create module In-Reply-To: <4DEF417D.8070409@mrc-lmb.cam.ac.uk> References: <4DEF417D.8070409@mrc-lmb.cam.ac.uk> Message-ID: Thanks, It works On Wed, Jun 8, 2011 at 3:01 PM, Dave Howorth wrote: > Shachi Gahoi wrote: > > Dear All, > > > > I am using PDF::Create module of bioperl to insert image in pdf file, > > > > Here is the script on which I am working > > > > ###################################################################### > > > > #!/usr/bin/perl > > > > use PDF; > > use PDF::Create; > > > > my $pdf = new PDF::Create('filename' => 'image.pdf'); > > > > my $a4 = $pdf->new_page('MediaBox' => $pdf->get_page_size('A4')); > > > > my $page = $a4->new_page; > > > > $page->image('domain.jpeg'); > > > > $pdf->close; > > > > > ############################################################################# > > > > In this script I am trying to insert domain.jpeg image in new pdf file > > image.pdf. But when I am running the script, It is giving one error--- > > > > Error- > > > ############################################################################# > > *Can't use string ("1.2") as a HASH ref while "strict refs" in use at > > /usr/local/share/perl/5.10.1/PDF/Create/Page.pm line 474. > > * > > > ############################################################################# > > > > I am new to this module (PDF::Create). Is there any error in my script. > > Please help me. > > I'm new to it too. It seems PDF::Create is not very well documented :( > > Here's some code that works: > > #!/usr/bin/perl > use strict; > use warnings; > > #use PDF; > use PDF::Create; > > my $pdf = new PDF::Create('filename' => 'image.pdf'); > > my $a4 = $pdf->new_page('MediaBox' => $pdf->get_page_size('A4')); > > my $page = $a4->new_page; > > my $image = $pdf->image('domain.jpeg'); > > #$page->image($image, 10, 10); > $page->image(image => $image, xpos => 10, ypos => 10); > > $pdf->close; > > Hints: > (1) always use strict; use warnings; > (2) there's no need to use PDF > (3) the image method you called should be invoked by the document, not > the page. That's documented, though not very clearly. > (4) you then need to invoke the page's image method. It is described > later in the POD, but the description of the arguments is plain wrong. > (Read its code to find out what the real arguments are!) > (5) general perl questions like this are probably better asked at > perlmonks rather than here > > HTH, Dave > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Regards, Shachi From tristan.lefebure at gmail.com Wed Jun 8 08:45:22 2011 From: tristan.lefebure at gmail.com (Tristan Lefebure) Date: Wed, 8 Jun 2011 14:45:22 +0200 Subject: [Bioperl-l] Bio::AlignIO::Mase Message-ID: <201106081445.22322.tristan.lefebure@gmail.com> Hi there, I have some weird alignments with some numerical code stored within the sequence strings (eg. frameshift genewise code). Most AlignIO module I have tried eat them without any trouble except for Bio::AlignIO::Mase. The following patch seems to do the trick: diff -u mase.pm mase_mod.pm --- mase.pm 2011-06-08 14:08:58.558033996 +0200 +++ mase_mod.pm 2011-06-08 14:09:20.388066014 +0200 @@ -109,7 +109,7 @@ while( $entry = $self->_readline) { $entry =~ /^;/ && last; - $entry =~ s/[^A-Za-z\.\-]//g; + $entry =~ s/[^A-Za-z0-9\.\-]//g; $seq .= $entry; } if( $end == -1) { But I am left with the feeling that I don't really understand why this works (which I don't quite like before pushing a patch...) Why doing a s///g instead of a simple m//, and why doing '/[^' and not '/^['... Is that linked to that fact that $/ was modified to read chunks of files? BTW where is $/ set? I searched in Bio::Root::IO but didn't find it... Oh so many questions... Thanks! -- Tristan From cjfields at illinois.edu Wed Jun 8 09:21:41 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 8 Jun 2011 08:21:41 -0500 Subject: [Bioperl-l] Bio::AlignIO::Mase In-Reply-To: <201106081445.22322.tristan.lefebure@gmail.com> References: <201106081445.22322.tristan.lefebure@gmail.com> Message-ID: <837A9E9C-F0D9-4C96-B1C8-1FB5895AA6EA@illinois.edu> Tristan, This very well may be a bug. Have you run the test suite for this module? It should be something like t/AlignIO/mase.t. chris On Jun 8, 2011, at 7:45 AM, Tristan Lefebure wrote: > Hi there, > > I have some weird alignments with some numerical code stored > within the sequence strings (eg. frameshift genewise code). > Most AlignIO module I have tried eat them without any > trouble except for Bio::AlignIO::Mase. > > The following patch seems to do the trick: > > diff -u mase.pm mase_mod.pm > --- mase.pm 2011-06-08 14:08:58.558033996 +0200 > +++ mase_mod.pm 2011-06-08 14:09:20.388066014 +0200 > @@ -109,7 +109,7 @@ > > while( $entry = $self->_readline) { > $entry =~ /^;/ && last; > - $entry =~ s/[^A-Za-z\.\-]//g; > + $entry =~ s/[^A-Za-z0-9\.\-]//g; > $seq .= $entry; > } > if( $end == -1) { > > But I am left with the feeling that I don't really > understand why this works (which I don't quite like before > pushing a patch...) > > Why doing a s///g instead of a simple m//, and why doing > '/[^' and not '/^['... Is that linked to that fact that $/ > was modified to read chunks of files? BTW where is $/ set? I > searched in Bio::Root::IO but didn't find it... > > Oh so many questions... > > Thanks! > > -- > Tristan > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From locarpau at upvnet.upv.es Wed Jun 8 09:22:58 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero) Date: Wed, 08 Jun 2011 15:22:58 +0200 Subject: [Bioperl-l] model 3 on Codeml.pm In-Reply-To: <894644EC-C39E-4AF2-99E9-E46DBF3EFDD3@gmail.com> References: <18DF7D20DFEC044098A1062202F5FFF338731C9A60@exchsth.agresearch.co.nz> <4DEEDDD4.2040002@upvnet.upv.es> <894644EC-C39E-4AF2-99E9-E46DBF3EFDD3@gmail.com> Message-ID: <4DEF77B2.60300@upvnet.upv.es> Jason, I tried to change codeml.pm on my computer to accept 3 as model value but still have the same message. The warning message I have, even after doing the mentioned change in codeml.pm, is actually complaining about the value passed to model (I wrongly posted a warning message from a different script). Despite that, the program seems to run fine but results seem not to be exactly the same as those resulting from manual run of codeml. Thanks a lot, Lorenzo PS: Let me take a look at github and all this. I really don't know what is it. On 6/8/11 7:15 AM, Jason Stajich wrote: > it is in github so you can fork a version, make the change, and submit a patch which we can pick up. > > I am concerned that this change can't be tested without example code. > > Did you just edit the code and make sure your changes worked, the error message seems to refer to a different parameter > (ncatG) not model. > > Thanks, > >> MSG: parameter ncatG specified value is not recognized, please see the documentation and the code for this module or set the no_param_checks to a true value.' > > On Jun 7, 2011, at 9:26 PM, Lorenzo Carretero wrote: > >> Hi, >> I'm trying to run the clade model D (Model D: model = 3, NSsites = 3 ncatG = 2, See reference. Bielawski, J. P., and Z. Yang. 2004. A maximum likelihood method for detecting functional divergence at individual codon sites, with application to gene family evolution. Journal of Molecular Evolution 59:121-132. and PAML 4.4 manual page 30) from the Bio::Tools::Run::Phylo::PAML::Codeml module. However, 3 is not among the valid values that can be passed to the module (line 275 'model' => [0..2,7],) and consequently the following Warning message is returned from lines 689-690: >> 'MSG: parameter ncatG specified value is not recognized, please see the documentation and the code for this module or set the no_param_checks to a true value.' >> Can line 275: 'model' => [0..2,7], be changed to 'model' => [0..3,7], to accept value 3 or additional changes must be done in other modules to properly run the so-called clade models of PAML. >> Thanks, >> Lorenzo >> >> >> -- >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> Lorenzo Carretero Paulet >> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >> Integrative Systems Biology Group >> C/ Ingeniero Fausto Elio s/n. >> 46022 Valencia, Spain >> >> Phone: +34 963879934 >> Fax: +34 963877859 >> e-mail: locarpau at upvnet.upv.es >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From jason.stajich at gmail.com Wed Jun 8 10:27:44 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Wed, 8 Jun 2011 09:27:44 -0500 Subject: [Bioperl-l] Bio::AlignIO::Mase In-Reply-To: <201106081445.22322.tristan.lefebure@gmail.com> References: <201106081445.22322.tristan.lefebure@gmail.com> Message-ID: <5AD73D48-FB3B-4F5C-8AF8-0670C90EE8CE@gmail.com> Hi Tristan - This regular expression is to is to strip everything that isn't a letter, . or - the [^] means match everything EXCEPT what follows. I guess if numeric values are valid in these type of alignments you would just add \d (instead of 0-9) So you are asking for the parser to not strip out frameshift info from a MASE parser? This doesn't have anything to do with the chunk pattern or size set with $/ AFAIK. On Jun 8, 2011, at 7:45 AM, Tristan Lefebure wrote: > Hi there, > > I have some weird alignments with some numerical code stored > within the sequence strings (eg. frameshift genewise code). > Most AlignIO module I have tried eat them without any > trouble except for Bio::AlignIO::Mase. > > The following patch seems to do the trick: > > diff -u mase.pm mase_mod.pm > --- mase.pm 2011-06-08 14:08:58.558033996 +0200 > +++ mase_mod.pm 2011-06-08 14:09:20.388066014 +0200 > @@ -109,7 +109,7 @@ > > while( $entry = $self->_readline) { > $entry =~ /^;/ && last; > - $entry =~ s/[^A-Za-z\.\-]//g; > + $entry =~ s/[^A-Za-z0-9\.\-]//g; > $seq .= $entry; > } > if( $end == -1) { > > But I am left with the feeling that I don't really > understand why this works (which I don't quite like before > pushing a patch...) > > Why doing a s///g instead of a simple m//, and why doing > '/[^' and not '/^['... Is that linked to that fact that $/ > was modified to read chunks of files? BTW where is $/ set? I > searched in Bio::Root::IO but didn't find it... > > Oh so many questions... > > Thanks! > > -- > Tristan > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jun.yin at ucd.ie Wed Jun 8 09:38:30 2011 From: jun.yin at ucd.ie (Jun Yin) Date: Wed, 08 Jun 2011 14:38:30 +0100 Subject: [Bioperl-l] Bio::AlignIO::Mase In-Reply-To: <201106081445.22322.tristan.lefebure@gmail.com> References: <201106081445.22322.tristan.lefebure@gmail.com> Message-ID: <001601cc25e1$5a9a9050$0fcfb0f0$%yin@ucd.ie> Hi, Tristan, For your first two questions, $entry =~ s/[^A-Za-z0-9\.\-]//g; # It recursively remove all non "A-Za-z0-9.-" If you change it to $entry =~ m/[^A-Za-z0-9\.\-]/; #It will find the first non "A-Za-z0-9.-", and do nothing (except return 1). '/[^' and '/^[' are two different things in the reg-exp. [^abc] means non-abc in the string. ^[abc] means the string should start with abc. I don't understand why you are looking for $/. $/ is OUTPUT_FIELD_SEPARATOR. You can set it in your own script, for example: $old_seperator=$/; $/="\t"; Then the line should end with "\t". After that, you can change it back using: $/=$old_seperator; For your patch, I think it is written well. Since you don't want to remove the digits in your sequence, this is why $entry =~ s/[^A-Za-z\.\-]//g; is changed into $entry =~ s/[^A-Za-z0-9\.\-]//g; Otherwise, all your digits will be removed. Cheers, Jun Yin Ph.D.?student in U.C.D. Bioinformatics Laboratory Conway Institute University College Dublin -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Tristan Lefebure Sent: Wednesday, June 08, 2011 1:45 PM To: bioperl-l at lists.open-bio.org Subject: [Bioperl-l] Bio::AlignIO::Mase Hi there, I have some weird alignments with some numerical code stored within the sequence strings (eg. frameshift genewise code). Most AlignIO module I have tried eat them without any trouble except for Bio::AlignIO::Mase. The following patch seems to do the trick: diff -u mase.pm mase_mod.pm --- mase.pm 2011-06-08 14:08:58.558033996 +0200 +++ mase_mod.pm 2011-06-08 14:09:20.388066014 +0200 @@ -109,7 +109,7 @@ while( $entry = $self->_readline) { $entry =~ /^;/ && last; - $entry =~ s/[^A-Za-z\.\-]//g; + $entry =~ s/[^A-Za-z0-9\.\-]//g; $seq .= $entry; } if( $end == -1) { But I am left with the feeling that I don't really understand why this works (which I don't quite like before pushing a patch...) Why doing a s///g instead of a simple m//, and why doing '/[^' and not '/^['... Is that linked to that fact that $/ was modified to read chunks of files? BTW where is $/ set? I searched in Bio::Root::IO but didn't find it... Oh so many questions... Thanks! -- Tristan _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Jun 8 10:51:48 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 8 Jun 2011 09:51:48 -0500 Subject: [Bioperl-l] Bio::AlignIO::Mase In-Reply-To: <001601cc25e1$5a9a9050$0fcfb0f0$%yin@ucd.ie> References: <201106081445.22322.tristan.lefebure@gmail.com> <001601cc25e1$5a9a9050$0fcfb0f0$%yin@ucd.ie> Message-ID: <44F52EA5-F9B1-4492-A8D9-0EDDC2A872A2@illinois.edu> On Jun 8, 2011, at 8:38 AM, Jun Yin wrote: > Hi, Tristan, > > For your first two questions, > > $entry =~ s/[^A-Za-z0-9\.\-]//g; # It recursively remove all non > "A-Za-z0-9.-" > If you change it to $entry =~ m/[^A-Za-z0-9\.\-]/; #It will find the first > non "A-Za-z0-9.-", and do nothing (except return 1). > > '/[^' and '/^[' are two different things in the reg-exp. [^abc] means > non-abc in the string. ^[abc] means the string should start with abc. > > I don't understand why you are looking for $/. $/ is OUTPUT_FIELD_SEPARATOR. > You can set it in your own script, for example: > > $old_seperator=$/; > $/="\t"; > > Then the line should end with "\t". After that, you can change it back > using: > $/=$old_seperator; or you can localize it to a block/sub, which is considered safer: { local $/ = undef; # do rotten things here } # outside the block $/ returns to prior setting > For your patch, I think it is written well. Since you don't want to remove > the digits in your sequence, this is why > > $entry =~ s/[^A-Za-z\.\-]//g; > is changed into > $entry =~ s/[^A-Za-z0-9\.\-]//g; > > Otherwise, all your digits will be removed. > > Cheers, > Jun Yin > Ph.D. student in U.C.D. We do have a nice suite of regression tests; I would check to see whether changes affect the test results, then (upon a failure) determine why the tests fail. In many cases the tests are possibly wrong or aren't complete. The latter may be the case for mase tests (seem pretty minimal): [cjfields at pyrimidine bioperl-live (master)]$ ./Build test --test-files t/AlignIO/mase.t --verbose ... t/AlignIO/mase.t .. 1..3 ok 1 - use Bio::AlignIO::mase; ok 2 - The object isa Bio::Align::AlignI ok 3 - mase input test ok All tests successful. Files=1, Tests=3, 0 wallclock secs ( 0.03 usr 0.01 sys + 0.17 cusr 0.02 csys = 0.23 CPU) Result: PASS [cjfields at pyrimidine bioperl-live (master)]$ chris PS - nice to hear from you Jun! Need to talk to you about last year's GSoC code at some point. From tristan.lefebure at gmail.com Wed Jun 8 11:05:33 2011 From: tristan.lefebure at gmail.com (Tristan Lefebure) Date: Wed, 8 Jun 2011 17:05:33 +0200 Subject: [Bioperl-l] Bio::AlignIO::Mase In-Reply-To: <5AD73D48-FB3B-4F5C-8AF8-0670C90EE8CE@gmail.com> References: <201106081445.22322.tristan.lefebure@gmail.com> <5AD73D48-FB3B-4F5C-8AF8-0670C90EE8CE@gmail.com> Message-ID: <201106081705.33856.tristan.lefebure@gmail.com> Thanks all for your answers. I didn't know about [^] (always something new to learn with perl...). Yes, the point is to keep numerical characters (like frameshift) in the MASE parser (and others actually, the FASTA and NEXUS don't try to strip anything AFAIK). May be I should rephrase it this way: should any parser try to strip anything from the sequence string? I actually wonder what is the original purpose of the line $entry =~ s/[^A-Za-z\.\-]//g; in the MASE parser. Couldn't we just replace it with: chomp $entry; If I do so and run the test I get: [tristan at picodon bioperl-live] perl t/AlignIO/mase.t 1..3 ok 1 - use Bio::AlignIO::mase; ok 2 - The object isa Bio::Align::AlignI ok 3 - mase input test (Chris, how do you run this: ./Build test --test-files t/AlignIO/mase.t --verbose The only thing I manage to do is: [tristan at picodon bioperl-live] ./Build.PL test --test-files t/AlignIO/mase.t --verbose Too early to specify a build action 'test'. Do 'Build test' instead. ) -- Tristan On Wednesday 08 June 2011 16:27:44 Jason Stajich wrote: > Hi Tristan - > > This regular expression is to is to strip everything that > isn't a letter, . or - the [^] means match everything > EXCEPT what follows. I guess if numeric values are > valid in these type of alignments you would just add \d > (instead of 0-9) > > So you are asking for the parser to not strip out > frameshift info from a MASE parser? > > This doesn't have anything to do with the chunk pattern > or size set with $/ AFAIK. > > On Jun 8, 2011, at 7:45 AM, Tristan Lefebure wrote: > > Hi there, > > > > I have some weird alignments with some numerical code > > stored within the sequence strings (eg. frameshift > > genewise code). Most AlignIO module I have tried eat > > them without any trouble except for > > Bio::AlignIO::Mase. > > > > The following patch seems to do the trick: > > > > diff -u mase.pm mase_mod.pm > > --- mase.pm 2011-06-08 14:08:58.558033996 +0200 > > +++ mase_mod.pm 2011-06-08 14:09:20.388066014 +0200 > > @@ -109,7 +109,7 @@ > > > > while( $entry = $self->_readline) { > > > > $entry =~ /^;/ && last; > > > > - $entry =~ s/[^A-Za-z\.\-]//g; > > + $entry =~ s/[^A-Za-z0-9\.\-]//g; > > > > $seq .= $entry; > > > > } > > if( $end == -1) { > > > > But I am left with the feeling that I don't really > > understand why this works (which I don't quite like > > before pushing a patch...) > > > > Why doing a s///g instead of a simple m//, and why > > doing '/[^' and not '/^['... Is that linked to that > > fact that $/ was modified to read chunks of files? BTW > > where is $/ set? I searched in Bio::Root::IO but > > didn't find it... > > > > Oh so many questions... > > > > Thanks! > > > > -- > > Tristan > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason.stajich at gmail.com Wed Jun 8 11:27:48 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Wed, 8 Jun 2011 10:27:48 -0500 Subject: [Bioperl-l] Bio::AlignIO::Mase In-Reply-To: <201106081705.33856.tristan.lefebure@gmail.com> References: <201106081445.22322.tristan.lefebure@gmail.com> <5AD73D48-FB3B-4F5C-8AF8-0670C90EE8CE@gmail.com> <201106081705.33856.tristan.lefebure@gmail.com> Message-ID: It is a good question of what should it keep? This could be too much of nanny-state bioinformatics... =) It is mainly to get rid of internal whitespaces. Some of the alphabets supported by the Bio::Seq objects could balk at the non-standard symbols. This is also to perhaps sanitize the data before building the object so that it can be translated back out to different formats that cannot support non-character symbols. It is critical to remove the whitespaces though since the sequence object should just have data and so that everything is still aligned. Jason On Jun 8, 2011, at 10:05 AM, Tristan Lefebure wrote: > Thanks all for your answers. I didn't know about [^] (always > something new to learn with perl...). > > Yes, the point is to keep numerical characters (like > frameshift) in the MASE parser (and others actually, the > FASTA and NEXUS don't try to strip anything AFAIK). > > May be I should rephrase it this way: should any parser try > to strip anything from the sequence string? I actually > wonder what is the original purpose of the line > $entry =~ s/[^A-Za-z\.\-]//g; > in the MASE parser. Couldn't we just replace it with: > chomp $entry; > > If I do so and run the test I get: > > [tristan at picodon bioperl-live] perl t/AlignIO/mase.t > 1..3 > ok 1 - use Bio::AlignIO::mase; > ok 2 - The object isa Bio::Align::AlignI > ok 3 - mase input test > > (Chris, how do you run this: > ./Build test --test-files t/AlignIO/mase.t --verbose > > The only thing I manage to do is: > > [tristan at picodon bioperl-live] ./Build.PL test --test-files > t/AlignIO/mase.t --verbose > Too early to specify a build action 'test'. Do 'Build test' > instead. > ) > > -- > Tristan > > On Wednesday 08 June 2011 16:27:44 Jason Stajich wrote: >> Hi Tristan - >> >> This regular expression is to is to strip everything that >> isn't a letter, . or - the [^] means match everything >> EXCEPT what follows. I guess if numeric values are >> valid in these type of alignments you would just add \d >> (instead of 0-9) >> >> So you are asking for the parser to not strip out >> frameshift info from a MASE parser? >> >> This doesn't have anything to do with the chunk pattern >> or size set with $/ AFAIK. >> >> On Jun 8, 2011, at 7:45 AM, Tristan Lefebure wrote: >>> Hi there, >>> >>> I have some weird alignments with some numerical code >>> stored within the sequence strings (eg. frameshift >>> genewise code). Most AlignIO module I have tried eat >>> them without any trouble except for >>> Bio::AlignIO::Mase. >>> >>> The following patch seems to do the trick: >>> >>> diff -u mase.pm mase_mod.pm >>> --- mase.pm 2011-06-08 14:08:58.558033996 +0200 >>> +++ mase_mod.pm 2011-06-08 14:09:20.388066014 +0200 >>> @@ -109,7 +109,7 @@ >>> >>> while( $entry = $self->_readline) { >>> >>> $entry =~ /^;/ && last; >>> >>> - $entry =~ s/[^A-Za-z\.\-]//g; >>> + $entry =~ s/[^A-Za-z0-9\.\-]//g; >>> >>> $seq .= $entry; >>> >>> } >>> if( $end == -1) { >>> >>> But I am left with the feeling that I don't really >>> understand why this works (which I don't quite like >>> before pushing a patch...) >>> >>> Why doing a s///g instead of a simple m//, and why >>> doing '/[^' and not '/^['... Is that linked to that >>> fact that $/ was modified to read chunks of files? BTW >>> where is $/ set? I searched in Bio::Root::IO but >>> didn't find it... >>> >>> Oh so many questions... >>> >>> Thanks! >>> >>> -- >>> Tristan >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From locarpau at upvnet.upv.es Wed Jun 8 15:10:17 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Wed, 08 Jun 2011 21:10:17 +0200 Subject: [Bioperl-l] model 3 on Codeml.pm In-Reply-To: <894644EC-C39E-4AF2-99E9-E46DBF3EFDD3@gmail.com> References: <18DF7D20DFEC044098A1062202F5FFF338731C9A60@exchsth.agresearch.co.nz> <4DEEDDD4.2040002@upvnet.upv.es> <894644EC-C39E-4AF2-99E9-E46DBF3EFDD3@gmail.com> Message-ID: <4DEFC919.4060506@upvnet.upv.es> Jason, I edited the Codeml.pm and posted the version in github. The only change I made was in line 275 'model' => [0..2,7] was changed to 'model' => [0..3,7] , so that 3 could be passed as valid value to model. However, I still got the same warning message, and I tested with different alignments and results are different from those obtained running PAML locally. Maybe there are additional changes to be made. Lorenzo El 08/06/11 07:15, Jason Stajich escribi?: > it is in github so you can fork a version, make the change, and submit a patch which we can pick up. > > I am concerned that this change can't be tested without example code. > > Did you just edit the code and make sure your changes worked, the error message seems to refer to a different parameter > (ncatG) not model. > > Thanks, > >> MSG: parameter ncatG specified value is not recognized, please see the documentation and the code for this module or set the no_param_checks to a true value.' > > On Jun 7, 2011, at 9:26 PM, Lorenzo Carretero wrote: > >> Hi, >> I'm trying to run the clade model D (Model D: model = 3, NSsites = 3 ncatG = 2, See reference. Bielawski, J. P., and Z. Yang. 2004. A maximum likelihood method for detecting functional divergence at individual codon sites, with application to gene family evolution. Journal of Molecular Evolution 59:121-132. and PAML 4.4 manual page 30) from the Bio::Tools::Run::Phylo::PAML::Codeml module. However, 3 is not among the valid values that can be passed to the module (line 275 'model' => [0..2,7],) and consequently the following Warning message is returned from lines 689-690: >> 'MSG: parameter ncatG specified value is not recognized, please see the documentation and the code for this module or set the no_param_checks to a true value.' >> Can line 275: 'model' => [0..2,7], be changed to 'model' => [0..3,7], to accept value 3 or additional changes must be done in other modules to properly run the so-called clade models of PAML. >> Thanks, >> Lorenzo >> >> >> -- >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> Lorenzo Carretero Paulet >> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >> Integrative Systems Biology Group >> C/ Ingeniero Fausto Elio s/n. >> 46022 Valencia, Spain >> >> Phone: +34 963879934 >> Fax: +34 963877859 >> e-mail: locarpau at upvnet.upv.es >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From jason.stajich at gmail.com Wed Jun 8 16:04:43 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Wed, 8 Jun 2011 15:04:43 -0500 Subject: [Bioperl-l] model 3 on Codeml.pm In-Reply-To: <4DEFC919.4060506@upvnet.upv.es> References: <18DF7D20DFEC044098A1062202F5FFF338731C9A60@exchsth.agresearch.co.nz> <4DEEDDD4.2040002@upvnet.upv.es> <894644EC-C39E-4AF2-99E9-E46DBF3EFDD3@gmail.com> <4DEFC919.4060506@upvnet.upv.es> Message-ID: yes - as I said, a simple script that demonstrates the problem will make this all easier to debug. If you reported the actual error message that might be a good start. The msg before refers to a different parameter so I'm suspicious. On Jun 8, 2011, at 2:10 PM, Lorenzo Carretero Paulet wrote: > Jason, > I edited the Codeml.pm and posted the version in github. The only change I made was in line 275 'model' => [0..2,7] was changed to 'model' => [0..3,7] , so that 3 could be passed as valid value to model. However, I still got the same warning message, and I tested with different alignments and results are different from those obtained running PAML locally. Maybe there are additional changes to be made. > Lorenzo > > El 08/06/11 07:15, Jason Stajich escribi?: >> it is in github so you can fork a version, make the change, and submit a patch which we can pick up. >> >> I am concerned that this change can't be tested without example code. >> >> Did you just edit the code and make sure your changes worked, the error message seems to refer to a different parameter >> (ncatG) not model. >> >> Thanks, >> >>> MSG: parameter ncatG specified value is not recognized, please see the documentation and the code for this module or set the no_param_checks to a true value.' >> >> On Jun 7, 2011, at 9:26 PM, Lorenzo Carretero wrote: >> >>> Hi, >>> I'm trying to run the clade model D (Model D: model = 3, NSsites = 3 ncatG = 2, See reference. Bielawski, J. P., and Z. Yang. 2004. A maximum likelihood method for detecting functional divergence at individual codon sites, with application to gene family evolution. Journal of Molecular Evolution 59:121-132. and PAML 4.4 manual page 30) from the Bio::Tools::Run::Phylo::PAML::Codeml module. However, 3 is not among the valid values that can be passed to the module (line 275 'model' => [0..2,7],) and consequently the following Warning message is returned from lines 689-690: >>> 'MSG: parameter ncatG specified value is not recognized, please see the documentation and the code for this module or set the no_param_checks to a true value.' >>> Can line 275: 'model' => [0..2,7], be changed to 'model' => [0..3,7], to accept value 3 or additional changes must be done in other modules to properly run the so-called clade models of PAML. >>> Thanks, >>> Lorenzo >>> >>> >>> -- >>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>> Lorenzo Carretero Paulet >>> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >>> Integrative Systems Biology Group >>> C/ Ingeniero Fausto Elio s/n. >>> 46022 Valencia, Spain >>> >>> Phone: +34 963879934 >>> Fax: +34 963877859 >>> e-mail: locarpau at upvnet.upv.es >>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > Lorenzo Carretero Paulet > Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) > Integrative Systems Biology Group > C/ Ingeniero Fausto Elio s/n. > 46022 Valencia, Spain > > Phone: +34 963879934 > Fax: +34 963877859 > e-mail: locarpau at upvnet.upv.es > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From locarpau at upvnet.upv.es Wed Jun 8 16:21:08 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Wed, 08 Jun 2011 22:21:08 +0200 Subject: [Bioperl-l] model 3 on Codeml.pm In-Reply-To: References: <18DF7D20DFEC044098A1062202F5FFF338731C9A60@exchsth.agresearch.co.nz> <4DEEDDD4.2040002@upvnet.upv.es> <894644EC-C39E-4AF2-99E9-E46DBF3EFDD3@gmail.com> <4DEFC919.4060506@upvnet.upv.es> Message-ID: <4DEFD9B4.8010503@upvnet.upv.es> Jason, This is the exact error message: --------------------- WARNING --------------------- MSG: parameter model specified value 3 is not recognized, please see the documentation and the code for this module or set the no_param_checks to a true value --------------------------------------------------- See below a version of the script I'm trying to run. It runs well for model values 0,1 and 2 and, but retruns taht message with 3 (even after editing codeml.pm). It is within a subroutine to which I pass aa and nt sequences files as well as the tree file. Actually I'm only interested in the resulting lnL of the data to do LRT comparisons. I ran PAML manually and, with the same datasets, the lnL values for model 3 are not the same as those returned when ran from the script. Thanks, Lorenzo PS: my %hashNTseqs = (); my $dna_aln; my $pep_aln; my $lnL; my $seq_ids; #Create a Bio::SeqIO object with the pep sequences my $inseq_pep = Bio::SeqIO->new(-file => "<$sequencesfilenameAA", -format => $format ); # Remove the stop (*) from the end of every protein sequence while (my $seqin = $inseq_pep->next_seq) { my $seq = $seqin->seq(); $seq =~ s/\*//g; } #Create a Bio::SeqIO object with the nt sequences my $inseq_nt = Bio::SeqIO->new(-file => "<$sequencesfilenameNT", -format => $format ); #Get an alignment of protein sequences my $aln_factory = Bio::Tools::Run::Alignment::Clustalw->new (); #my $aln_factory = Bio::Tools::Run::Alignment::Muscle->new (); #say "Aligning peptide sequences globally (clustalw) ..."; $pep_aln = $aln_factory->align($sequencesfilenameAA); while (my $seqin = $inseq_nt->next_seq) { my $seq = $seqin->seq(); # Replace all Xs and missing characters (?) with Ns $seq =~ s/X/N/gi; $seq =~ s/\?/N/g; my $seq_id = $seqin->display_id(); # Create a reference to a hash with keys : display_ids for the aa sequences in the alignment # and the values are a Bio::PrimarySeqI object for the corresponding spliced cDNA sequence. $hashNTseqs{$seq_id} = $seqin; } #Generating codon alignment on the basis of the corresponding peptide one #say "Aligning codon sequences according to corresponding peptide MSA ..."; $dna_aln = aa_to_dna_aln($pep_aln, \%hashNTseqs); #Remove all positions containing a codon gap from the codon alignment??? #say "positions: ",$dna_aln->num_residues(); $dna_aln = $dna_aln-> remove_columns(['gaps']); #say "positions: ",$dna_aln->num_residues(); #Generating tree from input string of ids my $io = IO::String->new($tree); my $biotreeio = Bio::TreeIO->new(-fh => $io, -format => 'newick'); my $biotree = $biotreeio->next_tree; my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( -alignment => $dna_aln, -tree => $biotree, -params => { #'verbose' => 0, #'noisy' => 9, 'runmode' => 0, #user tree 'seqtype' => 1, 'model' => 3, 'NSsites' => 3, 'fix_omega' => 0, 'omega' => 0, 'ncatG' => 2, #'icode' => 0, #'fix_alpha' => 0, #'fix_kappa' => 0, #'RateAncestor' => 0, 'CodonFreq' => 2, 'cleandata' => 1, 'ndata' => 1 }, ); #$codeml_factory->alignment($dna_aln); #$codeml_factory->tree($biotree); #$codeml_factory->set_parameter('model',2); my ($rc,$parser) = $codeml_factory->run(); # or run($dna_aln,$biotree) $codeml_factory->cleanup(); my $result = $parser->next_result(); #my $MLmatrix_free = $result_free->get_MLmatrix(); my $intree = $result->next_tree; $lnL = $intree->score; say "lnL=$lnL"; for my $node ( $intree->get_nodes ) { my $id; # first we do some work to figure out what the ID should be. # for a leaf or tip node this is just the taxon label if( $node->is_Leaf() ) { $id = $node->id; } else { # for the internal nodes it is just the name of all the sub-nodes # put together, much like how Sanderson represents internal nodes # in r8s $id = "(".join(",", map { $_->id } grep { $_->is_Leaf }$node->get_all_Descendents) .")"; } if( ! $node->ancestor or ! $node->has_tag('t') ) { # skip when no values have been associated with this node # (like the root node) next; } print join ("\t",$id,map { ($node->get_tag_values($_))[0] }qw(dN/dS)), "\n"; } $result->reset_seqs; El 08/06/11 22:04, Jason Stajich escribi?: > yes - as I said, a simple script that demonstrates the problem will make this all easier to debug. If you reported the actual error message that might be a good start. The msg before refers to a different parameter so I'm suspicious. > > On Jun 8, 2011, at 2:10 PM, Lorenzo Carretero Paulet wrote: > >> Jason, >> I edited the Codeml.pm and posted the version in github. The only change I made was in line 275 'model' => [0..2,7] was changed to 'model' => [0..3,7] , so that 3 could be passed as valid value to model. However, I still got the same warning message, and I tested with different alignments and results are different from those obtained running PAML locally. Maybe there are additional changes to be made. >> Lorenzo >> >> El 08/06/11 07:15, Jason Stajich escribi?: >>> it is in github so you can fork a version, make the change, and submit a patch which we can pick up. >>> >>> I am concerned that this change can't be tested without example code. >>> >>> Did you just edit the code and make sure your changes worked, the error message seems to refer to a different parameter >>> (ncatG) not model. >>> >>> Thanks, >>> >>>> MSG: parameter ncatG specified value is not recognized, please see the documentation and the code for this module or set the no_param_checks to a true value.' >>> On Jun 7, 2011, at 9:26 PM, Lorenzo Carretero wrote: >>> >>>> Hi, >>>> I'm trying to run the clade model D (Model D: model = 3, NSsites = 3 ncatG = 2, See reference. Bielawski, J. P., and Z. Yang. 2004. A maximum likelihood method for detecting functional divergence at individual codon sites, with application to gene family evolution. Journal of Molecular Evolution 59:121-132. and PAML 4.4 manual page 30) from the Bio::Tools::Run::Phylo::PAML::Codeml module. However, 3 is not among the valid values that can be passed to the module (line 275 'model' => [0..2,7],) and consequently the following Warning message is returned from lines 689-690: >>>> 'MSG: parameter ncatG specified value is not recognized, please see the documentation and the code for this module or set the no_param_checks to a true value.' >>>> Can line 275: 'model' => [0..2,7], be changed to 'model' => [0..3,7], to accept value 3 or additional changes must be done in other modules to properly run the so-called clade models of PAML. >>>> Thanks, >>>> Lorenzo >>>> >>>> >>>> -- >>>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>>> Lorenzo Carretero Paulet >>>> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >>>> Integrative Systems Biology Group >>>> C/ Ingeniero Fausto Elio s/n. >>>> 46022 Valencia, Spain >>>> >>>> Phone: +34 963879934 >>>> Fax: +34 963877859 >>>> e-mail: locarpau at upvnet.upv.es >>>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> Lorenzo Carretero Paulet >> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >> Integrative Systems Biology Group >> C/ Ingeniero Fausto Elio s/n. >> 46022 Valencia, Spain >> >> Phone: +34 963879934 >> Fax: +34 963877859 >> e-mail: locarpau at upvnet.upv.es >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From cjfields at illinois.edu Wed Jun 8 16:35:31 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 8 Jun 2011 15:35:31 -0500 Subject: [Bioperl-l] Bio::AlignIO::Mase In-Reply-To: <201106081705.33856.tristan.lefebure@gmail.com> References: <201106081445.22322.tristan.lefebure@gmail.com> <5AD73D48-FB3B-4F5C-8AF8-0670C90EE8CE@gmail.com> <201106081705.33856.tristan.lefebure@gmail.com> Message-ID: <9D745651-1477-4F93-A272-6EE1183E2317@illinois.edu> On Jun 8, 2011, at 10:05 AM, Tristan Lefebure wrote: > Thanks all for your answers. I didn't know about [^] (always > something new to learn with perl...). > > ... > (Chris, how do you run this: > ./Build test --test-files t/AlignIO/mase.t --verbose > > The only thing I manage to do is: > > [tristan at picodon bioperl-live] ./Build.PL test --test-files > t/AlignIO/mase.t --verbose > Too early to specify a build action 'test'. Do 'Build test' > instead. > ) Note the lack of '.PL' on './Build test --test-files t/AlignIO/mase.t --verbose'. You must run 'perl Build.PL' first. Arguably a better general way to run tests is to use 'prove -lv t/AlignIO/mase.t', which adds the local 'lib' directory while testing. Just running the tests using 'perl t/AlignIO.mase.t' doesn't do this (it can be done with '-I./lib'), but 'prove' has more flexibility that simply using perl directly. chris From jason.stajich at gmail.com Wed Jun 8 17:01:14 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Wed, 8 Jun 2011 16:01:14 -0500 Subject: [Bioperl-l] model 3 on Codeml.pm In-Reply-To: <4DEFD9B4.8010503@upvnet.upv.es> References: <18DF7D20DFEC044098A1062202F5FFF338731C9A60@exchsth.agresearch.co.nz> <4DEEDDD4.2040002@upvnet.upv.es> <894644EC-C39E-4AF2-99E9-E46DBF3EFDD3@gmail.com> <4DEFC919.4060506@upvnet.upv.es> <4DEFD9B4.8010503@upvnet.upv.es> Message-ID: <0D92B323-E89C-4D0C-8F3C-EB673BDCB9B3@gmail.com> And did you try setting -no_param_checks => 1 so that it ignores the testing for valid parameters, this is your fastest route to running the program without it validating parameters... On Jun 8, 2011, at 3:21 PM, Lorenzo Carretero Paulet wrote: > no_param_checks From locarpau at upvnet.upv.es Wed Jun 8 17:32:02 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero) Date: Wed, 08 Jun 2011 23:32:02 +0200 Subject: [Bioperl-l] model 3 on Codeml.pm In-Reply-To: <0D92B323-E89C-4D0C-8F3C-EB673BDCB9B3@gmail.com> References: <18DF7D20DFEC044098A1062202F5FFF338731C9A60@exchsth.agresearch.co.nz> <4DEEDDD4.2040002@upvnet.upv.es> <894644EC-C39E-4AF2-99E9-E46DBF3EFDD3@gmail.com> <4DEFC919.4060506@upvnet.upv.es> <4DEFD9B4.8010503@upvnet.upv.es> <0D92B323-E89C-4D0C-8F3C-EB673BDCB9B3@gmail.com> Message-ID: <4DEFEA52.3020208@upvnet.upv.es> On 6/8/11 11:01 PM, Jason Stajich wrote: > And did you try setting > -no_param_checks => 1 > so that it ignores the testing for valid parameters, this is your fastest route to running the program without it validating parameters... > > On Jun 8, 2011, at 3:21 PM, Lorenzo Carretero Paulet wrote: > >> no_param_checks > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Yes. The script runs but returns the same message and seemingly unreliable results (different from those obtained through manual runs of PAML). This is really odd... Cheers, Lorenzo -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From jason.stajich at gmail.com Wed Jun 8 18:27:18 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Wed, 8 Jun 2011 17:27:18 -0500 Subject: [Bioperl-l] model 3 on Codeml.pm In-Reply-To: <4DEFEA52.3020208@upvnet.upv.es> References: <18DF7D20DFEC044098A1062202F5FFF338731C9A60@exchsth.agresearch.co.nz> <4DEEDDD4.2040002@upvnet.upv.es> <894644EC-C39E-4AF2-99E9-E46DBF3EFDD3@gmail.com> <4DEFC919.4060506@upvnet.upv.es> <4DEFD9B4.8010503@upvnet.upv.es> <0D92B323-E89C-4D0C-8F3C-EB673BDCB9B3@gmail.com> <4DEFEA52.3020208@upvnet.upv.es> Message-ID: well I would just have it run where it doesn't cleanup the tempfiles and look at the produced .ctl file so you can see how it is different from what you expected. All this module is doing is making a .ctl file and a temporary folder so there is likely a setting that isn't get get passed the ctl file properly. On Jun 8, 2011, at 4:32 PM, Lorenzo Carretero wrote: > On 6/8/11 11:01 PM, Jason Stajich wrote: >> And did you try setting >> -no_param_checks => 1 >> so that it ignores the testing for valid parameters, this is your fastest route to running the program without it validating parameters... >> >> On Jun 8, 2011, at 3:21 PM, Lorenzo Carretero Paulet wrote: >> >>> no_param_checks >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > Yes. The script runs but returns the same message and seemingly unreliable results (different from those obtained through manual runs of PAML). This is really odd... > Cheers, > Lorenzo > > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > Lorenzo Carretero Paulet > Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) > Integrative Systems Biology Group > C/ Ingeniero Fausto Elio s/n. > 46022 Valencia, Spain > > Phone: +34 963879934 > Fax: +34 963877859 > e-mail: locarpau at upvnet.upv.es > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From senthil.debian at gmail.com Wed Jun 8 22:25:34 2011 From: senthil.debian at gmail.com (Senthil Kumar M) Date: Wed, 8 Jun 2011 19:25:34 -0700 Subject: [Bioperl-l] Bio::DB::EUtilities and RefSeq Message-ID: Hi, I tried the Bio::DB::EUtilities example mentioned in this discussion: http://biostar.stackexchange.com/questions/3043/how-can-i-get-protein-sequence-in-fasta-format-using-taxon-id, with " -term => 'Escherichia coli BW2952[Orgn] AND rpoB[Gene/Protein Name]' and -db => 'protein' ". This retrieves three identical rpoB amino acid sequences, for brevity I provide only the fasta headers and not the actual sequences below: >gi|259494181|sp|C5A0S7.1|RPOB_ECOBW RecName: Full=DNA-directed RNA polymerase subunit beta; Short=RNAP subunit beta; AltName: Full=RNA polymerase subunit beta; AltName: Full=Transcriptase subunit beta >gi|238863495|gb|ACR65493.1| RNA polymerase, beta subunit [Escherichia coli BW2952] >gi|238903043|ref|YP_002928839.1| RNA polymerase, beta subunit [Escherichia coli BW2952] I am only interested in the RefSeq entry, ie YP_002928839.1 and not the other two. I can filter such duplicate entries after I download them from NCBI, but it would be nicer if there is a way to retrieve ONLY the RefSeq entries that match my query and download just them. I am aware that it is easier to do this online at the NCBI protein site, where there is a filter option (http://www.ncbi.nlm.nih.gov/protein?term=escherichia coli BW2952[organism] AND rpoB[Gene%2FProtein Name]), but I would like to know if the same is achievable through EUtilities since I have many sequences to download from NCBI. Reading "$ perldoc /usr/share/perl5/Bio/Tools/EUtilities/EUtilParameters.pm" and searching google did not provide any clues, but I might have missed something that was blindingly obvious. Any help would be much appreciated. Thanks in advance, Senthil -/ For I am Vader, Darth Vader, Lord Vader. I can kill you with a single thought." "Well, you'll still need a tray." "No, I will not need a tray. I do not need a tray to kill you. I can kill you without a tray, with the power of the Force, which is strong within me. Even though I could kill you with a tray if I so wished, for I would hack at your neck with the thin bit until the blood flowed across the canteen floor." "No, the food is hot. You'll need a tray to put the food on." "Oh, I see, the food is hot. I'm sorry, I did not realize." -- Eddie Izzard From cjfields at illinois.edu Wed Jun 8 22:40:29 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 8 Jun 2011 21:40:29 -0500 Subject: [Bioperl-l] Bio::DB::EUtilities and RefSeq In-Reply-To: References: Message-ID: If you can find a 'details' link on the NCBI search page it gives you a hint as to how it's done (in a fairly archaic way :). Here is your term: "Escherichia coli BW2952"[Organism] AND rpoB[Gene] AND srcdb_refseq[PROP] So, adding 'srcdb_refseq[PROP]' to your searches should limit to RefSeq only. chris On Jun 8, 2011, at 9:25 PM, Senthil Kumar M wrote: > Hi, > > I tried the Bio::DB::EUtilities example mentioned in this discussion: > http://biostar.stackexchange.com/questions/3043/how-can-i-get-protein-sequence-in-fasta-format-using-taxon-id, > with " -term => 'Escherichia coli BW2952[Orgn] AND rpoB[Gene/Protein > Name]' and -db => 'protein' ". This retrieves three identical rpoB > amino acid sequences, for brevity I provide only the fasta headers and > not the actual sequences below: > >> gi|259494181|sp|C5A0S7.1|RPOB_ECOBW RecName: Full=DNA-directed RNA polymerase subunit beta; Short=RNAP subunit beta; AltName: Full=RNA polymerase subunit beta; AltName: Full=Transcriptase subunit beta >> gi|238863495|gb|ACR65493.1| RNA polymerase, beta subunit [Escherichia coli BW2952] >> gi|238903043|ref|YP_002928839.1| RNA polymerase, beta subunit [Escherichia coli BW2952] > > I am only interested in the RefSeq entry, ie YP_002928839.1 and not > the other two. I can filter such duplicate entries after I download > them from NCBI, but it would be nicer if there is a way to retrieve > ONLY the RefSeq entries that match my query and download just them. > > I am aware that it is easier to do this online at the NCBI protein > site, where there is a filter option > (http://www.ncbi.nlm.nih.gov/protein?term=escherichia coli > BW2952[organism] AND rpoB[Gene%2FProtein Name]), but I would like to > know if the same is achievable through EUtilities since I have many > sequences to download from NCBI. > > Reading "$ perldoc > /usr/share/perl5/Bio/Tools/EUtilities/EUtilParameters.pm" and > searching google did not provide any clues, but I might have missed > something that was blindingly obvious. Any help would be much > appreciated. > > Thanks in advance, > > Senthil > > -/ > For I am Vader, Darth Vader, Lord Vader. I can kill you with a single thought." > "Well, you'll still need a tray." > "No, I will not need a tray. I do not need a tray to kill you. I can > kill you without a tray, with the power of the Force, which is strong > within me. Even though I could kill you with a tray if I so wished, > for I would hack at your neck with the thin bit until the blood flowed > across the canteen floor." > "No, the food is hot. You'll need a tray to put the food on." > "Oh, I see, the food is hot. I'm sorry, I did not realize." > -- Eddie Izzard > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From tristan.lefebure at gmail.com Thu Jun 9 03:24:39 2011 From: tristan.lefebure at gmail.com (Tristan Lefebure) Date: Thu, 9 Jun 2011 09:24:39 +0200 Subject: [Bioperl-l] Bio::AlignIO::Mase In-Reply-To: <9D745651-1477-4F93-A272-6EE1183E2317@illinois.edu> References: <201106081445.22322.tristan.lefebure@gmail.com> <5AD73D48-FB3B-4F5C-8AF8-0670C90EE8CE@gmail.com> <201106081705.33856.tristan.lefebure@gmail.com> <9D745651-1477-4F93-A272-6EE1183E2317@illinois.edu> Message-ID: Thanks again. Based on your comments, would you mind considering the following patch? [tristan at picodon bioperl-live] diff -u Bio/AlignIO/mase_original.pm Bio/AlignIO/mase.pm --- Bio/AlignIO/mase_original.pm??????? 2011-06-09 09:12:47.929957767 +0200 +++ Bio/AlignIO/mase.pm 2011-06-09 09:12:11.029903646 +0200 @@ -109,7 +109,7 @@ ??????? while( $entry = $self->_readline) { ??????????? $entry =~ /^;/ && last; -?????????? $entry =~ s/[^A-Za-z\.\-]//g; +?????????? $entry =~ s/[^A-Za-z\d\.\-]//g; ??????????? $seq .= $entry; ??????? } ??????? if( $end == -1) { [tristan at picodon bioperl-live] prove -lv t/AlignIO/mase.t t/AlignIO/mase.t .. 1..3 ok 1 - use Bio::AlignIO::mase; ok 2 - The object isa Bio::Align::AlignI ok 3 - mase input test ok All tests successful. Files=1, Tests=3,? 0 wallclock secs ( 0.03 usr? 0.00 sys +? 0.14 cusr 0.01 csys =? 0.18 CPU) Result: PASS [tristan at picodon bioperl-live] I see that the Bio::AlignIO::Mase write_aln method is yet to be implemented. Is that on someone's todo list? I may try to do this... -- Tristan On Wed, Jun 8, 2011 at 10:35 PM, Chris Fields wrote: > > On Jun 8, 2011, at 10:05 AM, Tristan Lefebure wrote: > > > Thanks all for your answers. I didn't know about [^] (always > > something new to learn with perl...). > > > > ... > > (Chris, how do you run this: > > ? ? ? ./Build test --test-files t/AlignIO/mase.t --verbose > > > > The only thing I manage to do is: > > > > [tristan at picodon bioperl-live] ./Build.PL test --test-files > > t/AlignIO/mase.t --verbose > > Too early to specify a build action 'test'. ?Do 'Build test' > > instead. > > ) > > Note the lack of '.PL' on './Build test --test-files t/AlignIO/mase.t --verbose'. ?You must run 'perl Build.PL' first. > > Arguably a better general way to run tests is to use 'prove -lv t/AlignIO/mase.t', which adds the local 'lib' directory while testing. ?Just running the tests using 'perl t/AlignIO.mase.t' doesn't do this (it can be done with '-I./lib'), but 'prove' has more flexibility that simply using perl directly. > > chris From senthil.debian at gmail.com Thu Jun 9 12:58:41 2011 From: senthil.debian at gmail.com (Senthil Kumar M) Date: Thu, 9 Jun 2011 09:58:41 -0700 Subject: [Bioperl-l] Bio::DB::EUtilities and RefSeq In-Reply-To: References: Message-ID: On Wed, Jun 8, 2011 at 7:40 PM, Chris Fields wrote: > If you can find a 'details' link on the NCBI search page it gives you a hint as to how it's done (in a fairly archaic way :). ?Here is your term: > > "Escherichia coli BW2952"[Organism] AND rpoB[Gene] AND srcdb_refseq[PROP] > > So, adding 'srcdb_refseq[PROP]' to your searches should limit to RefSeq only. > > chris > Hi Chris, Thanks, that works perfectly! Senthil -/ You have no control over your cat! You can't say to your cat, "Cat, heel! Stay! Wait! Lie down! Roll over!" 'Cause the cat's just gonna be sitting there going, "Interesting words? have you finished?" -- Eddie Izzard From locarpau at upvnet.upv.es Thu Jun 9 14:14:24 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero) Date: Thu, 09 Jun 2011 20:14:24 +0200 Subject: [Bioperl-l] model 3 on Codeml.pm In-Reply-To: References: <18DF7D20DFEC044098A1062202F5FFF338731C9A60@exchsth.agresearch.co.nz> <4DEEDDD4.2040002@upvnet.upv.es> <894644EC-C39E-4AF2-99E9-E46DBF3EFDD3@gmail.com> <4DEFC919.4060506@upvnet.upv.es> <4DEFD9B4.8010503@upvnet.upv.es> <0D92B323-E89C-4D0C-8F3C-EB673BDCB9B3@gmail.com> <4DEFEA52.3020208@upvnet.upv.es> Message-ID: <4DF10D80.7040603@upvnet.upv.es> Jason, Thanks for your reply. I also changed the line in Bio::Tools::Run::Phylo::PAML::Codeml and 3 is now properly passed as value to model, no warning message and the returned lnL values corresponded to manual PAML runs (keeping the rest of parameters identical). Cheers On 6/9/11 12:27 AM, Jason Stajich wrote: > well I would just have it run where it doesn't cleanup the tempfiles and look at the produced .ctl file so you can see how it is different from what you expected. All this module is doing is making a .ctl file and a temporary folder so there is likely a setting that isn't get get passed the ctl file properly. > > On Jun 8, 2011, at 4:32 PM, Lorenzo Carretero wrote: > >> On 6/8/11 11:01 PM, Jason Stajich wrote: >>> And did you try setting >>> -no_param_checks => 1 >>> so that it ignores the testing for valid parameters, this is your fastest route to running the program without it validating parameters... >>> >>> On Jun 8, 2011, at 3:21 PM, Lorenzo Carretero Paulet wrote: >>> >>>> no_param_checks >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> Yes. The script runs but returns the same message and seemingly unreliable results (different from those obtained through manual runs of PAML). This is really odd... >> Cheers, >> Lorenzo >> >> -- >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> Lorenzo Carretero Paulet >> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >> Integrative Systems Biology Group >> C/ Ingeniero Fausto Elio s/n. >> 46022 Valencia, Spain >> >> Phone: +34 963879934 >> Fax: +34 963877859 >> e-mail: locarpau at upvnet.upv.es >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From cjfields at illinois.edu Sat Jun 11 19:26:27 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sat, 11 Jun 2011 18:26:27 -0500 Subject: [Bioperl-l] A question about Bioperl SeqUtil In-Reply-To: References: Message-ID: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> Hui, Please email questions to the mail list, not directly to individual developers. You probably could translate whole chromosomes this way, but it might take a while. I would also look at tools like EMBOSS to do this, should be faster. chris On Jun 11, 2011, at 10:31 AM, Hui Liu wrote: > Hi Chris, > Now I am having trouble translating large chromosome sequences in 6 frames. Do you know any available tools that supports large sequence translation? Does Bioperl 1.6.90 support that? > Thanks a lot :-) > Hui > -- > Nothing shocks me. I'm a scientist. > -Indiana Jones From locarpau at upvnet.upv.es Sun Jun 12 21:35:50 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero) Date: Mon, 13 Jun 2011 03:35:50 +0200 Subject: [Bioperl-l] passing twice a codon MSA to codeml factory In-Reply-To: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> Message-ID: <4DF56976.8080704@upvnet.upv.es> Hi, I'm trying to pass the same codon MSA several times to a sub which runs codeml with the parameters passed as arguments. However, the second time it is passed, the program stops with the error message: --------------------- WARNING --------------------- MSG: There was an error - see error_string for the program output --------------------------------------------------- ------------- EXCEPTION: Bio::Root::NotImplemented ------------- MSG: Unknown format of PAML output did not see seqtype STACK: Error::throw STACK: Bio::Root::Root::throw /Library/Perl//5.10.0/Bio/Root/Root.pm:472 STACK: Bio::Tools::Phylo::PAML::_parse_summary /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:526 STACK: Bio::Tools::Phylo::PAML::next_result /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:271 STACK: main::BranchSiteEvolAnalysis /Users/Lorenzo/Documents/workspace/PlantEvolGen/test.pl:364 STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/test.pl:233 Here is just some partial code to illustrate what I'm saying: my $codon_MSA = Method_to_get_codonMSA ( $sequencesfilenameAA, $sequencesfilenameNT ); ( $lnL, $omegas, $pamlrun ) = BranchSiteEvolAnalysis ( $codon_MSA, 2, $tree, 0, 0, 0, 8 ); #The first time runs OK ( $lnL, $omegas, $pamlrun ) = BranchSiteEvolAnalysis ( $codon_MSA, 0, $tree, 0, 0, 0, 8 ); #The second time crashes #Method to_run PAML with the codon_MSA, tree, and codeml parameters passed as arguments sub BranchSiteEvolAnalysis { my ( $codon_MSA, $model, $tree, $NSsites, $fix_omega, $omega, $ncatG ) = @_; . . . my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( -alignment => $codon_MSA, -tree => $biotree, -params => { #'verbose' => 0, #'noisy' => 9, 'runmode' => 0, #user tree 'seqtype' => 1, 'model' => $model, 'NSsites' => $NSsites, 'fix_omega' => $fix_omega, 'omega' => $omega, 'ncatG' => $ncatG, #'icode' => 0, #'fix_alpha' => 0, #'fix_kappa' => 0, #'RateAncestor' => 0, 'CodonFreq' => 2, 'cleandata' => 0, # remove sites with amibguity data (1 yes, 0 no), 'ndata' => 2 }, ); . . . } I verified and the $codon_MSA ref point to the same location in memory before and after running the codeml_factory, so I guess it is not modified by the package in such a way that it couldn't be passed more than once. DO you know of any way to avoid redoing the $codon_MSA each time i want to pass it to the codeml_factory. Thank you very much, Lorenzo -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From jordi.durban at gmail.com Mon Jun 13 07:50:04 2011 From: jordi.durban at gmail.com (Jordi Durban) Date: Mon, 13 Jun 2011 13:50:04 +0200 Subject: [Bioperl-l] API for frame() has changed?? Message-ID: Hi all! I tried to use an outdated perl script and I got this warning: * --------------------- WARNING --------------------- MSG: API for frame() has changed. Please refer to documentation for Bio::Search::HSP::GenericHSP; returning query frame ------------------------------**---------------------* It could be possible as the script has: *my $strand = $hsp->strand(); my $frame = $hsp->frame();* And I didn't found the strand() and frame() in Bio::Search::HSP::GenericHSP. Anyone knows how can I obtain such information from Bio::Search::HSP::GenericHSP?? I tried to get it from Bio::Search::Hit::HitI but I think it returns the information from the best blast hit, doesn't it? Thanks a lot -- Jordi From bosborne11 at verizon.net Mon Jun 13 09:07:15 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 13 Jun 2011 09:07:15 -0400 Subject: [Bioperl-l] API for frame() has changed?? In-Reply-To: References: Message-ID: http://www.bioperl.org/wiki/HOWTO:SearchIO On Jun 13, 2011, at 7:50 AM, Jordi Durban wrote: > Hi all! > I tried to use an outdated perl script and I got this warning: > * > --------------------- WARNING --------------------- > MSG: API for frame() has changed. > Please refer to documentation for Bio::Search::HSP::GenericHSP; > returning query frame > ------------------------------**---------------------* > > It could be possible as the script has: > > *my $strand = $hsp->strand(); > my $frame = $hsp->frame();* > > And I didn't found the strand() and frame() in Bio::Search::HSP::GenericHSP. > Anyone knows how can I obtain such information from > Bio::Search::HSP::GenericHSP?? > I tried to get it from Bio::Search::Hit::HitI but I think it returns the > information from the best blast hit, doesn't it? > Thanks a lot > > -- > Jordi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jonas.zierer at campus.lmu.de Wed Jun 8 09:10:45 2011 From: jonas.zierer at campus.lmu.de (Jonas Zierer) Date: Wed, 08 Jun 2011 15:10:45 +0200 Subject: [Bioperl-l] reading sam files Message-ID: <1307538645.24614.1.camel@stanhope.bio.ifi.lmu.de> Hi, i tried to read a sam file insted of a bam file using DB::BIO::Sam, but that didn't work. Is there any other possibility to read sam files with bioperl? thx, bye From lhphanto at gmail.com Mon Jun 13 10:09:33 2011 From: lhphanto at gmail.com (Hui Liu) Date: Mon, 13 Jun 2011 09:09:33 -0500 Subject: [Bioperl-l] A question about Bioperl SeqUtil In-Reply-To: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> Message-ID: Hi Chris, Sorry. I will send to bioperl List next time. So my current problem is that large sequence will 'out of memory' using Bio::SeqUtil. It is of 500Mb size. Anyway, I will try to do something in my Perl code to divide the sequences. Thanks :-) Hui On Sat, Jun 11, 2011 at 6:26 PM, Chris Fields wrote: > Hui, > > Please email questions to the mail list, not directly to individual > developers. > > You probably could translate whole chromosomes this way, but it might take > a while. I would also look at tools like EMBOSS to do this, should be > faster. > > chris > > On Jun 11, 2011, at 10:31 AM, Hui Liu wrote: > > > Hi Chris, > > Now I am having trouble translating large chromosome sequences in 6 > frames. Do you know any available tools that supports large sequence > translation? Does Bioperl 1.6.90 support that? > > Thanks a lot :-) > > Hui > > -- > > Nothing shocks me. I'm a scientist. > > -Indiana Jones > > -- Nothing shocks me. I'm a scientist. -Indiana Jones From joshpk105 at gmail.com Mon Jun 13 12:22:51 2011 From: joshpk105 at gmail.com (josh katz) Date: Mon, 13 Jun 2011 12:22:51 -0400 Subject: [Bioperl-l] Filtering Hits and HSP Message-ID: I was wondering if anyone had a way of filtering hits by top-k using the Bio::SearchIO::Writer::TextResultWriter. A specific example of what I want, I have a Blast file that contains 50 hits and 50 alignments per hit. I would instead like the top 10 hits and 10 alignments from this file, I can filter the HSP by using $hsp->rank value but I can't come up with a counting method for the hits. Any ideas would be appreciated. Thanks, Josh -- ________________________________________________________ "Only extraneous observations, can explain self." Joshua P. Katz From scott at scottcain.net Mon Jun 13 12:39:03 2011 From: scott at scottcain.net (Scott Cain) Date: Mon, 13 Jun 2011 12:39:03 -0400 Subject: [Bioperl-l] reading sam files In-Reply-To: <1307538645.24614.1.camel@stanhope.bio.ifi.lmu.de> References: <1307538645.24614.1.camel@stanhope.bio.ifi.lmu.de> Message-ID: Hello Jonas, When you say it didn't work, what do you mean? Did you read the documentation, and it didn't work the way the documentation indicated? Could you show the code you used and any error messages you got? Scott On Wed, Jun 8, 2011 at 9:10 AM, Jonas Zierer wrote: > Hi, > > i tried to read a sam file insted of a bam file using DB::BIO::Sam, but > that didn't work. > > Is there any other possibility to read sam files with bioperl? > > thx, bye > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From rondonbio at yahoo.com.br Mon Jun 13 13:44:40 2011 From: rondonbio at yahoo.com.br (Rondon Neto) Date: Mon, 13 Jun 2011 10:44:40 -0700 (PDT) Subject: [Bioperl-l] coverage percentage Message-ID: <418099.75703.qm@web130206.mail.mud.yahoo.com> I need the subject's coverage percentage from an alignment (in this case, a BLAT alignment). I saw that exist an coverage map in?Bio::Search::Tiling, but it doesn't help.. So, I wrote the script below, but as you can see, It don't consider that can be non-aligned regions between HSPs. CAn anyone help me? thank you Rondon #!/usr/bin/perl use warnings; use strict; use Bio::SearchIO; my $in = new Bio::SearchIO ( -format => 'psl', -file ? => $ARGV[0] ); my %hash; my $i=0; while( my $result = $in->next_result ) { while( my $hit = $result->next_hit ) { while( my $hsp = $hit->next_hsp ) { my $subject = $hit->name();?my $start = $hsp->start('hit');?my $end = $hsp->end('hit');?my $length = $hit->length(); if (defined $hash{$subject}){ if ($start < $hash{$subject}{start}){ $hash{$subject}{start} = $start; } elsif ($end > $hash{$subject}{end}){ $hash{$subject}{end} = $end; } } else { $hash{$subject}{start} = $start; $hash{$subject}{end} = $end; } $hash{$subject}{length} = $length; } ? } } #calc percentage for each subject foreach my $key (keys %hash){ my $tam = $hash{$key}{end} - $hash{$key}{start}; my $per = $tam * 100 / $hash{$key}{length}; printf "$key\t%4.2f%%\n", $per; } exit; From fanx0038 at umn.edu Mon Jun 13 17:40:36 2011 From: fanx0038 at umn.edu (Flora Fan) Date: Mon, 13 Jun 2011 16:40:36 -0500 Subject: [Bioperl-l] BioPerl and undefined value Message-ID: Hi, I am trying to retrieve UTR for a list of RefSeq IDs using get_sequence. However, because some of my identifiers were removed from the most current DB, BioPerl stopped when it first encounters an undefined ID , and gave the following error message: ------------- EXCEPTION ------------- MSG: acc does not exist STACK Bio::DB::WebDBSeqI::get_Seq_by_acc /project/ccbioinf/Software/Ensembl/ensembl_55/bioperl-live/Bio/DB/WebDBSeqI. pm:177 STACK Bio::DB::GenBank::get_Seq_by_acc /project/ccbioinf/Software/Ensembl/ensembl_55/bioperl-live/Bio/DB/GenBank.pm :216 STACK Bio::Perl::get_sequence /project/ccbioinf/Software/Ensembl/ensembl_55/bioperl-live/Bio/Perl.pm:510 STACK main::BEGIN test_retrieve_UTR.pl:11 STACK (eval) test_retrieve_UTR.pl:11 STACK toplevel test_retrieve_UTR.pl:11 Does anybody know how to let the script move on to the next ID, and skip those undefined values? ?if (defined get_sequence('Genbank',$ID)) {}? doesn?t work.... Thank you very much. -Flora -- Flora Danhua Fan, Ph.D Biostatistics and Bioinformatics Masonic Cancer Center, University of Minnesota 425 Delaware St SE MMC 806, Minneapolis, MN 55455 Office: 2-152 Moos Tower Phone: 612-625-3648 http://www.cancer.umn.edu/research/cores/biostats/bioinformatics.html From scott at scottcain.net Mon Jun 13 21:33:19 2011 From: scott at scottcain.net (Scott Cain) Date: Mon, 13 Jun 2011 21:33:19 -0400 Subject: [Bioperl-l] BioPerl and undefined value In-Reply-To: References: Message-ID: next unless $ID; On Mon, Jun 13, 2011 at 5:40 PM, Flora Fan wrote: > Hi, > I am trying to retrieve UTR for a list of RefSeq IDs using get_sequence. > However, because some of my identifiers were removed from the most current > DB, BioPerl stopped when it first encounters an undefined ID , and gave the > following error message: > ------------- EXCEPTION ?------------- > MSG: acc does not exist > STACK Bio::DB::WebDBSeqI::get_Seq_by_acc > /project/ccbioinf/Software/Ensembl/ensembl_55/bioperl-live/Bio/DB/WebDBSeqI. > pm:177 > STACK Bio::DB::GenBank::get_Seq_by_acc > /project/ccbioinf/Software/Ensembl/ensembl_55/bioperl-live/Bio/DB/GenBank.pm > :216 > STACK Bio::Perl::get_sequence > /project/ccbioinf/Software/Ensembl/ensembl_55/bioperl-live/Bio/Perl.pm:510 > STACK main::BEGIN test_retrieve_UTR.pl:11 > STACK (eval) test_retrieve_UTR.pl:11 > STACK toplevel test_retrieve_UTR.pl:11 > > Does anybody know how to let the script move on to the next ID, and skip > those undefined values? ?if (defined get_sequence('Genbank',$ID)) {}? > doesn?t work.... > > Thank you very much. > -Flora > > -- > Flora Danhua Fan, Ph.D > Biostatistics and Bioinformatics > Masonic Cancer Center, University of Minnesota > 425 Delaware St SE MMC 806, Minneapolis, MN 55455 > Office: 2-152 Moos Tower > Phone: 612-625-3648 > http://www.cancer.umn.edu/research/cores/biostats/bioinformatics.html > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From roy.chaudhuri at gmail.com Tue Jun 14 06:26:01 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Tue, 14 Jun 2011 11:26:01 +0100 Subject: [Bioperl-l] BioPerl and undefined value In-Reply-To: References: Message-ID: <4DF73739.7010006@gmail.com> I don't think that will work - as I understand it the problem is that $ID is not found in Genbank, not that it has an undefined value. Flora, maybe you could try wrapping the get_sequence call in an eval block, something like: for my $ID (@ID) { my $seq; eval {$seq=get_sequence('genbank',$ID)} next if $@; #do stuff with $seq here } Hope this helps, Roy. On 14/06/2011 02:33, Scott Cain wrote: > next unless $ID; > > > On Mon, Jun 13, 2011 at 5:40 PM, Flora Fan wrote: >> Hi, >> I am trying to retrieve UTR for a list of RefSeq IDs using get_sequence. >> However, because some of my identifiers were removed from the most current >> DB, BioPerl stopped when it first encounters an undefined ID , and gave the >> following error message: >> ------------- EXCEPTION ------------- >> MSG: acc does not exist >> STACK Bio::DB::WebDBSeqI::get_Seq_by_acc >> /project/ccbioinf/Software/Ensembl/ensembl_55/bioperl-live/Bio/DB/WebDBSeqI. >> pm:177 >> STACK Bio::DB::GenBank::get_Seq_by_acc >> /project/ccbioinf/Software/Ensembl/ensembl_55/bioperl-live/Bio/DB/GenBank.pm >> :216 >> STACK Bio::Perl::get_sequence >> /project/ccbioinf/Software/Ensembl/ensembl_55/bioperl-live/Bio/Perl.pm:510 >> STACK main::BEGIN test_retrieve_UTR.pl:11 >> STACK (eval) test_retrieve_UTR.pl:11 >> STACK toplevel test_retrieve_UTR.pl:11 >> >> Does anybody know how to let the script move on to the next ID, and skip >> those undefined values? ?if (defined get_sequence('Genbank',$ID)) {}? >> doesn?t work.... >> >> Thank you very much. >> -Flora >> >> -- >> Flora Danhua Fan, Ph.D >> Biostatistics and Bioinformatics >> Masonic Cancer Center, University of Minnesota >> 425 Delaware St SE MMC 806, Minneapolis, MN 55455 >> Office: 2-152 Moos Tower >> Phone: 612-625-3648 >> http://www.cancer.umn.edu/research/cores/biostats/bioinformatics.html >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > From jonas.zierer at campus.lmu.de Tue Jun 14 07:47:23 2011 From: jonas.zierer at campus.lmu.de (Jonas Zierer) Date: Tue, 14 Jun 2011 13:47:23 +0200 Subject: [Bioperl-l] reading sam files In-Reply-To: References: <1307538645.24614.1.camel@stanhope.bio.ifi.lmu.de> Message-ID: <1308052043.30552.35.camel@hensel.bio.ifi.lmu.de> Hi Scott, no, i didn't find any way to read sam files in the documentation (http://cpan.uwinnipeg.ca/htdocs/Bio-SamTools/Bio/DB/Sam.html ) the BIO::DB::Sam->new method only takes a -bam argument and give a sam file to this method it says [bam_header_read] EOF marker is absent. [bam_index_load] fail to load BAM index. [bam_header_read] EOF marker is absent. [bam_header_read] EOF marker is absent. Jonas Am Monday, den 13.06.2011, 12:39 -0400 schrieb Scott Cain: > Hello Jonas, > > When you say it didn't work, what do you mean? Did you read the > documentation, and it didn't work the way the documentation indicated? > Could you show the code you used and any error messages you got? > > Scott > > > On Wed, Jun 8, 2011 at 9:10 AM, Jonas Zierer wrote: > > Hi, > > > > i tried to read a sam file insted of a bam file using DB::BIO::Sam, but > > that didn't work. > > > > Is there any other possibility to read sam files with bioperl? > > > > thx, bye > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > From shachigahoimbi at gmail.com Tue Jun 14 05:37:40 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Tue, 14 Jun 2011 15:07:40 +0530 Subject: [Bioperl-l] problem in inserting image in pdf file Message-ID: Dear All, I want to insert my image file in newly created pdf file. For this I am using bioperl module PDF::Create to create new pdf file. But when I inserted my image file in newly created pdf file, whole image is not coming in pdf file. I have attached used image file and created pdf file. Please tell me the required changes in script. Code is given below ---- ############################## ###################################### #!/usr/bin/perl use strict; use warnings; use PDF::Create; my $pdf = new PDF::Create('filename' => 'IMAGE11.pdf'); my $a4 = $pdf->new_page('MediaBox' => $pdf->get_page_size('A4')); my $page = $a4->new_page; my $image = $pdf->image('img.jpeg'); $page->image(image => $image, xpos => -5, ypos => 650); $pdf->close; ############################################################################## -- Regards, Shachi -------------- next part -------------- A non-text attachment was scrubbed... Name: img.jpeg Type: image/jpeg Size: 12851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: IMAGE11.pdf Type: application/pdf Size: 13903 bytes Desc: not available URL: From R.A.Vos at reading.ac.uk Tue Jun 14 09:53:38 2011 From: R.A.Vos at reading.ac.uk (Rutger Vos) Date: Tue, 14 Jun 2011 14:53:38 +0100 Subject: [Bioperl-l] problem in inserting image in pdf file In-Reply-To: References: Message-ID: Dear Shachi, this question doesn't seem related to BioPerl, perhaps you might instead ask the author(s) of PDF::Create, or read its documentation? Best wishes, Rutger On Tue, Jun 14, 2011 at 10:37 AM, Shachi Gahoi wrote: > Dear All, > > I want to insert my image file in newly created pdf file. For this I am > using bioperl module PDF::Create to create new pdf file. > > But when I inserted my image file in newly created pdf file, whole image is > not coming in pdf file. I have attached used image file and created pdf > file. > > Please tell me the required changes in script. > > Code is given below ---- > > ############################## > ###################################### > #!/usr/bin/perl > use strict; > use warnings; > > use PDF::Create; > > my $pdf = new PDF::Create('filename' => 'IMAGE11.pdf'); > > my $a4 = $pdf->new_page('MediaBox' => $pdf->get_page_size('A4')); > > my $page = $a4->new_page; > > my $image = $pdf->image('img.jpeg'); > > $page->image(image => $image, xpos => -5, ypos => 650); > > $pdf->close; > ############################################################################## > > > -- > Regards, > Shachi > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading, RG6 6BX, United Kingdom Tel: +44 (0) 118 378 7535 http://rutgervos.blogspot.com From R.A.Vos at reading.ac.uk Tue Jun 14 09:53:38 2011 From: R.A.Vos at reading.ac.uk (Rutger Vos) Date: Tue, 14 Jun 2011 14:53:38 +0100 Subject: [Bioperl-l] problem in inserting image in pdf file In-Reply-To: References: Message-ID: Dear Shachi, this question doesn't seem related to BioPerl, perhaps you might instead ask the author(s) of PDF::Create, or read its documentation? Best wishes, Rutger On Tue, Jun 14, 2011 at 10:37 AM, Shachi Gahoi wrote: > Dear All, > > I want to insert my image file in newly created pdf file. For this I am > using bioperl module PDF::Create to create new pdf file. > > But when I inserted my image file in newly created pdf file, whole image is > not coming in pdf file. I have attached used image file and created pdf > file. > > Please tell me the required changes in script. > > Code is given below ---- > > ############################## > ###################################### > #!/usr/bin/perl > use strict; > use warnings; > > use PDF::Create; > > my $pdf = new PDF::Create('filename' => 'IMAGE11.pdf'); > > my $a4 = $pdf->new_page('MediaBox' => $pdf->get_page_size('A4')); > > my $page = $a4->new_page; > > my $image = $pdf->image('img.jpeg'); > > $page->image(image => $image, xpos => -5, ypos => 650); > > $pdf->close; > ############################################################################## > > > -- > Regards, > Shachi > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading, RG6 6BX, United Kingdom Tel: +44 (0) 118 378 7535 http://rutgervos.blogspot.com From cjfields at illinois.edu Tue Jun 14 09:56:58 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 14 Jun 2011 08:56:58 -0500 Subject: [Bioperl-l] problem in inserting image in pdf file In-Reply-To: References: Message-ID: On Jun 14, 2011, at 4:37 AM, Shachi Gahoi wrote: > Dear All, > > I want to insert my image file in newly created pdf file. For this I am > using bioperl module PDF::Create to create new pdf file. I know we own a substantial bit of real estate on CPAN, but PDF::Create is NOT a BioPerl module. You should contact the developer (from the PDF::Create README, bcc'd): Markus Baertschi, I have taken over maintenence of PDF::Create as Fabien has disappeared and did no longer maintain it in the last years. The last version of PDF::Create from Fabien is 0.06. All never versions have been modified by me. I maintain PDF::Create in git. You can access the repository directly at http://github.com/markusb/pdf-create. chris > But when I inserted my image file in newly created pdf file, whole image is > not coming in pdf file. I have attached used image file and created pdf > file. > > Please tell me the required changes in script. > > Code is given below ---- > > ############################## > ###################################### > #!/usr/bin/perl > use strict; > use warnings; > > use PDF::Create; > > my $pdf = new PDF::Create('filename' => 'IMAGE11.pdf'); > > my $a4 = $pdf->new_page('MediaBox' => $pdf->get_page_size('A4')); > > my $page = $a4->new_page; > > my $image = $pdf->image('img.jpeg'); > > $page->image(image => $image, xpos => -5, ypos => 650); > > $pdf->close; > ############################################################################## > > > -- > Regards, > Shachi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Jun 14 09:56:58 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 14 Jun 2011 08:56:58 -0500 Subject: [Bioperl-l] problem in inserting image in pdf file In-Reply-To: References: Message-ID: On Jun 14, 2011, at 4:37 AM, Shachi Gahoi wrote: > Dear All, > > I want to insert my image file in newly created pdf file. For this I am > using bioperl module PDF::Create to create new pdf file. I know we own a substantial bit of real estate on CPAN, but PDF::Create is NOT a BioPerl module. You should contact the developer (from the PDF::Create README, bcc'd): Markus Baertschi, I have taken over maintenence of PDF::Create as Fabien has disappeared and did no longer maintain it in the last years. The last version of PDF::Create from Fabien is 0.06. All never versions have been modified by me. I maintain PDF::Create in git. You can access the repository directly at http://github.com/markusb/pdf-create. chris > But when I inserted my image file in newly created pdf file, whole image is > not coming in pdf file. I have attached used image file and created pdf > file. > > Please tell me the required changes in script. > > Code is given below ---- > > ############################## > ###################################### > #!/usr/bin/perl > use strict; > use warnings; > > use PDF::Create; > > my $pdf = new PDF::Create('filename' => 'IMAGE11.pdf'); > > my $a4 = $pdf->new_page('MediaBox' => $pdf->get_page_size('A4')); > > my $page = $a4->new_page; > > my $image = $pdf->image('img.jpeg'); > > $page->image(image => $image, xpos => -5, ypos => 650); > > $pdf->close; > ############################################################################## > > > -- > Regards, > Shachi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From scott at scottcain.net Tue Jun 14 10:29:34 2011 From: scott at scottcain.net (Scott Cain) Date: Tue, 14 Jun 2011 10:29:34 -0400 Subject: [Bioperl-l] reading sam files In-Reply-To: <1308052043.30552.35.camel@hensel.bio.ifi.lmu.de> References: <1307538645.24614.1.camel@stanhope.bio.ifi.lmu.de> <1308052043.30552.35.camel@hensel.bio.ifi.lmu.de> Message-ID: Hi Jonas, Then I think you are correct that there isn't a way to read sam files directly. You could convert to bam, or extend Bio::DB::Sam to do what you want (patches welcome!). Scott On Tue, Jun 14, 2011 at 7:47 AM, Jonas Zierer wrote: > Hi Scott, > > no, i didn't find any way to read sam files in the documentation > (http://cpan.uwinnipeg.ca/htdocs/Bio-SamTools/Bio/DB/Sam.html ) > > the BIO::DB::Sam->new method only takes a -bam argument and give a sam > file to this method it says > [bam_header_read] EOF marker is > absent. > [bam_index_load] fail to load BAM > index. > [bam_header_read] EOF marker is > absent. > [bam_header_read] EOF marker is absent. > > Jonas > > > Am Monday, den 13.06.2011, 12:39 -0400 schrieb Scott Cain: >> Hello Jonas, >> >> When you say it didn't work, what do you mean? ?Did you read the >> documentation, and it didn't work the way the documentation indicated? >> ?Could you show the code you used and any error messages you got? >> >> Scott >> >> >> On Wed, Jun 8, 2011 at 9:10 AM, Jonas Zierer wrote: >> > Hi, >> > >> > i tried to read a sam file insted of a bam file using DB::BIO::Sam, but >> > that didn't work. >> > >> > Is there any other possibility to read sam files with bioperl? >> > >> > thx, bye >> > >> > _______________________________________________ >> > Bioperl-l mailing list >> > Bioperl-l at lists.open-bio.org >> > http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > >> >> >> > > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From avilella at gmail.com Tue Jun 14 10:35:23 2011 From: avilella at gmail.com (Albert Vilella) Date: Tue, 14 Jun 2011 15:35:23 +0100 Subject: [Bioperl-l] blast2sam Bio::AlignIO::sam was this ever implemented? Message-ID: http://seqanswers.com/forums/showpost.php?p=12992&postcount=4 I would be interested in using it myself right now :-p From dhoworth at mrc-lmb.cam.ac.uk Tue Jun 14 10:26:32 2011 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Tue, 14 Jun 2011 15:26:32 +0100 Subject: [Bioperl-l] problem in inserting image in pdf file In-Reply-To: References: Message-ID: <4DF76F98.1020106@mrc-lmb.cam.ac.uk> Chris Fields wrote: > On Jun 14, 2011, at 4:37 AM, Shachi Gahoi wrote: >> I want to insert my image file in newly created pdf file. For this >> I am using bioperl module PDF::Create to create new pdf file. > > I know we own a substantial bit of real estate on CPAN, but > PDF::Create is NOT a BioPerl module. You should contact the > developer (from the PDF::Create README, bcc'd): Well, better not to suggest bothering the author, but read the documentation as Rutger suggested, since the problem is obvious and the cure is documented. Cheers, Dave PS and Shachi, please stop sending offlist mail. From cjfields at illinois.edu Tue Jun 14 13:26:38 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 14 Jun 2011 12:26:38 -0500 Subject: [Bioperl-l] reading sam files In-Reply-To: References: <1307538645.24614.1.camel@stanhope.bio.ifi.lmu.de> <1308052043.30552.35.camel@hensel.bio.ifi.lmu.de> Message-ID: <007C526E-B99C-4D9F-927A-B25BB3B92A9B@illinois.edu> Jonas, I think you need TAM (aka text-based SAM, which I think most users call SAM output). It does handle that, look here: http://search.cpan.org/~lds/Bio-SamTools-1.28/lib/Bio/DB/Sam.pm#TAM_Files I'm not sure why you would want to do this though, as the BAM file can be manipulated directly as well (would think it is faster working with BAM). Any reason you are doing this? chris On Jun 14, 2011, at 9:29 AM, Scott Cain wrote: > Hi Jonas, > > Then I think you are correct that there isn't a way to read sam files > directly. You could convert to bam, or extend Bio::DB::Sam to do what > you want (patches welcome!). > > Scott > > > On Tue, Jun 14, 2011 at 7:47 AM, Jonas Zierer > wrote: >> Hi Scott, >> >> no, i didn't find any way to read sam files in the documentation >> (http://cpan.uwinnipeg.ca/htdocs/Bio-SamTools/Bio/DB/Sam.html ) >> >> the BIO::DB::Sam->new method only takes a -bam argument and give a sam >> file to this method it says >> [bam_header_read] EOF marker is >> absent. >> [bam_index_load] fail to load BAM >> index. >> [bam_header_read] EOF marker is >> absent. >> [bam_header_read] EOF marker is absent. >> >> Jonas >> >> >> Am Monday, den 13.06.2011, 12:39 -0400 schrieb Scott Cain: >>> Hello Jonas, >>> >>> When you say it didn't work, what do you mean? Did you read the >>> documentation, and it didn't work the way the documentation indicated? >>> Could you show the code you used and any error messages you got? >>> >>> Scott >>> >>> >>> On Wed, Jun 8, 2011 at 9:10 AM, Jonas Zierer wrote: >>>> Hi, >>>> >>>> i tried to read a sam file insted of a bam file using DB::BIO::Sam, but >>>> that didn't work. >>>> >>>> Is there any other possibility to read sam files with bioperl? >>>> >>>> thx, bye >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> >>> >>> >> >> >> > > > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. scott at scottcain dot net > GMOD Coordinator (http://gmod.org/) 216-392-3087 > Ontario Institute for Cancer Research > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jonas.zierer at campus.lmu.de Wed Jun 15 04:50:15 2011 From: jonas.zierer at campus.lmu.de (Jonas Zierer) Date: Wed, 15 Jun 2011 10:50:15 +0200 Subject: [Bioperl-l] reading sam files In-Reply-To: <007C526E-B99C-4D9F-927A-B25BB3B92A9B@illinois.edu> References: <1307538645.24614.1.camel@stanhope.bio.ifi.lmu.de> <1308052043.30552.35.camel@hensel.bio.ifi.lmu.de> <007C526E-B99C-4D9F-927A-B25BB3B92A9B@illinois.edu> Message-ID: <1308127815.28546.3.camel@boole.bio.ifi.lmu.de> Hi, the reason is, that i want do add tags to the sam file. as far as i know this is only possible in sam files and not in bam files. now i have to read the sam file analyze the read and add the tag. These tags are written to the sam file which can then be converted to bam. If i have to read a bam file with bioperl i have to convert the sam to bam, read it with bioperl, change the sam file and convert it again. my target was to avoid this second conversion, but i think that's the best solution. thx for your help! bye Am Tuesday, den 14.06.2011, 12:26 -0500 schrieb Chris Fields: > Jonas, > > I think you need TAM (aka text-based SAM, which I think most users call SAM output). It does handle that, look here: > > http://search.cpan.org/~lds/Bio-SamTools-1.28/lib/Bio/DB/Sam.pm#TAM_Files > > I'm not sure why you would want to do this though, as the BAM file can be manipulated directly as well (would think it is faster working with BAM). Any reason you are doing this? > > chris > > On Jun 14, 2011, at 9:29 AM, Scott Cain wrote: > > > Hi Jonas, > > > > Then I think you are correct that there isn't a way to read sam files > > directly. You could convert to bam, or extend Bio::DB::Sam to do what > > you want (patches welcome!). > > > > Scott > > > > > > On Tue, Jun 14, 2011 at 7:47 AM, Jonas Zierer > > wrote: > >> Hi Scott, > >> > >> no, i didn't find any way to read sam files in the documentation > >> (http://cpan.uwinnipeg.ca/htdocs/Bio-SamTools/Bio/DB/Sam.html ) > >> > >> the BIO::DB::Sam->new method only takes a -bam argument and give a sam > >> file to this method it says > >> [bam_header_read] EOF marker is > >> absent. > >> [bam_index_load] fail to load BAM > >> index. > >> [bam_header_read] EOF marker is > >> absent. > >> [bam_header_read] EOF marker is absent. > >> > >> Jonas > >> > >> > >> Am Monday, den 13.06.2011, 12:39 -0400 schrieb Scott Cain: > >>> Hello Jonas, > >>> > >>> When you say it didn't work, what do you mean? Did you read the > >>> documentation, and it didn't work the way the documentation indicated? > >>> Could you show the code you used and any error messages you got? > >>> > >>> Scott > >>> > >>> > >>> On Wed, Jun 8, 2011 at 9:10 AM, Jonas Zierer wrote: > >>>> Hi, > >>>> > >>>> i tried to read a sam file insted of a bam file using DB::BIO::Sam, but > >>>> that didn't work. > >>>> > >>>> Is there any other possibility to read sam files with bioperl? > >>>> > >>>> thx, bye > >>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>> > >>> > >>> > >> > >> > >> > > > > > > > > -- > > ------------------------------------------------------------------------ > > Scott Cain, Ph. D. scott at scottcain dot net > > GMOD Coordinator (http://gmod.org/) 216-392-3087 > > Ontario Institute for Cancer Research > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From p.j.a.cock at googlemail.com Wed Jun 15 05:03:05 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 15 Jun 2011 10:03:05 +0100 Subject: [Bioperl-l] reading sam files In-Reply-To: <1308127815.28546.3.camel@boole.bio.ifi.lmu.de> References: <1307538645.24614.1.camel@stanhope.bio.ifi.lmu.de> <1308052043.30552.35.camel@hensel.bio.ifi.lmu.de> <007C526E-B99C-4D9F-927A-B25BB3B92A9B@illinois.edu> <1308127815.28546.3.camel@boole.bio.ifi.lmu.de> Message-ID: On Wed, Jun 15, 2011 at 9:50 AM, Jonas Zierer wrote: > Hi, > > the reason is, that i want do add tags to the sam file. as far as i know > this is only possible in sam files and not in bam files. now i have to > read the sam file analyze the read and add the tag. These tags are > written to the sam file which can then be converted to bam. > If i have to read a bam file with bioperl i have to convert the sam to > bam, read it with bioperl, change the sam file and convert it again. > > my target was to avoid this second conversion, but i think that's the > best solution. > thx for your help! > bye Both SAM and BAM support plain text header text, e.g. @CO lines. If you are looking for an efficient way to update this header in a BAM file, try the samtools reheader command. Peter From jonas.zierer at campus.lmu.de Wed Jun 15 05:07:30 2011 From: jonas.zierer at campus.lmu.de (Jonas Zierer) Date: Wed, 15 Jun 2011 11:07:30 +0200 Subject: [Bioperl-l] reading sam files In-Reply-To: References: <1307538645.24614.1.camel@stanhope.bio.ifi.lmu.de> <1308052043.30552.35.camel@hensel.bio.ifi.lmu.de> <007C526E-B99C-4D9F-927A-B25BB3B92A9B@illinois.edu> <1308127815.28546.3.camel@boole.bio.ifi.lmu.de> Message-ID: <1308128850.28546.5.camel@boole.bio.ifi.lmu.de> i don't want to update the header, but add tags to every single aligned read! (in plain text format X[A-Z]:: at the end of the line) Jonas Am Wednesday, den 15.06.2011, 10:03 +0100 schrieb Peter Cock: > On Wed, Jun 15, 2011 at 9:50 AM, Jonas Zierer > wrote: > > Hi, > > > > the reason is, that i want do add tags to the sam file. as far as i know > > this is only possible in sam files and not in bam files. now i have to > > read the sam file analyze the read and add the tag. These tags are > > written to the sam file which can then be converted to bam. > > If i have to read a bam file with bioperl i have to convert the sam to > > bam, read it with bioperl, change the sam file and convert it again. > > > > my target was to avoid this second conversion, but i think that's the > > best solution. > > thx for your help! > > bye > > Both SAM and BAM support plain text header text, e.g. @CO lines. > If you are looking for an efficient way to update this header in a BAM > file, try the samtools reheader command. > > Peter From shachigahoimbi at gmail.com Wed Jun 15 06:15:38 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Wed, 15 Jun 2011 15:45:38 +0530 Subject: [Bioperl-l] problem in Bio::Graphics module Message-ID: Dear all, Is it possible to generate images in jpeg or gif format using Bio::Graphics module. I am using Bio::Graphics::Panel module to generate image. But it is generating image in *png* format but I want image in jpeg or gif format. Is it possible to genertae image in jpeg or gif format using Bio::Graphics module. If anyone knows please help me. Thanks in advance -- Regards, Shachi From roy.chaudhuri at gmail.com Wed Jun 15 06:32:04 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 15 Jun 2011 11:32:04 +0100 Subject: [Bioperl-l] problem in Bio::Graphics module In-Reply-To: References: Message-ID: <4DF88A24.6000702@gmail.com> From the documentation to Bio::Graphics::Panel: $gd = $panel->gd([$gd]) The gd() method lays out the image and returns a GD::Image object containing it. You may then call the GD::Image object's png() or jpeg() methods to get the image data. So you'll want $panel->gd->jpeg (or $panel->gd->gif). Please ensure you read the documentation thoroughly before asking questions to the mailing list. Cheers, Roy. On 15/06/2011 11:15, Shachi Gahoi wrote: > Dear all, > > Is it possible to generate images in jpeg or gif format using Bio::Graphics > module. > > I am using Bio::Graphics::Panel module to generate image. But it is > generating image in *png* format but I want image in jpeg or gif format. > > Is it possible to genertae image in jpeg or gif format using Bio::Graphics > module. If anyone knows please help me. > > > Thanks in advance > From chad.a.davis at gmail.com Wed Jun 15 07:21:02 2011 From: chad.a.davis at gmail.com (Chad Davis) Date: Wed, 15 Jun 2011 13:21:02 +0200 Subject: [Bioperl-l] Per-column conservation of multiple alignment in Perl Message-ID: I asked this on BioStar, but then started thinking a patch to Bio::SimpleAlign would be easy, depending on what people here think ... http://biostar.stackexchange.com/questions/9196/per-column-conservation-of-multiple-alignment-in-perl Given a Bio::SimpleAlign, what is the best way to get per-column conservation scores. E.g. into an array of values in [0:1] where the array length would be the same as $align->length. I don't find anything like this in Bio::SimpleAlign. I'm looking for a function that allows: my $io = Bio::AlignIO->new(-file=>$file); my $align = $io->next_aln; my @cons = $align->percentage_identity_by_column(); # <- does this exist? print "@cons"; # 0.75 1.0 1.0 1.0 0.64 .... Or should I just concat the gapped sequence, use substr() to extract the characters and count them with a hash and return the frequency of the most frequent character per column? It looks like the private method Bio::SimpleAlign::_consensus_aa() already does most of this, but it returns the character rather than the fraction, which is what I was looking for. Short of submitting a patch for that, is there a better approach? Would there be general interest in such a patch to get per-column conservation of multiple alignments? Chad From jun.yin at ucd.ie Wed Jun 15 08:53:25 2011 From: jun.yin at ucd.ie (Jun Yin) Date: Wed, 15 Jun 2011 13:53:25 +0100 Subject: [Bioperl-l] Per-column conservation of multiple alignment in Perl In-Reply-To: References: Message-ID: <01de01cc2b5b$371dcd20$a5596760$%yin@ucd.ie> Hi, There is no function in Bio::SimpleAlign calculating per-column-conservation. I think it is a good idea to implement it. Just one suggestion, you can also define a window size parameter to get per-window-conservation. This will make this function more useful. Cheers, Jun Yin Ph.D.?student in U.C.D. Bioinformatics Laboratory Conway Institute University College Dublin -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chad Davis Sent: Wednesday, June 15, 2011 12:21 PM To: bioperl-l at lists.open-bio.org Subject: [Bioperl-l] Per-column conservation of multiple alignment in Perl I asked this on BioStar, but then started thinking a patch to Bio::SimpleAlign would be easy, depending on what people here think ... http://biostar.stackexchange.com/questions/9196/per-column-conservation-of-m ultiple-alignment-in-perl Given a Bio::SimpleAlign, what is the best way to get per-column conservation scores. E.g. into an array of values in [0:1] where the array length would be the same as $align->length. I don't find anything like this in Bio::SimpleAlign. I'm looking for a function that allows: my $io = Bio::AlignIO->new(-file=>$file); my $align = $io->next_aln; my @cons = $align->percentage_identity_by_column(); # <- does this exist? print "@cons"; # 0.75 1.0 1.0 1.0 0.64 .... Or should I just concat the gapped sequence, use substr() to extract the characters and count them with a hash and return the frequency of the most frequent character per column? It looks like the private method Bio::SimpleAlign::_consensus_aa() already does most of this, but it returns the character rather than the fraction, which is what I was looking for. Short of submitting a patch for that, is there a better approach? Would there be general interest in such a patch to get per-column conservation of multiple alignments? Chad _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From Russell.Smithies at agresearch.co.nz Wed Jun 15 17:05:23 2011 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Thu, 16 Jun 2011 09:05:23 +1200 Subject: [Bioperl-l] A question about Bioperl SeqUtil In-Reply-To: References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> Message-ID: <18DF7D20DFEC044098A1062202F5FFF3396074D1A6@exchsth.agresearch.co.nz> How about running your chromosomes through getorf or similar from EMBOSS then translating just the ORFs (with 6-frame or BioPerl)? --Russell > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Hui Liu > Sent: Tuesday, 14 June 2011 2:10 a.m. > To: Chris Fields > Cc: BioPerl List > Subject: Re: [Bioperl-l] A question about Bioperl SeqUtil > > Hi Chris, > Sorry. I will send to bioperl List next time. So my current problem > is > that large sequence will 'out of memory' using Bio::SeqUtil. It is of > 500Mb > size. Anyway, I will try to do something in my Perl code to divide the > sequences. > Thanks :-) > Hui > On Sat, Jun 11, 2011 at 6:26 PM, Chris Fields > wrote: > > > Hui, > > > > Please email questions to the mail list, not directly to individual > > developers. > > > > You probably could translate whole chromosomes this way, but it might > take > > a while. I would also look at tools like EMBOSS to do this, should > be > > faster. > > > > chris > > > > On Jun 11, 2011, at 10:31 AM, Hui Liu wrote: > > > > > Hi Chris, > > > Now I am having trouble translating large chromosome sequences > in 6 > > frames. Do you know any available tools that supports large sequence > > translation? Does Bioperl 1.6.90 support that? > > > Thanks a lot :-) > > > Hui > > > -- > > > Nothing shocks me. I'm a scientist. > > > -Indiana Jones > > > > > > > -- > Nothing shocks me. I'm a scientist. > -Indiana Jones > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From mconte at umd.edu Fri Jun 17 14:17:49 2011 From: mconte at umd.edu (Matthew Conte) Date: Fri, 17 Jun 2011 14:17:49 -0400 Subject: [Bioperl-l] upgrading GBrowse from 2.08 to 2.38, problems with BioPerl 1.6.901 upgrade Message-ID: Hello, I've emailed both the GBrowse and BioPerl mailing lists since I'm trying to upgrade GBrowse from 2.08 to 2.38, but I'm running into some problems due to the BioPerl 1.6.901 installation that is required. I'm long overdue for the upgrade and I'd especially like to take advantage of the newer user accounts feature. I'm installing this on OS X 10.6.7 server with Perl 5.10.0. I've run "sudo perl -MCPAN -e 'install Bio::Graphics::Browser2'" and the relevant output is: *...* *CPAN.pm: Going to build L/LD/LDS/GBrowse-2.38.tar.gz* * * *Checking prerequisites...* * requires:* * ! Bio::Root::Version (1.006001) is installed, but we need version >= 1.0069* * recommends:* * * Bio::DB::Sam (1.19) is installed, but we prefer to have 1.2* * * Bio::Das is not installed* * * DBD::Pg is not installed* * * Net::OpenID::Consumer is not installed* * * *...* * * *---- Unsatisfied dependencies detected during ----* *---- LDS/GBrowse-2.38.tar.gz ----* * Bio::Root::Version [requires]* *Shall I follow them and prepend them to the queue* *of modules we are processing right now? [yes] * *Running Build test* * Delayed until after prerequisites* *Running Build install* * Delayed until after prerequisites* *Running install for module 'Bio::Root::Version'* * * *...* * * *Building BioPerl* * CJFIELDS/BioPerl-1.6.901.tar.gz* * ./Build -- OK* *Warning (usually harmless): 'YAML' not installed, will not store persistent state* *Running Build test* *t/Align/AlignStats.t ......................... ok * *t/Align/AlignUtil.t .......................... ok * *t/Align/Graphics.t ........................... Bareword found where operator expected at (eval 14) line 2, near "'The optional module GD generated the following error: * *Can't"* * (Might be a runaway multi-line '' string starting on line 1)* * (Missing operator before t?)* *Bareword found where operator expected at (eval 14) line 2, near "'boot_GD' symbol"* * (Missing operator before symbol?)* *Bareword found where operator expected at (eval 14) line 2, near "2level"* * (Missing operator before level?)* *Bareword found where operator expected at (eval 14) line 3, near ") line"* * (Missing operator before line?)* *Number found where operator expected at (eval 14) line 3, near "line 1"* * (Do you need to predeclare line?)* *Bareword found where operator expected at (eval 14) line 4, near "1* *Compilation"* * (Missing operator before Compilation?)* *Bareword found where operator expected at (eval 14) line 4, near ") line"* * (Missing operator before line?)* *Number found where operator expected at (eval 14) line 4, near "line 1."* * (Do you need to predeclare line?)* *String found where operator expected at (eval 14) line 5, at end of line* * (Missing operator before ?)* *t/Align/Graphics.t ........................... 1/? * *# Failed test 'use Bio::Align::Graphics;'* *# at t/Align/Graphics.t line 9.* *# Tried to use 'Bio::Align::Graphics'.* *# Error: Attempt to reload GD.pm aborted.* *# Compilation failed in require at /Users/Matt/.cpan/build/BioPerl-1.6.901-blYCfR/blib/lib/Bio/Align/Graphics.pm line 41.* *# BEGIN failed--compilation aborted at /Users/Matt/.cpan/build/BioPerl-1.6.901-blYCfR/blib/lib/Bio/Align/Graphics.pm line 41.* *# Compilation failed in require at (eval 15) line 2.* *# BEGIN failed--compilation aborted at (eval 15) line 2.* * * *# Failed test 'require Bio::Align::Graphics;'* *# at t/Align/Graphics.t line 10.* *# Tried to require 'Bio::Align::Graphics'.* *# Error: Attempt to reload Bio/Align/Graphics.pm aborted.* *# Compilation failed in require at (eval 16) line 2.* * * *# Failed test 'Bio::Align::Graphics->can(...)'* *# at t/Align/Graphics.t line 11.* *# Bio::Align::Graphics->can('new') failed* *# Bio::Align::Graphics->can('draw') failed* *# Bio::Align::Graphics->can('height') failed* *# Bio::Align::Graphics->can('width') failed* *# Bio::Align::Graphics->can('aln_length') failed* *# Bio::Align::Graphics->can('aln_format') failed* *# Bio::Align::Graphics->can('no_sequences') failed* *Can't locate object method "catfile" via package "Bio::Root::IO" (perhaps you forgot to load "Bio::Root::IO"?) at t/Align/Graphics.t line 15.* *# Tests were run but no plan was declared and done_testing() was not seen.* *t/Align/Graphics.t ........................... Dubious, test returned 2 (wstat 512, 0x200)* *Failed 3/3 subtests * *t/Align/SimpleAlign.t ........................ ok * *t/Align/TreeBuild.t .......................... ok * * * *...* * * *t/Assembly/ContigSpectrum.t .................. ok * *t/Assembly/IO/bowtie.t ....................... skipped: The optional module Bio::Tools::Run::Samtools (or dependencies thereof) was not installed* *t/Assembly/IO/sam.t .......................... Bareword found where operator expected at (eval 41) line 2, near "'The optional module Bio::DB::Sam generated the following error: * *Can't"* * (Might be a runaway multi-line '' string starting on line 1)* * (Missing operator before t?)* *Bareword found where operator expected at (eval 41) line 2, near "2level"* * (Missing operator before level?)* *Bareword found where operator expected at (eval 41) line 3, near "2level"* * (Missing operator before level?)* *Bareword found where operator expected at (eval 41) line 5, near "2level"* * (Missing operator before level?)* *Bareword found where operator expected at (eval 41) line 5, near "2level"* * (Missing operator before level?)* *Bareword found where operator expected at (eval 41) line 6, near "207.* * at"* * (Missing operator before at?)* *Bareword found where operator expected at (eval 41) line 6, near ") line"* * (Missing operator before line?)* *Number found where operator expected at (eval 41) line 6, near "line 1"* * (Do you need to predeclare line?)* *Bareword found where operator expected at (eval 41) line 7, near "1* *Compilation"* * (Missing operator before Compilation?)* *Bareword found where operator expected at (eval 41) line 7, near ") line"* * (Missing operator before line?)* *Number found where operator expected at (eval 41) line 7, near "line 1."* * (Do you need to predeclare line?)* *String found where operator expected at (eval 41) line 8, at end of line* * (Missing operator before ?)* *t/Assembly/IO/sam.t .......................... 1/? Bio::Assembly::IO: could not load sam - for more details on supported formats please see the Assembly::IO docs* *Exception * *------------- EXCEPTION: Bio::Root::Exception -------------* *MSG: Failed to load module Bio::Assembly::IO::sam. * *------------- EXCEPTION: Bio::Root::Exception -------------* *MSG: __PACKAGE__ requires installation of samtools (libbam) and Bio::DB::Sam (available on CPAN; not part of BioPerl)* *STACK: Error::throw* *STACK: Bio::Root::Root::throw Bio/Root/Root.pm:472* *STACK: Bio::Assembly::IO::sam::BEGIN Bio/Assembly/IO/sam.pm:189* *STACK: Bio::Root::Root::_load_module Bio/Assembly/IO/sam.pm:195* *STACK: Bio::Assembly::IO::_load_format_module Bio/Assembly/IO.pm:296* *STACK: Bio::Assembly::IO::new Bio/Assembly/IO.pm:138* *STACK: t/Assembly/IO/sam.t:29* *-----------------------------------------------------------* *BEGIN failed--compilation aborted at Bio/Assembly/IO/sam.pm line 195.* *Compilation failed in require at Bio/Root/Root.pm line 543.* * * *STACK: Error::throw* *STACK: Bio::Root::Root::throw Bio/Root/Root.pm:472* *STACK: Bio::Root::Root::_load_module Bio/Root/Root.pm:545* *STACK: Bio::Assembly::IO::_load_format_module Bio/Assembly/IO.pm:296* *STACK: Bio::Assembly::IO::new Bio/Assembly/IO.pm:138* *STACK: t/Assembly/IO/sam.t:29* *-----------------------------------------------------------* * * * * *# Failed test 'init sam IO object'* *# at t/Assembly/IO/sam.t line 29.* * * *# Failed test 'The thing isa Bio::Assembly::IO'* *# at t/Assembly/IO/sam.t line 32.* *# The thing isn't defined* *Can't call method "sam" on an undefined value at t/Assembly/IO/sam.t line 33.* *# Tests were run but no plan was declared and done_testing() was not seen.* *t/Assembly/IO/sam.t .......................... Dubious, test returned 255 (wstat 65280, 0xff00)* *Failed 2/7 subtests * *t/Assembly/core.t ............................ 1/890 * *--------------------- WARNING ---------------------* *MSG: Setting end to equal start[1]* *---------------------------------------------------* * * *--------------------- WARNING ---------------------* *MSG: Setting end to equal start[1]* *---------------------------------------------------* * * *--------------------- WARNING ---------------------* *MSG: Setting end to equal start[1]* *---------------------------------------------------* * * *--------------------- WARNING ---------------------* *MSG: Setting end to equal start[1]* *---------------------------------------------------* * * *--------------------- WARNING ---------------------* *MSG: Setting end to equal start[1]* *---------------------------------------------------* *t/Assembly/core.t ............................ ok * *t/Biblio/Biblio.t ............................ ok * *t/Biblio/References.t ........................ ok * *...* The rest of the tests pass so it looks like I'm having issues with GD and Bio::DB::Sam. So I tried to upgrade to the most current version of these modules. I've got GD version 2.45 installed and when I try to upgrade it to the current 2.46, it fails with: * LDS/GD-2.46.tar.gz* * /usr/bin/make -- OK* *Warning (usually harmless): 'YAML' not installed, will not store persistent state* *Running make test* *PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t* *t/GD.t ........ Can't find 'boot_GD' symbol in ./blib/arch/auto/GD/GD.bundle* * at t/GD.t line 14* *Compilation failed in require at t/GD.t line 14.* *BEGIN failed--compilation aborted at t/GD.t line 14.* *t/GD.t ........ Dubious, test returned 2 (wstat 512, 0x200)* *Failed 12/12 subtests * *t/Polyline.t .. Can't find 'boot_GD' symbol in /Users/Matt/.cpan/build/GD-2.46-E2Ep66/blib/arch/auto/GD/GD.bundle* * at /Users/Matt/.cpan/build/GD-2.46-E2Ep66/blib/lib/GD/Polyline.pm line 45* *Compilation failed in require at /Users/Matt/.cpan/build/GD-2.46-E2Ep66/blib/lib/GD/Polyline.pm line 45.* *BEGIN failed--compilation aborted at /Users/Matt/.cpan/build/GD-2.46-E2Ep66/blib/lib/GD/Polyline.pm line 45.* *Compilation failed in require at t/Polyline.t line 10.* *BEGIN failed--compilation aborted at t/Polyline.t line 10.* *t/Polyline.t .. Dubious, test returned 2 (wstat 512, 0x200)* *Failed 1/1 subtests * I've got Bio::DB::Sam version 1.19 installed and when I try to upgrade to the current 1.29 it also fails: * CPAN.pm: Going to build L/LD/LDS/Bio-SamTools-1.29.tar.gz* * * *This module requires samtools 0.1.9 or higher (samtools.sourceforge.net).* *Please enter the location of the bam.h and compiled libbam.a files: /Users/Matt/Desktop/Matt/software/samtools-0.1.16* * * *Found /Users/Matt/Desktop/Matt/software/samtools-0.1.16/bam.h and /Users/Matt/Desktop/Matt/software/samtools-0.1.16/libbam.a.* *Creating new 'MYMETA.yml' with configuration results* *Creating new 'Build' script for 'Bio-SamTools' version '1.29'* *Could not read '/Users/Matt/.cpan/build/Bio-SamTools-1.29-o_guh5/META.yml'. Falling back to other methods to determine prerequisites* *CPAN: Module::Build loaded ok (v0.3603)* *Building Bio-SamTools* *gcc-4.2 -I/Users/Matt/Desktop/Matt/software/samtools-0.1.16 -I/System/Library/Perl/5.10.0/darwin-thread-multi-2level/CORE -DXS_VERSION="1.29" -DVERSION="1.29" -D_IOLIB=2 -D_FILE_OFFSET_BITS=64 -Wformat=0 -c -arch x86_64 -arch i386 -arch ppc -g -pipe -fno-common -DPERL_DARWIN -fno-strict-aliasing -I/usr/local/include -Os -o lib/Bio/DB/Sam.o lib/Bio/DB/Sam.c* *ExtUtils::Mkbootstrap::Mkbootstrap('blib/arch/auto/Bio/DB/Sam/Sam.bs')* *env LD_RUN_PATH=/System/Library/Perl/5.10.0/darwin-thread-multi-2level/CORE gcc-4.2 -mmacosx-version-min=10.6 -arch x86_64 -arch i386 -arch ppc -bundle -undefined dynamic_lookup -L/usr/local/lib -o blib/arch/auto/Bio/DB/Sam/Sam.bundle lib/Bio/DB/Sam.o -L/Users/Matt/Desktop/Matt/software/samtools-0.1.16 -lbam -lz* *ld: warning: in /Users/Matt/Desktop/Matt/software/samtools-0.1.16/libbam.a, file was built for unsupported file format which is not the architecture being linked (i386)* *ld: warning: in /Users/Matt/Desktop/Matt/software/samtools-0.1.16/libbam.a, file was built for unsupported file format which is not the architecture being linked (ppc)* * LDS/Bio-SamTools-1.29.tar.gz* * ./Build -- OK* *Warning (usually harmless): 'YAML' not installed, will not store persistent state* *Running Build test* *t/01sam.t .. Can't load '/Users/Matt/.cpan/build/Bio-SamTools-1.29-o_guh5/t/../blib/arch/auto/Bio/DB/Sam/Sam.bundle' for module Bio::DB::Sam: dlopen(/Users/Matt/.cpan/build/Bio-SamTools-1.29-o_guh5/t/../blib/arch/auto/Bio/DB/Sam/Sam.bundle, 1): Symbol not found: _bam_nt16_rev_table* * Referenced from: /Users/Matt/.cpan/build/Bio-SamTools-1.29-o_guh5/t/../blib/arch/auto/Bio/DB/Sam/Sam.bundle * * Expected in: flat namespace* * in /Users/Matt/.cpan/build/Bio-SamTools-1.29-o_guh5/t/../blib/arch/auto/Bio/DB/Sam/Sam.bundle at /System/Library/Perl/5.10.0/darwin-thread-multi-2level/DynaLoader.pm line 207.* * at t/01sam.t line 26* *Compilation failed in require at t/01sam.t line 26.* *BEGIN failed--compilation aborted at t/01sam.t line 26.* *t/01sam.t .. Dubious, test returned 2 (wstat 512, 0x200)* *Failed 112/112 subtests * BioPerl fails to install and so GBrowse won't install either after this. Interestingly, I'm able to install this on my development server which is a very similar setup (same OS 10.6.7 server and Perl 5.10.0 universal binary), but is 32-bit opposed to this server which is 64-bit. I can also upgrade GD and Bio::DB::Sam on the 32-bit development server. I get all of the same architecture warnings, so I'm not sure if that is the issue.. Any help would be appreciated. Thanks, Matt -- Matthew Conte Bioinformatician Department of Biology University of Maryland mconte at umd.edu Bouillabase.org From cjfields at illinois.edu Fri Jun 17 14:52:13 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 17 Jun 2011 13:52:13 -0500 Subject: [Bioperl-l] upgrading GBrowse from 2.08 to 2.38, problems with BioPerl 1.6.901 upgrade In-Reply-To: References: Message-ID: I normally would say that (from the BioPerl perspective) GD is an optional module, but you will need it for GBrowse. Bio::DB::Sam likewise. Have you looked at the README for the two distributions? They have pointers regarding issues installing both modules. In particular, I know Bio::Samtools may require adding the -fPIC flag and recompiling samtools: http://cpansearch.perl.org/src/LDS/Bio-SamTools-1.29/README GD issues are a little more complex: http://cpansearch.perl.org/src/LDS/GD-2.46/README chris On Jun 17, 2011, at 1:17 PM, Matthew Conte wrote: > Hello, > > I've emailed both the GBrowse and BioPerl mailing lists since I'm trying to > upgrade GBrowse from 2.08 to 2.38, but I'm running into some problems due to > the BioPerl 1.6.901 installation that is required. I'm long overdue for the > upgrade and I'd especially like to take advantage of the newer user accounts > feature. > > I'm installing this on OS X 10.6.7 server with Perl 5.10.0. > > I've run "sudo perl -MCPAN -e 'install Bio::Graphics::Browser2'" and the > relevant output is: > > > *...* > *CPAN.pm: Going to build L/LD/LDS/GBrowse-2.38.tar.gz* > * > * > *Checking prerequisites...* > * requires:* > * ! Bio::Root::Version (1.006001) is installed, but we need version >= > 1.0069* > * recommends:* > * * Bio::DB::Sam (1.19) is installed, but we prefer to have 1.2* > * * Bio::Das is not installed* > * * DBD::Pg is not installed* > * * Net::OpenID::Consumer is not installed* > * > * > *...* > * > * > *---- Unsatisfied dependencies detected during ----* > *---- LDS/GBrowse-2.38.tar.gz ----* > * Bio::Root::Version [requires]* > *Shall I follow them and prepend them to the queue* > *of modules we are processing right now? [yes] * > *Running Build test* > * Delayed until after prerequisites* > *Running Build install* > * Delayed until after prerequisites* > *Running install for module 'Bio::Root::Version'* > * > * > *...* > * > * > *Building BioPerl* > * CJFIELDS/BioPerl-1.6.901.tar.gz* > * ./Build -- OK* > *Warning (usually harmless): 'YAML' not installed, will not store persistent > state* > *Running Build test* > *t/Align/AlignStats.t ......................... ok * > *t/Align/AlignUtil.t .......................... ok * > *t/Align/Graphics.t ........................... Bareword found where > operator expected at (eval 14) line 2, near "'The optional module GD > generated the following error: * > *Can't"* > * (Might be a runaway multi-line '' string starting on line 1)* > * (Missing operator before t?)* > *Bareword found where operator expected at (eval 14) line 2, near "'boot_GD' > symbol"* > * (Missing operator before symbol?)* > *Bareword found where operator expected at (eval 14) line 2, near "2level"* > * (Missing operator before level?)* > *Bareword found where operator expected at (eval 14) line 3, near ") line"* > * (Missing operator before line?)* > *Number found where operator expected at (eval 14) line 3, near "line 1"* > * (Do you need to predeclare line?)* > *Bareword found where operator expected at (eval 14) line 4, near "1* > *Compilation"* > * (Missing operator before Compilation?)* > *Bareword found where operator expected at (eval 14) line 4, near ") line"* > * (Missing operator before line?)* > *Number found where operator expected at (eval 14) line 4, near "line 1."* > * (Do you need to predeclare line?)* > *String found where operator expected at (eval 14) line 5, at end of line* > * (Missing operator before ?)* > *t/Align/Graphics.t ........................... 1/? * > *# Failed test 'use Bio::Align::Graphics;'* > *# at t/Align/Graphics.t line 9.* > *# Tried to use 'Bio::Align::Graphics'.* > *# Error: Attempt to reload GD.pm aborted.* > *# Compilation failed in require at > /Users/Matt/.cpan/build/BioPerl-1.6.901-blYCfR/blib/lib/Bio/Align/Graphics.pm > line 41.* > *# BEGIN failed--compilation aborted at > /Users/Matt/.cpan/build/BioPerl-1.6.901-blYCfR/blib/lib/Bio/Align/Graphics.pm > line 41.* > *# Compilation failed in require at (eval 15) line 2.* > *# BEGIN failed--compilation aborted at (eval 15) line 2.* > * > * > *# Failed test 'require Bio::Align::Graphics;'* > *# at t/Align/Graphics.t line 10.* > *# Tried to require 'Bio::Align::Graphics'.* > *# Error: Attempt to reload Bio/Align/Graphics.pm aborted.* > *# Compilation failed in require at (eval 16) line 2.* > * > * > *# Failed test 'Bio::Align::Graphics->can(...)'* > *# at t/Align/Graphics.t line 11.* > *# Bio::Align::Graphics->can('new') failed* > *# Bio::Align::Graphics->can('draw') failed* > *# Bio::Align::Graphics->can('height') failed* > *# Bio::Align::Graphics->can('width') failed* > *# Bio::Align::Graphics->can('aln_length') failed* > *# Bio::Align::Graphics->can('aln_format') failed* > *# Bio::Align::Graphics->can('no_sequences') failed* > *Can't locate object method "catfile" via package "Bio::Root::IO" (perhaps > you forgot to load "Bio::Root::IO"?) at t/Align/Graphics.t line 15.* > *# Tests were run but no plan was declared and done_testing() was not seen.* > *t/Align/Graphics.t ........................... Dubious, test returned 2 > (wstat 512, 0x200)* > *Failed 3/3 subtests * > *t/Align/SimpleAlign.t ........................ ok * > *t/Align/TreeBuild.t .......................... ok * > * > * > *...* > * > * > *t/Assembly/ContigSpectrum.t .................. ok * > *t/Assembly/IO/bowtie.t ....................... skipped: The optional module > Bio::Tools::Run::Samtools (or dependencies thereof) was not installed* > *t/Assembly/IO/sam.t .......................... Bareword found where > operator expected at (eval 41) line 2, near "'The optional module > Bio::DB::Sam generated the following error: * > *Can't"* > * (Might be a runaway multi-line '' string starting on line 1)* > * (Missing operator before t?)* > *Bareword found where operator expected at (eval 41) line 2, near "2level"* > * (Missing operator before level?)* > *Bareword found where operator expected at (eval 41) line 3, near "2level"* > * (Missing operator before level?)* > *Bareword found where operator expected at (eval 41) line 5, near "2level"* > * (Missing operator before level?)* > *Bareword found where operator expected at (eval 41) line 5, near "2level"* > * (Missing operator before level?)* > *Bareword found where operator expected at (eval 41) line 6, near "207.* > * at"* > * (Missing operator before at?)* > *Bareword found where operator expected at (eval 41) line 6, near ") line"* > * (Missing operator before line?)* > *Number found where operator expected at (eval 41) line 6, near "line 1"* > * (Do you need to predeclare line?)* > *Bareword found where operator expected at (eval 41) line 7, near "1* > *Compilation"* > * (Missing operator before Compilation?)* > *Bareword found where operator expected at (eval 41) line 7, near ") line"* > * (Missing operator before line?)* > *Number found where operator expected at (eval 41) line 7, near "line 1."* > * (Do you need to predeclare line?)* > *String found where operator expected at (eval 41) line 8, at end of line* > * (Missing operator before ?)* > *t/Assembly/IO/sam.t .......................... 1/? Bio::Assembly::IO: could > not load sam - for more details on supported formats please see the > Assembly::IO docs* > *Exception * > *------------- EXCEPTION: Bio::Root::Exception -------------* > *MSG: Failed to load module Bio::Assembly::IO::sam. * > *------------- EXCEPTION: Bio::Root::Exception -------------* > *MSG: __PACKAGE__ requires installation of samtools (libbam) and > Bio::DB::Sam (available on CPAN; not part of BioPerl)* > *STACK: Error::throw* > *STACK: Bio::Root::Root::throw Bio/Root/Root.pm:472* > *STACK: Bio::Assembly::IO::sam::BEGIN Bio/Assembly/IO/sam.pm:189* > *STACK: Bio::Root::Root::_load_module Bio/Assembly/IO/sam.pm:195* > *STACK: Bio::Assembly::IO::_load_format_module Bio/Assembly/IO.pm:296* > *STACK: Bio::Assembly::IO::new Bio/Assembly/IO.pm:138* > *STACK: t/Assembly/IO/sam.t:29* > *-----------------------------------------------------------* > *BEGIN failed--compilation aborted at Bio/Assembly/IO/sam.pm line 195.* > *Compilation failed in require at Bio/Root/Root.pm line 543.* > * > * > *STACK: Error::throw* > *STACK: Bio::Root::Root::throw Bio/Root/Root.pm:472* > *STACK: Bio::Root::Root::_load_module Bio/Root/Root.pm:545* > *STACK: Bio::Assembly::IO::_load_format_module Bio/Assembly/IO.pm:296* > *STACK: Bio::Assembly::IO::new Bio/Assembly/IO.pm:138* > *STACK: t/Assembly/IO/sam.t:29* > *-----------------------------------------------------------* > * > * > * > * > *# Failed test 'init sam IO object'* > *# at t/Assembly/IO/sam.t line 29.* > * > * > *# Failed test 'The thing isa Bio::Assembly::IO'* > *# at t/Assembly/IO/sam.t line 32.* > *# The thing isn't defined* > *Can't call method "sam" on an undefined value at t/Assembly/IO/sam.t line > 33.* > *# Tests were run but no plan was declared and done_testing() was not seen.* > *t/Assembly/IO/sam.t .......................... Dubious, test returned 255 > (wstat 65280, 0xff00)* > *Failed 2/7 subtests * > *t/Assembly/core.t ............................ 1/890 * > *--------------------- WARNING ---------------------* > *MSG: Setting end to equal start[1]* > *---------------------------------------------------* > * > * > *--------------------- WARNING ---------------------* > *MSG: Setting end to equal start[1]* > *---------------------------------------------------* > * > * > *--------------------- WARNING ---------------------* > *MSG: Setting end to equal start[1]* > *---------------------------------------------------* > * > * > *--------------------- WARNING ---------------------* > *MSG: Setting end to equal start[1]* > *---------------------------------------------------* > * > * > *--------------------- WARNING ---------------------* > *MSG: Setting end to equal start[1]* > *---------------------------------------------------* > *t/Assembly/core.t ............................ ok * > *t/Biblio/Biblio.t ............................ ok * > *t/Biblio/References.t ........................ ok * > *...* > > > > The rest of the tests pass so it looks like I'm having issues with GD and > Bio::DB::Sam. So I tried to upgrade to the most current version of these > modules. I've got GD version 2.45 installed and when I try to upgrade it to > the current 2.46, it fails with: > > > > > * LDS/GD-2.46.tar.gz* > * /usr/bin/make -- OK* > *Warning (usually harmless): 'YAML' not installed, will not store persistent > state* > *Running make test* > *PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" > "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t* > *t/GD.t ........ Can't find 'boot_GD' symbol in > ./blib/arch/auto/GD/GD.bundle* > * at t/GD.t line 14* > *Compilation failed in require at t/GD.t line 14.* > *BEGIN failed--compilation aborted at t/GD.t line 14.* > *t/GD.t ........ Dubious, test returned 2 (wstat 512, 0x200)* > *Failed 12/12 subtests * > *t/Polyline.t .. Can't find 'boot_GD' symbol in > /Users/Matt/.cpan/build/GD-2.46-E2Ep66/blib/arch/auto/GD/GD.bundle* > * at /Users/Matt/.cpan/build/GD-2.46-E2Ep66/blib/lib/GD/Polyline.pm line 45* > *Compilation failed in require at > /Users/Matt/.cpan/build/GD-2.46-E2Ep66/blib/lib/GD/Polyline.pm line 45.* > *BEGIN failed--compilation aborted at > /Users/Matt/.cpan/build/GD-2.46-E2Ep66/blib/lib/GD/Polyline.pm line 45.* > *Compilation failed in require at t/Polyline.t line 10.* > *BEGIN failed--compilation aborted at t/Polyline.t line 10.* > *t/Polyline.t .. Dubious, test returned 2 (wstat 512, 0x200)* > *Failed 1/1 subtests * > > > > I've got Bio::DB::Sam version 1.19 installed and when I try to upgrade to > the current 1.29 it also fails: > > > > * CPAN.pm: Going to build L/LD/LDS/Bio-SamTools-1.29.tar.gz* > * > * > *This module requires samtools 0.1.9 or higher (samtools.sourceforge.net).* > *Please enter the location of the bam.h and compiled libbam.a files: > /Users/Matt/Desktop/Matt/software/samtools-0.1.16* > * > * > *Found /Users/Matt/Desktop/Matt/software/samtools-0.1.16/bam.h and > /Users/Matt/Desktop/Matt/software/samtools-0.1.16/libbam.a.* > *Creating new 'MYMETA.yml' with configuration results* > *Creating new 'Build' script for 'Bio-SamTools' version '1.29'* > *Could not read '/Users/Matt/.cpan/build/Bio-SamTools-1.29-o_guh5/META.yml'. > Falling back to other methods to determine prerequisites* > *CPAN: Module::Build loaded ok (v0.3603)* > *Building Bio-SamTools* > *gcc-4.2 -I/Users/Matt/Desktop/Matt/software/samtools-0.1.16 > -I/System/Library/Perl/5.10.0/darwin-thread-multi-2level/CORE > -DXS_VERSION="1.29" -DVERSION="1.29" -D_IOLIB=2 -D_FILE_OFFSET_BITS=64 > -Wformat=0 -c -arch x86_64 -arch i386 -arch ppc -g -pipe -fno-common > -DPERL_DARWIN -fno-strict-aliasing -I/usr/local/include -Os -o > lib/Bio/DB/Sam.o lib/Bio/DB/Sam.c* > *ExtUtils::Mkbootstrap::Mkbootstrap('blib/arch/auto/Bio/DB/Sam/Sam.bs')* > *env LD_RUN_PATH=/System/Library/Perl/5.10.0/darwin-thread-multi-2level/CORE > gcc-4.2 -mmacosx-version-min=10.6 -arch x86_64 -arch i386 -arch ppc -bundle > -undefined dynamic_lookup -L/usr/local/lib -o > blib/arch/auto/Bio/DB/Sam/Sam.bundle lib/Bio/DB/Sam.o > -L/Users/Matt/Desktop/Matt/software/samtools-0.1.16 -lbam -lz* > *ld: warning: in /Users/Matt/Desktop/Matt/software/samtools-0.1.16/libbam.a, > file was built for unsupported file format which is not the architecture > being linked (i386)* > *ld: warning: in /Users/Matt/Desktop/Matt/software/samtools-0.1.16/libbam.a, > file was built for unsupported file format which is not the architecture > being linked (ppc)* > * LDS/Bio-SamTools-1.29.tar.gz* > * ./Build -- OK* > *Warning (usually harmless): 'YAML' not installed, will not store persistent > state* > *Running Build test* > *t/01sam.t .. Can't load > '/Users/Matt/.cpan/build/Bio-SamTools-1.29-o_guh5/t/../blib/arch/auto/Bio/DB/Sam/Sam.bundle' > for module Bio::DB::Sam: > dlopen(/Users/Matt/.cpan/build/Bio-SamTools-1.29-o_guh5/t/../blib/arch/auto/Bio/DB/Sam/Sam.bundle, > 1): Symbol not found: _bam_nt16_rev_table* > * Referenced from: > /Users/Matt/.cpan/build/Bio-SamTools-1.29-o_guh5/t/../blib/arch/auto/Bio/DB/Sam/Sam.bundle > * > * Expected in: flat namespace* > * in > /Users/Matt/.cpan/build/Bio-SamTools-1.29-o_guh5/t/../blib/arch/auto/Bio/DB/Sam/Sam.bundle > at /System/Library/Perl/5.10.0/darwin-thread-multi-2level/DynaLoader.pm line > 207.* > * at t/01sam.t line 26* > *Compilation failed in require at t/01sam.t line 26.* > *BEGIN failed--compilation aborted at t/01sam.t line 26.* > *t/01sam.t .. Dubious, test returned 2 (wstat 512, 0x200)* > *Failed 112/112 subtests * > > > BioPerl fails to install and so GBrowse won't install either after this. > Interestingly, I'm able to install this on my development server which is a > very similar setup (same OS 10.6.7 server and Perl 5.10.0 universal binary), > but is 32-bit opposed to this server which is 64-bit. I can also upgrade GD > and Bio::DB::Sam on the 32-bit development server. I get all of the same > architecture warnings, so I'm not sure if that is the issue.. Any help > would be appreciated. > > Thanks, > Matt > > > -- > Matthew Conte > Bioinformatician > Department of Biology > University of Maryland > mconte at umd.edu > Bouillabase.org > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From ross at cuhk.edu.hk Sun Jun 19 19:53:07 2011 From: ross at cuhk.edu.hk (Ross KK Leung) Date: Mon, 20 Jun 2011 07:53:07 +0800 Subject: [Bioperl-l] resources for read simulation? In-Reply-To: References: <01f901cb7203$f66e4040$e34ac0c0$%yin@ucd.ie> <004001cbe2d5$76598200$630c8600$@edu.hk> <005301cbe31b$a3bee550$eb3caff0$@edu.hk> <9CD1455E-88B4-4E2A-B3BC-398C10D5AAA9@tamu.edu> <3E73745F-A687-4229-B71E-5C56B2D1FBAE@illinois.edu> Message-ID: <009001cc2edc$09b80740$1d2815c0$@edu.hk> Is there any bioperl modules/programs that make short reads out of an input sequence file? e.g. input: a genome of 3Mb fasta file output: a read sequence fasta file with parameters: i) read length (either fixed, or in a range) ii) proportion of sampling (e.g. randomly select 30% coverage of the input file) From florent.angly at gmail.com Sun Jun 19 20:12:45 2011 From: florent.angly at gmail.com (Florent Angly) Date: Mon, 20 Jun 2011 10:12:45 +1000 Subject: [Bioperl-l] resources for read simulation? In-Reply-To: <009001cc2edc$09b80740$1d2815c0$@edu.hk> References: <01f901cb7203$f66e4040$e34ac0c0$%yin@ucd.ie> <004001cbe2d5$76598200$630c8600$@edu.hk> <005301cbe31b$a3bee550$eb3caff0$@edu.hk> <9CD1455E-88B4-4E2A-B3BC-398C10D5AAA9@tamu.edu> <3E73745F-A687-4229-B71E-5C56B2D1FBAE@illinois.edu> <009001cc2edc$09b80740$1d2815c0$@edu.hk> Message-ID: <4DFE907D.1000204@gmail.com> Hi Ross, You could try Grinder: https://sourceforge.net/projects/biogrinder/files/biogrinder/ It uses BioPerl. Florent On 20/06/11 09:53, Ross KK Leung wrote: > Is there any bioperl modules/programs that make short reads out of an input sequence file? > > e.g. > > input: a genome of 3Mb fasta file > output: a read sequence fasta file with parameters: > i) read length (either fixed, or in a range) > ii) proportion of sampling (e.g. randomly select 30% coverage of the input file) > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From a.gautam168 at yahoo.co.in Mon Jun 20 10:26:47 2011 From: a.gautam168 at yahoo.co.in (anna_sh) Date: Mon, 20 Jun 2011 07:26:47 -0700 (PDT) Subject: [Bioperl-l] Split genomes Message-ID: <31886043.post@talk.nabble.com> Hey, I want to split mitochondrial genome of certain mammals into ~250MB length of fragments which i further use for running RepeatMasker on them. I have posted below the script I use for the purpose, the issue is that it does not split the genomes in the size i require. Can anybody please modify my script so that it splits the genome in fragments of 200-275MB. Thanks a lot ! Script (minus the shebang line) --> # USE: file_split.fasta.pl fastafile # USE: Splits a given fasta file into smaller files # USE: -n splits into files of n numbered sequences # USE: -f splits into n numbered files # USE: -l splits into files of a total of l length use strict; use Getopt::Long; my ($nseq, $nfiles, $seql); GetOptions( "n:i" => \$nseq, "f:i" => \$nfiles, "l:i" => \$seql); my $usage = "file_split.fasta.pl -n -f fasta file\n"; if (!($ARGV[0])) { die "$usage";} #checks to make sure that some form of split variable has been given my $infile = shift or die; chomp $infile; if ($infile) { open (INITIAL, $infile) or die "cannot open $infile\n"; } #take initial_input line 1 my $fnum = 0; if ($nseq && !($nfiles) && !($seql)){ my $seqcount = 0; my $outfile = $infile.".".$fnum; open (OUTFILE, ">$outfile"); while() { if (m/^>/) {$seqcount++;} if ($seqcount > $nseq) { $seqcount = 1; $fnum++; close OUTFILE; $outfile = $infile.".".$fnum; open (OUTFILE, ">$outfile"); } print OUTFILE "$_"; } close OUTFILE; close INITIAL; } #ends split on n-sequences loop if ($nfiles && !($nseq) && !($seql)){ my $totalseqs = totalcount($infile); my $numseqs = int ($totalseqs/$nfiles); my $outfile = $infile.".".$fnum; my $seqcount = 0; open (OUTFILE, ">$outfile"); while() { if (m/^>/) {$seqcount++;} if (($seqcount > $numseqs) && ($fnum < ($nfiles-1))) { $seqcount = 1; $fnum++; close OUTFILE; $outfile = $infile.".".$fnum; open (OUTFILE, ">$outfile"); } print OUTFILE "$_"; } close OUTFILE; close INITIAL; } #ends split into $nfile files loop if ($seql && !($nfiles) && !($nseq)){ my $totallength = getlength($infile); my $filetotal = int ($totallength/$seql); my $outfile = $infile.".".$fnum; open (OUTFILE, ">$outfile"); my $sumlength = 0; while() { if (!(m/^>/)) {$sumlength += length ($_) } if (($sumlength > $seql) && (m/^>/)) { $sumlength = 0; $fnum++; close OUTFILE; $outfile = $infile.".".$fnum; open (OUTFILE, ">$outfile"); } print OUTFILE "$_"; } close OUTFILE; close INITIAL; } $fnum++; print "Split $infile into $fnum files\n"; sub totalcount { open (FILE, "$_[0]") or die "cannot find $_[0]\n"; my $seqnum = 0; while (){ if (m/^>/) {$seqnum++;} } close FILE; return $seqnum; } #returns number of sequences in the file sub getlength{ open (FILE, "$_[0]") or die "cannot find $_[0]\n"; my $totalbp = 0; while (){ if (!(m/^>/)){ $totalbp += length ($_);} } print "Total bp = $totalbp\n"; close FILE; return $totalbp; } #returns the total basepairs in the file. -- View this message in context: http://old.nabble.com/Split-genomes-tp31886043p31886043.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From a.gautam168 at yahoo.co.in Mon Jun 20 10:13:08 2011 From: a.gautam168 at yahoo.co.in (anna_sh) Date: Mon, 20 Jun 2011 07:13:08 -0700 (PDT) Subject: [Bioperl-l] Split genomes Message-ID: <31886043.post@talk.nabble.com> Hey, I want to split mitochondrial genome of certain mammals into ~250bp length of fragments which i further use for running RepeatMasker on them. I have posted below the script I use for the purpose, the issue is that it does not split the genomes in the size i require. Can anybody please modify my script so that it splits the genome in fragments of 200-275 bp. Thanks a lot ! Script --> # USE: file_split.fasta.pl fastafile # USE: Splits a given fasta file into smaller files # USE: -n splits into files of n numbered sequences # USE: -f splits into n numbered files # USE: -l splits into files of a total of l length use strict; use Getopt::Long; my ($nseq, $nfiles, $seql); GetOptions( "n:i" => \$nseq, "f:i" => \$nfiles, "l:i" => \$seql); my $usage = "file_split.fasta.pl -n -f fasta file\n"; if (!($ARGV[0])) { die "$usage";} #checks to make sure that some form of split variable has been given my $infile = shift or die; chomp $infile; if ($infile) { open (INITIAL, $infile) or die "cannot open $infile\n"; } #take initial_input line 1 my $fnum = 0; if ($nseq && !($nfiles) && !($seql)){ my $seqcount = 0; my $outfile = $infile.".".$fnum; open (OUTFILE, ">$outfile"); while() { if (m/^>/) {$seqcount++;} if ($seqcount > $nseq) { $seqcount = 1; $fnum++; close OUTFILE; $outfile = $infile.".".$fnum; open (OUTFILE, ">$outfile"); } print OUTFILE "$_"; } close OUTFILE; close INITIAL; } #ends split on n-sequences loop if ($nfiles && !($nseq) && !($seql)){ my $totalseqs = totalcount($infile); my $numseqs = int ($totalseqs/$nfiles); my $outfile = $infile.".".$fnum; my $seqcount = 0; open (OUTFILE, ">$outfile"); while() { if (m/^>/) {$seqcount++;} if (($seqcount > $numseqs) && ($fnum < ($nfiles-1))) { $seqcount = 1; $fnum++; close OUTFILE; $outfile = $infile.".".$fnum; open (OUTFILE, ">$outfile"); } print OUTFILE "$_"; } close OUTFILE; close INITIAL; } #ends split into $nfile files loop if ($seql && !($nfiles) && !($nseq)){ my $totallength = getlength($infile); my $filetotal = int ($totallength/$seql); my $outfile = $infile.".".$fnum; open (OUTFILE, ">$outfile"); my $sumlength = 0; while() { if (!(m/^>/)) {$sumlength += length ($_) } if (($sumlength > $seql) && (m/^>/)) { $sumlength = 0; $fnum++; close OUTFILE; $outfile = $infile.".".$fnum; open (OUTFILE, ">$outfile"); } print OUTFILE "$_"; } close OUTFILE; close INITIAL; } $fnum++; print "Split $infile into $fnum files\n"; sub totalcount { open (FILE, "$_[0]") or die "cannot find $_[0]\n"; my $seqnum = 0; while (){ if (m/^>/) {$seqnum++;} } close FILE; return $seqnum; } #returns number of sequences in the file sub getlength{ open (FILE, "$_[0]") or die "cannot find $_[0]\n"; my $totalbp = 0; while (){ if (!(m/^>/)){ $totalbp += length ($_);} } print "Total bp = $totalbp\n"; close FILE; return $totalbp; } #returns the total basepairs in the file. -- View this message in context: http://old.nabble.com/Split-genomes-tp31886043p31886043.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From a.gautam168 at yahoo.co.in Mon Jun 20 10:14:21 2011 From: a.gautam168 at yahoo.co.in (anna_sh) Date: Mon, 20 Jun 2011 07:14:21 -0700 (PDT) Subject: [Bioperl-l] Split genomes Message-ID: <31886043.post@talk.nabble.com> Hey, I want to split mitochondrial genome of certain mammals into ~250bp length of fragments which i further use for running RepeatMasker on them. I have posted below the script I use for the purpose, the issue is that it does not split the genomes in the size i require. Can anybody please modify my script so that it splits the genome in fragments of 200-275 bp. Thanks a lot ! Script (minus the shebang line) --> # USE: file_split.fasta.pl fastafile # USE: Splits a given fasta file into smaller files # USE: -n splits into files of n numbered sequences # USE: -f splits into n numbered files # USE: -l splits into files of a total of l length use strict; use Getopt::Long; my ($nseq, $nfiles, $seql); GetOptions( "n:i" => \$nseq, "f:i" => \$nfiles, "l:i" => \$seql); my $usage = "file_split.fasta.pl -n -f fasta file\n"; if (!($ARGV[0])) { die "$usage";} #checks to make sure that some form of split variable has been given my $infile = shift or die; chomp $infile; if ($infile) { open (INITIAL, $infile) or die "cannot open $infile\n"; } #take initial_input line 1 my $fnum = 0; if ($nseq && !($nfiles) && !($seql)){ my $seqcount = 0; my $outfile = $infile.".".$fnum; open (OUTFILE, ">$outfile"); while() { if (m/^>/) {$seqcount++;} if ($seqcount > $nseq) { $seqcount = 1; $fnum++; close OUTFILE; $outfile = $infile.".".$fnum; open (OUTFILE, ">$outfile"); } print OUTFILE "$_"; } close OUTFILE; close INITIAL; } #ends split on n-sequences loop if ($nfiles && !($nseq) && !($seql)){ my $totalseqs = totalcount($infile); my $numseqs = int ($totalseqs/$nfiles); my $outfile = $infile.".".$fnum; my $seqcount = 0; open (OUTFILE, ">$outfile"); while() { if (m/^>/) {$seqcount++;} if (($seqcount > $numseqs) && ($fnum < ($nfiles-1))) { $seqcount = 1; $fnum++; close OUTFILE; $outfile = $infile.".".$fnum; open (OUTFILE, ">$outfile"); } print OUTFILE "$_"; } close OUTFILE; close INITIAL; } #ends split into $nfile files loop if ($seql && !($nfiles) && !($nseq)){ my $totallength = getlength($infile); my $filetotal = int ($totallength/$seql); my $outfile = $infile.".".$fnum; open (OUTFILE, ">$outfile"); my $sumlength = 0; while() { if (!(m/^>/)) {$sumlength += length ($_) } if (($sumlength > $seql) && (m/^>/)) { $sumlength = 0; $fnum++; close OUTFILE; $outfile = $infile.".".$fnum; open (OUTFILE, ">$outfile"); } print OUTFILE "$_"; } close OUTFILE; close INITIAL; } $fnum++; print "Split $infile into $fnum files\n"; sub totalcount { open (FILE, "$_[0]") or die "cannot find $_[0]\n"; my $seqnum = 0; while (){ if (m/^>/) {$seqnum++;} } close FILE; return $seqnum; } #returns number of sequences in the file sub getlength{ open (FILE, "$_[0]") or die "cannot find $_[0]\n"; my $totalbp = 0; while (){ if (!(m/^>/)){ $totalbp += length ($_);} } print "Total bp = $totalbp\n"; close FILE; return $totalbp; } #returns the total basepairs in the file. -- View this message in context: http://old.nabble.com/Split-genomes-tp31886043p31886043.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From projectbasu at gmail.com Tue Jun 21 12:55:45 2011 From: projectbasu at gmail.com (ashwoo) Date: Tue, 21 Jun 2011 09:55:45 -0700 (PDT) Subject: [Bioperl-l] AlignIO Message-ID: <2213d37b-5145-4fde-a239-68ab756ab902@l6g2000vbn.googlegroups.com> Dear All, I have a script which parses the best HSP alignment out of BLAST result and writes it to a temporary file. my $aln = $hsp->get_aln; 1. my $out = Bio::AlignIO->new(-file => ">tmp.aln", 2. -format => 'clustalw'); 3. $out->write_aln($aln); I randomize the alignment within the "tmp.aln" file to generate a new file "mult_rand.aln" containing all randomized alignments in clustalw format. Now I want to read each alignment in the randomized file hence I use 4. my $in = Bio::AlignIO->new(-file => "mult_rand.aln", 5. -format => 'clustalw'); 6. while ( my $aln = $in->next_aln() ) { 7. #"RUN RNAZ to check the conservedness of each randomized alignment" 8. #NOT GETTING ANY VALUES HERE 9. } But I am not able to access each alignment. When I open the randomized aln file with a separate script and same code in lines 4-9 it works fine. Is this happening due to my using AlignIO objects twice in the same script. Please Help. yours sincerely, Perl Novice From chad.a.davis at gmail.com Tue Jun 21 13:23:06 2011 From: chad.a.davis at gmail.com (Chad Davis) Date: Tue, 21 Jun 2011 19:23:06 +0200 Subject: [Bioperl-l] AlignIO In-Reply-To: <2213d37b-5145-4fde-a239-68ab756ab902@l6g2000vbn.googlegroups.com> References: <2213d37b-5145-4fde-a239-68ab756ab902@l6g2000vbn.googlegroups.com> Message-ID: It sounds like your randomization procedure is breaking the alignment format, such that the clustal parser can no longer read it. How are you randomizing the file? An example might help. Chad On Tue, Jun 21, 2011 at 18:55, ashwoo wrote: > Dear All, > ? ? ? ? ? ? ?I have a script which parses the best HSP alignment out > of BLAST result and writes it to a temporary file. > > my $aln = $hsp->get_aln; > 1. my $out = Bio::AlignIO->new(-file => ">tmp.aln", > 2. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -format => 'clustalw'); > 3. $out->write_aln($aln); > > > I randomize the alignment within the "tmp.aln" file to generate a new > file "mult_rand.aln" containing all randomized alignments in clustalw > format. > Now I want to read each alignment in the randomized file hence I use > > 4. my $in ?= Bio::AlignIO->new(-file ? => "mult_rand.aln", > 5. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -format => 'clustalw'); > 6. while ( my $aln = $in->next_aln() ) { > 7. ? #"RUN RNAZ to check the conservedness of each randomized > alignment" > 8. ? #NOT GETTING ANY VALUES HERE > 9. ? } > > But I am not able to access each alignment. When I open the randomized > aln file with a separate script and same code in lines 4-9 ?it works > fine. Is this happening due to my using AlignIO objects twice in the > same script. Please Help. > > yours sincerely, > Perl Novice > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From Russell.Smithies at agresearch.co.nz Tue Jun 21 17:01:07 2011 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Wed, 22 Jun 2011 09:01:07 +1200 Subject: [Bioperl-l] Split genomes In-Reply-To: <31886043.post@talk.nabble.com> References: <31886043.post@talk.nabble.com> Message-ID: <18DF7D20DFEC044098A1062202F5FFF3396074D1E0@exchsth.agresearch.co.nz> I don't see any BioPerl being used so why are you asking on this list? Which bit isn't working? >From what I can see, your first bit where you split a large file into number of smaller files is going to discard the end of the file. i.e if you have a file with 1000 sequences and split it into chunks of 300, your script is going to give you 3 files of 300 and discard the other 100. I think you need to simplify and try again as it appears overly complex. Or try the built-in BioPerl script bp_split_seq.pl for some guidance. --Russell > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of anna_sh > Sent: Tuesday, 21 June 2011 2:14 a.m. > To: Bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Split genomes > > > Hey, > > I want to split mitochondrial genome of certain mammals into ~250bp > length > of fragments which i further use for running RepeatMasker on them. I > have > posted below the script I use for the purpose, the issue is that it > does not > split the genomes in the size i require. Can anybody please modify my > script > so that it splits the genome in fragments of 200-275 bp. Thanks a lot ! > > > > > > Script (minus the shebang line) --> > > > # USE: file_split.fasta.pl fastafile > # USE: Splits a given fasta file into smaller files > # USE: -n splits into files of n numbered sequences > # USE: -f splits into n numbered files > # USE: -l splits into files of a total of l length > > use strict; > use Getopt::Long; > > my ($nseq, $nfiles, $seql); > > GetOptions( "n:i" => \$nseq, > "f:i" => \$nfiles, > "l:i" => \$seql); > > my $usage = "file_split.fasta.pl -n -f fasta file\n"; > > if (!($ARGV[0])) { die "$usage";} #checks to make sure that some form > of > split variable has been given > > my $infile = shift or die; > > chomp $infile; > > > if ($infile) { > open (INITIAL, $infile) or die "cannot open $infile\n"; > } #take initial_input line 1 > > my $fnum = 0; > > if ($nseq && !($nfiles) && !($seql)){ > my $seqcount = 0; > my $outfile = $infile.".".$fnum; > open (OUTFILE, ">$outfile"); > > while() { > if (m/^>/) {$seqcount++;} > if ($seqcount > $nseq) > { $seqcount = 1; > $fnum++; > close OUTFILE; > $outfile = $infile.".".$fnum; > open (OUTFILE, ">$outfile"); > } > print OUTFILE "$_"; > } > close OUTFILE; > close INITIAL; > } #ends split on n-sequences loop > > if ($nfiles && !($nseq) && !($seql)){ > my $totalseqs = totalcount($infile); > my $numseqs = int ($totalseqs/$nfiles); > my $outfile = $infile.".".$fnum; > my $seqcount = 0; > open (OUTFILE, ">$outfile"); > > while() { > if (m/^>/) {$seqcount++;} > if (($seqcount > $numseqs) && ($fnum < ($nfiles-1))) > { $seqcount = 1; > $fnum++; > close OUTFILE; > $outfile = $infile.".".$fnum; > open (OUTFILE, ">$outfile"); > } > print OUTFILE "$_"; > } > close OUTFILE; > close INITIAL; > } #ends split into $nfile files loop > > if ($seql && !($nfiles) && !($nseq)){ > my $totallength = getlength($infile); > my $filetotal = int ($totallength/$seql); > > my $outfile = $infile.".".$fnum; > open (OUTFILE, ">$outfile"); > my $sumlength = 0; > > while() { > > if (!(m/^>/)) {$sumlength += length ($_) } > if (($sumlength > $seql) && (m/^>/)) > { $sumlength = 0; > $fnum++; > close OUTFILE; > $outfile = $infile.".".$fnum; > open (OUTFILE, ">$outfile"); > } > print OUTFILE "$_"; > } > close OUTFILE; > close INITIAL; > } > $fnum++; > print "Split $infile into $fnum files\n"; > > sub totalcount { > open (FILE, "$_[0]") or die "cannot find $_[0]\n"; > my $seqnum = 0; > while (){ > if (m/^>/) {$seqnum++;} > } > close FILE; > return $seqnum; > } #returns number of sequences in the file > > sub getlength{ > open (FILE, "$_[0]") or die "cannot find $_[0]\n"; > my $totalbp = 0; > while (){ > if (!(m/^>/)){ $totalbp += length ($_);} > } > print "Total bp = $totalbp\n"; > close FILE; > return $totalbp; > } #returns the total basepairs in the file. > -- > View this message in context: http://old.nabble.com/Split-genomes- > tp31886043p31886043.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From jason.stajich at gmail.com Wed Jun 22 12:04:59 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Wed, 22 Jun 2011 09:04:59 -0700 Subject: [Bioperl-l] bp_classify_hits_kingdom.pl In-Reply-To: <7EAD4222-BDD8-497F-A0E6-AE94EA4D0D10@gwdg.de> References: <7EAD4222-BDD8-497F-A0E6-AE94EA4D0D10@gwdg.de> Message-ID: <8283D9A4-34C2-4AC5-AF0B-B1662D5393C6@gmail.com> Hi Dan Looks like the mformat 6 is the right one for the blastplus toolkit - it is m8 or m9 for the C toolkit blastall application. I think DB_File was falling over with the now 40M+ gi to taxid pairs that I think were overwhelming DB_File and the berkkeleyDB implementation there. To solve it I reimplemented it with SQLite -- which will require you to install DBD::SQLite. I've checked in code to the main trunk in the github repo if you want to take a look -- you can either download the file https://github.com/bioperl/bioperl-live/blob/master/scripts/taxa/classify_hits_kingdom.PLS or check it out via git (recommended). -jason On Jun 21, 2011, at 7:25 AM, Jackson, Daniel wrote: > Hi Jason, > > My name is Dan and I'm hoping to use your bioperl script bp_classify_hits_kingdom.pl to categorise some ESTs I recently acquired. I've been stuck on this problem for days now - can you help?!? I suspect it's an easy and obvious solution.... I'm not a complete newbie to using scripts, but wouldn't say I'm experienced! I've just installed Bioperl and have generated a small BLASTx test file of my sequences searched against a local installation of GenBank's nr database. The BLAST search was run locally as follows: > > gzgbio-48:~ djackson$ blastx -query /Users/djackson/Desktop/10_Vaceltia_contigs_fna.txt -db /Users/djackson/BLAST-2.2.25+/db/nr/nr -outfmt 6 -out /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m6.txt -numthreads 2 -evalue .00001 -show_gis -num_descriptions 10 -num_alignments 10 -max_target_seqs 10 > > > The results of this file are attached (BLASTx_10_Vaceltia_contigs_m6.txt). I realise the BLASTx output is supposed to be in -outfmt 8 or -outfmt 9, but providing these files to bp_classify_hits_kingdom.pl generates the following error: > > gzgbio-48:~ djackson$ bp_classify_hits_kingdom.pl -t /Users/djackson/taxdump -g /Users/djackson/taxdump/gi_taxid_prot.dmp -i /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m9.txt -e .0001 > /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m9.txt > Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 1. > Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 1. > Use of uninitialized value $hname in pattern match (m//) at /usr/local/bin/bp_classify_hits_kingdom.pl line 148, <$fh> line 1. > Use of uninitialized value $hname in concatenation (.) or string at /usr/local/bin/bp_classify_hits_kingdom.pl line 195, <$fh> line 1. > no GI in > Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 2. > Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 2. > Use of uninitialized value $hname in pattern match (m//) at /usr/local/bin/bp_classify_hits_kingdom.pl line 148, <$fh> line 2. > Use of uninitialized value $hname in concatenation (.) or string at /usr/local/bin/bp_classify_hits_kingdom.pl line 195, <$fh> line 2. > no GI in > Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 3. > Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 3. > . > . > etc... > . > . > no GI in > /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m9 total=1182 > 1182 100.00% > gzgbio-48:~ djackson$ > > > Providing the bp_classify_hits_kingdom.pl script with an -outfmt 6 format seems to get closer to a meaningful output, but still generates the following error: > > > gzgbio-48:~ djackson$ bp_classify_hits_kingdom.pl -t /Users/djackson/taxdump -g /Users/djackson/taxdump/gi_taxid_prot.dmp -i /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m6.txt -e .0001 -v > /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m6.txt > no taxid for 51127506 > no taxid for 51127506 > no taxid for 51127506 > no taxid for 317419045 > no taxid for 47219014 > . > . > etc... > . > . > /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m6 total=10 > 10 100.00% > gzgbio-48:~ djackson$ > > > Kind regards and thanks in advance, > Dan > > > > > --------------------------------------------------------------- > Junior Professor Daniel J. Jackson > Courant Research Centre Geobiology > Georg-August University of G?ttingen > Goldschmidtstr.3 > 37077 G?ttingen > Germany > > Tel: +49 (0) 551 39 14177 > Fax: +49 (0) 551 39 7918 > > djackso at uni-goettingen.de > http://www.uni-goettingen.de/en/102705.html > --------------------------------------------------------------- > > > > > From a.gautam168 at yahoo.co.in Wed Jun 22 13:39:45 2011 From: a.gautam168 at yahoo.co.in (anna_sh) Date: Wed, 22 Jun 2011 10:39:45 -0700 (PDT) Subject: [Bioperl-l] Split genomes In-Reply-To: <31886043.post@talk.nabble.com> References: <31886043.post@talk.nabble.com> Message-ID: <31905239.post@talk.nabble.com> Thanks for the script. It was just the perfect fix. anna_sh wrote: > > Hey, > > I want to split mitochondrial genome of certain mammals into ~250MB length > of fragments which i further use for running RepeatMasker on them. I have > posted below the script I use for the purpose, the issue is that it does > not split the genomes in the size i require. Can anybody please modify my > script so that it splits the genome in fragments of 200-275MB. Thanks a > lot ! > > > > > > Script (minus the shebang line) --> > > > # USE: file_split.fasta.pl fastafile > # USE: Splits a given fasta file into smaller files > # USE: -n splits into files of n numbered sequences > # USE: -f splits into n numbered files > # USE: -l splits into files of a total of l length > > use strict; > use Getopt::Long; > > my ($nseq, $nfiles, $seql); > > GetOptions( "n:i" => \$nseq, > "f:i" => \$nfiles, > "l:i" => \$seql); > > my $usage = "file_split.fasta.pl -n -f fasta file\n"; > > if (!($ARGV[0])) { die "$usage";} #checks to make sure that some form of > split variable has been given > > my $infile = shift or die; > > chomp $infile; > > > if ($infile) { > open (INITIAL, $infile) or die "cannot open $infile\n"; > } #take initial_input line 1 > > my $fnum = 0; > > if ($nseq && !($nfiles) && !($seql)){ > my $seqcount = 0; > my $outfile = $infile.".".$fnum; > open (OUTFILE, ">$outfile"); > > while() { > if (m/^>/) {$seqcount++;} > if ($seqcount > $nseq) > { $seqcount = 1; > $fnum++; > close OUTFILE; > $outfile = $infile.".".$fnum; > open (OUTFILE, ">$outfile"); > } > print OUTFILE "$_"; > } > close OUTFILE; > close INITIAL; > } #ends split on n-sequences loop > > if ($nfiles && !($nseq) && !($seql)){ > my $totalseqs = totalcount($infile); > my $numseqs = int ($totalseqs/$nfiles); > my $outfile = $infile.".".$fnum; > my $seqcount = 0; > open (OUTFILE, ">$outfile"); > > while() { > if (m/^>/) {$seqcount++;} > if (($seqcount > $numseqs) && ($fnum < ($nfiles-1))) > { $seqcount = 1; > $fnum++; > close OUTFILE; > $outfile = $infile.".".$fnum; > open (OUTFILE, ">$outfile"); > } > print OUTFILE "$_"; > } > close OUTFILE; > close INITIAL; > } #ends split into $nfile files loop > > if ($seql && !($nfiles) && !($nseq)){ > my $totallength = getlength($infile); > my $filetotal = int ($totallength/$seql); > > my $outfile = $infile.".".$fnum; > open (OUTFILE, ">$outfile"); > my $sumlength = 0; > > while() { > > if (!(m/^>/)) {$sumlength += length ($_) } > if (($sumlength > $seql) && (m/^>/)) > { $sumlength = 0; > $fnum++; > close OUTFILE; > $outfile = $infile.".".$fnum; > open (OUTFILE, ">$outfile"); > } > print OUTFILE "$_"; > } > close OUTFILE; > close INITIAL; > } > $fnum++; > print "Split $infile into $fnum files\n"; > > sub totalcount { > open (FILE, "$_[0]") or die "cannot find $_[0]\n"; > my $seqnum = 0; > while (){ > if (m/^>/) {$seqnum++;} > } > close FILE; > return $seqnum; > } #returns number of sequences in the file > > sub getlength{ > open (FILE, "$_[0]") or die "cannot find $_[0]\n"; > my $totalbp = 0; > while (){ > if (!(m/^>/)){ $totalbp += length ($_);} > } > print "Total bp = $totalbp\n"; > close FILE; > return $totalbp; > } #returns the total basepairs in the file. > -- View this message in context: http://old.nabble.com/Split-genomes-tp31886043p31905239.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From bernd.web at gmail.com Wed Jun 22 14:07:39 2011 From: bernd.web at gmail.com (Bernd Web) Date: Wed, 22 Jun 2011 20:07:39 +0200 Subject: [Bioperl-l] bp_classify_hits_kingdom.pl In-Reply-To: <8283D9A4-34C2-4AC5-AF0B-B1662D5393C6@gmail.com> References: <7EAD4222-BDD8-497F-A0E6-AE94EA4D0D10@gwdg.de> <8283D9A4-34C2-4AC5-AF0B-B1662D5393C6@gmail.com> Message-ID: Hi Jason, I did GI to TAX mapping in Perl alone. Nice to know this script exists. Thanks for this. Just one question, I noticed on https://github.com/bioperl/bioperl-live/blob/master/scripts/taxa/classify_hits_kingdom.PLS: line 96: my $dbh = tie(%gi2node, 'DB_File', 'gi2class'); and line 100: my $dbh2 = my $dbh = DBI->connect("dbi:SQLite:dbname=$giidxfile","",""); So the second $dbh masks earlier declaration. Cheers, Bernd On Wed, Jun 22, 2011 at 6:04 PM, Jason Stajich wrote: > Hi Dan > > Looks like the mformat 6 is the right one for the blastplus toolkit - it is m8 or m9 for the C toolkit blastall application. > > I think DB_File was falling over with the now 40M+ gi to taxid pairs that I think were overwhelming DB_File and the berkkeleyDB implementation there. > > To solve it I reimplemented it with SQLite -- which will require you to install DBD::SQLite. > > I've checked in code to the main trunk in the github repo if you want to take a look -- you can either download the file https://github.com/bioperl/bioperl-live/blob/master/scripts/taxa/classify_hits_kingdom.PLS or check it out via git (recommended). > > -jason > On Jun 21, 2011, at 7:25 AM, Jackson, Daniel wrote: > >> Hi Jason, >> >> My name is Dan and I'm hoping to use your bioperl script bp_classify_hits_kingdom.pl to categorise some ESTs I recently acquired. I've been stuck on this problem for days now - can you help?!? I suspect it's an easy and obvious solution.... I'm not a complete newbie to using scripts, but wouldn't say I'm experienced! I've just installed Bioperl and have generated a small BLASTx test file of my sequences searched against a local installation of GenBank's nr database. The BLAST search was run locally as follows: >> >> gzgbio-48:~ djackson$ blastx -query /Users/djackson/Desktop/10_Vaceltia_contigs_fna.txt -db /Users/djackson/BLAST-2.2.25+/db/nr/nr -outfmt 6 -out /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m6.txt -numthreads 2 -evalue .00001 -show_gis -num_descriptions 10 -num_alignments 10 -max_target_seqs 10 >> >> >> The results of this file are attached (BLASTx_10_Vaceltia_contigs_m6.txt). I realise the BLASTx output is supposed to be in -outfmt 8 or -outfmt 9, but providing these files to bp_classify_hits_kingdom.pl generates the following error: >> >> gzgbio-48:~ djackson$ bp_classify_hits_kingdom.pl -t /Users/djackson/taxdump -g /Users/djackson/taxdump/gi_taxid_prot.dmp -i /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m9.txt ?-e .0001 >> /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m9.txt >> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 1. >> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 1. >> Use of uninitialized value $hname in pattern match (m//) at /usr/local/bin/bp_classify_hits_kingdom.pl line 148, <$fh> line 1. >> Use of uninitialized value $hname in concatenation (.) or string at /usr/local/bin/bp_classify_hits_kingdom.pl line 195, <$fh> line 1. >> no GI in >> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 2. >> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 2. >> Use of uninitialized value $hname in pattern match (m//) at /usr/local/bin/bp_classify_hits_kingdom.pl line 148, <$fh> line 2. >> Use of uninitialized value $hname in concatenation (.) or string at /usr/local/bin/bp_classify_hits_kingdom.pl line 195, <$fh> line 2. >> no GI in >> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 3. >> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 3. >> . >> . >> etc... >> . >> . >> no GI in >> /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m9 total=1182 >> ? ? ? ? ? ? ? ? ? ? 1182 100.00% >> gzgbio-48:~ djackson$ >> >> >> Providing the bp_classify_hits_kingdom.pl script with an -outfmt 6 format seems to get closer to a meaningful output, but still generates the following error: >> >> >> gzgbio-48:~ djackson$ bp_classify_hits_kingdom.pl -t /Users/djackson/taxdump -g /Users/djackson/taxdump/gi_taxid_prot.dmp -i /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m6.txt ?-e .0001 -v >> /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m6.txt >> no taxid for 51127506 >> no taxid for 51127506 >> no taxid for 51127506 >> no taxid for 317419045 >> no taxid for 47219014 >> . >> . >> etc... >> . >> . >> /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m6 total=10 >> ? ? ? ? ? ? ? ? ? ? 10 100.00% >> gzgbio-48:~ djackson$ >> >> >> Kind regards and thanks in advance, >> Dan >> >> >> >> >> --------------------------------------------------------------- >> Junior Professor Daniel J. Jackson >> Courant Research Centre Geobiology >> Georg-August University of G?ttingen >> Goldschmidtstr.3 >> 37077 G?ttingen >> Germany >> >> Tel: +49 (0) 551 39 14177 >> Fax: +49 (0) 551 39 7918 >> >> djackso at uni-goettingen.de >> http://www.uni-goettingen.de/en/102705.html >> --------------------------------------------------------------- >> >> >> >> >> > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bosborne11 at verizon.net Wed Jun 22 14:13:40 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 22 Jun 2011 14:13:40 -0400 Subject: [Bioperl-l] Split genomes In-Reply-To: <31905239.post@talk.nabble.com> References: <31886043.post@talk.nabble.com> <31905239.post@talk.nabble.com> Message-ID: <50A0CCD4-4819-40C1-9A52-E0419B8516B5@verizon.net> Anna, Start with this: http://www.bioperl.org/wiki/HOWTO:Beginners The script you showed had no Bioperl in it, so it was too long and too complicated. Use Bioperl, it's less work. BIO On Jun 22, 2011, at 1:39 PM, anna_sh wrote: > > Thanks for the script. It was just the perfect fix. > > > > anna_sh wrote: >> >> Hey, >> >> I want to split mitochondrial genome of certain mammals into ~250MB length >> of fragments which i further use for running RepeatMasker on them. I have >> posted below the script I use for the purpose, the issue is that it does >> not split the genomes in the size i require. Can anybody please modify my >> script so that it splits the genome in fragments of 200-275MB. Thanks a >> lot ! >> >> >> >> >> >> Script (minus the shebang line) --> >> >> >> # USE: file_split.fasta.pl fastafile >> # USE: Splits a given fasta file into smaller files >> # USE: -n splits into files of n numbered sequences >> # USE: -f splits into n numbered files >> # USE: -l splits into files of a total of l length >> >> use strict; >> use Getopt::Long; >> >> my ($nseq, $nfiles, $seql); >> >> GetOptions( "n:i" => \$nseq, >> "f:i" => \$nfiles, >> "l:i" => \$seql); >> >> my $usage = "file_split.fasta.pl -n -f fasta file\n"; >> >> if (!($ARGV[0])) { die "$usage";} #checks to make sure that some form of >> split variable has been given >> >> my $infile = shift or die; >> >> chomp $infile; >> >> >> if ($infile) { >> open (INITIAL, $infile) or die "cannot open $infile\n"; >> } #take initial_input line 1 >> >> my $fnum = 0; >> >> if ($nseq && !($nfiles) && !($seql)){ >> my $seqcount = 0; >> my $outfile = $infile.".".$fnum; >> open (OUTFILE, ">$outfile"); >> >> while() { >> if (m/^>/) {$seqcount++;} >> if ($seqcount > $nseq) >> { $seqcount = 1; >> $fnum++; >> close OUTFILE; >> $outfile = $infile.".".$fnum; >> open (OUTFILE, ">$outfile"); >> } >> print OUTFILE "$_"; >> } >> close OUTFILE; >> close INITIAL; >> } #ends split on n-sequences loop >> >> if ($nfiles && !($nseq) && !($seql)){ >> my $totalseqs = totalcount($infile); >> my $numseqs = int ($totalseqs/$nfiles); >> my $outfile = $infile.".".$fnum; >> my $seqcount = 0; >> open (OUTFILE, ">$outfile"); >> >> while() { >> if (m/^>/) {$seqcount++;} >> if (($seqcount > $numseqs) && ($fnum < ($nfiles-1))) >> { $seqcount = 1; >> $fnum++; >> close OUTFILE; >> $outfile = $infile.".".$fnum; >> open (OUTFILE, ">$outfile"); >> } >> print OUTFILE "$_"; >> } >> close OUTFILE; >> close INITIAL; >> } #ends split into $nfile files loop >> >> if ($seql && !($nfiles) && !($nseq)){ >> my $totallength = getlength($infile); >> my $filetotal = int ($totallength/$seql); >> >> my $outfile = $infile.".".$fnum; >> open (OUTFILE, ">$outfile"); >> my $sumlength = 0; >> >> while() { >> >> if (!(m/^>/)) {$sumlength += length ($_) } >> if (($sumlength > $seql) && (m/^>/)) >> { $sumlength = 0; >> $fnum++; >> close OUTFILE; >> $outfile = $infile.".".$fnum; >> open (OUTFILE, ">$outfile"); >> } >> print OUTFILE "$_"; >> } >> close OUTFILE; >> close INITIAL; >> } >> $fnum++; >> print "Split $infile into $fnum files\n"; >> >> sub totalcount { >> open (FILE, "$_[0]") or die "cannot find $_[0]\n"; >> my $seqnum = 0; >> while (){ >> if (m/^>/) {$seqnum++;} >> } >> close FILE; >> return $seqnum; >> } #returns number of sequences in the file >> >> sub getlength{ >> open (FILE, "$_[0]") or die "cannot find $_[0]\n"; >> my $totalbp = 0; >> while (){ >> if (!(m/^>/)){ $totalbp += length ($_);} >> } >> print "Total bp = $totalbp\n"; >> close FILE; >> return $totalbp; >> } #returns the total basepairs in the file. >> > > -- > View this message in context: http://old.nabble.com/Split-genomes-tp31886043p31905239.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Jun 22 15:51:21 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 22 Jun 2011 14:51:21 -0500 Subject: [Bioperl-l] bp_classify_hits_kingdom.pl In-Reply-To: <8283D9A4-34C2-4AC5-AF0B-B1662D5393C6@gmail.com> References: <7EAD4222-BDD8-497F-A0E6-AE94EA4D0D10@gwdg.de> <8283D9A4-34C2-4AC5-AF0B-B1662D5393C6@gmail.com> Message-ID: We should actually do a general switchover to SQLite I think, or at least abstract that. chris On Jun 22, 2011, at 11:04 AM, Jason Stajich wrote: > Hi Dan > > Looks like the mformat 6 is the right one for the blastplus toolkit - it is m8 or m9 for the C toolkit blastall application. > > I think DB_File was falling over with the now 40M+ gi to taxid pairs that I think were overwhelming DB_File and the berkkeleyDB implementation there. > > To solve it I reimplemented it with SQLite -- which will require you to install DBD::SQLite. > > I've checked in code to the main trunk in the github repo if you want to take a look -- you can either download the file https://github.com/bioperl/bioperl-live/blob/master/scripts/taxa/classify_hits_kingdom.PLS or check it out via git (recommended). > > -jason > On Jun 21, 2011, at 7:25 AM, Jackson, Daniel wrote: > >> Hi Jason, >> >> My name is Dan and I'm hoping to use your bioperl script bp_classify_hits_kingdom.pl to categorise some ESTs I recently acquired. I've been stuck on this problem for days now - can you help?!? I suspect it's an easy and obvious solution.... I'm not a complete newbie to using scripts, but wouldn't say I'm experienced! I've just installed Bioperl and have generated a small BLASTx test file of my sequences searched against a local installation of GenBank's nr database. The BLAST search was run locally as follows: >> >> gzgbio-48:~ djackson$ blastx -query /Users/djackson/Desktop/10_Vaceltia_contigs_fna.txt -db /Users/djackson/BLAST-2.2.25+/db/nr/nr -outfmt 6 -out /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m6.txt -numthreads 2 -evalue .00001 -show_gis -num_descriptions 10 -num_alignments 10 -max_target_seqs 10 >> >> >> The results of this file are attached (BLASTx_10_Vaceltia_contigs_m6.txt). I realise the BLASTx output is supposed to be in -outfmt 8 or -outfmt 9, but providing these files to bp_classify_hits_kingdom.pl generates the following error: >> >> gzgbio-48:~ djackson$ bp_classify_hits_kingdom.pl -t /Users/djackson/taxdump -g /Users/djackson/taxdump/gi_taxid_prot.dmp -i /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m9.txt -e .0001 >> /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m9.txt >> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 1. >> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 1. >> Use of uninitialized value $hname in pattern match (m//) at /usr/local/bin/bp_classify_hits_kingdom.pl line 148, <$fh> line 1. >> Use of uninitialized value $hname in concatenation (.) or string at /usr/local/bin/bp_classify_hits_kingdom.pl line 195, <$fh> line 1. >> no GI in >> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 2. >> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 2. >> Use of uninitialized value $hname in pattern match (m//) at /usr/local/bin/bp_classify_hits_kingdom.pl line 148, <$fh> line 2. >> Use of uninitialized value $hname in concatenation (.) or string at /usr/local/bin/bp_classify_hits_kingdom.pl line 195, <$fh> line 2. >> no GI in >> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 3. >> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 3. >> . >> . >> etc... >> . >> . >> no GI in >> /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m9 total=1182 >> 1182 100.00% >> gzgbio-48:~ djackson$ >> >> >> Providing the bp_classify_hits_kingdom.pl script with an -outfmt 6 format seems to get closer to a meaningful output, but still generates the following error: >> >> >> gzgbio-48:~ djackson$ bp_classify_hits_kingdom.pl -t /Users/djackson/taxdump -g /Users/djackson/taxdump/gi_taxid_prot.dmp -i /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m6.txt -e .0001 -v >> /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m6.txt >> no taxid for 51127506 >> no taxid for 51127506 >> no taxid for 51127506 >> no taxid for 317419045 >> no taxid for 47219014 >> . >> . >> etc... >> . >> . >> /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m6 total=10 >> 10 100.00% >> gzgbio-48:~ djackson$ >> >> >> Kind regards and thanks in advance, >> Dan >> >> >> >> >> --------------------------------------------------------------- >> Junior Professor Daniel J. Jackson >> Courant Research Centre Geobiology >> Georg-August University of G?ttingen >> Goldschmidtstr.3 >> 37077 G?ttingen >> Germany >> >> Tel: +49 (0) 551 39 14177 >> Fax: +49 (0) 551 39 7918 >> >> djackso at uni-goettingen.de >> http://www.uni-goettingen.de/en/102705.html >> --------------------------------------------------------------- >> >> >> >> >> > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason.stajich at gmail.com Wed Jun 22 16:07:54 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Wed, 22 Jun 2011 13:07:54 -0700 Subject: [Bioperl-l] bp_classify_hits_kingdom.pl In-Reply-To: References: <7EAD4222-BDD8-497F-A0E6-AE94EA4D0D10@gwdg.de> <8283D9A4-34C2-4AC5-AF0B-B1662D5393C6@gmail.com> Message-ID: <992064A0-B165-461D-B1BB-130BAA1CD98F@gmail.com> Bernd - oops - thanks very much for noticing this - I was too fast in copy & paste. I see another typo in there now that the midday light is shining on the code that I'll fix. Should be able to check in this in a second. Jason On Jun 22, 2011, at 11:07 AM, Bernd Web wrote: > Hi Jason, > > I did GI to TAX mapping in Perl alone. Nice to know this script > exists. Thanks for this. > Just one question, I noticed on > https://github.com/bioperl/bioperl-live/blob/master/scripts/taxa/classify_hits_kingdom.PLS: > > line 96: my $dbh = tie(%gi2node, 'DB_File', 'gi2class'); > and > line 100: my $dbh2 = my $dbh = > DBI->connect("dbi:SQLite:dbname=$giidxfile","",""); > > So the second $dbh masks earlier declaration. > > > Cheers, > Bernd > > On Wed, Jun 22, 2011 at 6:04 PM, Jason Stajich wrote: >> Hi Dan >> >> Looks like the mformat 6 is the right one for the blastplus toolkit - it is m8 or m9 for the C toolkit blastall application. >> >> I think DB_File was falling over with the now 40M+ gi to taxid pairs that I think were overwhelming DB_File and the berkkeleyDB implementation there. >> >> To solve it I reimplemented it with SQLite -- which will require you to install DBD::SQLite. >> >> I've checked in code to the main trunk in the github repo if you want to take a look -- you can either download the file https://github.com/bioperl/bioperl-live/blob/master/scripts/taxa/classify_hits_kingdom.PLS or check it out via git (recommended). >> >> -jason >> On Jun 21, 2011, at 7:25 AM, Jackson, Daniel wrote: >> >>> Hi Jason, >>> >>> My name is Dan and I'm hoping to use your bioperl script bp_classify_hits_kingdom.pl to categorise some ESTs I recently acquired. I've been stuck on this problem for days now - can you help?!? I suspect it's an easy and obvious solution.... I'm not a complete newbie to using scripts, but wouldn't say I'm experienced! I've just installed Bioperl and have generated a small BLASTx test file of my sequences searched against a local installation of GenBank's nr database. The BLAST search was run locally as follows: >>> >>> gzgbio-48:~ djackson$ blastx -query /Users/djackson/Desktop/10_Vaceltia_contigs_fna.txt -db /Users/djackson/BLAST-2.2.25+/db/nr/nr -outfmt 6 -out /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m6.txt -numthreads 2 -evalue .00001 -show_gis -num_descriptions 10 -num_alignments 10 -max_target_seqs 10 >>> >>> >>> The results of this file are attached (BLASTx_10_Vaceltia_contigs_m6.txt). I realise the BLASTx output is supposed to be in -outfmt 8 or -outfmt 9, but providing these files to bp_classify_hits_kingdom.pl generates the following error: >>> >>> gzgbio-48:~ djackson$ bp_classify_hits_kingdom.pl -t /Users/djackson/taxdump -g /Users/djackson/taxdump/gi_taxid_prot.dmp -i /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m9.txt -e .0001 >>> /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m9.txt >>> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 1. >>> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 1. >>> Use of uninitialized value $hname in pattern match (m//) at /usr/local/bin/bp_classify_hits_kingdom.pl line 148, <$fh> line 1. >>> Use of uninitialized value $hname in concatenation (.) or string at /usr/local/bin/bp_classify_hits_kingdom.pl line 195, <$fh> line 1. >>> no GI in >>> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 2. >>> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 2. >>> Use of uninitialized value $hname in pattern match (m//) at /usr/local/bin/bp_classify_hits_kingdom.pl line 148, <$fh> line 2. >>> Use of uninitialized value $hname in concatenation (.) or string at /usr/local/bin/bp_classify_hits_kingdom.pl line 195, <$fh> line 2. >>> no GI in >>> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 3. >>> Use of uninitialized value $evalue in numeric gt (>) at /usr/local/bin/bp_classify_hits_kingdom.pl line 143, <$fh> line 3. >>> . >>> . >>> etc... >>> . >>> . >>> no GI in >>> /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m9 total=1182 >>> 1182 100.00% >>> gzgbio-48:~ djackson$ >>> >>> >>> Providing the bp_classify_hits_kingdom.pl script with an -outfmt 6 format seems to get closer to a meaningful output, but still generates the following error: >>> >>> >>> gzgbio-48:~ djackson$ bp_classify_hits_kingdom.pl -t /Users/djackson/taxdump -g /Users/djackson/taxdump/gi_taxid_prot.dmp -i /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m6.txt -e .0001 -v >>> /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m6.txt >>> no taxid for 51127506 >>> no taxid for 51127506 >>> no taxid for 51127506 >>> no taxid for 317419045 >>> no taxid for 47219014 >>> . >>> . >>> etc... >>> . >>> . >>> /Users/djackson/Desktop/BLASTx_10_Vaceltia_contigs_m6 total=10 >>> 10 100.00% >>> gzgbio-48:~ djackson$ >>> >>> >>> Kind regards and thanks in advance, >>> Dan >>> >>> >>> >>> >>> --------------------------------------------------------------- >>> Junior Professor Daniel J. Jackson >>> Courant Research Centre Geobiology >>> Georg-August University of G?ttingen >>> Goldschmidtstr.3 >>> 37077 G?ttingen >>> Germany >>> >>> Tel: +49 (0) 551 39 14177 >>> Fax: +49 (0) 551 39 7918 >>> >>> djackso at uni-goettingen.de >>> http://www.uni-goettingen.de/en/102705.html >>> --------------------------------------------------------------- >>> >>> >>> >>> >>> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> From htafer at gmail.com Thu Jun 23 08:20:54 2011 From: htafer at gmail.com (htafer) Date: Thu, 23 Jun 2011 05:20:54 -0700 (PDT) Subject: [Bioperl-l] UTR regions in LiveSeq Message-ID: <23127982-f3ab-4aab-b267-451fdff62110@s17g2000yqs.googlegroups.com> Hi I am currently working on SNPs and would like to use BioPerl. I am using LiveSeq objects to store the sequence and I am looking at mutations by using the Mutation/Mutator objects. More explicitly: Given a set of annotation I extract the boundaries for all exons/ introns, the boundaries for the transcript as well as for the coding sequence. Based on these boundaries and the corresponding genomic sequence I construct the Bio::LiveSeq::{ DNA/Exon/Transcript/ Translation } objects which allow me to construct a Bio::LiveSeq::Gene object. Then based on a list of SNPs, I generate a set of Bio::LiveSeq::mutation objects and a Bio::LiveSeq::mutator which I then used mutate my gene. My main problem here is that I dont know how to handle transcripts having UTR, or ncRNAs, i.e transcripts/ exons that do not code for protein. According to the documentation of Bio::LiveSeq::Transcript, this class is aimed at storing information about coding sequences (CDS) only. The following code for transcripts with UTR is somewhat working, delivering the expected results. Still the alignment function of Bio::LiveSeq::Mutator do not work. So is there some plan to introduce Bio::LiveSeq::UTR similar to Bio::SeqFeature::Gene::UTR and to better suppot ncRNA/UTR with the mutator object? Or is it possible to do this with current BioPerl implementation? my $DNAsequence = Bio::LiveSeq::DNA->new( -seq => "GGGGGGGGGGATGAAAAAAATTTTTTTTTTAAAAAAAATAGCCCCCCCCCC"); my $utr5 = Bio::LiveSeq::Exon-> new(-seq => $DNAsequence, -start => 1, -end => 10, -strand => 1); my $exon1 = Bio::LiveSeq::Exon-> new(-seq => $DNAsequence, -start => 11, -end => 20, -strand => 1); my $intron1 =Bio::LiveSeq::Intron-> new(-seq => $DNAsequence, -start => 21, -end => 30, -strand => 1); my $exon2 = Bio::LiveSeq::Exon-> new(-seq => $DNAsequence, -start => 31, -end => 41, -strand => 1); my $utr3 = Bio::LiveSeq::Exon-> new(-seq => $DNAsequence, -start => 42, -end => 51, -strand => 1); my @tarray = ($exon1, $exon2); my @uarray = ($utr5, $exon1, $exon2,$utr3); my @iarray = ($intron1); my $Transcript = Bio::LiveSeq::Transcript->new( -exons => \@uarray); my $translationTranscript = Bio::LiveSeq::Transcript->new( -exons => \@tarray); my $Translation= Bio::LiveSeq::Translation->new( -transcript => $translationTranscript); #need to do this to avoid change_error() $Transcript->{'translation'}=$Translation; my $features; $features->{DNA} = $DNAsequence; $features->{Transcripts} = [$Transcript]; $features->{Translations} = [$Translation]; $features->{Exons} = \@uarray; $features->{Introns} = \@iarray; my $gene=Bio::LiveSeq::Gene->new(-name => "bla", -features => $features); my $mutation = new Bio::LiveSeq::Mutation (-seq =>'', -pos => 32, -len => 3 ); my $mutate = Bio::LiveSeq::Mutator->new(-gene => $gene, -numbering => 'entry' ); $mutate->add_Mutation($mutation); dna_mut:GGGGGGGGGGATGAAAAAAATTTTTTTTTTAAAAATAGCCCCCCCCCC dna_ori: GGGGGGGGGGATGAAAAAAATTTTTTTTTTAAAAAAAATAGCCCCCCCCCC rna_mut: GGGGGGGGGGATGAAAAAAAAAAAATAGCCCCCCCCCC rna_ori: GGGGGGGGGGATGAAAAAAAAAAAAAAATAGCCCCCCCCCC aa_mut: MKKKK* aa_ori: MKKKKK* print $results->alignment(); Variant : GAT GAA AAA AAA AAA ATA GCC CCC Reference: GAT GAA AAA AAA Bio AAA ATA GCC CCC E K K X K I A From locarpau at upvnet.upv.es Thu Jun 23 14:17:47 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Thu, 23 Jun 2011 20:17:47 +0200 Subject: [Bioperl-l] Mapping GO slim terms In-Reply-To: <4DF56976.8080704@upvnet.upv.es> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> Message-ID: <4E03834B.8000801@upvnet.upv.es> Hi all, Anyone knows if there is a way of mapping a set fo GO terms to their GO slim terms (providing the corresponding OBO file) using Bioperl? Cheers, Lorenzo -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From florent.angly at gmail.com Thu Jun 23 15:34:39 2011 From: florent.angly at gmail.com (Florent Angly) Date: Fri, 24 Jun 2011 05:34:39 +1000 Subject: [Bioperl-l] Bioperl module for simulated reads Message-ID: <4E03954F.8030909@gmail.com> Hi all, I just wanted to let me know that, in the context of my program that generates simulated amplicon and shotgun datasets, Grinder (https://sourceforge.net/projects/biogrinder/files/biogrinder/), I developed a Bioperl module that facilitates taking simulated reads from a reference sequences. I called it Bio::Seq::SimulatedRead: https://github.com/bioperl/bioperl-live/blob/master/Bio/Seq/SimulatedRead.pm. I welcome testing and feedback! Florent From jason.stajich at gmail.com Thu Jun 23 16:10:50 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Thu, 23 Jun 2011 13:10:50 -0700 Subject: [Bioperl-l] Bioperl module for simulated reads In-Reply-To: <4E03954F.8030909@gmail.com> References: <4E03954F.8030909@gmail.com> Message-ID: that's great that you are contributing this, but we should see about spliting any new modules/code off from the main trunk though - per the GSoC project. On Jun 23, 2011, at 12:34 PM, Florent Angly wrote: > Hi all, > > I just wanted to let me know that, in the context of my program that generates simulated amplicon and shotgun datasets, Grinder (https://sourceforge.net/projects/biogrinder/files/biogrinder/), I developed a Bioperl module that facilitates taking simulated reads from a reference sequences. I called it Bio::Seq::SimulatedRead: https://github.com/bioperl/bioperl-live/blob/master/Bio/Seq/SimulatedRead.pm. I welcome testing and feedback! > > Florent > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From rjbuzz at gmail.com Fri Jun 24 08:13:31 2011 From: rjbuzz at gmail.com (ravikumar jayachandran) Date: Fri, 24 Jun 2011 17:43:31 +0530 Subject: [Bioperl-l] Reg. Objects created by default on Bioperl Message-ID: Hi, I have a doubt regarding the object name created automatically on the perl script. What basis are the objects assigned to a specific module on? Please find the link below which I found on the link, http://www.bioperl.org/wiki/HOWTO:Beginners. --------------------------------- use Bio::Seq; use Bio::Tools::Run::StandAloneBlast; $blast_obj = Bio::Tools::Run::StandAloneBlast->new(-program => 'blastn', -database => 'db.fa')); $seq_obj = Bio::Seq->new(-id =>"test query", -seq =>"TTTAAATATATTTTGAAGTATAGATTATATGTT"); $report_obj = $blast_obj->blastall($seq_obj); $result_obj = $report_obj->next_result; print $result_obj->num_hits; ------------------------------------------------- For example, "next_result" method is present in Bio::SearchIO module. $report_obj has been used to access this "next_result" method. I don't understand on what basis $report_obj has been defined to Bio::SearchIO by default. Please help me understand the concept. Thanks & Regards, Ravi. From bosborne11 at verizon.net Mon Jun 27 11:00:14 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 27 Jun 2011 11:00:14 -0400 Subject: [Bioperl-l] Building CMs from Rfam.full? Message-ID: bioperl-l, Is there a script or module around that will take Rfam.full (multi-Stockholm file with all alignments) and build each individual *cm file from it, creating separate files? Thanks again, Brian O. From bosborne11 at verizon.net Mon Jun 27 11:29:31 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 27 Jun 2011 11:29:31 -0400 Subject: [Bioperl-l] Building CMs from Rfam.full? In-Reply-To: References: Message-ID: <6B74F1A1-E5E3-46D6-B43D-25D57A71B5C5@verizon.net> Please disregard, just wrote it. On Jun 27, 2011, at 11:00 AM, Brian Osborne wrote: > bioperl-l, > > Is there a script or module around that will take Rfam.full (multi-Stockholm file with all alignments) and build each individual *cm file from it, creating separate files? > > Thanks again, > > Brian O. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Mon Jun 27 12:56:44 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 27 Jun 2011 11:56:44 -0500 Subject: [Bioperl-l] Building CMs from Rfam.full? In-Reply-To: References: Message-ID: <06F1185A-B0C9-4475-9889-86270307F5F0@illinois.edu> You *should* be able to do this, maybe via combinations of Bio::Index::Stockholm, Bio::AlignIO::stockholm, Bio::Tools::Run::Infernal, and File::Temp (indexing rfam, pulling out ind. alignments into temp files by ID, running cmbuild..) chris (typing with one hand, child in other. in a cabin in Nova Scotia) On Jun 27, 2011, at 10:00 AM, Brian Osborne wrote: > bioperl-l, > > Is there a script or module around that will take Rfam.full (multi-Stockholm file with all alignments) and build each individual *cm file from it, creating separate files? > > Thanks again, > > Brian O. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From carandraug+dev at gmail.com Mon Jun 27 21:46:26 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 28 Jun 2011 02:46:26 +0100 Subject: [Bioperl-l] retrieve refseq ids from UIDs Message-ID: Hi I've been having some trouble with bioperl and I was hoping that someone could help me. I'm trying to obtain the transcript and protein id of RefSeq given a gene UID. For example, for the gene with UID 9555 http://www.ncbi.nlm.nih.gov/gene/9555 I'd like to get the transcripts and proteins ids as in the section (NCBI Reference Sequence (RefSeq)) of that page, the ones like: NM_001040158.1 ? NP_001035248.1 core histone macro-H2A.1 isoform 2 NM_004893.2 ? NP_004884.1 core histone macro-H2A.1 isoform 2 I believe to be fairly good with perl, just not with bioperl yet. Currently, what I'm doing is after getting the UIDs, using EUtilities esummary to get the genomic coordinates, then use efecth to extract that sequence and parse it to obtain the protein_id and transcript_id. Until now it's ok because I actually wanted all those things but now I'd like to skip the first parts and get only the transcripts or proteins. I'm sure there must be a "smarter" way to do this. I've tried to use efetch, suply the UID as id and use get_Response->content which gives me a some kind of structure with the info that I want but how do access it properly? Like this my @ids = qw(9555); my $factory = Bio::DB::EUtilities->new(-eutil => 'efetch', -db => 'gene', -id => \@ids, ); say $factory->get_Response->content; Also, when using the einfo script that comes with bioperl, to get the info I should be able to get when searching the database gene (running as einfo -d=gene), it says that I should be able to get it. At least one of the fields is the following: Field Code :ACCN Field Name :Nucleotide/Protein Accession Description :Nucleotide or protein accession(s) associated with this gene Term Count :49104652 Attributes :is_singletoken How do I get to this field? When using esummary to get a docum and then use the to_string method, I still can't see anything useful. my $summaries = Bio::DB::EUtilities->new( -eutil => 'esummary', -db => 'gene', -id => \@ids, ); while (my $docsum = $summaries->next_DocSum) { say $docsum->to_string(); } Any help would be very appreciated. Thanks in advance, Carn? Draug From Russell.Smithies at agresearch.co.nz Mon Jun 27 23:20:48 2011 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Tue, 28 Jun 2011 15:20:48 +1200 Subject: [Bioperl-l] retrieve refseq ids from UIDs In-Reply-To: References: Message-ID: <18DF7D20DFEC044098A1062202F5FFF3396074D228@exchsth.agresearch.co.nz> I assume you've had a look at the cookbook http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook Also take a look at elink, it might do what you are after http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#I_want_a_list_of_database_.27x.27_UIDs_that_are_linked_from_a_list_of_database_.27y.27_UIDs The Scrapbook is a good place to get ideas as well http://www.bioperl.org/wiki/Category:Scrapbook --Russell > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Carn? Draug > Sent: Tuesday, 28 June 2011 1:46 p.m. > To: bioperl mailing list > Subject: [Bioperl-l] retrieve refseq ids from UIDs > > Hi > > I've been having some trouble with bioperl and I was hoping that > someone could help me. I'm trying to obtain the transcript and protein > id of RefSeq given a gene UID. For example, for the gene with UID 9555 > http://www.ncbi.nlm.nih.gov/gene/9555 I'd like to get the transcripts > and proteins ids as in the section (NCBI Reference Sequence (RefSeq)) > of that page, the ones like: > > NM_001040158.1 ? NP_001035248.1 core histone macro-H2A.1 isoform 2 > NM_004893.2 ? NP_004884.1 core histone macro-H2A.1 isoform 2 > > I believe to be fairly good with perl, just not with bioperl yet. > Currently, what I'm doing is after getting the UIDs, using EUtilities > esummary to get the genomic coordinates, then use efecth to extract > that sequence and parse it to obtain the protein_id and transcript_id. > Until now it's ok because I actually wanted all those things but now > I'd like to skip the first parts and get only the transcripts or > proteins. I'm sure there must be a "smarter" way to do this. > > I've tried to use efetch, suply the UID as id and use > get_Response->content which gives me a some kind of structure with the > info that I want but how do access it properly? Like this > > my @ids = qw(9555); > my $factory = Bio::DB::EUtilities->new(-eutil => 'efetch', > -db => 'gene', > -id => \@ids, > ); > say $factory->get_Response->content; > > Also, when using the einfo script that comes with bioperl, to get the > info I should be able to get when searching the database gene (running > as einfo -d=gene), it says that I should be able to get it. At least > one of the fields is the following: > > Field Code :ACCN > Field Name :Nucleotide/Protein Accession > Description :Nucleotide or protein accession(s) associated > with this gene > Term Count :49104652 > Attributes :is_singletoken > > How do I get to this field? > > When using esummary to get a docum and then use the to_string method, > I still can't see anything useful. > > my $summaries = Bio::DB::EUtilities->new( > -eutil => 'esummary', > -db => 'gene', > -id => \@ids, > ); > > while (my $docsum = $summaries->next_DocSum) { > say $docsum->to_string(); > } > > Any help would be very appreciated. Thanks in advance, > Carn? Draug > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From carandraug+dev at gmail.com Tue Jun 28 07:41:06 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 28 Jun 2011 12:41:06 +0100 Subject: [Bioperl-l] retrieve refseq ids from UIDs In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF3396074D228@exchsth.agresearch.co.nz> References: <18DF7D20DFEC044098A1062202F5FFF3396074D228@exchsth.agresearch.co.nz> Message-ID: On 28 June 2011 04:20, Smithies, Russell wrote: > I assume you've had a look at the cookbook http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook > Also take a look at elink, it might do what you are after http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#I_want_a_list_of_database_.27x.27_UIDs_that_are_linked_from_a_list_of_database_.27y.27_UIDs > The Scrapbook is a good place to get ideas as well http://www.bioperl.org/wiki/Category:Scrapbook Hi Russel, thank you for your answer. I had indeed looking at the cookbook. I'd never tried elink and it works sometimes. I have a couple of problems with it tough. Basically, using that approach, I have to get the UID from gene, and use elink to get the transcripts by searching what links to 'nucleotide' (with link name gene_nuccore_refseqrna). Then, I have to search to where each of them links to the protein db. Also, since if I use an array of uids to search, I get all the UIDS that links in one list, I have to use a single UID so I know from where each comes. This is true for searching what nucleotides come from gene and what proteins come from nucleotide. This implies a lot of connections and it may be why sometimes I get the warning --------------------- WARNING --------------------- MSG: No linksets returned --------------------------------------------------- Does NCBI have some sort of mechanism to avoid flooding with requests? Here's the code I used http://pastebin.com/DsCh2JuL Also, the several connections makes it slower. There must be a simpler way since one of the pieces of code I showed on the first mail my @ids = qw(9555); my $factory = Bio::DB::EUtilities->new(-eutil => 'efetch', -db => 'gene', -id => \@ids, ); say $factory->get_Response->content; does retrieve a weird structure with all that info. Isn't there a method to access this data properly? Or maybe use some other module? Thanks, Carn? From dan.halligan at gmail.com Tue Jun 28 08:46:09 2011 From: dan.halligan at gmail.com (wannymahoots) Date: Tue, 28 Jun 2011 05:46:09 -0700 (PDT) Subject: [Bioperl-l] Make edits to a large sequence Message-ID: <18fe54bb-5dc7-4185-83de-a957d1a6c6bb@a31g2000vbt.googlegroups.com> Hi, I'm looking for the quickest / most efficient way to make many edits (mutations) to a long fasta sequence using bioperl. The sequences are of the order of 200Mb long, and I would like to make 1,000s of changes to single bases (e.g. A->T at position 1,000, G->C at position 1,201 etc.). The only way I've come across to do this is reading in the sequence and then making edits using SeqUtils, so something like: my $in = Bio::SeqIO->new('-file' => "file.fa", '-format' => "fasta"); while(my $seq = $in->next_seq()) { my $mut = Bio::LiveSeq::Mutation->new(-seq => 'c',-pos => 3); Bio::SeqUtils->mutate($seq,$mut); } However, I'm concerned that this might be making multiple copies of the large sequence, and that using substr (which is how mutate works), is perhaps not the most efficient. Would it be better to save the fasta sequence as an array and change individual array positions directly? Many thanks for any advice. From David.Messina at sbc.su.se Tue Jun 28 09:45:33 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Tue, 28 Jun 2011 15:45:33 +0200 Subject: [Bioperl-l] Make edits to a large sequence In-Reply-To: <18fe54bb-5dc7-4185-83de-a957d1a6c6bb@a31g2000vbt.googlegroups.com> References: <18fe54bb-5dc7-4185-83de-a957d1a6c6bb@a31g2000vbt.googlegroups.com> Message-ID: Hi Dan, On Tue, Jun 28, 2011 at 14:46, wannymahoots wrote: > Would it be better to save the > fasta sequence as an array and change individual array positions > directly? > I think this would probably be the way to go. Dave From Kevin.M.Brown at asu.edu Tue Jun 28 11:23:29 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Tue, 28 Jun 2011 08:23:29 -0700 Subject: [Bioperl-l] Make edits to a large sequence In-Reply-To: <18fe54bb-5dc7-4185-83de-a957d1a6c6bb@a31g2000vbt.googlegroups.com> References: <18fe54bb-5dc7-4185-83de-a957d1a6c6bb@a31g2000vbt.googlegroups.com> Message-ID: <1A4207F8295607498283FE9E93B775B407AF4AE8@EX02.asurite.ad.asu.edu> An array might work, or just hold the whole thing in a string and use substr on that rather than the BioPerl objects. while (my $seq = $in->next_seq()) { my sequence = $seq->seq; substr($sequence,2,1,'c'); substr($sequence,8,1,'t'); $seq->seq($sequence); ... } Just remember that substr works on a 0 indexed string rather than a 1 indexed. So the 3rd position is 2 rather than 3. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of wannymahoots > Sent: Tuesday, June 28, 2011 5:46 AM > To: bioperl-l at bioperl.org > Subject: [Bioperl-l] Make edits to a large sequence > > Hi, > > I'm looking for the quickest / most efficient way to make many edits > (mutations) to a long fasta sequence using bioperl. The sequences are > of the order of 200Mb long, and I would like to make 1,000s of changes > to single bases (e.g. A->T at position 1,000, G->C at position 1,201 > etc.). The only way I've come across to do this is reading in the > sequence and then making edits using SeqUtils, so something like: > > my $in = Bio::SeqIO->new('-file' => "file.fa", '-format' => "fasta"); > > while(my $seq = $in->next_seq()) { > my $mut = Bio::LiveSeq::Mutation->new(-seq => 'c',-pos => 3); > Bio::SeqUtils->mutate($seq,$mut); > } > > However, I'm concerned that this might be making multiple copies of > the large sequence, and that using substr (which is how mutate works), > is perhaps not the most efficient. Would it be better to save the > fasta sequence as an array and change individual array positions > directly? > > Many thanks for any advice. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From sofia at bioinfo.hr Tue Jun 28 11:29:12 2011 From: sofia at bioinfo.hr (Sofia) Date: Tue, 28 Jun 2011 17:29:12 +0200 Subject: [Bioperl-l] problem with Bio::Align::Utilities - aa_to_dna_aln Message-ID: <4E09F348.2080907@bioinfo.hr> Hi, I'm using the aa_to_dna_aln function from the BioPerl module Bio::Align::Utilities. I've used it some time ago and now i'm just re-running an old script and i get this error repeatedly: substr outside of string at /common/software/API/bioperl-1.6.1/lib/perl5/site_perl/5.8.8//Bio/Align/Utilities.pm line 160. Use of uninitialized value in string eq at /common/software/API/bioperl-1.6.1/lib/perl5/site_perl/5.8.8//Bio/Align/Utilities.pm line 161. --------------------- WARNING --------------------- MSG: In sequence ENSP00000372224 residue count gives end value 1161. Overriding value [1107] with value 1161 for Bio::LocatableSeq::end(). ATGGGGCGCTGGGCCTGGGTCCCCAGCCCCTGGCCCCCACCGGGGCTGGGCCCCTTCCTCCTCCTCCTCCTGCTGCTGCTGCTGCTGCCACGGGGGTTCCAGCCCCAGCCTGGCGGGAACCGTACGGAGTCCCCAGAACCTAATGCCACAGCGACCCCTGCGATCCCCACTATCCTGGTGACCTCTGTGACCTCTGAGACCCCAGCAACAAGTGCTCCAGAGGCAGAGGGACCCCAAAGTGGGGGGCTCCCGCCCCCGCCCAGGGCAGTTCCCTCGAGCAGTAGCCCCCAGGCCCAAGCACTCACCGAGGAC --------------------------------------------------- I had a look at the code of Utilities.pm and the line 160 is: my $char = substr($aa_seqstr,$i + $start_offset,1); I was wondering why is the $start_offset applied to the amino acid sequence $aa_seqstr and not to the dna sequence $nt_seqstr(line 164)? When i remove the $start_offset from the line 160 and add it to the line 164 i don't get any errors. I don't know if that is a problem or if the problem are the arguments sending to the function. The arguments i'm using are the same i used before: an alignment of protein sequences and a set of dna sequences. Did something change regarding the arguments? I would appreciate any help you could provide. Best regards, Sofia Pinto From member at linkedin.com Tue Jun 28 13:39:28 2011 From: member at linkedin.com (Radhouane Aniba, PhD - (aradwen@gmail.com) via LinkedIn) Date: Tue, 28 Jun 2011 17:39:28 +0000 (UTC) Subject: [Bioperl-l] Invitation to connect on LinkedIn Message-ID: <854360645.9621032.1309282768826.JavaMail.app@ela4-bed39.prod> LinkedIn ------------ Radhouane Aniba, PhD - (aradwen at gmail.com) requested to add you as a connection on LinkedIn: ------------------------------------------ Bolotin,, I'd like to add you to my professional network on LinkedIn. - Radhouane Accept invitation from Radhouane Aniba, PhD - (aradwen at gmail.com) http://www.linkedin.com/e/5drwke-gph55m0l-q/uez6TYkHzbaXxXM-lUk23auFwJZodcPlXc2UWC0Ao8h/blk/I2923629842_2/1BpC5vrmRLoRZcjkkZt5YCpnlOt3RApnhMpmdzgmhxrSNBszYOnP8Qe3AOdzcOej99bQITcR5VjRdnbPcSc3wSd3AScPcLrCBxbOYWrSlI/EML_comm_afe/ View profile of Radhouane Aniba, PhD - (aradwen at gmail.com) http://www.linkedin.com/e/5drwke-gph55m0l-q/rsn/4779195/JdEa/ ------------------------------------------ -- (c) 2011, LinkedIn Corporation From Russell.Smithies at agresearch.co.nz Tue Jun 28 16:54:57 2011 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Wed, 29 Jun 2011 08:54:57 +1200 Subject: [Bioperl-l] retrieve refseq ids from UIDs In-Reply-To: References: <18DF7D20DFEC044098A1062202F5FFF3396074D228@exchsth.agresearch.co.nz> Message-ID: <18DF7D20DFEC044098A1062202F5FFF3396074D22F@exchsth.agresearch.co.nz> It's fairly common for NCBI to return partial or incomplete data, often 1/2 a record is missing or requests will time-out at random. If you have a lot of records, it may be better to download all the data from the ftp site then parse it locally. This is what we tend to do if there's more than a few hundred queries. I'd like to point out that it's NCBIs problem, not the BioPerl code at fault. You'll run into the same problems if you use NCBIs Perl API (http://www.ncbi.nlm.nih.gov/books/NBK1058/) directly. Take a look at the gene2accession, gene2refseq, and gene_info data at ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/ and at the tax data ftp://ftp.ncbi.nih.gov/pub/taxonomy/ if you need to decode the taxids without doing web queries. It's much easier/faster to download these files, index them, them search rather than do queries against NCBI. And as all the data is local, you don't need to worry about connection problems. --Russell > -----Original Message----- > From: carandraug at gmail.com [mailto:carandraug at gmail.com] On Behalf Of > Carn? Draug > Sent: Tuesday, 28 June 2011 11:41 p.m. > To: Smithies, Russell > Cc: bioperl mailing list > Subject: Re: [Bioperl-l] retrieve refseq ids from UIDs > > On 28 June 2011 04:20, Smithies, Russell > wrote: > > I assume you've had a look at the cookbook > http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook > > Also take a look at elink, it might do what you are after > http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#I_want_a_list_of_ > database_.27x.27_UIDs_that_are_linked_from_a_list_of_database_.27y.27_U > IDs > > The Scrapbook is a good place to get ideas as well > http://www.bioperl.org/wiki/Category:Scrapbook > > Hi Russel, > > thank you for your answer. I had indeed looking at the cookbook. I'd > never tried elink and it works sometimes. I have a couple of problems > with it tough. > > Basically, using that approach, I have to get the UID from gene, and > use elink to get the transcripts by searching what links to > 'nucleotide' (with link name gene_nuccore_refseqrna). Then, I have to > search to where each of them links to the protein db. Also, since if I > use an array of uids to search, I get all the UIDS that links in one > list, I have to use a single UID so I know from where each comes. This > is true for searching what nucleotides come from gene and what > proteins come from nucleotide. This implies a lot of connections and > it may be why sometimes I get the warning > > --------------------- WARNING --------------------- > MSG: No linksets returned > --------------------------------------------------- > > Does NCBI have some sort of mechanism to avoid flooding with requests? > Here's the code I used http://pastebin.com/DsCh2JuL > > Also, the several connections makes it slower. There must be a simpler > way since one of the pieces of code I showed on the first mail > > my @ids = qw(9555); > my $factory = Bio::DB::EUtilities->new(-eutil => 'efetch', > -db => 'gene', > -id => \@ids, > ); > say $factory->get_Response->content; > > does retrieve a weird structure with all that info. Isn't there a > method to access this data properly? Or maybe use some other module? > Thanks, > Carn? ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From abhiram.das at gatech.edu Tue Jun 28 16:26:53 2011 From: abhiram.das at gatech.edu (Das, Abhiram) Date: Tue, 28 Jun 2011 16:26:53 -0400 (EDT) Subject: [Bioperl-l] Runnig bl2seq throws error + Failure to parse BlastResult In-Reply-To: <1732299973.569076.1309292672541.JavaMail.root@mail3.gatech.edu> Message-ID: <156252619.569129.1309292813565.JavaMail.root@mail3.gatech.edu> Hi, Running the following code throws the message: "Use of uninitialized value in numeric le (<=) at /Library/Perl/5.8.8/Bio/SearchIO/IteratedSearchResultEventBuilder.pm line 315, line 26." Even though the output file shows no hit's, the BlastResult->num_hits returns 1. CODE: use strict; use Bio::Seq; use Bio::SeqIO; use Bio::DB::GenBank; use Bio::Tools::Run::StandAloneBlastPlus; use Bio::Search::Result::BlastResult; use Bio::Search::Hit::HitI; my $fac = Bio::Tools::Run::StandAloneBlastPlus->new(); my $seq_obj = Bio::SeqIO->new(-file => "all_reads.fasta", -format => "fasta", -alphabet => "dna"); #loop through each seq in the seq-obj and blast it with the next sequence my @seq_list; while(my $seq = $seq_obj->next_seq()){ push(@seq_list, $seq); } my $eval = 0.000001; my $word = 16; my $match = 0; no strict; for(my $i = 0; $i < @seq_list; $i++){ for(my $j = $i+1; $j < @seq_list; $j++){ $fac->bl2seq(-method=>'blastn', -query => $seq_list[$j], -subject => $seq_list[$i], -outfile=>'test.out'); $fac->rewind_results; if($result = $fac->next_result){ print "# hits found:: ", $result->num_hits,"\n"; if(my $hit = $result->next_hit){ $match++; } } } } $fac->cleanup(); Appreciate any help. Thanks -Abhi From jason.stajich at gmail.com Tue Jun 28 17:02:39 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Tue, 28 Jun 2011 14:02:39 -0700 Subject: [Bioperl-l] Runnig bl2seq throws error + Failure to parse BlastResult In-Reply-To: <156252619.569129.1309292813565.JavaMail.root@mail3.gatech.edu> References: <156252619.569129.1309292813565.JavaMail.root@mail3.gatech.edu> Message-ID: <1CA575DB-FDA1-4107-8711-2378456969B5@gmail.com> I'm sorry but I don't understand your question, are you worried that the num_hits is 1 simply? I think this is a function of a forced result from bl2seq -- can you better explain what you are trying to do or provide your sequence file so the issue you are concerned with can be replicated. Are you trying to construct an all-vs-all pairwise distances based on bl2seq? On Jun 28, 2011, at 1:26 PM, Das, Abhiram wrote: > Hi, > > Running the following code throws the message: > > "Use of uninitialized value in numeric le (<=) at /Library/Perl/5.8.8/Bio/SearchIO/IteratedSearchResultEventBuilder.pm line 315, line 26." > > Even though the output file shows no hit's, the BlastResult->num_hits returns 1. > > > > CODE: > > use strict; > use Bio::Seq; > use Bio::SeqIO; > use Bio::DB::GenBank; > use Bio::Tools::Run::StandAloneBlastPlus; > use Bio::Search::Result::BlastResult; > use Bio::Search::Hit::HitI; > > my $fac = Bio::Tools::Run::StandAloneBlastPlus->new(); > > my $seq_obj = Bio::SeqIO->new(-file => "all_reads.fasta", -format => "fasta", -alphabet => "dna"); > > #loop through each seq in the seq-obj and blast it with the next sequence > my @seq_list; > while(my $seq = $seq_obj->next_seq()){ > push(@seq_list, $seq); > } > > my $eval = 0.000001; > my $word = 16; > my $match = 0; > no strict; > for(my $i = 0; $i < @seq_list; $i++){ > for(my $j = $i+1; $j < @seq_list; $j++){ > $fac->bl2seq(-method=>'blastn', -query => $seq_list[$j], -subject => $seq_list[$i], -outfile=>'test.out'); > $fac->rewind_results; > if($result = $fac->next_result){ > print "# hits found:: ", $result->num_hits,"\n"; > if(my $hit = $result->next_hit){ > $match++; > } > } > } > } > $fac->cleanup(); > > > Appreciate any help. > > > > > > > > Thanks > -Abhi > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From Russell.Smithies at agresearch.co.nz Tue Jun 28 17:03:17 2011 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Wed, 29 Jun 2011 09:03:17 +1200 Subject: [Bioperl-l] Runnig bl2seq throws error + Failure to parse BlastResult In-Reply-To: <156252619.569129.1309292813565.JavaMail.root@mail3.gatech.edu> References: <1732299973.569076.1309292672541.JavaMail.root@mail3.gatech.edu> <156252619.569129.1309292813565.JavaMail.root@mail3.gatech.edu> Message-ID: <18DF7D20DFEC044098A1062202F5FFF3396074D230@exchsth.agresearch.co.nz> Did you try printing the result with Data::Dumper to see what's in it? use Data::Dumper; if($result = $fac->next_result){ print Dumper $result; . . . . --Russell > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Das, Abhiram > Sent: Wednesday, 29 June 2011 8:27 a.m. > To: bioperl-l at bioperl.org > Subject: [Bioperl-l] Runnig bl2seq throws error + Failure to parse > BlastResult > > Hi, > > Running the following code throws the message: > > "Use of uninitialized value in numeric le (<=) at > /Library/Perl/5.8.8/Bio/SearchIO/IteratedSearchResultEventBuilder.pm > line 315, line 26." > > Even though the output file shows no hit's, the BlastResult->num_hits > returns 1. > > > > CODE: > > use strict; > use Bio::Seq; > use Bio::SeqIO; > use Bio::DB::GenBank; > use Bio::Tools::Run::StandAloneBlastPlus; > use Bio::Search::Result::BlastResult; > use Bio::Search::Hit::HitI; > > my $fac = Bio::Tools::Run::StandAloneBlastPlus->new(); > > my $seq_obj = Bio::SeqIO->new(-file => "all_reads.fasta", -format => > "fasta", -alphabet => "dna"); > > #loop through each seq in the seq-obj and blast it with the next > sequence > my @seq_list; > while(my $seq = $seq_obj->next_seq()){ > push(@seq_list, $seq); > } > > my $eval = 0.000001; > my $word = 16; > my $match = 0; > no strict; > for(my $i = 0; $i < @seq_list; $i++){ > for(my $j = $i+1; $j < @seq_list; $j++){ > $fac->bl2seq(-method=>'blastn', -query => $seq_list[$j], -subject => > $seq_list[$i], -outfile=>'test.out'); > $fac->rewind_results; > if($result = $fac->next_result){ > print "# hits found:: ", $result->num_hits,"\n"; > if(my $hit = $result->next_hit){ > $match++; > } > } > } > } > $fac->cleanup(); > > > Appreciate any help. > > > > > > > > Thanks > -Abhi > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From carandraug+dev at gmail.com Tue Jun 28 17:24:17 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Tue, 28 Jun 2011 22:24:17 +0100 Subject: [Bioperl-l] retrieve refseq ids from UIDs In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF3396074D22F@exchsth.agresearch.co.nz> References: <18DF7D20DFEC044098A1062202F5FFF3396074D228@exchsth.agresearch.co.nz> <18DF7D20DFEC044098A1062202F5FFF3396074D22F@exchsth.agresearch.co.nz> Message-ID: 2011/6/28 Smithies, Russell : > It's fairly common for NCBI to return partial or incomplete data, often 1/2 a record is missing or requests will time-out at random. > If you have a lot of records, it may be better to download all the data from the ftp site then parse it locally. This is what we tend to do if there's more than a few hundred queries. I'd like to point out that it's NCBIs problem, not the BioPerl code at fault. You'll run into the same problems if you use NCBIs Perl API (http://www.ncbi.nlm.nih.gov/books/NBK1058/) directly. Is there any way to catch this kind of errors? Other than repeat fetching the data until there's two consecutive results that have the same result? > Take a look at the gene2accession, gene2refseq, and gene_info data at ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/ and at the tax data ftp://ftp.ncbi.nih.gov/pub/taxonomy/ if you need to decode the taxids without doing web queries. > It's much easier/faster to download these files, index them, them search rather than do queries against NCBI. Any module already done written to parse these guys? Thanks for all your answers, Carn? From Russell.Smithies at agresearch.co.nz Tue Jun 28 17:26:34 2011 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Wed, 29 Jun 2011 09:26:34 +1200 Subject: [Bioperl-l] retrieve refseq ids from UIDs In-Reply-To: References: <18DF7D20DFEC044098A1062202F5FFF3396074D228@exchsth.agresearch.co.nz> <18DF7D20DFEC044098A1062202F5FFF3396074D22F@exchsth.agresearch.co.nz> Message-ID: <18DF7D20DFEC044098A1062202F5FFF3396074D231@exchsth.agresearch.co.nz> Gene and tax data are all tab-separated so it's just a matter of splitting into a hash then querying. There's a readme in each file (I think) that describes what goes where. --Russell > -----Original Message----- > From: carandraug at gmail.com [mailto:carandraug at gmail.com] On Behalf Of > Carn? Draug > Sent: Wednesday, 29 June 2011 9:24 a.m. > To: Smithies, Russell > Cc: bioperl mailing list > Subject: Re: [Bioperl-l] retrieve refseq ids from UIDs > > 2011/6/28 Smithies, Russell : > > It's fairly common for NCBI to return partial or incomplete data, > often 1/2 a record is missing or requests will time-out at random. > > If you have a lot of records, it may be better to download all the > data from the ftp site then parse it locally. This is what we tend to > do if there's more than a few hundred queries. I'd like to point out > that it's NCBIs problem, not the BioPerl code at fault. You'll run into > the same problems if you use NCBIs Perl API > (http://www.ncbi.nlm.nih.gov/books/NBK1058/) directly. > > Is there any way to catch this kind of errors? Other than repeat > fetching the data until there's two consecutive results that have the > same result? > > > Take a look at the gene2accession, gene2refseq, and gene_info data at > ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/ and at the tax data > ftp://ftp.ncbi.nih.gov/pub/taxonomy/ if you need to decode the taxids > without doing web queries. > > It's much easier/faster to download these files, index them, them > search rather than do queries against NCBI. > > Any module already done written to parse these guys? > > Thanks for all your answers, > Carn? ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From jason.stajich at gmail.com Tue Jun 28 17:39:00 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Tue, 28 Jun 2011 14:39:00 -0700 Subject: [Bioperl-l] Runnig bl2seq throws error + Failure to parse BlastResult In-Reply-To: <680320907.570662.1309296734345.JavaMail.root@mail3.gatech.edu> References: <680320907.570662.1309296734345.JavaMail.root@mail3.gatech.edu> Message-ID: <6423A46F-622D-4FF2-946E-1660D989C324@gmail.com> To answer your question "there might be a bug" But there won't be any hsps or hits so why do you care if the counter is wrong? On Jun 28, 2011, at 2:32 PM, Das, Abhiram wrote: > Jason, thanks for your reply. > > I have two issues: > > 1. While running the program it shows up the following message: > > "Use of uninitialized value in numeric le (<=) at /Library/Perl/5.8.8/Bio/SearchIO/IteratedSearchResultEventBuilder.pm line 367, line 26." > > 2. Even though there is no hit; $results->num_hits get's me 1 hit. > > > What I am trying to do: > > "Find number of pairwise similar sequences from a set of sequences." > > Please find attached the sequence file. > > Thanks again > -Abhi > > ----- Original Message ----- > From: "Jason Stajich" > To: "Abhiram Das" > Cc: bioperl-l at bioperl.org > Sent: Tuesday, June 28, 2011 5:02:39 PM > Subject: Re: [Bioperl-l] Runnig bl2seq throws error + Failure to parse BlastResult > > I'm sorry but I don't understand your question, are you worried that the num_hits is 1 simply? I think this is a function of a forced result from bl2seq -- can you better explain what you are trying to do or provide your sequence file so the issue you are concerned with can be replicated. > > Are you trying to construct an all-vs-all pairwise distances based on bl2seq? > > On Jun 28, 2011, at 1:26 PM, Das, Abhiram wrote: > >> Hi, >> >> Running the following code throws the message: >> >> "Use of uninitialized value in numeric le (<=) at /Library/Perl/5.8.8/Bio/SearchIO/IteratedSearchResultEventBuilder.pm line 315, line 26." >> >> Even though the output file shows no hit's, the BlastResult->num_hits returns 1. >> >> >> >> CODE: >> >> use strict; >> use Bio::Seq; >> use Bio::SeqIO; >> use Bio::DB::GenBank; >> use Bio::Tools::Run::StandAloneBlastPlus; >> use Bio::Search::Result::BlastResult; >> use Bio::Search::Hit::HitI; >> >> my $fac = Bio::Tools::Run::StandAloneBlastPlus->new(); >> >> my $seq_obj = Bio::SeqIO->new(-file => "all_reads.fasta", -format => "fasta", -alphabet => "dna"); >> >> #loop through each seq in the seq-obj and blast it with the next sequence >> my @seq_list; >> while(my $seq = $seq_obj->next_seq()){ >> push(@seq_list, $seq); >> } >> >> my $eval = 0.000001; >> my $word = 16; >> my $match = 0; >> no strict; >> for(my $i = 0; $i < @seq_list; $i++){ >> for(my $j = $i+1; $j < @seq_list; $j++){ >> $fac->bl2seq(-method=>'blastn', -query => $seq_list[$j], -subject => $seq_list[$i], -outfile=>'test.out'); >> $fac->rewind_results; >> if($result = $fac->next_result){ >> print "# hits found:: ", $result->num_hits,"\n"; >> if(my $hit = $result->next_hit){ >> $match++; >> } >> } >> } >> } >> $fac->cleanup(); >> >> >> Appreciate any help. >> >> >> >> >> >> >> >> Thanks >> -Abhi >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From abhiram.das at gatech.edu Tue Jun 28 17:32:14 2011 From: abhiram.das at gatech.edu (Das, Abhiram) Date: Tue, 28 Jun 2011 17:32:14 -0400 (EDT) Subject: [Bioperl-l] Runnig bl2seq throws error + Failure to parse BlastResult In-Reply-To: <1CA575DB-FDA1-4107-8711-2378456969B5@gmail.com> Message-ID: <680320907.570662.1309296734345.JavaMail.root@mail3.gatech.edu> Jason, thanks for your reply. I have two issues: 1. While running the program it shows up the following message: "Use of uninitialized value in numeric le (<=) at /Library/Perl/5.8.8/Bio/SearchIO/IteratedSearchResultEventBuilder.pm line 367, line 26." 2. Even though there is no hit; $results->num_hits get's me 1 hit. What I am trying to do: "Find number of pairwise similar sequences from a set of sequences." Please find attached the sequence file. Thanks again -Abhi ----- Original Message ----- From: "Jason Stajich" To: "Abhiram Das" Cc: bioperl-l at bioperl.org Sent: Tuesday, June 28, 2011 5:02:39 PM Subject: Re: [Bioperl-l] Runnig bl2seq throws error + Failure to parse BlastResult I'm sorry but I don't understand your question, are you worried that the num_hits is 1 simply? I think this is a function of a forced result from bl2seq -- can you better explain what you are trying to do or provide your sequence file so the issue you are concerned with can be replicated. Are you trying to construct an all-vs-all pairwise distances based on bl2seq? On Jun 28, 2011, at 1:26 PM, Das, Abhiram wrote: > Hi, > > Running the following code throws the message: > > "Use of uninitialized value in numeric le (<=) at /Library/Perl/5.8.8/Bio/SearchIO/IteratedSearchResultEventBuilder.pm line 315, line 26." > > Even though the output file shows no hit's, the BlastResult->num_hits returns 1. > > > > CODE: > > use strict; > use Bio::Seq; > use Bio::SeqIO; > use Bio::DB::GenBank; > use Bio::Tools::Run::StandAloneBlastPlus; > use Bio::Search::Result::BlastResult; > use Bio::Search::Hit::HitI; > > my $fac = Bio::Tools::Run::StandAloneBlastPlus->new(); > > my $seq_obj = Bio::SeqIO->new(-file => "all_reads.fasta", -format => "fasta", -alphabet => "dna"); > > #loop through each seq in the seq-obj and blast it with the next sequence > my @seq_list; > while(my $seq = $seq_obj->next_seq()){ > push(@seq_list, $seq); > } > > my $eval = 0.000001; > my $word = 16; > my $match = 0; > no strict; > for(my $i = 0; $i < @seq_list; $i++){ > for(my $j = $i+1; $j < @seq_list; $j++){ > $fac->bl2seq(-method=>'blastn', -query => $seq_list[$j], -subject => $seq_list[$i], -outfile=>'test.out'); > $fac->rewind_results; > if($result = $fac->next_result){ > print "# hits found:: ", $result->num_hits,"\n"; > if(my $hit = $result->next_hit){ > $match++; > } > } > } > } > $fac->cleanup(); > > > Appreciate any help. > > > > > > > > Thanks > -Abhi > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -------------- next part -------------- A non-text attachment was scrubbed... Name: all_reads.fasta Type: application/octet-stream Size: 1756 bytes Desc: not available URL: From cjfields at illinois.edu Tue Jun 28 21:00:16 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 28 Jun 2011 20:00:16 -0500 Subject: [Bioperl-l] problem with Bio::Align::Utilities - aa_to_dna_aln In-Reply-To: <4E09F348.2080907@bioinfo.hr> References: <4E09F348.2080907@bioinfo.hr> Message-ID: <250694DF-98CB-4E88-B98D-B97EA2A845AB@illinois.edu> Sofia, Not sure, it's hard to diagnose unless we have something more to go on. Unfortunately I can't look at it for a bit, but I suggest filing this as a possible bug, or maybe someone else has an answer? chris On Jun 28, 2011, at 10:29 AM, Sofia wrote: > Hi, > > I'm using the aa_to_dna_aln function from the BioPerl module Bio::Align::Utilities. > > I've used it some time ago and now i'm just re-running an old script and i get this error repeatedly: > > substr outside of string at /common/software/API/bioperl-1.6.1/lib/perl5/site_perl/5.8.8//Bio/Align/Utilities.pm line 160. > Use of uninitialized value in string eq at /common/software/API/bioperl-1.6.1/lib/perl5/site_perl/5.8.8//Bio/Align/Utilities.pm line 161. > > --------------------- WARNING --------------------- > MSG: In sequence ENSP00000372224 residue count gives end value 1161. > Overriding value [1107] with value 1161 for Bio::LocatableSeq::end(). > ATGGGGCGCTGGGCCTGGGTCCCCAGCCCCTGGCCCCCACCGGGGCTGGGCCCCTTCCTCCTCCTCCTCCTGCTGCTGCTGCTGCTGCCACGGGGGTTCCAGCCCCAGCCTGGCGGGAACCGTACGGAGTCCCCAGAACCTAATGCCACAGCGACCCCTGCGATCCCCACTATCCTGGTGACCTCTGTGACCTCTGAGACCCCAGCAACAAGTGCTCCAGAGGCAGAGGGACCCCAAAGTGGGGGGCTCCCGCCCCCGCCCAGGGCAGTTCCCTCGAGCAGTAGCCCCCAGGCCCAAGCACTCACCGAGGAC > --------------------------------------------------- > > I had a look at the code of Utilities.pm and the line 160 is: > > my $char = substr($aa_seqstr,$i + $start_offset,1); > > I was wondering why is the $start_offset applied to the amino acid sequence $aa_seqstr and not to the dna sequence $nt_seqstr(line 164)? > > When i remove the $start_offset from the line 160 and add it to the line 164 i don't get any errors. > > I don't know if that is a problem or if the problem are the arguments sending to the function. > > The arguments i'm using are the same i used before: an alignment of protein sequences and a set of dna sequences. Did something change regarding the arguments? > > I would appreciate any help you could provide. > > Best regards, > Sofia Pinto > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason.stajich at gmail.com Wed Jun 29 01:28:34 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Tue, 28 Jun 2011 22:28:34 -0700 Subject: [Bioperl-l] Runnig bl2seq throws error + Failure to parse BlastResult In-Reply-To: <237763757.571237.1309299016179.JavaMail.root@mail3.gatech.edu> References: <237763757.571237.1309299016179.JavaMail.root@mail3.gatech.edu> Message-ID: <22E4FEA2-4184-44CE-8EB0-D0B289D123A1@gmail.com> You should continue to ask your question to the mailing list too. I presume when query == subject there is in fact an alignment which is why you don't get an error. However, I don't understand why you are calling rewind_results -- this is what is causing the # of hits == 1 even when there is no hit. =head2 rewind Title : rewind Usage : $searchio->rewind; Function: Allow one to reset the Result iterator to the beginning, so that next_result() will subsequently return the first result and so on. NB: result objects are not cached, so you will get new result objects each time you rewind. Also, note that result_count() counts the number of times you have called next_result(), so will not be able tell you how many results there were in the file if you use rewind(). I don't think the parser handles the new Blast+ bl2seq output properly is my guess, it is still creating hit objects even when there are no results, I'm entirely sure but I think has to do with workaround for that format not providing the Query= at the beginning of the output in some cases. blast+ may be better but staring at the output format suggests it is different. You can try the older bl2seq with StandAloneBlast I suppose. You may want to reconsider another alignment approach anyways if you are trying to: "Find number of pairwise similar sequences from a set of sequences." Usually one searches a sequence against a database if they are looking for all hits, using BLAST or FASTA on a database for each query, not enumerating all pairwise as you seem to be doing in your code. You can always stop having messages displaying on the console by not having perl run with -w as for this script that will turn off the error message. On Jun 28, 2011, at 3:10 PM, Das, Abhiram wrote: > I wanted to stop further checking of hits and then hsps to decide if there was a match. > > The message is annoying when I am running 100s of sequences. Is there any way I can stop the message printing to the console? > > The message do not show up when the 'query' is same as 'subject'. > > Thanks again. > -Abhi > > > ----- Original Message ----- > From: "Jason Stajich" > To: "Abhiram Das" > Cc: bioperl-l at bioperl.org > Sent: Tuesday, June 28, 2011 5:39:00 PM > Subject: Re: [Bioperl-l] Runnig bl2seq throws error + Failure to parse BlastResult > > To answer your question "there might be a bug" > > But there won't be any hsps or hits so why do you care if the counter is wrong? > > > On Jun 28, 2011, at 2:32 PM, Das, Abhiram wrote: > >> Jason, thanks for your reply. >> >> I have two issues: >> >> 1. While running the program it shows up the following message: >> >> "Use of uninitialized value in numeric le (<=) at /Library/Perl/5.8.8/Bio/SearchIO/IteratedSearchResultEventBuilder.pm line 367, line 26." >> >> 2. Even though there is no hit; $results->num_hits get's me 1 hit. >> >> >> What I am trying to do: >> >> "Find number of pairwise similar sequences from a set of sequences." >> >> Please find attached the sequence file. >> >> Thanks again >> -Abhi >> >> ----- Original Message ----- >> From: "Jason Stajich" >> To: "Abhiram Das" >> Cc: bioperl-l at bioperl.org >> Sent: Tuesday, June 28, 2011 5:02:39 PM >> Subject: Re: [Bioperl-l] Runnig bl2seq throws error + Failure to parse BlastResult >> >> I'm sorry but I don't understand your question, are you worried that the num_hits is 1 simply? I think this is a function of a forced result from bl2seq -- can you better explain what you are trying to do or provide your sequence file so the issue you are concerned with can be replicated. >> >> Are you trying to construct an all-vs-all pairwise distances based on bl2seq? >> >> On Jun 28, 2011, at 1:26 PM, Das, Abhiram wrote: >> >>> Hi, >>> >>> Running the following code throws the message: >>> >>> "Use of uninitialized value in numeric le (<=) at /Library/Perl/5.8.8/Bio/SearchIO/IteratedSearchResultEventBuilder.pm line 315, line 26." >>> >>> Even though the output file shows no hit's, the BlastResult->num_hits returns 1. >>> >>> >>> >>> CODE: >>> >>> use strict; >>> use Bio::Seq; >>> use Bio::SeqIO; >>> use Bio::DB::GenBank; >>> use Bio::Tools::Run::StandAloneBlastPlus; >>> use Bio::Search::Result::BlastResult; >>> use Bio::Search::Hit::HitI; >>> >>> my $fac = Bio::Tools::Run::StandAloneBlastPlus->new(); >>> >>> my $seq_obj = Bio::SeqIO->new(-file => "all_reads.fasta", -format => "fasta", -alphabet => "dna"); >>> >>> #loop through each seq in the seq-obj and blast it with the next sequence >>> my @seq_list; >>> while(my $seq = $seq_obj->next_seq()){ >>> push(@seq_list, $seq); >>> } >>> >>> my $eval = 0.000001; >>> my $word = 16; >>> my $match = 0; >>> no strict; >>> for(my $i = 0; $i < @seq_list; $i++){ >>> for(my $j = $i+1; $j < @seq_list; $j++){ >>> $fac->bl2seq(-method=>'blastn', -query => $seq_list[$j], -subject => $seq_list[$i], -outfile=>'test.out'); >>> $fac->rewind_results; >>> if($result = $fac->next_result){ >>> print "# hits found:: ", $result->num_hits,"\n"; >>> if(my $hit = $result->next_hit){ >>> $match++; >>> } >>> } >>> } >>> } >>> $fac->cleanup(); >>> >>> >>> Appreciate any help. >>> >>> >>> >>> >>> >>> >>> >>> Thanks >>> -Abhi >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > From jason.stajich at gmail.com Wed Jun 29 01:29:51 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Tue, 28 Jun 2011 22:29:51 -0700 Subject: [Bioperl-l] problem with Bio::Align::Utilities - aa_to_dna_aln In-Reply-To: <4E09F348.2080907@bioinfo.hr> References: <4E09F348.2080907@bioinfo.hr> Message-ID: <54FF3FA6-BFE4-48F8-A180-576D33BF5BC3@gmail.com> Sofia - If you provided a test example of a file you are starting with we could test this out and determine the error. I don't know if you are relying on the offset from the alignment where the cDNA won't match the protein exactly - I am not sure that behavior will work, all my usage for this has been where the cDNA and protein are 1:1 and all the code is doing is inserting gaps in to make the codon alignment work. Jason On Jun 28, 2011, at 8:29 AM, Sofia wrote: > Hi, > > I'm using the aa_to_dna_aln function from the BioPerl module Bio::Align::Utilities. > > I've used it some time ago and now i'm just re-running an old script and i get this error repeatedly: > > substr outside of string at /common/software/API/bioperl-1.6.1/lib/perl5/site_perl/5.8.8//Bio/Align/Utilities.pm line 160. > Use of uninitialized value in string eq at /common/software/API/bioperl-1.6.1/lib/perl5/site_perl/5.8.8//Bio/Align/Utilities.pm line 161. > > --------------------- WARNING --------------------- > MSG: In sequence ENSP00000372224 residue count gives end value 1161. > Overriding value [1107] with value 1161 for Bio::LocatableSeq::end(). > ATGGGGCGCTGGGCCTGGGTCCCCAGCCCCTGGCCCCCACCGGGGCTGGGCCCCTTCCTCCTCCTCCTCCTGCTGCTGCTGCTGCTGCCACGGGGGTTCCAGCCCCAGCCTGGCGGGAACCGTACGGAGTCCCCAGAACCTAATGCCACAGCGACCCCTGCGATCCCCACTATCCTGGTGACCTCTGTGACCTCTGAGACCCCAGCAACAAGTGCTCCAGAGGCAGAGGGACCCCAAAGTGGGGGGCTCCCGCCCCCGCCCAGGGCAGTTCCCTCGAGCAGTAGCCCCCAGGCCCAAGCACTCACCGAGGAC > --------------------------------------------------- > > I had a look at the code of Utilities.pm and the line 160 is: > > my $char = substr($aa_seqstr,$i + $start_offset,1); > > I was wondering why is the $start_offset applied to the amino acid sequence $aa_seqstr and not to the dna sequence $nt_seqstr(line 164)? > > When i remove the $start_offset from the line 160 and add it to the line 164 i don't get any errors. > > I don't know if that is a problem or if the problem are the arguments sending to the function. > > The arguments i'm using are the same i used before: an alignment of protein sequences and a set of dna sequences. Did something change regarding the arguments? > > I would appreciate any help you could provide. > > Best regards, > Sofia Pinto > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason.stajich at gmail.com Wed Jun 29 01:34:16 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Tue, 28 Jun 2011 22:34:16 -0700 Subject: [Bioperl-l] passing twice a codon MSA to codeml factory In-Reply-To: <4DF56976.8080704@upvnet.upv.es> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> Message-ID: Lorenzo - Did you ever find a workaround - I lost track of this bug and I still don't really have time to figure it out, maybe this is elevated one someone else's radar? When I did try to debug this I was not convinced this it is because you are re-using the codon_MSA object. I would have to go back to a simple code example that we can reproduce the error from to track this down. Usually the seqtype error is because codeml crashed and you can also print out the error string from the program and save the mlc out file to see what the errors were. On Jun 12, 2011, at 6:35 PM, Lorenzo Carretero wrote: > Hi, > > I'm trying to pass the same codon MSA several times to a sub which runs codeml with the parameters passed as arguments. However, the second time it is passed, the program stops with the error message: > > --------------------- WARNING --------------------- > MSG: There was an error - see error_string for the program output > --------------------------------------------------- > > ------------- EXCEPTION: Bio::Root::NotImplemented ------------- > MSG: Unknown format of PAML output did not see seqtype > STACK: Error::throw > STACK: Bio::Root::Root::throw /Library/Perl//5.10.0/Bio/Root/Root.pm:472 > STACK: Bio::Tools::Phylo::PAML::_parse_summary /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:526 > STACK: Bio::Tools::Phylo::PAML::next_result /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:271 > STACK: main::BranchSiteEvolAnalysis /Users/Lorenzo/Documents/workspace/PlantEvolGen/test.pl:364 > STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/test.pl:233 > > Here is just some partial code to illustrate what I'm saying: > > my $codon_MSA = Method_to_get_codonMSA ( $sequencesfilenameAA, $sequencesfilenameNT ); > ( $lnL, $omegas, $pamlrun ) = BranchSiteEvolAnalysis ( $codon_MSA, 2, $tree, 0, 0, 0, 8 ); > #The first time runs OK > ( $lnL, $omegas, $pamlrun ) = BranchSiteEvolAnalysis ( $codon_MSA, 0, $tree, 0, 0, 0, 8 ); > #The second time crashes > #Method to_run PAML with the codon_MSA, tree, and codeml parameters passed as arguments > sub BranchSiteEvolAnalysis > { > my ( $codon_MSA, $model, $tree, $NSsites, $fix_omega, $omega, $ncatG ) = @_; > . > . > . > my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > ( > -alignment => $codon_MSA, > -tree => $biotree, > -params => { > #'verbose' => 0, > #'noisy' => 9, > 'runmode' => 0, #user tree > 'seqtype' => 1, > 'model' => $model, > 'NSsites' => $NSsites, > 'fix_omega' => $fix_omega, > 'omega' => $omega, > 'ncatG' => $ncatG, > #'icode' => 0, > #'fix_alpha' => 0, > #'fix_kappa' => 0, > #'RateAncestor' => 0, > 'CodonFreq' => 2, > 'cleandata' => 0, # remove sites with amibguity data (1 yes, 0 no), > 'ndata' => 2 > }, > ); > . > . > . > } > > I verified and the $codon_MSA ref point to the same location in memory before and after running the codeml_factory, so I guess it is not modified by the package in such a way that it couldn't be passed more than once. DO you know of any way to avoid redoing the $codon_MSA each time i want to pass it to the codeml_factory. > > Thank you very much, > > Lorenzo > > > > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > Lorenzo Carretero Paulet > Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) > Integrative Systems Biology Group > C/ Ingeniero Fausto Elio s/n. > 46022 Valencia, Spain > > Phone: +34 963879934 > Fax: +34 963877859 > e-mail: locarpau at upvnet.upv.es > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason.stajich at gmail.com Wed Jun 29 01:38:24 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Tue, 28 Jun 2011 22:38:24 -0700 Subject: [Bioperl-l] Mapping GO slim terms In-Reply-To: <4E03834B.8000801@upvnet.upv.es> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> <4E03834B.8000801@upvnet.upv.es> Message-ID: <2D0FA564-AD89-4DD6-A870-7FD486841270@gmail.com> Lorenzo - you probably want to look at the GO-perl modules from Chris Mungall and others (http://search.cpan.org/~cmungall/go-perl/) -- these are not part of BioPerl however. I think this script does something akin to what you are asking. http://search.cpan.org/~cmungall/go-perl/scripts/map2slim On Jun 23, 2011, at 11:17 AM, Lorenzo Carretero Paulet wrote: > Hi all, > Anyone knows if there is a way of mapping a set fo GO terms to their GO slim terms (providing the corresponding OBO file) using Bioperl? > Cheers, > Lorenzo > > > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > Lorenzo Carretero Paulet > Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) > Integrative Systems Biology Group > C/ Ingeniero Fausto Elio s/n. > 46022 Valencia, Spain > > Phone: +34 963879934 > Fax: +34 963877859 > e-mail: locarpau at upvnet.upv.es > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason.stajich at gmail.com Wed Jun 29 01:40:46 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Tue, 28 Jun 2011 22:40:46 -0700 Subject: [Bioperl-l] Reg. Objects created by default on Bioperl In-Reply-To: References: Message-ID: <7EA73DDD-2A1F-4CE4-B83A-AB337B0FB215@gmail.com> The documentation here http://search.cpan.org/~cjfields/BioPerl-1.6.901/Bio/Tools/Run/StandAloneBlast.pm says: "For blastall and non-psiblast blastpgp runs, report object is a Bio::SearchIO object" On Jun 24, 2011, at 5:13 AM, ravikumar jayachandran wrote: > Hi, > > I have a doubt regarding the object name created automatically on the perl > script. What basis are the objects assigned to a specific module on? Please > find the link below which I found on the link, > http://www.bioperl.org/wiki/HOWTO:Beginners. > > --------------------------------- > use Bio::Seq; > use Bio::Tools::Run::StandAloneBlast; > > $blast_obj = Bio::Tools::Run::StandAloneBlast->new(-program => > 'blastn', -database => 'db.fa')); > > $seq_obj = Bio::Seq->new(-id =>"test query", -seq > =>"TTTAAATATATTTTGAAGTATAGATTATATGTT"); > > $report_obj = $blast_obj->blastall($seq_obj); > > $result_obj = $report_obj->next_result; > > print $result_obj->num_hits; > > ------------------------------------------------- > > For example, "next_result" method is present in Bio::SearchIO module. > $report_obj has been used to access this "next_result" method. I don't > understand on what basis $report_obj has been defined to Bio::SearchIO by > default. Please help me understand the concept. > > Thanks & Regards, > Ravi. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason.stajich at gmail.com Wed Jun 29 01:45:01 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Tue, 28 Jun 2011 22:45:01 -0700 Subject: [Bioperl-l] UTR regions in LiveSeq In-Reply-To: <23127982-f3ab-4aab-b267-451fdff62110@s17g2000yqs.googlegroups.com> References: <23127982-f3ab-4aab-b267-451fdff62110@s17g2000yqs.googlegroups.com> Message-ID: <40AA3E17-A9B0-4CD4-9F41-F91B5576A5A2@gmail.com> To answer your simply, no, there is no plan to add this capability unless someone jumps in and wants to code it. This is a volunteer project and the person working on those modules stopped ~8 years ago after they were developed for their needs. There are other toolkits that do implement aspects that calculate the effects of mutations but would be nice to have something that worked within the code base here. I would encourage you to look into writing the modules you need if you have a need to do this. This is a collaborative project where people hopefully work to develop new code that they need and do it in collaboration with other developers to make the job easier and we are happy to help improve code that someone starts and contributes to the toolkit. -Jason On Jun 23, 2011, at 5:20 AM, htafer wrote: > Hi > I am currently working on SNPs and would like to use BioPerl. > I am using LiveSeq objects to store the sequence and I am looking at > mutations by using the Mutation/Mutator objects. > > More explicitly: > Given a set of annotation I extract the boundaries for all exons/ > introns, the boundaries for the transcript as well as for the coding > sequence. Based on these boundaries and the corresponding genomic > sequence I construct the Bio::LiveSeq::{ DNA/Exon/Transcript/ > Translation } objects which allow me to construct a Bio::LiveSeq::Gene > object. > > Then based on a list of SNPs, I generate a set of > Bio::LiveSeq::mutation objects and a Bio::LiveSeq::mutator > which I then used mutate my gene. My main problem here is that I dont > know how to handle transcripts having UTR, or ncRNAs, i.e transcripts/ > exons that do not code for protein. According to the documentation of > Bio::LiveSeq::Transcript, this class is aimed at storing information > about coding sequences (CDS) only. > > The following code for transcripts with UTR is somewhat working, > delivering the expected results. Still the alignment function of > Bio::LiveSeq::Mutator do not work. So is there some plan to introduce > Bio::LiveSeq::UTR similar to Bio::SeqFeature::Gene::UTR and to better > suppot ncRNA/UTR with the mutator object? Or is it possible to do this > with current BioPerl implementation? > > my $DNAsequence = Bio::LiveSeq::DNA->new( -seq => > "GGGGGGGGGGATGAAAAAAATTTTTTTTTTAAAAAAAATAGCCCCCCCCCC"); > > > > my $utr5 = Bio::LiveSeq::Exon-> new(-seq => $DNAsequence, > -start => 1, > -end => 10, > -strand => 1); > my $exon1 = Bio::LiveSeq::Exon-> new(-seq => $DNAsequence, > -start => 11, > -end => 20, > -strand => 1); > my $intron1 =Bio::LiveSeq::Intron-> new(-seq => $DNAsequence, > -start => 21, > -end => 30, > -strand => 1); > > my $exon2 = Bio::LiveSeq::Exon-> new(-seq => $DNAsequence, > -start => 31, > -end => 41, > -strand => 1); > my $utr3 = Bio::LiveSeq::Exon-> new(-seq => $DNAsequence, > -start => 42, > -end => 51, > -strand => 1); > > my @tarray = ($exon1, $exon2); > my @uarray = ($utr5, $exon1, $exon2,$utr3); > my @iarray = ($intron1); > my $Transcript = Bio::LiveSeq::Transcript->new( -exons => \@uarray); > my $translationTranscript = Bio::LiveSeq::Transcript->new( -exons => > \@tarray); > my $Translation= Bio::LiveSeq::Translation->new( -transcript => > $translationTranscript); > #need to do this to avoid change_error() > $Transcript->{'translation'}=$Translation; > my $features; > $features->{DNA} = $DNAsequence; > $features->{Transcripts} = [$Transcript]; > $features->{Translations} = [$Translation]; > $features->{Exons} = \@uarray; > $features->{Introns} = \@iarray; > > > my $gene=Bio::LiveSeq::Gene->new(-name => "bla", > -features => $features); > > my $mutation = new Bio::LiveSeq::Mutation (-seq =>'', > -pos => 32, > -len => 3 > ); > > my $mutate = Bio::LiveSeq::Mutator->new(-gene => $gene, > -numbering => 'entry' > ); > > $mutate->add_Mutation($mutation); > > dna_mut:GGGGGGGGGGATGAAAAAAATTTTTTTTTTAAAAATAGCCCCCCCCCC > dna_ori: GGGGGGGGGGATGAAAAAAATTTTTTTTTTAAAAAAAATAGCCCCCCCCCC > rna_mut: GGGGGGGGGGATGAAAAAAAAAAAATAGCCCCCCCCCC > rna_ori: GGGGGGGGGGATGAAAAAAAAAAAAAAATAGCCCCCCCCCC > aa_mut: MKKKK* > aa_ori: MKKKKK* > > print $results->alignment(); > Variant : GAT GAA AAA AAA AAA ATA GCC CCC > Reference: GAT GAA AAA AAA Bio AAA ATA GCC CCC > E K K X K I A > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason.stajich at gmail.com Wed Jun 29 01:46:29 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Tue, 28 Jun 2011 22:46:29 -0700 Subject: [Bioperl-l] AlignIO In-Reply-To: References: <2213d37b-5145-4fde-a239-68ab756ab902@l6g2000vbn.googlegroups.com> Message-ID: There are other randomizing tools out there -- I believe Sean Eddy's shuffle does this http://www.bioperl.org/wiki/SQUID#shuffle On Jun 21, 2011, at 10:23 AM, Chad Davis wrote: > It sounds like your randomization procedure is breaking the alignment > format, such that the clustal parser can no longer read it. How are > you randomizing the file? An example might help. > > Chad > > On Tue, Jun 21, 2011 at 18:55, ashwoo wrote: >> Dear All, >> I have a script which parses the best HSP alignment out >> of BLAST result and writes it to a temporary file. >> >> my $aln = $hsp->get_aln; >> 1. my $out = Bio::AlignIO->new(-file => ">tmp.aln", >> 2. -format => 'clustalw'); >> 3. $out->write_aln($aln); >> >> >> I randomize the alignment within the "tmp.aln" file to generate a new >> file "mult_rand.aln" containing all randomized alignments in clustalw >> format. >> Now I want to read each alignment in the randomized file hence I use >> >> 4. my $in = Bio::AlignIO->new(-file => "mult_rand.aln", >> 5. -format => 'clustalw'); >> 6. while ( my $aln = $in->next_aln() ) { >> 7. #"RUN RNAZ to check the conservedness of each randomized >> alignment" >> 8. #NOT GETTING ANY VALUES HERE >> 9. } >> >> But I am not able to access each alignment. When I open the randomized >> aln file with a separate script and same code in lines 4-9 it works >> fine. Is this happening due to my using AlignIO objects twice in the >> same script. Please Help. >> >> yours sincerely, >> Perl Novice >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From sofia at bioinfo.hr Wed Jun 29 05:25:44 2011 From: sofia at bioinfo.hr (Sofia) Date: Wed, 29 Jun 2011 11:25:44 +0200 Subject: [Bioperl-l] problem with Bio::Align::Utilities - aa_to_dna_aln In-Reply-To: <54FF3FA6-BFE4-48F8-A180-576D33BF5BC3@gmail.com> References: <4E09F348.2080907@bioinfo.hr> <54FF3FA6-BFE4-48F8-A180-576D33BF5BC3@gmail.com> Message-ID: <4E0AEF98.10905@bioinfo.hr> Hi, I'm using only a part of the protein sequence and the entire cDNA, so no 1:1. That's how i used it before and it used to work. So the current function only works for matching cDNA and protein? No other way? In attachment you can find 4 files: prot_seq.txt - contains parts of protein sequences; start and end of the sequence are provided in the fasta header prot_align.out - alignment of proteins by Muscle dna_seq.txt - contains cDNA sequences dna_align.txt - alignment of the cDNA parts corresponding to the proteins, given by aa_to_dna_aln after the errors (see errors.out) Thanks. Sofia -------------- next part -------------- A non-text attachment was scrubbed... Name: files_Sofia.rar Type: application/octet-stream Size: 19737 bytes Desc: not available URL: From awitney at sgul.ac.uk Wed Jun 29 07:44:06 2011 From: awitney at sgul.ac.uk (Adam Witney) Date: Wed, 29 Jun 2011 12:44:06 +0100 Subject: [Bioperl-l] running a simple example with Bio::Tools::Run::BWA Message-ID: Hi, I am trying to run bwa according to the HOWTO listed here: http://www.bioperl.org/wiki/HOWTO:Short-read_assemblies_with_BWA To test it i am just trying to reproduce the simple example at the beginning, using this code: #!/usr/local/bin/perl -w use warnings; use strict; use Bio::Tools::Run::BWA; my $bwa = Bio::Tools::Run::BWA->new(); $bwa->out_type('test.sam'); my $assy = $bwa->run( 'myseq.fastq', 'myref.fa' ); print $assy."\n"; My understanding of the example in the HOWTO is that the file "test.sam" (although it looks like it should be producing a .bam file now anyway) should be created, is that correct? If so my problem is that the file is not created, and from a little digging in Bio::Tools::Run::BWA I can't see where it would (it does create a tmp file in /tmp/xxxx/, but this is eventually deleted). I think the process is working as i can see bwa and samtools running using "top" and without the out_type line then I do get a Bio::Assembly::ScaffoldI object created in $assy. Its just that the output file is not being copied/moved to the requested out_type. Have i just misunderstood how this works? Thanks for any help Adam From locarpau at upvnet.upv.es Wed Jun 29 09:42:01 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Wed, 29 Jun 2011 15:42:01 +0200 Subject: [Bioperl-l] Mapping GO slim terms In-Reply-To: <2D0FA564-AD89-4DD6-A870-7FD486841270@gmail.com> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> <4E03834B.8000801@upvnet.upv.es> <2D0FA564-AD89-4DD6-A870-7FD486841270@gmail.com> Message-ID: <4E0B2BA9.4000202@upvnet.upv.es> Jason, Thank you very much. I'll take a look at that module. Lorenzo El 29/06/11 07:38, Jason Stajich escribi?: > Lorenzo - > > you probably want to look at the GO-perl modules from Chris Mungall and others (http://search.cpan.org/~cmungall/go-perl/) -- these are not part of BioPerl however. > > I think this script does something akin to what you are asking. > http://search.cpan.org/~cmungall/go-perl/scripts/map2slim > > On Jun 23, 2011, at 11:17 AM, Lorenzo Carretero Paulet wrote: > >> Hi all, >> Anyone knows if there is a way of mapping a set fo GO terms to their GO slim terms (providing the corresponding OBO file) using Bioperl? >> Cheers, >> Lorenzo >> >> >> -- >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> Lorenzo Carretero Paulet >> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >> Integrative Systems Biology Group >> C/ Ingeniero Fausto Elio s/n. >> 46022 Valencia, Spain >> >> Phone: +34 963879934 >> Fax: +34 963877859 >> e-mail: locarpau at upvnet.upv.es >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From nickl at ebi.ac.uk Wed Jun 29 09:44:01 2011 From: nickl at ebi.ac.uk (Nick Langridge) Date: Wed, 29 Jun 2011 14:44:01 +0100 Subject: [Bioperl-l] Bio::Species bug report - cannot handle scientific names containing brackets Message-ID: <4E0B2C21.3050403@ebi.ac.uk> Hi, I'm having problems with Bio::Species and species that have brackets in thier scientific names, e.g. "Buchnera aphidicola (subsp. Acyrthosiphon pisum, strain 5A)". Bio::Species tries to extract the genus, species, and subspecies from the scientific name, but it ends up with mismatched brackets, e.g. genus: "Buchnera" species: "aphidicola (subsp." subspecies: "Acyrthosiphon pisum, strain 5A)" This causes an 'Unmatched ( in regex' runtime error when the module later tries to use the species value directly in a regex (in the binomial sub). I'm not sure if this module is maintained (?) as the docs say it is deprecated (unfortunately my circumstances mean I can't avoid it) but if it is and you'd like a more detailed bug report then I can provide one. Best regards, Nick ----------------------------- Nick Langridge Ensembl Genomes Web Developer EMBL-EBI http://www.ensemblgenomes.org From cjfields at illinois.edu Wed Jun 29 09:48:44 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 29 Jun 2011 08:48:44 -0500 Subject: [Bioperl-l] passing twice a codon MSA to codeml factory In-Reply-To: References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> Message-ID: <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> There was a lot of PAML refactoring done over the last several releases; wouldn't be surprised if there is some discordance between the BioPerl-supported version (I think the policy was only the latest PAML is supported in bioperl-live) and the one here. Lorenzo, can you give us a bit more information on what versions of BioPerl and PAML you are using? chris On Jun 29, 2011, at 12:34 AM, Jason Stajich wrote: > Lorenzo - > > Did you ever find a workaround - I lost track of this bug and I still don't really have time to figure it out, maybe this is elevated one someone else's radar? > > When I did try to debug this I was not convinced this it is because you are re-using the codon_MSA object. I would have to go back to a simple code example that we can reproduce the error from to track this down. > > Usually the seqtype error is because codeml crashed and you can also print out the error string from the program and save the mlc out file to see what the errors were. > > > > On Jun 12, 2011, at 6:35 PM, Lorenzo Carretero wrote: > >> Hi, >> >> I'm trying to pass the same codon MSA several times to a sub which runs codeml with the parameters passed as arguments. However, the second time it is passed, the program stops with the error message: >> >> --------------------- WARNING --------------------- >> MSG: There was an error - see error_string for the program output >> --------------------------------------------------- >> >> ------------- EXCEPTION: Bio::Root::NotImplemented ------------- >> MSG: Unknown format of PAML output did not see seqtype >> STACK: Error::throw >> STACK: Bio::Root::Root::throw /Library/Perl//5.10.0/Bio/Root/Root.pm:472 >> STACK: Bio::Tools::Phylo::PAML::_parse_summary /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:526 >> STACK: Bio::Tools::Phylo::PAML::next_result /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:271 >> STACK: main::BranchSiteEvolAnalysis /Users/Lorenzo/Documents/workspace/PlantEvolGen/test.pl:364 >> STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/test.pl:233 >> >> Here is just some partial code to illustrate what I'm saying: >> >> my $codon_MSA = Method_to_get_codonMSA ( $sequencesfilenameAA, $sequencesfilenameNT ); >> ( $lnL, $omegas, $pamlrun ) = BranchSiteEvolAnalysis ( $codon_MSA, 2, $tree, 0, 0, 0, 8 ); >> #The first time runs OK >> ( $lnL, $omegas, $pamlrun ) = BranchSiteEvolAnalysis ( $codon_MSA, 0, $tree, 0, 0, 0, 8 ); >> #The second time crashes >> #Method to_run PAML with the codon_MSA, tree, and codeml parameters passed as arguments >> sub BranchSiteEvolAnalysis >> { >> my ( $codon_MSA, $model, $tree, $NSsites, $fix_omega, $omega, $ncatG ) = @_; >> . >> . >> . >> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml >> ( >> -alignment => $codon_MSA, >> -tree => $biotree, >> -params => { >> #'verbose' => 0, >> #'noisy' => 9, >> 'runmode' => 0, #user tree >> 'seqtype' => 1, >> 'model' => $model, >> 'NSsites' => $NSsites, >> 'fix_omega' => $fix_omega, >> 'omega' => $omega, >> 'ncatG' => $ncatG, >> #'icode' => 0, >> #'fix_alpha' => 0, >> #'fix_kappa' => 0, >> #'RateAncestor' => 0, >> 'CodonFreq' => 2, >> 'cleandata' => 0, # remove sites with amibguity data (1 yes, 0 no), >> 'ndata' => 2 >> }, >> ); >> . >> . >> . >> } >> >> I verified and the $codon_MSA ref point to the same location in memory before and after running the codeml_factory, so I guess it is not modified by the package in such a way that it couldn't be passed more than once. DO you know of any way to avoid redoing the $codon_MSA each time i want to pass it to the codeml_factory. >> >> Thank you very much, >> >> Lorenzo >> >> >> >> -- >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> Lorenzo Carretero Paulet >> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >> Integrative Systems Biology Group >> C/ Ingeniero Fausto Elio s/n. >> 46022 Valencia, Spain >> >> Phone: +34 963879934 >> Fax: +34 963877859 >> e-mail: locarpau at upvnet.upv.es >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From locarpau at upvnet.upv.es Wed Jun 29 10:23:40 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Wed, 29 Jun 2011 16:23:40 +0200 Subject: [Bioperl-l] passing twice a codon MSA to codeml factory In-Reply-To: <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> References: <9645AF32-5EC3-41FA-9A32-45B6B92E31FD@illinois.edu> <4DF56976.8080704@upvnet.upv.es> <9866C4A4-AC36-4A25-B38F-3006A7BB0F11@illinois.edu> Message-ID: <4E0B356C.9020609@upvnet.upv.es> Hi, I'm using what I think are the latest versions of both Bioperl (1.006901 ) and PAML (4.4). I didn't found the way of passing the same alignment object to the BioPerl codeml factory several times. I had to realign the sequences and create a new alignment object each time. Cheers, Lorenzo El 29/06/11 15:48, Chris Fields escribi?: > There was a lot of PAML refactoring done over the last several releases; wouldn't be surprised if there is some discordance between the BioPerl-supported version (I think the policy was only the latest PAML is supported in bioperl-live) and the one here. Lorenzo, can you give us a bit more information on what versions of BioPerl and PAML you are using? > > chris > > On Jun 29, 2011, at 12:34 AM, Jason Stajich wrote: > >> Lorenzo - >> >> Did you ever find a workaround - I lost track of this bug and I still don't really have time to figure it out, maybe this is elevated one someone else's radar? >> >> When I did try to debug this I was not convinced this it is because you are re-using the codon_MSA object. I would have to go back to a simple code example that we can reproduce the error from to track this down. >> >> Usually the seqtype error is because codeml crashed and you can also print out the error string from the program and save the mlc out file to see what the errors were. >> >> >> >> On Jun 12, 2011, at 6:35 PM, Lorenzo Carretero wrote: >> >>> Hi, >>> >>> I'm trying to pass the same codon MSA several times to a sub which runs codeml with the parameters passed as arguments. However, the second time it is passed, the program stops with the error message: >>> >>> --------------------- WARNING --------------------- >>> MSG: There was an error - see error_string for the program output >>> --------------------------------------------------- >>> >>> ------------- EXCEPTION: Bio::Root::NotImplemented ------------- >>> MSG: Unknown format of PAML output did not see seqtype >>> STACK: Error::throw >>> STACK: Bio::Root::Root::throw /Library/Perl//5.10.0/Bio/Root/Root.pm:472 >>> STACK: Bio::Tools::Phylo::PAML::_parse_summary /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:526 >>> STACK: Bio::Tools::Phylo::PAML::next_result /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:271 >>> STACK: main::BranchSiteEvolAnalysis /Users/Lorenzo/Documents/workspace/PlantEvolGen/test.pl:364 >>> STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/test.pl:233 >>> >>> Here is just some partial code to illustrate what I'm saying: >>> >>> my $codon_MSA = Method_to_get_codonMSA ( $sequencesfilenameAA, $sequencesfilenameNT ); >>> ( $lnL, $omegas, $pamlrun ) = BranchSiteEvolAnalysis ( $codon_MSA, 2, $tree, 0, 0, 0, 8 ); >>> #The first time runs OK >>> ( $lnL, $omegas, $pamlrun ) = BranchSiteEvolAnalysis ( $codon_MSA, 0, $tree, 0, 0, 0, 8 ); >>> #The second time crashes >>> #Method to_run PAML with the codon_MSA, tree, and codeml parameters passed as arguments >>> sub BranchSiteEvolAnalysis >>> { >>> my ( $codon_MSA, $model, $tree, $NSsites, $fix_omega, $omega, $ncatG ) = @_; >>> . >>> . >>> . >>> my $codeml_factory = new Bio::Tools::Run::Phylo::PAML::Codeml >>> ( >>> -alignment => $codon_MSA, >>> -tree => $biotree, >>> -params => { >>> #'verbose' => 0, >>> #'noisy' => 9, >>> 'runmode' => 0, #user tree >>> 'seqtype' => 1, >>> 'model' => $model, >>> 'NSsites' => $NSsites, >>> 'fix_omega' => $fix_omega, >>> 'omega' => $omega, >>> 'ncatG' => $ncatG, >>> #'icode' => 0, >>> #'fix_alpha' => 0, >>> #'fix_kappa' => 0, >>> #'RateAncestor' => 0, >>> 'CodonFreq' => 2, >>> 'cleandata' => 0, # remove sites with amibguity data (1 yes, 0 no), >>> 'ndata' => 2 >>> }, >>> ); >>> . >>> . >>> . >>> } >>> >>> I verified and the $codon_MSA ref point to the same location in memory before and after running the codeml_factory, so I guess it is not modified by the package in such a way that it couldn't be passed more than once. DO you know of any way to avoid redoing the $codon_MSA each time i want to pass it to the codeml_factory. >>> >>> Thank you very much, >>> >>> Lorenzo >>> >>> >>> >>> -- >>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>> Lorenzo Carretero Paulet >>> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >>> Integrative Systems Biology Group >>> C/ Ingeniero Fausto Elio s/n. >>> 46022 Valencia, Spain >>> >>> Phone: +34 963879934 >>> Fax: +34 963877859 >>> e-mail: locarpau at upvnet.upv.es >>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From carandraug+dev at gmail.com Wed Jun 29 16:53:42 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Wed, 29 Jun 2011 21:53:42 +0100 Subject: [Bioperl-l] How (from where) to retrieve FieldInfo objects? Message-ID: Hi By using the einfo script that comes with bioperl I get a list of fields of info that I want to access $ einfo --database=gene If I understood correctly the man page of the script, this returns the list of info I should be able to obtain when searching the 'gene' database. As such, I tried to use EUtilities esearch to access this info. Using the deobfuscator, I understood that I should create a Bio::Tools::EUtilities::Query EUtitlities object with Bio::DB::EUtilities esearch. With this object I could use the get_FieldInfo method to obtain an array of FieldInfo objects or the next_FieldInfo method to obtain one of them. However, I'm failing miserably at this. My search returns no FieldInfo objects whatsoever. my $factory = Bio::DB::EUtilities->new( -eutil => 'esearch', -db => 'gene', -term => 'h2afx[sym] AND human[organism]', -retmax => 5, ); ## this returns nothing my @fields = $factory->get_FieldInfos; ## it also never gets into this loop while (my $field = $factory->next_FieldInfo) { say "hello"; } My question is how to I access the info fields mentioned when I use the einfo script. I can't find any info on this. Thanks in advance, Carn? From cjfields at illinois.edu Wed Jun 29 17:39:33 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 29 Jun 2011 16:39:33 -0500 Subject: [Bioperl-l] Bio::Species bug report - cannot handle scientific names containing brackets In-Reply-To: <4E0B2C21.3050403@ebi.ac.uk> References: <4E0B2C21.3050403@ebi.ac.uk> Message-ID: <250A8ACB-4F05-42EA-8174-67DFBEC2C271@illinois.edu> You are more than welcome to file a bug, but we don't really support parsing this data anymore (the reason why Bio::Species is deprecated). It's just too fraught with problematic edge cases to work in all instances. chris On Jun 29, 2011, at 8:44 AM, Nick Langridge wrote: > Hi, > > I'm having problems with Bio::Species and species that have brackets in thier scientific names, e.g. "Buchnera aphidicola (subsp. Acyrthosiphon pisum, strain 5A)". > > Bio::Species tries to extract the genus, species, and subspecies from the scientific name, but it ends up with mismatched brackets, e.g. > genus: "Buchnera" > species: "aphidicola (subsp." > subspecies: "Acyrthosiphon pisum, strain 5A)" > > This causes an 'Unmatched ( in regex' runtime error when the module later tries to use the species value directly in a regex (in the binomial sub). > > I'm not sure if this module is maintained (?) as the docs say it is deprecated (unfortunately my circumstances mean I can't avoid it) but if it is and you'd like a more detailed bug report then I can provide one. > > Best regards, > Nick > > ----------------------------- > Nick Langridge > Ensembl Genomes Web Developer > EMBL-EBI > http://www.ensemblgenomes.org > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From carandraug+dev at gmail.com Wed Jun 29 18:12:37 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Wed, 29 Jun 2011 23:12:37 +0100 Subject: [Bioperl-l] How (from where) to retrieve FieldInfo objects? In-Reply-To: <58D258F0-7A80-4CEC-ACF6-99110AA623A4@verizon.net> References: <58D258F0-7A80-4CEC-ACF6-99110AA623A4@verizon.net> Message-ID: > On Jun 29, 2011, at 4:53 PM, Carn? Draug wrote: > > Hi > > By using the einfo script that comes with bioperl I get a list of > fields of info that I want to access > > $ einfo --database=gene > > If I understood correctly the man page of the script, this returns the > list of info I should be able to obtain when searching the 'gene' > database. As such, I tried to use EUtilities esearch to access this > info. > > Using the deobfuscator, I understood that I should create a > Bio::Tools::EUtilities::Query EUtitlities object with > Bio::DB::EUtilities esearch. With this object I could use the > get_FieldInfo method to obtain an array of FieldInfo objects or the > next_FieldInfo method to obtain one of them. However, I'm failing > miserably at this. My search returns no FieldInfo objects whatsoever. > > my $factory = Bio::DB::EUtilities->new( > -eutil => 'esearch', > -db => 'gene', > -term => 'h2afx[sym] AND > human[organism]', > -retmax => 5, > ); > > ## this returns nothing > my @fields = $factory->get_FieldInfos; > > ## it also never gets into this loop > while (my $field = $factory->next_FieldInfo) { > say "hello"; > } > > My question is how to I access the info fields mentioned when I use > the einfo script. I can't find any info on this. 2011/6/29 Brian Osborne : > Carne, something like: > > #!/usr/bin/perl > use Bio::DB::EUtilities; > # Get all the fields for a db: > # > my $factory = Bio::DB::EUtilities->new(-eutil => 'einfo', > ?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?-db => 'genomeprj', ); > $factory->print_all; Hi Brian thank for you answer. Maybe I'm just misunderstanding the point of the FieldInfo objects. The script you sent gives me a list of the Field Infos, which if I understood correctly is the list of infos I can obtain for a specific database (genomeprj in your case). But when I search that specific database, how do I get that info? For example, running the code you mentioned, one of the Field Info has the name 'Sequencing Center' and the code CEN. So when I search the genomeprj database with a query, how do I get that value from the results? Thanks Carn? From bosborne11 at verizon.net Wed Jun 29 17:13:04 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 29 Jun 2011 17:13:04 -0400 Subject: [Bioperl-l] How (from where) to retrieve FieldInfo objects? In-Reply-To: References: Message-ID: <58D258F0-7A80-4CEC-ACF6-99110AA623A4@verizon.net> Carne, something like: #!/usr/bin/perl use Bio::DB::EUtilities; # Get all the fields for a db: # my $factory = Bio::DB::EUtilities->new(-eutil => 'einfo', -db => 'genomeprj', ); $factory->print_all; On Jun 29, 2011, at 4:53 PM, Carn? Draug wrote: > Hi > > By using the einfo script that comes with bioperl I get a list of > fields of info that I want to access > > $ einfo --database=gene > > If I understood correctly the man page of the script, this returns the > list of info I should be able to obtain when searching the 'gene' > database. As such, I tried to use EUtilities esearch to access this > info. > > Using the deobfuscator, I understood that I should create a > Bio::Tools::EUtilities::Query EUtitlities object with > Bio::DB::EUtilities esearch. With this object I could use the > get_FieldInfo method to obtain an array of FieldInfo objects or the > next_FieldInfo method to obtain one of them. However, I'm failing > miserably at this. My search returns no FieldInfo objects whatsoever. > > my $factory = Bio::DB::EUtilities->new( > -eutil => 'esearch', > -db => 'gene', > -term => 'h2afx[sym] AND > human[organism]', > -retmax => 5, > ); > > ## this returns nothing > my @fields = $factory->get_FieldInfos; > > ## it also never gets into this loop > while (my $field = $factory->next_FieldInfo) { > say "hello"; > } > > My question is how to I access the info fields mentioned when I use > the einfo script. I can't find any info on this. > > Thanks in advance, > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From Russell.Smithies at agresearch.co.nz Wed Jun 29 18:30:04 2011 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Thu, 30 Jun 2011 10:30:04 +1200 Subject: [Bioperl-l] How (from where) to retrieve FieldInfo objects? In-Reply-To: References: <58D258F0-7A80-4CEC-ACF6-99110AA623A4@verizon.net> Message-ID: <18DF7D20DFEC044098A1062202F5FFF3396074D238@exchsth.agresearch.co.nz> How about just returning ASN.1 then parsing that? There's far more data in that format than any of the others. my $factory = Bio::DB::EUtilities->new(-eutil => 'esearch', -term => 'h2afx[sym] AND human[organism]', -db => 'gene', -usehistory => 'y'); my $hist = $factory->next_History || die "No history data returned"; $factory->set_parameters(-eutil => 'efetch',-history => $hist); print Dumper $factory->get_Response; --Russell > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Carn? Draug > Sent: Thursday, 30 June 2011 10:13 a.m. > To: Brian Osborne > Cc: bioperl mailing list > Subject: Re: [Bioperl-l] How (from where) to retrieve FieldInfo > objects? > > > On Jun 29, 2011, at 4:53 PM, Carn? Draug wrote: > > > > Hi > > > > By using the einfo script that comes with bioperl I get a list of > > fields of info that I want to access > > > > $ einfo --database=gene > > > > If I understood correctly the man page of the script, this returns > the > > list of info I should be able to obtain when searching the 'gene' > > database. As such, I tried to use EUtilities esearch to access this > > info. > > > > Using the deobfuscator, I understood that I should create a > > Bio::Tools::EUtilities::Query EUtitlities object with > > Bio::DB::EUtilities esearch. With this object I could use the > > get_FieldInfo method to obtain an array of FieldInfo objects or the > > next_FieldInfo method to obtain one of them. However, I'm failing > > miserably at this. My search returns no FieldInfo objects whatsoever. > > > > my $factory = Bio::DB::EUtilities->new( > > -eutil => 'esearch', > > -db => 'gene', > > -term => 'h2afx[sym] AND > > human[organism]', > > -retmax => 5, > > ); > > > > ## this returns nothing > > my @fields = $factory->get_FieldInfos; > > > > ## it also never gets into this loop > > while (my $field = $factory->next_FieldInfo) { > > say "hello"; > > } > > > > My question is how to I access the info fields mentioned when I use > > the einfo script. I can't find any info on this. > > 2011/6/29 Brian Osborne : > > Carne, something like: > > > > #!/usr/bin/perl > > use Bio::DB::EUtilities; > > # Get all the fields for a db: > > # > > my $factory = Bio::DB::EUtilities->new(-eutil => 'einfo', > > ?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?-db => 'genomeprj', ); > > $factory->print_all; > > Hi Brian > > thank for you answer. Maybe I'm just misunderstanding the point of the > FieldInfo objects. The script you sent gives me a list of the Field > Infos, which if I understood correctly is the list of infos I can > obtain for a specific database (genomeprj in your case). But when I > search that specific database, how do I get that info? > > For example, running the code you mentioned, one of the Field Info has > the name 'Sequencing Center' and the code CEN. So when I search the > genomeprj database with a query, how do I get that value from the > results? > > Thanks > Carn? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From carandraug+dev at gmail.com Wed Jun 29 19:39:00 2011 From: carandraug+dev at gmail.com (=?ISO-8859-1?Q?Carn=EB_Draug?=) Date: Thu, 30 Jun 2011 00:39:00 +0100 Subject: [Bioperl-l] How (from where) to retrieve FieldInfo objects? In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF3396074D238@exchsth.agresearch.co.nz> References: <58D258F0-7A80-4CEC-ACF6-99110AA623A4@verizon.net> <18DF7D20DFEC044098A1062202F5FFF3396074D238@exchsth.agresearch.co.nz> Message-ID: On 29 June 2011 23:30, Smithies, Russell wrote: > How about just returning ASN.1 then parsing that? > There's far more data in that format than any of the others. > > my $factory = Bio::DB::EUtilities->new(-eutil ? ? ?=> 'esearch', > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -term ? ? ? => 'h2afx[sym] AND human[organism]', > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -db ? ? ? ? => 'gene', > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -usehistory => 'y'); > > > my $hist ?= $factory->next_History || die "No history data returned"; > > $factory->set_parameters(-eutil ? => 'efetch',-history => $hist); > > print Dumper $factory->get_Response; When I do this, I get a XML with the ASN.1 inside the tag pre. Is is supposed to be this way? Should I extract it myself? Shouldn't the method do this? It's nice that I can get so many information but wouldn't it be lighter on the NCBI server if I could ask only for the info that I need rather than the whole record? Also, I still can't understand what's the point of the einfo and when/why get the FieldInfo objects from a search. Carn? From bernd.web at gmail.com Thu Jun 30 08:19:59 2011 From: bernd.web at gmail.com (Bernd Web) Date: Thu, 30 Jun 2011 14:19:59 +0200 Subject: [Bioperl-l] bp_classify_hits_kingdom.pl In-Reply-To: <992064A0-B165-461D-B1BB-130BAA1CD98F@gmail.com> References: <7EAD4222-BDD8-497F-A0E6-AE94EA4D0D10@gwdg.de> <8283D9A4-34C2-4AC5-AF0B-B1662D5393C6@gmail.com> <992064A0-B165-461D-B1BB-130BAA1CD98F@gmail.com> Message-ID: Hi Jason, I was reading through your script and running it. I noticed, at least the new BLASTP+, is parsed incorrectly. The %positves field is missing. All fields are: # Fields: query id, subject ids, % identity, % positives, alignment length, mismatches, gap opens, q. start, q. end, s. start, s. end, evalue, bit score So my ($qname,$hname,$pid,$qaln,$mismatch,$gaps, $qstart,$qend,$hstart,$hend, $evalue,$bits,$score) = split(/\t/,$_); should be: my ($qname,$hname,$pid, $posid, $qaln,$mismatch,$gaps, $qstart,$qend,$hstart,$hend, $evalue,$bit_score) = split(/\t/,$_); $evalue was set to $hend, instead of the E-value. And $bit_score as one field. Cheers, Bernd On Wed, Jun 22, 2011 at 10:07 PM, Jason Stajich wrote: > Bernd - oops - thanks very much for noticing this - I was too fast in copy & paste. I see another typo in there now that the midday light is shining on the code that I'll fix. > > Should be able to check in this in a second. > > Jason > > On Jun 22, 2011, at 11:07 AM, Bernd Web wrote: > >> Hi Jason, >> >> I did GI to TAX mapping in Perl alone. Nice to know this script >> exists. Thanks for this. >> Just one question, I noticed on >> https://github.com/bioperl/bioperl-live/blob/master/scripts/taxa/classify_hits_kingdom.PLS: >> >> line 96: my $dbh = tie(%gi2node, 'DB_File', 'gi2class'); >> and >> line 100: my $dbh2 = my $dbh = >> DBI->connect("dbi:SQLite:dbname=$giidxfile","",""); >> >> So ?the second $dbh masks earlier declaration. >> >> >> Cheers, >> Bernd From jason.stajich at gmail.com Thu Jun 30 13:10:40 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Thu, 30 Jun 2011 10:10:40 -0700 Subject: [Bioperl-l] bp_classify_hits_kingdom.pl In-Reply-To: References: <7EAD4222-BDD8-497F-A0E6-AE94EA4D0D10@gwdg.de> <8283D9A4-34C2-4AC5-AF0B-B1662D5393C6@gmail.com> <992064A0-B165-461D-B1BB-130BAA1CD98F@gmail.com> Message-ID: <6DC7A063-0550-4D1E-A2AA-8B0A711F0FFB@gmail.com> Bernd - Thanks for seeing it -- however I it was written for the original blast m8 format, so if a different formatted input is provided then it requires now the option of supporting either format I would say - and/or manipulation of the input. so some more code. I'm not going to work on this but if you want to report it as a bug that can be tracked and someone can then volunteer to work on it and submit a batch via git. thanks. On Jun 30, 2011, at 5:19 AM, Bernd Web wrote: > Hi Jason, > > I was reading through your script and running it. > I noticed, at least the new BLASTP+, is parsed incorrectly. > The %positves field is missing. All fields are: > > # Fields: query id, subject ids, % identity, % positives, alignment > length, mismatches, gap opens, q. start, q. end, s. start, s. end, > evalue, bit score > > So > my ($qname,$hname,$pid,$qaln,$mismatch,$gaps, > $qstart,$qend,$hstart,$hend, > $evalue,$bits,$score) = split(/\t/,$_); > > should be: > my ($qname,$hname,$pid, $posid, $qaln,$mismatch,$gaps, > $qstart,$qend,$hstart,$hend, > $evalue,$bit_score) = split(/\t/,$_); > > $evalue was set to $hend, instead of the E-value. And $bit_score as one field. > > > Cheers, > Bernd > > On Wed, Jun 22, 2011 at 10:07 PM, Jason Stajich wrote: >> Bernd - oops - thanks very much for noticing this - I was too fast in copy & paste. I see another typo in there now that the midday light is shining on the code that I'll fix. >> >> Should be able to check in this in a second. >> >> Jason >> >> On Jun 22, 2011, at 11:07 AM, Bernd Web wrote: >> >>> Hi Jason, >>> >>> I did GI to TAX mapping in Perl alone. Nice to know this script >>> exists. Thanks for this. >>> Just one question, I noticed on >>> https://github.com/bioperl/bioperl-live/blob/master/scripts/taxa/classify_hits_kingdom.PLS: >>> >>> line 96: my $dbh = tie(%gi2node, 'DB_File', 'gi2class'); >>> and >>> line 100: my $dbh2 = my $dbh = >>> DBI->connect("dbi:SQLite:dbname=$giidxfile","",""); >>> >>> So the second $dbh masks earlier declaration. >>> >>> >>> Cheers, >>> Bernd From randy.hancock at gmail.com Thu Jun 30 14:41:43 2011 From: randy.hancock at gmail.com (randy909) Date: Thu, 30 Jun 2011 11:41:43 -0700 (PDT) Subject: [Bioperl-l] debian package for bioperl-db and bioperl-ext Message-ID: <32364563-3050-4805-85ec-1fedd24587b6@h12g2000pro.googlegroups.com> Hi, I need a debian package for bioperl-db and bioperl-ext. Is there somewhere I can get these or get help on creating the packages myself? Thanks, Randy From plessy at debian.org Thu Jun 30 18:37:54 2011 From: plessy at debian.org (Charles Plessy) Date: Fri, 1 Jul 2011 07:37:54 +0900 Subject: [Bioperl-l] debian package for bioperl-db and bioperl-ext In-Reply-To: <32364563-3050-4805-85ec-1fedd24587b6@h12g2000pro.googlegroups.com> References: <32364563-3050-4805-85ec-1fedd24587b6@h12g2000pro.googlegroups.com> Message-ID: <20110630223754.GA19757@merveille.plessy.net> Le Thu, Jun 30, 2011 at 11:41:43AM -0700, randy909 a ?crit : > Hi, I need a debian package for bioperl-db and bioperl-ext. Is there > somewhere I can get these or get help on creating the packages > myself? Dear Randy, you are most welcome to contact the Debian Med team, which already maintains a Debian package for bioperl and bioperl-run. Our electronic address is debian-med at lists.debian.org. It is a publicly archived mailint list, see ?http://lists.debian.org/debian-med/?. Have a nice day, -- Charles Plessy Debian Med packaging team, http://www.debian.org/devel/debian-med Tsurumi, Kanagawa, Japan