From duxroq at hotmail.com Sun May 1 18:26:02 2011 From: duxroq at hotmail.com (duxroq) Date: Sun, 1 May 2011 15:26:02 -0700 (PDT) Subject: [Bioperl-l] Clustalw problems!!! In-Reply-To: References: <31509401.post@talk.nabble.com> Message-ID: <31519669.post@talk.nabble.com> Thank you so much for your help, it now runs perfectly :-) I guess misunderstood how CLUSTALWDIR works. Chris Fields-5 wrote: > > I don't have access to a Windows machine to test this, unfortunately. I > did notice you set CLUSTALWDIR to the actual executable, NOT the directory > it is in. Also, the executable name is 'clustalw.exe', not 'clustalw', so > possibly change that prior to instantiation: > > $Bio::Tools::Run::Alignment::ClustalW::PROGRAM_NAME = 'clustalw.exe'; > > Maybe that's a start? > > chris > > On Apr 29, 2011, at 5:45 PM, duxroq wrote: > >> >> Hi, I'm not sure whether my program is not finding the clustal w >> exceutable >> or if it is having trouble with the module itself. Here is my error, in >> the >> image below: >> >> http://old.nabble.com/file/p31509401/Untitled.png Untitled.png >> >> here is my code: >> >> >> #!/usr/bin/local/perl -w >> >> use Bio::Perl; >> use Bio::AlignIO; >> use Bio::Root::IO; >> use Bio::Seq; >> use Bio::SeqIO; >> use Bio::SimpleAlign; >> use Bio::TreeIO; >> >> #--------------------------------------------------------------------------------# >> >> # Main >> >> unless(($#ARGV + 1) > 1) { >> print_start_message(); >> exit; >> } >> >> BEGIN { $ENV{CLUSTALDIR} = 'C:\Program Files\clustalw.exe'} #-Set >> CLUSTALDIR to correct directory >> use Bio::Tools::Run::Alignment::Clustalw; >> >> my $file_name1 = $ARGV[0]; #-name of file containing all sequences >> my @sequences = read_all_sequences($file_name1,'fasta'); >> >> # alt_revcom(\@sequences); #-reverse complement of every other seq >> # print_sequences(@sequences); >> >> my @params = ('outfile' => 'mult_aln.aln'); #-sets parameters for >> alignment factory >> ################################################### >> # >> # >> # >> #The error is probably in the next few lines! agh!# >> # >> # >> # >> ################################################### >> my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params); >> $clustalfound = Bio::Tools::Run::Alignment::Clustalw->exists_clustal(); >> if ($clustalfound) { >> print "\n we found it!!!\n" } >> my $aln = $factory->align(\@sequences); >> print "\nDevins name is bob. \n"; >> #-creates alignment >> print "\nPercentage Identity:\n",$aln->percentage_identity,"\n\n"; >> >> my $cons_str = $aln->consensus_iupac(); #-configures consensus using >> IUPAC codes >> my $cons_name = "Consensus_".save_id(@sequences); >> my $cons_seq = new_sequence($cons_str, $cons_name); >> >> write_seq_to_file(">cons.fa",$cons_seq); #-writes consensus to file >> >> my $file_name2 = $ARGV[1]; >> my $lead_seq = read_sequence($file_name2,'fasta'); #-compare consensus >> sequence to leader >> >> # print_sequences($lead_seq); >> # print_sequences($cons_seq); >> >> my @lead_cons_seqs = ($lead_seq, $cons_seq); #-forms array for >> alignment >> of leader and consensus >> @params = ('pairgap' => 50); >> >> print "\n bobisnotmyname\n"; >> >> $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params); >> my $aln_lead_cons = $factory->align(\@lead_cons_seqs); >> print "\nPercentage >> Identity:\n",$aln_lead_cons->percentage_identity,"\n\n"; >> >> my $seq_len = $aln_lead_cons->length(); >> >> my @aln_seqs = (); #-array of aligned sequences, including gaps >> my $i = 0; >> foreach $seq ($aln_lead_cons->each_seq() ) { >> $aln_seqs[$i] = $seq; >> $i++; >> } >> # print_sequences(@aln_seqs); >> >> my $l_aln_str = ''; #-str of leader sequence from alignment >> my $c_aln_str = ''; #-str of consensus sequence from alignment >> $found = 0; >> $i = 1; >> while ($found == 0 && $i < $seq_len) { >> >> $l_aln_str = substr($aln_seqs[0]->seq(),$i,1); #-gets a substring from >> l_aln_str >> if ($l_aln_str !~ m/\./i) { >> #-checks if substring has gap characters >> $cons_slice_str = substr($aln_seqs[1]->seq(),$i,490); >> $found = 1; #-retrieves 490 characters of consensus where >> } >> $i++; #-leader begins in alignment >> } >> >> $cons_slice_seq = new_sequence($cons_slice_str,"Sliced_".$cons_name); >> # print_sequences($cons_slice_seq); >> write_seq_to_file(">cons_slice.fa",$cons_slice_seq); >> >> >> # End Main >> >> #--------------------------------------------------------------------------------# >> >> # Subroutines >> >> sub alt_revcom { >> my @sequences = @_; >> for ( $i=0; $i <= $#sequences; $i++) { >> if ($i%2==1) { >> $sequences[$i] = reverse_complement($sequences[$i]); >> } >> } >> } >> >> sub save_id { >> my @sequences = @_; >> $id = $sequences[0]->display_id; >> return $id; >> } >> >> sub print_sequences { >> my @sequences = @_; >> for ($i = 0; $i <= $#sequences; $i++) { >> print "Sequence name:",$sequences[$i]->display_id,"\n"; >> print "Sequence acc:",$sequences[$i]->accession_number,"\n"; >> print $sequences[$i]->seq(),"\n"; >> } >> } >> >> sub write_seq_to_file { >> my ($file_name,$seq) = @_; >> write_sequence($file_name,'fasta',$seq); >> print "\n",$seq->display_id," written to file.\n\n"; >> } >> >> >> -- >> View this message in context: >> http://old.nabble.com/Clustalw-problems%21%21%21-tp31509401p31509401.html >> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- View this message in context: http://old.nabble.com/Clustalw-problems%21%21%21-tp31509401p31519669.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From florent.angly at gmail.com Sun May 1 22:17:56 2011 From: florent.angly at gmail.com (Florent Angly) Date: Mon, 02 May 2011 12:17:56 +1000 Subject: [Bioperl-l] Convert fastq to fasta In-Reply-To: <31492543.post@talk.nabble.com> References: <31492543.post@talk.nabble.com> Message-ID: <4DBE1454.3010800@gmail.com> You need to use this module: Bio::SeqIO::fastq Read this for explanations: http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html Regards, Florent On 28/04/11 10:26, perlbio007 wrote: > Iam new to Bioperl. Pls help. > I have a zip folder of sequences which is in fastq format. I need to convert > it in fasta format? > How I do that using bioperl?What module do I need? From bioinfo.khush at gmail.com Mon May 2 02:02:28 2011 From: bioinfo.khush at gmail.com (khush ........) Date: Mon, 2 May 2011 11:32:28 +0530 Subject: [Bioperl-l] Bioperl-l Digest, Vol 96, Issue 28 In-Reply-To: <4DBC991B.80002@gmail.com> References: <4DBC991B.80002@gmail.com> Message-ID: Dear Florent, Thanks for your reply. Yes its clustalw, but I have clustalw2 and clustalx installed on my fc13 machine. I am not sure where to set the path for the same. I have some 400 nucleotide sequences for which I have to do the analysis i.e is y I found this script useful to me. help me... Thank you Kamak On Sun, May 1, 2011 at 4:49 AM, Florent Angly wrote: > Kamal, > It looks like you have a typo somewhere: what is 'clustaw'? You probably > mean 'clustalw'. > Florent > > > > On 29/04/11 16:34, khush ........ wrote: > >> Dear, >> >> I am trying to calculate the Ka/ks ratio of my aligned sequences by >> clustalx >> and for the same I am using >> >> So I am using the the scrip given at >> https://github.com/bioperl/bioperl-live/blob/master/scripts/utilitind the >> executable forind the executable fories/pairwise_kaks.PLS >> >> when I am trying to run the It alert me to chage the line >> >> "warn("Could not find the executable f $aln_prog, make sure you have >> installed it and have either set ".uc($aln_prog)."DIR or it is in your >> PATH");" >> >> "Could not find the executable for clustaw, make sure you have installed >> it >> and have either set CLUSTAWDIR or it is in your PATH at kaks.pl line 52." >> >> I have clustalw2 and clustalx installed on my system. How to and where to >> set the path for the same and how to calculate the Ka/Ks raio for my >> sequences. >> >> Thank you >> Kamal >> >> >> >> >> >> >> On Fri, Apr 29, 2011 at 11:16 AM,> >wrote: >> >> Send Bioperl-l mailing list submissions to >>> bioperl-l at lists.open-bio.org >>> >>> To subscribe or unsubscribe via the World Wide Web, visit >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> or, via email, send a message with subject or body 'help' to >>> bioperl-l-request at lists.open-bio.org >>> >>> You can reach the person managing the list at >>> bioperl-l-owner at lists.open-bio.org >>> >>> When replying, please edit your Subject line so it is more specific >>> than "Re: Contents of Bioperl-l digest..." >>> >>> >>> Today's Topics: >>> >>> 1. Re: GSoC/BioPerl Reorganization Project (Sheena Scroggins) >>> 2. Re: GSoC/BioPerl Reorganization Project (Chris Fields) >>> 3. Re: GSoC/BioPerl Reorganization Project (Robert Buels) >>> 4. Re: GSoC/BioPerl Reorganization Project (Siddhartha Basu) >>> 5. Re: Standalone blast (khush ........) >>> 6. Re: GSoC/BioPerl Reorganization Project (Robert Buels) >>> 7. Re: Standalone blast (Florent Angly) >>> 8. Re: Standalone blast (khush ........) >>> >>> >>> ---------------------------------------------------------------------- >>> >>> Message: 1 >>> Date: Thu, 28 Apr 2011 12:53:49 -0700 >>> From: Sheena Scroggins >>> Subject: Re: [Bioperl-l] GSoC/BioPerl Reorganization Project >>> To: Chris Fields >>> Cc: bioperl-l at lists.open-bio.org >>> Message-ID: >>> Content-Type: text/plain; charset=ISO-8859-1 >>> >>> Chris, >>> >>> We haven't talked much about the versioning yet, but it will be on the >>> list >>> to figure out asap. >>> >>> So far, the plan is to split out Bio::Root first, followed by a couple >>> modules that depend only on Bio::Root. The plan I proposed was Bio::Das, >>> Bio::Event then Bio::Location. Depending on how much time is remaining >>> for >>> the GSoC project, the next to split out would be Bio::Factory and >>> Bio::Coordinate, because they depend on Bio::Root and Bio::Location. I >>> plan >>> to still help with the reorganization after the internship is over, but I >>> obviously have to have a stopping point for the GSoC project. >>> >>> Rob provide me with a really nice scrip to list dependencies of the >>> modules, >>> so I plan to make a roadmap towards to end of the summer that will help >>> guide the rest of the reorganization. At that point, we'll have to deal >>> with >>> the circular dependencies carefully. >>> >>> This is a huge project, much bigger than I can do in one summer. But I >>> plan >>> to get it started in a way that makes it easy for others to contribute. >>> >>> Sheena >>> >>> >>> On Wed, Apr 27, 2011 at 12:35 PM, Chris Fields>> >>>> wrote: >>>> Sheena, >>>> >>>> Congrats on being accepted! We've talked about doing this over the >>>> years, >>>> but it's not an easy task and it needs a dedicated project to get the >>>> >>> ball >>> >>>> rolling, so to speak. Hopefully this isn't tl;dr. I'll start off with >>>> a >>>> few of my questions/thoughts (Rob could probably chime in as well, but I >>>> think his general thoughts on the project parallel mine): >>>> >>>> 1) The current BioPerl CPAN could just be a simple install script, >>>> acting >>>> like a 'Task' or 'Bundle' module, installing the actual Bio-specific >>>> distributions. Doing it this way would allow you to iteratively split >>>> >>> off >>> >>>> additional code but retain the original Task/Bundle-based approach to >>>> installation. For instance, the first pass could split out Root, then >>>> >>> have >>> >>>> a dependency-light and 'extras' distribution, 2nd round split further >>>> >>> based >>> >>>> on function, and so on: >>>> >>>> 1st round (v 1.9) : BioPerl (just an installer) -> installs root, >>>> min-deps, extra-deps >>>> 2nd round (v 1.901) : BioPerl (just an installer) -> root, >>>> seq/feature, >>>> other-min-deps, extra-deps >>>> ... >>>> Xth round (v 1.99) : BioPerl (just an installer) -> root, tools, >>>> seq, >>>> tree, align, coord, map, everything-else >>>> ... >>>> >>>> Also, one could potentially install modules in various ways: >>>> >>> interactively, >>> >>>> in predetermined groups, using a user-defined list, etc (one could >>>> effectively create custom BioPerl installs for GBrowse or other tools >>>> for >>>> instance). Of course I would only pick the easiest route to start, but >>>> maybe that gives some ideas. Regardless, if the dependency tree is set >>>> >>> up >>> >>>> correctly any reliance on other Bio* modules would be defined in the >>>> >>> various >>> >>>> Build.PL/Makefile.PL and then installed via CPAN (as is any dependency). >>>> >>>> 2) The Bio::Root modules are probably the true core modules and are the >>>> most stable with regards to changes, so those could be moved to >>>> something >>>> like BioPerl-Core. Beyond that, what are the proposed splits? (we've >>>> discussed this on-list before, but it's appropriate to bring this up >>>> >>> again) >>> >>>> 3) How do we want to handle versioning? We can't (and probably >>>> >>> shouldn't) >>> >>>> release everything on a synchronized versioning scheme (via >>>> Bio::Root::Version, for instance), that'll quickly fall apart. >>>> >>> Personally I >>> >>>> can foresee each split-off dist having it's own version, with the >>>> BioPerl >>>> network of modules being in effect it's own mini-CPAN. >>>> >>>> 5) Related to versioning, in my opinion we should maybe aim on >>>> eventually >>>> calling this BioPerl v2.0 and starting with a simpler X.Y versioning >>>> >>> scheme. >>> >>>> Lincoln has already done something like this with Bio::Graphics, which >>>> >>> was >>> >>>> originally part of BioPerl but split off prior to v 1.6.0. >>>> >>>> 6) In some cases I can see particularly thorny problems, such as >>>> circular >>>> dependencies. I can think of a few ways to address that (creating a >>>> >>> simple >>> >>>> lightweight Bio::Species class as a fallback if Bio::Tree code isn't >>>> present, for instance), but any additional thoughts on this would be >>>> helpful. >>>> >>>> 7) Do we want to set up something like 'git submodule' for the devs to >>>> >>> pull >>> >>>> down all BioPerl-relevant code? >>>> >>>> Other thoughts? >>>> >>>> chris >>>> >>>> On Apr 27, 2011, at 12:17 AM, Sheena Scroggins wrote: >>>> >>>> Hey everyone, >>>>> >>>>> I wanted to take a minute to introduce myself as one of the Google >>>>> >>>> Summer >>> >>>> of >>>> >>>>> Code interns. I was the lucky one chosen to work on the BioPerl >>>>> Reorganization (*crowd cheers*). I am a grad student in bioinformatics, >>>>> >>>> and >>>> >>>>> somewhat new to this level of programming so bear with me as I learn >>>>> >>>> the >>> >>>> technical jargon. Luckily I have both Rob and Chris to mentor me this >>>>> summer! >>>>> >>>>> Reading through the mailing list archives, I see there have been many >>>>> discussion and differing opinions about tackling this project. Given >>>>> >>>> the >>> >>>> time frame for GSoC and my limited experience, there is no way I will >>>>> complete this project on my own but I will at least be able to start >>>>> >>>> it, >>> >>>> which will hopefully motivate others to pitch in. So far, the plan for >>>>> >>>> the >>>> >>>>> GSoC project is to start by breaking out Bio::Root, followed by a >>>>> >>>> couple >>> >>>> other modules based on their dependencies and the time allowed. Each >>>>> >>>> will >>> >>>> be >>>> >>>>> published to CPAN independently. You can follow the project (once it >>>>> >>>> starts) >>>> >>>>> on github at https://github.com/sheenams. >>>>> >>>>> I look forward to collaborating with many of you on the reorganization >>>>> >>>> (hint >>>> >>>>> hint)! >>>>> >>>>> Sheena >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>> >>>> >>> ------------------------------ >>> >>> Message: 2 >>> Date: Thu, 28 Apr 2011 16:04:51 -0500 >>> From: Chris Fields >>> Subject: Re: [Bioperl-l] GSoC/BioPerl Reorganization Project >>> To: Sheena Scroggins >>> Cc: BioPerl List, Robert Buels >>> >>> Message-ID:<1FF62DC3-941A-4DCB-8464-89D220E4A9C5 at illinois.edu> >>> Content-Type: text/plain; charset="us-ascii" >>> >>> Sounds fine; I think (as you indicate) we can deal with issues along the >>> way. Rob, anything to add? >>> >>> chris >>> >>> On Apr 28, 2011, at 2:53 PM, Sheena Scroggins wrote: >>> >>> Chris, >>>> >>>> We haven't talked much about the versioning yet, but it will be on the >>>> >>> list to figure out asap. >>> >>>> So far, the plan is to split out Bio::Root first, followed by a couple >>>> >>> modules that depend only on Bio::Root. The plan I proposed was Bio::Das, >>> Bio::Event then Bio::Location. Depending on how much time is remaining >>> for >>> the GSoC project, the next to split out would be Bio::Factory and >>> Bio::Coordinate, because they depend on Bio::Root and Bio::Location. I >>> plan >>> to still help with the reorganization after the internship is over, but I >>> obviously have to have a stopping point for the GSoC project. >>> >>>> Rob provide me with a really nice scrip to list dependencies of the >>>> >>> modules, so I plan to make a roadmap towards to end of the summer that >>> will >>> help guide the rest of the reorganization. At that point, we'll have to >>> deal >>> with the circular dependencies carefully. >>> >>>> This is a huge project, much bigger than I can do in one summer. But I >>>> >>> plan to get it started in a way that makes it easy for others to >>> contribute. >>> >>>> Sheena >>>> >>>> >>>> On Wed, Apr 27, 2011 at 12:35 PM, Chris Fields >>>> >>> wrote: >>> >>>> Sheena, >>>> >>>> Congrats on being accepted! We've talked about doing this over the >>>> years, >>>> >>> but it's not an easy task and it needs a dedicated project to get the >>> ball >>> rolling, so to speak. Hopefully this isn't tl;dr. I'll start off with a >>> few of my questions/thoughts (Rob could probably chime in as well, but I >>> think his general thoughts on the project parallel mine): >>> >>>> 1) The current BioPerl CPAN could just be a simple install script, >>>> acting >>>> >>> like a 'Task' or 'Bundle' module, installing the actual Bio-specific >>> distributions. Doing it this way would allow you to iteratively split >>> off >>> additional code but retain the original Task/Bundle-based approach to >>> installation. For instance, the first pass could split out Root, then >>> have >>> a dependency-light and 'extras' distribution, 2nd round split further >>> based >>> on function, and so on: >>> >>>> 1st round (v 1.9) : BioPerl (just an installer) -> installs root, >>>> >>> min-deps, extra-deps >>> >>>> 2nd round (v 1.901) : BioPerl (just an installer) -> root, >>>> seq/feature, >>>> >>> other-min-deps, extra-deps >>> >>>> ... >>>> Xth round (v 1.99) : BioPerl (just an installer) -> root, tools, >>>> seq, >>>> >>> tree, align, coord, map, everything-else >>> >>>> ... >>>> >>>> Also, one could potentially install modules in various ways: >>>> >>> interactively, in predetermined groups, using a user-defined list, etc >>> (one >>> could effectively create custom BioPerl installs for GBrowse or other >>> tools >>> for instance). Of course I would only pick the easiest route to start, >>> but >>> maybe that gives some ideas. Regardless, if the dependency tree is set >>> up >>> correctly any reliance on other Bio* modules would be defined in the >>> various >>> Build.PL/Makefile.PL and then installed via CPAN (as is any dependency). >>> >>>> 2) The Bio::Root modules are probably the true core modules and are the >>>> >>> most stable with regards to changes, so those could be moved to something >>> like BioPerl-Core. Beyond that, what are the proposed splits? (we've >>> discussed this on-list before, but it's appropriate to bring this up >>> again) >>> >>>> 3) How do we want to handle versioning? We can't (and probably >>>> >>> shouldn't) release everything on a synchronized versioning scheme (via >>> Bio::Root::Version, for instance), that'll quickly fall apart. >>> Personally I >>> can foresee each split-off dist having it's own version, with the BioPerl >>> network of modules being in effect it's own mini-CPAN. >>> >>>> 5) Related to versioning, in my opinion we should maybe aim on >>>> eventually >>>> >>> calling this BioPerl v2.0 and starting with a simpler X.Y versioning >>> scheme. >>> Lincoln has already done something like this with Bio::Graphics, which >>> was >>> originally part of BioPerl but split off prior to v 1.6.0. >>> >>>> 6) In some cases I can see particularly thorny problems, such as >>>> circular >>>> >>> dependencies. I can think of a few ways to address that (creating a >>> simple >>> lightweight Bio::Species class as a fallback if Bio::Tree code isn't >>> present, for instance), but any additional thoughts on this would be >>> helpful. >>> >>>> 7) Do we want to set up something like 'git submodule' for the devs to >>>> >>> pull down all BioPerl-relevant code? >>> >>>> Other thoughts? >>>> >>>> chris >>>> >>>> On Apr 27, 2011, at 12:17 AM, Sheena Scroggins wrote: >>>> >>>> Hey everyone, >>>>> >>>>> I wanted to take a minute to introduce myself as one of the Google >>>>> >>>> Summer of >>> >>>> Code interns. I was the lucky one chosen to work on the BioPerl >>>>> Reorganization (*crowd cheers*). I am a grad student in bioinformatics, >>>>> >>>> and >>> >>>> somewhat new to this level of programming so bear with me as I learn >>>>> >>>> the >>> >>>> technical jargon. Luckily I have both Rob and Chris to mentor me this >>>>> summer! >>>>> >>>>> Reading through the mailing list archives, I see there have been many >>>>> discussion and differing opinions about tackling this project. Given >>>>> >>>> the >>> >>>> time frame for GSoC and my limited experience, there is no way I will >>>>> complete this project on my own but I will at least be able to start >>>>> >>>> it, >>> >>>> which will hopefully motivate others to pitch in. So far, the plan for >>>>> >>>> the >>> >>>> GSoC project is to start by breaking out Bio::Root, followed by a >>>>> >>>> couple >>> >>>> other modules based on their dependencies and the time allowed. Each >>>>> >>>> will be >>> >>>> published to CPAN independently. You can follow the project (once it >>>>> >>>> starts) >>> >>>> on github at https://github.com/sheenams. >>>>> >>>>> I look forward to collaborating with many of you on the reorganization >>>>> >>>> (hint >>> >>>> hint)! >>>>> >>>>> Sheena >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>> >>>> >>> >>> >>> ------------------------------ >>> >>> Message: 3 >>> Date: Thu, 28 Apr 2011 16:19:51 -0700 >>> From: Robert Buels >>> Subject: Re: [Bioperl-l] GSoC/BioPerl Reorganization Project >>> To: Chris Fields >>> Cc: Sheena Scroggins, BioPerl List >>> >>> Message-ID:<4DB9F617.6070705 at cornell.edu> >>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >>> >>> I think you guys are on the right track, here are some slightly more >>> detailed plans. I'll use Chris's subject numbering. >>> >>> 1,2,3,5.) I envision the splitting algorithm going like this: >>> >>> no strict; # this is pseudocode! >>> >>> my $split_count = 0; >>> for $subsystem (qw( Bio::Root Bio::Das Bio::Event ... )) { >>> >>> - take $subsystem modules and tests out of bioperl-live >>> >>> (my $new_dist_name = $subsystem) =~ s/::/-/g; >>> - extract $subsystem modules into new dist called >>> $new_dist_name. Make sure all its tests pass, and write >>> some more tests if necessary. >>> >>> - add dep on $subsystem to bioperl-live/Build.PL >>> >>> - push $new_dist_name and bioperl-live to CPAN. >>> $new_dist_name has version '2.000', and bioperl-live has >>> version "1.7.$split_count". >>> } >>> >>> and then, at the end of this loop, bioperl-live will be >>> nothing but a Build.PL and a couple of other things >>> for backcompat, like Bio::Root::Version, Bio::Perl, etc. >>> >>> Important things to notice about this algorithm are that, at each >>> step in the loop: >>> >>> a.) For users that install bioperl with CPAN, >>> doing cpan 'Bio::Perl' or cpan 'Bio::Root::Version' will >>> get you the same set of modules as before the split >>> started, with the split-off modules at 2.000 versions, and >>> the non-split-off ones at 1.7.x versions. >>> >>> b.) For users (not developers) that are git cloning >>> bioperl-live, even though they are naughty (wink), they >>> can do 'perl Build.PL; ./Build installdeps' to get the >>> split-off modules, downloaded like any other CPAN >>> dependency. There may be some lag before the split-off >>> thing is downloadable from CPAN, >>> >>> c.) For BioPerl developers, unless they are working on a >>> certain module, they should install the split-off modules >>> from CPAN like everybody else, and git clone only the piece >>> they are working on. >>> >>> d.) The version of bioperl-live keeps increasing by 0.001 with >>> each split. The systems that are split off have a 2.x >>> version number, each slightly different depending on when it >>> was split off. After this point, their release schedules >>> and version numbers are independent of eachother and of >>> bioperl-live. For Bio::Perl and Bio::Root::Version, the >>> things that stay in bioperl-live, installing the latest >>> version will get you all the split-off modules. >>> >>> >>> 6.) (thorny circular dependencies and stuff) Those will become quickly >>> apparent as this process proceeds. They'll take some finesse and/or >>> ruthlessness and/or hacking to get around. We'll burn those bridges as >>> we come to them. >>> >>> 7.) (git submodules) Git submodules probably won't be necessary, since >>> at each step in the process BioPerl devs can use ./Build installdeps or >>> cpanm --installdeps . to install whatever the dependencies are for the >>> piece they are working on, whether it's bioperl-live (in the case of a >>> module that has not yet been split off), or one of the distributions >>> that has already been split off (in which case their improvements will >>> probably be releasable to CPAN immediately!). >>> >>> Lots of detail there. I tried to make it structured and easy to skim >>> though. Thoughts? >>> >>> Rob >>> >>> >>> >>> On 04/28/2011 02:04 PM, Chris Fields wrote: >>> >>>> Sounds fine; I think (as you indicate) we can deal with issues along the >>>> >>> way. Rob, anything to add? >>> >>>> chris >>>> >>>> On Apr 28, 2011, at 2:53 PM, Sheena Scroggins wrote: >>>> >>>> Chris, >>>>> >>>>> We haven't talked much about the versioning yet, but it will be on the >>>>> >>>> list to figure out asap. >>> >>>> So far, the plan is to split out Bio::Root first, followed by a couple >>>>> >>>> modules that depend only on Bio::Root. The plan I proposed was Bio::Das, >>> Bio::Event then Bio::Location. Depending on how much time is remaining >>> for >>> the GSoC project, the next to split out would be Bio::Factory and >>> Bio::Coordinate, because they depend on Bio::Root and Bio::Location. I >>> plan >>> to still help with the reorganization after the internship is over, but I >>> obviously have to have a stopping point for the GSoC project. >>> >>>> Rob provide me with a really nice scrip to list dependencies of the >>>>> >>>> modules, so I plan to make a roadmap towards to end of the summer that >>> will >>> help guide the rest of the reorganization. At that point, we'll have to >>> deal >>> with the circular dependencies carefully. >>> >>>> This is a huge project, much bigger than I can do in one summer. But I >>>>> >>>> plan to get it started in a way that makes it easy for others to >>> contribute. >>> >>>> Sheena >>>>> >>>>> >>>>> On Wed, Apr 27, 2011 at 12:35 PM, Chris Fields >>>>> >>>> wrote: >>> >>>> Sheena, >>>>> >>>>> Congrats on being accepted! We've talked about doing this over the >>>>> >>>> years, but it's not an easy task and it needs a dedicated project to get >>> the >>> ball rolling, so to speak. Hopefully this isn't tl;dr. I'll start off >>> with >>> a few of my questions/thoughts (Rob could probably chime in as well, but >>> I >>> think his general thoughts on the project parallel mine): >>> >>>> 1) The current BioPerl CPAN could just be a simple install script, >>>>> >>>> acting like a 'Task' or 'Bundle' module, installing the actual >>> Bio-specific >>> distributions. Doing it this way would allow you to iteratively split >>> off >>> additional code but retain the original Task/Bundle-based approach to >>> installation. For instance, the first pass could split out Root, then >>> have >>> a dependency-light and 'extras' distribution, 2nd round split further >>> based >>> on function, and so on: >>> >>>> 1st round (v 1.9) : BioPerl (just an installer) -> installs root, >>>>> >>>> min-deps, extra-deps >>> >>>> 2nd round (v 1.901) : BioPerl (just an installer) -> root, >>>>> >>>> seq/feature, other-min-deps, extra-deps >>> >>>> ... >>>>> Xth round (v 1.99) : BioPerl (just an installer) -> root, tools, >>>>> >>>> seq, tree, align, coord, map, everything-else >>> >>>> ... >>>>> >>>>> Also, one could potentially install modules in various ways: >>>>> >>>> interactively, in predetermined groups, using a user-defined list, etc >>> (one >>> could effectively create custom BioPerl installs for GBrowse or other >>> tools >>> for instance). Of course I would only pick the easiest route to start, >>> but >>> maybe that gives some ideas. Regardless, if the dependency tree is set >>> up >>> correctly any reliance on other Bio* modules would be defined in the >>> various >>> Build.PL/Makefile.PL and then installed via CPAN (as is any dependency). >>> >>>> 2) The Bio::Root modules are probably the true core modules and are the >>>>> >>>> most stable with regards to changes, so those could be moved to >>> something >>> like BioPerl-Core. Beyond that, what are the proposed splits? (we've >>> discussed this on-list before, but it's appropriate to bring this up >>> again) >>> >>>> 3) How do we want to handle versioning? We can't (and probably >>>>> >>>> shouldn't) release everything on a synchronized versioning scheme (via >>> Bio::Root::Version, for instance), that'll quickly fall apart. >>> Personally I >>> can foresee each split-off dist having it's own version, with the BioPerl >>> network of modules being in effect it's own mini-CPAN. >>> >>>> 5) Related to versioning, in my opinion we should maybe aim on >>>>> >>>> eventually calling this BioPerl v2.0 and starting with a simpler X.Y >>> versioning scheme. Lincoln has already done something like this with >>> Bio::Graphics, which was originally part of BioPerl but split off prior >>> to v >>> 1.6.0. >>> >>>> 6) In some cases I can see particularly thorny problems, such as >>>>> >>>> circular dependencies. I can think of a few ways to address that >>> (creating >>> a simple lightweight Bio::Species class as a fallback if Bio::Tree code >>> isn't present, for instance), but any additional thoughts on this would >>> be >>> helpful. >>> >>>> 7) Do we want to set up something like 'git submodule' for the devs to >>>>> >>>> pull down all BioPerl-relevant code? >>> >>>> Other thoughts? >>>>> >>>>> chris >>>>> >>>>> On Apr 27, 2011, at 12:17 AM, Sheena Scroggins wrote: >>>>> >>>>> Hey everyone, >>>>>> >>>>>> I wanted to take a minute to introduce myself as one of the Google >>>>>> >>>>> Summer of >>> >>>> Code interns. I was the lucky one chosen to work on the BioPerl >>>>>> Reorganization (*crowd cheers*). I am a grad student in >>>>>> bioinformatics, >>>>>> >>>>> and >>> >>>> somewhat new to this level of programming so bear with me as I learn >>>>>> >>>>> the >>> >>>> technical jargon. Luckily I have both Rob and Chris to mentor me this >>>>>> summer! >>>>>> >>>>>> Reading through the mailing list archives, I see there have been many >>>>>> discussion and differing opinions about tackling this project. Given >>>>>> >>>>> the >>> >>>> time frame for GSoC and my limited experience, there is no way I will >>>>>> complete this project on my own but I will at least be able to start >>>>>> >>>>> it, >>> >>>> which will hopefully motivate others to pitch in. So far, the plan for >>>>>> >>>>> the >>> >>>> GSoC project is to start by breaking out Bio::Root, followed by a >>>>>> >>>>> couple >>> >>>> other modules based on their dependencies and the time allowed. Each >>>>>> >>>>> will be >>> >>>> published to CPAN independently. You can follow the project (once it >>>>>> >>>>> starts) >>> >>>> on github at https://github.com/sheenams. >>>>>> >>>>>> I look forward to collaborating with many of you on the reorganization >>>>>> >>>>> (hint >>> >>>> hint)! >>>>>> >>>>>> Sheena >>>>>> _______________________________________________ >>>>>> Bioperl-l mailing list >>>>>> Bioperl-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>> >>>>> >>>>> >>>> >>> >>> ------------------------------ >>> >>> Message: 4 >>> Date: Thu, 28 Apr 2011 21:15:01 -0500 >>> From: Siddhartha Basu >>> Subject: [Bioperl-l] Re: GSoC/BioPerl Reorganization Project >>> To: bioperl-l at lists.open-bio.org >>> Message-ID:<20110429021457.GA351 at Macintosh-235.local> >>> Content-Type: text/plain; charset=us-ascii >>> >>> Hi Robert, >>> At what point in flow the dependencies between the split modules will be >>> added. Is there any particular order the split modules would be created. >>> And how those split off modules will be released in CPAN, one by one as >>> they being generated or all of them in a batch after which they will >>> follow their release schedule. >>> >>> -siddhartha >>> >>> >>> >>> On Thu, 28 Apr 2011, Robert Buels wrote: >>> >>> I think you guys are on the right track, here are some slightly more >>>> detailed plans. I'll use Chris's subject numbering. >>>> >>>> 1,2,3,5.) I envision the splitting algorithm going like this: >>>> >>>> no strict; # this is pseudocode! >>>> >>>> my $split_count = 0; >>>> for $subsystem (qw( Bio::Root Bio::Das Bio::Event ... )) { >>>> >>>> - take $subsystem modules and tests out of bioperl-live >>>> >>>> (my $new_dist_name = $subsystem) =~ s/::/-/g; >>>> - extract $subsystem modules into new dist called >>>> $new_dist_name. Make sure all its tests pass, and write >>>> some more tests if necessary. >>>> >>>> - add dep on $subsystem to bioperl-live/Build.PL >>>> >>>> - push $new_dist_name and bioperl-live to CPAN. >>>> $new_dist_name has version '2.000', and bioperl-live has >>>> version "1.7.$split_count". >>>> } >>>> >>>> and then, at the end of this loop, bioperl-live will be >>>> nothing but a Build.PL and a couple of other things >>>> for backcompat, like Bio::Root::Version, Bio::Perl, etc. >>>> >>>> Important things to notice about this algorithm are that, at each >>>> step in the loop: >>>> >>>> a.) For users that install bioperl with CPAN, >>>> doing cpan 'Bio::Perl' or cpan 'Bio::Root::Version' will >>>> get you the same set of modules as before the split >>>> started, with the split-off modules at 2.000 versions, and >>>> the non-split-off ones at 1.7.x versions. >>>> >>>> b.) For users (not developers) that are git cloning >>>> bioperl-live, even though they are naughty (wink), they >>>> can do 'perl Build.PL; ./Build installdeps' to get the >>>> split-off modules, downloaded like any other CPAN >>>> dependency. There may be some lag before the split-off >>>> thing is downloadable from CPAN, >>>> >>>> c.) For BioPerl developers, unless they are working on a >>>> certain module, they should install the split-off modules >>>> from CPAN like everybody else, and git clone only the piece >>>> they are working on. >>>> >>>> d.) The version of bioperl-live keeps increasing by 0.001 with >>>> each split. The systems that are split off have a 2.x >>>> version number, each slightly different depending on when it >>>> was split off. After this point, their release schedules >>>> and version numbers are independent of eachother and of >>>> bioperl-live. For Bio::Perl and Bio::Root::Version, the >>>> things that stay in bioperl-live, installing the latest >>>> version will get you all the split-off modules. >>>> >>>> >>>> 6.) (thorny circular dependencies and stuff) Those will become quickly >>>> apparent as this process proceeds. They'll take some finesse and/or >>>> ruthlessness and/or hacking to get around. We'll burn those bridges as >>>> >>> we >>> >>>> come to them. >>>> >>>> 7.) (git submodules) Git submodules probably won't be necessary, since >>>> at >>>> each step in the process BioPerl devs can use ./Build installdeps or >>>> >>> cpanm >>> >>>> --installdeps . to install whatever the dependencies are for the piece >>>> they are working on, whether it's bioperl-live (in the case of a module >>>> that has not yet been split off), or one of the distributions that has >>>> already been split off (in which case their improvements will probably >>>> be >>>> releasable to CPAN immediately!). >>>> >>>> Lots of detail there. I tried to make it structured and easy to skim >>>> though. Thoughts? >>>> >>>> Rob >>>> >>>> >>>> >>>> On 04/28/2011 02:04 PM, Chris Fields wrote: >>>> >>>>> Sounds fine; I think (as you indicate) we can deal with issues along >>>>> >>>> the >>> >>>> way. Rob, anything to add? >>>>> >>>>> chris >>>>> >>>>> On Apr 28, 2011, at 2:53 PM, Sheena Scroggins wrote: >>>>> >>>>> Chris, >>>>>> >>>>>> We haven't talked much about the versioning yet, but it will be on the >>>>>> list to figure out asap. >>>>>> >>>>>> So far, the plan is to split out Bio::Root first, followed by a couple >>>>>> modules that depend only on Bio::Root. The plan I proposed was >>>>>> >>>>> Bio::Das, >>> >>>> Bio::Event then Bio::Location. Depending on how much time is remaining >>>>>> for the GSoC project, the next to split out would be Bio::Factory and >>>>>> Bio::Coordinate, because they depend on Bio::Root and Bio::Location. I >>>>>> plan to still help with the reorganization after the internship is >>>>>> >>>>> over, >>> >>>> but I obviously have to have a stopping point for the GSoC project. >>>>>> >>>>>> Rob provide me with a really nice scrip to list dependencies of the >>>>>> modules, so I plan to make a roadmap towards to end of the summer that >>>>>> will help guide the rest of the reorganization. At that point, we'll >>>>>> >>>>> have >>> >>>> to deal with the circular dependencies carefully. >>>>>> >>>>>> This is a huge project, much bigger than I can do in one summer. But I >>>>>> plan to get it started in a way that makes it easy for others to >>>>>> contribute. >>>>>> >>>>>> Sheena >>>>>> >>>>>> >>>>>> On Wed, Apr 27, 2011 at 12:35 PM, Chris Fields >>>>>> wrote: >>>>>> Sheena, >>>>>> >>>>>> Congrats on being accepted! We've talked about doing this over the >>>>>> >>>>> years, >>> >>>> but it's not an easy task and it needs a dedicated project to get the >>>>>> ball rolling, so to speak. Hopefully this isn't tl;dr. I'll start >>>>>> >>>>> off >>> >>>> with a few of my questions/thoughts (Rob could probably chime in as >>>>>> >>>>> well, >>> >>>> but I think his general thoughts on the project parallel mine): >>>>>> >>>>>> 1) The current BioPerl CPAN could just be a simple install script, >>>>>> >>>>> acting >>> >>>> like a 'Task' or 'Bundle' module, installing the actual Bio-specific >>>>>> distributions. Doing it this way would allow you to iteratively split >>>>>> off additional code but retain the original Task/Bundle-based approach >>>>>> >>>>> to >>> >>>> installation. For instance, the first pass could split out Root, then >>>>>> have a dependency-light and 'extras' distribution, 2nd round split >>>>>> further based on function, and so on: >>>>>> >>>>>> 1st round (v 1.9) : BioPerl (just an installer) -> installs >>>>>> >>>>> root, >>> >>>> min-deps, extra-deps >>>>>> 2nd round (v 1.901) : BioPerl (just an installer) -> root, >>>>>> seq/feature, other-min-deps, extra-deps >>>>>> ... >>>>>> Xth round (v 1.99) : BioPerl (just an installer) -> root, tools, >>>>>> seq, tree, align, coord, map, everything-else >>>>>> ... >>>>>> >>>>>> Also, one could potentially install modules in various ways: >>>>>> interactively, in predetermined groups, using a user-defined list, etc >>>>>> (one could effectively create custom BioPerl installs for GBrowse or >>>>>> other tools for instance). Of course I would only pick the easiest >>>>>> >>>>> route >>> >>>> to start, but maybe that gives some ideas. Regardless, if the >>>>>> >>>>> dependency >>> >>>> tree is set up correctly any reliance on other Bio* modules would be >>>>>> defined in the various Build.PL/Makefile.PL and then installed via >>>>>> >>>>> CPAN >>> >>>> (as is any dependency). >>>>>> >>>>>> 2) The Bio::Root modules are probably the true core modules and are >>>>>> >>>>> the >>> >>>> most stable with regards to changes, so those could be moved to >>>>>> >>>>> something >>> >>>> like BioPerl-Core. Beyond that, what are the proposed splits? (we've >>>>>> discussed this on-list before, but it's appropriate to bring this up >>>>>> again) >>>>>> >>>>>> 3) How do we want to handle versioning? We can't (and probably >>>>>> shouldn't) release everything on a synchronized versioning scheme (via >>>>>> Bio::Root::Version, for instance), that'll quickly fall apart. >>>>>> Personally I can foresee each split-off dist having it's own version, >>>>>> with the BioPerl network of modules being in effect it's own >>>>>> >>>>> mini-CPAN. >>> >>>> 5) Related to versioning, in my opinion we should maybe aim on >>>>>> >>>>> eventually >>> >>>> calling this BioPerl v2.0 and starting with a simpler X.Y versioning >>>>>> scheme. Lincoln has already done something like this with >>>>>> >>>>> Bio::Graphics, >>> >>>> which was originally part of BioPerl but split off prior to v 1.6.0. >>>>>> >>>>>> 6) In some cases I can see particularly thorny problems, such as >>>>>> >>>>> circular >>> >>>> dependencies. I can think of a few ways to address that (creating a >>>>>> simple lightweight Bio::Species class as a fallback if Bio::Tree code >>>>>> isn't present, for instance), but any additional thoughts on this >>>>>> >>>>> would >>> >>>> be helpful. >>>>>> >>>>>> 7) Do we want to set up something like 'git submodule' for the devs to >>>>>> pull down all BioPerl-relevant code? >>>>>> >>>>>> Other thoughts? >>>>>> >>>>>> chris >>>>>> >>>>>> On Apr 27, 2011, at 12:17 AM, Sheena Scroggins wrote: >>>>>> >>>>>> Hey everyone, >>>>>>> >>>>>>> I wanted to take a minute to introduce myself as one of the Google >>>>>>> Summer of >>>>>>> Code interns. I was the lucky one chosen to work on the BioPerl >>>>>>> Reorganization (*crowd cheers*). I am a grad student in >>>>>>> >>>>>> bioinformatics, >>> >>>> and >>>>>>> somewhat new to this level of programming so bear with me as I learn >>>>>>> >>>>>> the >>> >>>> technical jargon. Luckily I have both Rob and Chris to mentor me this >>>>>>> summer! >>>>>>> >>>>>>> Reading through the mailing list archives, I see there have been many >>>>>>> discussion and differing opinions about tackling this project. Given >>>>>>> >>>>>> the >>> >>>> time frame for GSoC and my limited experience, there is no way I will >>>>>>> complete this project on my own but I will at least be able to start >>>>>>> >>>>>> it, >>> >>>> which will hopefully motivate others to pitch in. So far, the plan >>>>>>> >>>>>> for >>> >>>> the >>>>>>> GSoC project is to start by breaking out Bio::Root, followed by a >>>>>>> >>>>>> couple >>> >>>> other modules based on their dependencies and the time allowed. Each >>>>>>> will be >>>>>>> published to CPAN independently. You can follow the project (once it >>>>>>> starts) >>>>>>> on github at https://github.com/sheenams. >>>>>>> >>>>>>> I look forward to collaborating with many of you on the >>>>>>> >>>>>> reorganization >>> >>>> (hint >>>>>>> hint)! >>>>>>> >>>>>>> Sheena >>>>>>> _______________________________________________ >>>>>>> Bioperl-l mailing list >>>>>>> Bioperl-l at lists.open-bio.org >>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> >>> ------------------------------ >>> >>> Message: 5 >>> Date: Fri, 29 Apr 2011 10:23:50 +0530 >>> From: "khush ........" >>> Subject: Re: [Bioperl-l] Standalone blast >>> To: Dave Messina >>> Cc: bioperl-l at lists.open-bio.org >>> Message-ID: >>> Content-Type: text/plain; charset=ISO-8859-1 >>> >>> Dear Dave, >>> >>> Thank you for your support. >>> >>> If need to change the following lines like >>> >>> $blast_obj = Bio::Tools::Run::StandAloneBlast->new(-program => >>> 'blastx', >>> -database => 'nr.fa')); >>> >>> $seq_obj = Bio::Seq->new(-id =>"test query", -seq =>"file.fa"); >>> >>> I have a simple and basic query for you, as I am beginners in bioperl, >>> that >>> if I need to download the whole nr database from NCBI to run the code or >>> It >>> will directly fetch information from the NCBI website. I do not >>> understand >>> it, because downloading the whole nr d/b itself takes long time for me. >>> >>> How could I read whole file instead of simple string "TTTATAGATAGAGACAG" >>> in >>> -seq (a fasta file). Is there a simple way to do the exercise according >>> to >>> my conditions. >>> >>> Thank you >>> Kamal >>> >>> >>> On Thu, Apr 28, 2011 at 12:59 PM, Dave Messina>> >>>> wrote: >>>> Hi Kamal, >>>> >>>> This is covered in the beginners' HOWTO: >>>> http://www.bioperl.org/wiki/HOWTO:Beginners#BLAST >>>> >>>> >>>> Dave >>>> >>>> >>>> On Thu, Apr 28, 2011 at 07:22, khush ........>>> wrote: >>>> >>>> Hi, >>>>> >>>>> I have some sequences ~250 and wanted to use BLASTX to blast against nr >>>>> database of NCBI, as this is time consuming using web based search. Can >>>>> some >>>>> one please tell me how to start BIOPERL with scuh problems. I know that >>>>> this >>>>> is possible with bioperl, but do not know how. >>>>> >>>>> Any suggestion will be appreciable. >>>>> >>>>> Thanks in advance >>>>> Kamal >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>>> >>>> >>> ------------------------------ >>> >>> Message: 6 >>> Date: Thu, 28 Apr 2011 22:15:01 -0700 >>> From: Robert Buels >>> Subject: Re: [Bioperl-l] GSoC/BioPerl Reorganization Project >>> To: BioPerl List >>> Message-ID:<4DBA4955.2030003 at cornell.edu> >>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >>> >>> On 04/28/2011 07:15 PM, Siddhartha Basu wrote: >>> >>>> At what point in flow the dependencies between the split modules will be >>>> added. Is there any particular order the split modules would be created. >>>> >>> Dependencies are added and characterized at the time each distribution >>> is created. That's why the splitting order starts at Bio::Root, so that >>> you can proceed up the hierarchy of dependencies without having to >>> modify the dependency lists of the distributions that have already been >>> extracted. >>> >>> And how those split off modules will be released in CPAN, one by one as >>>> they being generated or all of them in a batch after which they will >>>> follow their release schedule. >>>> >>> One by one, as they are generated. I think it would be a good idea to >>> re-release bioperl-live with each split as well. This will probably >>> lead to bioperl-live being released nearly every week as the split is >>> ongoing. As a consequence, the master branch of bioperl-live will need >>> to be kept in very good shape. This is easy if you just follow good >>> practice: develop in branches, run *all* the tests before committing, go >>> on IRC and send pull requests for code review, etc. >>> >>> Rob >>> >>> >>> ------------------------------ >>> >>> Message: 7 >>> Date: Fri, 29 Apr 2011 15:24:45 +1000 >>> From: Florent Angly >>> Subject: Re: [Bioperl-l] Standalone blast >>> To: bioinfo.khush at gmail.com >>> Cc: bioperl-l at lists.open-bio.org >>> Message-ID:<4DBA4B9D.1010400 at gmail.com> >>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >>> >>> Hi Kamal, >>> >>> To run BLAST the way Dave described, you need to have BLAST installed on >>> your computer, and you need to download BLAST databases to your computer >>> (or make them yourself with the formatdb command). There are plenty of >>> databases available on the NCBI FTP website: ftp://ftp.ncbi.nih.gov/. >>> And yes, some of these databases are very large and will take a long >>> time to download. By the way, the BLAST may also take a very long time >>> to execute if you use large databases, so, you'd better run the analysis >>> on a powerful computer or a server. >>> >>> Also read this documentation: >>> >>> >>> http://search.cpan.org/~cjfields/BioPerl-1.6.900/Bio/Tools/Run/StandAloneBlast.pm >>> < >>> >>> http://search.cpan.org/%7Ecjfields/BioPerl-1.6.900/Bio/Tools/Run/StandAloneBlast.pm >>> It stipulates that you can BLAST an entire FASTA file (not just a >>> sequence object): >>> >>> $inputfilename = 't/testquery.fa'; >>> $blast_report = $factory->blastall($inputfilename); >>> >>> >>> Regards, >>> >>> Florent >>> >>> >>> >>> >>> On 29/04/11 14:53, khush ........ wrote: >>> >>>> Dear Dave, >>>> >>>> Thank you for your support. >>>> >>>> If need to change the following lines like >>>> >>>> $blast_obj = Bio::Tools::Run::StandAloneBlast->new(-program => >>>> >>> 'blastx', >>> >>>> -database => 'nr.fa')); >>>> >>>> $seq_obj = Bio::Seq->new(-id =>"test query", -seq =>"file.fa"); >>>> >>>> I have a simple and basic query for you, as I am beginners in bioperl, >>>> >>> that >>> >>>> if I need to download the whole nr database from NCBI to run the code or >>>> >>> It >>> >>>> will directly fetch information from the NCBI website. I do not >>>> >>> understand >>> >>>> it, because downloading the whole nr d/b itself takes long time for me. >>>> >>>> How could I read whole file instead of simple string "TTTATAGATAGAGACAG" >>>> >>> in >>> >>>> -seq (a fasta file). Is there a simple way to do the exercise according >>>> >>> to >>> >>>> my conditions. >>>> >>>> Thank you >>>> Kamal >>>> >>>> >>>> On Thu, Apr 28, 2011 at 12:59 PM, Dave Messina>>> wrote: >>>> >>>> Hi Kamal, >>>>> >>>>> This is covered in the beginners' HOWTO: >>>>> http://www.bioperl.org/wiki/HOWTO:Beginners#BLAST >>>>> >>>>> >>>>> Dave >>>>> >>>>> >>>>> On Thu, Apr 28, 2011 at 07:22, khush ........>>>> >>>> wrote: >>>> >>>>> Hi, >>>>>> >>>>>> I have some sequences ~250 and wanted to use BLASTX to blast against >>>>>> nr >>>>>> database of NCBI, as this is time consuming using web based search. >>>>>> Can >>>>>> some >>>>>> one please tell me how to start BIOPERL with scuh problems. I know >>>>>> that >>>>>> this >>>>>> is possible with bioperl, but do not know how. >>>>>> >>>>>> Any suggestion will be appreciable. >>>>>> >>>>>> Thanks in advance >>>>>> Kamal >>>>>> _______________________________________________ >>>>>> Bioperl-l mailing list >>>>>> Bioperl-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>> >>>>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> >>> >>> ------------------------------ >>> >>> Message: 8 >>> Date: Fri, 29 Apr 2011 11:16:38 +0530 >>> From: "khush ........" >>> Subject: Re: [Bioperl-l] Standalone blast >>> To: Florent Angly >>> Cc: bioperl-l at lists.open-bio.org >>> Message-ID: >>> Content-Type: text/plain; charset=ISO-8859-1 >>> >>> Dear Florent, >>> >>> Thank you very much for your kind reply and let me clear the concept of >>> running the blast. I am working with simple machine so I need to take >>> permission from my administrator to work on some good server to have >>> whole >>> nr database from NCBI and run the blastx. >>> >>> Thank you >>> >>> Kamal >>> Bioperl is great. >>> >>> >>> On Fri, Apr 29, 2011 at 10:54 AM, Florent Angly>> >>>> wrote: >>>> Hi Kamal, >>>> >>>> To run BLAST the way Dave described, you need to have BLAST installed on >>>> your computer, and you need to download BLAST databases to your computer >>>> >>> (or >>> >>>> make them yourself with the formatdb command). There are plenty of >>>> >>> databases >>> >>>> available on the NCBI FTP website: ftp://ftp.ncbi.nih.gov/. And yes, >>>> >>> some >>> >>>> of these databases are very large and will take a long time to download. >>>> >>> By >>> >>>> the way, the BLAST may also take a very long time to execute if you use >>>> large databases, so, you'd better run the analysis on a powerful >>>> computer >>>> >>> or >>> >>>> a server. >>>> >>>> Also read this documentation: >>>> >>>> >>> http://search.cpan.org/~cjfields/BioPerl-1.6.900/Bio/Tools/Run/StandAloneBlast.pm >>> < >>> >>> http://search.cpan.org/%7Ecjfields/BioPerl-1.6.900/Bio/Tools/Run/StandAloneBlast.pm >>> >>>> It stipulates that you can BLAST an entire FASTA file (not just a >>>> >>> sequence >>> >>>> object): >>>> >>>> $inputfilename = 't/testquery.fa'; >>>> $blast_report = $factory->blastall($inputfilename); >>>> >>>> >>>> Regards, >>>> >>>> Florent >>>> >>>> >>>> >>>> >>>> >>>> On 29/04/11 14:53, khush ........ wrote: >>>> >>>> Dear Dave, >>>>> >>>>> Thank you for your support. >>>>> >>>>> If need to change the following lines like >>>>> >>>>> $blast_obj = Bio::Tools::Run::StandAloneBlast->new(-program => >>>>> >>>> 'blastx', >>> >>>> -database => 'nr.fa')); >>>>> >>>>> $seq_obj = Bio::Seq->new(-id =>"test query", -seq =>"file.fa"); >>>>> >>>>> I have a simple and basic query for you, as I am beginners in bioperl, >>>>> that >>>>> if I need to download the whole nr database from NCBI to run the code >>>>> or >>>>> It >>>>> will directly fetch information from the NCBI website. I do not >>>>> >>>> understand >>> >>>> it, because downloading the whole nr d/b itself takes long time for me. >>>>> >>>>> How could I read whole file instead of simple string >>>>> "TTTATAGATAGAGACAG" >>>>> in >>>>> -seq (a fasta file). Is there a simple way to do the exercise according >>>>> >>>> to >>> >>>> my conditions. >>>>> >>>>> Thank you >>>>> Kamal >>>>> >>>>> >>>>> On Thu, Apr 28, 2011 at 12:59 PM, Dave Messina>>>> >>>>>> wrote: >>>>>> >>>>> Hi Kamal, >>>>> >>>>>> This is covered in the beginners' HOWTO: >>>>>> http://www.bioperl.org/wiki/HOWTO:Beginners#BLAST >>>>>> >>>>>> >>>>>> Dave >>>>>> >>>>>> >>>>>> On Thu, Apr 28, 2011 at 07:22, khush ........>>>>> >>>>>>> wrote: >>>>>>> >>>>>> Hi, >>>>>> >>>>>>> I have some sequences ~250 and wanted to use BLASTX to blast against >>>>>>> >>>>>> nr >>> >>>> database of NCBI, as this is time consuming using web based search. >>>>>>> >>>>>> Can >>> >>>> some >>>>>>> one please tell me how to start BIOPERL with scuh problems. I know >>>>>>> >>>>>> that >>> >>>> this >>>>>>> is possible with bioperl, but do not know how. >>>>>>> >>>>>>> Any suggestion will be appreciable. >>>>>>> >>>>>>> Thanks in advance >>>>>>> Kamal >>>>>>> _______________________________________________ >>>>>>> Bioperl-l mailing list >>>>>>> Bioperl-l at lists.open-bio.org >>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>> >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>>> >>>> >>> ------------------------------ >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> End of Bioperl-l Digest, Vol 96, Issue 28 >>> ***************************************** >>> >>> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > From tejaminnu at gmail.com Mon May 2 14:12:03 2011 From: tejaminnu at gmail.com (sukeerthi teja Rallapalli) Date: Mon, 2 May 2011 14:12:03 -0400 Subject: [Bioperl-l] Doubt - Regarding Stand Alone Blast In-Reply-To: References: Message-ID: Dear Sir, I am trying to create a stand alone database in Mac using Blast. I have installed Blast. I have pasted my 2 files (input file and the large database file) also in the same folder as in Blast. So far these are the commands that i have used are these *"**formatdb -i sequences.fasta -p T -o T * * * * * *blastall -p blastp -d homologene_result.fasta -i sequences.fasta -o stdout -a 1 -e 10* * * * * *OR* * * *makeblastdb -in** **sequences.fasta** **-dbtype **prot* * * * * *blastn -query** test **-db **homologene_result.fasta** -out** stdout **-outfmt 6**"* I think these are the two versions of blast .. But i am using the latest version of Blast on my system. I am always getting an error report, this is what it is. *"**sukeerthi-tejas-macbook-pro:bin sukeerthiteja$ blastall -p blastp -d homologene_result.fasta -i sequences.fasta -o stdout -a 1 -e 10* *[NULL_Caption] WARNING: Unable to open homologene_result.fasta.pin* *BLASTP 2.2.16 [Mar-25-2007]* * * * * *Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, * *Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), * *"Gapped BLAST and PSI-BLAST: a new generation of protein database search* *programs", Nucleic Acids Res. 25:3389-3402.* * * *Query= gi|293651546|ref|NP_001170840.1| nuclear factor NF-kappa-B p100* *subunit isoform a [Mus musculus]* * (899 letters)* * * *[NULL_Caption] WARNING: gi|293651546|ref|NP_001170840.1|: Unable to open homologene_result.fasta.pin**"* * * Please help me understand. Thanking you Teja From jonathan at leto.net Mon May 2 14:24:40 2011 From: jonathan at leto.net (Jonathan "Duke" Leto) Date: Mon, 2 May 2011 11:24:40 -0700 Subject: [Bioperl-l] GSoC/BioPerl Reorganization Project In-Reply-To: References: <1FF62DC3-941A-4DCB-8464-89D220E4A9C5@illinois.edu> <4DB9F617.6070705@cornell.edu> <20110429021457.GA351@Macintosh-235.local> <4DBA4955.2030003@cornell.edu> Message-ID: Howdy, >> One additional question: how are we dealing with commit history? ?I don't >> think there is an easy way of carrying that over to a brand-new repo... >> >> Not that it's a problem, but something to think about. > > > I believe git filter-branch can be used: > > "filter-branch is commonly used on a clone of the repo to split a too-large > repo into smaller ones." > > https://github.com/matthewmccullough/git-workshop/raw/master/workbook/htmls/27-Filter-Branch.html No commit history needs to get dropped on the floor. I can help with the git filter-branch stuff, just let me know. Duke -- Jonathan "Duke" Leto 209.691.DUKE // http://leto.net NOTE: Personal email is only checked twice a day at 10am/2pm PST, please call/text for time-sensitive matters. From cjfields at illinois.edu Mon May 2 14:30:17 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 2 May 2011 13:30:17 -0500 Subject: [Bioperl-l] GSoC/BioPerl Reorganization Project In-Reply-To: References: <1FF62DC3-941A-4DCB-8464-89D220E4A9C5@illinois.edu> <4DB9F617.6070705@cornell.edu> <20110429021457.GA351@Macintosh-235.local> <4DBA4955.2030003@cornell.edu> Message-ID: <64077D3C-1C1F-4F95-8219-693FE2B97198@illinois.edu> On May 2, 2011, at 1:24 PM, Jonathan Duke Leto wrote: > Howdy, > >>> One additional question: how are we dealing with commit history? I don't >>> think there is an easy way of carrying that over to a brand-new repo... >>> >>> Not that it's a problem, but something to think about. >> >> >> I believe git filter-branch can be used: >> >> "filter-branch is commonly used on a clone of the repo to split a too-large >> repo into smaller ones." >> >> https://github.com/matthewmccullough/git-workshop/raw/master/workbook/htmls/27-Filter-Branch.html > > No commit history needs to get dropped on the floor. I can help with the git > filter-branch stuff, just let me know. > > Duke I *hate* it when I drop my commit history on the floor. :) On a more serious note, it's very possible as we edge towards coding we will need help, so that would be great! chris From p.j.a.cock at googlemail.com Tue May 3 05:24:08 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 3 May 2011 10:24:08 +0100 Subject: [Bioperl-l] Interesting BLAST 2.2.25+ XML behaviour In-Reply-To: References: Message-ID: Hello all, I've CC'd the BioPerl, BioRuby, BioJava and Biopython development mailing lists to make sure you're aware of this, but can we continue any discussion on the cross-project open-bio-l mailing list please? I noticed that recent versions of BLAST are not using a single block for each query, which was the historical behaviour and assumed by the Biopython BLAST XML parser. This may be a bug in BLAST. See link below for an example. Has anyone else noticed this, and has it been reported to the NCBI yet? Thanks, Peter (Not for the first time, I wish there was a public bug tracker for BLAST, or at least a private bug tracker so we could talk about issues with an NCBI assigned reference number.) ---------- Forwarded message ---------- From: Peter Cock Date: Wed, Apr 20, 2011 at 6:08 PM Subject: Interesting BLAST 2.2.25+ XML behaviour To: Biopython-Dev Mailing List Hi all, Have a look at this XML file from a FASTA vs FASTA search using blastp from ?BLAST 2.2.25+ (current release), which is a test file I created for the BLAST+ wrappers in Galaxy: https://bitbucket.org/galaxy/galaxy-central/src/8eaf07a46623/test-data/blastp_four_human_vs_rhodopsin.xml I just put it though the Biopython BLAST XML parser, and was surprised not to get four records back (since as you might guess from the filename, there were four queries). It appears this version of BLAST+ is incrementing the iteration counter for each match... or something like that. Has anyone else noticed this? I wonder if it is accidental... Peter From David.Messina at sbc.su.se Tue May 3 05:16:33 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Tue, 3 May 2011 11:16:33 +0200 Subject: [Bioperl-l] Doubt - Regarding Stand Alone Blast In-Reply-To: References: Message-ID: > > formatdb -i sequences.fasta -p T -o T > The above command makes a database of the sequences.fasta file. > blastall -p blastp -d homologene_result.fasta -i sequences.fasta -o stdout > -a 1 -e 10 > But then you ask for a database made from the homologene_result.fasta file. For the above blastall command to work, you will have to use the following formatdb command: formatdb -i homologene_result.fasta -p T -o T Dave From David.Messina at sbc.su.se Tue May 3 08:06:31 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Tue, 3 May 2011 14:06:31 +0200 Subject: [Bioperl-l] Doubt - Regarding Stand Alone Blast In-Reply-To: References: Message-ID: Hi Teja, Please 'reply all' to keep the mailing list Cc'd so that it can be archived and everyone can follow along. On Tue, May 3, 2011 at 13:45, sukeerthi teja Rallapalli wrote: > Hi David, > > I basically have these 2 files, the > > Input file = sequences.txt > Database file = homologene_result.txt > > Isn't "blastall" an older version command ? That is why i was using > makeblastdb > In your previous email, you made the same mistake with both formatdb and makeblastdb. (making a database for one file but specifying the other file's name as the database when you tried to run blast) > This is what i typed > > 172-30-8-231:bin sukeerthiteja$ makeblastdb -in homologene_result.txt > -dbtype prot -parse_seqids -out teja1 blastp -query sequences.txt -db teja2 > You've combined two separate commands on the same line here. This is one command: makeblastdb -in homologene_result.txt -dbtype prot -parse_seqids -out teja1 This is another command: blastp -query sequences.txt -db teja2 Incidentally, this blastp again won't work because you've named your datase teja1 when you run makeblastdb, but then you tell blastp that the database is named teja2. Instead, you should do blastp -query sequences.txt -db teja1 because that's the name you specified with the -out option to makeblastdb. Two things: #1 You need to read the NCBI-BLAST+ documentation available here: http://www.ncbi.nlm.nih.gov/books/NBK1763/ #2 This is the BioPerl mailing list, and your questions do not relate to BioPerl. seqanswers.com is probably a more appropriate place to ask them. But I would suggest doing #1 first. Dave AND i also tried the code using .txt next to *teja2.txt, * but in both cases > i got this error. > > Error: (106.18) > NCBI C++ Exception: > Error: (CArgException::eSynopsis) Too many positional arguments (1), > the offending value: blastp > Error: (CArgException::eSynopsis) Application's initialization failed > Thanking you > Teja > From duxroq at hotmail.com Mon May 2 23:22:00 2011 From: duxroq at hotmail.com (duxroq) Date: Mon, 2 May 2011 20:22:00 -0700 (PDT) Subject: [Bioperl-l] Clustalw and sensitivity to changes in 'pairgap' parameter?? Message-ID: <31529229.post@talk.nabble.com> Hi, For a project I am doing I wanted to maximize the gap penalty for a pairwise alignment of a leader and a consensus sequence. I experimented with changing the 'pairgap' parameter. However, Clustalw generated the same exact alignment file no matter what I changed 'pairgap' to! Why would this happen? (As a side question, does 'pairgap' refer to the gap opening penalty for pairwise alignments? And does clustalw.pm have a parameter that controls gap penalties for multiple alignments or for gap extensions? My computer did not recognize 'fixedgap' or 'floatgap' and therefore I assumed I could only use 'pairgap' to modify my results). Here are the relevant pieces of code: # I set 'pairgap' to 50 because I think the default is 10 and I was seeking to maximize the gap penalty, although I do not know the upper limit for 'pairgap' @params = ('pairgap' => 50); $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params); my $aln_lead_cons = $factory->align(\@lead_cons_seqs); Thank you! Alexandria -- View this message in context: http://old.nabble.com/Clustalw-and-sensitivity-to-changes-in-%27pairgap%27-parameter---tp31529229p31529229.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From tejaminnu at gmail.com Tue May 3 08:10:47 2011 From: tejaminnu at gmail.com (sukeerthi teja Rallapalli) Date: Tue, 3 May 2011 08:10:47 -0400 Subject: [Bioperl-l] Fwd: Doubt - Regarding Stand Alone Blast In-Reply-To: References: Message-ID: ---------- Forwarded message ---------- From: sukeerthi teja Rallapalli Date: Tue, May 3, 2011 at 7:45 AM Subject: Re: [Bioperl-l] Doubt - Regarding Stand Alone Blast To: Dave Messina Hi David, I basically have these 2 files, the Input file = sequences.txt Database file = homologene_result.txt Isn't "blastall" an older version command ? That is why i was using makeblastdb This is what i typed 172-30-8-231:bin sukeerthiteja$ makeblastdb -in homologene_result.txt -dbtype prot -parse_seqids -out teja1 blastp -query sequences.txt -db teja2 AND i also tried the code using .txt next to *teja2.txt, * but in both cases i got this error. Error: (106.18) NCBI C++ Exception: Error: (CArgException::eSynopsis) Too many positional arguments (1), the offending value: blastp Error: (CArgException::eSynopsis) Application's initialization failed Thanking you Teja -- Regards Rallapalli Sukeerthi Teja From cjfields at illinois.edu Tue May 3 09:31:55 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 3 May 2011 08:31:55 -0500 Subject: [Bioperl-l] [BioRuby] Interesting BLAST 2.2.25+ XML behaviour In-Reply-To: References: Message-ID: <398303E2-1195-4CC2-8B73-09C6C1117892@illinois.edu> Haven't tried this using the latest BLAST+ myself, but it doesn't surprise me too much. Also agree re: some kind of bug tracking with NCBI; I believe they have an internal one, but it would be nice to have a public interface to it. chris On May 3, 2011, at 4:24 AM, Peter Cock wrote: > Hello all, > > I've CC'd the BioPerl, BioRuby, BioJava and Biopython development mailing > lists to make sure you're aware of this, but can we continue any discussion > on the cross-project open-bio-l mailing list please? > > I noticed that recent versions of BLAST are not using a single > block for each query, which was the historical behaviour and assumed > by the Biopython BLAST XML parser. This may be a bug in BLAST. > See link below for an example. > > Has anyone else noticed this, and has it been reported to the NCBI yet? > > Thanks, > > Peter > > (Not for the first time, I wish there was a public bug tracker for BLAST, > or at least a private bug tracker so we could talk about issues with an > NCBI assigned reference number.) > > ---------- Forwarded message ---------- > From: Peter Cock > Date: Wed, Apr 20, 2011 at 6:08 PM > Subject: Interesting BLAST 2.2.25+ XML behaviour > To: Biopython-Dev Mailing List > > > Hi all, > > Have a look at this XML file from a FASTA vs FASTA search > using blastp from BLAST 2.2.25+ (current release), which > is a test file I created for the BLAST+ wrappers in Galaxy: > > https://bitbucket.org/galaxy/galaxy-central/src/8eaf07a46623/test-data/blastp_four_human_vs_rhodopsin.xml > > I just put it though the Biopython BLAST XML parser, and > was surprised not to get four records back (since as you > might guess from the filename, there were four queries). > > It appears this version of BLAST+ is incrementing the > iteration counter for each match... or something like that. > > Has anyone else noticed this? I wonder if it is accidental... > > Peter > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From joyeux2000 at hotmail.fr Wed May 4 09:51:12 2011 From: joyeux2000 at hotmail.fr (debutant.bioperl) Date: Wed, 4 May 2011 06:51:12 -0700 (PDT) Subject: [Bioperl-l] Bioperl Message-ID: <31542007.post@talk.nabble.com> Good morning, I download a file containing the upstream sequences of genes from a database. these sequences are in FASTA format. explanatory example of sequence : > XX1G56520 | :2695538-2696537 FORWARD CHR1 LENGTH = > 500ATCGATCGATCGATCGGAGAGAGATCGATCGATCGATCGATCG> YY1G56520 | > :2695538-2696539 FORWARD CHR1 LENGTH = > 500ATCGATCGATCGATCGGAGAGAGATCGATCGATCGATCGATCG .... I want to write code in Bioperl that : *reads the file containing the sequences (you can open it with notepad, Word ...) *looking for a pattern (eg. GAGAGAGATCGA) *gives what is this gene sequence (eg the pattern italic we note gene code "XX1G56520. ID, which is just past the sign"> "before the pattern) *determines the position of this motif from the start codon, otherwise count the letter starting with the first letter pattern until the last letter before the sign ">" Next. *give the results as a table containing the code for each gene and the position corresponding pattern (number of letters to find plus a negative sign "- ") if you want more explanation I am always at your disposal. Please help me is urgent %-| cordially -- View this message in context: http://old.nabble.com/Bioperl-tp31542007p31542007.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From David.Messina at sbc.su.se Wed May 4 10:21:05 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Wed, 4 May 2011 16:21:05 +0200 Subject: [Bioperl-l] Bioperl In-Reply-To: <31542007.post@talk.nabble.com> References: <31542007.post@talk.nabble.com> Message-ID: Um, this sounds an awful lot like homework to me. Dave On Wed, May 4, 2011 at 15:51, debutant.bioperl wrote: > > Good morning, > I download a file containing the upstream sequences of genes from a > database. > these sequences are in FASTA format. > explanatory example of sequence : > > XX1G56520 | :2695538-2696537 FORWARD CHR1 LENGTH = > > 500ATCGATCGATCGATCGGAGAGAGATCGATCGATCGATCGATCG> YY1G56520 | > > :2695538-2696539 FORWARD CHR1 LENGTH = > > 500ATCGATCGATCGATCGGAGAGAGATCGATCGATCGATCGATCG .... > I want to write code in Bioperl that : > *reads the file containing the sequences (you can open it with notepad, > Word > ...) > *looking for a pattern (eg. GAGAGAGATCGA) > *gives what is this gene sequence (eg the pattern italic we note gene code > "XX1G56520. ID, which is just past the sign"> "before the pattern) > *determines the position of this motif from the start codon, otherwise > count > the letter starting with the first letter pattern until the last letter > before the sign ">" Next. > *give the results as a table containing the code for each gene and the > position corresponding pattern (number of letters to find plus a negative > sign "- ") > if you want more explanation I am always at your disposal. > Please help me is urgent %-| > cordially > -- > View this message in context: > http://old.nabble.com/Bioperl-tp31542007p31542007.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at illinois.edu Wed May 4 10:28:09 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 4 May 2011 09:28:09 -0500 Subject: [Bioperl-l] Bioperl In-Reply-To: References: <31542007.post@talk.nabble.com> Message-ID: Yep. Strange that, being the end of the academic year is upon us... :) chris On May 4, 2011, at 9:21 AM, Dave Messina wrote: > Um, this sounds an awful lot like homework to me. > > > Dave > > > On Wed, May 4, 2011 at 15:51, debutant.bioperl wrote: > >> >> Good morning, >> I download a file containing the upstream sequences of genes from a >> database. >> these sequences are in FASTA format. >> explanatory example of sequence : >>> XX1G56520 | :2695538-2696537 FORWARD CHR1 LENGTH = >>> 500ATCGATCGATCGATCGGAGAGAGATCGATCGATCGATCGATCG> YY1G56520 | >>> :2695538-2696539 FORWARD CHR1 LENGTH = >>> 500ATCGATCGATCGATCGGAGAGAGATCGATCGATCGATCGATCG .... >> I want to write code in Bioperl that : >> *reads the file containing the sequences (you can open it with notepad, >> Word >> ...) >> *looking for a pattern (eg. GAGAGAGATCGA) >> *gives what is this gene sequence (eg the pattern italic we note gene code >> "XX1G56520. ID, which is just past the sign"> "before the pattern) >> *determines the position of this motif from the start codon, otherwise >> count >> the letter starting with the first letter pattern until the last letter >> before the sign ">" Next. >> *give the results as a table containing the code for each gene and the >> position corresponding pattern (number of letters to find plus a negative >> sign "- ") >> if you want more explanation I am always at your disposal. >> Please help me is urgent %-| >> cordially >> -- >> View this message in context: >> http://old.nabble.com/Bioperl-tp31542007p31542007.html >> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From joyeux2000 at hotmail.fr Wed May 4 10:49:33 2011 From: joyeux2000 at hotmail.fr (debutant.bioperl) Date: Wed, 4 May 2011 07:49:33 -0700 (PDT) Subject: [Bioperl-l] Bioperl In-Reply-To: References: <31542007.post@talk.nabble.com> Message-ID: <31542480.post@talk.nabble.com> please help me world, help me even with simples ideas :-( Chris Fields-5 wrote: > > Yep. Strange that, being the end of the academic year is upon us... :) > > chris > > On May 4, 2011, at 9:21 AM, Dave Messina wrote: > >> Um, this sounds an awful lot like homework to me. >> >> >> Dave >> >> >> On Wed, May 4, 2011 at 15:51, debutant.bioperl >> wrote: >> >>> >>> Good morning, >>> I download a file containing the upstream sequences of genes from a >>> database. >>> these sequences are in FASTA format. >>> explanatory example of sequence : >>>> XX1G56520 | :2695538-2696537 FORWARD CHR1 LENGTH = >>>> 500ATCGATCGATCGATCGGAGAGAGATCGATCGATCGATCGATCG> YY1G56520 | >>>> :2695538-2696539 FORWARD CHR1 LENGTH = >>>> 500ATCGATCGATCGATCGGAGAGAGATCGATCGATCGATCGATCG .... >>> I want to write code in Bioperl that : >>> *reads the file containing the sequences (you can open it with notepad, >>> Word >>> ...) >>> *looking for a pattern (eg. GAGAGAGATCGA) >>> *gives what is this gene sequence (eg the pattern italic we note gene >>> code >>> "XX1G56520. ID, which is just past the sign"> "before the pattern) >>> *determines the position of this motif from the start codon, otherwise >>> count >>> the letter starting with the first letter pattern until the last letter >>> before the sign ">" Next. >>> *give the results as a table containing the code for each gene and the >>> position corresponding pattern (number of letters to find plus a >>> negative >>> sign "- ") >>> if you want more explanation I am always at your disposal. >>> Please help me is urgent %-| >>> cordially >>> -- >>> View this message in context: >>> http://old.nabble.com/Bioperl-tp31542007p31542007.html >>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- View this message in context: http://old.nabble.com/Bioperl-tp31542007p31542480.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From joyeux2000 at hotmail.fr Wed May 4 10:51:18 2011 From: joyeux2000 at hotmail.fr (debutant.bioperl) Date: Wed, 4 May 2011 07:51:18 -0700 (PDT) Subject: [Bioperl-l] Bioperl In-Reply-To: References: <31542007.post@talk.nabble.com> Message-ID: <31542491.post@talk.nabble.com> Dave Messina-3 wrote: > > Um, this sounds an awful lot like homework to me. > > > Dave > > > On Wed, May 4, 2011 at 15:51, debutant.bioperl > wrote: > >> >> Good morning, >> I download a file containing the upstream sequences of genes from a >> database. >> these sequences are in FASTA format. >> explanatory example of sequence : >> > XX1G56520 | :2695538-2696537 FORWARD CHR1 LENGTH = >> > 500ATCGATCGATCGATCGGAGAGAGATCGATCGATCGATCGATCG> YY1G56520 | >> > :2695538-2696539 FORWARD CHR1 LENGTH = >> > 500ATCGATCGATCGATCGGAGAGAGATCGATCGATCGATCGATCG .... >> I want to write code in Bioperl that : >> *reads the file containing the sequences (you can open it with notepad, >> Word >> ...) >> *looking for a pattern (eg. GAGAGAGATCGA) >> *gives what is this gene sequence (eg the pattern italic we note gene >> code >> "XX1G56520. ID, which is just past the sign"> "before the pattern) >> *determines the position of this motif from the start codon, otherwise >> count >> the letter starting with the first letter pattern until the last letter >> before the sign ">" Next. >> *give the results as a table containing the code for each gene and the >> position corresponding pattern (number of letters to find plus a negative >> sign "- ") >> if you want more explanation I am always at your disposal. >> Please help me is urgent %-| >> cordially >> -- >> View this message in context: >> http://old.nabble.com/Bioperl-tp31542007p31542007.html >> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > %-|%-| -- View this message in context: http://old.nabble.com/Bioperl-tp31542007p31542491.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From biopython at maubp.freeserve.co.uk Wed May 4 10:57:45 2011 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 4 May 2011 15:57:45 +0100 Subject: [Bioperl-l] Bioperl In-Reply-To: <31542491.post@talk.nabble.com> References: <31542007.post@talk.nabble.com> <31542491.post@talk.nabble.com> Message-ID: On Wed, May 4, 2011 at 3:51 PM, debutant.bioperl wrote: > > %-|%-| > Was that some cryptic Perl code that I don't get, or two smilies? Try reading about BioPerl's SeqIO module for reading FASTA files. http://www.bioperl.org/wiki/HOWTO:SeqIO Peter From cjfields at illinois.edu Wed May 4 11:07:35 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 4 May 2011 10:07:35 -0500 Subject: [Bioperl-l] Bioperl In-Reply-To: References: <31542007.post@talk.nabble.com> <31542491.post@talk.nabble.com> Message-ID: <182E51ED-754D-4D32-BB28-291C377F333B@illinois.edu> On May 4, 2011, at 9:57 AM, Peter wrote: > On Wed, May 4, 2011 at 3:51 PM, debutant.bioperl wrote: >> >> %-|%-| >> > > Was that some cryptic Perl code that I don't get, or two smilies? > > Try reading about BioPerl's SeqIO module for reading FASTA files. > http://www.bioperl.org/wiki/HOWTO:SeqIO > > Peter Oh, you pythonista :) chris From wkretzsch at gmail.com Wed May 4 11:10:55 2011 From: wkretzsch at gmail.com (Warren W. Kretzschmar) Date: Wed, 4 May 2011 16:10:55 +0100 Subject: [Bioperl-l] Bioperl In-Reply-To: <182E51ED-754D-4D32-BB28-291C377F333B@illinois.edu> References: <31542007.post@talk.nabble.com> <31542491.post@talk.nabble.com> <182E51ED-754D-4D32-BB28-291C377F333B@illinois.edu> Message-ID: Hmm, if in doubt, it's probably perl. On Wed, May 4, 2011 at 4:07 PM, Chris Fields wrote: > On May 4, 2011, at 9:57 AM, Peter wrote: > >> On Wed, May 4, 2011 at 3:51 PM, debutant.bioperl wrote: >>> >>> %-|%-| >>> >> >> Was that some cryptic Perl code that I don't get, or two smilies? >> >> Try reading about BioPerl's SeqIO module for reading FASTA files. >> http://www.bioperl.org/wiki/HOWTO:SeqIO >> >> Peter > > Oh, you pythonista :) > > chris > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields1 at gmail.com Wed May 4 20:50:22 2011 From: cjfields1 at gmail.com (Christopher Fields) Date: Wed, 4 May 2011 19:50:22 -0500 Subject: [Bioperl-l] ps-another attempt at installing Ace for BioPerl In-Reply-To: References: Message-ID: (cc'ing the main bioperl list) Amy, For future correspondence you should contact the bioperl mailing list. You can subscribe to it here if needed: http://lists.open-bio.org/mailman/listinfo/bioperl-l As for the modules in question (AcePerl and GraphViz), these aren't absolutely required for most BioPerl functionality; tests that required them are designed to skip if the modules aren't present. Have you tried *not* installing those and running tests? chris On May 4, 2011, at 6:08 PM, Powell Phd, Amy Jo wrote: > Hi there, > > FY: I made another attempt at installing an Ace package that BioPerl can (hopefully) use, to no avail: > > > LDS/AcePerl-1.92.tar.gz > /usr/bin/make test -- NOT OK > //hint// to see the cpan-testers results for installing this module, try: > reports LDS/AcePerl-1.92.tar.gz > Running make install > make test had returned bad status, won't install without force > Failed during this command: > LDS/AcePerl-1.92.tar.gz : make_test NO > From lmrodriguezr at gmail.com Thu May 5 08:40:12 2011 From: lmrodriguezr at gmail.com (=?ISO-8859-1?Q?Luis=2DMiguel_Rodr=EDguez_Rojas?=) Date: Thu, 5 May 2011 14:40:12 +0200 Subject: [Bioperl-l] Bioperl In-Reply-To: References: <31542007.post@talk.nabble.com> <31542491.post@talk.nabble.com> <182E51ED-754D-4D32-BB28-291C377F333B@illinois.edu> Message-ID: Hash tied to named capture buffer -- Luis M. Rodriguez-R [ http://thebio.me/lrr ] --------------------------------- UMR R?sistance des Plantes aux Bioagresseurs - Group effecteur/cible Institut de Recherche pour le D?veloppement, Montpellier, France [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ] +33 (0) 6.29.74.55.93 Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a Universidad de Los Andes, Bogot?, Colombia [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ] +57 (1) 3.39.49.49 ext 2777 On Wed, May 4, 2011 at 5:10 PM, Warren W. Kretzschmar wrote: > Hmm, if in doubt, it's probably perl. > > On Wed, May 4, 2011 at 4:07 PM, Chris Fields > wrote: > > On May 4, 2011, at 9:57 AM, Peter wrote: > > > >> On Wed, May 4, 2011 at 3:51 PM, debutant.bioperl > wrote: > >>> > >>> %-|%-| > >>> > >> > >> Was that some cryptic Perl code that I don't get, or two smilies? > >> > >> Try reading about BioPerl's SeqIO module for reading FASTA files. > >> http://www.bioperl.org/wiki/HOWTO:SeqIO > >> > >> Peter > > > > Oh, you pythonista :) > > > > chris > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From lmrodriguezr at gmail.com Thu May 5 08:47:57 2011 From: lmrodriguezr at gmail.com (=?ISO-8859-1?Q?Luis=2DMiguel_Rodr=EDguez_Rojas?=) Date: Thu, 5 May 2011 14:47:57 +0200 Subject: [Bioperl-l] Handling hierarchical phylogeny based data in bio-/perl In-Reply-To: References: Message-ID: I am not sure of getting your problem, but seems to me like you could use Bio::Taxon [1] for this. [1] http://search.cpan.org/~cjfields/BioPerl-1.6.900/Bio/Taxon.pm, http://bioperl.org/wiki/Module:Bio::Taxon -- Luis M. Rodriguez-R [ http://thebio.me/lrr ] --------------------------------- UMR R?sistance des Plantes aux Bioagresseurs - Group effecteur/cible Institut de Recherche pour le D?veloppement, Montpellier, France [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ] +33 (0) 6.29.74.55.93 Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a Universidad de Los Andes, Bogot?, Colombia [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ] +57 (1) 3.39.49.49 ext 2777 On Wed, Apr 27, 2011 at 8:32 PM, Abhishek Pratap wrote: > Hi Guys > > I have lineage for many contigs blasted to nt dbase. The goal is to > arrange them in a hierarchical data structure something like hash of > hash and also store some other ancillary data like contig names for > each bin and coverage etc. > > For example > > if my input is from a tsv file with lineage as one column and others > like contig name, coverage etc > > Eukaryota Viridiplantae Streptophyta Embryophyta Tracheophyta > Spermatophyta Magnoliophyta eudicotyledons core_eudicotyledons > Eukaryota Viridiplantae Streptophyta Streptophytina Charophyceae > Charales Characeae Chara > Eukaryota Viridiplantae Streptophyta Streptophytina Embryophyta > > then I would like to store data as follows > > Eukaryota -> count = 3 > Eukaryota -> coverage = 6.3 > Eukaryota->Viridplantae->count=3 > Eukaryota->Viridplantae->coverage=4.3 > Eukaryota->Viridplantae->Streptophyta->count=3 > Eukaryota->Viridplantae->Streptophyta->coverage2=2.3 > -------etc > > I could create such hash explicitly but it is a tiring process as num > of words on each line(lineage) increases I have to keep on increasing > my data structure manually. Also all lines(lineage) wont have same > number of words. > > Also I would like to print such a tree with count/coverage information > associated for each bin. > > Wondering if I can use some Tree based built in capability of > perl/bio-perl to do this. I did have a look at > http://bioperl.org/wiki/HOWTO:Trees but I dont think I could find > example to read from tsv file and create a data structure where I am > also storing count/coverage for each bin. > > Any pointers will help. > > Best, > -Abhi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From lincoln.stein at gmail.com Thu May 5 09:46:08 2011 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Thu, 5 May 2011 09:46:08 -0400 Subject: [Bioperl-l] ps-another attempt at installing Ace for BioPerl In-Reply-To: References: Message-ID: I think we should remove Ace from bioperl; there can't be that many people still using it, and I'm not devoting any cycles to maintaining the package. I'm happy to do the dirty deed unless there's a strong objection. Lincoln On Wed, May 4, 2011 at 8:50 PM, Christopher Fields wrote: > (cc'ing the main bioperl list) > > Amy, > > For future correspondence you should contact the bioperl mailing list. You > can subscribe to it here if needed: > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > As for the modules in question (AcePerl and GraphViz), these aren't > absolutely required for most BioPerl functionality; tests that required them > are designed to skip if the modules aren't present. Have you tried *not* > installing those and running tests? > > chris > > On May 4, 2011, at 6:08 PM, Powell Phd, Amy Jo wrote: > > > Hi there, > > > > FY: I made another attempt at installing an Ace package that BioPerl can > (hopefully) use, to no avail: > > > > > > LDS/AcePerl-1.92.tar.gz > > /usr/bin/make test -- NOT OK > > //hint// to see the cpan-testers results for installing this module, try: > > reports LDS/AcePerl-1.92.tar.gz > > Running make install > > make test had returned bad status, won't install without force > > Failed during this command: > > LDS/AcePerl-1.92.tar.gz : make_test NO > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Director, Informatics and Biocomputing Platform Ontario Institute for Cancer Research 101 College St., Suite 800 Toronto, ON, Canada M5G0A3 416 673-8514 Assistant: Renata Musa From cjfields1 at gmail.com Thu May 5 10:21:26 2011 From: cjfields1 at gmail.com (Christopher Fields) Date: Thu, 5 May 2011 09:21:26 -0500 Subject: [Bioperl-l] ps-another attempt at installing Ace for BioPerl In-Reply-To: References: Message-ID: <663C8C04-ED94-43F0-BA50-15D59C1E5D1A@gmail.com> Lincoln, Go for it. Not sure if you want to split that code out or just deprecate it's usage (latter is easy enough to do within BioPerl code, just a dep warning). chris On May 5, 2011, at 8:46 AM, Lincoln Stein wrote: > I think we should remove Ace from bioperl; there can't be that many people > still using it, and I'm not devoting any cycles to maintaining the package. > I'm happy to do the dirty deed unless there's a strong objection. > > Lincoln > > On Wed, May 4, 2011 at 8:50 PM, Christopher Fields wrote: > >> (cc'ing the main bioperl list) >> >> Amy, >> >> For future correspondence you should contact the bioperl mailing list. You >> can subscribe to it here if needed: >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> As for the modules in question (AcePerl and GraphViz), these aren't >> absolutely required for most BioPerl functionality; tests that required them >> are designed to skip if the modules aren't present. Have you tried *not* >> installing those and running tests? >> >> chris >> >> On May 4, 2011, at 6:08 PM, Powell Phd, Amy Jo wrote: >> >>> Hi there, >>> >>> FY: I made another attempt at installing an Ace package that BioPerl can >> (hopefully) use, to no avail: >>> >>> >>> LDS/AcePerl-1.92.tar.gz >>> /usr/bin/make test -- NOT OK >>> //hint// to see the cpan-testers results for installing this module, try: >>> reports LDS/AcePerl-1.92.tar.gz >>> Running make install >>> make test had returned bad status, won't install without force >>> Failed during this command: >>> LDS/AcePerl-1.92.tar.gz : make_test NO >>> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > > -- > Lincoln D. Stein > Director, Informatics and Biocomputing Platform > Ontario Institute for Cancer Research > 101 College St., Suite 800 > Toronto, ON, Canada M5G0A3 > 416 673-8514 > Assistant: Renata Musa > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu May 5 12:45:47 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 5 May 2011 11:45:47 -0500 Subject: [Bioperl-l] ps-another attempt at installing Ace for BioPerl In-Reply-To: References: Message-ID: <87D2F87B-381B-4FD2-8860-4FFD2D05B314@illinois.edu> Amy, Are you installing this using the system perl (which I think is perl 5.10.1)? I'm running BioPerl on the same OS w/o problems, but I am also using a bit more of a custom setup (perlbrew in particular, which allows me to switch perl versions, and cpanminus). In general I have found the UNIX install instructions to work fine. Having a more detailed list of problems you have encountered would help tremendously as well. Specifically, how exactly are you installing BioPerl? What tests are failing? Also, by distributions, do you mean something like fink or macports? I think a bioperl version exists on macports, but I'm not sure how up-to-date it is. chris On May 5, 2011, at 11:20 AM, Powell Phd, Amy Jo wrote: > Hi Chris & Lincoln, > > A fully installable & useable package/bundle/distribution of BioPerl for the (large) community of Mac users would be greatly appreciated. > > My colleagues and I, all Mac users, have not been able to load BioPerl successfully onto our machines. > > Here are the specs for my particular machine: MacBook Pro running SnowLeopardOSX (Mac OS X 10.6.7 (10J869)). > > Detailed instructions on how to install a new version would also be greatly appreciated. Thanks very much. Regards, AJP > > > > From: Lincoln Stein > Date: Thu, 5 May 2011 09:46:08 -0400 > To: Christopher Fields > Cc: Amy Jo Powell , BioPerl List , "Supinger, Adam W" > Subject: Re: [Bioperl-l] ps-another attempt at installing Ace for BioPerl > >> I think we should remove Ace from bioperl; there can't be that many people still using it, and I'm not devoting any cycles to maintaining the package. I'm happy to do the dirty deed unless there's a strong objection. >> >> Lincoln >> >> On Wed, May 4, 2011 at 8:50 PM, Christopher Fields wrote: >>> (cc'ing the main bioperl list) >>> >>> Amy, >>> >>> For future correspondence you should contact the bioperl mailing list. You can subscribe to it here if needed: >>> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> As for the modules in question (AcePerl and GraphViz), these aren't absolutely required for most BioPerl functionality; tests that required them are designed to skip if the modules aren't present. Have you tried *not* installing those and running tests? >>> >>> chris >>> >>> On May 4, 2011, at 6:08 PM, Powell Phd, Amy Jo wrote: >>> >>> > Hi there, >>> > >>> > FY: I made another attempt at installing an Ace package that BioPerl can (hopefully) use, to no avail: >>> > >>> > >>> > LDS/AcePerl-1.92.tar.gz >>> > /usr/bin/make test -- NOT OK >>> > //hint// to see the cpan-testers results for installing this module, try: >>> > reports LDS/AcePerl-1.92.tar.gz >>> > Running make install >>> > make test had returned bad status, won't install without force >>> > Failed during this command: >>> > LDS/AcePerl-1.92.tar.gz : make_test NO >>> > >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> -- >> Lincoln D. Stein >> Director, Informatics and Biocomputing Platform >> Ontario Institute for Cancer Research >> 101 College St., Suite 800 >> Toronto, ON, Canada M5G0A3 >> 416 673-8514 >> Assistant: Renata Musa From ajpowel at sandia.gov Thu May 5 12:20:41 2011 From: ajpowel at sandia.gov (Powell Phd, Amy Jo) Date: Thu, 5 May 2011 16:20:41 +0000 Subject: [Bioperl-l] ps-another attempt at installing Ace for BioPerl In-Reply-To: Message-ID: Hi Chris & Lincoln, A fully installable & useable package/bundle/distribution of BioPerl for the (large) community of Mac users would be greatly appreciated. My colleagues and I, all Mac users, have not been able to load BioPerl successfully onto our machines. Here are the specs for my particular machine: MacBook Pro running SnowLeopardOSX (Mac OS X 10.6.7 (10J869)). Detailed instructions on how to install a new version would also be greatly appreciated. Thanks very much. Regards, AJP From: Lincoln Stein > Date: Thu, 5 May 2011 09:46:08 -0400 To: Christopher Fields > Cc: Amy Jo Powell >, BioPerl List >, "Supinger, Adam W" > Subject: Re: [Bioperl-l] ps-another attempt at installing Ace for BioPerl I think we should remove Ace from bioperl; there can't be that many people still using it, and I'm not devoting any cycles to maintaining the package. I'm happy to do the dirty deed unless there's a strong objection. Lincoln On Wed, May 4, 2011 at 8:50 PM, Christopher Fields > wrote: (cc'ing the main bioperl list) Amy, For future correspondence you should contact the bioperl mailing list. You can subscribe to it here if needed: http://lists.open-bio.org/mailman/listinfo/bioperl-l As for the modules in question (AcePerl and GraphViz), these aren't absolutely required for most BioPerl functionality; tests that required them are designed to skip if the modules aren't present. Have you tried *not* installing those and running tests? chris On May 4, 2011, at 6:08 PM, Powell Phd, Amy Jo wrote: > Hi there, > > FY: I made another attempt at installing an Ace package that BioPerl can (hopefully) use, to no avail: > > > LDS/AcePerl-1.92.tar.gz > /usr/bin/make test -- NOT OK > //hint// to see the cpan-testers results for installing this module, try: > reports LDS/AcePerl-1.92.tar.gz > Running make install > make test had returned bad status, won't install without force > Failed during this command: > LDS/AcePerl-1.92.tar.gz : make_test NO > _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Lincoln D. Stein Director, Informatics and Biocomputing Platform Ontario Institute for Cancer Research 101 College St., Suite 800 Toronto, ON, Canada M5G0A3 416 673-8514 Assistant: Renata Musa > From ajpowel at sandia.gov Thu May 5 13:38:27 2011 From: ajpowel at sandia.gov (Powell Phd, Amy Jo) Date: Thu, 5 May 2011 17:38:27 +0000 Subject: [Bioperl-l] ps-another attempt at installing Ace for BioPerl In-Reply-To: <87D2F87B-381B-4FD2-8860-4FFD2D05B314@illinois.edu> Message-ID: Hi Chris, I've (repeatedly) tried to install (all relevant possible versions) using by sudo'ing a cpan window. Here is the version of perl on my machine: s930712:go_daily-termdb-tables ajpowel$ perl -v This is perl, v5.10.0 built for darwin-thread-multi-2level (with 2 registered patches, see perl -V for more detail) Here are the errors that are typically generated (in my attempts to install): cpan[7]> i /BioPerl/ Bundle Bundle::BioPerl (CRAFFI/Bundle-BioPerl-2.1.8.tar.gz) Distribution BOZO/Fry-Lib-BioPerl-0.15.tar.gz Distribution CJFIELDS/BioPerl-1.6.1.tar.gz Distribution CJFIELDS/BioPerl-1.6.900.tar.gz Distribution CJFIELDS/BioPerl-Run-1.006900.tar.gz Distribution CRAFFI/Bundle-BioPerl-2.1.8.tar.gz Module < Bio::LiveSeq::IO::BioPerl (CJFIELDS/BioPerl-1.6.1.tar.gz) Module < Fry::Lib::BioPerl (BOZO/Fry-Lib-BioPerl-0.15.tar.gz) Author BIOPERLML ("Bioperl-l" ) 9 items found install Bundle::BioPerl Test Summary Report ------------------- t/testindexer.t (Wstat: 768 Tests: 5 Failed: 3) Failed tests: 2-3, 5 Non-zero exit status: 3 Parse errors: Bad plan. You planned 11 tests but ran 5. Files=2, Tests=15, 0 wallclock secs ( 0.02 usr 0.01 sys + 0.06 cusr 0.01 csys = 0.10 CPU) Result: FAIL Failed 1/2 test programs. 3/15 subtests failed. make: *** [test_dynamic] Error 255 MINGYILIU/Bio-ASN1-EntrezGene-1.10-withoutworldwriteables.tar.gz one dependency not OK (Bio::Index::AbstractSeq); additionally test harness failed /usr/bin/make test -- NOT OK //hint// to see the cpan-testers results for installing this module, try: reports MINGYILIU/Bio-ASN1-EntrezGene-1.10-withoutworldwriteables.tar.gz Running make install make test had returned bad status, won't install without force Failed during this command: LDS/AcePerl-1.92.tar.gz : make_test NO TWH/GD-SVG-0.33.tar.gz : make_test NO one dependency not OK (GD); additionally test harness failed MINGYILIU/Bio-ASN1-EntrezGene-1.10-withoutworldwriteables.tar.gz: make_test NO one dependency not OK (Bio::Index::AbstractSeq); additionally test harness failed CJFIELDS/BioPerl-1.6.1.tar.gz : make_test NO 2 dependencies missing (Ace,GraphViz) Bottom line, the whole installation appears to fail (no matter what package try to install) b/c of dependency issues (I.e., GraphViz & Ace). My colleague has run into the same problems. Sorry, I maybe used the wrong jargon when I said 'distribution.' I always use CPAN when installing perl modules. Make sense? Any guidance/help you could offer would be greatly appreciated. Regards, AJP On 5/5/11 10:45 AM, "Chris Fields" wrote: >Amy, > >Are you installing this using the system perl (which I think is perl >5.10.1)? I'm running BioPerl on the same OS w/o problems, but I am also >using a bit more of a custom setup (perlbrew in particular, which allows >me to switch perl versions, and cpanminus). In general I have found the >UNIX install instructions to work fine. > >Having a more detailed list of problems you have encountered would help >tremendously as well. Specifically, how exactly are you installing >BioPerl? What tests are failing? > >Also, by distributions, do you mean something like fink or macports? I >think a bioperl version exists on macports, but I'm not sure how >up-to-date it is. > >chris > >On May 5, 2011, at 11:20 AM, Powell Phd, Amy Jo wrote: > >> Hi Chris & Lincoln, >> >> A fully installable & useable package/bundle/distribution of BioPerl >>for the (large) community of Mac users would be greatly appreciated. >> >> My colleagues and I, all Mac users, have not been able to load BioPerl >>successfully onto our machines. >> >> Here are the specs for my particular machine: MacBook Pro running >>SnowLeopardOSX (Mac OS X 10.6.7 (10J869)). >> >> Detailed instructions on how to install a new version would also be >>greatly appreciated. Thanks very much. Regards, AJP >> >> >> >> From: Lincoln Stein >> Date: Thu, 5 May 2011 09:46:08 -0400 >> To: Christopher Fields >> Cc: Amy Jo Powell , BioPerl List >>, "Supinger, Adam W" >> Subject: Re: [Bioperl-l] ps-another attempt at installing Ace for >>BioPerl >> >>> I think we should remove Ace from bioperl; there can't be that many >>>people still using it, and I'm not devoting any cycles to maintaining >>>the package. I'm happy to do the dirty deed unless there's a strong >>>objection. >>> >>> Lincoln >>> >>> On Wed, May 4, 2011 at 8:50 PM, Christopher Fields >>> wrote: >>>> (cc'ing the main bioperl list) >>>> >>>> Amy, >>>> >>>> For future correspondence you should contact the bioperl mailing >>>>list. You can subscribe to it here if needed: >>>> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> As for the modules in question (AcePerl and GraphViz), these aren't >>>>absolutely required for most BioPerl functionality; tests that >>>>required them are designed to skip if the modules aren't present. >>>>Have you tried *not* installing those and running tests? >>>> >>>> chris >>>> >>>> On May 4, 2011, at 6:08 PM, Powell Phd, Amy Jo wrote: >>>> >>>> > Hi there, >>>> > >>>> > FY: I made another attempt at installing an Ace package that >>>>BioPerl can (hopefully) use, to no avail: >>>> > >>>> > >>>> > LDS/AcePerl-1.92.tar.gz >>>> > /usr/bin/make test -- NOT OK >>>> > //hint// to see the cpan-testers results for installing this >>>>module, try: >>>> > reports LDS/AcePerl-1.92.tar.gz >>>> > Running make install >>>> > make test had returned bad status, won't install without force >>>> > Failed during this command: >>>> > LDS/AcePerl-1.92.tar.gz : make_test NO >>>> > >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >>> -- >>> Lincoln D. Stein >>> Director, Informatics and Biocomputing Platform >>> Ontario Institute for Cancer Research >>> 101 College St., Suite 800 >>> Toronto, ON, Canada M5G0A3 >>> 416 673-8514 >>> Assistant: Renata Musa > > From Kevin.M.Brown at asu.edu Thu May 5 14:04:11 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Thu, 5 May 2011 11:04:11 -0700 Subject: [Bioperl-l] ps-another attempt at installing Ace for BioPerl In-Reply-To: References: <87D2F87B-381B-4FD2-8860-4FFD2D05B314@illinois.edu> Message-ID: <1A4207F8295607498283FE9E93B775B4079C3689@EX02.asurite.ad.asu.edu> I notice that CPAN is returning quite a few different bundles. Try "install CJFIELDS/BioPerl-1.6.900.tar.gz" and see what that gives you. Kevin Brown Center for Innovations in Medicine Biodesign Institute Arizona State University > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Powell Phd, Amy Jo > Sent: Thursday, May 05, 2011 10:38 AM > To: Chris Fields > Cc: Lincoln Stein; BioPerl List; Supinger,Adam W > Subject: Re: [Bioperl-l] ps-another attempt at installing Ace for > BioPerl > > Hi Chris, > > I've (repeatedly) tried to install (all relevant possible versions) > using > by sudo'ing a cpan window. Here is the version of perl on my machine: > > s930712:go_daily-termdb-tables ajpowel$ perl -v > > This is perl, v5.10.0 built for darwin-thread-multi-2level > (with 2 registered patches, see perl -V for more detail) > > > Here are the errors that are typically generated (in my attempts to > install): > > > cpan[7]> i /BioPerl/ > > Bundle Bundle::BioPerl (CRAFFI/Bundle-BioPerl-2.1.8.tar.gz) > Distribution BOZO/Fry-Lib-BioPerl-0.15.tar.gz > Distribution CJFIELDS/BioPerl-1.6.1.tar.gz > Distribution CJFIELDS/BioPerl-1.6.900.tar.gz > Distribution CJFIELDS/BioPerl-Run-1.006900.tar.gz > Distribution CRAFFI/Bundle-BioPerl-2.1.8.tar.gz > Module < Bio::LiveSeq::IO::BioPerl (CJFIELDS/BioPerl-1.6.1.tar.gz) > Module < Fry::Lib::BioPerl (BOZO/Fry-Lib-BioPerl-0.15.tar.gz) > Author BIOPERLML ("Bioperl-l" ) > 9 items found > > install Bundle::BioPerl > > > Test Summary Report > ------------------- > t/testindexer.t (Wstat: 768 Tests: 5 Failed: 3) > Failed tests: 2-3, 5 > Non-zero exit status: 3 > Parse errors: Bad plan. You planned 11 tests but ran 5. > Files=2, Tests=15, 0 wallclock secs ( 0.02 usr 0.01 sys + 0.06 cusr > 0.01 csys = 0.10 CPU) > Result: FAIL > Failed 1/2 test programs. 3/15 subtests failed. > make: *** [test_dynamic] Error 255 > MINGYILIU/Bio-ASN1-EntrezGene-1.10-withoutworldwriteables.tar.gz > one dependency not OK (Bio::Index::AbstractSeq); additionally test > harness > failed > /usr/bin/make test -- NOT OK > //hint// to see the cpan-testers results for installing this module, > try: > reports MINGYILIU/Bio-ASN1-EntrezGene-1.10- > withoutworldwriteables.tar.gz > Running make install > make test had returned bad status, won't install without force > Failed during this command: > LDS/AcePerl-1.92.tar.gz : make_test NO > TWH/GD-SVG-0.33.tar.gz : make_test NO one > dependency not OK (GD); additionally test harness failed > MINGYILIU/Bio-ASN1-EntrezGene-1.10-withoutworldwriteables.tar.gz: > make_test NO one dependency not OK (Bio::Index::AbstractSeq); > additionally > test harness failed > CJFIELDS/BioPerl-1.6.1.tar.gz : make_test NO 2 > dependencies missing (Ace,GraphViz) > > > > Bottom line, the whole installation appears to fail (no matter what > package try to install) b/c of dependency issues (I.e., GraphViz & > Ace). > My colleague has run into the same problems. > > Sorry, I maybe used the wrong jargon when I said 'distribution.' I > always > use CPAN when installing perl modules. Make sense? Any guidance/help > you could offer would be greatly appreciated. Regards, AJP > > > > > > On 5/5/11 10:45 AM, "Chris Fields" wrote: > > >Amy, > > > >Are you installing this using the system perl (which I think is perl > >5.10.1)? I'm running BioPerl on the same OS w/o problems, but I am > also > >using a bit more of a custom setup (perlbrew in particular, which > allows > >me to switch perl versions, and cpanminus). In general I have found > the > >UNIX install instructions to work fine. > > > >Having a more detailed list of problems you have encountered would > help > >tremendously as well. Specifically, how exactly are you installing > >BioPerl? What tests are failing? > > > >Also, by distributions, do you mean something like fink or macports? > I > >think a bioperl version exists on macports, but I'm not sure how > >up-to-date it is. > > > >chris > > > >On May 5, 2011, at 11:20 AM, Powell Phd, Amy Jo wrote: > > > >> Hi Chris & Lincoln, > >> > >> A fully installable & useable package/bundle/distribution of BioPerl > >>for the (large) community of Mac users would be greatly appreciated. > >> > >> My colleagues and I, all Mac users, have not been able to load > BioPerl > >>successfully onto our machines. > >> > >> Here are the specs for my particular machine: MacBook Pro running > >>SnowLeopardOSX (Mac OS X 10.6.7 (10J869)). > >> > >> Detailed instructions on how to install a new version would also be > >>greatly appreciated. Thanks very much. Regards, AJP > >> > >> > >> > >> From: Lincoln Stein > >> Date: Thu, 5 May 2011 09:46:08 -0400 > >> To: Christopher Fields > >> Cc: Amy Jo Powell , BioPerl List > >>, "Supinger, Adam W" > > >> Subject: Re: [Bioperl-l] ps-another attempt at installing Ace for > >>BioPerl > >> > >>> I think we should remove Ace from bioperl; there can't be that many > >>>people still using it, and I'm not devoting any cycles to > maintaining > >>>the package. I'm happy to do the dirty deed unless there's a strong > >>>objection. > >>> > >>> Lincoln > >>> > >>> On Wed, May 4, 2011 at 8:50 PM, Christopher Fields > >>> wrote: > >>>> (cc'ing the main bioperl list) > >>>> > >>>> Amy, > >>>> > >>>> For future correspondence you should contact the bioperl mailing > >>>>list. You can subscribe to it here if needed: > >>>> > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>>> As for the modules in question (AcePerl and GraphViz), these > aren't > >>>>absolutely required for most BioPerl functionality; tests that > >>>>required them are designed to skip if the modules aren't present. > >>>>Have you tried *not* installing those and running tests? > >>>> > >>>> chris > >>>> > >>>> On May 4, 2011, at 6:08 PM, Powell Phd, Amy Jo wrote: > >>>> > >>>> > Hi there, > >>>> > > >>>> > FY: I made another attempt at installing an Ace package that > >>>>BioPerl can (hopefully) use, to no avail: > >>>> > > >>>> > > >>>> > LDS/AcePerl-1.92.tar.gz > >>>> > /usr/bin/make test -- NOT OK > >>>> > //hint// to see the cpan-testers results for installing this > >>>>module, try: > >>>> > reports LDS/AcePerl-1.92.tar.gz > >>>> > Running make install > >>>> > make test had returned bad status, won't install without force > >>>> > Failed during this command: > >>>> > LDS/AcePerl-1.92.tar.gz : make_test NO > >>>> > > >>>> > >>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >>> > >>> > >>> -- > >>> Lincoln D. Stein > >>> Director, Informatics and Biocomputing Platform > >>> Ontario Institute for Cancer Research > >>> 101 College St., Suite 800 > >>> Toronto, ON, Canada M5G0A3 > >>> 416 673-8514 > >>> Assistant: Renata Musa > > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu May 5 14:31:54 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 5 May 2011 13:31:54 -0500 Subject: [Bioperl-l] ps-another attempt at installing Ace for BioPerl In-Reply-To: <1A4207F8295607498283FE9E93B775B4079C3689@EX02.asurite.ad.asu.edu> References: <87D2F87B-381B-4FD2-8860-4FFD2D05B314@illinois.edu> <1A4207F8295607498283FE9E93B775B4079C3689@EX02.asurite.ad.asu.edu> Message-ID: <249B7A1A-8E18-4F42-B7B0-43D2749FF2C1@illinois.edu> Or even 'install Bio::Perl' should work, if the CPAN index is up-to-date (it should install the latest). chris On May 5, 2011, at 1:04 PM, Kevin Brown wrote: > I notice that CPAN is returning quite a few different bundles. > > Try "install CJFIELDS/BioPerl-1.6.900.tar.gz" and see what that gives > you. > > Kevin Brown > Center for Innovations in Medicine > Biodesign Institute > Arizona State University > >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> bounces at lists.open-bio.org] On Behalf Of Powell Phd, Amy Jo >> Sent: Thursday, May 05, 2011 10:38 AM >> To: Chris Fields >> Cc: Lincoln Stein; BioPerl List; Supinger,Adam W >> Subject: Re: [Bioperl-l] ps-another attempt at installing Ace for >> BioPerl >> >> Hi Chris, >> >> I've (repeatedly) tried to install (all relevant possible versions) >> using >> by sudo'ing a cpan window. Here is the version of perl on my machine: >> >> s930712:go_daily-termdb-tables ajpowel$ perl -v >> >> This is perl, v5.10.0 built for darwin-thread-multi-2level >> (with 2 registered patches, see perl -V for more detail) >> >> >> Here are the errors that are typically generated (in my attempts to >> install): >> >> >> cpan[7]> i /BioPerl/ >> >> Bundle Bundle::BioPerl (CRAFFI/Bundle-BioPerl-2.1.8.tar.gz) >> Distribution BOZO/Fry-Lib-BioPerl-0.15.tar.gz >> Distribution CJFIELDS/BioPerl-1.6.1.tar.gz >> Distribution CJFIELDS/BioPerl-1.6.900.tar.gz >> Distribution CJFIELDS/BioPerl-Run-1.006900.tar.gz >> Distribution CRAFFI/Bundle-BioPerl-2.1.8.tar.gz >> Module < Bio::LiveSeq::IO::BioPerl (CJFIELDS/BioPerl-1.6.1.tar.gz) >> Module < Fry::Lib::BioPerl (BOZO/Fry-Lib-BioPerl-0.15.tar.gz) >> Author BIOPERLML ("Bioperl-l" ) >> 9 items found >> >> install Bundle::BioPerl >> >> >> Test Summary Report >> ------------------- >> t/testindexer.t (Wstat: 768 Tests: 5 Failed: 3) >> Failed tests: 2-3, 5 >> Non-zero exit status: 3 >> Parse errors: Bad plan. You planned 11 tests but ran 5. >> Files=2, Tests=15, 0 wallclock secs ( 0.02 usr 0.01 sys + 0.06 cusr >> 0.01 csys = 0.10 CPU) >> Result: FAIL >> Failed 1/2 test programs. 3/15 subtests failed. >> make: *** [test_dynamic] Error 255 >> MINGYILIU/Bio-ASN1-EntrezGene-1.10-withoutworldwriteables.tar.gz >> one dependency not OK (Bio::Index::AbstractSeq); additionally test >> harness >> failed >> /usr/bin/make test -- NOT OK >> //hint// to see the cpan-testers results for installing this module, >> try: >> reports MINGYILIU/Bio-ASN1-EntrezGene-1.10- >> withoutworldwriteables.tar.gz >> Running make install >> make test had returned bad status, won't install without force >> Failed during this command: >> LDS/AcePerl-1.92.tar.gz : make_test NO >> TWH/GD-SVG-0.33.tar.gz : make_test NO one >> dependency not OK (GD); additionally test harness failed >> MINGYILIU/Bio-ASN1-EntrezGene-1.10-withoutworldwriteables.tar.gz: >> make_test NO one dependency not OK (Bio::Index::AbstractSeq); >> additionally >> test harness failed >> CJFIELDS/BioPerl-1.6.1.tar.gz : make_test NO 2 >> dependencies missing (Ace,GraphViz) >> >> >> >> Bottom line, the whole installation appears to fail (no matter what >> package try to install) b/c of dependency issues (I.e., GraphViz & >> Ace). >> My colleague has run into the same problems. >> >> Sorry, I maybe used the wrong jargon when I said 'distribution.' I >> always >> use CPAN when installing perl modules. Make sense? Any > guidance/help >> you could offer would be greatly appreciated. Regards, AJP >> >> >> >> >> >> On 5/5/11 10:45 AM, "Chris Fields" wrote: >> >>> Amy, >>> >>> Are you installing this using the system perl (which I think is perl >>> 5.10.1)? I'm running BioPerl on the same OS w/o problems, but I am >> also >>> using a bit more of a custom setup (perlbrew in particular, which >> allows >>> me to switch perl versions, and cpanminus). In general I have found >> the >>> UNIX install instructions to work fine. >>> >>> Having a more detailed list of problems you have encountered would >> help >>> tremendously as well. Specifically, how exactly are you installing >>> BioPerl? What tests are failing? >>> >>> Also, by distributions, do you mean something like fink or macports? >> I >>> think a bioperl version exists on macports, but I'm not sure how >>> up-to-date it is. >>> >>> chris >>> >>> On May 5, 2011, at 11:20 AM, Powell Phd, Amy Jo wrote: >>> >>>> Hi Chris & Lincoln, >>>> >>>> A fully installable & useable package/bundle/distribution of > BioPerl >>>> for the (large) community of Mac users would be greatly appreciated. >>>> >>>> My colleagues and I, all Mac users, have not been able to load >> BioPerl >>>> successfully onto our machines. >>>> >>>> Here are the specs for my particular machine: MacBook Pro running >>>> SnowLeopardOSX (Mac OS X 10.6.7 (10J869)). >>>> >>>> Detailed instructions on how to install a new version would also be >>>> greatly appreciated. Thanks very much. Regards, AJP >>>> >>>> >>>> >>>> From: Lincoln Stein >>>> Date: Thu, 5 May 2011 09:46:08 -0400 >>>> To: Christopher Fields >>>> Cc: Amy Jo Powell , BioPerl List >>>> , "Supinger, Adam W" >> >>>> Subject: Re: [Bioperl-l] ps-another attempt at installing Ace for >>>> BioPerl >>>> >>>>> I think we should remove Ace from bioperl; there can't be that > many >>>>> people still using it, and I'm not devoting any cycles to >> maintaining >>>>> the package. I'm happy to do the dirty deed unless there's a strong >>>>> objection. >>>>> >>>>> Lincoln >>>>> >>>>> On Wed, May 4, 2011 at 8:50 PM, Christopher Fields >>>>> wrote: >>>>>> (cc'ing the main bioperl list) >>>>>> >>>>>> Amy, >>>>>> >>>>>> For future correspondence you should contact the bioperl mailing >>>>>> list. You can subscribe to it here if needed: >>>>>> >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>> >>>>>> As for the modules in question (AcePerl and GraphViz), these >> aren't >>>>>> absolutely required for most BioPerl functionality; tests that >>>>>> required them are designed to skip if the modules aren't present. >>>>>> Have you tried *not* installing those and running tests? >>>>>> >>>>>> chris >>>>>> >>>>>> On May 4, 2011, at 6:08 PM, Powell Phd, Amy Jo wrote: >>>>>> >>>>>>> Hi there, >>>>>>> >>>>>>> FY: I made another attempt at installing an Ace package that >>>>>> BioPerl can (hopefully) use, to no avail: >>>>>>> >>>>>>> >>>>>>> LDS/AcePerl-1.92.tar.gz >>>>>>> /usr/bin/make test -- NOT OK >>>>>>> //hint// to see the cpan-testers results for installing this >>>>>> module, try: >>>>>>> reports LDS/AcePerl-1.92.tar.gz >>>>>>> Running make install >>>>>>> make test had returned bad status, won't install without force >>>>>>> Failed during this command: >>>>>>> LDS/AcePerl-1.92.tar.gz : make_test NO >>>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Bioperl-l mailing list >>>>>> Bioperl-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>>> >>>>> >>>>> -- >>>>> Lincoln D. Stein >>>>> Director, Informatics and Biocomputing Platform >>>>> Ontario Institute for Cancer Research >>>>> 101 College St., Suite 800 >>>>> Toronto, ON, Canada M5G0A3 >>>>> 416 673-8514 >>>>> Assistant: Renata Musa >>> >>> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From ajpowel at sandia.gov Thu May 5 14:46:37 2011 From: ajpowel at sandia.gov (Powell Phd, Amy Jo) Date: Thu, 5 May 2011 18:46:37 +0000 Subject: [Bioperl-l] ps-another attempt at installing Ace for BioPerl In-Reply-To: <87D2F87B-381B-4FD2-8860-4FFD2D05B314@illinois.edu> Message-ID: PS- More errors obtained in trying to install base bioperl progs: cpan[3]> install Bio::Root::Build (...) Result: FAIL Failed 21/348 test programs. 38273/59614 subtests failed. CJFIELDS/BioPerl-1.6.900.tar.gz ./Build test -- NOT OK //hint// to see the cpan-testers results for installing this module, try: reports CJFIELDS/BioPerl-1.6.900.tar.gz Running Build install make test had returned bad status, won't install without force Failed during this command: CJFIELDS/BioPerl-1.6.900.tar.gz : make_test NO On 5/5/11 10:45 AM, "Chris Fields" wrote: >Amy, > >Are you installing this using the system perl (which I think is perl >5.10.1)? I'm running BioPerl on the same OS w/o problems, but I am also >using a bit more of a custom setup (perlbrew in particular, which allows >me to switch perl versions, and cpanminus). In general I have found the >UNIX install instructions to work fine. > >Having a more detailed list of problems you have encountered would help >tremendously as well. Specifically, how exactly are you installing >BioPerl? What tests are failing? > >Also, by distributions, do you mean something like fink or macports? I >think a bioperl version exists on macports, but I'm not sure how >up-to-date it is. > >chris > >On May 5, 2011, at 11:20 AM, Powell Phd, Amy Jo wrote: > >> Hi Chris & Lincoln, >> >> A fully installable & useable package/bundle/distribution of BioPerl >>for the (large) community of Mac users would be greatly appreciated. >> >> My colleagues and I, all Mac users, have not been able to load BioPerl >>successfully onto our machines. >> >> Here are the specs for my particular machine: MacBook Pro running >>SnowLeopardOSX (Mac OS X 10.6.7 (10J869)). >> >> Detailed instructions on how to install a new version would also be >>greatly appreciated. Thanks very much. Regards, AJP >> >> >> >> From: Lincoln Stein >> Date: Thu, 5 May 2011 09:46:08 -0400 >> To: Christopher Fields >> Cc: Amy Jo Powell , BioPerl List >>, "Supinger, Adam W" >> Subject: Re: [Bioperl-l] ps-another attempt at installing Ace for >>BioPerl >> >>> I think we should remove Ace from bioperl; there can't be that many >>>people still using it, and I'm not devoting any cycles to maintaining >>>the package. I'm happy to do the dirty deed unless there's a strong >>>objection. >>> >>> Lincoln >>> >>> On Wed, May 4, 2011 at 8:50 PM, Christopher Fields >>> wrote: >>>> (cc'ing the main bioperl list) >>>> >>>> Amy, >>>> >>>> For future correspondence you should contact the bioperl mailing >>>>list. You can subscribe to it here if needed: >>>> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> As for the modules in question (AcePerl and GraphViz), these aren't >>>>absolutely required for most BioPerl functionality; tests that >>>>required them are designed to skip if the modules aren't present. >>>>Have you tried *not* installing those and running tests? >>>> >>>> chris >>>> >>>> On May 4, 2011, at 6:08 PM, Powell Phd, Amy Jo wrote: >>>> >>>> > Hi there, >>>> > >>>> > FY: I made another attempt at installing an Ace package that >>>>BioPerl can (hopefully) use, to no avail: >>>> > >>>> > >>>> > LDS/AcePerl-1.92.tar.gz >>>> > /usr/bin/make test -- NOT OK >>>> > //hint// to see the cpan-testers results for installing this >>>>module, try: >>>> > reports LDS/AcePerl-1.92.tar.gz >>>> > Running make install >>>> > make test had returned bad status, won't install without force >>>> > Failed during this command: >>>> > LDS/AcePerl-1.92.tar.gz : make_test NO >>>> > >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >>> -- >>> Lincoln D. Stein >>> Director, Informatics and Biocomputing Platform >>> Ontario Institute for Cancer Research >>> 101 College St., Suite 800 >>> Toronto, ON, Canada M5G0A3 >>> 416 673-8514 >>> Assistant: Renata Musa > > From cjfields at illinois.edu Thu May 5 15:12:40 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 5 May 2011 14:12:40 -0500 Subject: [Bioperl-l] ps-another attempt at installing Ace for BioPerl In-Reply-To: References: Message-ID: <1B6DFB64-B3B9-4F22-AE95-D9CFB0F8FFFC@illinois.edu> Amy, I need that (...) part you excised, I'm assuming it contains the actual test failures (the rest is the summary of the overall test fails but doesn't indicate *why* they failed). You can attach it if needed, or register and file a bug report and add it there (probably the best way, so we don't lose track of this): https://redmine.open-bio.org/projects/bioperl I can do that if you don't have time. chris PS - Just so you don't lose faith in case I don't get back to you, I will be tied up with setting up a symposium the rest of today and all day tomorrow. On May 5, 2011, at 1:46 PM, Powell Phd, Amy Jo wrote: > PS- > > More errors obtained in trying to install base bioperl progs: > > cpan[3]> install Bio::Root::Build > > (...) > > Result: FAIL > Failed 21/348 test programs. 38273/59614 subtests failed. > CJFIELDS/BioPerl-1.6.900.tar.gz > ./Build test -- NOT OK > //hint// to see the cpan-testers results for installing this module, try: > reports CJFIELDS/BioPerl-1.6.900.tar.gz > Running Build install > make test had returned bad status, won't install without force > Failed during this command: > CJFIELDS/BioPerl-1.6.900.tar.gz : make_test NO > > > > > > On 5/5/11 10:45 AM, "Chris Fields" wrote: > >> Amy, >> >> Are you installing this using the system perl (which I think is perl >> 5.10.1)? I'm running BioPerl on the same OS w/o problems, but I am also >> using a bit more of a custom setup (perlbrew in particular, which allows >> me to switch perl versions, and cpanminus). In general I have found the >> UNIX install instructions to work fine. >> >> Having a more detailed list of problems you have encountered would help >> tremendously as well. Specifically, how exactly are you installing >> BioPerl? What tests are failing? >> >> Also, by distributions, do you mean something like fink or macports? I >> think a bioperl version exists on macports, but I'm not sure how >> up-to-date it is. >> >> chris >> >> On May 5, 2011, at 11:20 AM, Powell Phd, Amy Jo wrote: >> >>> Hi Chris & Lincoln, >>> >>> A fully installable & useable package/bundle/distribution of BioPerl >>> for the (large) community of Mac users would be greatly appreciated. >>> >>> My colleagues and I, all Mac users, have not been able to load BioPerl >>> successfully onto our machines. >>> >>> Here are the specs for my particular machine: MacBook Pro running >>> SnowLeopardOSX (Mac OS X 10.6.7 (10J869)). >>> >>> Detailed instructions on how to install a new version would also be >>> greatly appreciated. Thanks very much. Regards, AJP >>> >>> >>> >>> From: Lincoln Stein >>> Date: Thu, 5 May 2011 09:46:08 -0400 >>> To: Christopher Fields >>> Cc: Amy Jo Powell , BioPerl List >>> , "Supinger, Adam W" >>> Subject: Re: [Bioperl-l] ps-another attempt at installing Ace for >>> BioPerl >>> >>>> I think we should remove Ace from bioperl; there can't be that many >>>> people still using it, and I'm not devoting any cycles to maintaining >>>> the package. I'm happy to do the dirty deed unless there's a strong >>>> objection. >>>> >>>> Lincoln >>>> >>>> On Wed, May 4, 2011 at 8:50 PM, Christopher Fields >>>> wrote: >>>>> (cc'ing the main bioperl list) >>>>> >>>>> Amy, >>>>> >>>>> For future correspondence you should contact the bioperl mailing >>>>> list. You can subscribe to it here if needed: >>>>> >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>>> As for the modules in question (AcePerl and GraphViz), these aren't >>>>> absolutely required for most BioPerl functionality; tests that >>>>> required them are designed to skip if the modules aren't present. >>>>> Have you tried *not* installing those and running tests? >>>>> >>>>> chris >>>>> >>>>> On May 4, 2011, at 6:08 PM, Powell Phd, Amy Jo wrote: >>>>> >>>>>> Hi there, >>>>>> >>>>>> FY: I made another attempt at installing an Ace package that >>>>> BioPerl can (hopefully) use, to no avail: >>>>>> >>>>>> >>>>>> LDS/AcePerl-1.92.tar.gz >>>>>> /usr/bin/make test -- NOT OK >>>>>> //hint// to see the cpan-testers results for installing this >>>>> module, try: >>>>>> reports LDS/AcePerl-1.92.tar.gz >>>>>> Running make install >>>>>> make test had returned bad status, won't install without force >>>>>> Failed during this command: >>>>>> LDS/AcePerl-1.92.tar.gz : make_test NO >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> >>>> >>>> -- >>>> Lincoln D. Stein >>>> Director, Informatics and Biocomputing Platform >>>> Ontario Institute for Cancer Research >>>> 101 College St., Suite 800 >>>> Toronto, ON, Canada M5G0A3 >>>> 416 673-8514 >>>> Assistant: Renata Musa >> >> > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From ajpowel at sandia.gov Thu May 5 15:14:53 2011 From: ajpowel at sandia.gov (Powell Phd, Amy Jo) Date: Thu, 5 May 2011 19:14:53 +0000 Subject: [Bioperl-l] ps-another attempt at installing Ace for BioPerl In-Reply-To: <1B6DFB64-B3B9-4F22-AE95-D9CFB0F8FFFC@illinois.edu> Message-ID: Hi Chris, OOOOOOOH PHOOOOOEY... I already closed that CPAN window : ( I can regenerate the error, tho', if you would like? On 5/5/11 1:12 PM, "Chris Fields" wrote: >Amy, > >I need that (...) part you excised, I'm assuming it contains the actual >test failures (the rest is the summary of the overall test fails but >doesn't indicate *why* they failed). You can attach it if needed, or >register and file a bug report and add it there (probably the best way, >so we don't lose track of this): > >https://redmine.open-bio.org/projects/bioperl > >I can do that if you don't have time. > >chris > >PS - Just so you don't lose faith in case I don't get back to you, I will >be tied up with setting up a symposium the rest of today and all day >tomorrow. > >On May 5, 2011, at 1:46 PM, Powell Phd, Amy Jo wrote: > >> PS- >> >> More errors obtained in trying to install base bioperl progs: >> >> cpan[3]> install Bio::Root::Build >> >> (...) >> >> Result: FAIL >> Failed 21/348 test programs. 38273/59614 subtests failed. >> CJFIELDS/BioPerl-1.6.900.tar.gz >> ./Build test -- NOT OK >> //hint// to see the cpan-testers results for installing this module, >>try: >> reports CJFIELDS/BioPerl-1.6.900.tar.gz >> Running Build install >> make test had returned bad status, won't install without force >> Failed during this command: >> CJFIELDS/BioPerl-1.6.900.tar.gz : make_test NO >> >> >> >> >> >> On 5/5/11 10:45 AM, "Chris Fields" wrote: >> >>> Amy, >>> >>> Are you installing this using the system perl (which I think is perl >>> 5.10.1)? I'm running BioPerl on the same OS w/o problems, but I am >>>also >>> using a bit more of a custom setup (perlbrew in particular, which >>>allows >>> me to switch perl versions, and cpanminus). In general I have found >>>the >>> UNIX install instructions to work fine. >>> >>> Having a more detailed list of problems you have encountered would help >>> tremendously as well. Specifically, how exactly are you installing >>> BioPerl? What tests are failing? >>> >>> Also, by distributions, do you mean something like fink or macports? I >>> think a bioperl version exists on macports, but I'm not sure how >>> up-to-date it is. >>> >>> chris >>> >>> On May 5, 2011, at 11:20 AM, Powell Phd, Amy Jo wrote: >>> >>>> Hi Chris & Lincoln, >>>> >>>> A fully installable & useable package/bundle/distribution of BioPerl >>>> for the (large) community of Mac users would be greatly appreciated. >>>> >>>> My colleagues and I, all Mac users, have not been able to load BioPerl >>>> successfully onto our machines. >>>> >>>> Here are the specs for my particular machine: MacBook Pro running >>>> SnowLeopardOSX (Mac OS X 10.6.7 (10J869)). >>>> >>>> Detailed instructions on how to install a new version would also be >>>> greatly appreciated. Thanks very much. Regards, AJP >>>> >>>> >>>> >>>> From: Lincoln Stein >>>> Date: Thu, 5 May 2011 09:46:08 -0400 >>>> To: Christopher Fields >>>> Cc: Amy Jo Powell , BioPerl List >>>> , "Supinger, Adam W" >>>> >>>> Subject: Re: [Bioperl-l] ps-another attempt at installing Ace for >>>> BioPerl >>>> >>>>> I think we should remove Ace from bioperl; there can't be that many >>>>> people still using it, and I'm not devoting any cycles to maintaining >>>>> the package. I'm happy to do the dirty deed unless there's a strong >>>>> objection. >>>>> >>>>> Lincoln >>>>> >>>>> On Wed, May 4, 2011 at 8:50 PM, Christopher Fields >>>>> wrote: >>>>>> (cc'ing the main bioperl list) >>>>>> >>>>>> Amy, >>>>>> >>>>>> For future correspondence you should contact the bioperl mailing >>>>>> list. You can subscribe to it here if needed: >>>>>> >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>> >>>>>> As for the modules in question (AcePerl and GraphViz), these aren't >>>>>> absolutely required for most BioPerl functionality; tests that >>>>>> required them are designed to skip if the modules aren't present. >>>>>> Have you tried *not* installing those and running tests? >>>>>> >>>>>> chris >>>>>> >>>>>> On May 4, 2011, at 6:08 PM, Powell Phd, Amy Jo wrote: >>>>>> >>>>>>> Hi there, >>>>>>> >>>>>>> FY: I made another attempt at installing an Ace package that >>>>>> BioPerl can (hopefully) use, to no avail: >>>>>>> >>>>>>> >>>>>>> LDS/AcePerl-1.92.tar.gz >>>>>>> /usr/bin/make test -- NOT OK >>>>>>> //hint// to see the cpan-testers results for installing this >>>>>> module, try: >>>>>>> reports LDS/AcePerl-1.92.tar.gz >>>>>>> Running make install >>>>>>> make test had returned bad status, won't install without force >>>>>>> Failed during this command: >>>>>>> LDS/AcePerl-1.92.tar.gz : make_test NO >>>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Bioperl-l mailing list >>>>>> Bioperl-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>>> >>>>> >>>>> -- >>>>> Lincoln D. Stein >>>>> Director, Informatics and Biocomputing Platform >>>>> Ontario Institute for Cancer Research >>>>> 101 College St., Suite 800 >>>>> Toronto, ON, Canada M5G0A3 >>>>> 416 673-8514 >>>>> Assistant: Renata Musa >>> >>> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From jason.stajich at gmail.com Thu May 5 18:01:53 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Thu, 5 May 2011 18:01:53 -0400 Subject: [Bioperl-l] ps-another attempt at installing Ace for BioPerl In-Reply-To: References: Message-ID: <57AE5D73-E2DA-44B2-8000-0511001A5867@gmail.com> macports works pretty well now - I have submitted some additional packages so that it will be as easy as running this - which i was able to install on a clean machine. $ port install bioperl I'm not sure they accepted my bioperl one yet though - but it is attached as #3 portfile on that page or in my github below. I think I got some of the syntax not quite right, but it definitely worked when it comes to doing the installation. https://trac.macports.org/ticket/26468 My ports are here but I've submitted them to macports so they'll be part of the master repo: https://github.com/hyphaltip/macports -jason On May 5, 2011, at 12:20 PM, Powell Phd, Amy Jo wrote: > Hi Chris & Lincoln, > > A fully installable & useable package/bundle/distribution of BioPerl for the (large) community of Mac users would be greatly appreciated. > > My colleagues and I, all Mac users, have not been able to load BioPerl successfully onto our machines. > > Here are the specs for my particular machine: MacBook Pro running SnowLeopardOSX (Mac OS X 10.6.7 (10J869)). > > Detailed instructions on how to install a new version would also be greatly appreciated. Thanks very much. Regards, AJP > > > > From: Lincoln Stein > > Date: Thu, 5 May 2011 09:46:08 -0400 > To: Christopher Fields > > Cc: Amy Jo Powell >, BioPerl List >, "Supinger, Adam W" > > Subject: Re: [Bioperl-l] ps-another attempt at installing Ace for BioPerl > > I think we should remove Ace from bioperl; there can't be that many people still using it, and I'm not devoting any cycles to maintaining the package. I'm happy to do the dirty deed unless there's a strong objection. > > Lincoln > > On Wed, May 4, 2011 at 8:50 PM, Christopher Fields > wrote: > (cc'ing the main bioperl list) > > Amy, > > For future correspondence you should contact the bioperl mailing list. You can subscribe to it here if needed: > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > As for the modules in question (AcePerl and GraphViz), these aren't absolutely required for most BioPerl functionality; tests that required them are designed to skip if the modules aren't present. Have you tried *not* installing those and running tests? > > chris > > On May 4, 2011, at 6:08 PM, Powell Phd, Amy Jo wrote: > >> Hi there, >> >> FY: I made another attempt at installing an Ace package that BioPerl can (hopefully) use, to no avail: >> >> >> LDS/AcePerl-1.92.tar.gz >> /usr/bin/make test -- NOT OK >> //hint// to see the cpan-testers results for installing this module, try: >> reports LDS/AcePerl-1.92.tar.gz >> Running make install >> make test had returned bad status, won't install without force >> Failed during this command: >> LDS/AcePerl-1.92.tar.gz : make_test NO >> > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > Lincoln D. Stein > Director, Informatics and Biocomputing Platform > Ontario Institute for Cancer Research > 101 College St., Suite 800 > Toronto, ON, Canada M5G0A3 > 416 673-8514 > Assistant: Renata Musa > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Fri May 6 03:32:03 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Fri, 6 May 2011 09:32:03 +0200 Subject: [Bioperl-l] ps-another attempt at installing Ace for BioPerl In-Reply-To: <1A4207F8295607498283FE9E93B775B4079C3689@EX02.asurite.ad.asu.edu> References: <87D2F87B-381B-4FD2-8860-4FFD2D05B314@illinois.edu> <1A4207F8295607498283FE9E93B775B4079C3689@EX02.asurite.ad.asu.edu> Message-ID: I know I'm jumping into this thread late, but this: > Bundle Bundle::BioPerl (CRAFFI/Bundle-BioPerl-2.1.8.tar.gz) > is outdated I believe and should probably not be used anymore. Chris D., is that right? If so, could you remove it from CPAN? Dave From anandksrao at gmail.com Fri May 6 03:30:40 2011 From: anandksrao at gmail.com (onlyIDleft) Date: Fri, 6 May 2011 00:30:40 -0700 (PDT) Subject: [Bioperl-l] domain aligned global protein multiple sequence alignment Message-ID: <31556697.post@talk.nabble.com> Are there tools in BioPerl that will allow me to obtain multiple sequence alignment of proteins but with a twist All the proteins I deal with have at least one protein domain (as defined by Pfam HMM models) So I want to constrain this global multiple sequence alignment with the fact that all domains should be aligned to one another.... I could do it iteratively by using HMM align in HMMER3 for each domain, and then use this alignment as a constraint for the global alignment by T-COFFEE or some such strategy. But this is too manually intensive and cumbersome, are there BioPerl tricks that can be used to achieve my goal? -- View this message in context: http://old.nabble.com/domain-aligned-global-protein-multiple-sequence-alignment-tp31556697p31556697.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From jason.stajich at gmail.com Fri May 6 10:42:09 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Fri, 6 May 2011 07:42:09 -0700 Subject: [Bioperl-l] domain aligned global protein multiple sequence alignment In-Reply-To: <31556697.post@talk.nabble.com> References: <31556697.post@talk.nabble.com> Message-ID: <3F28E6EA-6B5A-4A6E-BC44-4F230B31E80D@gmail.com> We use DIALIGN for this type of problem of multi-copy domain proteins. http://bibiserv.techfak.uni-bielefeld.de/dialign/ or you can use bioperl to parse a HMMer report, get the domain locations, cut all these out and make a multiple alignment of the domains alone if you want to see the history of them. For our project on such a family we present the domain tree/alignment and also the whole protein sequence history. jason On May 6, 2011, at 12:30 AM, onlyIDleft wrote: > > Are there tools in BioPerl that will allow me to obtain multiple sequence > alignment of proteins but with a twist > > All the proteins I deal with have at least one protein domain (as defined by > Pfam HMM models) > > So I want to constrain this global multiple sequence alignment with the fact > that all domains should be aligned to one another.... > > I could do it iteratively by using HMM align in HMMER3 for each domain, and > then use this alignment as a constraint for the global alignment by T-COFFEE > or some such strategy. > > But this is too manually intensive and cumbersome, are there BioPerl tricks > that can be used to achieve my goal? > -- > View this message in context: http://old.nabble.com/domain-aligned-global-protein-multiple-sequence-alignment-tp31556697p31556697.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From Kevin.M.Brown at asu.edu Fri May 6 11:09:16 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Fri, 6 May 2011 08:09:16 -0700 Subject: [Bioperl-l] ps-another attempt at installing Ace for BioPerl In-Reply-To: References: <87D2F87B-381B-4FD2-8860-4FFD2D05B314@illinois.edu> <1A4207F8295607498283FE9E93B775B4079C3689@EX02.asurite.ad.asu.edu> Message-ID: <1A4207F8295607498283FE9E93B775B4079C379F@EX02.asurite.ad.asu.edu> Bundle-BioPerl-2.1.8 A bundle to install external CPAN modules used by BioPerl 1.5.2 [Download ] [Browse ] 18 Nov 2006 Yes, very much out of date. Plus it isn?t for installing BioPerl, but just the modules that BioPerl depends on, and an old version of BioPerl at that. Kevin Brown Center for Innovations in Medicine Biodesign Institute Arizona State University From: Dave Messina [mailto:David.Messina at sbc.su.se] Sent: Friday, May 06, 2011 12:32 AM To: Kevin Brown Cc: BioPerl List; dag at sonsorol.org Subject: Re: [Bioperl-l] ps-another attempt at installing Ace for BioPerl I know I'm jumping into this thread late, but this: > Bundle Bundle::BioPerl (CRAFFI/Bundle-BioPerl-2.1.8.tar.gz) is outdated I believe and should probably not be used anymore. Chris D., is that right? If so, could you remove it from CPAN? Dave From ajpowel at sandia.gov Fri May 6 13:18:00 2011 From: ajpowel at sandia.gov (Powell Phd, Amy Jo) Date: Fri, 6 May 2011 17:18:00 +0000 Subject: [Bioperl-l] ps-another attempt at installing Ace for BioPerl In-Reply-To: <57AE5D73-E2DA-44B2-8000-0511001A5867@gmail.com> Message-ID: Hey thanks, I'll give the MacPorts method a go today & let you know how it turns out. Regards, AJP From: Jason Stajich > Date: Thu, 5 May 2011 18:01:53 -0400 To: Amy Jo Powell > Cc: Lincoln Stein >, Christopher Fields >, BioPerl List >, "Supinger, Adam W" > Subject: Re: [Bioperl-l] ps-another attempt at installing Ace for BioPerl macports works pretty well now - I have submitted some additional packages so that it will be as easy as running this - which i was able to install on a clean machine. $ port install bioperl I'm not sure they accepted my bioperl one yet though - but it is attached as #3 portfile on that page or in my github below. I think I got some of the syntax not quite right, but it definitely worked when it comes to doing the installation. https://trac.macports.org/ticket/26468 My ports are here but I've submitted them to macports so they'll be part of the master repo: https://github.com/hyphaltip/macports -jason On May 5, 2011, at 12:20 PM, Powell Phd, Amy Jo wrote: Hi Chris & Lincoln, A fully installable & useable package/bundle/distribution of BioPerl for the (large) community of Mac users would be greatly appreciated. My colleagues and I, all Mac users, have not been able to load BioPerl successfully onto our machines. Here are the specs for my particular machine: MacBook Pro running SnowLeopardOSX (Mac OS X 10.6.7 (10J869)). Detailed instructions on how to install a new version would also be greatly appreciated. Thanks very much. Regards, AJP From: Lincoln Stein > Date: Thu, 5 May 2011 09:46:08 -0400 To: Christopher Fields > Cc: Amy Jo Powell >, BioPerl List >, "Supinger, Adam W" > Subject: Re: [Bioperl-l] ps-another attempt at installing Ace for BioPerl I think we should remove Ace from bioperl; there can't be that many people still using it, and I'm not devoting any cycles to maintaining the package. I'm happy to do the dirty deed unless there's a strong objection. Lincoln On Wed, May 4, 2011 at 8:50 PM, Christopher Fields > wrote: (cc'ing the main bioperl list) Amy, For future correspondence you should contact the bioperl mailing list. You can subscribe to it here if needed: http://lists.open-bio.org/mailman/listinfo/bioperl-l As for the modules in question (AcePerl and GraphViz), these aren't absolutely required for most BioPerl functionality; tests that required them are designed to skip if the modules aren't present. Have you tried *not* installing those and running tests? chris On May 4, 2011, at 6:08 PM, Powell Phd, Amy Jo wrote: Hi there, FY: I made another attempt at installing an Ace package that BioPerl can (hopefully) use, to no avail: LDS/AcePerl-1.92.tar.gz /usr/bin/make test -- NOT OK //hint// to see the cpan-testers results for installing this module, try: reports LDS/AcePerl-1.92.tar.gz Running make install make test had returned bad status, won't install without force Failed during this command: LDS/AcePerl-1.92.tar.gz : make_test NO _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Lincoln D. Stein Director, Informatics and Biocomputing Platform Ontario Institute for Cancer Research 101 College St., Suite 800 Toronto, ON, Canada M5G0A3 416 673-8514 Assistant: Renata Musa > _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From rmb32 at cornell.edu Fri May 6 16:35:30 2011 From: rmb32 at cornell.edu (Robert Buels) Date: Fri, 06 May 2011 13:35:30 -0700 Subject: [Bioperl-l] Bioperl In-Reply-To: References: <31542007.post@talk.nabble.com> <31542491.post@talk.nabble.com> <182E51ED-754D-4D32-BB28-291C377F333B@illinois.edu> Message-ID: <4DC45B92.90900@cornell.edu> Egad. Whether or not that's what it actually is, stuff like this is what gives Perl a bad name. Newbies: if you run across things like that in Perl that you are trying to maintain, find the author and hit them. R On 05/05/2011 05:40 AM, Luis-Miguel Rodr?guez Rojas wrote: > Hash tied to named capture buffer From rutgeraldo at gmail.com Tue May 10 09:47:38 2011 From: rutgeraldo at gmail.com (Rutger Vos) Date: Tue, 10 May 2011 14:47:38 +0100 Subject: [Bioperl-l] Announcement: Registration open for Computational Phyloinformatics course Message-ID: COMPUTATIONAL PHYLOINFORMATICS August 1 2011 through August 11 2011 Bioinformatics Center of Kyoto University Application Deadline: May 31, 2011 http://academy.nescent.org/wiki/Computational_phyloinformatics Computational Phyloinformatics is an 11-day international course (August 1-11, 2011) co-organized by the Computational Biology Research Center (CBRC/AIST), the Bioinformatics Center of Kyoto University, the Database Center for Life Science (DBCLS/JST), and the National Evolutionary Synthesis Center (NESCent). This course, which will take place at Kyoto University directly following the SMBE Meeting (http://smbe2011.com/), aims to give participants practical knowledge and hands-on skills in phyloinformatics. The venue in Kyoto is completely unaffected by the unfortunate events in Fukushima and the power shortages in Tokyo. We encourage biologists from other countries to participate in the SMBE meeting and/or this special international course, in solidarity with the scientific community of Japan in their effort to return to normalcy and to help minimize any negative impacts that the earthquake may have on scientific activities in Japan. SYNOPSIS Biologists are faced with ever-larger datasets, more complex evolutionary models, and increasingly elaborate analytical methods. Seldom is it sufficient to run a dataset with an off-the-shelf program on a desktop PC; increasingly, biologists need to write scripts to interface with internet services and databases, build analytical pipelines, customize analyses, and distribute computation over multiple processors. This course is designed for graduate students, postdocs, faculty, and researchers in phylogenetics interested in receiving practical, hands-on training in the use of Perl and SQL for workflows and applications in phyloinformatics. The course is divided into four parts: PART I: A tutorial review of Perl, including object oriented programming and building packages. PART II: Introduction and practical use of BioPerl and Bio::Phylo, (e.g. scripting for large tree inference engines, automating model testing, genomic-scale data mining and acquisition, supertree assembly, rate smoothing and branch calibration, tree traversal, etc). PART III: Introduction and practical use of BioRuby for molecular evolution and functional genomics (e.g. scripting multiple sequence alignment, gene duplication inference, tree inference, etc.). PART IV: Introduction to SQL and database design; computing and querying nested sets and transitive closure; querying both large trees (e.g. NCBI) and large collections of trees (e.g. TreeBASE). Participants will learn how to write basic phylogenetic or comparative analysis scripts, parse NEXUS files, traverse and compute over trees, and make practical use of phylogenetic software libraries. These skills will be learned in a biological context, touching on a diverse array of topics such as analysis of large datasets, automation of supertree assembly, querying for topological patterns in large collections of trees, etc. Participants will leave the course with a full set of installations and libraries on their computer ready to build phyloinformatic workflows for their own research projects, as well as continued access to a 50+ page wiki "textbook" containing step-by-step instructions, problem sets, and examples. INSTRUCTORS AND COURSE ORGANIZERS Christian Zmasek, Karen Cranston, Rutger A. Vos, Susumu Goto, Toshiaki Katayama, William H. Piel APPLICATION DEADLINE May 31, 2011 TUITION ?40,000 (~$500) Participants are responsible for their own travel costs, including transportation and accommodation -- see the website for more information. International participants will benefit by combining attendance with the 2011 SMBE meeting. A limited number of travel scholarships from NESCent are available for US-based students. Preference will be given to students from under-represented minorities. SUBSIDIES AND SCHOLARSHIPS A limited number of travel scholarships from NESCent are available for US-based students. Preference will be given to students from under-represented minorities. The Asia-Pacific Bioinformatics Network (APBioNet) is happy to provide travel assistance for a limited number of students/early career researchers from the Asia-Pacific region. Applicants are requested to contact Dr Asif Khan, APBioNet Secretariat: asif -$- bic.nus.edu.sg (replace -$- with @) for details. PREREQUISITES BIOLOGY: A good understanding of phylogenetics ? for example, having already taken the Workshop on Molecular Evolution (http://www.molecularevolution.org/) or equivalent coursework or experience. COMPUTING: Prior experience with Perl or careful study of the suggested reading materials in advance of the class (see web site). Participants should have some experience with basic Unix shell commands. EQUIPMENT: Participants are expected to bring their own Mac OSX computer or a LINUX computer, else they will be provided with an iMac. Participants who cannot bring their own computer and will be using a supplied iMac, should consider bringing their own portable firewire/usb drive so that they can also leave the course with a full suite of phyloinformatic software tools. -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading RG6 6BX United Kingdom Tel: +44 (0) 118 378 7535 http://www.nexml.org http://rutgervos.blogspot.com From lthiberiol at gmail.com Tue May 10 11:36:11 2011 From: lthiberiol at gmail.com (Luiz Thiberio Rangel) Date: Tue, 10 May 2011 12:36:11 -0300 Subject: [Bioperl-l] Problem rendering a BLAST result Message-ID: Hi... I've been trying to parse a BLAST output and generate a figure based on it's result. I'm using this example (http://www.bioperl.org/wiki/ HOWTO:Graphics#Parsing_Real_BLAST_Output), but when I execute it i get the following error message: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: asking for tag value that does not exist bgcolor STACK: Error::throw STACK: Bio::Root::Root::throw /usr/local/share/perl/5.10.1/ Bio/Root/ Root.pm:368 STACK: Bio::SeqFeature::Generic::get_tag_values /usr/local/share/perl/ 5.10.1/Bio/SeqFeature/Generic.pm:517 STACK: Bio::Graphics::Glyph::bgcolor /usr/local/share/perl/5.10.1/Bio/ Graphics/Glyph.pm:703 STACK: Bio::Graphics::Glyph::graded_segments::bgcolor /usr/local/share/ perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:124 STACK: Bio::Graphics::Glyph::graded_segments::draw /usr/local/share/ perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:59 STACK: Bio::Graphics::Glyph::track::draw /usr/local/share/perl/5.10.1/ Bio/Graphics/Glyph/track.pm:35 STACK: Bio::Graphics::Panel::gd /usr/local/share/perl/5.10.1/Bio/ Graphics/Panel.pm:588 STACK: Bio::Graphics::Panel::png /usr/local/share/perl/5.10.1/Bio/ Graphics/Panel.pm:1067 STACK: yeah.pl:68 ----------------------------------------------------------- Thanks! -- Luiz Thib?rio Rangel From lthiberiol at gmail.com Tue May 10 09:20:31 2011 From: lthiberiol at gmail.com (lthiberiol) Date: Tue, 10 May 2011 06:20:31 -0700 (PDT) Subject: [Bioperl-l] Problem rendering a BLAST result Message-ID: <75954ae2-2b23-4866-bcce-9c97644c6f84@t16g2000vbi.googlegroups.com> Hi... I've been trying to parse a BLAST output and generate a figure based on it's result. I'm using this example (http://www.bioperl.org/wiki/ HOWTO:Graphics#Parsing_Real_BLAST_Output), but when I execute it i get the following error message: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: asking for tag value that does not exist bgcolor STACK: Error::throw STACK: Bio::Root::Root::throw /usr/local/share/perl/5.10.1/Bio/Root/ Root.pm:368 STACK: Bio::SeqFeature::Generic::get_tag_values /usr/local/share/perl/ 5.10.1/Bio/SeqFeature/Generic.pm:517 STACK: Bio::Graphics::Glyph::bgcolor /usr/local/share/perl/5.10.1/Bio/ Graphics/Glyph.pm:703 STACK: Bio::Graphics::Glyph::graded_segments::bgcolor /usr/local/share/ perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:124 STACK: Bio::Graphics::Glyph::graded_segments::draw /usr/local/share/ perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:59 STACK: Bio::Graphics::Glyph::track::draw /usr/local/share/perl/5.10.1/ Bio/Graphics/Glyph/track.pm:35 STACK: Bio::Graphics::Panel::gd /usr/local/share/perl/5.10.1/Bio/ Graphics/Panel.pm:588 STACK: Bio::Graphics::Panel::png /usr/local/share/perl/5.10.1/Bio/ Graphics/Panel.pm:1067 STACK: yeah.pl:68 ----------------------------------------------------------- Thanks! From belaid_moa at hotmail.com Tue May 10 16:40:31 2011 From: belaid_moa at hotmail.com (Belaid MOA) Date: Tue, 10 May 2011 20:40:31 +0000 Subject: [Bioperl-l] Zombie processes with GenBank get_Seq_by_acc() Message-ID: Dear All, I installed the latest version of BioPerl and I ran a very simple code: it goes through each line (an ACC) in a file and uses GenBank to get the sequence via get_Seq_by_acc(). A look at ps shows that there were a lot of zombie processes (with attribute) created. The list grows with the time. This means that Bio:DB:GenBank is forking and not cleaning the children. Is there any way to overcome the issue? Moreover, is there any way to specify the number of forked processes? With best regards, -Belaid. From scott at scottcain.net Tue May 10 16:50:31 2011 From: scott at scottcain.net (Scott Cain) Date: Tue, 10 May 2011 16:50:31 -0400 Subject: [Bioperl-l] Problem rendering a BLAST result In-Reply-To: <75954ae2-2b23-4866-bcce-9c97644c6f84@t16g2000vbi.googlegroups.com> References: <75954ae2-2b23-4866-bcce-9c97644c6f84@t16g2000vbi.googlegroups.com> Message-ID: I'd say your asking for a tag value that doesn't exist (bgcolor) :-) Perhaps a code snippet and sample data that we could take a look at would help. Scott On Tue, May 10, 2011 at 9:20 AM, lthiberiol wrote: > Hi... > > I've been trying to parse a BLAST output and generate a figure based > on it's result. I'm using this example (http://www.bioperl.org/wiki/ > HOWTO:Graphics#Parsing_Real_BLAST_Output), but when I execute it i get > the following error message: > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: asking for tag value that does not exist bgcolor > STACK: Error::throw > STACK: Bio::Root::Root::throw /usr/local/share/perl/5.10.1/Bio/Root/ > Root.pm:368 > STACK: Bio::SeqFeature::Generic::get_tag_values /usr/local/share/perl/ > 5.10.1/Bio/SeqFeature/Generic.pm:517 > STACK: Bio::Graphics::Glyph::bgcolor /usr/local/share/perl/5.10.1/Bio/ > Graphics/Glyph.pm:703 > STACK: Bio::Graphics::Glyph::graded_segments::bgcolor /usr/local/share/ > perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:124 > STACK: Bio::Graphics::Glyph::graded_segments::draw /usr/local/share/ > perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:59 > STACK: Bio::Graphics::Glyph::track::draw /usr/local/share/perl/5.10.1/ > Bio/Graphics/Glyph/track.pm:35 > STACK: Bio::Graphics::Panel::gd /usr/local/share/perl/5.10.1/Bio/ > Graphics/Panel.pm:588 > STACK: Bio::Graphics::Panel::png /usr/local/share/perl/5.10.1/Bio/ > Graphics/Panel.pm:1067 > STACK: yeah.pl:68 > ----------------------------------------------------------- > > Thanks! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From Kevin.M.Brown at asu.edu Tue May 10 16:57:04 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Tue, 10 May 2011 13:57:04 -0700 Subject: [Bioperl-l] Zombie processes with GenBank get_Seq_by_acc() In-Reply-To: References: Message-ID: <1A4207F8295607498283FE9E93B775B4079C3AB5@EX02.asurite.ad.asu.edu> Seeing your code might help. They could just be forked children waiting for the script to exit before they go away or something else forked them and failed to clean up before quitting. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Belaid MOA > Sent: Tuesday, May 10, 2011 1:41 PM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Zombie processes with GenBank get_Seq_by_acc() > > > Dear All, > I installed the latest version of BioPerl and I ran a very simple > code: it goes through each line (an ACC) in a file and uses GenBank to > get the sequence > via get_Seq_by_acc(). A look at ps shows that there were a lot of > zombie processes (with attribute) created. The list grows > with the time. > This means that Bio:DB:GenBank is forking and not cleaning the > children. Is there any way to overcome the issue? Moreover, is there > any way > to specify the number of forked processes? > > With best regards, > -Belaid. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From Kevin.M.Brown at asu.edu Tue May 10 16:55:34 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Tue, 10 May 2011 13:55:34 -0700 Subject: [Bioperl-l] Problem rendering a BLAST result In-Reply-To: <75954ae2-2b23-4866-bcce-9c97644c6f84@t16g2000vbi.googlegroups.com> References: <75954ae2-2b23-4866-bcce-9c97644c6f84@t16g2000vbi.googlegroups.com> Message-ID: <1A4207F8295607498283FE9E93B775B4079C3AB4@EX02.asurite.ad.asu.edu> Would help if we can see your code as line 68 of yeah.pl is the final line of the script you linked. It is trying to set the bgcolor based on a value that it thinks should be held in the Bio::SeqFeature::Generic object as a tag, but no such tag exists. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of lthiberiol > Sent: Tuesday, May 10, 2011 6:21 AM > To: bioperl-l at bioperl.org > Subject: [Bioperl-l] Problem rendering a BLAST result > > Hi... > > I've been trying to parse a BLAST output and generate a figure based > on it's result. I'm using this example (http://www.bioperl.org/wiki/ > HOWTO:Graphics#Parsing_Real_BLAST_Output), but when I execute it i get > the following error message: > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: asking for tag value that does not exist bgcolor > STACK: Error::throw > STACK: Bio::Root::Root::throw /usr/local/share/perl/5.10.1/Bio/Root/ > Root.pm:368 > STACK: Bio::SeqFeature::Generic::get_tag_values /usr/local/share/perl/ > 5.10.1/Bio/SeqFeature/Generic.pm:517 > STACK: Bio::Graphics::Glyph::bgcolor /usr/local/share/perl/5.10.1/Bio/ > Graphics/Glyph.pm:703 > STACK: Bio::Graphics::Glyph::graded_segments::bgcolor /usr/local/share/ > perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:124 > STACK: Bio::Graphics::Glyph::graded_segments::draw /usr/local/share/ > perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:59 > STACK: Bio::Graphics::Glyph::track::draw /usr/local/share/perl/5.10.1/ > Bio/Graphics/Glyph/track.pm:35 > STACK: Bio::Graphics::Panel::gd /usr/local/share/perl/5.10.1/Bio/ > Graphics/Panel.pm:588 > STACK: Bio::Graphics::Panel::png /usr/local/share/perl/5.10.1/Bio/ > Graphics/Panel.pm:1067 > STACK: yeah.pl:68 > ----------------------------------------------------------- > > Thanks! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From lthiberiol at gmail.com Tue May 10 20:14:52 2011 From: lthiberiol at gmail.com (Luiz Thiberio Rangel) Date: Tue, 10 May 2011 21:14:52 -0300 Subject: [Bioperl-l] Problem rendering a BLAST result In-Reply-To: <1A4207F8295607498283FE9E93B775B4079C3AB4@EX02.asurite.ad.asu.edu> References: <75954ae2-2b23-4866-bcce-9c97644c6f84@t16g2000vbi.googlegroups.com> <1A4207F8295607498283FE9E93B775B4079C3AB4@EX02.asurite.ad.asu.edu> Message-ID: The code is this one --> http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output. The same used in this example. On Tue, May 10, 2011 at 5:55 PM, Kevin Brown wrote: > Would help if we can see your code as line 68 of yeah.pl is the final > line of the script you linked. > > It is trying to set the bgcolor based on a value that it thinks should > be held in the Bio::SeqFeature::Generic object as a tag, but no such tag > exists. > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of lthiberiol > > Sent: Tuesday, May 10, 2011 6:21 AM > > To: bioperl-l at bioperl.org > > Subject: [Bioperl-l] Problem rendering a BLAST result > > > > Hi... > > > > I've been trying to parse a BLAST output and generate a figure based > > on it's result. I'm using this example (http://www.bioperl.org/wiki/ > > HOWTO:Graphics#Parsing_Real_BLAST_Output), but when I execute it i get > > the following error message: > > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > > MSG: asking for tag value that does not exist bgcolor > > STACK: Error::throw > > STACK: Bio::Root::Root::throw /usr/local/share/perl/5.10.1/Bio/Root/ > > Root.pm:368 > > STACK: Bio::SeqFeature::Generic::get_tag_values /usr/local/share/perl/ > > 5.10.1/Bio/SeqFeature/Generic.pm:517 > > STACK: Bio::Graphics::Glyph::bgcolor /usr/local/share/perl/5.10.1/Bio/ > > Graphics/Glyph.pm:703 > > STACK: Bio::Graphics::Glyph::graded_segments::bgcolor > /usr/local/share/ > > perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:124 > > STACK: Bio::Graphics::Glyph::graded_segments::draw /usr/local/share/ > > perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:59 > > STACK: Bio::Graphics::Glyph::track::draw /usr/local/share/perl/5.10.1/ > > Bio/Graphics/Glyph/track.pm:35 > > STACK: Bio::Graphics::Panel::gd /usr/local/share/perl/5.10.1/Bio/ > > Graphics/Panel.pm:588 > > STACK: Bio::Graphics::Panel::png /usr/local/share/perl/5.10.1/Bio/ > > Graphics/Panel.pm:1067 > > STACK: yeah.pl:68 > > ----------------------------------------------------------- > > > > Thanks! > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Luiz Thib?rio Rangel From shachigahoimbi at gmail.com Wed May 11 02:20:37 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Wed, 11 May 2011 11:50:37 +0530 Subject: [Bioperl-l] Run PROSITE using bioperl Message-ID: I want to run PROSITE for my sequence remotely ( I want to run PROSITE through my script) . Is there any module in BioPerl to run PROSITE and also I want to parse "Domain signature" from PROSITE output. Please tell me if anyone knows about module to run PROSITE and parse protein domain signature form PROSITE output. Thanks in advance -- Regards, Shachi From David.Messina at sbc.su.se Wed May 11 05:15:15 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Wed, 11 May 2011 11:15:15 +0200 Subject: [Bioperl-l] Run PROSITE using bioperl In-Reply-To: References: Message-ID: Hi Shachi, I thought PROSITE was a database rather than a program, but I might be mistaken about that. http://www.expasy.org/prosite/ There is an (old) program for comparing a sequence against the PROSITE library called pfscan, which I believe can be run with bioperl-run using Bio::Tools::Run::Profile http://doc.bioperl.org/bioperl-run/lib/Bio/Tools/Run/Profile.html Dave On Wed, May 11, 2011 at 08:20, Shachi Gahoi wrote: > I want to run PROSITE for my sequence remotely ( I want to run PROSITE > through my script) . Is there any module in BioPerl to run PROSITE and also > I want to parse "Domain signature" from PROSITE output. > > Please tell me if anyone knows about module to run PROSITE and parse > protein > domain signature form PROSITE output. > > Thanks in advance > > -- > Regards, > Shachi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From jun.yin at ucd.ie Wed May 11 05:42:55 2011 From: jun.yin at ucd.ie (Jun Yin) Date: Wed, 11 May 2011 10:42:55 +0100 Subject: [Bioperl-l] Run PROSITE using bioperl In-Reply-To: References: Message-ID: <039f01cc0fbf$cdf61b10$69e25130$%yin@ucd.ie> Hi, I don't think there is any module related with PROSITE in BioPerl. However, PROSITE is based on RESTful service, which is a standard website structure. You can use BioPerl or Perl packages to fetch the data by yourself. I wrote a module last year for PROSITE using LWP and HTTP packages, though it is not published with BioPerl. You can try this link and edit the code by yourself. If you can not see the code, just let me know. https://github.com/yinjun111/bioperl-live/blob/master/Bio/DB/Align/Prosite.p m Cheers, Jun Yin Ph.D.?student in U.C.D. Bioinformatics Laboratory Conway Institute University College Dublin -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Shachi Gahoi Sent: Wednesday, May 11, 2011 7:21 AM To: bioperl-l at lists.open-bio.org Subject: [Bioperl-l] Run PROSITE using bioperl I want to run PROSITE for my sequence remotely ( I want to run PROSITE through my script) . Is there any module in BioPerl to run PROSITE and also I want to parse "Domain signature" from PROSITE output. Please tell me if anyone knows about module to run PROSITE and parse protein domain signature form PROSITE output. Thanks in advance -- Regards, Shachi _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From lmrodriguezr at gmail.com Wed May 11 10:45:13 2011 From: lmrodriguezr at gmail.com (=?ISO-8859-1?Q?Luis=2DMiguel_Rodr=EDguez_Rojas?=) Date: Wed, 11 May 2011 16:45:13 +0200 Subject: [Bioperl-l] Splitting BLAST report Message-ID: Dear all, Is there a way to split BlastXML report with multiple queries into several BlastXML reports with one query each? So far, I have something similar to the following code: *#!/usr/bin/perl* use strict; use Bio::SearchIO; * * *# $file contains the output file* *# $severalQueries contain the queries* *# %args contains other BLAST parameters* *# [...] First, run the large BLAST:* my $factory = Bio::Tools::Run::StandAloneBlast->new(%args); $factory->o($file); $factory->m(7); *# BlastXML* my $report = $factory->blastall($severalQueries); * * *# $dir is the output directory* *# [...] Now, in another script, or another part of the script:* my $report = Bio::SearchIO->new(-file=>$file, -format=>'blastxml'); mkdir $dir unless -d $dir; while(my $result = $report->next_result){ my $newFile = $dir."/".$result->query_accession.".xml"; my $searchIO = Bio::SearchIO->new(-file=>">$newFile", -output_format=>'blastxml'); $searchIO->write_result($result); } * * *# [...]* *__END__* The BLAST runs correctly, and the first output (the large XML) is there. However, the second part fails with the following message, clearly stating that I can't create a Bio::SearchIO object with output format BlastXML: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Failed to load module Bio::SearchIO::Writer::blastxml. *Can't locate Bio/SearchIO/Writer/blastxml.pm* in @INC (@INC contains: /home/equipe/resistance/lrodrigu/Runbox/Unus/lib /home/equipe/resistance/lrodrigu/lib/perl5 /home/equipe/resistance/lrodrigu/lib/perl5/5.8.8 /home/equipe/resistance/lrodrigu/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl /usr/lib/perl5/5.8.8/i386-linux-thread-multi /usr/lib/perl5/5.8.8 .) at /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm line 439, line 48675. STACK: Error::throw STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:368 STACK: Bio::Root::Root::_load_module /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:441 STACK: Bio::SearchIO::new /usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO.pm:180 STACK: Unus::Blast::run /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Blast.pm:88 STACK: Unus::Orth::BsrAuto::extract_values /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/BsrAuto.pm:47 STACK: Unus::Orth::BsrAuto::thresholds /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/BsrAuto.pm:31 STACK: Unus::Orth::Bsr::build_orthref_file /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/Bsr.pm:42 STACK: Unus::Orth::Bsr::run /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/Bsr.pm:25 STACK: Unus::Unus::calculate_orthologs /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Unus.pm:175 STACK: Unus::Unus::run /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Unus.pm:149 STACK: /home/equipe/resistance/lrodrigu/bin/unus2:45 ----------------------------------------------------------- I checked the supported output formats (Bio::SearchIO::Writer::*) and none of the supported formats seem to be either BlastXML or Blast, so I assume I am no in the right direction. Thanks in advance! Best, LRR -- Luis M. Rodriguez-R [ http://thebio.me/lrr ] --------------------------------- UMR R?sistance des Plantes aux Bioagresseurs - Group effecteur/cible Institut de Recherche pour le D?veloppement, Montpellier, France [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ] +33 (0) 6.29.74.55.93 Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a Universidad de Los Andes, Bogot?, Colombia [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ] +57 (1) 3.39.49.49 ext 2777 From David.Messina at sbc.su.se Wed May 11 11:28:39 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Wed, 11 May 2011 17:28:39 +0200 Subject: [Bioperl-l] Splitting BLAST report In-Reply-To: References: Message-ID: Hi Luis-Miguel, That's right, you can't write out blastxml from within BioPerl. Have you looked at blast_formatter that comes with BLAST+, though? From the manual: "It may be helpful to view the same BLAST results in different formats. A user may first parse the tabular format looking for matches meeting a certain criteria, then go back and examine the relevant alignments in the full BLAST report." So you might be able to do what you want using command-line tools from the NCBI. Otherwise, xml is extremely structured, so if you just want to break it up into files query by query, it would probably be pretty straightforward using good old-fashioned Perl. Dave 2011/5/11 Luis-Miguel Rodr?guez Rojas > Dear all, > > Is there a way to split BlastXML report with multiple queries into several > BlastXML reports with one query each? > > So far, I have something similar to the following code: > > *#!/usr/bin/perl* > use strict; > use Bio::SearchIO; > * > * > *# $file contains the output file* > *# $severalQueries contain the queries* > *# %args contains other BLAST parameters* > *# [...] First, run the large BLAST:* > my $factory = Bio::Tools::Run::StandAloneBlast->new(%args); > $factory->o($file); > $factory->m(7); *# BlastXML* > my $report = $factory->blastall($severalQueries); > * > * > *# $dir is the output directory* > *# [...] Now, in another script, or another part of the script:* > my $report = Bio::SearchIO->new(-file=>$file, -format=>'blastxml'); > mkdir $dir unless -d $dir; > while(my $result = $report->next_result){ > my $newFile = $dir."/".$result->query_accession.".xml"; > my $searchIO = Bio::SearchIO->new(-file=>">$newFile", > -output_format=>'blastxml'); > $searchIO->write_result($result); > } > * > * > *# [...]* > *__END__* > > The BLAST runs correctly, and the first output (the large XML) is there. > However, the second part fails with the following message, clearly stating > that I can't create a Bio::SearchIO object with output format BlastXML: > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Failed to load module Bio::SearchIO::Writer::blastxml. *Can't locate > Bio/SearchIO/Writer/blastxml.pm* in @INC (@INC contains: > /home/equipe/resistance/lrodrigu/Runbox/Unus/lib > /home/equipe/resistance/lrodrigu/lib/perl5 > /home/equipe/resistance/lrodrigu/lib/perl5/5.8.8 > /home/equipe/resistance/lrodrigu/lib/perl5/site_perl/5.8.8 > /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi > /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl > /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi > /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl > /usr/lib/perl5/5.8.8/i386-linux-thread-multi /usr/lib/perl5/5.8.8 .) at > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm line 439, line > 48675. > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:368 > STACK: Bio::Root::Root::_load_module > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:441 > STACK: Bio::SearchIO::new > /usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO.pm:180 > STACK: Unus::Blast::run > /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Blast.pm:88 > STACK: Unus::Orth::BsrAuto::extract_values > /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/BsrAuto.pm:47 > STACK: Unus::Orth::BsrAuto::thresholds > /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/BsrAuto.pm:31 > STACK: Unus::Orth::Bsr::build_orthref_file > /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/Bsr.pm:42 > STACK: Unus::Orth::Bsr::run > /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/Bsr.pm:25 > STACK: Unus::Unus::calculate_orthologs > /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Unus.pm:175 > STACK: Unus::Unus::run > /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Unus.pm:149 > STACK: /home/equipe/resistance/lrodrigu/bin/unus2:45 > ----------------------------------------------------------- > > > I checked the supported output formats (Bio::SearchIO::Writer::*) and none > of the supported formats seem to be either BlastXML or Blast, so I assume I > am no in the right direction. > > Thanks in advance! > > Best, > LRR > > -- > Luis M. Rodriguez-R > [ http://thebio.me/lrr ] > --------------------------------- > UMR R?sistance des Plantes aux Bioagresseurs - Group effecteur/cible > Institut de Recherche pour le D?veloppement, Montpellier, France > [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ] > +33 (0) 6.29.74.55.93 > > Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a > Universidad de Los Andes, Bogot?, Colombia > [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ] > +57 (1) 3.39.49.49 ext 2777 > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From lmrodriguezr at gmail.com Wed May 11 11:32:50 2011 From: lmrodriguezr at gmail.com (=?ISO-8859-1?Q?Luis=2DMiguel_Rodr=EDguez_Rojas?=) Date: Wed, 11 May 2011 17:32:50 +0200 Subject: [Bioperl-l] Splitting BLAST report In-Reply-To: References: Message-ID: Hello Dave, Thanks for your answer. Yes, I am currently working in your second suggestion. It is pretty simple, indeed. Thanks! LRR -- Luis M. Rodriguez-R [ http://thebio.me/lrr ] --------------------------------- UMR R?sistance des Plantes aux Bioagresseurs - Group effecteur/cible Institut de Recherche pour le D?veloppement, Montpellier, France [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ] +33 (0) 6.29.74.55.93 Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a Universidad de Los Andes, Bogot?, Colombia [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ] +57 (1) 3.39.49.49 ext 2777 2011/5/11 Dave Messina > Hi Luis-Miguel, > > That's right, you can't write out blastxml from within BioPerl. > > Have you looked at blast_formatter that comes with BLAST+, though? From the > manual: > > "It may be helpful to view the same BLAST results in different formats. A > user may first parse the tabular format looking for matches meeting a > certain criteria, then go back and examine the relevant alignments in the > full BLAST report." > > So you might be able to do what you want using command-line tools from the > NCBI. > > Otherwise, xml is extremely structured, so if you just want to break it up > into files query by query, it would probably be pretty straightforward using > good old-fashioned Perl. > > > Dave > > > > > 2011/5/11 Luis-Miguel Rodr?guez Rojas > >> Dear all, >> >> Is there a way to split BlastXML report with multiple queries into several >> BlastXML reports with one query each? >> >> So far, I have something similar to the following code: >> >> *#!/usr/bin/perl* >> use strict; >> use Bio::SearchIO; >> * >> * >> *# $file contains the output file* >> *# $severalQueries contain the queries* >> *# %args contains other BLAST parameters* >> *# [...] First, run the large BLAST:* >> my $factory = Bio::Tools::Run::StandAloneBlast->new(%args); >> $factory->o($file); >> $factory->m(7); *# BlastXML* >> my $report = $factory->blastall($severalQueries); >> * >> * >> *# $dir is the output directory* >> *# [...] Now, in another script, or another part of the script:* >> my $report = Bio::SearchIO->new(-file=>$file, -format=>'blastxml'); >> mkdir $dir unless -d $dir; >> while(my $result = $report->next_result){ >> my $newFile = $dir."/".$result->query_accession.".xml"; >> my $searchIO = Bio::SearchIO->new(-file=>">$newFile", >> -output_format=>'blastxml'); >> $searchIO->write_result($result); >> } >> * >> * >> *# [...]* >> *__END__* >> >> The BLAST runs correctly, and the first output (the large XML) is there. >> However, the second part fails with the following message, clearly >> stating >> that I can't create a Bio::SearchIO object with output format BlastXML: >> >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: Failed to load module Bio::SearchIO::Writer::blastxml. *Can't locate >> Bio/SearchIO/Writer/blastxml.pm* in @INC (@INC contains: >> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib >> /home/equipe/resistance/lrodrigu/lib/perl5 >> /home/equipe/resistance/lrodrigu/lib/perl5/5.8.8 >> /home/equipe/resistance/lrodrigu/lib/perl5/site_perl/5.8.8 >> /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi >> /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl >> /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi >> /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl >> /usr/lib/perl5/5.8.8/i386-linux-thread-multi /usr/lib/perl5/5.8.8 .) at >> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm line 439, line >> 48675. >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:368 >> STACK: Bio::Root::Root::_load_module >> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:441 >> STACK: Bio::SearchIO::new >> /usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO.pm:180 >> STACK: Unus::Blast::run >> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Blast.pm:88 >> STACK: Unus::Orth::BsrAuto::extract_values >> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/BsrAuto.pm:47 >> STACK: Unus::Orth::BsrAuto::thresholds >> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/BsrAuto.pm:31 >> STACK: Unus::Orth::Bsr::build_orthref_file >> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/Bsr.pm:42 >> STACK: Unus::Orth::Bsr::run >> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/Bsr.pm:25 >> STACK: Unus::Unus::calculate_orthologs >> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Unus.pm:175 >> STACK: Unus::Unus::run >> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Unus.pm:149 >> STACK: /home/equipe/resistance/lrodrigu/bin/unus2:45 >> ----------------------------------------------------------- >> >> >> I checked the supported output formats (Bio::SearchIO::Writer::*) and none >> of the supported formats seem to be either BlastXML or Blast, so I assume >> I >> am no in the right direction. >> >> Thanks in advance! >> >> Best, >> LRR >> >> -- >> Luis M. Rodriguez-R >> [ http://thebio.me/lrr ] >> --------------------------------- >> UMR R?sistance des Plantes aux Bioagresseurs - Group effecteur/cible >> Institut de Recherche pour le D?veloppement, Montpellier, France >> [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ] >> +33 (0) 6.29.74.55.93 >> >> Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a >> Universidad de Los Andes, Bogot?, Colombia >> [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ] >> +57 (1) 3.39.49.49 ext 2777 >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > From ocarnorsk138 at gmail.com Wed May 11 18:16:15 2011 From: ocarnorsk138 at gmail.com (O'car Johann Campos) Date: Wed, 11 May 2011 22:16:15 +0000 (UTC) Subject: [Bioperl-l] =?utf-8?q?Zombie_processes_with_GenBank_get=5FSeq=5Fb?= =?utf-8?b?eV9hY2MoKQ==?= References: <1A4207F8295607498283FE9E93B775B4079C3AB5@EX02.asurite.ad.asu.edu> Message-ID: Kevin Brown asu.edu> writes: > > Seeing your code might help. They could just be forked children waiting > for the script to exit before they go away or something else forked them > and failed to clean up before quitting. > > > -----Original Message----- > > From: bioperl-l-bounces lists.open-bio.org [mailto:bioperl-l- > > bounces lists.open-bio.org] On Behalf Of Belaid MOA > > Sent: Tuesday, May 10, 2011 1:41 PM > > To: bioperl-l lists.open-bio.org > > Subject: [Bioperl-l] Zombie processes with GenBank get_Seq_by_acc() > > > > > > Dear All, > > I installed the latest version of BioPerl and I ran a very simple > > code: it goes through each line (an ACC) in a file and uses GenBank to > > get the sequence > > via get_Seq_by_acc(). A look at ps shows that there were a lot of > > zombie processes (with attribute) created. The list grows > > with the time. > > This means that Bio:DB:GenBank is forking and not cleaning the > > children. Is there any way to overcome the issue? Moreover, is there > > any way > > to specify the number of forked processes? > > > > With best regards, > > -Belaid. > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Kevin, Belaid, All: Recently I've been working with genbank too and ran a code to get Genbank info from accession numbers, I also noticed the weird behavior and the zombie processes that are in the background, altough the code works and I get the info I need there are a lot of zombie processes in the background and for example running this task with 8000 accession numbers would be a pain where you all know. I'm not a bioperl expert and I may be missing some piece of code to quit the forked children as may be happening to belaid, so this is my piece of code in case any get and idea why is this happening. http://pastebin.com/Zq88cpwb Thanks in advance. Cheers. O'car. From shachigahoimbi at gmail.com Thu May 12 02:22:21 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Thu, 12 May 2011 11:52:21 +0530 Subject: [Bioperl-l] parsing interproscan result Message-ID: I want to parse interproscan result and through parsing of interproscan result I want PROSITE signature as an output. Like, I entered a particular sequence in Interproscan and now i want to parse a domain signature it contains. So how can i parse a particular domain signature from interproscan output. Is there any module in bioperl to parse a domain signature from interproscan output. Please tell me if anyone knows. Thanks in advance -- Regards, Shachi From David.Messina at sbc.su.se Thu May 12 03:22:04 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 12 May 2011 09:22:04 +0200 Subject: [Bioperl-l] Zombie processes with GenBank get_Seq_by_acc() In-Reply-To: References: <1A4207F8295607498283FE9E93B775B4079C3AB5@EX02.asurite.ad.asu.edu> Message-ID: Thanks for posting the code, O'car. I haven't tried running it, but one thing that occurs to me is that on line 18 when you create your Bio::DB::Genbank object, there's no 'my', so those objects may be hanging around longer than you expect. The zombies may be those objects' forked processes for connecting to Genbank. Similar to what Kevin said earlier. But that's all speculation. The other thing I'll say as a general comment is that fetching thousands of records from Genbank this way (or really fetching any more than 100) is inefficient and probably slow also. Instead you might try using Genbank's own fetching tools, EUtilities, either directly or via the two BioPerl interfaces to them (Bio::DB::EUtilities and Bio::DB::SoapEUtilities). Dave On Thu, May 12, 2011 at 00:16, O'car Johann Campos wrote: > Kevin Brown asu.edu> writes: > > > > > Seeing your code might help. They could just be forked children waiting > > for the script to exit before they go away or something else forked them > > and failed to clean up before quitting. > > > > > -----Original Message----- > > > From: bioperl-l-bounces lists.open-bio.org [mailto:bioperl-l- > > > bounces lists.open-bio.org] On Behalf Of Belaid MOA > > > Sent: Tuesday, May 10, 2011 1:41 PM > > > To: bioperl-l lists.open-bio.org > > > Subject: [Bioperl-l] Zombie processes with GenBank get_Seq_by_acc() > > > > > > > > > Dear All, > > > I installed the latest version of BioPerl and I ran a very simple > > > code: it goes through each line (an ACC) in a file and uses GenBank to > > > get the sequence > > > via get_Seq_by_acc(). A look at ps shows that there were a lot of > > > zombie processes (with attribute) created. The list grows > > > with the time. > > > This means that Bio:DB:GenBank is forking and not cleaning the > > > children. Is there any way to overcome the issue? Moreover, is there > > > any way > > > to specify the number of forked processes? > > > > > > With best regards, > > > -Belaid. > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > Kevin, Belaid, All: > > Recently I've been working with genbank too and ran a code to get > Genbank info from accession numbers, I also noticed the weird behavior and > the > zombie processes that are in the background, altough the code works and I > get > the info I need there are a lot of zombie processes in the background and > for > example running this task with 8000 accession numbers would be a pain where > you > all know. I'm not a bioperl expert and I may be missing some piece of code > to > quit the forked children as may be happening to belaid, so this is my piece > of > code in case any get and idea why is this happening. > > http://pastebin.com/Zq88cpwb > > Thanks in advance. > Cheers. > > O'car. > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From David.Messina at sbc.su.se Thu May 12 03:32:37 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 12 May 2011 09:32:37 +0200 Subject: [Bioperl-l] parsing interproscan result In-Reply-To: References: Message-ID: Hi Shachi, There is a SeqIO module for parsing InterProScan XML which I think will do what you want. Bio::SeqIO::interpro The basic structure of your code will follow the SeqIO idiom (see the SeqIO HOWTO), and the test file t/SeqIO/interpro.t has some bits of example code that might be helpful for the interpro-specific stuff. Dave On Thu, May 12, 2011 at 08:22, Shachi Gahoi wrote: > I want to parse interproscan result and through parsing of interproscan > result I want PROSITE signature as an output. > > Like, I entered a particular sequence in Interproscan and now i want to > parse a domain signature it contains. > > So how can i parse a particular domain signature from interproscan output. > > Is there any module in bioperl to parse a domain signature from > interproscan > output. > > Please tell me if anyone knows. > > > > Thanks in advance > > -- > Regards, > Shachi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From David.Messina at sbc.su.se Thu May 12 05:39:39 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 12 May 2011 11:39:39 +0200 Subject: [Bioperl-l] parsing interproscan result In-Reply-To: References: Message-ID: Please remember to "reply all" so that this conversation stays on the mailing list. On Thu, May 12, 2011 at 11:32, Shachi Gahoi wrote: > How to use Bio::SeqIO::interpromodule of bioperl. > As I said in my previous, email, see the SeqIO HOWTO: http://www.bioperl.org/wiki/HOWTO:SeqIO > In bioperl site only following code is available. Where is test file > "t/SeqIO/interpro". Please help me. > It's part of the BioPerl distribution. https://github.com/bioperl/bioperl-live/blob/master/t/SeqIO/interpro.t Dave > ################################################ > > use strict; > use Bio::SeqIO; > > my $io = Bio::SeqIO->new(-format => "interpro", > > -file => $interpro_file); > > while (my $seq = $io->next_seq) { > > > # use the Sequence object > } > > ################################################## > > > > > > On Thu, May 12, 2011 at 1:02 PM, Dave Messina wrote: > >> Hi Shachi, >> >> There is a SeqIO module for parsing InterProScan XML which I think will do >> what you want. Bio::SeqIO::interpro >> >> The basic structure of your code will follow the SeqIO idiom (see the >> SeqIO HOWTO), and the test file t/SeqIO/interpro.t has some bits of example >> code that might be helpful for the interpro-specific stuff. >> >> Dave >> >> >> >> >> On Thu, May 12, 2011 at 08:22, Shachi Gahoi wrote: >> >>> I want to parse interproscan result and through parsing of interproscan >>> result I want PROSITE signature as an output. >>> >>> Like, I entered a particular sequence in Interproscan and now i want to >>> parse a domain signature it contains. >>> >>> So how can i parse a particular domain signature from interproscan >>> output. >>> >>> Is there any module in bioperl to parse a domain signature from >>> interproscan >>> output. >>> >>> Please tell me if anyone knows. >>> >>> >>> >>> Thanks in advance >>> >>> -- >>> Regards, >>> Shachi >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> > > > -- > Regards, > Shachi > From ocarnorsk138 at gmail.com Thu May 12 09:56:59 2011 From: ocarnorsk138 at gmail.com (O'car Campos) Date: Thu, 12 May 2011 09:56:59 -0400 Subject: [Bioperl-l] Zombie processes with GenBank get_Seq_by_acc() In-Reply-To: References: <1A4207F8295607498283FE9E93B775B4079C3AB5@EX02.asurite.ad.asu.edu> Message-ID: Dave: Thanks for checking the code, I tried with what you said, adding a "my" to line 18 but I still get the zombies. I was exaggerating with the 8000 genbank codes, also I didn't know about those other tools I will check them, thanks for the tip. So I'm still in a zombieland. Cheers. O'car On 12 May 2011 03:22, Dave Messina wrote: > Thanks for posting the code, O'car. > > I haven't tried running it, but one thing that occurs to me is that on line > 18 when you create your Bio::DB::Genbank object, there's no 'my', so those > objects may be hanging around longer than you expect. The zombies may be > those objects' forked processes for connecting to Genbank. Similar to what > Kevin said earlier. > > But that's all speculation. > > The other thing I'll say as a general comment is that fetching thousands of > records from Genbank this way (or really fetching any more than 100) is > inefficient and probably slow also. > > Instead you might try using Genbank's own fetching tools, EUtilities, > either directly or via the two BioPerl interfaces to them > (Bio::DB::EUtilities and Bio::DB::SoapEUtilities). > > > Dave > > > > > On Thu, May 12, 2011 at 00:16, O'car Johann Campos > wrote: > >> Kevin Brown asu.edu> writes: >> >> > >> > Seeing your code might help. They could just be forked children waiting >> > for the script to exit before they go away or something else forked them >> > and failed to clean up before quitting. >> > >> > > -----Original Message----- >> > > From: bioperl-l-bounces lists.open-bio.org [mailto:bioperl-l- >> > > bounces lists.open-bio.org] On Behalf Of Belaid MOA >> > > Sent: Tuesday, May 10, 2011 1:41 PM >> > > To: bioperl-l lists.open-bio.org >> > > Subject: [Bioperl-l] Zombie processes with GenBank get_Seq_by_acc() >> > > >> > > >> > > Dear All, >> > > I installed the latest version of BioPerl and I ran a very simple >> > > code: it goes through each line (an ACC) in a file and uses GenBank to >> > > get the sequence >> > > via get_Seq_by_acc(). A look at ps shows that there were a lot of >> > > zombie processes (with attribute) created. The list grows >> > > with the time. >> > > This means that Bio:DB:GenBank is forking and not cleaning the >> > > children. Is there any way to overcome the issue? Moreover, is there >> > > any way >> > > to specify the number of forked processes? >> > > >> > > With best regards, >> > > -Belaid. >> > > >> > > _______________________________________________ >> > > Bioperl-l mailing list >> > > Bioperl-l lists.open-bio.org >> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > >> >> Kevin, Belaid, All: >> >> Recently I've been working with genbank too and ran a code to get >> Genbank info from accession numbers, I also noticed the weird behavior and >> the >> zombie processes that are in the background, altough the code works and I >> get >> the info I need there are a lot of zombie processes in the background and >> for >> example running this task with 8000 accession numbers would be a pain >> where you >> all know. I'm not a bioperl expert and I may be missing some piece of code >> to >> quit the forked children as may be happening to belaid, so this is my >> piece of >> code in case any get and idea why is this happening. >> >> http://pastebin.com/Zq88cpwb >> >> Thanks in advance. >> Cheers. >> >> O'car. >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > From cjfields at illinois.edu Thu May 12 10:27:43 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 12 May 2011 09:27:43 -0500 Subject: [Bioperl-l] Zombie processes with GenBank get_Seq_by_acc() In-Reply-To: References: <1A4207F8295607498283FE9E93B775B4079C3AB5@EX02.asurite.ad.asu.edu> Message-ID: <4749257A-E9ED-4070-8D1A-2781140BDAB1@illinois.edu> All, Sorry for coming to the thread late; this has been reported as a bug: https://redmine.open-bio.org/issues/3200 I don't think the PIDs for the child processes are stored. Truthfully, I actually think running requests as a forked process 'under the hood' isn't a good idea even if it speeds things up for this very reason (kind of violates the least surprise rule). The forking should be done at a higher level by the user. That being said, one should not be hammering NCBI with tons of requests anyway (you will be blocked). This is mentioned explicitly in the POD. chris On May 12, 2011, at 8:56 AM, O'car Campos wrote: > Dave: > > Thanks for checking the code, I tried with what you said, adding a > "my" to line 18 but I still get the zombies. I was exaggerating with the > 8000 genbank codes, also I didn't know about those other tools I will check > them, thanks for the tip. So I'm still in a zombieland. > > Cheers. > > O'car > > > On 12 May 2011 03:22, Dave Messina wrote: > >> Thanks for posting the code, O'car. >> >> I haven't tried running it, but one thing that occurs to me is that on line >> 18 when you create your Bio::DB::Genbank object, there's no 'my', so those >> objects may be hanging around longer than you expect. The zombies may be >> those objects' forked processes for connecting to Genbank. Similar to what >> Kevin said earlier. >> >> But that's all speculation. >> >> The other thing I'll say as a general comment is that fetching thousands of >> records from Genbank this way (or really fetching any more than 100) is >> inefficient and probably slow also. >> >> Instead you might try using Genbank's own fetching tools, EUtilities, >> either directly or via the two BioPerl interfaces to them >> (Bio::DB::EUtilities and Bio::DB::SoapEUtilities). >> >> >> Dave >> >> >> >> >> On Thu, May 12, 2011 at 00:16, O'car Johann Campos >> wrote: >> >>> Kevin Brown asu.edu> writes: >>> >>>> >>>> Seeing your code might help. They could just be forked children waiting >>>> for the script to exit before they go away or something else forked them >>>> and failed to clean up before quitting. >>>> >>>>> -----Original Message----- >>>>> From: bioperl-l-bounces lists.open-bio.org [mailto:bioperl-l- >>>>> bounces lists.open-bio.org] On Behalf Of Belaid MOA >>>>> Sent: Tuesday, May 10, 2011 1:41 PM >>>>> To: bioperl-l lists.open-bio.org >>>>> Subject: [Bioperl-l] Zombie processes with GenBank get_Seq_by_acc() >>>>> >>>>> >>>>> Dear All, >>>>> I installed the latest version of BioPerl and I ran a very simple >>>>> code: it goes through each line (an ACC) in a file and uses GenBank to >>>>> get the sequence >>>>> via get_Seq_by_acc(). A look at ps shows that there were a lot of >>>>> zombie processes (with attribute) created. The list grows >>>>> with the time. >>>>> This means that Bio:DB:GenBank is forking and not cleaning the >>>>> children. Is there any way to overcome the issue? Moreover, is there >>>>> any way >>>>> to specify the number of forked processes? >>>>> >>>>> With best regards, >>>>> -Belaid. >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> >>> Kevin, Belaid, All: >>> >>> Recently I've been working with genbank too and ran a code to get >>> Genbank info from accession numbers, I also noticed the weird behavior and >>> the >>> zombie processes that are in the background, altough the code works and I >>> get >>> the info I need there are a lot of zombie processes in the background and >>> for >>> example running this task with 8000 accession numbers would be a pain >>> where you >>> all know. I'm not a bioperl expert and I may be missing some piece of code >>> to >>> quit the forked children as may be happening to belaid, so this is my >>> piece of >>> code in case any get and idea why is this happening. >>> >>> http://pastebin.com/Zq88cpwb >>> >>> Thanks in advance. >>> Cheers. >>> >>> O'car. >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From Kevin.M.Brown at asu.edu Thu May 12 11:20:58 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Thu, 12 May 2011 08:20:58 -0700 Subject: [Bioperl-l] Zombie processes with GenBank get_Seq_by_acc() In-Reply-To: References: <1A4207F8295607498283FE9E93B775B4079C3AB5@EX02.asurite.ad.asu.edu> Message-ID: <1A4207F8295607498283FE9E93B775B4079C3C4C@EX02.asurite.ad.asu.edu> Utilizing keywords like "my", "our", etc... is only useful in perl if you also "use strict;" in your script. This forces perl to deal with variable scope and should cause those threads to die when the parent object goes out of scope. Instead they aren't dying because their parent object never gets destroyed. In fact, going through your code showed that you had a large number of errors such as @lines[$i] is NOT how you get something out of an array. $lines[$i] is. http://pastebin.com/vx9Y0GW7 Kevin Brown Center for Innovations in Medicine Biodesign Institute Arizona State University > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of O'car Campos > Sent: Thursday, May 12, 2011 6:57 AM > To: Dave Messina > Cc: bioperl-l at bioperl.org > Subject: Re: [Bioperl-l] Zombie processes with GenBank get_Seq_by_acc() > > Dave: > > Thanks for checking the code, I tried with what you said, adding > a > "my" to line 18 but I still get the zombies. I was exaggerating with > the > 8000 genbank codes, also I didn't know about those other tools I will > check > them, thanks for the tip. So I'm still in a zombieland. > > Cheers. > > O'car > > > On 12 May 2011 03:22, Dave Messina wrote: > > > Thanks for posting the code, O'car. > > > > I haven't tried running it, but one thing that occurs to me is that > on line > > 18 when you create your Bio::DB::Genbank object, there's no 'my', so > those > > objects may be hanging around longer than you expect. The zombies may > be > > those objects' forked processes for connecting to Genbank. Similar to > what > > Kevin said earlier. > > > > But that's all speculation. > > > > The other thing I'll say as a general comment is that fetching > thousands of > > records from Genbank this way (or really fetching any more than 100) > is > > inefficient and probably slow also. > > > > Instead you might try using Genbank's own fetching tools, EUtilities, > > either directly or via the two BioPerl interfaces to them > > (Bio::DB::EUtilities and Bio::DB::SoapEUtilities). > > > > > > Dave > > > > > > > > > > On Thu, May 12, 2011 at 00:16, O'car Johann Campos > > > wrote: > > > >> Kevin Brown asu.edu> writes: > >> > >> > > >> > Seeing your code might help. They could just be forked children > waiting > >> > for the script to exit before they go away or something else > forked them > >> > and failed to clean up before quitting. > >> > > >> > > -----Original Message----- > >> > > From: bioperl-l-bounces lists.open-bio.org [mailto:bioperl- > l- > >> > > bounces lists.open-bio.org] On Behalf Of Belaid MOA > >> > > Sent: Tuesday, May 10, 2011 1:41 PM > >> > > To: bioperl-l lists.open-bio.org > >> > > Subject: [Bioperl-l] Zombie processes with GenBank > get_Seq_by_acc() > >> > > > >> > > > >> > > Dear All, > >> > > I installed the latest version of BioPerl and I ran a very > simple > >> > > code: it goes through each line (an ACC) in a file and uses > GenBank to > >> > > get the sequence > >> > > via get_Seq_by_acc(). A look at ps shows that there were a lot > of > >> > > zombie processes (with attribute) created. The list > grows > >> > > with the time. > >> > > This means that Bio:DB:GenBank is forking and not cleaning the > >> > > children. Is there any way to overcome the issue? Moreover, is > there > >> > > any way > >> > > to specify the number of forked processes? > >> > > > >> > > With best regards, > >> > > -Belaid. > >> > > > >> > > _______________________________________________ > >> > > Bioperl-l mailing list > >> > > Bioperl-l lists.open-bio.org > >> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > >> > >> Kevin, Belaid, All: > >> > >> Recently I've been working with genbank too and ran a code to > get > >> Genbank info from accession numbers, I also noticed the weird > behavior and > >> the > >> zombie processes that are in the background, altough the code works > and I > >> get > >> the info I need there are a lot of zombie processes in the > background and > >> for > >> example running this task with 8000 accession numbers would be a > pain > >> where you > >> all know. I'm not a bioperl expert and I may be missing some piece > of code > >> to > >> quit the forked children as may be happening to belaid, so this is > my > >> piece of > >> code in case any get and idea why is this happening. > >> > >> http://pastebin.com/Zq88cpwb > >> > >> Thanks in advance. > >> Cheers. > >> > >> O'car. > >> > >> > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From roy.chaudhuri at gmail.com Thu May 12 11:49:18 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Thu, 12 May 2011 16:49:18 +0100 Subject: [Bioperl-l] Zombie processes with GenBank get_Seq_by_acc() In-Reply-To: <1A4207F8295607498283FE9E93B775B4079C3C4C@EX02.asurite.ad.asu.edu> References: <1A4207F8295607498283FE9E93B775B4079C3AB5@EX02.asurite.ad.asu.edu> <1A4207F8295607498283FE9E93B775B4079C3C4C@EX02.asurite.ad.asu.edu> Message-ID: <4DCC017E.4070709@gmail.com> That's not really true. Turning on strict just makes it a compile time error if you don't declare the scope of a variable. "my" and "our" will still work without turning on strict, but they are not mandatory. Of course, you should always use strict, it's there to help you. @lines[$i] is valid syntax, although you will get the message "Scalar value @lines[$i] better written as $lines[$i]" if warnings are switched on (again, this should always be the case). Roy. On 12/05/2011 16:20, Kevin Brown wrote: > Utilizing keywords like "my", "our", etc... is only useful in perl if > you also "use strict;" in your script. This forces perl to deal with > variable scope and should cause those threads to die when the parent > object goes out of scope. Instead they aren't dying because their parent > object never gets destroyed. > > In fact, going through your code showed that you had a large number of > errors such as @lines[$i] is NOT how you get something out of an array. > $lines[$i] is. > > http://pastebin.com/vx9Y0GW7 > > Kevin Brown > Center for Innovations in Medicine > Biodesign Institute > Arizona State University > >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> bounces at lists.open-bio.org] On Behalf Of O'car Campos >> Sent: Thursday, May 12, 2011 6:57 AM >> To: Dave Messina >> Cc: bioperl-l at bioperl.org >> Subject: Re: [Bioperl-l] Zombie processes with GenBank > get_Seq_by_acc() >> >> Dave: >> >> Thanks for checking the code, I tried with what you said, > adding >> a >> "my" to line 18 but I still get the zombies. I was exaggerating with >> the >> 8000 genbank codes, also I didn't know about those other tools I will >> check >> them, thanks for the tip. So I'm still in a zombieland. >> >> Cheers. >> >> O'car >> >> >> On 12 May 2011 03:22, Dave Messina wrote: >> >>> Thanks for posting the code, O'car. >>> >>> I haven't tried running it, but one thing that occurs to me is that >> on line >>> 18 when you create your Bio::DB::Genbank object, there's no 'my', so >> those >>> objects may be hanging around longer than you expect. The zombies > may >> be >>> those objects' forked processes for connecting to Genbank. Similar > to >> what >>> Kevin said earlier. >>> >>> But that's all speculation. >>> >>> The other thing I'll say as a general comment is that fetching >> thousands of >>> records from Genbank this way (or really fetching any more than 100) >> is >>> inefficient and probably slow also. >>> >>> Instead you might try using Genbank's own fetching tools, > EUtilities, >>> either directly or via the two BioPerl interfaces to them >>> (Bio::DB::EUtilities and Bio::DB::SoapEUtilities). >>> >>> >>> Dave >>> >>> >>> >>> >>> On Thu, May 12, 2011 at 00:16, O'car Johann Campos >> >>> wrote: >>> >>>> Kevin Brown asu.edu> writes: >>>> >>>>> >>>>> Seeing your code might help. They could just be forked children >> waiting >>>>> for the script to exit before they go away or something else >> forked them >>>>> and failed to clean up before quitting. >>>>> >>>>>> -----Original Message----- >>>>>> From: bioperl-l-bounces lists.open-bio.org > [mailto:bioperl- >> l- >>>>>> bounces lists.open-bio.org] On Behalf Of Belaid MOA >>>>>> Sent: Tuesday, May 10, 2011 1:41 PM >>>>>> To: bioperl-l lists.open-bio.org >>>>>> Subject: [Bioperl-l] Zombie processes with GenBank >> get_Seq_by_acc() >>>>>> >>>>>> >>>>>> Dear All, >>>>>> I installed the latest version of BioPerl and I ran a very >> simple >>>>>> code: it goes through each line (an ACC) in a file and uses >> GenBank to >>>>>> get the sequence >>>>>> via get_Seq_by_acc(). A look at ps shows that there were a lot >> of >>>>>> zombie processes (with attribute) created. The list >> grows >>>>>> with the time. >>>>>> This means that Bio:DB:GenBank is forking and not cleaning the >>>>>> children. Is there any way to overcome the issue? Moreover, is >> there >>>>>> any way >>>>>> to specify the number of forked processes? >>>>>> >>>>>> With best regards, >>>>>> -Belaid. >>>>>> >>>>>> _______________________________________________ >>>>>> Bioperl-l mailing list >>>>>> Bioperl-l lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>> >>>> Kevin, Belaid, All: >>>> >>>> Recently I've been working with genbank too and ran a code > to >> get >>>> Genbank info from accession numbers, I also noticed the weird >> behavior and >>>> the >>>> zombie processes that are in the background, altough the code works >> and I >>>> get >>>> the info I need there are a lot of zombie processes in the >> background and >>>> for >>>> example running this task with 8000 accession numbers would be a >> pain >>>> where you >>>> all know. I'm not a bioperl expert and I may be missing some piece >> of code >>>> to >>>> quit the forked children as may be happening to belaid, so this is >> my >>>> piece of >>>> code in case any get and idea why is this happening. >>>> >>>> http://pastebin.com/Zq88cpwb >>>> >>>> Thanks in advance. >>>> Cheers. >>>> >>>> O'car. >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu May 12 12:38:10 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 12 May 2011 11:38:10 -0500 Subject: [Bioperl-l] Zombie processes with GenBank get_Seq_by_acc() In-Reply-To: <4DCC017E.4070709@gmail.com> References: <1A4207F8295607498283FE9E93B775B4079C3AB5@EX02.asurite.ad.asu.edu> <1A4207F8295607498283FE9E93B775B4079C3C4C@EX02.asurite.ad.asu.edu> <4DCC017E.4070709@gmail.com> Message-ID: <3B4A69F2-696A-4D3F-8FE1-1BCF97F47B5B@illinois.edu> At one point the latest version of perl was to have strictures automatically turned on, but I don't think this is implemented yet. Re: strictures, I've answered too many emails on list where a simple 'use strict' revealed bugs. That should always be on, along with 'use warnings'. chris On May 12, 2011, at 10:49 AM, Roy Chaudhuri wrote: > That's not really true. Turning on strict just makes it a compile time error if you don't declare the scope of a variable. "my" and "our" will still work without turning on strict, but they are not mandatory. Of course, you should always use strict, it's there to help you. > > @lines[$i] is valid syntax, although you will get the message "Scalar value @lines[$i] better written as $lines[$i]" if warnings are switched on (again, this should always be the case). > > Roy. > > On 12/05/2011 16:20, Kevin Brown wrote: >> Utilizing keywords like "my", "our", etc... is only useful in perl if >> you also "use strict;" in your script. This forces perl to deal with >> variable scope and should cause those threads to die when the parent >> object goes out of scope. Instead they aren't dying because their parent >> object never gets destroyed. >> >> In fact, going through your code showed that you had a large number of >> errors such as @lines[$i] is NOT how you get something out of an array. >> $lines[$i] is. >> >> http://pastebin.com/vx9Y0GW7 >> >> Kevin Brown >> Center for Innovations in Medicine >> Biodesign Institute >> Arizona State University >> >>> -----Original Message----- >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >>> bounces at lists.open-bio.org] On Behalf Of O'car Campos >>> Sent: Thursday, May 12, 2011 6:57 AM >>> To: Dave Messina >>> Cc: bioperl-l at bioperl.org >>> Subject: Re: [Bioperl-l] Zombie processes with GenBank >> get_Seq_by_acc() >>> >>> Dave: >>> >>> Thanks for checking the code, I tried with what you said, >> adding >>> a >>> "my" to line 18 but I still get the zombies. I was exaggerating with >>> the >>> 8000 genbank codes, also I didn't know about those other tools I will >>> check >>> them, thanks for the tip. So I'm still in a zombieland. >>> >>> Cheers. >>> >>> O'car >>> >>> >>> On 12 May 2011 03:22, Dave Messina wrote: >>> >>>> Thanks for posting the code, O'car. >>>> >>>> I haven't tried running it, but one thing that occurs to me is that >>> on line >>>> 18 when you create your Bio::DB::Genbank object, there's no 'my', so >>> those >>>> objects may be hanging around longer than you expect. The zombies >> may >>> be >>>> those objects' forked processes for connecting to Genbank. Similar >> to >>> what >>>> Kevin said earlier. >>>> >>>> But that's all speculation. >>>> >>>> The other thing I'll say as a general comment is that fetching >>> thousands of >>>> records from Genbank this way (or really fetching any more than 100) >>> is >>>> inefficient and probably slow also. >>>> >>>> Instead you might try using Genbank's own fetching tools, >> EUtilities, >>>> either directly or via the two BioPerl interfaces to them >>>> (Bio::DB::EUtilities and Bio::DB::SoapEUtilities). >>>> >>>> >>>> Dave >>>> >>>> >>>> >>>> >>>> On Thu, May 12, 2011 at 00:16, O'car Johann Campos >>> >>>> wrote: >>>> >>>>> Kevin Brown asu.edu> writes: >>>>> >>>>>> >>>>>> Seeing your code might help. They could just be forked children >>> waiting >>>>>> for the script to exit before they go away or something else >>> forked them >>>>>> and failed to clean up before quitting. >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: bioperl-l-bounces lists.open-bio.org >> [mailto:bioperl- >>> l- >>>>>>> bounces lists.open-bio.org] On Behalf Of Belaid MOA >>>>>>> Sent: Tuesday, May 10, 2011 1:41 PM >>>>>>> To: bioperl-l lists.open-bio.org >>>>>>> Subject: [Bioperl-l] Zombie processes with GenBank >>> get_Seq_by_acc() >>>>>>> >>>>>>> >>>>>>> Dear All, >>>>>>> I installed the latest version of BioPerl and I ran a very >>> simple >>>>>>> code: it goes through each line (an ACC) in a file and uses >>> GenBank to >>>>>>> get the sequence >>>>>>> via get_Seq_by_acc(). A look at ps shows that there were a lot >>> of >>>>>>> zombie processes (with attribute) created. The list >>> grows >>>>>>> with the time. >>>>>>> This means that Bio:DB:GenBank is forking and not cleaning the >>>>>>> children. Is there any way to overcome the issue? Moreover, is >>> there >>>>>>> any way >>>>>>> to specify the number of forked processes? >>>>>>> >>>>>>> With best regards, >>>>>>> -Belaid. >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Bioperl-l mailing list >>>>>>> Bioperl-l lists.open-bio.org >>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>> >>>>> >>>>> Kevin, Belaid, All: >>>>> >>>>> Recently I've been working with genbank too and ran a code >> to >>> get >>>>> Genbank info from accession numbers, I also noticed the weird >>> behavior and >>>>> the >>>>> zombie processes that are in the background, altough the code works >>> and I >>>>> get >>>>> the info I need there are a lot of zombie processes in the >>> background and >>>>> for >>>>> example running this task with 8000 accession numbers would be a >>> pain >>>>> where you >>>>> all know. I'm not a bioperl expert and I may be missing some piece >>> of code >>>>> to >>>>> quit the forked children as may be happening to belaid, so this is >>> my >>>>> piece of >>>>> code in case any get and idea why is this happening. >>>>> >>>>> http://pastebin.com/Zq88cpwb >>>>> >>>>> Thanks in advance. >>>>> Cheers. >>>>> >>>>> O'car. >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>> >>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From lthiberiol at gmail.com Thu May 12 13:34:06 2011 From: lthiberiol at gmail.com (Luiz Thiberio Rangel) Date: Thu, 12 May 2011 14:34:06 -0300 Subject: [Bioperl-l] Problem rendering a BLAST result In-Reply-To: References: <75954ae2-2b23-4866-bcce-9c97644c6f84@t16g2000vbi.googlegroups.com> <1A4207F8295607498283FE9E93B775B4079C3AB4@EX02.asurite.ad.asu.edu> Message-ID: I've been using the same code used in the BioPerl's HOWTO ( http://www.bioperl.org/wiki/Render_blast4) and even if the "-bgcolor" definition is removed the error persists. The same script is working well into a older version of the bioperl, but I can't find where is the problem. best regards On Tue, May 10, 2011 at 9:14 PM, Luiz Thiberio Rangel wrote: > The code is this one --> > http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output. The > same used in this example. > > > On Tue, May 10, 2011 at 5:55 PM, Kevin Brown wrote: > >> Would help if we can see your code as line 68 of yeah.pl is the final >> line of the script you linked. >> >> It is trying to set the bgcolor based on a value that it thinks should >> be held in the Bio::SeqFeature::Generic object as a tag, but no such tag >> exists. >> >> > -----Original Message----- >> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> > bounces at lists.open-bio.org] On Behalf Of lthiberiol >> > Sent: Tuesday, May 10, 2011 6:21 AM >> > To: bioperl-l at bioperl.org >> > Subject: [Bioperl-l] Problem rendering a BLAST result >> > >> > Hi... >> > >> > I've been trying to parse a BLAST output and generate a figure based >> > on it's result. I'm using this example (http://www.bioperl.org/wiki/ >> > HOWTO:Graphics#Parsing_Real_BLAST_Output), but when I execute it i get >> > the following error message: >> > >> > ------------- EXCEPTION: Bio::Root::Exception ------------- >> > MSG: asking for tag value that does not exist bgcolor >> > STACK: Error::throw >> > STACK: Bio::Root::Root::throw /usr/local/share/perl/5.10.1/Bio/Root/ >> > Root.pm:368 >> > STACK: Bio::SeqFeature::Generic::get_tag_values /usr/local/share/perl/ >> > 5.10.1/Bio/SeqFeature/Generic.pm:517 >> > STACK: Bio::Graphics::Glyph::bgcolor /usr/local/share/perl/5.10.1/Bio/ >> > Graphics/Glyph.pm:703 >> > STACK: Bio::Graphics::Glyph::graded_segments::bgcolor >> /usr/local/share/ >> > perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:124 >> > STACK: Bio::Graphics::Glyph::graded_segments::draw /usr/local/share/ >> > perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:59 >> > STACK: Bio::Graphics::Glyph::track::draw /usr/local/share/perl/5.10.1/ >> > Bio/Graphics/Glyph/track.pm:35 >> > STACK: Bio::Graphics::Panel::gd /usr/local/share/perl/5.10.1/Bio/ >> > Graphics/Panel.pm:588 >> > STACK: Bio::Graphics::Panel::png /usr/local/share/perl/5.10.1/Bio/ >> > Graphics/Panel.pm:1067 >> > STACK: yeah.pl:68 >> > ----------------------------------------------------------- >> > >> > Thanks! >> > _______________________________________________ >> > Bioperl-l mailing list >> > Bioperl-l at lists.open-bio.org >> > http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > > -- > Luiz Thib?rio Rangel > -- Luiz Thib?rio Rangel From scott at scottcain.net Thu May 12 15:08:46 2011 From: scott at scottcain.net (Scott Cain) Date: Thu, 12 May 2011 15:08:46 -0400 Subject: [Bioperl-l] Problem rendering a BLAST result In-Reply-To: References: <75954ae2-2b23-4866-bcce-9c97644c6f84@t16g2000vbi.googlegroups.com> <1A4207F8295607498283FE9E93B775B4079C3AB4@EX02.asurite.ad.asu.edu> Message-ID: Hi Luiz, If you could also send the blast file that is causing the problem, that would help too. I don't have one handy to test with. Scott On Thu, May 12, 2011 at 1:34 PM, Luiz Thiberio Rangel wrote: > I've been using the same code used in the BioPerl's HOWTO ( > http://www.bioperl.org/wiki/Render_blast4) and even if the "-bgcolor" > definition ?is removed the error persists. > > The same script is working well into a older version of the bioperl, but I > can't find where is the problem. > > > best regards > > On Tue, May 10, 2011 at 9:14 PM, Luiz Thiberio Rangel > wrote: > >> The code is this one --> >> http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output. The >> same used in this example. >> >> >> On Tue, May 10, 2011 at 5:55 PM, Kevin Brown wrote: >> >>> Would help if we can see your code as line 68 of yeah.pl is the final >>> line of the script you linked. >>> >>> It is trying to set the bgcolor based on a value that it thinks should >>> be held in the Bio::SeqFeature::Generic object as a tag, but no such tag >>> exists. >>> >>> > -----Original Message----- >>> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >>> > bounces at lists.open-bio.org] On Behalf Of lthiberiol >>> > Sent: Tuesday, May 10, 2011 6:21 AM >>> > To: bioperl-l at bioperl.org >>> > Subject: [Bioperl-l] Problem rendering a BLAST result >>> > >>> > Hi... >>> > >>> > I've been trying to parse a BLAST output and generate a figure based >>> > on it's result. I'm using this example (http://www.bioperl.org/wiki/ >>> > HOWTO:Graphics#Parsing_Real_BLAST_Output), but when I execute it i get >>> > the following error message: >>> > >>> > ------------- EXCEPTION: Bio::Root::Exception ------------- >>> > MSG: asking for tag value that does not exist bgcolor >>> > STACK: Error::throw >>> > STACK: Bio::Root::Root::throw /usr/local/share/perl/5.10.1/Bio/Root/ >>> > Root.pm:368 >>> > STACK: Bio::SeqFeature::Generic::get_tag_values /usr/local/share/perl/ >>> > 5.10.1/Bio/SeqFeature/Generic.pm:517 >>> > STACK: Bio::Graphics::Glyph::bgcolor /usr/local/share/perl/5.10.1/Bio/ >>> > Graphics/Glyph.pm:703 >>> > STACK: Bio::Graphics::Glyph::graded_segments::bgcolor >>> /usr/local/share/ >>> > perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:124 >>> > STACK: Bio::Graphics::Glyph::graded_segments::draw /usr/local/share/ >>> > perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:59 >>> > STACK: Bio::Graphics::Glyph::track::draw /usr/local/share/perl/5.10.1/ >>> > Bio/Graphics/Glyph/track.pm:35 >>> > STACK: Bio::Graphics::Panel::gd /usr/local/share/perl/5.10.1/Bio/ >>> > Graphics/Panel.pm:588 >>> > STACK: Bio::Graphics::Panel::png /usr/local/share/perl/5.10.1/Bio/ >>> > Graphics/Panel.pm:1067 >>> > STACK: yeah.pl:68 >>> > ----------------------------------------------------------- >>> > >>> > Thanks! >>> > _______________________________________________ >>> > Bioperl-l mailing list >>> > Bioperl-l at lists.open-bio.org >>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> >> >> -- >> Luiz Thib?rio Rangel >> > > > > -- > Luiz Thib?rio Rangel > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From duxroq at hotmail.com Tue May 10 20:42:13 2011 From: duxroq at hotmail.com (duxroq) Date: Tue, 10 May 2011 17:42:13 -0700 (PDT) Subject: [Bioperl-l] Installing Clustalw and bioperl on Windows 7 Message-ID: <31590447.post@talk.nabble.com> Hi, I was able to run bioperl and clustalw 1.8 wonderfully on a windows XP computer, however when I did the same to a windows 7 it did not run. Is there anything I should know about which versions to use for bioperl, clustalw, and perl when working on a windows 7 machine? Thank you, Alex -- View this message in context: http://old.nabble.com/Installing-Clustalw-and-bioperl-on-Windows-7-tp31590447p31590447.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From lthiberiol at gmail.com Thu May 12 15:28:25 2011 From: lthiberiol at gmail.com (Luiz Thiberio Rangel) Date: Thu, 12 May 2011 16:28:25 -0300 Subject: [Bioperl-l] Problem rendering a BLAST result In-Reply-To: References: <75954ae2-2b23-4866-bcce-9c97644c6f84@t16g2000vbi.googlegroups.com> <1A4207F8295607498283FE9E93B775B4079C3AB4@EX02.asurite.ad.asu.edu> Message-ID: The problem is not a specific BLAST result. Anyway, I'm attaching a small one. Thank you, again... On Thu, May 12, 2011 at 4:08 PM, Scott Cain wrote: > Hi Luiz, > > If you could also send the blast file that is causing the problem, > that would help too. I don't have one handy to test with. > > Scott > > > On Thu, May 12, 2011 at 1:34 PM, Luiz Thiberio Rangel > wrote: > > I've been using the same code used in the BioPerl's HOWTO ( > > http://www.bioperl.org/wiki/Render_blast4) and even if the "-bgcolor" > > definition is removed the error persists. > > > > The same script is working well into a older version of the bioperl, but > I > > can't find where is the problem. > > > > > > best regards > > > > On Tue, May 10, 2011 at 9:14 PM, Luiz Thiberio Rangel > > wrote: > > > >> The code is this one --> > >> http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output. > The > >> same used in this example. > >> > >> > >> On Tue, May 10, 2011 at 5:55 PM, Kevin Brown >wrote: > >> > >>> Would help if we can see your code as line 68 of yeah.pl is the final > >>> line of the script you linked. > >>> > >>> It is trying to set the bgcolor based on a value that it thinks should > >>> be held in the Bio::SeqFeature::Generic object as a tag, but no such > tag > >>> exists. > >>> > >>> > -----Original Message----- > >>> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > >>> > bounces at lists.open-bio.org] On Behalf Of lthiberiol > >>> > Sent: Tuesday, May 10, 2011 6:21 AM > >>> > To: bioperl-l at bioperl.org > >>> > Subject: [Bioperl-l] Problem rendering a BLAST result > >>> > > >>> > Hi... > >>> > > >>> > I've been trying to parse a BLAST output and generate a figure based > >>> > on it's result. I'm using this example (http://www.bioperl.org/wiki/ > >>> > HOWTO:Graphics#Parsing_Real_BLAST_Output), but when I execute it i > get > >>> > the following error message: > >>> > > >>> > ------------- EXCEPTION: Bio::Root::Exception ------------- > >>> > MSG: asking for tag value that does not exist bgcolor > >>> > STACK: Error::throw > >>> > STACK: Bio::Root::Root::throw /usr/local/share/perl/5.10.1/Bio/Root/ > >>> > Root.pm:368 > >>> > STACK: Bio::SeqFeature::Generic::get_tag_values > /usr/local/share/perl/ > >>> > 5.10.1/Bio/SeqFeature/Generic.pm:517 > >>> > STACK: Bio::Graphics::Glyph::bgcolor > /usr/local/share/perl/5.10.1/Bio/ > >>> > Graphics/Glyph.pm:703 > >>> > STACK: Bio::Graphics::Glyph::graded_segments::bgcolor > >>> /usr/local/share/ > >>> > perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:124 > >>> > STACK: Bio::Graphics::Glyph::graded_segments::draw /usr/local/share/ > >>> > perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:59 > >>> > STACK: Bio::Graphics::Glyph::track::draw > /usr/local/share/perl/5.10.1/ > >>> > Bio/Graphics/Glyph/track.pm:35 > >>> > STACK: Bio::Graphics::Panel::gd /usr/local/share/perl/5.10.1/Bio/ > >>> > Graphics/Panel.pm:588 > >>> > STACK: Bio::Graphics::Panel::png /usr/local/share/perl/5.10.1/Bio/ > >>> > Graphics/Panel.pm:1067 > >>> > STACK: yeah.pl:68 > >>> > ----------------------------------------------------------- > >>> > > >>> > Thanks! > >>> > _______________________________________________ > >>> > Bioperl-l mailing list > >>> > Bioperl-l at lists.open-bio.org > >>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >> > >> > >> > >> -- > >> Luiz Thib?rio Rangel > >> > > > > > > > > -- > > Luiz Thib?rio Rangel > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. scott at scottcain dot > net > GMOD Coordinator (http://gmod.org/) 216-392-3087 > Ontario Institute for Cancer Research > -- Luiz Thib?rio Rangel -------------- next part -------------- A non-text attachment was scrubbed... Name: yeah.bls Type: application/octet-stream Size: 6960 bytes Desc: not available URL: From rgoldade at sfu.ca Thu May 12 16:44:40 2011 From: rgoldade at sfu.ca (rgoldade) Date: Thu, 12 May 2011 13:44:40 -0700 (PDT) Subject: [Bioperl-l] Blastp filters - no hits on similar sequence Message-ID: <31606350.post@talk.nabble.com> I'm having a problem with blastp filters... I'm using blast to compare artificial sequences: Calling >blastp -query test1.fa -subject test2.fa Returns no hits. From what I've read, there are no filters active by default with blastp but I'm still having nothing returned. Does anyone have a suggestion how to overcome this issue? Thank you, Ryan Example >Line1 HHHHHAAAAAAAAAAAAAAIHHHHHAAAAAAAAAAAAAAIHHHHHAAAAAAAAAAAAAAI >Line2 HHHHHAAAAAAAAAAAAAAIHHHHHAAALAAAALALAAAIHHHHHAAAAAAAAAAAAAAI -- View this message in context: http://old.nabble.com/Blastp-filters---no-hits-on-similar-sequence-tp31606350p31606350.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From scott at scottcain.net Thu May 12 23:05:59 2011 From: scott at scottcain.net (Scott Cain) Date: Thu, 12 May 2011 23:05:59 -0400 Subject: [Bioperl-l] Problem rendering a BLAST result In-Reply-To: References: <75954ae2-2b23-4866-bcce-9c97644c6f84@t16g2000vbi.googlegroups.com> <1A4207F8295607498283FE9E93B775B4079C3AB4@EX02.asurite.ad.asu.edu> Message-ID: Hi Luiz, There was a bug introduced in the last release of Bio::Graphics. If you replace your copy of Bio::Graphics::Glyph with the attached one, it should work. Scott On Thu, May 12, 2011 at 3:28 PM, Luiz Thiberio Rangel wrote: > The problem is not a specific BLAST result. Anyway, I'm attaching a small > one. > > Thank you, again... > > On Thu, May 12, 2011 at 4:08 PM, Scott Cain wrote: >> >> Hi Luiz, >> >> If you could also send the blast file that is causing the problem, >> that would help too. ?I don't have one handy to test with. >> >> Scott >> >> >> On Thu, May 12, 2011 at 1:34 PM, Luiz Thiberio Rangel >> wrote: >> > I've been using the same code used in the BioPerl's HOWTO ( >> > http://www.bioperl.org/wiki/Render_blast4) and even if the "-bgcolor" >> > definition ?is removed the error persists. >> > >> > The same script is working well into a older version of the bioperl, but >> > I >> > can't find where is the problem. >> > >> > >> > best regards >> > >> > On Tue, May 10, 2011 at 9:14 PM, Luiz Thiberio Rangel >> > wrote: >> > >> >> The code is this one --> >> >> http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output. >> >> The >> >> same used in this example. >> >> >> >> >> >> On Tue, May 10, 2011 at 5:55 PM, Kevin Brown >> >> wrote: >> >> >> >>> Would help if we can see your code as line 68 of yeah.pl is the final >> >>> line of the script you linked. >> >>> >> >>> It is trying to set the bgcolor based on a value that it thinks should >> >>> be held in the Bio::SeqFeature::Generic object as a tag, but no such >> >>> tag >> >>> exists. >> >>> >> >>> > -----Original Message----- >> >>> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> >>> > bounces at lists.open-bio.org] On Behalf Of lthiberiol >> >>> > Sent: Tuesday, May 10, 2011 6:21 AM >> >>> > To: bioperl-l at bioperl.org >> >>> > Subject: [Bioperl-l] Problem rendering a BLAST result >> >>> > >> >>> > Hi... >> >>> > >> >>> > I've been trying to parse a BLAST output and generate a figure based >> >>> > on it's result. I'm using this example (http://www.bioperl.org/wiki/ >> >>> > HOWTO:Graphics#Parsing_Real_BLAST_Output), but when I execute it i >> >>> > get >> >>> > the following error message: >> >>> > >> >>> > ------------- EXCEPTION: Bio::Root::Exception ------------- >> >>> > MSG: asking for tag value that does not exist bgcolor >> >>> > STACK: Error::throw >> >>> > STACK: Bio::Root::Root::throw /usr/local/share/perl/5.10.1/Bio/Root/ >> >>> > Root.pm:368 >> >>> > STACK: Bio::SeqFeature::Generic::get_tag_values >> >>> > /usr/local/share/perl/ >> >>> > 5.10.1/Bio/SeqFeature/Generic.pm:517 >> >>> > STACK: Bio::Graphics::Glyph::bgcolor >> >>> > /usr/local/share/perl/5.10.1/Bio/ >> >>> > Graphics/Glyph.pm:703 >> >>> > STACK: Bio::Graphics::Glyph::graded_segments::bgcolor >> >>> /usr/local/share/ >> >>> > perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:124 >> >>> > STACK: Bio::Graphics::Glyph::graded_segments::draw /usr/local/share/ >> >>> > perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:59 >> >>> > STACK: Bio::Graphics::Glyph::track::draw >> >>> > /usr/local/share/perl/5.10.1/ >> >>> > Bio/Graphics/Glyph/track.pm:35 >> >>> > STACK: Bio::Graphics::Panel::gd /usr/local/share/perl/5.10.1/Bio/ >> >>> > Graphics/Panel.pm:588 >> >>> > STACK: Bio::Graphics::Panel::png /usr/local/share/perl/5.10.1/Bio/ >> >>> > Graphics/Panel.pm:1067 >> >>> > STACK: yeah.pl:68 >> >>> > ----------------------------------------------------------- >> >>> > >> >>> > Thanks! >> >>> > _______________________________________________ >> >>> > Bioperl-l mailing list >> >>> > Bioperl-l at lists.open-bio.org >> >>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >>> >> >>> _______________________________________________ >> >>> Bioperl-l mailing list >> >>> Bioperl-l at lists.open-bio.org >> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >>> >> >> >> >> >> >> >> >> -- >> >> Luiz Thib?rio Rangel >> >> >> > >> > >> > >> > -- >> > Luiz Thib?rio Rangel >> > >> > _______________________________________________ >> > Bioperl-l mailing list >> > Bioperl-l at lists.open-bio.org >> > http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > >> >> >> >> -- >> ------------------------------------------------------------------------ >> Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain >> dot net >> GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 >> Ontario Institute for Cancer Research > > > > -- > Luiz Thib?rio Rangel > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research -------------- next part -------------- A non-text attachment was scrubbed... Name: Glyph.pm Type: application/octet-stream Size: 73313 bytes Desc: not available URL: From lthiberiol at gmail.com Fri May 13 09:33:26 2011 From: lthiberiol at gmail.com (Luiz Thiberio Rangel) Date: Fri, 13 May 2011 10:33:26 -0300 Subject: [Bioperl-l] Problem rendering a BLAST result In-Reply-To: References: <75954ae2-2b23-4866-bcce-9c97644c6f84@t16g2000vbi.googlegroups.com> <1A4207F8295607498283FE9E93B775B4079C3AB4@EX02.asurite.ad.asu.edu> Message-ID: Hi Scott, thanks very much. It worked very well after replacing the Glyph.pm! On Fri, May 13, 2011 at 12:05 AM, Scott Cain wrote: > Hi Luiz, > > There was a bug introduced in the last release of Bio::Graphics. If > you replace your copy of Bio::Graphics::Glyph with the attached one, > it should work. > > Scott > > > On Thu, May 12, 2011 at 3:28 PM, Luiz Thiberio Rangel > wrote: > > The problem is not a specific BLAST result. Anyway, I'm attaching a small > > one. > > > > Thank you, again... > > > > On Thu, May 12, 2011 at 4:08 PM, Scott Cain wrote: > >> > >> Hi Luiz, > >> > >> If you could also send the blast file that is causing the problem, > >> that would help too. I don't have one handy to test with. > >> > >> Scott > >> > >> > >> On Thu, May 12, 2011 at 1:34 PM, Luiz Thiberio Rangel > >> wrote: > >> > I've been using the same code used in the BioPerl's HOWTO ( > >> > http://www.bioperl.org/wiki/Render_blast4) and even if the "-bgcolor" > >> > definition is removed the error persists. > >> > > >> > The same script is working well into a older version of the bioperl, > but > >> > I > >> > can't find where is the problem. > >> > > >> > > >> > best regards > >> > > >> > On Tue, May 10, 2011 at 9:14 PM, Luiz Thiberio Rangel > >> > wrote: > >> > > >> >> The code is this one --> > >> >> http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output > . > >> >> The > >> >> same used in this example. > >> >> > >> >> > >> >> On Tue, May 10, 2011 at 5:55 PM, Kevin Brown > >> >> wrote: > >> >> > >> >>> Would help if we can see your code as line 68 of yeah.pl is the > final > >> >>> line of the script you linked. > >> >>> > >> >>> It is trying to set the bgcolor based on a value that it thinks > should > >> >>> be held in the Bio::SeqFeature::Generic object as a tag, but no such > >> >>> tag > >> >>> exists. > >> >>> > >> >>> > -----Original Message----- > >> >>> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > >> >>> > bounces at lists.open-bio.org] On Behalf Of lthiberiol > >> >>> > Sent: Tuesday, May 10, 2011 6:21 AM > >> >>> > To: bioperl-l at bioperl.org > >> >>> > Subject: [Bioperl-l] Problem rendering a BLAST result > >> >>> > > >> >>> > Hi... > >> >>> > > >> >>> > I've been trying to parse a BLAST output and generate a figure > based > >> >>> > on it's result. I'm using this example ( > http://www.bioperl.org/wiki/ > >> >>> > HOWTO:Graphics#Parsing_Real_BLAST_Output), but when I execute it i > >> >>> > get > >> >>> > the following error message: > >> >>> > > >> >>> > ------------- EXCEPTION: Bio::Root::Exception ------------- > >> >>> > MSG: asking for tag value that does not exist bgcolor > >> >>> > STACK: Error::throw > >> >>> > STACK: Bio::Root::Root::throw > /usr/local/share/perl/5.10.1/Bio/Root/ > >> >>> > Root.pm:368 > >> >>> > STACK: Bio::SeqFeature::Generic::get_tag_values > >> >>> > /usr/local/share/perl/ > >> >>> > 5.10.1/Bio/SeqFeature/Generic.pm:517 > >> >>> > STACK: Bio::Graphics::Glyph::bgcolor > >> >>> > /usr/local/share/perl/5.10.1/Bio/ > >> >>> > Graphics/Glyph.pm:703 > >> >>> > STACK: Bio::Graphics::Glyph::graded_segments::bgcolor > >> >>> /usr/local/share/ > >> >>> > perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:124 > >> >>> > STACK: Bio::Graphics::Glyph::graded_segments::draw > /usr/local/share/ > >> >>> > perl/5.10.1/Bio/Graphics/Glyph/graded_segments.pm:59 > >> >>> > STACK: Bio::Graphics::Glyph::track::draw > >> >>> > /usr/local/share/perl/5.10.1/ > >> >>> > Bio/Graphics/Glyph/track.pm:35 > >> >>> > STACK: Bio::Graphics::Panel::gd /usr/local/share/perl/5.10.1/Bio/ > >> >>> > Graphics/Panel.pm:588 > >> >>> > STACK: Bio::Graphics::Panel::png /usr/local/share/perl/5.10.1/Bio/ > >> >>> > Graphics/Panel.pm:1067 > >> >>> > STACK: yeah.pl:68 > >> >>> > ----------------------------------------------------------- > >> >>> > > >> >>> > Thanks! > >> >>> > _______________________________________________ > >> >>> > Bioperl-l mailing list > >> >>> > Bioperl-l at lists.open-bio.org > >> >>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> >>> > >> >>> _______________________________________________ > >> >>> Bioperl-l mailing list > >> >>> Bioperl-l at lists.open-bio.org > >> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> >>> > >> >> > >> >> > >> >> > >> >> -- > >> >> Luiz Thib?rio Rangel > >> >> > >> > > >> > > >> > > >> > -- > >> > Luiz Thib?rio Rangel > >> > > >> > _______________________________________________ > >> > Bioperl-l mailing list > >> > Bioperl-l at lists.open-bio.org > >> > http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > >> > >> > >> > >> -- > >> ------------------------------------------------------------------------ > >> Scott Cain, Ph. D. scott at scottcain > >> dot net > >> GMOD Coordinator (http://gmod.org/) 216-392-3087 > >> Ontario Institute for Cancer Research > > > > > > > > -- > > Luiz Thib?rio Rangel > > > > > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. scott at scottcain dot > net > GMOD Coordinator (http://gmod.org/) 216-392-3087 > Ontario Institute for Cancer Research > -- Luiz Thib?rio Rangel From juettemann at gmail.com Fri May 13 11:33:30 2011 From: juettemann at gmail.com (Thomas Juettemann) Date: Fri, 13 May 2011 17:33:30 +0200 Subject: [Bioperl-l] translate issue Message-ID:

Dear all,

I am trying to translate a DNA sequence to a protein sequence using all 3 ORFs (same strand). However, when trying to pass options to the sequence object (like -frame => 1 or -complete => 1), the option is introduced into the object:

Source: use strict; use warnings; use Data::Dumper; use Bio::Tools::CodonTable; use Bio::Seq; my $dna = 'ATGAAAGGAACATCCATTTTATTCAAAGCACCTCCAAACCTGCAATCCTAAGTTCCAGGCAACTCAATCCCAAAAATCCACTGTAGATGCCCAAAGGCTGGGGTGTTCGGTCTTCAACATTTTTGCCTTTGTGGCTCCCAGTCAAGATAGAGCTGCACCAAGTCCAATTCCATTCCTCATCACAGATGATTTTTTCTACTTTAAGATCAGAACTATACAAGCTTCTTGCTTTGTGTCAGCATGCTGTTGTACCCATGGGCAAATTCTTAGGTAAGACAAAAACACAGTCCCAAGGGCAGGTAGTAATTTTTTCAGAAAAAGGTAAGGCAATCATTTATCTCAGTCTGCCCAGGACAGTCCCAATTTACACATGTATATTCTCCCAATCTGTAGGCTGTCTTTTCATTTTGTTGATTATTTCACTTAATTTTTTATTATTTATTTATTTTATAGAGACAGATCTCATTATGTTGCCCAGGGTGATCCTTGATCTCCTGGCCTCAAGTGATCCTCCAACCTTGGTCTCCCAAAGTGCTGGGATTACAGATGTGAACTACCACACCCAGTCAACGTGCAGAAGGTTTTCAGTTTGATGTAGTCTGATGTAGTCTCATGTATTTATCCTTCTTGTTGTTGCCTGAGCTTTTGGTGTGATATCCAAAAATATCATTGCCAAGATCAATATCAAGAAACTTTCCCCCTATGTTTCTTACAGAAATTTTATGGTTTCAGATTTTTCATCCATTTTGAGTATATTTGTGTGTATGATGTAAGATAAGGGTCCAGTCTCCCCAGTGTTGGATATCCAATTTTCATAACACCATTTATTGAAGAGATTATTCTTTCTCCACTGTGTTTTCTTGATGTCCTTGTCAAAAATTAGTTGACTTTTATATGCTTGGGTTTATTTCTGGGCTCTATTCTGTTTCATTGCTTTACATCTCTGTTTTCATGCCAGTGCCACAGTGTTTTGATTACTATAGCTTTGTAATATAATTTGAAATCAGAATGTGTAATACCTATAACTTTGTTTTTTGCTCTAAAGATTTATTTATTTATTTATTTTTGCCATTTCAGGTCTTTTGTGGTTTCATATGAATTTCAGAATTGTTTTTCCTATTTCTGTGAAAAATGCCATTGACATTTTGATAGGGATTGTGTTGAATCTATATATTGCTTTGGATAGTATGGATG'; my $seq_obj = Bio::Seq->new( -seq => $dna, -alphabet => 'dna' ); my $prot_obj = $seq_obj->translate(-complete => 1); print $prot_obj->seq, "\n"; Output: MKGTSILFKAPPNLQS-completeVPGNSIPKIHCRCPKAGVFGLQHFCLCGSQSR-completeSCTKSNSIPHHR-completeFFLL-completeDQNYTSFLLCVSMLLYPWANS-completeVRQKHSPKGR-complete-completeFFQKKVRQSFISVCPGQSQFTHVYSPNL-completeAVFSFC-completeLFHLIFYYLFIL-completeRQISLCCPG-completeSLISWPQVILQPWSPKVLGLQM-completeTTTPSQRAEGFQFDVV-completeCSLMYLSFLLLPELLV-completeYPKISLPRSISRNFPPMFLTEILWFQIFHPF-completeVYLCV-completeCKIRVQSPQCWISNFHNTIY-completeRDYSFSTVFS-completeCPCQKLVDFYMLGFISGLYSVSLLYISVFMPVPQCFDYYSFVI-completeFEIRMCNTYNFVFCSKDLFIYLFLPFQVFCGFI-completeISELFFLFL-completeKMPLTF-complete-completeGLC-completeIYILLWIVWM Versions: Perl version v5.10.0 BioPerl 1.2.3

How do I correctly pass an option?
Any help is greatly appreciated!

From Kevin.M.Brown at asu.edu Fri May 13 11:43:19 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Fri, 13 May 2011 08:43:19 -0700 Subject: [Bioperl-l] translate issue In-Reply-To: References: Message-ID: <1A4207F8295607498283FE9E93B775B4079C3D85@EX02.asurite.ad.asu.edu> > Versions: > > Perl version v5.10.0 > BioPerl 1.2.3 > > >

> How do I correctly pass an option?
> Any help is greatly appreciated! >

Upgrade your version of BioPerl. Version 1.2.3 is ancient. Current version is 1.6.1. http://www.bioperl.org/wiki/Category:Installation From cjfields at illinois.edu Fri May 13 11:46:59 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 13 May 2011 10:46:59 -0500 Subject: [Bioperl-l] translate issue In-Reply-To: <1A4207F8295607498283FE9E93B775B4079C3D85@EX02.asurite.ad.asu.edu> References: <1A4207F8295607498283FE9E93B775B4079C3D85@EX02.asurite.ad.asu.edu> Message-ID: <6F8A95AF-4110-4B63-B0D0-CFDFBB23FAC9@illinois.edu> On May 13, 2011, at 10:43 AM, Kevin Brown wrote: >> Versions: >> >> Perl version v5.10.0 >> BioPerl 1.2.3 >> >> >>

>> How do I correctly pass an option?
>> Any help is greatly appreciated! >>

> > Upgrade your version of BioPerl. Version 1.2.3 is ancient. Current > version is 1.6.1. > http://www.bioperl.org/wiki/Category:Installation Actually, the current version is v.1.6.9 now, looks as if we need some wiki updating :) chris From lcpaulet at googlemail.com Sat May 14 11:29:12 2011 From: lcpaulet at googlemail.com (Lorenzo Carretero) Date: Sat, 14 May 2011 17:29:12 +0200 Subject: [Bioperl-l] Problems parsing blast reports Message-ID: Hi all, I'm trying to parse blasttable '-m 8 format' reports from whole intra-genome comparisons of all vs all to get the best non-self hit and diverged best non-self hit as separate hashes (key=query=>value=non_self_hit). The following script seems to run ok but it returns unexpected results (i.e. it doesn't catch the best hit but a random hit apparently). I assume hits are iterated over the while loop (while (my $hit = $result->next_hit) as returned in the blast results (i.e. sorted by hits bit scores). Any help would be much appreciated. Cheers, Lorenzo my ( $filename, $format, $minimumbitsscore, $thresholdevalue, > $minimumidentity, $maximumredundancy, $minimumalnlength ) = @_; > my ( %besthits, %diverged, %redundant ,%genefusion ) = (); > my ( $refbh, $refdiv, $cb, $cd, $refred, $reffus, $cr, $cf); > my $total = 0; > my $in = new Bio::SearchIO ( -file => $filename, > -format => $format, > #-verbose => -1, > ) > or die "No $filename BLAST file with > $format found"; > while( my $result = $in->next_result) { > $total++; > while (my $hit = $result->next_hit) { > my $query = $result->query_name(); > my $hitname = $hit->name(); > my $bits = $hit->bits(); > my $evalue = $hit->significance(); > if ($query ne $hitname and $bits >= $minimumbitsscore and > $evalue <= $thresholdevalue ) { > $besthits{ $query } = $hitname; > } > elsif ( $bits <= $minimumbitsscore and $evalue >= > $thresholdevalue){ > $diverged { $query } = $hitname; > } > > # while( my $hsp = $hit->next_hsp ) { > # my $querylen = $hsp->length( 'query' ); > # my $hitlen = $hsp->length( 'hit' ); > # my $alnlen = $hsp->length( 'total' ); > # my $identity = $hsp->percent_identity(); > # if ( $identity > $maximumredundancy ) { > # $redundant{ $query } = $hitname; > # } > # elsif ( ($querylen <= ($alnlen / 2) ) || ($hitlen <= > ($alnlen / 2) ) || ($alnlen <= $minimumalnlength) ) { > # $genefusion{ $query } = $hitname; > # } > # elsif ( ($evalue >= $thresholdevalue) || ($identity <= > $minimumidentity) || ($bits <= $minimumbitsscore) ) { > # $diverged{ $query } = $hitname; > # } > # else { > # $besthits{ $query } = $hitname; > # } > # } > } > } > $refbh = \%besthits; > $refdiv = \%diverged; > $refred = \%redundant; > $reffus = \%genefusion; > $cb = keys ( %besthits ); > $cd = keys( %diverged ); > $cr = keys( %redundant ); > $cf = keys( %genefusion ); > From David.Messina at sbc.su.se Sat May 14 12:15:08 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Sat, 14 May 2011 18:15:08 +0200 Subject: [Bioperl-l] Problems parsing blast reports In-Reply-To: References: Message-ID: Hi Lorenzo, Hmm, I tried your code, and it worked for me. I set your input variables set as follows: my ( $filename, $format, $minimumbitsscore, $thresholdevalue, $minimumidentity, $maximumredundancy, $minimumalnlength ) = ('2008.blasttable', 'blasttable', 300, '1e-100', 10, 0, 10); where the file is t/data/2008.blasttable that comes with the BioPerl distro. With those settings, the top hit goes into %besthits and the third hit goes into %diverged. I also added at the top: use strict; use warnings; use Bio::SearchIO; You *are* using strict and warnings, aren't you? :) One thing that may be an issue is that you're doing this at the hit level, and remember that in -m8 format each line represents an HSP. I'd need your input ? really, a neat little test case and the corresponding erroneous output ? to help further. Dave On Sat, May 14, 2011 at 17:29, Lorenzo Carretero wrote: > Hi all, > I'm trying to parse blasttable '-m 8 format' reports from whole > intra-genome > comparisons of all vs all to get the best non-self hit and diverged best > non-self hit as separate hashes (key=query=>value=non_self_hit). The > following script seems to run ok but it returns unexpected results (i.e. it > doesn't catch the best hit but a random hit apparently). I assume hits are > iterated over the while loop (while (my $hit = $result->next_hit) as > returned in the blast results (i.e. sorted by hits bit scores). > Any help would be much appreciated. > Cheers, > Lorenzo > > > my ( $filename, $format, $minimumbitsscore, $thresholdevalue, > > $minimumidentity, $maximumredundancy, $minimumalnlength ) = @_; > > my ( %besthits, %diverged, %redundant ,%genefusion ) = (); > > my ( $refbh, $refdiv, $cb, $cd, $refred, $reffus, $cr, $cf); > > my $total = 0; > > my $in = new Bio::SearchIO ( -file => $filename, > > -format => $format, > > #-verbose => -1, > > ) > > or die "No $filename BLAST file with > > $format found"; > > while( my $result = $in->next_result) { > > $total++; > > while (my $hit = $result->next_hit) { > > my $query = $result->query_name(); > > my $hitname = $hit->name(); > > my $bits = $hit->bits(); > > my $evalue = $hit->significance(); > > if ($query ne $hitname and $bits >= $minimumbitsscore and > > $evalue <= $thresholdevalue ) { > > $besthits{ $query } = $hitname; > > } > > elsif ( $bits <= $minimumbitsscore and $evalue >= > > $thresholdevalue){ > > $diverged { $query } = $hitname; > > } > > > > # while( my $hsp = $hit->next_hsp ) { > > # my $querylen = $hsp->length( 'query' ); > > # my $hitlen = $hsp->length( 'hit' ); > > # my $alnlen = $hsp->length( 'total' ); > > # my $identity = $hsp->percent_identity(); > > # if ( $identity > $maximumredundancy ) { > > # $redundant{ $query } = $hitname; > > # } > > # elsif ( ($querylen <= ($alnlen / 2) ) || ($hitlen <= > > ($alnlen / 2) ) || ($alnlen <= $minimumalnlength) ) { > > # $genefusion{ $query } = $hitname; > > # } > > # elsif ( ($evalue >= $thresholdevalue) || ($identity > <= > > $minimumidentity) || ($bits <= $minimumbitsscore) ) { > > # $diverged{ $query } = $hitname; > > # } > > # else { > > # $besthits{ $query } = $hitname; > > # } > > # } > > } > > } > > $refbh = \%besthits; > > $refdiv = \%diverged; > > $refred = \%redundant; > > $reffus = \%genefusion; > > $cb = keys ( %besthits ); > > $cd = keys( %diverged ); > > $cr = keys( %redundant ); > > $cf = keys( %genefusion ); > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From David.Messina at sbc.su.se Mon May 16 08:40:57 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Mon, 16 May 2011 14:40:57 +0200 Subject: [Bioperl-l] Problems parsing blast reports In-Reply-To: References: Message-ID: Hi Lorenzo, Please remember to reply all so the mailing list sees the whole discussion. I looked at your code and used your data, and the reason you're not always getting the best hit in your output is that subsequent hits which also pass your filter are writing over the best hit in your hash. So it's not a random hit that's getting saved, it's the last one that matched your criteria. To fix this, you need to short-circuit out of the loop once you've found a hit matching your criteria. Label the next_result loop like: RESULT: while ( my $result = $in->next_result ) { and then have the last line inside your filtering if block be next RESULT; Assuming "best hit" to you means best e-value, the above works since hits are sorted by best to worst e-value. I'd also like to get the queries only returning itself as hits (singletons) > but I must make run besthits properly before (any suggestion?). For this, I'd suggest that you actually save the best hit regardless of whether it's a self-match, and then if there's a hit passing your criteria that isn't a self-match, you test for that and replace the previously saved self-match in %besthits. (Again, short-circuiting out of the loop once you've got that hit.) For the moment I'm only testing bitscore and thresholdevalue so I don't need > to loop over hsp (do I?). No, remember each line in the -m8 output is an hsp, not a hit. So if you only loop over the hits, you'll only see the first hsp. In your example dataset, there weren't any with multiple hsps, so that's why you didn't run into this problem. Dave On Sat, May 14, 2011 at 22:17, Lorenzo Carretero wrote: > Hi David, > Thanks for your reply. > My scripts is within a subroutine within a .pm file called from a .pl > script. I checked the arguments are being passed correctly. For the moment > I'm only testing bitscore and thresholdevalue so I don't need to loop over > hsp (do I?). Of course, I'm using the strict and warning pragmas and use > Bio::SearchIO; I tried with a reduced version of my BLASTP output (attached > as a file). This is what I get in %besthits: > >> gnl|Alyrata|AL2G21220,gnl|Alyrata|AL6G21100 >> gnl|Alyrata|AL0G11620,gnl|Alyrata|AL1G37580 >> gnl|Alyrata|AL6G05070,gnl|Alyrata|AL3G15690 >> gnl|Alyrata|AL2G12090,gnl|Alyrata|AL4G18800 >> gnl|Alyrata|AL1G15460,gnl|Alyrata|AL0G01870 >> > > My script only returns the correct hits for AL6G05070 and AL0G11260, while > the 3d for AL2G21220, the 17th for AL2G12090 and the 13th for AL1G15460. > Note that the best non-self hit for AL2G12090 is not the second one but the > first as it shows better score than the self-hit. I'd also like to get the > queries only returning itself as hits (singletons) but I must make run > besthits properly before (any suggestion?). > Thanks again, > Lorenzo > > On Sat, May 14, 2011 at 6:15 PM, Dave Messina wrote: > >> Hi Lorenzo, >> >> Hmm, I tried your code, and it worked for me. >> >> I set your input variables set as follows: >> my ( $filename, $format, $minimumbitsscore, $thresholdevalue, >> $minimumidentity, >> $maximumredundancy, $minimumalnlength ) >> = ('2008.blasttable', 'blasttable', 300, '1e-100', 10, 0, 10); >> >> where the file is t/data/2008.blasttable that comes with the BioPerl >> distro. With those settings, the top hit goes into %besthits and the third >> hit goes into %diverged. >> >> I also added at the top: >> use strict; >> use warnings; >> use Bio::SearchIO; >> >> You *are* using strict and warnings, aren't you? :) >> >> One thing that may be an issue is that you're doing this at the hit level, >> and remember that in -m8 format each line represents an HSP. >> >> I'd need your input ? really, a neat little test case and the >> corresponding erroneous output ? to help further. >> >> Dave >> >> >> >> On Sat, May 14, 2011 at 17:29, Lorenzo Carretero > > wrote: >> >>> Hi all, >>> I'm trying to parse blasttable '-m 8 format' reports from whole >>> intra-genome >>> comparisons of all vs all to get the best non-self hit and diverged best >>> non-self hit as separate hashes (key=query=>value=non_self_hit). The >>> following script seems to run ok but it returns unexpected results (i.e. >>> it >>> doesn't catch the best hit but a random hit apparently). I assume hits >>> are >>> iterated over the while loop (while (my $hit = $result->next_hit) as >>> returned in the blast results (i.e. sorted by hits bit scores). >>> Any help would be much appreciated. >>> Cheers, >>> Lorenzo >>> >>> >>> my ( $filename, $format, $minimumbitsscore, $thresholdevalue, >>> > $minimumidentity, $maximumredundancy, $minimumalnlength ) = @_; >>> > my ( %besthits, %diverged, %redundant ,%genefusion ) = (); >>> > my ( $refbh, $refdiv, $cb, $cd, $refred, $reffus, $cr, $cf); >>> > my $total = 0; >>> > my $in = new Bio::SearchIO ( -file => $filename, >>> > -format => $format, >>> > #-verbose => -1, >>> > ) >>> > or die "No $filename BLAST file with >>> > $format found"; >>> > while( my $result = $in->next_result) { >>> > $total++; >>> > while (my $hit = $result->next_hit) { >>> > my $query = $result->query_name(); >>> > my $hitname = $hit->name(); >>> > my $bits = $hit->bits(); >>> > my $evalue = $hit->significance(); >>> > if ($query ne $hitname and $bits >= $minimumbitsscore and >>> > $evalue <= $thresholdevalue ) { >>> > $besthits{ $query } = $hitname; >>> > } >>> > elsif ( $bits <= $minimumbitsscore and $evalue >= >>> > $thresholdevalue){ >>> > $diverged { $query } = $hitname; >>> > } >>> > >>> > # while( my $hsp = $hit->next_hsp ) { >>> > # my $querylen = $hsp->length( 'query' ); >>> > # my $hitlen = $hsp->length( 'hit' ); >>> > # my $alnlen = $hsp->length( 'total' ); >>> > # my $identity = $hsp->percent_identity(); >>> > # if ( $identity > $maximumredundancy ) { >>> > # $redundant{ $query } = $hitname; >>> > # } >>> > # elsif ( ($querylen <= ($alnlen / 2) ) || ($hitlen >>> <= >>> > ($alnlen / 2) ) || ($alnlen <= $minimumalnlength) ) { >>> > # $genefusion{ $query } = $hitname; >>> > # } >>> > # elsif ( ($evalue >= $thresholdevalue) || >>> ($identity <= >>> > $minimumidentity) || ($bits <= $minimumbitsscore) ) { >>> > # $diverged{ $query } = $hitname; >>> > # } >>> > # else { >>> > # $besthits{ $query } = $hitname; >>> > # } >>> > # } >>> > } >>> > } >>> > $refbh = \%besthits; >>> > $refdiv = \%diverged; >>> > $refred = \%redundant; >>> > $reffus = \%genefusion; >>> > $cb = keys ( %besthits ); >>> > $cd = keys( %diverged ); >>> > $cr = keys( %redundant ); >>> > $cf = keys( %genefusion ); >>> > >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> > From lcpaulet at googlemail.com Mon May 16 15:56:29 2011 From: lcpaulet at googlemail.com (Lorenzo Carretero) Date: Mon, 16 May 2011 21:56:29 +0200 Subject: [Bioperl-l] Problems parsing blast reports In-Reply-To: References: Message-ID: Hi Dave, The following code do the job as far as I don't loop through hsps. I tested it using some results showing several hsps and gives the expected results. Hsps for a given hit are also ordered by score so is it really necessary to loop?. Best, Lorenzo PS: Sorry, I forgot to reply all in my last message while( my $result = $in->next_result) > { > $total++; > my $query = $result->query_name(); > while (my $hit = $result->next_hit) > { > my $hitname = $hit->name(); > my $bits = $hit->bits(); > my $evalue = $hit->significance(); > if ( $result->num_hits == 1 and $query eq $hitname ) > { > $cs++; > $singletons{ $cs } = "$query;$hitname"; > last; > } > if ( $query ne $hitname and $bits >= $minimumbitsscore and > $evalue <= $thresholdevalue ) > { > $cb++; > $besthits{ $cb } = "$query;$hitname"; > last; > } > elsif ( $query ne $hitname and $bits <= > $minimumbitsscore and $evalue >= $thresholdevalue) > { > $cd++; > $diverged { $cd } = "$query;$hitname"; > last; > } > # while( my $hsp = $hit->next_hsp ) { > # #last; > # } > } > } > > On Mon, May 16, 2011 at 2:40 PM, Dave Messina wrote: > Hi Lorenzo, > > Please remember to reply all so the mailing list sees the whole discussion. > > I looked at your code and used your data, and the reason you're not always > getting the best hit in your output is that subsequent hits which also pass > your filter are writing over the best hit in your hash. So it's not a random > hit that's getting saved, it's the last one that matched your criteria. > > To fix this, you need to short-circuit out of the loop once you've found a > hit matching your criteria. > > Label the next_result loop like: > > RESULT: while ( my $result = $in->next_result ) { > > and then have the last line inside your filtering if block be > > next RESULT; > > Assuming "best hit" to you means best e-value, the above works since hits > are sorted by best to worst e-value. > > > I'd also like to get the queries only returning itself as hits (singletons) >> but I must make run besthits properly before (any suggestion?). > > > For this, I'd suggest that you actually save the best hit regardless of > whether it's a self-match, and then if there's a hit passing your criteria > that isn't a self-match, you test for that and replace the previously saved > self-match in %besthits. (Again, short-circuiting out of the loop once > you've got that hit.) > > > For the moment I'm only testing bitscore and thresholdevalue so I don't >> need to loop over hsp (do I?). > > > No, remember each line in the -m8 output is an hsp, not a hit. So if you > only loop over the hits, you'll only see the first hsp. In your example > dataset, there weren't any with multiple hsps, so that's why you didn't run > into this problem. > > > Dave > > > > > On Sat, May 14, 2011 at 22:17, Lorenzo Carretero wrote: > >> Hi David, >> Thanks for your reply. >> My scripts is within a subroutine within a .pm file called from a .pl >> script. I checked the arguments are being passed correctly. For the moment >> I'm only testing bitscore and thresholdevalue so I don't need to loop over >> hsp (do I?). Of course, I'm using the strict and warning pragmas and use >> Bio::SearchIO; I tried with a reduced version of my BLASTP output (attached >> as a file). This is what I get in %besthits: >> >>> gnl|Alyrata|AL2G21220,gnl|Alyrata|AL6G21100 >>> gnl|Alyrata|AL0G11620,gnl|Alyrata|AL1G37580 >>> gnl|Alyrata|AL6G05070,gnl|Alyrata|AL3G15690 >>> gnl|Alyrata|AL2G12090,gnl|Alyrata|AL4G18800 >>> gnl|Alyrata|AL1G15460,gnl|Alyrata|AL0G01870 >>> >> >> My script only returns the correct hits for AL6G05070 and AL0G11260, while >> the 3d for AL2G21220, the 17th for AL2G12090 and the 13th for AL1G15460. >> Note that the best non-self hit for AL2G12090 is not the second one but the >> first as it shows better score than the self-hit. I'd also like to get the >> queries only returning itself as hits (singletons) but I must make run >> besthits properly before (any suggestion?). >> Thanks again, >> Lorenzo >> >> On Sat, May 14, 2011 at 6:15 PM, Dave Messina wrote: >> >>> Hi Lorenzo, >>> >>> Hmm, I tried your code, and it worked for me. >>> >>> I set your input variables set as follows: >>> my ( $filename, $format, $minimumbitsscore, $thresholdevalue, >>> $minimumidentity, >>> $maximumredundancy, $minimumalnlength ) >>> = ('2008.blasttable', 'blasttable', 300, '1e-100', 10, 0, 10); >>> >>> where the file is t/data/2008.blasttable that comes with the BioPerl >>> distro. With those settings, the top hit goes into %besthits and the third >>> hit goes into %diverged. >>> >>> I also added at the top: >>> use strict; >>> use warnings; >>> use Bio::SearchIO; >>> >>> You *are* using strict and warnings, aren't you? :) >>> >>> One thing that may be an issue is that you're doing this at the hit >>> level, and remember that in -m8 format each line represents an HSP. >>> >>> I'd need your input ? really, a neat little test case and the >>> corresponding erroneous output ? to help further. >>> >>> Dave >>> >>> >>> >>> On Sat, May 14, 2011 at 17:29, Lorenzo Carretero < >>> lcpaulet at googlemail.com> wrote: >>> >>>> Hi all, >>>> I'm trying to parse blasttable '-m 8 format' reports from whole >>>> intra-genome >>>> comparisons of all vs all to get the best non-self hit and diverged best >>>> non-self hit as separate hashes (key=query=>value=non_self_hit). The >>>> following script seems to run ok but it returns unexpected results (i.e. >>>> it >>>> doesn't catch the best hit but a random hit apparently). I assume hits >>>> are >>>> iterated over the while loop (while (my $hit = $result->next_hit) as >>>> returned in the blast results (i.e. sorted by hits bit scores). >>>> Any help would be much appreciated. >>>> Cheers, >>>> Lorenzo >>>> >>>> >>>> my ( $filename, $format, $minimumbitsscore, $thresholdevalue, >>>> > $minimumidentity, $maximumredundancy, $minimumalnlength ) = @_; >>>> > my ( %besthits, %diverged, %redundant ,%genefusion ) = (); >>>> > my ( $refbh, $refdiv, $cb, $cd, $refred, $reffus, $cr, $cf); >>>> > my $total = 0; >>>> > my $in = new Bio::SearchIO ( -file => $filename, >>>> > -format => $format, >>>> > #-verbose => -1, >>>> > ) >>>> > or die "No $filename BLAST file with >>>> > $format found"; >>>> > while( my $result = $in->next_result) { >>>> > $total++; >>>> > while (my $hit = $result->next_hit) { >>>> > my $query = $result->query_name(); >>>> > my $hitname = $hit->name(); >>>> > my $bits = $hit->bits(); >>>> > my $evalue = $hit->significance(); >>>> > if ($query ne $hitname and $bits >= $minimumbitsscore and >>>> > $evalue <= $thresholdevalue ) { >>>> > $besthits{ $query } = $hitname; >>>> > } >>>> > elsif ( $bits <= $minimumbitsscore and $evalue >= >>>> > $thresholdevalue){ >>>> > $diverged { $query } = $hitname; >>>> > } >>>> > >>>> > # while( my $hsp = $hit->next_hsp ) { >>>> > # my $querylen = $hsp->length( 'query' ); >>>> > # my $hitlen = $hsp->length( 'hit' ); >>>> > # my $alnlen = $hsp->length( 'total' ); >>>> > # my $identity = $hsp->percent_identity(); >>>> > # if ( $identity > $maximumredundancy ) { >>>> > # $redundant{ $query } = $hitname; >>>> > # } >>>> > # elsif ( ($querylen <= ($alnlen / 2) ) || ($hitlen >>>> <= >>>> > ($alnlen / 2) ) || ($alnlen <= $minimumalnlength) ) { >>>> > # $genefusion{ $query } = $hitname; >>>> > # } >>>> > # elsif ( ($evalue >= $thresholdevalue) || >>>> ($identity <= >>>> > $minimumidentity) || ($bits <= $minimumbitsscore) ) { >>>> > # $diverged{ $query } = $hitname; >>>> > # } >>>> > # else { >>>> > # $besthits{ $query } = $hitname; >>>> > # } >>>> > # } >>>> > } >>>> > } >>>> > $refbh = \%besthits; >>>> > $refdiv = \%diverged; >>>> > $refred = \%redundant; >>>> > $reffus = \%genefusion; >>>> > $cb = keys ( %besthits ); >>>> > $cd = keys( %diverged ); >>>> > $cr = keys( %redundant ); >>>> > $cf = keys( %genefusion ); >>>> > >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> >>> >> > From David.Messina at sbc.su.se Mon May 16 16:14:04 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Mon, 16 May 2011 22:14:04 +0200 Subject: [Bioperl-l] Problems parsing blast reports In-Reply-To: References: Message-ID: Hi Lorenzo, Hsps for a given hit are also ordered by score so is it really necessary to > loop?. > If you know you'll always be interested in the highest-scoring hsp for a hit, and if you're not using anything in the hsp object. then yes, it should always come first and I don't think you'll need the hsp loop. Glad to hear you've got it working! Best, Dave > Best, > Lorenzo > PS: Sorry, I forgot to reply all in my last message > > while( my $result = $in->next_result) >> { >> $total++; >> my $query = $result->query_name(); >> while (my $hit = $result->next_hit) >> { >> my $hitname = $hit->name(); >> my $bits = $hit->bits(); >> my $evalue = $hit->significance(); >> if ( $result->num_hits == 1 and $query eq $hitname ) >> { >> $cs++; >> $singletons{ $cs } = "$query;$hitname"; >> last; >> >> } >> if ( $query ne $hitname and $bits >= $minimumbitsscore and >> $evalue <= $thresholdevalue ) >> { >> $cb++; >> $besthits{ $cb } = "$query;$hitname"; >> last; >> } >> elsif ( $query ne $hitname and $bits <= >> $minimumbitsscore and $evalue >= $thresholdevalue) >> { >> $cd++; >> $diverged { $cd } = "$query;$hitname"; >> last; >> } >> # while( my $hsp = $hit->next_hsp ) { >> # #last; >> # } >> } >> } >> >> On Mon, May 16, 2011 at 2:40 PM, Dave Messina wrote: > >> Hi Lorenzo, >> >> Please remember to reply all so the mailing list sees the whole >> discussion. >> >> I looked at your code and used your data, and the reason you're not always >> getting the best hit in your output is that subsequent hits which also pass >> your filter are writing over the best hit in your hash. So it's not a random >> hit that's getting saved, it's the last one that matched your criteria. >> >> To fix this, you need to short-circuit out of the loop once you've found a >> hit matching your criteria. >> >> Label the next_result loop like: >> >> RESULT: while ( my $result = $in->next_result ) { >> >> and then have the last line inside your filtering if block be >> >> next RESULT; >> >> Assuming "best hit" to you means best e-value, the above works since hits >> are sorted by best to worst e-value. >> >> >> I'd also like to get the queries only returning itself as hits >>> (singletons) but I must make run besthits properly before (any suggestion?). >> >> >> For this, I'd suggest that you actually save the best hit regardless of >> whether it's a self-match, and then if there's a hit passing your criteria >> that isn't a self-match, you test for that and replace the previously saved >> self-match in %besthits. (Again, short-circuiting out of the loop once >> you've got that hit.) >> >> >> For the moment I'm only testing bitscore and thresholdevalue so I don't >>> need to loop over hsp (do I?). >> >> >> No, remember each line in the -m8 output is an hsp, not a hit. So if you >> only loop over the hits, you'll only see the first hsp. In your example >> dataset, there weren't any with multiple hsps, so that's why you didn't run >> into this problem. >> >> >> Dave >> >> >> >> >> On Sat, May 14, 2011 at 22:17, Lorenzo Carretero > > wrote: >> >>> Hi David, >>> Thanks for your reply. >>> My scripts is within a subroutine within a .pm file called from a .pl >>> script. I checked the arguments are being passed correctly. For the moment >>> I'm only testing bitscore and thresholdevalue so I don't need to loop over >>> hsp (do I?). Of course, I'm using the strict and warning pragmas and use >>> Bio::SearchIO; I tried with a reduced version of my BLASTP output (attached >>> as a file). This is what I get in %besthits: >>> >>>> gnl|Alyrata|AL2G21220,gnl|Alyrata|AL6G21100 >>>> gnl|Alyrata|AL0G11620,gnl|Alyrata|AL1G37580 >>>> gnl|Alyrata|AL6G05070,gnl|Alyrata|AL3G15690 >>>> gnl|Alyrata|AL2G12090,gnl|Alyrata|AL4G18800 >>>> gnl|Alyrata|AL1G15460,gnl|Alyrata|AL0G01870 >>>> >>> >>> My script only returns the correct hits for AL6G05070 and AL0G11260, >>> while the 3d for AL2G21220, the 17th for AL2G12090 and the 13th for >>> AL1G15460. Note that the best non-self hit for AL2G12090 is not the second >>> one but the first as it shows better score than the self-hit. I'd also like >>> to get the queries only returning itself as hits (singletons) but I must >>> make run besthits properly before (any suggestion?). >>> Thanks again, >>> Lorenzo >>> >>> On Sat, May 14, 2011 at 6:15 PM, Dave Messina wrote: >>> >>>> Hi Lorenzo, >>>> >>>> Hmm, I tried your code, and it worked for me. >>>> >>>> I set your input variables set as follows: >>>> my ( $filename, $format, $minimumbitsscore, $thresholdevalue, >>>> $minimumidentity, >>>> $maximumredundancy, $minimumalnlength ) >>>> = ('2008.blasttable', 'blasttable', 300, '1e-100', 10, 0, 10); >>>> >>>> where the file is t/data/2008.blasttable that comes with the BioPerl >>>> distro. With those settings, the top hit goes into %besthits and the third >>>> hit goes into %diverged. >>>> >>>> I also added at the top: >>>> use strict; >>>> use warnings; >>>> use Bio::SearchIO; >>>> >>>> You *are* using strict and warnings, aren't you? :) >>>> >>>> One thing that may be an issue is that you're doing this at the hit >>>> level, and remember that in -m8 format each line represents an HSP. >>>> >>>> I'd need your input ? really, a neat little test case and the >>>> corresponding erroneous output ? to help further. >>>> >>>> Dave >>>> >>>> >>>> >>>> On Sat, May 14, 2011 at 17:29, Lorenzo Carretero < >>>> lcpaulet at googlemail.com> wrote: >>>> >>>>> Hi all, >>>>> I'm trying to parse blasttable '-m 8 format' reports from whole >>>>> intra-genome >>>>> comparisons of all vs all to get the best non-self hit and diverged >>>>> best >>>>> non-self hit as separate hashes (key=query=>value=non_self_hit). The >>>>> following script seems to run ok but it returns unexpected results >>>>> (i.e. it >>>>> doesn't catch the best hit but a random hit apparently). I assume hits >>>>> are >>>>> iterated over the while loop (while (my $hit = $result->next_hit) as >>>>> returned in the blast results (i.e. sorted by hits bit scores). >>>>> Any help would be much appreciated. >>>>> Cheers, >>>>> Lorenzo >>>>> >>>>> >>>>> my ( $filename, $format, $minimumbitsscore, $thresholdevalue, >>>>> > $minimumidentity, $maximumredundancy, $minimumalnlength ) = @_; >>>>> > my ( %besthits, %diverged, %redundant ,%genefusion ) = (); >>>>> > my ( $refbh, $refdiv, $cb, $cd, $refred, $reffus, $cr, $cf); >>>>> > my $total = 0; >>>>> > my $in = new Bio::SearchIO ( -file => $filename, >>>>> > -format => $format, >>>>> > #-verbose => -1, >>>>> > ) >>>>> > or die "No $filename BLAST file with >>>>> > $format found"; >>>>> > while( my $result = $in->next_result) { >>>>> > $total++; >>>>> > while (my $hit = $result->next_hit) { >>>>> > my $query = $result->query_name(); >>>>> > my $hitname = $hit->name(); >>>>> > my $bits = $hit->bits(); >>>>> > my $evalue = $hit->significance(); >>>>> > if ($query ne $hitname and $bits >= $minimumbitsscore and >>>>> > $evalue <= $thresholdevalue ) { >>>>> > $besthits{ $query } = $hitname; >>>>> > } >>>>> > elsif ( $bits <= $minimumbitsscore and $evalue >= >>>>> > $thresholdevalue){ >>>>> > $diverged { $query } = $hitname; >>>>> > } >>>>> > >>>>> > # while( my $hsp = $hit->next_hsp ) { >>>>> > # my $querylen = $hsp->length( 'query' ); >>>>> > # my $hitlen = $hsp->length( 'hit' ); >>>>> > # my $alnlen = $hsp->length( 'total' ); >>>>> > # my $identity = $hsp->percent_identity(); >>>>> > # if ( $identity > $maximumredundancy ) { >>>>> > # $redundant{ $query } = $hitname; >>>>> > # } >>>>> > # elsif ( ($querylen <= ($alnlen / 2) ) || >>>>> ($hitlen <= >>>>> > ($alnlen / 2) ) || ($alnlen <= $minimumalnlength) ) { >>>>> > # $genefusion{ $query } = $hitname; >>>>> > # } >>>>> > # elsif ( ($evalue >= $thresholdevalue) || >>>>> ($identity <= >>>>> > $minimumidentity) || ($bits <= $minimumbitsscore) ) { >>>>> > # $diverged{ $query } = $hitname; >>>>> > # } >>>>> > # else { >>>>> > # $besthits{ $query } = $hitname; >>>>> > # } >>>>> > # } >>>>> > } >>>>> > } >>>>> > $refbh = \%besthits; >>>>> > $refdiv = \%diverged; >>>>> > $refred = \%redundant; >>>>> > $reffus = \%genefusion; >>>>> > $cb = keys ( %besthits ); >>>>> > $cd = keys( %diverged ); >>>>> > $cr = keys( %redundant ); >>>>> > $cf = keys( %genefusion ); >>>>> > >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>> >>>> >>> >> > From cjfields at illinois.edu Mon May 16 16:46:26 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 16 May 2011 15:46:26 -0500 Subject: [Bioperl-l] Bio::Tools::Primer3Redux released to CPAN Message-ID: <091955B1-AFDF-4DC2-89ED-F04D3F251EC7@illinois.edu> All, Forgot to mention this on the list: I have released some code I had lying around (used in a local project) as Bio::Tools::Primer3Redux. Grab it from CPAN if ya want it: http://search.cpan.org/dist/Bio-Tools-Primer3Redux/ The main repo on github: https://github.com/cjfields/bio-tools-primer3redux == Features == * Support for both primer3 v1 and v2. * The distribution contains both the wrapper (Bio::Tools::Run::Primer3Redux), the parser, and the related modules for storing data. In other words, it's self-contained and not tied to a specific BioPerl version beyond some basic classes (Bio::Root::Root and Bio::SeqFeature::Generic). * Some Tests! * Some Documentation! == Why Primer3Redux? == This is a rewrite of the BioPerl code for Primer3. The key reason for a new name: the API differs significantly enough from the older Primer3 code to pretty much require a different namespace. For some of the functionality I wanted at the time, the code pretty much required it (such as using hierarchal features). == Why not include this within BioPerl? == Well, truthfully, I thought it would be a good idea to demonstrate that one can both (1) release BioPerl-reliant code on CPAN as a separate focused bundle and still contribute to BioPerl, and (2) use modern perl tools to do so (e.g. Dist::Zilla). For more (and likely better) examples of the former, see: Bio::Graphics, Bio::DB::Sam, Bio::Chado::Schema, and numerous other modules. == To Do == * Would probably be a good idea to genericize the base feature class being used so one could use any Bio::SeqFeatureI (e.g. Bio::DB::SeqFeature, for instance). * Separate the parser out; primer3 is still using Boulder format! == Thanks == * Frank Schwach and Cass Johnston, for their input and bug reports. * UIUC chris From cjfields at illinois.edu Mon May 16 16:58:10 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 16 May 2011 15:58:10 -0500 Subject: [Bioperl-l] Problems parsing blast reports In-Reply-To: References: Message-ID: <0E2C1C84-0D30-4618-9A07-2E24D5813687@illinois.edu> Agreed, though I would rather screen based on the actual statistic I wanted than assume the first one will always give the best of anything (I'm a bit of a devil's advocate in that regard). Following is some completely untested and probably incorrect code (batteries not included, offer void in all 50 states), but you get the idea: ================================= use List::Utils qw(reduce); ... while( my $result = $in->next_result) { my $best_hit = reduce {$a->significance < $b->significance ? $a : $b } $result->hits; if (defined($best_hit)) { my $best_hsp = reduce {$a->evalue < $b->evalue ? $a : $b } $best_hit->hsps; # do earth-shattering stuff here } } ================================= chris On May 16, 2011, at 3:14 PM, Dave Messina wrote: > Hi Lorenzo, > > > Hsps for a given hit are also ordered by score so is it really necessary to >> loop?. >> > > If you know you'll always be interested in the highest-scoring hsp for a > hit, and if you're not using anything in the hsp object. then yes, it should > always come first and I don't think you'll need the hsp loop. > > Glad to hear you've got it working! > > Best, > Dave > > > > > > >> Best, >> Lorenzo >> PS: Sorry, I forgot to reply all in my last message >> >> while( my $result = $in->next_result) >>> { >>> $total++; >>> my $query = $result->query_name(); >>> while (my $hit = $result->next_hit) >>> { >>> my $hitname = $hit->name(); >>> my $bits = $hit->bits(); >>> my $evalue = $hit->significance(); >>> if ( $result->num_hits == 1 and $query eq $hitname ) >>> { >>> $cs++; >>> $singletons{ $cs } = "$query;$hitname"; >>> last; >>> >>> } >>> if ( $query ne $hitname and $bits >= $minimumbitsscore and >>> $evalue <= $thresholdevalue ) >>> { >>> $cb++; >>> $besthits{ $cb } = "$query;$hitname"; >>> last; >>> } >>> elsif ( $query ne $hitname and $bits <= >>> $minimumbitsscore and $evalue >= $thresholdevalue) >>> { >>> $cd++; >>> $diverged { $cd } = "$query;$hitname"; >>> last; >>> } >>> # while( my $hsp = $hit->next_hsp ) { >>> # #last; >>> # } >>> } >>> } >>> >>> On Mon, May 16, 2011 at 2:40 PM, Dave Messina wrote: >> >>> Hi Lorenzo, >>> >>> Please remember to reply all so the mailing list sees the whole >>> discussion. >>> >>> I looked at your code and used your data, and the reason you're not always >>> getting the best hit in your output is that subsequent hits which also pass >>> your filter are writing over the best hit in your hash. So it's not a random >>> hit that's getting saved, it's the last one that matched your criteria. >>> >>> To fix this, you need to short-circuit out of the loop once you've found a >>> hit matching your criteria. >>> >>> Label the next_result loop like: >>> >>> RESULT: while ( my $result = $in->next_result ) { >>> >>> and then have the last line inside your filtering if block be >>> >>> next RESULT; >>> >>> Assuming "best hit" to you means best e-value, the above works since hits >>> are sorted by best to worst e-value. >>> >>> >>> I'd also like to get the queries only returning itself as hits >>>> (singletons) but I must make run besthits properly before (any suggestion?). >>> >>> >>> For this, I'd suggest that you actually save the best hit regardless of >>> whether it's a self-match, and then if there's a hit passing your criteria >>> that isn't a self-match, you test for that and replace the previously saved >>> self-match in %besthits. (Again, short-circuiting out of the loop once >>> you've got that hit.) >>> >>> >>> For the moment I'm only testing bitscore and thresholdevalue so I don't >>>> need to loop over hsp (do I?). >>> >>> >>> No, remember each line in the -m8 output is an hsp, not a hit. So if you >>> only loop over the hits, you'll only see the first hsp. In your example >>> dataset, there weren't any with multiple hsps, so that's why you didn't run >>> into this problem. >>> >>> >>> Dave >>> >>> >>> >>> >>> On Sat, May 14, 2011 at 22:17, Lorenzo Carretero >>> wrote: >>> >>>> Hi David, >>>> Thanks for your reply. >>>> My scripts is within a subroutine within a .pm file called from a .pl >>>> script. I checked the arguments are being passed correctly. For the moment >>>> I'm only testing bitscore and thresholdevalue so I don't need to loop over >>>> hsp (do I?). Of course, I'm using the strict and warning pragmas and use >>>> Bio::SearchIO; I tried with a reduced version of my BLASTP output (attached >>>> as a file). This is what I get in %besthits: >>>> >>>>> gnl|Alyrata|AL2G21220,gnl|Alyrata|AL6G21100 >>>>> gnl|Alyrata|AL0G11620,gnl|Alyrata|AL1G37580 >>>>> gnl|Alyrata|AL6G05070,gnl|Alyrata|AL3G15690 >>>>> gnl|Alyrata|AL2G12090,gnl|Alyrata|AL4G18800 >>>>> gnl|Alyrata|AL1G15460,gnl|Alyrata|AL0G01870 >>>>> >>>> >>>> My script only returns the correct hits for AL6G05070 and AL0G11260, >>>> while the 3d for AL2G21220, the 17th for AL2G12090 and the 13th for >>>> AL1G15460. Note that the best non-self hit for AL2G12090 is not the second >>>> one but the first as it shows better score than the self-hit. I'd also like >>>> to get the queries only returning itself as hits (singletons) but I must >>>> make run besthits properly before (any suggestion?). >>>> Thanks again, >>>> Lorenzo >>>> >>>> On Sat, May 14, 2011 at 6:15 PM, Dave Messina wrote: >>>> >>>>> Hi Lorenzo, >>>>> >>>>> Hmm, I tried your code, and it worked for me. >>>>> >>>>> I set your input variables set as follows: >>>>> my ( $filename, $format, $minimumbitsscore, $thresholdevalue, >>>>> $minimumidentity, >>>>> $maximumredundancy, $minimumalnlength ) >>>>> = ('2008.blasttable', 'blasttable', 300, '1e-100', 10, 0, 10); >>>>> >>>>> where the file is t/data/2008.blasttable that comes with the BioPerl >>>>> distro. With those settings, the top hit goes into %besthits and the third >>>>> hit goes into %diverged. >>>>> >>>>> I also added at the top: >>>>> use strict; >>>>> use warnings; >>>>> use Bio::SearchIO; >>>>> >>>>> You *are* using strict and warnings, aren't you? :) >>>>> >>>>> One thing that may be an issue is that you're doing this at the hit >>>>> level, and remember that in -m8 format each line represents an HSP. >>>>> >>>>> I'd need your input ? really, a neat little test case and the >>>>> corresponding erroneous output ? to help further. >>>>> >>>>> Dave >>>>> >>>>> >>>>> >>>>> On Sat, May 14, 2011 at 17:29, Lorenzo Carretero < >>>>> lcpaulet at googlemail.com> wrote: >>>>> >>>>>> Hi all, >>>>>> I'm trying to parse blasttable '-m 8 format' reports from whole >>>>>> intra-genome >>>>>> comparisons of all vs all to get the best non-self hit and diverged >>>>>> best >>>>>> non-self hit as separate hashes (key=query=>value=non_self_hit). The >>>>>> following script seems to run ok but it returns unexpected results >>>>>> (i.e. it >>>>>> doesn't catch the best hit but a random hit apparently). I assume hits >>>>>> are >>>>>> iterated over the while loop (while (my $hit = $result->next_hit) as >>>>>> returned in the blast results (i.e. sorted by hits bit scores). >>>>>> Any help would be much appreciated. >>>>>> Cheers, >>>>>> Lorenzo >>>>>> >>>>>> >>>>>> my ( $filename, $format, $minimumbitsscore, $thresholdevalue, >>>>>>> $minimumidentity, $maximumredundancy, $minimumalnlength ) = @_; >>>>>>> my ( %besthits, %diverged, %redundant ,%genefusion ) = (); >>>>>>> my ( $refbh, $refdiv, $cb, $cd, $refred, $reffus, $cr, $cf); >>>>>>> my $total = 0; >>>>>>> my $in = new Bio::SearchIO ( -file => $filename, >>>>>>> -format => $format, >>>>>>> #-verbose => -1, >>>>>>> ) >>>>>>> or die "No $filename BLAST file with >>>>>>> $format found"; >>>>>>> while( my $result = $in->next_result) { >>>>>>> $total++; >>>>>>> while (my $hit = $result->next_hit) { >>>>>>> my $query = $result->query_name(); >>>>>>> my $hitname = $hit->name(); >>>>>>> my $bits = $hit->bits(); >>>>>>> my $evalue = $hit->significance(); >>>>>>> if ($query ne $hitname and $bits >= $minimumbitsscore and >>>>>>> $evalue <= $thresholdevalue ) { >>>>>>> $besthits{ $query } = $hitname; >>>>>>> } >>>>>>> elsif ( $bits <= $minimumbitsscore and $evalue >= >>>>>>> $thresholdevalue){ >>>>>>> $diverged { $query } = $hitname; >>>>>>> } >>>>>>> >>>>>>> # while( my $hsp = $hit->next_hsp ) { >>>>>>> # my $querylen = $hsp->length( 'query' ); >>>>>>> # my $hitlen = $hsp->length( 'hit' ); >>>>>>> # my $alnlen = $hsp->length( 'total' ); >>>>>>> # my $identity = $hsp->percent_identity(); >>>>>>> # if ( $identity > $maximumredundancy ) { >>>>>>> # $redundant{ $query } = $hitname; >>>>>>> # } >>>>>>> # elsif ( ($querylen <= ($alnlen / 2) ) || >>>>>> ($hitlen <= >>>>>>> ($alnlen / 2) ) || ($alnlen <= $minimumalnlength) ) { >>>>>>> # $genefusion{ $query } = $hitname; >>>>>>> # } >>>>>>> # elsif ( ($evalue >= $thresholdevalue) || >>>>>> ($identity <= >>>>>>> $minimumidentity) || ($bits <= $minimumbitsscore) ) { >>>>>>> # $diverged{ $query } = $hitname; >>>>>>> # } >>>>>>> # else { >>>>>>> # $besthits{ $query } = $hitname; >>>>>>> # } >>>>>>> # } >>>>>>> } >>>>>>> } >>>>>>> $refbh = \%besthits; >>>>>>> $refdiv = \%diverged; >>>>>>> $refred = \%redundant; >>>>>>> $reffus = \%genefusion; >>>>>>> $cb = keys ( %besthits ); >>>>>>> $cd = keys( %diverged ); >>>>>>> $cr = keys( %redundant ); >>>>>>> $cf = keys( %genefusion ); >>>>>>> >>>>>> _______________________________________________ >>>>>> Bioperl-l mailing list >>>>>> Bioperl-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>> >>>>> >>>>> >>>> >>> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Tue May 17 03:53:05 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Tue, 17 May 2011 09:53:05 +0200 Subject: [Bioperl-l] Bio::Tools::Primer3Redux released to CPAN In-Reply-To: <091955B1-AFDF-4DC2-89ED-F04D3F251EC7@illinois.edu> References: <091955B1-AFDF-4DC2-89ED-F04D3F251EC7@illinois.edu> Message-ID: Very cool, Chris! This will be a nice template for BioPerl Modules of the Future. Dave On Mon, May 16, 2011 at 22:46, Chris Fields wrote: > All, > > Forgot to mention this on the list: I have released some code I had lying > around (used in a local project) as Bio::Tools::Primer3Redux. Grab it from > CPAN if ya want it: > > http://search.cpan.org/dist/Bio-Tools-Primer3Redux/ > > The main repo on github: > > https://github.com/cjfields/bio-tools-primer3redux > > == Features == > > * Support for both primer3 v1 and v2. > * The distribution contains both the wrapper > (Bio::Tools::Run::Primer3Redux), the parser, and the related modules for > storing data. In other words, it's self-contained and not tied to a > specific BioPerl version beyond some basic classes (Bio::Root::Root and > Bio::SeqFeature::Generic). > * Some Tests! > * Some Documentation! > > == Why Primer3Redux? == > > This is a rewrite of the BioPerl code for Primer3. The key reason for a > new name: the API differs significantly enough from the older Primer3 code > to pretty much require a different namespace. For some of the functionality > I wanted at the time, the code pretty much required it (such as using > hierarchal features). > > == Why not include this within BioPerl? == > > Well, truthfully, I thought it would be a good idea to demonstrate that one > can both (1) release BioPerl-reliant code on CPAN as a separate focused > bundle and still contribute to BioPerl, and (2) use modern perl tools to do > so (e.g. Dist::Zilla). For more (and likely better) examples of the former, > see: Bio::Graphics, Bio::DB::Sam, Bio::Chado::Schema, and numerous other > modules. > > == To Do == > > * Would probably be a good idea to genericize the base feature class being > used so one could use any Bio::SeqFeatureI (e.g. Bio::DB::SeqFeature, for > instance). > * Separate the parser out; primer3 is still using Boulder format! > > == Thanks == > > * Frank Schwach and Cass Johnston, for their input and bug reports. > * UIUC > > chris > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From abualiga at gmail.com Wed May 18 17:37:40 2011 From: abualiga at gmail.com (Galeb Abu-Ali) Date: Wed, 18 May 2011 17:37:40 -0400 Subject: [Bioperl-l] failed to install Bio::Tools::Run::StandAloneBlastPlus Message-ID: Hi, I'm having trouble installing Bio::Tools::Run::StandAloneBlastPlus, which I'd like to try since I use standalone BLAST+. I realize the obstacle has to do with CJFIELDS/BioPerl-Run-1.006900, but have no idea how to fix it. Below is the Test Summary Report and cpan-tester results for CJFIELDS/BioPerl-Run-1.006900.tar.gz. I use a RHEL 5.5 machine. Many thanks for your suggestions. galeb Test Summary Report ------------------- t/BWA.t (Wstat: 512 Tests: 0 Failed: 0) Non-zero exit status: 2 Parse errors: Bad plan. You planned 36 tests but ran 0. t/Blat.t (Wstat: 65280 Tests: 20 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 33 tests but ran 20. t/Samtools.t (Wstat: 512 Tests: 24 Failed: 0) Non-zero exit status: 2 Parse errors: Bad plan. You planned 40 tests but ran 24. Files=80, Tests=2799, 34 wallclock secs ( 0.37 usr 0.06 sys + 30.90 cusr 3.07 csys = 34.40 CPU) Result: FAIL Failed 3/80 test programs. 0/2799 subtests failed. CJFIELDS/BioPerl-Run-1.006900.tar.gz ./Build test -- NOT OK //hint// to see the cpan-testers results for installing this module, try: reports CJFIELDS/BioPerl-Run-1.006900.tar.gz Running Build install make test had returned bad status, won't install without force Failed during this command: CJFIELDS/BioPerl-Run-1.006900.tar.gz : make_test NO cpan[2]> reports CJFIELDS/BioPerl-Run-1.006900.tar.gz Distribution: C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz Fetching 'http://www.cpantesters.org/show/BioPerl-Run.yaml'...DONE Catching error: "CPAN::Exception::yaml_process_error=HASH(0x1964d450)" at /usr/local/lib/perl5/5.12.2/CPAN.pm line 391 CPAN::shell() called at /usr/local/lib/perl5/5.12.2/App/Cpan.pm line 295 App::Cpan::_process_options('App::Cpan') called at /usr/local/lib/perl5/5.12.2/App/Cpan.pm line 364 App::Cpan::run('App::Cpan') called at /usr/local/bin/cpan line 11 From rmb32 at cornell.edu Wed May 18 20:00:08 2011 From: rmb32 at cornell.edu (Robert Buels) Date: Wed, 18 May 2011 17:00:08 -0700 Subject: [Bioperl-l] failed to install Bio::Tools::Run::StandAloneBlastPlus In-Reply-To: References: Message-ID: <4DD45D88.6010207@cornell.edu> Hi Galeb, In your email, could you include the *full* test output? Just the summary isn't enough for us to diagnose what is happening. Rob On 05/18/2011 02:37 PM, Galeb Abu-Ali wrote: > Hi, > > I'm having trouble installing Bio::Tools::Run::StandAloneBlastPlus, which > I'd like to try since I use standalone BLAST+. > I realize the obstacle has to do with CJFIELDS/BioPerl-Run-1.006900, but > have no idea how to fix it. Below is the Test Summary Report and cpan-tester > results for CJFIELDS/BioPerl-Run-1.006900.tar.gz. I use a RHEL 5.5 machine. > Many thanks for your suggestions. > > galeb > > > > Test Summary Report > ------------------- > t/BWA.t (Wstat: 512 Tests: 0 Failed: 0) > Non-zero exit status: 2 > Parse errors: Bad plan. You planned 36 tests but ran 0. > t/Blat.t (Wstat: 65280 Tests: 20 Failed: 0) > Non-zero exit status: 255 > Parse errors: Bad plan. You planned 33 tests but ran 20. > t/Samtools.t (Wstat: 512 Tests: 24 Failed: 0) > Non-zero exit status: 2 > Parse errors: Bad plan. You planned 40 tests but ran 24. > Files=80, Tests=2799, 34 wallclock secs ( 0.37 usr 0.06 sys + 30.90 cusr > 3.07 csys = 34.40 CPU) > Result: FAIL > Failed 3/80 test programs. 0/2799 subtests failed. > CJFIELDS/BioPerl-Run-1.006900.tar.gz > ./Build test -- NOT OK > //hint// to see the cpan-testers results for installing this module, try: > reports CJFIELDS/BioPerl-Run-1.006900.tar.gz > Running Build install > make test had returned bad status, won't install without force > Failed during this command: > CJFIELDS/BioPerl-Run-1.006900.tar.gz : make_test NO > > cpan[2]> reports CJFIELDS/BioPerl-Run-1.006900.tar.gz > Distribution: C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz > Fetching 'http://www.cpantesters.org/show/BioPerl-Run.yaml'...DONE > > Catching error: "CPAN::Exception::yaml_process_error=HASH(0x1964d450)" at > /usr/local/lib/perl5/5.12.2/CPAN.pm line 391 > CPAN::shell() called at /usr/local/lib/perl5/5.12.2/App/Cpan.pm line > 295 > App::Cpan::_process_options('App::Cpan') called at > /usr/local/lib/perl5/5.12.2/App/Cpan.pm line 364 > App::Cpan::run('App::Cpan') called at /usr/local/bin/cpan line 11 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From rmb32 at cornell.edu Wed May 18 19:57:07 2011 From: rmb32 at cornell.edu (Robert Buels) Date: Wed, 18 May 2011 16:57:07 -0700 Subject: [Bioperl-l] cpan indexing Message-ID: <4DD45CD3.40303@cornell.edu> So, rob at nightshade ~$ cpanm Bio::PrimarySeq --> Working on Bio::PrimarySeq Fetching http://search.cpan.org/CPAN/authors/id/C/CJ/CJFIELDS/BioPerl-1.6.1.tar.gz ... What? 1.6.1? That's not right. Looking in http://search.cpan.org/CPAN/modules/02packages.details.txt.gz, we see things like: Bio::Align::AlignI 1.006001 BioPerl-1.6.1.tar.gz Bio::Align::DNAStatistics 1.006001 BioPerl-1.6.1.tar.gz Bio::Align::Graphics 0 BioPerl-1.6.900.tar.gz Bio::Align::PairwiseStatistics 1.006001 BioPerl-1.6.1.tar.gz Bio::Align::ProteinStatistics 1.006001 BioPerl-1.6.1.tar.gz Bio::Align::StatisticsI 1.006001 BioPerl-1.6.1.tar.gz It looks like 1.6.1 was uploaded with $VERSION of every module, while 1.6.9 was not (thus the 0 version numbers). Thus, for all but the omodules that were added in 1.6.9, 1.6.1 is being treated by the CPAN indexer as the most recent distribution. This probably calls for another release, 1.6.910 or something, with $VERSION in each file. :-( Rob From rmb32 at cornell.edu Wed May 18 19:58:53 2011 From: rmb32 at cornell.edu (Robert Buels) Date: Wed, 18 May 2011 16:58:53 -0700 Subject: [Bioperl-l] cpan indexing In-Reply-To: <4DD45CD3.40303@cornell.edu> References: <4DD45CD3.40303@cornell.edu> Message-ID: <4DD45D3D.9050702@cornell.edu> Forgot to include the related reading about $VERSION: http://www.cpan.org/modules/04pause.html#conventions Rob On 05/18/2011 04:57 PM, Robert Buels wrote: > So, > > rob at nightshade ~$ cpanm Bio::PrimarySeq > --> Working on Bio::PrimarySeq > Fetching > http://search.cpan.org/CPAN/authors/id/C/CJ/CJFIELDS/BioPerl-1.6.1.tar.gz ... > > > What? 1.6.1? That's not right. > > Looking in > http://search.cpan.org/CPAN/modules/02packages.details.txt.gz, we see > things like: > > Bio::Align::AlignI 1.006001 BioPerl-1.6.1.tar.gz > Bio::Align::DNAStatistics 1.006001 BioPerl-1.6.1.tar.gz > Bio::Align::Graphics 0 BioPerl-1.6.900.tar.gz > Bio::Align::PairwiseStatistics 1.006001 BioPerl-1.6.1.tar.gz > Bio::Align::ProteinStatistics 1.006001 BioPerl-1.6.1.tar.gz > Bio::Align::StatisticsI 1.006001 BioPerl-1.6.1.tar.gz > > It looks like 1.6.1 was uploaded with $VERSION of every module, while > 1.6.9 was not (thus the 0 version numbers). Thus, for all but the > omodules that were added in 1.6.9, 1.6.1 is being treated by the CPAN > indexer as the most recent distribution. > > This probably calls for another release, 1.6.910 or something, with > $VERSION in each file. :-( > > Rob From lcpaulet at googlemail.com Tue May 17 19:25:24 2011 From: lcpaulet at googlemail.com (Lorenzo Carretero) Date: Wed, 18 May 2011 01:25:24 +0200 Subject: [Bioperl-l] questions on Bio::Tools::Run::Alignment::Clustalw Message-ID: Hi all, I have a few question regarding the package Bio::Tools::Run::Alignment::Clustalw. The following script: #!/usr/local/bin/perl -w > use 5.010; > use strict; > > use lib "/Library/Perl/"; > use Bio::Perl; > use Bio::Seq; > use Bio::SeqIO; > # definition of the environmental variable CLUSTALDIR > BEGIN {$ENV{CLUSTALDIR} = > '/Applications/Bioinformatics/clustalw-2.0.10-macosx/ '} > use Bio::Tools::Run::Alignment::Clustalw; > > my $sequencesfilename = > "/Users/Lorenzo/Documents/SequencesDatabase/plaza_public_02_Apr27/plaza_public_02/BLAST_Parsed_results/PerSpecies/test_vs_test.besth.pep1.fas > "; > my $format = 'fasta'; > #my $inseq = Bio::SeqIO->new(-file => "<$sequencesfilename", > # -format => $format ); > > my $factory = Bio::Tools::Run::Alignment::Clustalw->new (); #use default > parameters > #my @seq_object_array = read_all_sequences( -file => > "<$sequencesfilename", > # -format => $format ); > #my $seq_array_ref = \@seq_object_array; > #my $aln = $factory->align($seq_array_ref); > my $aln = $factory->align($sequencesfilename); > my $avgpercentid = $aln->percentage_identity; > my $alnlength = $aln->length(); > my $numberalnresidues = $aln->no_residues; > print "$avgpercentid and $alnlength and $numberalnresidues\n"; > is returning the following error message: Use of uninitialized value in concatenation (.) or string at > /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm line 753. > Use of uninitialized value in concatenation (.) or string at > /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm line 754. > sh: align: command not found > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: ClustalW call ( align > -infile="/Users/Lorenzo/Desktop/test_vs_test.besth.pep1.fas" -output=gcg > -outfile="/var/folders/rA/rApd7cXoFyWK-Yhn66cxZk+++TI/-Tmp-/O3Was62L0X/exicCvJnrF" > 2>&1) crashed: 32512 > STACK: Error::throw > STACK: Bio::Root::Root::throw /Library/Perl//5.10.0/Bio/Root/Root.pm:368 > STACK: Bio::Tools::Run::Alignment::Clustalw::_run > /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm:768 > STACK: Bio::Tools::Run::Alignment::Clustalw::align > /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm:515 > STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/test.pl:22 > ----------------------------------------------------------- > What would be more efficient in term of memory usage: i.-performing the alignment directly over a fasta sequences file or ii.-performing the alignment over a ref to an array of seq objects: my @seq_object_array = read_all_sequences( -file => > "<$sequencesfilename", > -format => $format ); > my $seq_array_ref = \@seq_object_array; > my $aln = $factory->align($seq_array_ref); > Unfortunately my script is not running neither in this form. I checked and custalw is properly installed in the given dir It appears as the script is not reading properly my file (see attached document). Should I move the seqs files to the clustalw dir? FInally, is there any way of geting the number of aminoacids in the aligned region in eg. the longer or the shorter sequence implemented or should I loop over the sequences in the $aln Bio::SimpleAlign object etc?. Greetings from Spain, Lorenzo -------------- next part -------------- A non-text attachment was scrubbed... Name: test_vs_test.besth.pep1.fas Type: application/octet-stream Size: 1323 bytes Desc: not available URL: From lsbrath at gmail.com Wed May 18 14:16:49 2011 From: lsbrath at gmail.com (Mgavi Brathwaite) Date: Wed, 18 May 2011 14:16:49 -0400 Subject: [Bioperl-l] Poly Bioinformatics Event Message-ID: Science*Alliance **s in* *matics: C* *Lunch will be provided Reception to follow* *Registration* *This is a free event* *Location* Bern Dibner Library at NYU-POLY 5 Metrotech Center Brooklyn, NY 11201 For additional information, e-mail nymeetings at nyas.org or call 212.298.3725 *Career Bioinfor* *From th to the* *and Be* *e Lab linic yond* *Register at www.nyas.org/Bioinformatics* MAY 21 1:00 PM- 4:00 PM The 21st century has been named the century of biology and at the core of this revolution is the field of bioinformatics and computational biology. As we are well into the post genomic era and dealing with ever growing amounts of data the means to process, learn, and discover what we are producing from our high content experiments is key. This event will focus on the opportunities that exist in the fast growing field of bioinformatics and its applications to healthcare, energy, and agriculture. Also, we will provide information on local resources for graduate education at the masters level and up. *Speakers* *Richard Bonneau*, PhD, New York University *Mgavi E. Brathwaite*, MS, Polytechnic Institute of New York University *Tamer Chowdbury*, MS, Merck *Kalle Levon*, PhD, Polytechnic Institute of New York University *Bud Mishra*, NYU Courant Institute *Usman W. Roshan*, PhD, New Jersey Insitute of Technology Presented by *Science Alliance *and *NYU-POLY Graduate Center- Bioinformatics* * * From abualiga2 at gmail.com Wed May 18 17:11:13 2011 From: abualiga2 at gmail.com (galeb abu-ali) Date: Wed, 18 May 2011 17:11:13 -0400 Subject: [Bioperl-l] fail to install Bio::Tools::Run::StandAloneBlastPlus Message-ID: Hi, I'm having trouble installing Bio::Tools::Run::StandAloneBlastPlus, which I'd like to try since I use standalone BLAST+. I realize the obstacle has to do with CJFIELDS/BioPerl-Run-1.006900, but have no idea how to fix it. Below is the Test Summary Report and cpan-tester results for CJFIELDS/BioPerl-Run-1.006900.tar.gz. I use a RHEL 5.5 machine. Many thanks for your suggestions. galeb Test Summary Report ------------------- t/BWA.t (Wstat: 512 Tests: 0 Failed: 0) Non-zero exit status: 2 Parse errors: Bad plan. You planned 36 tests but ran 0. t/Blat.t (Wstat: 65280 Tests: 20 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 33 tests but ran 20. t/Samtools.t (Wstat: 512 Tests: 24 Failed: 0) Non-zero exit status: 2 Parse errors: Bad plan. You planned 40 tests but ran 24. Files=80, Tests=2799, 34 wallclock secs ( 0.37 usr 0.06 sys + 30.90 cusr 3.07 csys = 34.40 CPU) Result: FAIL Failed 3/80 test programs. 0/2799 subtests failed. CJFIELDS/BioPerl-Run-1.006900.tar.gz ./Build test -- NOT OK //hint// to see the cpan-testers results for installing this module, try: reports CJFIELDS/BioPerl-Run-1.006900.tar.gz Running Build install make test had returned bad status, won't install without force Failed during this command: CJFIELDS/BioPerl-Run-1.006900.tar.gz : make_test NO cpan[2]> reports CJFIELDS/BioPerl-Run-1.006900.tar.gz Distribution: C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz Fetching 'http://www.cpantesters.org/show/BioPerl-Run.yaml'...DONE Catching error: "CPAN::Exception::yaml_process_error=HASH(0x1964d450)" at /usr/local/lib/perl5/5.12.2/CPAN.pm line 391 CPAN::shell() called at /usr/local/lib/perl5/5.12.2/App/Cpan.pm line 295 App::Cpan::_process_options('App::Cpan') called at /usr/local/lib/perl5/5.12.2/App/Cpan.pm line 364 App::Cpan::run('App::Cpan') called at /usr/local/bin/cpan line 11 From lcpaulet at googlemail.com Wed May 18 18:27:13 2011 From: lcpaulet at googlemail.com (Lorenzo Carretero) Date: Thu, 19 May 2011 00:27:13 +0200 Subject: [Bioperl-l] questions on ClustalW.pm Message-ID: Hi all, I have a few question regarding the package Bio::Tools::Run::Alignment:: Clustalw. The following script: #!/usr/local/bin/perl -w > use 5.010; > use strict; > > use lib "/Library/Perl/"; > use Bio::Perl; > use Bio::Seq; > use Bio::SeqIO; > # definition of the environmental variable CLUSTALDIR > BEGIN {$ENV{CLUSTALDIR} = > '/Applications/Bioinformatics/clustalw-2.0.10-macosx/ '} > use Bio::Tools::Run::Alignment::Clustalw; > > my $sequencesfilename = > "/Users/Lorenzo/Documents/SequencesDatabase/plaza_public_02_Apr27/plaza_public_02/BLAST_Parsed_results/PerSpecies/test_vs_test.besth.pep1.fas > "; > my $format = 'fasta'; > #my $inseq = Bio::SeqIO->new(-file => "<$sequencesfilename", > # -format => $format ); > > my $factory = Bio::Tools::Run::Alignment::Clustalw->new (); #use default > parameters > #my @seq_object_array = read_all_sequences( -file => > "<$sequencesfilename", > # -format => $format ); > #my $seq_array_ref = \@seq_object_array; > #my $aln = $factory->align($seq_array_ref); > my $aln = $factory->align($sequencesfilename); > my $avgpercentid = $aln->percentage_identity; > my $alnlength = $aln->length(); > my $numberalnresidues = $aln->no_residues; > print "$avgpercentid and $alnlength and $numberalnresidues\n"; > is returning the following error message: Use of uninitialized value in concatenation (.) or string at > /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm line 753. > Use of uninitialized value in concatenation (.) or string at > /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm line 754. > sh: align: command not found > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: ClustalW call ( align > -infile="/Users/Lorenzo/Desktop/test_vs_test.besth.pep1.fas" -output=gcg > -outfile="/var/folders/rA/rApd7cXoFyWK-Yhn66cxZk+++TI/-Tmp-/O3Was62L0X/exicCvJnrF" > 2>&1) crashed: 32512 > STACK: Error::throw > STACK: Bio::Root::Root::throw /Library/Perl//5.10.0/Bio/Root/Root.pm:368 > STACK: Bio::Tools::Run::Alignment::Clustalw::_run > /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm:768 > STACK: Bio::Tools::Run::Alignment::Clustalw::align > /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm:515 > STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/test.pl:22 > ----------------------------------------------------------- > What would be more efficient in term of memory usage: i.-performing the alignment directly over a fasta sequences file or ii.-performing the alignment over a ref to an array of seq objects: my @seq_object_array = read_all_sequences( -file => > "<$sequencesfilename", > -format => $format ); > my $seq_array_ref = \@seq_object_array; > my $aln = $factory->align($seq_array_ref); > Unfortunately my script is not running neither in this form. I checked and custalw is properly installed in the given dir It appears as the script is not reading properly my file (see attached document). Should I move the seqs files to the clustalw dir? FInally, is there any way of geting the number of aminoacids in the aligned region in eg. the longer or the shorter sequence implemented or should I loop over the sequences in the $aln Bio::SimpleAlign object etc?. Greetings from Spain, Lorenzo -------------- next part -------------- A non-text attachment was scrubbed... Name: test_vs_test.besth.pep1.fas Type: application/octet-stream Size: 1323 bytes Desc: not available URL: From lcpaulet at googlemail.com Wed May 18 18:32:31 2011 From: lcpaulet at googlemail.com (Lorenzo Carretero) Date: Thu, 19 May 2011 00:32:31 +0200 Subject: [Bioperl-l] questions on CLustalW.pm Message-ID: Hi all, I have a few question regarding the package Bio::Tools::Run::Alignment::Clustalw. The following script: #!/usr/local/bin/perl -w > use 5.010; > use strict; > > use lib "/Library/Perl/"; > use Bio::Perl; > use Bio::Seq; > use Bio::SeqIO; > # definition of the environmental variable CLUSTALDIR > BEGIN {$ENV{CLUSTALDIR} = > '/Applications/Bioinformatics/clustalw-2.0.10-macosx/ '} > use Bio::Tools::Run::Alignment::Clustalw; > > my $sequencesfilename = > "/Users/Lorenzo/Documents/SequencesDatabase/plaza_public_02_Apr27/plaza_public_02/BLAST_Parsed_results/PerSpecies/test_vs_test.besth.pep1.fas > "; > my $format = 'fasta'; > #my $inseq = Bio::SeqIO->new(-file => "<$sequencesfilename", > # -format => $format ); > > my $factory = Bio::Tools::Run::Alignment::Clustalw->new (); #use default > parameters > #my @seq_object_array = read_all_sequences( -file => > "<$sequencesfilename", > # -format => $format ); > #my $seq_array_ref = \@seq_object_array; > #my $aln = $factory->align($seq_array_ref); > my $aln = $factory->align($sequencesfilename); > my $avgpercentid = $aln->percentage_identity; > my $alnlength = $aln->length(); > my $numberalnresidues = $aln->no_residues; > print "$avgpercentid and $alnlength and $numberalnresidues\n"; > is returning the following error message: Use of uninitialized value in concatenation (.) or string at > /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm line 753. > Use of uninitialized value in concatenation (.) or string at > /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm line 754. > sh: align: command not found > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: ClustalW call ( align > -infile="/Users/Lorenzo/Desktop/test_vs_test.besth.pep1.fas" -output=gcg > -outfile="/var/folders/rA/rApd7cXoFyWK-Yhn66cxZk+++TI/-Tmp-/O3Was62L0X/exicCvJnrF" > 2>&1) crashed: 32512 > STACK: Error::throw > STACK: Bio::Root::Root::throw /Library/Perl//5.10.0/Bio/Root/Root.pm:368 > STACK: Bio::Tools::Run::Alignment::Clustalw::_run > /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm:768 > STACK: Bio::Tools::Run::Alignment::Clustalw::align > /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm:515 > STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/test.pl:22 > ----------------------------------------------------------- > What would be more efficient in term of memory usage: i.-performing the alignment directly over a fasta sequences file or ii.-performing the alignment over a ref to an array of seq objects: my @seq_object_array = read_all_sequences( -file => > "<$sequencesfilename", > -format => $format ); > my $seq_array_ref = \@seq_object_array; > my $aln = $factory->align($seq_array_ref); > Unfortunately my script is not running neither in this form. I checked and custalw is properly installed in the given dir It appears as the script is not reading properly my file (see attached document). Should I move the seqs files to the clustalw dir? FInally, is there any way of geting the number of aminoacids in the aligned region in eg. the longer or the shorter sequence implemented or should I loop over the sequences in the $aln Bio::SimpleAlign object etc?. Greetings from Spain, Lorenzo -------------- next part -------------- A non-text attachment was scrubbed... Name: test_vs_test.besth.pep1.fas Type: application/octet-stream Size: 1323 bytes Desc: not available URL: From locarpau at upvnet.upv.es Wed May 18 18:41:26 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero) Date: Thu, 19 May 2011 00:41:26 +0200 Subject: [Bioperl-l] questions on Bio::Tools::Run::Alignment::Clustalw Message-ID: <4DD44B16.7020108@upvnet.upv.es> Hi all, I have a few questions regarding the package Bio::Tools::Run::Alignment::Clustalw. The following script: #!/usr/local/bin/perl -w use 5.010; use strict; use lib "/Library/Perl/"; use Bio::Perl; use Bio::Seq; use Bio::SeqIO; # definition of the environmental variable CLUSTALDIR BEGIN {$ENV{CLUSTALDIR} = '/Applications/Bioinformatics/clustalw-2.0.10-macosx/ '} use Bio::Tools::Run::Alignment::Clustalw; my $sequencesfilename = "/Users/Lorenzo/Documents/SequencesDatabase/plaza_public_02_Apr27/plaza_public_02/BLAST_Parsed_results/PerSpecies/test_vs_test.besth.pep1.fas "; my $format = 'fasta'; #my $inseq = Bio::SeqIO->new(-file => "<$sequencesfilename", # -format => $format ); my $factory = Bio::Tools::Run::Alignment::Clustalw->new (); #use default parameters #my @seq_object_array = read_all_sequences( -file => "<$sequencesfilename", # -format => $format ); #my $seq_array_ref = \@seq_object_array; #my $aln = $factory->align($seq_array_ref); my $aln = $factory->align($sequencesfilename); my $avgpercentid = $aln->percentage_identity; my $alnlength = $aln->length(); my $numberalnresidues = $aln->no_residues; print "$avgpercentid and $alnlength and $numberalnresidues\n"; is returning the following error message: Use of uninitialized value in concatenation (.) or string at /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm line 753. Use of uninitialized value in concatenation (.) or string at /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm line 754. sh: align: command not found ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: ClustalW call ( align -infile="/Users/Lorenzo/Desktop/test_vs_test.besth.pep1.fas" -output=gcg -outfile="/var/folders/rA/rApd7cXoFyWK-Yhn66cxZk+++TI/-Tmp-/O3Was62L0X/exicCvJnrF" 2>&1) crashed: 32512 STACK: Error::throw STACK: Bio::Root::Root::throw /Library/Perl//5.10.0/Bio/Root/Root.pm:368 STACK: Bio::Tools::Run::Alignment::Clustalw::_run /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm:768 STACK: Bio::Tools::Run::Alignment::Clustalw::align /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm:515 STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/test.pl:22 ----------------------------------------------------------- What would be more efficient in term of memory usage: i.-performing the alignment directly over a fasta sequences file or ii.-performing the alignment over a ref to an array of seq objects: my @seq_object_array = read_all_sequences( -file => "<$sequencesfilename", -format => $format ); my $seq_array_ref = \@seq_object_array; my $aln = $factory->align($seq_array_ref); Unfortunately my script is not running neither in this form. I checked and custalw is properly installed in the given dir It appears as the script is not reading properly my file (see attached document). Should I move the seqs files to the clustalw dir? FInally, is there any way of geting the number of aminoacids in the aligned region in eg. the longer or the shorter sequence implemented or should I loop over the sequences in the $aln Bio::SimpleAlign object etc?. Thanks for your help- Greetings from Spain, Lorenzo -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test_vs_test.besth.pep1.fas URL: From cjfields at illinois.edu Wed May 18 21:04:54 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 18 May 2011 20:04:54 -0500 Subject: [Bioperl-l] cpan indexing In-Reply-To: <4DD45D3D.9050702@cornell.edu> References: <4DD45CD3.40303@cornell.edu> <4DD45D3D.9050702@cornell.edu> Message-ID: <0E5A4CF7-8469-4AEA-8896-13BDC7796658@illinois.edu> We can do that (release a new version). This likely has to do with the simplification of Bio::Root::Build that went in with v 1.6.9; I can check the code to see if there was some munging that simplified this. chris On May 18, 2011, at 6:58 PM, Robert Buels wrote: > Forgot to include the related reading about $VERSION: > > http://www.cpan.org/modules/04pause.html#conventions > > Rob > > On 05/18/2011 04:57 PM, Robert Buels wrote: >> So, >> >> rob at nightshade ~$ cpanm Bio::PrimarySeq >> --> Working on Bio::PrimarySeq >> Fetching >> http://search.cpan.org/CPAN/authors/id/C/CJ/CJFIELDS/BioPerl-1.6.1.tar.gz ... >> >> >> What? 1.6.1? That's not right. >> >> Looking in >> http://search.cpan.org/CPAN/modules/02packages.details.txt.gz, we see >> things like: >> >> Bio::Align::AlignI 1.006001 BioPerl-1.6.1.tar.gz >> Bio::Align::DNAStatistics 1.006001 BioPerl-1.6.1.tar.gz >> Bio::Align::Graphics 0 BioPerl-1.6.900.tar.gz >> Bio::Align::PairwiseStatistics 1.006001 BioPerl-1.6.1.tar.gz >> Bio::Align::ProteinStatistics 1.006001 BioPerl-1.6.1.tar.gz >> Bio::Align::StatisticsI 1.006001 BioPerl-1.6.1.tar.gz >> >> It looks like 1.6.1 was uploaded with $VERSION of every module, while >> 1.6.9 was not (thus the 0 version numbers). Thus, for all but the >> omodules that were added in 1.6.9, 1.6.1 is being treated by the CPAN >> indexer as the most recent distribution. >> >> This probably calls for another release, 1.6.910 or something, with >> $VERSION in each file. :-( >> >> Rob > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed May 18 22:55:05 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 18 May 2011 21:55:05 -0500 Subject: [Bioperl-l] cpan indexing In-Reply-To: <0E5A4CF7-8469-4AEA-8896-13BDC7796658@illinois.edu> References: <4DD45CD3.40303@cornell.edu> <4DD45D3D.9050702@cornell.edu> <0E5A4CF7-8469-4AEA-8896-13BDC7796658@illinois.edu> Message-ID: <91AF9D19-D2CD-41E3-9649-D5A9459573AE@illinois.edu> New BioPerl uploaded, version 1.6.901. The PAUSE indexer has all modules with a 1.006901 version, even if $VERSION isn't directly set. Appears that META.json/yml are parsed for the version and not each module (it's what the dist_version option is supposed to be used for); just uncommented some Bio::Root::Build code that was apparently used for this reason and it seems to work now. chris On May 18, 2011, at 8:04 PM, Chris Fields wrote: > We can do that (release a new version). This likely has to do with the simplification of Bio::Root::Build that went in with v 1.6.9; I can check the code to see if there was some munging that simplified this. > > chris > > On May 18, 2011, at 6:58 PM, Robert Buels wrote: > >> Forgot to include the related reading about $VERSION: >> >> http://www.cpan.org/modules/04pause.html#conventions >> >> Rob >> >> On 05/18/2011 04:57 PM, Robert Buels wrote: >>> So, >>> >>> rob at nightshade ~$ cpanm Bio::PrimarySeq >>> --> Working on Bio::PrimarySeq >>> Fetching >>> http://search.cpan.org/CPAN/authors/id/C/CJ/CJFIELDS/BioPerl-1.6.1.tar.gz ... >>> >>> >>> What? 1.6.1? That's not right. >>> >>> Looking in >>> http://search.cpan.org/CPAN/modules/02packages.details.txt.gz, we see >>> things like: >>> >>> Bio::Align::AlignI 1.006001 BioPerl-1.6.1.tar.gz >>> Bio::Align::DNAStatistics 1.006001 BioPerl-1.6.1.tar.gz >>> Bio::Align::Graphics 0 BioPerl-1.6.900.tar.gz >>> Bio::Align::PairwiseStatistics 1.006001 BioPerl-1.6.1.tar.gz >>> Bio::Align::ProteinStatistics 1.006001 BioPerl-1.6.1.tar.gz >>> Bio::Align::StatisticsI 1.006001 BioPerl-1.6.1.tar.gz >>> >>> It looks like 1.6.1 was uploaded with $VERSION of every module, while >>> 1.6.9 was not (thus the 0 version numbers). Thus, for all but the >>> omodules that were added in 1.6.9, 1.6.1 is being treated by the CPAN >>> indexer as the most recent distribution. >>> >>> This probably calls for another release, 1.6.910 or something, with >>> $VERSION in each file. :-( >>> >>> Rob >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Thu May 19 05:54:18 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 19 May 2011 11:54:18 +0200 Subject: [Bioperl-l] questions on Bio::Tools::Run::Alignment::Clustalw In-Reply-To: <4DD44B16.7020108@upvnet.upv.es> References: <4DD44B16.7020108@upvnet.upv.es> Message-ID: Hi Lorenzo, Your code and data works for me with both clustalw v1.83 and 2.1. However, I did have to change the name of the clustalw 2.1 executable from clustalw2 to clustalw. $ perl lorenzo.pl test_vs_test.besth.pep1.fas CLUSTAL 2.1 Multiple Sequence Alignments Sequence format is Pearson Sequence 1: gnl|Alyrata|AL6G05070 602 aa Sequence 2: gnl|Alyrata|AL3G15690 611 aa Start of Pairwise alignments Aligning... Sequences (1:2) Aligned. Score: 33 Guide tree file created: [test_vs_test.besth.pep1.dnd] There are 1 groups Start of Multiple Alignment Aligning... Group 1: Sequences: 2 Score:6856 Alignment Score 1214 GCG-Alignment file created [/var/folders/Na/NagaNXNhHHm1GDx6seD-ME+++TI/-Tmp-/sniIE2msWJ/fGoixJVoUf] --------------------- WARNING --------------------- MSG: Use of method no_residues() is deprecated, use num_residues() instead To be removed in 1.0075 --------------------------------------------------- 34.8639455782313 and 625 and 1213 What would be more efficient in term of memory usage: > i.-performing the alignment directly over a fasta sequences file or > ii.-performing the alignment over a ref to an array of seq objects: Option i. But unless you're doing a ton, you probably won't notice either way, so I would do whichever is more convenient. Should I move the seqs files to the clustalw dir? > No, this isn't the problem. In the error message: MSG: ClustalW call ( align -infile="/Users/Lorenzo/Desktop/test_vs_test.besth.pep1.fas" -output=gcg -outfile="/var/folders/rA/rApd7cXoFyWK-Yhn66cxZk+++TI/-Tmp-/O3Was62L0X/exicCvJnrF" 2>&1) crashed: 32512 I notice that the path to your input file in that error is different than the path in your code ? perhaps this is the issue? FInally, is there any way of geting the number of aminoacids in the aligned > region in eg. the longer or the shorter sequence implemented or should I > loop over the sequences in the $aln Bio::SimpleAlign object etc?. > I'm not sure I understand your question: do you want something different than $aln->length() ? Dave From member at linkedin.com Thu May 19 05:50:42 2011 From: member at linkedin.com (Pranav Karkhanis via LinkedIn) Date: Thu, 19 May 2011 09:50:42 +0000 (UTC) Subject: [Bioperl-l] Invitation to connect on LinkedIn Message-ID: <979012060.21679078.1305798642928.JavaMail.app@ela4-bed77.prod> LinkedIn ------------Pranav Karkhanis requested to add you as a connection on LinkedIn: ------------------------------------------ Bolotin,, I'd like to add you to my professional network on LinkedIn. - Pranav Accept invitation from Pranav Karkhanis http://www.linkedin.com/e/5drwke-gnvisp9p-6z/uez6TYkHzbaXxXM-lUk23auFwJZodcPlXc2UWC0Ao8h/blk/I2823528406_2/1BpC5vrmRLoRZcjkkZt5YCpnlOt3RApnhMpmdzgmhxrSNBszYOnPoMd3wOdjcOe399bRxxgmZKgm9EbP0Ncz8Od38Pej8LrCBxbOYWrSlI/EML_comm_afe/ View invitation from Pranav Karkhanis http://www.linkedin.com/e/5drwke-gnvisp9p-6z/uez6TYkHzbaXxXM-lUk23auFwJZodcPlXc2UWC0Ao8h/blk/I2823528406_2/39vdz0Qe38RcP8UcAALqnpPbOYWrSlI/svi/ -- (c) 2011, LinkedIn Corporation From R.A.Vos at reading.ac.uk Thu May 19 07:22:43 2011 From: R.A.Vos at reading.ac.uk (Rutger Vos) Date: Thu, 19 May 2011 12:22:43 +0100 Subject: [Bioperl-l] 10 days left to register: workshop phylogenetic pipelines, August 1-11 Message-ID: COMPUTATIONAL PHYLOINFORMATICS August 1 2011 through August 11 2011 Bioinformatics Center of Kyoto University Application Deadline: May 31, 2011 http://academy.nescent.org/wiki/Computational_phyloinformatics Computational Phyloinformatics is an 11-day international course (August 1-11, 2011) co-organized by the Computational Biology Research Center (CBRC/AIST), the Bioinformatics Center of Kyoto University, the Database Center for Life Science (DBCLS/JST), and the National Evolutionary Synthesis Center (NESCent). This course, which will take place at Kyoto University directly following the SMBE Meeting (http://smbe2011.com/), aims to give participants practical knowledge and hands-on skills in phyloinformatics. The venue in Kyoto is completely unaffected by the unfortunate events in Fukushima and the power shortages in Tokyo. We encourage biologists from other countries to participate in the SMBE meeting and/or this special international course, in solidarity with the scientific community of Japan in their effort to return to normalcy and to help minimize any negative impacts that the earthquake may have on scientific activities in Japan. SYNOPSIS Biologists are faced with ever-larger datasets, more complex evolutionary models, and increasingly elaborate analytical methods. Seldom is it sufficient to run a dataset with an off-the-shelf program on a desktop PC; increasingly, biologists need to write scripts to interface with internet services and databases, build analytical pipelines, customize analyses, and distribute computation over multiple processors. This course is designed for graduate students, postdocs, faculty, and researchers in phylogenetics interested in receiving practical, hands-on training in the use of Perl and SQL for workflows and applications in phyloinformatics. The course is divided into four parts: PART I: A tutorial review of Perl, including object oriented programming and building packages. PART II: Introduction and practical use of BioPerl and Bio::Phylo, (e.g. scripting for large tree inference engines, automating model testing, genomic-scale data mining and acquisition, supertree assembly, rate smoothing and branch calibration, tree traversal, etc). PART III: Introduction and practical use of BioRuby for molecular evolution and functional genomics (e.g. scripting multiple sequence alignment, gene duplication inference, tree inference, etc.). PART IV: Introduction to SQL and database design; computing and querying nested sets and transitive closure; querying both large trees (e.g. NCBI) and large collections of trees (e.g. TreeBASE). Participants will learn how to write basic phylogenetic or comparative analysis scripts, parse NEXUS files, traverse and compute over trees, and make practical use of phylogenetic software libraries. These skills will be learned in a biological context, touching on a diverse array of topics such as analysis of large datasets, automation of supertree assembly, querying for topological patterns in large collections of trees, etc. Participants will leave the course with a full set of installations and libraries on their computer ready to build phyloinformatic workflows for their own research projects, as well as continued access to a 50+ page wiki "textbook" containing step-by-step instructions, problem sets, and examples. INSTRUCTORS AND COURSE ORGANIZERS Christian Zmasek, Karen Cranston, Rutger A. Vos, Susumu Goto, Toshiaki Katayama, William H. Piel APPLICATION DEADLINE May 31, 2011 TUITION ?40,000 (~$500) Participants are responsible for their own travel costs, including transportation and accommodation -- see the website for more information. International participants will benefit by combining attendance with the 2011 SMBE meeting. A limited number of travel scholarships from NESCent are available for US-based students. Preference will be given to students from under-represented minorities. SUBSIDIES AND SCHOLARSHIPS A limited number of travel scholarships from NESCent are available for US-based students. Preference will be given to students from under-represented minorities. The Asia-Pacific Bioinformatics Network (APBioNet) is happy to provide travel assistance for a limited number of students/early career researchers from the Asia-Pacific region. Applicants are requested to contact Dr Asif Khan, APBioNet Secretariat: asif -$- bic.nus.edu.sg (replace -$- with @) for details. PREREQUISITES BIOLOGY: A good understanding of phylogenetics ? for example, having already taken the Workshop on Molecular Evolution (http://www.molecularevolution.org/) or equivalent coursework or experience. COMPUTING: Prior experience with Perl or careful study of the suggested reading materials in advance of the class (see web site). Participants should have some experience with basic Unix shell commands. EQUIPMENT: Participants are expected to bring their own Mac OSX computer or a LINUX computer, else they will be provided with an iMac. Participants who cannot bring their own computer and will be using a supplied iMac, should consider bringing their own portable firewire/usb drive so that they can also leave the course with a full suite of phyloinformatic software tools. -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading, RG6 6BX, United Kingdom Tel: +44 (0) 118 378 7535 http://rutgervos.blogspot.com From locarpau at upvnet.upv.es Thu May 19 07:42:34 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero) Date: Thu, 19 May 2011 13:42:34 +0200 Subject: [Bioperl-l] questions on Bio::Tools::Run::Alignment::Clustalw In-Reply-To: References: <4DD44B16.7020108@upvnet.upv.es> Message-ID: <4DD5022A.5030708@upvnet.upv.es> On 5/19/11 11:54 AM, Dave Messina wrote: > Hi Lorenzo, > > Your code and data works for me with both clustalw v1.83 and 2.1. > However, I did have to change the name of the clustalw 2.1 executable > from clustalw2 to clustalw. > > $ perl lorenzo.pl test_vs_test.besth.pep1.fas > > CLUSTAL 2.1 Multiple Sequence Alignments > > Sequence format is Pearson > Sequence 1: gnl|Alyrata|AL6G05070 602 aa > Sequence 2: gnl|Alyrata|AL3G15690 611 aa > Start of Pairwise alignments > Aligning... > > Sequences (1:2) Aligned. Score: 33 > Guide tree file created: [test_vs_test.besth.pep1.dnd] > > There are 1 groups > Start of Multiple Alignment > > Aligning... > Group 1: Sequences: 2 Score:6856 > Alignment Score 1214 > > GCG-Alignment file created > [/var/folders/Na/NagaNXNhHHm1GDx6seD-ME+++TI/-Tmp-/sniIE2msWJ/fGoixJVoUf] > > > --------------------- WARNING --------------------- > MSG: Use of method no_residues() is deprecated, use num_residues() instead > To be removed in 1.0075 > --------------------------------------------------- > 34.8639455782313 and 625 and 1213 > > > > > > What would be more efficient in term of memory usage: > i.-performing the alignment directly over a fasta sequences file or > ii.-performing the alignment over a ref to an array of seq objects: > > > Option i. But unless you're doing a ton, you probably won't notice > either way, so I would do whichever is more convenient. > > > Should I move the seqs files to the clustalw dir? > > > No, this isn't the problem. In the error message: > MSG: ClustalW call ( align > -infile="/Users/Lorenzo/Desktop/test_vs_test.besth.pep1.fas" > -output=gcg > -outfile="/var/folders/rA/rApd7cXoFyWK-Yhn66cxZk+++TI/-Tmp-/O3Was62L0X/exicCvJnrF" > 2>&1) crashed: 32512 > > I notice that the path to your input file in that error is different > than the path in your code ? perhaps this is the issue? > > > FInally, is there any way of geting the number of aminoacids in > the aligned region in eg. the longer or the shorter sequence > implemented or should I loop over the sequences in the $aln > Bio::SimpleAlign object etc?. > > > I'm not sure I understand your question: do you want something > different than $aln->length() ? > > > > Dave > Thanks for the answer (and sorry for the multiple messages). I'll take a look again but my script still doesn't run, even after changing the name of the executable to clustalw. I ran the program loading files from different locations, and I posted a version of the script with attached fasta file from different a location. Anyway, what the error message means?: Use of uninitialized value in concatenation (.) or string at /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm line 753. Use of uninitialized value in concatenation (.) or string at /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm line 754 I checked lines 753 and 754 of Clustalw.pm and found: $self->debug( "Program "._$self_->executable."\n"); my $commandstring =_$self_->executable." $command"." $instring"." -output=$output". " $param_string"; Similarly, I found STACK: Bio::Tools::Run::Alignment::Clustalw::_run /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm:768 STACK: Bio::Tools::Run::Alignment::Clustalw::align /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm:515 and I found again: my $aln =_$self_->_run('align', $infilename, $param_string); close($pipe) || (_$self_->throw("ClustalW call ($commandstring) crashed: $?")); so I guess the problem should refer to $self->executable, which must be solved after changing the executable name to clustalw (is it right?). However, I don't understand the rest of the error message: sh: align: command not found . . . STACK: Error::throw STACK: Bio::Root::Root::throw /Library/Perl//5.10.0/Bio/Root/Root.pm:368 Regarding my last question, what I want is to align the sequences, using clustalw preferably, to get the total number of aligned aas for both the longest and the shortest sequence in the alignment. I need these data to apply the following formula: I'=I*Min(n1L1,n2L2) where I is the percentage of identical aas in the aligned region, Li is the length of sequence i and ni is the number of aas in the aligned regions in sequence i Lorenzo -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From David.Messina at sbc.su.se Thu May 19 08:28:37 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 19 May 2011 14:28:37 +0200 Subject: [Bioperl-l] questions on Bio::Tools::Run::Alignment::Clustalw In-Reply-To: <4DD5022A.5030708@upvnet.upv.es> References: <4DD44B16.7020108@upvnet.upv.es> <4DD5022A.5030708@upvnet.upv.es> Message-ID: Hi Lorenzo, > Anyway, what the error message means?: > "Uninitialized value" means that the variable doesn't have a value ? it's not set to anything. so I guess the problem should refer to $self->executable, > Yes, I think that's probably right. > which must be solved after changing the executable name to clustalw (is it > right?). > Well, yes, that's part of it. But I think the real problem must be that the directory where you have the clustalw executable is not being found by BioPerl. I notice that there's a space after the directory name when you set $ENV{CLUSTALDIR}. Get rid of the space, so it looks like: BEGIN {$ENV{CLUSTALDIR} = '/Applications/Bioinformatics/clustalw-2.0.10-macosx/'} BioPerl is not properly catching the trailing space ? I'll look into why and see if I can get it to be a little more defensive against this kind of thing. > However, I don't understand the rest of the error message: > > sh: align: command not found > > Warning: long-winded explanation follows. So, align is a parameter that the bioperl code is trying to pass to the clustalw executable (I bet $command back on line 754 is set to 'align'). Since $self->executable has not value, the first thing passed to the shell to execute is the word 'align'. And since there's no program called align in your PATH, you get a 'command not found' from the shell. The same exact thing would happen if you opened your terminal and typed align and hit return. tl;dr it's a side effect of $self->executable being uninitialized. > Regarding my last question, what I want is to align the sequences, using > clustalw preferably, to get the total number of aligned aas for both the > longest and the shortest sequence in the alignment. I need these data to > apply the following formula: > > I'=I*Min(n1L1,n2L2) > > where I is the percentage of identical aas in the aligned region, > Li is the length of sequence i and ni is the number of aas in the > aligned regions in sequence i > > Yes, I think you'll have to loop through the seqs in the alignment one by one to get how many aas each one has in the aligned region, but (if you haven't already) do look through the Bio::SimpleAlign docs and see if something there will be of use. There might also be some scripts in the scripts directory that do something like this. Dave From abualiga at gmail.com Thu May 19 09:29:08 2011 From: abualiga at gmail.com (Galeb Abu-Ali) Date: Thu, 19 May 2011 09:29:08 -0400 Subject: [Bioperl-l] failed to install Bio::Tools::Run::StandAloneBlastPlus - full test output Message-ID: Hi, I am not able to install Bio::Tools::Run::StandAloneBlastPlus. Pasted below is the full test output. At your convenience, I'd much appreciate your instruction. thanks galeb cpan[1]> install Bio::Tools::Run::StandAloneBlastPlus Going to read '/root/.cpan/Metadata' Database was generated on Wed, 18 May 2011 11:32:45 GMT Running install for module 'Bio::Tools::Run::StandAloneBlastPlus' Running make for C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz Checksum for /root/.cpan/sources/authors/id/C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz ok Scanning cache /root/.cpan/build for sizes .......................................................---------------------DONE DEL(1/12): /root/.cpan/build/BioPerl-1.6.900-gVKrU6 DEL(2/12): /root/.cpan/build/BioPerl-1.6.900-gVKrU6.yml DEL(3/12): /root/.cpan/build/DB_File-1.822-Ys3EkN DEL(4/12): /root/.cpan/build/DB_File-1.822-Ys3EkN.yml DEL(5/12): /root/.cpan/build/Algorithm-Diff-1.1902-RIMxED DEL(6/12): /root/.cpan/build/Algorithm-Diff-1.1902-RIMxED.yml DEL(7/12): /root/.cpan/build/File-Sort-1.01-5wOvOP.yml DEL(8/12): /root/.cpan/build/File-Sort-1.01-5wOvOP DEL(9/12): /root/.cpan/build/IPC-Run-0.89-79rMje DEL(10/12): /root/.cpan/build/IPC-Run-0.89-79rMje.yml DEL(11/12): /root/.cpan/build/CPAN-DistnameInfo-0.12-4Etdsy DEL(12/12): /root/.cpan/build/CPAN-DistnameInfo-0.12-4Etdsy.yml CPAN.pm: Going to build C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz Install scripts? y/n [n ]y Do you want to run tests that require connection to servers across the internet (likely to cause some failures)? y/n [n ]n - will not run internet-requiring tests Created MYMETA.yml and MYMETA.json Creating new 'Build' script for 'BioPerl-Run' version '1.006900' Building BioPerl-Run CJFIELDS/BioPerl-Run-1.006900.tar.gz ./Build -- OK Running Build test t/Amap.t ...................... 1/18 # Required executable for Bio::Tools::Run::Alignment::Amap is not present t/Amap.t ...................... ok t/AnalysisFactory_soap.t ...... skipped: Network tests have not been requested t/Analysis_soap.t ............. skipped: Network tests have not been requested t/BEDTools.t .................. 1/423 # Required executable for Bio::Tools::Run::BEDTools is not present t/BEDTools.t .................. ok t/BWA.t ....................... ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: __PACKAGE__ requires installation of samtools (libbam) and Bio::DB::Sam (available on CPAN; not part of BioPerl) STACK: Error::throw STACK: Bio::Root::Root::throw /usr/local/lib/perl5/site_perl/5.12.2/Bio/Root/Root.pm:472 STACK: Bio::Assembly::IO::sam::BEGIN /usr/local/lib/perl5/site_perl/5.12.2/Bio/Assembly/IO/sam.pm:189 STACK: main::BEGIN /usr/local/lib/perl5/site_perl/5.12.2/Bio/Assembly/IO/ sam.pm:195 STACK: /usr/local/lib/perl5/site_perl/5.12.2/Bio/Assembly/IO/sam.pm:195 ----------------------------------------------------------- BEGIN failed--compilation aborted at /usr/local/lib/perl5/site_perl/5.12.2/Bio/Assembly/IO/sam.pm line 195. Compilation failed in require at t/BWA.t line 21. BEGIN failed--compilation aborted at t/BWA.t line 21. # Looks like your test exited with 2 before it could output anything. t/BWA.t ....................... Dubious, test returned 2 (wstat 512, 0x200) Failed 36/36 subtests t/Blat.t ...................... 1/33 # Required executable for Bio::Tools::Run::Alignment::Blat is not present # Looks like you planned 33 tests but ran 20. t/Blat.t ...................... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 13/33 subtests (less 15 skipped subtests: 5 okay) t/Bowtie.t .................... 1/73 # Required executable for Bio::Tools::Run::Bowtie is not present t/Bowtie.t .................... ok t/Cap3.t ...................... 1/91 # Required executable for Bio::Tools::Run::Cap3 is not present t/Cap3.t ...................... ok t/Clustalw.t .................. 1/45 # Required executable for Bio::Tools::Run::Alignment::Clustalw is not present t/Clustalw.t .................. ok t/Coil.t ...................... 1/6 # Required executable for Bio::Tools::Run::Coil is not present t/Coil.t ...................... ok t/Consense.t .................. 1/9 # Required executable for Bio::Tools::Run::Phylo::Phylip::Consense is not present t/Consense.t .................. ok t/DBA.t ....................... 1/18 # Required executable for Bio::Tools::Run::Alignment::DBA is not present t/DBA.t ....................... ok t/DrawGram.t .................. 1/6 # Required executable for Bio::Tools::Run::Phylo::Phylip::DrawGram is not present t/DrawGram.t .................. ok t/DrawTree.t .................. 2/6 # Required executable for Bio::Tools::Run::Phylo::Phylip::DrawTree is not present t/DrawTree.t .................. ok t/EMBOSS.t .................... skipped: The optional module XML::Twig (or dependencies thereof) was not installed t/Ensembl.t ................... skipped: Network tests have not been requested t/Eponine.t ................... 1/7 # Required environment variable $EPONINEDIR is not set t/Eponine.t ................... ok t/Exonerate.t ................. 1/89 # Required executable for Bio::Tools::Run::Alignment::Exonerate is not present t/Exonerate.t ................. ok t/FootPrinter.t ............... 1/24 # Required executable for Bio::Tools::Run::FootPrinter is not present t/FootPrinter.t ............... ok t/Genemark.hmm.prokaryotic.t .. 1/99 # Required environment variable $GENEMARK_MODELS is not set t/Genemark.hmm.prokaryotic.t .. ok t/Genewise.t .................. 1/20 # Required executable for Bio::Tools::Run::Genewise is not present t/Genewise.t .................. ok t/Genscan.t ................... 1/6 # Required environment variable $GENSCANDIR is not set t/Genscan.t ................... ok t/Gerp.t ...................... 1/33 # Required executable for Bio::Tools::Run::Phylo::Gerp is not present t/Gerp.t ...................... ok t/Glimmer2.t .................. 1/217 # Required executable for Bio::Tools::Run::Glimmer is not present t/Glimmer2.t .................. ok t/Glimmer3.t .................. 1/111 # Required executable for Bio::Tools::Run::Glimmer is not present t/Glimmer3.t .................. ok t/Gumby.t ..................... 1/124 # Required executable for Bio::Tools::Run::Phylo::Gumby is not present t/Gumby.t ..................... ok t/Hmmer.t ..................... 1/27 # Required executable for Bio::Tools::Run::Hmmer is not present t/Hmmer.t ..................... ok t/Hyphy.t ..................... 1/15 # Required executable for Bio::Tools::Run::Phylo::Hyphy::SLAC is not present t/Hyphy.t ..................... ok t/Infernal.t .................. 1/43 # Required executable for Bio::Tools::Run::Infernal is not present t/Infernal.t .................. ok t/Kalign.t .................... 1/8 # Required executable for Bio::Tools::Run::Alignment::Kalign is not present t/Kalign.t .................... ok t/LVB.t ....................... 1/19 # Required executable for Bio::Tools::Run::Phylo::LVB is not present t/LVB.t ....................... ok t/Lagan.t ..................... 1/12 # Required executable for Bio::Tools::Run::Alignment::Lagan is not present t/Lagan.t ..................... ok t/MAFFT.t ..................... 1/17 # Required executable for Bio::Tools::Run::Alignment::MAFFT is not present t/MAFFT.t ..................... ok t/MCS.t ....................... 1/24 # Required executable for Bio::Tools::Run::MCS is not present t/MCS.t ....................... ok t/Maq.t ....................... 1/51 # Required executable for Bio::Tools::Run::Maq is not present t/Maq.t ....................... ok t/Match.t ..................... 1/7 # Required executable for Bio::Tools::Run::Match is not present t/Match.t ..................... ok t/Mdust.t ..................... 3/5 # Required executable for Bio::Tools::Run::Mdust is not present t/Mdust.t ..................... ok t/Meme.t ...................... 1/25 # Required executable for Bio::Tools::Run::Meme is not present t/Meme.t ...................... ok t/Minimo.t .................... 1/72 # Required executable for Bio::Tools::Run::Minimo is not present t/Minimo.t .................... ok t/Molphy.t .................... 1/10 # Required executable for Bio::Tools::Run::Phylo::Molphy::ProtML is not present t/Molphy.t .................... ok t/Muscle.t .................... 1/16 # Required executable for Bio::Tools::Run::Alignment::Muscle is not present t/Muscle.t .................... ok t/Neighbor.t .................. 1/17 # Required executable for Bio::Tools::Run::Phylo::Phylip::Neighbor is not present t/Neighbor.t .................. ok t/Newbler.t ................... 1/98 # Required executable for Bio::Tools::Run::Newbler is not present t/Newbler.t ................... ok t/Njtree.t .................... 1/6 # Required executable for Bio::Tools::Run::Phylo::Njtree::Best is not present t/Njtree.t .................... ok t/PAML.t ...................... 1/28 # Required executable for Bio::Tools::Run::Phylo::PAML::Codeml is not present t/PAML.t ...................... ok t/Pal2Nal.t ................... 1/9 # Required executable for Bio::Tools::Run::Alignment::Pal2Nal is not present t/Pal2Nal.t ................... ok t/PhastCons.t ................. 1/181 # Required executable for Bio::Tools::Run::Phylo::Phast::PhastCons is not present t/PhastCons.t ................. ok t/Phrap.t ..................... 1/127 # Required executable for Bio::Tools::Run::Phrap is not present t/Phrap.t ..................... ok t/Phyml.t ..................... 1/47 # Required executable for Bio::Tools::Run::Phylo::Phyml is not present t/Phyml.t ..................... ok t/Primate.t ................... 1/8 # Required executable for Bio::Tools::Run::Primate is not present t/Primate.t ................... ok t/Primer3.t ................... 1/9 # Required executable for Bio::Tools::Run::Primer3 is not present t/Primer3.t ................... ok t/Prints.t .................... 1/7 # Required executable for Bio::Tools::Run::Prints is not present t/Prints.t .................... ok t/Probalign.t ................. 1/13 # Required executable for Bio::Tools::Run::Alignment::Probalign is not present t/Probalign.t ................. ok t/Probcons.t .................. 1/11 # Required executable for Bio::Tools::Run::Alignment::Probcons is not present t/Probcons.t .................. ok t/Profile.t ................... 1/7 # Required executable for Bio::Tools::Run::Profile is not present t/Profile.t ................... ok t/Promoterwise.t .............. 1/9 # Required executable for Bio::Tools::Run::Promoterwise is not present t/Promoterwise.t .............. ok t/ProtDist.t .................. 1/14 # Required executable for Bio::Tools::Run::Phylo::Phylip::ProtDist is not present t/ProtDist.t .................. ok t/ProtPars.t .................. 1/11 # Required executable for Bio::Tools::Run::Phylo::Phylip::ProtPars is not present t/ProtPars.t .................. ok t/Pseudowise.t ................ 1/18 # Required executable for Bio::Tools::Run::Pseudowise is not present t/Pseudowise.t ................ ok t/QuickTree.t ................. 1/13 # Required executable for Bio::Tools::Run::Phylo::QuickTree is not present t/QuickTree.t ................. ok t/RepeatMasker.t .............. 1/12 RepeatMasker program not found as or not executable. # Required executable for Bio::Tools::Run::RepeatMasker is not present t/RepeatMasker.t .............. ok t/SABlastPlus.t ............... 1/65 # DB and mask make tests t/SABlastPlus.t ............... 29/65 # run BLAST methods t/SABlastPlus.t ............... 62/65 Use of uninitialized value $hit_signif in numeric le (<=) at /usr/local/lib/perl5/site_perl/5.12.2/Bio/SearchIO/IteratedSearchResultEventBuilder.pm line 367, line 17. Use of uninitialized value $hit_signif in numeric le (<=) at /usr/local/lib/perl5/site_perl/5.12.2/Bio/SearchIO/IteratedSearchResultEventBuilder.pm line 315, line 17. t/SABlastPlus.t ............... ok t/SLR.t ....................... 1/7 # Required executable for Bio::Tools::Run::Phylo::SLR is not present t/SLR.t ....................... ok t/Samtools.t .................. 1/40 ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Can't find executable for 'samtools'; can't continue STACK: Error::throw STACK: Bio::Root::Root::throw /usr/local/lib/perl5/site_perl/5.12.2/Bio/Root/Root.pm:472 STACK: Bio::Tools::Run::WrapperBase::_run /usr/local/lib/perl5/site_perl/5.12.2/Bio/Tools/Run/WrapperBase/CommandExts.pm:974 STACK: Bio::Tools::Run::Samtools::run /root/.cpan/build/BioPerl-Run-1.006900-8toBBs/blib/lib/Bio/Tools/Run/Samtools.pm:176 STACK: t/Samtools.t:71 ----------------------------------------------------------- # Looks like you planned 40 tests but ran 24. # Looks like your test exited with 2 just after 24. t/Samtools.t .................. Dubious, test returned 2 (wstat 512, 0x200) Failed 16/40 subtests t/Seg.t ....................... 1/8 # Required executable for Bio::Tools::Run::Seg is not present t/Seg.t ....................... ok t/Semphy.t .................... 1/19 # Required executable for Bio::Tools::Run::Phylo::Semphy is not present t/Semphy.t .................... ok t/SeqBoot.t ................... 1/9 # Required executable for Bio::Tools::Run::Phylo::Phylip::SeqBoot is not present t/SeqBoot.t ................... ok t/Signalp.t ................... 1/7 # Required executable for Bio::Tools::Run::Signalp is not present t/Signalp.t ................... ok t/Sim4.t ...................... 1/23 # Required executable for Bio::Tools::Run::Alignment::Sim4 is not present t/Sim4.t ...................... ok t/Simprot.t ................... 1/6 # Required executable for Bio::Tools::Run::Simprot is not present t/Simprot.t ................... ok t/SoapEU-function.t ........... skipped: The optional module Bio::DB::ESoap (or dependencies thereof) was not installed t/SoapEU-unit.t ............... skipped: The optional module Bio::DB::ESoap (or dependencies thereof) was not installed t/StandAloneFasta.t ........... 1/15 # Required executable for Bio::Tools::Run::Alignment::StandAloneFasta is not present t/StandAloneFasta.t ........... ok t/TCoffee.t ................... 1/27 # Required executable for Bio::Tools::Run::Alignment::TCoffee is not present t/TCoffee.t ................... ok t/TigrAssembler.t ............. 1/88 # Required executable for Bio::Tools::Run::TigrAssembler is not present # Required executable for Bio::Tools::Run::TigrAssembler is not present t/TigrAssembler.t ............. ok t/Tmhmm.t ..................... 1/9 # Required executable for Bio::Tools::Run::Tmhmm is not present t/Tmhmm.t ..................... ok t/TribeMCL.t .................. ok t/Vista.t ..................... 1/7 # Vista.jar is not in your class path:Exception in thread "main" java.lang.NoClassDefFoundError: Vista t/Vista.t ..................... ok t/gmap-run.t .................. 1/8 # Required executable for Bio::Tools::Run::Alignment::Gmap is not present t/gmap-run.t .................. ok t/tRNAscanSE.t ................ 1/12 # Required executable for Bio::Tools::Run::tRNAscanSE is not present t/tRNAscanSE.t ................ ok Test Summary Report ------------------- t/BWA.t (Wstat: 512 Tests: 0 Failed: 0) Non-zero exit status: 2 Parse errors: Bad plan. You planned 36 tests but ran 0. t/Blat.t (Wstat: 65280 Tests: 20 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 33 tests but ran 20. t/Samtools.t (Wstat: 512 Tests: 24 Failed: 0) Non-zero exit status: 2 Parse errors: Bad plan. You planned 40 tests but ran 24. Files=80, Tests=2799, 95 wallclock secs ( 0.39 usr 0.04 sys + 30.85 cusr 4.08 csys = 35.36 CPU) Result: FAIL Failed 3/80 test programs. 0/2799 subtests failed. CJFIELDS/BioPerl-Run-1.006900.tar.gz ./Build test -- NOT OK //hint// to see the cpan-testers results for installing this module, try: reports CJFIELDS/BioPerl-Run-1.006900.tar.gz Running Build install make test had returned bad status, won't install without force Failed during this command: CJFIELDS/BioPerl-Run-1.006900.tar.gz : make_test NO cpan[2]> Hi Galeb, In your email, could you include the *full* test output? Just the summary isn't enough for us to diagnose what is happening. Rob From cjfields at illinois.edu Thu May 19 09:40:23 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 19 May 2011 08:40:23 -0500 Subject: [Bioperl-l] failed to install Bio::Tools::Run::StandAloneBlastPlus - full test output In-Reply-To: References: Message-ID: <5B570D75-5361-4C88-9B49-0481CCDB2B5B@illinois.edu> Thanks for the full report. Looks as if the test suites for bwa and samtools aren't catching the lack of required executables or modules, I'll look into that. The tests for your module did pass (SABlast.t), however, so you could feasibly use 'force install' in this case if you don't plan on using the other modules. chris On May 19, 2011, at 8:29 AM, Galeb Abu-Ali wrote: > Hi, > > I am not able to install Bio::Tools::Run::StandAloneBlastPlus. Pasted below > is the full test output. At your convenience, I'd much appreciate your > instruction. > > thanks > > galeb > > > > > cpan[1]> install Bio::Tools::Run::StandAloneBlastPlus > Going to read '/root/.cpan/Metadata' > Database was generated on Wed, 18 May 2011 11:32:45 GMT > Running install for module 'Bio::Tools::Run::StandAloneBlastPlus' > Running make for C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz > Checksum for > /root/.cpan/sources/authors/id/C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz ok > Scanning cache /root/.cpan/build for sizes > .......................................................---------------------DONE > DEL(1/12): /root/.cpan/build/BioPerl-1.6.900-gVKrU6 > DEL(2/12): /root/.cpan/build/BioPerl-1.6.900-gVKrU6.yml > DEL(3/12): /root/.cpan/build/DB_File-1.822-Ys3EkN > DEL(4/12): /root/.cpan/build/DB_File-1.822-Ys3EkN.yml > DEL(5/12): /root/.cpan/build/Algorithm-Diff-1.1902-RIMxED > DEL(6/12): /root/.cpan/build/Algorithm-Diff-1.1902-RIMxED.yml > DEL(7/12): /root/.cpan/build/File-Sort-1.01-5wOvOP.yml > DEL(8/12): /root/.cpan/build/File-Sort-1.01-5wOvOP > DEL(9/12): /root/.cpan/build/IPC-Run-0.89-79rMje > DEL(10/12): /root/.cpan/build/IPC-Run-0.89-79rMje.yml > DEL(11/12): /root/.cpan/build/CPAN-DistnameInfo-0.12-4Etdsy > DEL(12/12): /root/.cpan/build/CPAN-DistnameInfo-0.12-4Etdsy.yml > > CPAN.pm: Going to build C/CJ/CJFIELDS/BioPerl-Run-1.006900.tar.gz > > Install scripts? y/n [n ]y > Do you want to run tests that require connection to servers across the > internet > (likely to cause some failures)? y/n [n ]n > - will not run internet-requiring tests > Created MYMETA.yml and MYMETA.json > Creating new 'Build' script for 'BioPerl-Run' version '1.006900' > Building BioPerl-Run > CJFIELDS/BioPerl-Run-1.006900.tar.gz > ./Build -- OK > Running Build test > t/Amap.t ...................... 1/18 # Required executable for > Bio::Tools::Run::Alignment::Amap is not present > t/Amap.t ...................... ok > t/AnalysisFactory_soap.t ...... skipped: Network tests have not been > requested > t/Analysis_soap.t ............. skipped: Network tests have not been > requested > t/BEDTools.t .................. 1/423 # Required executable for > Bio::Tools::Run::BEDTools is not present > t/BEDTools.t .................. ok > t/BWA.t ....................... > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: __PACKAGE__ requires installation of samtools (libbam) and Bio::DB::Sam > (available on CPAN; not part of BioPerl) > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/local/lib/perl5/site_perl/5.12.2/Bio/Root/Root.pm:472 > STACK: Bio::Assembly::IO::sam::BEGIN > /usr/local/lib/perl5/site_perl/5.12.2/Bio/Assembly/IO/sam.pm:189 > STACK: main::BEGIN /usr/local/lib/perl5/site_perl/5.12.2/Bio/Assembly/IO/ > sam.pm:195 > STACK: /usr/local/lib/perl5/site_perl/5.12.2/Bio/Assembly/IO/sam.pm:195 > ----------------------------------------------------------- > BEGIN failed--compilation aborted at > /usr/local/lib/perl5/site_perl/5.12.2/Bio/Assembly/IO/sam.pm line 195. > Compilation failed in require at t/BWA.t line 21. > BEGIN failed--compilation aborted at t/BWA.t line 21. > # Looks like your test exited with 2 before it could output anything. > t/BWA.t ....................... Dubious, test returned 2 (wstat 512, 0x200) > Failed 36/36 subtests > t/Blat.t ...................... 1/33 # Required executable for > Bio::Tools::Run::Alignment::Blat is not present > # Looks like you planned 33 tests but ran 20. > t/Blat.t ...................... Dubious, test returned 255 (wstat 65280, > 0xff00) > Failed 13/33 subtests > (less 15 skipped subtests: 5 okay) > t/Bowtie.t .................... 1/73 # Required executable for > Bio::Tools::Run::Bowtie is not present > t/Bowtie.t .................... ok > t/Cap3.t ...................... 1/91 # Required executable for > Bio::Tools::Run::Cap3 is not present > t/Cap3.t ...................... ok > t/Clustalw.t .................. 1/45 # Required executable for > Bio::Tools::Run::Alignment::Clustalw is not present > t/Clustalw.t .................. ok > t/Coil.t ...................... 1/6 # Required executable for > Bio::Tools::Run::Coil is not present > t/Coil.t ...................... ok > t/Consense.t .................. 1/9 # Required executable for > Bio::Tools::Run::Phylo::Phylip::Consense is not present > t/Consense.t .................. ok > t/DBA.t ....................... 1/18 # Required executable for > Bio::Tools::Run::Alignment::DBA is not present > t/DBA.t ....................... ok > t/DrawGram.t .................. 1/6 # Required executable for > Bio::Tools::Run::Phylo::Phylip::DrawGram is not present > t/DrawGram.t .................. ok > t/DrawTree.t .................. 2/6 # Required executable for > Bio::Tools::Run::Phylo::Phylip::DrawTree is not present > t/DrawTree.t .................. ok > t/EMBOSS.t .................... skipped: The optional module XML::Twig (or > dependencies thereof) was not installed > t/Ensembl.t ................... skipped: Network tests have not been > requested > t/Eponine.t ................... 1/7 # Required environment variable > $EPONINEDIR is not set > t/Eponine.t ................... ok > t/Exonerate.t ................. 1/89 # Required executable for > Bio::Tools::Run::Alignment::Exonerate is not present > t/Exonerate.t ................. ok > t/FootPrinter.t ............... 1/24 # Required executable for > Bio::Tools::Run::FootPrinter is not present > t/FootPrinter.t ............... ok > t/Genemark.hmm.prokaryotic.t .. 1/99 # Required environment variable > $GENEMARK_MODELS is not set > t/Genemark.hmm.prokaryotic.t .. ok > t/Genewise.t .................. 1/20 # Required executable for > Bio::Tools::Run::Genewise is not present > t/Genewise.t .................. ok > t/Genscan.t ................... 1/6 # Required environment variable > $GENSCANDIR is not set > t/Genscan.t ................... ok > t/Gerp.t ...................... 1/33 # Required executable for > Bio::Tools::Run::Phylo::Gerp is not present > t/Gerp.t ...................... ok > t/Glimmer2.t .................. 1/217 # Required executable for > Bio::Tools::Run::Glimmer is not present > t/Glimmer2.t .................. ok > t/Glimmer3.t .................. 1/111 # Required executable for > Bio::Tools::Run::Glimmer is not present > t/Glimmer3.t .................. ok > t/Gumby.t ..................... 1/124 # Required executable for > Bio::Tools::Run::Phylo::Gumby is not present > t/Gumby.t ..................... ok > t/Hmmer.t ..................... 1/27 # Required executable for > Bio::Tools::Run::Hmmer is not present > t/Hmmer.t ..................... ok > t/Hyphy.t ..................... 1/15 # Required executable for > Bio::Tools::Run::Phylo::Hyphy::SLAC is not present > t/Hyphy.t ..................... ok > t/Infernal.t .................. 1/43 # Required executable for > Bio::Tools::Run::Infernal is not present > t/Infernal.t .................. ok > t/Kalign.t .................... 1/8 # Required executable for > Bio::Tools::Run::Alignment::Kalign is not present > t/Kalign.t .................... ok > t/LVB.t ....................... 1/19 # Required executable for > Bio::Tools::Run::Phylo::LVB is not present > t/LVB.t ....................... ok > t/Lagan.t ..................... 1/12 # Required executable for > Bio::Tools::Run::Alignment::Lagan is not present > t/Lagan.t ..................... ok > t/MAFFT.t ..................... 1/17 # Required executable for > Bio::Tools::Run::Alignment::MAFFT is not present > t/MAFFT.t ..................... ok > t/MCS.t ....................... 1/24 # Required executable for > Bio::Tools::Run::MCS is not present > t/MCS.t ....................... ok > t/Maq.t ....................... 1/51 # Required executable for > Bio::Tools::Run::Maq is not present > t/Maq.t ....................... ok > t/Match.t ..................... 1/7 # Required executable for > Bio::Tools::Run::Match is not present > t/Match.t ..................... ok > t/Mdust.t ..................... 3/5 # Required executable for > Bio::Tools::Run::Mdust is not present > t/Mdust.t ..................... ok > t/Meme.t ...................... 1/25 # Required executable for > Bio::Tools::Run::Meme is not present > t/Meme.t ...................... ok > t/Minimo.t .................... 1/72 # Required executable for > Bio::Tools::Run::Minimo is not present > t/Minimo.t .................... ok > t/Molphy.t .................... 1/10 # Required executable for > Bio::Tools::Run::Phylo::Molphy::ProtML is not present > t/Molphy.t .................... ok > t/Muscle.t .................... 1/16 # Required executable for > Bio::Tools::Run::Alignment::Muscle is not present > t/Muscle.t .................... ok > t/Neighbor.t .................. 1/17 # Required executable for > Bio::Tools::Run::Phylo::Phylip::Neighbor is not present > t/Neighbor.t .................. ok > t/Newbler.t ................... 1/98 # Required executable for > Bio::Tools::Run::Newbler is not present > t/Newbler.t ................... ok > t/Njtree.t .................... 1/6 # Required executable for > Bio::Tools::Run::Phylo::Njtree::Best is not present > t/Njtree.t .................... ok > t/PAML.t ...................... 1/28 # Required executable for > Bio::Tools::Run::Phylo::PAML::Codeml is not present > t/PAML.t ...................... ok > t/Pal2Nal.t ................... 1/9 # Required executable for > Bio::Tools::Run::Alignment::Pal2Nal is not present > t/Pal2Nal.t ................... ok > t/PhastCons.t ................. 1/181 # Required executable for > Bio::Tools::Run::Phylo::Phast::PhastCons is not present > t/PhastCons.t ................. ok > t/Phrap.t ..................... 1/127 # Required executable for > Bio::Tools::Run::Phrap is not present > t/Phrap.t ..................... ok > t/Phyml.t ..................... 1/47 # Required executable for > Bio::Tools::Run::Phylo::Phyml is not present > t/Phyml.t ..................... ok > t/Primate.t ................... 1/8 # Required executable for > Bio::Tools::Run::Primate is not present > t/Primate.t ................... ok > t/Primer3.t ................... 1/9 # Required executable for > Bio::Tools::Run::Primer3 is not present > t/Primer3.t ................... ok > t/Prints.t .................... 1/7 # Required executable for > Bio::Tools::Run::Prints is not present > t/Prints.t .................... ok > t/Probalign.t ................. 1/13 # Required executable for > Bio::Tools::Run::Alignment::Probalign is not present > t/Probalign.t ................. ok > t/Probcons.t .................. 1/11 # Required executable for > Bio::Tools::Run::Alignment::Probcons is not present > t/Probcons.t .................. ok > t/Profile.t ................... 1/7 # Required executable for > Bio::Tools::Run::Profile is not present > t/Profile.t ................... ok > t/Promoterwise.t .............. 1/9 # Required executable for > Bio::Tools::Run::Promoterwise is not present > t/Promoterwise.t .............. ok > t/ProtDist.t .................. 1/14 # Required executable for > Bio::Tools::Run::Phylo::Phylip::ProtDist is not present > t/ProtDist.t .................. ok > t/ProtPars.t .................. 1/11 # Required executable for > Bio::Tools::Run::Phylo::Phylip::ProtPars is not present > t/ProtPars.t .................. ok > t/Pseudowise.t ................ 1/18 # Required executable for > Bio::Tools::Run::Pseudowise is not present > t/Pseudowise.t ................ ok > t/QuickTree.t ................. 1/13 # Required executable for > Bio::Tools::Run::Phylo::QuickTree is not present > t/QuickTree.t ................. ok > t/RepeatMasker.t .............. 1/12 RepeatMasker program not found as or > not executable. > # Required executable for Bio::Tools::Run::RepeatMasker is not present > t/RepeatMasker.t .............. ok > t/SABlastPlus.t ............... 1/65 # DB and mask make tests > t/SABlastPlus.t ............... 29/65 # run BLAST methods > t/SABlastPlus.t ............... 62/65 Use of uninitialized value $hit_signif > in numeric le (<=) at > /usr/local/lib/perl5/site_perl/5.12.2/Bio/SearchIO/IteratedSearchResultEventBuilder.pm > line 367, line 17. > Use of uninitialized value $hit_signif in numeric le (<=) at > /usr/local/lib/perl5/site_perl/5.12.2/Bio/SearchIO/IteratedSearchResultEventBuilder.pm > line 315, line 17. > t/SABlastPlus.t ............... ok > t/SLR.t ....................... 1/7 # Required executable for > Bio::Tools::Run::Phylo::SLR is not present > t/SLR.t ....................... ok > t/Samtools.t .................. 1/40 > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Can't find executable for 'samtools'; can't continue > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/local/lib/perl5/site_perl/5.12.2/Bio/Root/Root.pm:472 > STACK: Bio::Tools::Run::WrapperBase::_run > /usr/local/lib/perl5/site_perl/5.12.2/Bio/Tools/Run/WrapperBase/CommandExts.pm:974 > STACK: Bio::Tools::Run::Samtools::run > /root/.cpan/build/BioPerl-Run-1.006900-8toBBs/blib/lib/Bio/Tools/Run/Samtools.pm:176 > STACK: t/Samtools.t:71 > ----------------------------------------------------------- > # Looks like you planned 40 tests but ran 24. > # Looks like your test exited with 2 just after 24. > t/Samtools.t .................. Dubious, test returned 2 (wstat 512, 0x200) > Failed 16/40 subtests > t/Seg.t ....................... 1/8 # Required executable for > Bio::Tools::Run::Seg is not present > t/Seg.t ....................... ok > t/Semphy.t .................... 1/19 # Required executable for > Bio::Tools::Run::Phylo::Semphy is not present > t/Semphy.t .................... ok > t/SeqBoot.t ................... 1/9 # Required executable for > Bio::Tools::Run::Phylo::Phylip::SeqBoot is not present > t/SeqBoot.t ................... ok > t/Signalp.t ................... 1/7 # Required executable for > Bio::Tools::Run::Signalp is not present > t/Signalp.t ................... ok > t/Sim4.t ...................... 1/23 # Required executable for > Bio::Tools::Run::Alignment::Sim4 is not present > t/Sim4.t ...................... ok > t/Simprot.t ................... 1/6 # Required executable for > Bio::Tools::Run::Simprot is not present > t/Simprot.t ................... ok > t/SoapEU-function.t ........... skipped: The optional module Bio::DB::ESoap > (or dependencies thereof) was not installed > t/SoapEU-unit.t ............... skipped: The optional module Bio::DB::ESoap > (or dependencies thereof) was not installed > t/StandAloneFasta.t ........... 1/15 # Required executable for > Bio::Tools::Run::Alignment::StandAloneFasta is not present > t/StandAloneFasta.t ........... ok > t/TCoffee.t ................... 1/27 # Required executable for > Bio::Tools::Run::Alignment::TCoffee is not present > t/TCoffee.t ................... ok > t/TigrAssembler.t ............. 1/88 # Required executable for > Bio::Tools::Run::TigrAssembler is not present > # Required executable for Bio::Tools::Run::TigrAssembler is not present > t/TigrAssembler.t ............. ok > t/Tmhmm.t ..................... 1/9 # Required executable for > Bio::Tools::Run::Tmhmm is not present > t/Tmhmm.t ..................... ok > t/TribeMCL.t .................. ok > t/Vista.t ..................... 1/7 # Vista.jar is not in your class > path:Exception in thread "main" java.lang.NoClassDefFoundError: Vista > t/Vista.t ..................... ok > t/gmap-run.t .................. 1/8 # Required executable for > Bio::Tools::Run::Alignment::Gmap is not present > t/gmap-run.t .................. ok > t/tRNAscanSE.t ................ 1/12 # Required executable for > Bio::Tools::Run::tRNAscanSE is not present > t/tRNAscanSE.t ................ ok > > Test Summary Report > ------------------- > t/BWA.t (Wstat: 512 Tests: 0 Failed: 0) > Non-zero exit status: 2 > Parse errors: Bad plan. You planned 36 tests but ran 0. > t/Blat.t (Wstat: 65280 Tests: 20 Failed: 0) > Non-zero exit status: 255 > Parse errors: Bad plan. You planned 33 tests but ran 20. > t/Samtools.t (Wstat: 512 Tests: 24 Failed: 0) > Non-zero exit status: 2 > Parse errors: Bad plan. You planned 40 tests but ran 24. > Files=80, Tests=2799, 95 wallclock secs ( 0.39 usr 0.04 sys + 30.85 cusr > 4.08 csys = 35.36 CPU) > Result: FAIL > Failed 3/80 test programs. 0/2799 subtests failed. > CJFIELDS/BioPerl-Run-1.006900.tar.gz > ./Build test -- NOT OK > //hint// to see the cpan-testers results for installing this module, try: > reports CJFIELDS/BioPerl-Run-1.006900.tar.gz > Running Build install > make test had returned bad status, won't install without force > Failed during this command: > CJFIELDS/BioPerl-Run-1.006900.tar.gz : make_test NO > cpan[2]> > > > Hi Galeb, > > In your email, could you include the *full* test output? Just the > summary isn't enough for us to diagnose what is happening. > > Rob > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From abualiga at gmail.com Thu May 19 10:03:25 2011 From: abualiga at gmail.com (Galeb Abu-Ali) Date: Thu, 19 May 2011 10:03:25 -0400 Subject: [Bioperl-l] failed to install Bio::Tools::Run::StandAloneBlastPlus - full test output In-Reply-To: <5B570D75-5361-4C88-9B49-0481CCDB2B5B@illinois.edu> References: <5B570D75-5361-4C88-9B49-0481CCDB2B5B@illinois.edu> Message-ID: On Thu, May 19, 2011 at 9:40 AM, Chris Fields wrote: > Thanks for the full report. Looks as if the test suites for bwa and > samtools aren't catching the lack of required executables or modules, I'll > look into that. The tests for your module did pass (SABlast.t), however, so > you could feasibly use 'force install' in this case if you don't plan on > using the other modules. > > chris > > thanks Chris, I'll try that. galeb From rmb32 at cornell.edu Thu May 19 11:57:03 2011 From: rmb32 at cornell.edu (Robert Buels) Date: Thu, 19 May 2011 08:57:03 -0700 Subject: [Bioperl-l] cpan indexing In-Reply-To: <91AF9D19-D2CD-41E3-9649-D5A9459573AE@illinois.edu> References: <4DD45CD3.40303@cornell.edu> <4DD45D3D.9050702@cornell.edu> <0E5A4CF7-8469-4AEA-8896-13BDC7796658@illinois.edu> <91AF9D19-D2CD-41E3-9649-D5A9459573AE@illinois.edu> Message-ID: <4DD53DCF.3090608@cornell.edu> Ah, dist_version. Good catch. That's a lot better than having a $VERSION in each module. R On 05/18/2011 07:55 PM, Chris Fields wrote: > New BioPerl uploaded, version 1.6.901. The PAUSE indexer has all modules with a 1.006901 version, even if $VERSION isn't directly set. Appears that META.json/yml are parsed for the version and not each module (it's what the dist_version option is supposed to be used for); just uncommented some Bio::Root::Build code that was apparently used for this reason and it seems to work now. From joyeux2000 at hotmail.fr Thu May 19 14:00:27 2011 From: joyeux2000 at hotmail.fr (debutant.bioperl) Date: Thu, 19 May 2011 11:00:27 -0700 (PDT) Subject: [Bioperl-l] While running without stop Message-ID: <31658279.post@talk.nabble.com> Hello all, The following perl script looks for the pattern in each line (line of the fasta format that begins with">" : sequence). ............................................................................................................................................ #!/usr/bin/perl use strict; use warnings; use Bio::SeqIO; my $file = 'eee.txt'; my $in = Bio::SeqIO->new(-file => $file , '-format' => 'fasta'); while ( my $seq = $in->next_seq() ) { while ($seq->seq =~ m/(ATCGA)/){ print "ATCGA commence ? la position $-[1]\nse termine juste avant la position $+[1]\n"; } } ............................................................................................................................................ the problem that the while of: .................................................................................................. while ($seq->seq =~ m/(ATCGA)/){ print "ATCGA commence ? la position $-[1]\nse termine juste avant la position $+[1]\n"; } .................................................................................................. Runs continuously and always gives the same result. ... ATCG commence ? la position 172 se termine juste avant la position 176 ATCG commence ? la position 172 se termine juste avant la position 176 ATCG commence ? la position 172 se termine juste avant la position 176 ... And if we use "if" instead of While he finds only the first pattern (case), otherwise, if more than one pattern in the same sequence, he finds only the first. Please, have you any idea to fix this code so that: While research patterns and ends. Or For If, when he finds a pattern he went to the next letter and not the next sequence and found many motifs per sequence. Cordially -- View this message in context: http://old.nabble.com/While-running-without-stop-tp31658279p31658279.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From Kevin.M.Brown at asu.edu Thu May 19 14:48:28 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Thu, 19 May 2011 11:48:28 -0700 Subject: [Bioperl-l] While running without stop In-Reply-To: <31658279.post@talk.nabble.com> References: <31658279.post@talk.nabble.com> Message-ID: <1A4207F8295607498283FE9E93B775B4079C432A@EX02.asurite.ad.asu.edu> This is a problem with accessing objects, IIRC. Try doing something like: my $in = Bio::SeqIO->new(-file => $file , '-format' => 'fasta'); while ( my $seq = $in->next_seq() ) { my $sequence = $seq->seq; while ($sequence =~ m/(ATCGA)/){ print "ATCGA commence ? la position $-[1]\nse termine juste avant la position $+[1]\n"; } } Kevin Brown Center for Innovations in Medicine Biodesign Institute Arizona State University > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of debutant.bioperl > Sent: Thursday, May 19, 2011 11:00 AM > To: Bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] While running without stop > > > Hello all, > The following perl script looks for the pattern in each line (line of > the > fasta format that begins with">" : sequence). > > ....................................................................... > ..................................................................... > #!/usr/bin/perl > use strict; > use warnings; > use Bio::SeqIO; > > my $file = 'eee.txt'; > my $in = Bio::SeqIO->new(-file => $file , '-format' => 'fasta'); > while ( my $seq = $in->next_seq() ) > { > while ($seq->seq =~ m/(ATCGA)/){ > print "ATCGA commence ? la position $-[1]\nse termine juste avant > la > position $+[1]\n"; > } > } > ....................................................................... > ..................................................................... > > the problem that the while of: > ....................................................................... > ........................... > while ($seq->seq =~ m/(ATCGA)/){ > print "ATCGA commence ? la position $-[1]\nse termine juste > avant la > position $+[1]\n"; > } > ....................................................................... > ........................... > Runs continuously and always gives the same result. > > ... ATCG commence ? la position 172 > se termine juste avant la position 176 > ATCG commence ? la position 172 > se termine juste avant la position 176 > ATCG commence ? la position 172 > se termine juste avant la position 176 ... > > > And if we use "if" instead of While he finds only the first pattern > (case), > otherwise, if more than one pattern in the same sequence, he finds only > the > first. > Please, have you any idea to fix this code so that: > While research patterns and ends. > Or > For If, when he finds a pattern he went to the next letter and not the > next > sequence and found many motifs per sequence. > Cordially > > > -- > View this message in context: http://old.nabble.com/While-running- > without-stop-tp31658279p31658279.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From joyeux2000 at hotmail.fr Thu May 19 15:07:47 2011 From: joyeux2000 at hotmail.fr (debutant.bioperl) Date: Thu, 19 May 2011 12:07:47 -0700 (PDT) Subject: [Bioperl-l] While running without stop In-Reply-To: <1A4207F8295607498283FE9E93B775B4079C432A@EX02.asurite.ad.asu.edu> References: <31658279.post@talk.nabble.com> <1A4207F8295607498283FE9E93B775B4079C432A@EX02.asurite.ad.asu.edu> Message-ID: <31658776.post@talk.nabble.com> hello, thank you very much Kevin for your reply. but it gives the same result, while running without stop. Kevin Brown-8 wrote: > > This is a problem with accessing objects, IIRC. > > Try doing something like: > > my $in = Bio::SeqIO->new(-file => $file , '-format' => 'fasta'); > while ( my $seq = $in->next_seq() ) > { > my $sequence = $seq->seq; > while ($sequence =~ m/(ATCGA)/){ > print "ATCGA commence ? la position $-[1]\nse termine juste avant la > position $+[1]\n"; > } > } > > Kevin Brown > Center for Innovations in Medicine > Biodesign Institute > Arizona State University > >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> bounces at lists.open-bio.org] On Behalf Of debutant.bioperl >> Sent: Thursday, May 19, 2011 11:00 AM >> To: Bioperl-l at lists.open-bio.org >> Subject: [Bioperl-l] While running without stop >> >> >> Hello all, >> The following perl script looks for the pattern in each line (line of >> the >> fasta format that begins with">" : sequence). >> >> ....................................................................... >> ..................................................................... >> #!/usr/bin/perl >> use strict; >> use warnings; >> use Bio::SeqIO; >> >> my $file = 'eee.txt'; >> my $in = Bio::SeqIO->new(-file => $file , '-format' => 'fasta'); >> while ( my $seq = $in->next_seq() ) >> { >> while ($seq->seq =~ m/(ATCGA)/){ >> print "ATCGA commence ? la position $-[1]\nse termine juste avant >> la >> position $+[1]\n"; >> } >> } >> ....................................................................... >> ..................................................................... >> >> the problem that the while of: >> ....................................................................... >> ........................... >> while ($seq->seq =~ m/(ATCGA)/){ >> print "ATCGA commence ? la position $-[1]\nse termine juste >> avant la >> position $+[1]\n"; >> } >> ....................................................................... >> ........................... >> Runs continuously and always gives the same result. >> >> ... ATCG commence ? la position 172 >> se termine juste avant la position 176 >> ATCG commence ? la position 172 >> se termine juste avant la position 176 >> ATCG commence ? la position 172 >> se termine juste avant la position 176 ... >> >> >> And if we use "if" instead of While he finds only the first pattern >> (case), >> otherwise, if more than one pattern in the same sequence, he finds only >> the >> first. >> Please, have you any idea to fix this code so that: >> While research patterns and ends. >> Or >> For If, when he finds a pattern he went to the next letter and not the >> next >> sequence and found many motifs per sequence. >> Cordially >> >> >> -- >> View this message in context: http://old.nabble.com/While-running- >> without-stop-tp31658279p31658279.html >> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- View this message in context: http://old.nabble.com/While-running-without-stop-tp31658279p31658776.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From David.Messina at sbc.su.se Thu May 19 15:17:01 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 19 May 2011 21:17:01 +0200 Subject: [Bioperl-l] While running without stop In-Reply-To: <31658279.post@talk.nabble.com> References: <31658279.post@talk.nabble.com> Message-ID: I'm not familiar with these newfangled @-s and @+s, so I can't help with that. But the traditional way to do this is with pos(): while ( my $seq = $in->next_seq() ) { my $seqstring = $seq->seq; while ( $seqstring =~ m/(ATCGA)/g ) { my $end = pos($seqstring); my $start = $end - length($&) + 1; print "ATCGA commence ? la position $start\nse termine juste avant la position $end\n"; } } Dave From cjfields at illinois.edu Thu May 19 15:26:29 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 19 May 2011 14:26:29 -0500 Subject: [Bioperl-l] While running without stop In-Reply-To: <1A4207F8295607498283FE9E93B775B4079C432A@EX02.asurite.ad.asu.edu> References: <31658279.post@talk.nabble.com> <1A4207F8295607498283FE9E93B775B4079C432A@EX02.asurite.ad.asu.edu> Message-ID: <0696B61A-7890-4803-A6C4-06EAA633514F@illinois.edu> The match needs to be made iterative so subsequent matches can be made, otherwise the match starts at the beginning of the string during each iteration of the while loop. To do that you need the /g modifier: while ($sequence =~ m/(ATCGA)/g ){ ... } You could also use pos() and the match length in the loop to get the match start and end position, it might be slightly faster. Not sure whether $seq->seq would work in the place of $sequence above, but I think it would work fine. chris On May 19, 2011, at 1:48 PM, Kevin Brown wrote: > This is a problem with accessing objects, IIRC. > > Try doing something like: > > my $in = Bio::SeqIO->new(-file => $file , '-format' => 'fasta'); > while ( my $seq = $in->next_seq() ) > { > my $sequence = $seq->seq; > while ($sequence =~ m/(ATCGA)/){ > print "ATCGA commence ? la position $-[1]\nse termine juste avant la > position $+[1]\n"; > } > } > > Kevin Brown > Center for Innovations in Medicine > Biodesign Institute > Arizona State University > >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> bounces at lists.open-bio.org] On Behalf Of debutant.bioperl >> Sent: Thursday, May 19, 2011 11:00 AM >> To: Bioperl-l at lists.open-bio.org >> Subject: [Bioperl-l] While running without stop >> >> >> Hello all, >> The following perl script looks for the pattern in each line (line of >> the >> fasta format that begins with">" : sequence). >> >> ....................................................................... >> ..................................................................... >> #!/usr/bin/perl >> use strict; >> use warnings; >> use Bio::SeqIO; >> >> my $file = 'eee.txt'; >> my $in = Bio::SeqIO->new(-file => $file , '-format' => 'fasta'); >> while ( my $seq = $in->next_seq() ) >> { >> while ($seq->seq =~ m/(ATCGA)/){ >> print "ATCGA commence ? la position $-[1]\nse termine juste avant >> la >> position $+[1]\n"; >> } >> } >> ....................................................................... >> ..................................................................... >> >> the problem that the while of: >> ....................................................................... >> ........................... >> while ($seq->seq =~ m/(ATCGA)/){ >> print "ATCGA commence ? la position $-[1]\nse termine juste >> avant la >> position $+[1]\n"; >> } >> ....................................................................... >> ........................... >> Runs continuously and always gives the same result. >> >> ... ATCG commence ? la position 172 >> se termine juste avant la position 176 >> ATCG commence ? la position 172 >> se termine juste avant la position 176 >> ATCG commence ? la position 172 >> se termine juste avant la position 176 ... >> >> >> And if we use "if" instead of While he finds only the first pattern >> (case), >> otherwise, if more than one pattern in the same sequence, he finds only >> the >> first. >> Please, have you any idea to fix this code so that: >> While research patterns and ends. >> Or >> For If, when he finds a pattern he went to the next letter and not the >> next >> sequence and found many motifs per sequence. >> Cordially >> >> >> -- >> View this message in context: http://old.nabble.com/While-running- >> without-stop-tp31658279p31658279.html >> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From joyeux2000 at hotmail.fr Thu May 19 16:55:26 2011 From: joyeux2000 at hotmail.fr (debutant.bioperl) Date: Thu, 19 May 2011 13:55:26 -0700 (PDT) Subject: [Bioperl-l] While running without stop In-Reply-To: References: <31658279.post@talk.nabble.com> Message-ID: <31659539.post@talk.nabble.com> thank you very very much for your reply the script works well :-) Dave Messina-3 wrote: > > I'm not familiar with these newfangled @-s and @+s, so I can't help with > that. > > But the traditional way to do this is with pos(): > > > while ( my $seq = $in->next_seq() ) { > my $seqstring = $seq->seq; > while ( $seqstring =~ m/(ATCGA)/g ) { > my $end = pos($seqstring); > my $start = $end - length($&) + 1; > print "ATCGA commence ? la position $start\nse termine juste avant > la position $end\n"; > } > } > > > Dave > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- View this message in context: http://old.nabble.com/While-running-without-stop-tp31658279p31659539.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From joyeux2000 at hotmail.fr Thu May 19 17:48:41 2011 From: joyeux2000 at hotmail.fr (debutant.bioperl) Date: Thu, 19 May 2011 14:48:41 -0700 (PDT) Subject: [Bioperl-l] display ID fasta format Message-ID: <31659833.post@talk.nabble.com> Hello all, I want to display the sequence ID in which case there is an occurrence of the motif. Please, is there a regular expression to display the ID for the fasta format, if not how can I do to display the first word line. excuse me for these questions, but really I'm still a beginner. cordially ................................................................................................................................ #!/usr/bin/perl use strict; use warnings; use Bio::SeqIO; my $file = 'eee.txt'; my $in = Bio::SeqIO->new(-file => $file , '-format' => 'fasta'); while ( my $seq = $in->next_seq() ) { my $seqstring = $seq->seq; while ( $seqstring =~ m/(ATCGA)/g ) { my $end = pos($seqstring); my $start = $end - length($&) + 1; print "ATCGA commence ? la position $start\nse termine juste avant la position $end\n"; } } -- View this message in context: http://old.nabble.com/display-ID-fasta-format-tp31659833p31659833.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From member at linkedin.com Thu May 19 23:52:16 2011 From: member at linkedin.com (Pranav Karkhanis via LinkedIn) Date: Fri, 20 May 2011 03:52:16 +0000 (UTC) Subject: [Bioperl-l] Invitation to connect on LinkedIn Message-ID: <452436701.24353820.1305863536107.JavaMail.app@ela4-bed83.prod> LinkedIn ------------Pranav Karkhanis requested to add you as a connection on LinkedIn: ------------------------------------------ Bolotin,, I'd like to add you to my professional network on LinkedIn. - Pranav Accept invitation from Pranav Karkhanis http://www.linkedin.com/e/5drwke-gnwlfl61-6n/uez6TYkHzbaXxXM-lUk23auFwJZodcPlXc2UWC0Ao8h/blk/I2826071235_2/1BpC5vrmRLoRZcjkkZt5YCpnlOt3RApnhMpmdzgmhxrSNBszYOnPkPcz4Tc3oOe399bRxxgmZKgm9EbP0Ncz8Od38Pej8LrCBxbOYWrSlI/EML_comm_afe/ View invitation from Pranav Karkhanis http://www.linkedin.com/e/5drwke-gnwlfl61-6n/uez6TYkHzbaXxXM-lUk23auFwJZodcPlXc2UWC0Ao8h/blk/I2826071235_2/39vdjcOcjsMdz8UcAALqnpPbOYWrSlI/svi/ -- (c) 2011, LinkedIn Corporation From hrh at fmi.ch Fri May 20 03:45:57 2011 From: hrh at fmi.ch (Hans-Rudolf Hotz) Date: Fri, 20 May 2011 09:45:57 +0200 Subject: [Bioperl-l] display ID fasta format In-Reply-To: <31659833.post@talk.nabble.com> References: <31659833.post@talk.nabble.com> Message-ID: <4DD61C35.1090905@fmi.ch> Hi 'joyeux2000' Have you read: http://www.bioperl.org/wiki/HOWTO:Beginners http://www.bioperl.org/wiki/HOWTO:SeqIO Regards, Hans On 05/19/2011 11:48 PM, debutant.bioperl wrote: > > Hello all, > I want to display the sequence ID in which case there is an occurrence of > the motif. Please, is there a regular expression to display the ID for the > fasta format, if not how can I do to display the first word line. > excuse me for these questions, but really I'm still a beginner. > cordially > > ................................................................................................................................ > > #!/usr/bin/perl > use strict; > use warnings; > use Bio::SeqIO; > > my $file = 'eee.txt'; > my $in = Bio::SeqIO->new(-file => $file , '-format' => 'fasta'); > while ( my $seq = $in->next_seq() ) { > my $seqstring = $seq->seq; > while ( $seqstring =~ m/(ATCGA)/g ) { > my $end = pos($seqstring); > my $start = $end - length($&) + 1; > print "ATCGA commence ? la position $start\nse termine juste avant > la position $end\n"; > } > } > > From joyeux2000 at hotmail.fr Fri May 20 10:27:28 2011 From: joyeux2000 at hotmail.fr (debutant.bioperl) Date: Fri, 20 May 2011 07:27:28 -0700 (PDT) Subject: [Bioperl-l] display ID fasta format In-Reply-To: <4DD61C35.1090905@fmi.ch> References: <31659833.post@talk.nabble.com> <4DD61C35.1090905@fmi.ch> Message-ID: <31664579.post@talk.nabble.com> Hi, Thanks a lot Hotz, these links are very useful for me ^^. Hotz, Hans-Rudolf-2 wrote: > > Hi 'joyeux2000' > > Have you read: > > http://www.bioperl.org/wiki/HOWTO:Beginners > http://www.bioperl.org/wiki/HOWTO:SeqIO > > > > Regards, Hans > > > On 05/19/2011 11:48 PM, debutant.bioperl wrote: >> >> Hello all, >> I want to display the sequence ID in which case there is an occurrence of >> the motif. Please, is there a regular expression to display the ID for >> the >> fasta format, if not how can I do to display the first word line. >> excuse me for these questions, but really I'm still a beginner. >> cordially >> >> ................................................................................................................................ >> >> #!/usr/bin/perl >> use strict; >> use warnings; >> use Bio::SeqIO; >> >> my $file = 'eee.txt'; >> my $in = Bio::SeqIO->new(-file => $file , '-format' => 'fasta'); >> while ( my $seq = $in->next_seq() ) { >> my $seqstring = $seq->seq; >> while ( $seqstring =~ m/(ATCGA)/g ) { >> my $end = pos($seqstring); >> my $start = $end - length($&) + 1; >> print "ATCGA commence ? la position $start\nse termine juste >> avant >> la position $end\n"; >> } >> } >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- View this message in context: http://old.nabble.com/display-ID-fasta-format-tp31659833p31664579.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From shachigahoimbi at gmail.com Sat May 21 05:14:10 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Sat, 21 May 2011 14:44:10 +0530 Subject: [Bioperl-l] Retrieve Domain sequence from Domain database Message-ID: Hello I want to retrieve a domain signature sequence for particular text like i want to retrieve domain signature sequence for 'MADS-BOX'. Is there any module in bioperl which can connect a particular domain database and retrieve a sequence using keyword search. if anybody knows please tell me Thanks in advance -- Regards, Shachi From shachigahoimbi at gmail.com Sat May 21 06:34:25 2011 From: shachigahoimbi at gmail.com (Shachi Gahoi) Date: Sat, 21 May 2011 16:04:25 +0530 Subject: [Bioperl-l] Run Prositescan using Bioperl Message-ID: Is there any module in bioperl to run PrositeScan remotely. if anyone know please help me thanks in advance -- Regards, Shachi From sheena.scroggins at gmail.com Sun May 22 19:31:37 2011 From: sheena.scroggins at gmail.com (Sheena Scroggins) Date: Sun, 22 May 2011 16:31:37 -0700 Subject: [Bioperl-l] GSoC BioPerl Project Page In-Reply-To: References: Message-ID: Hey everyone, The BioPerl project can be followed at www.techomics.com. The relevant github links will be posted there soon. Sheena From cjfields at illinois.edu Mon May 23 10:07:48 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 23 May 2011 09:07:48 -0500 Subject: [Bioperl-l] GSoC BioPerl Project Page In-Reply-To: References: Message-ID: Very nice! It will be interesting to see how the project progresses; I'm hoping the bioperl community will chip in with suggestions and comments along the way. Also, it might be a good idea to post something here when updates are made, just in case. Thanks Sheena, and welcome to BioPerl! chris On May 22, 2011, at 6:31 PM, Sheena Scroggins wrote: > Hey everyone, > > > The BioPerl project can be followed at www.techomics.com. > > The relevant github links will be posted there soon. > > Sheena > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From R.A.Vos at reading.ac.uk Mon May 23 10:16:00 2011 From: R.A.Vos at reading.ac.uk (Rutger Vos) Date: Mon, 23 May 2011 15:16:00 +0100 Subject: [Bioperl-l] GSoC BioPerl Project Page In-Reply-To: References: Message-ID: Great project, good luck Sheena. Perhaps the way BioRuby organizes this (with its plugin system) might be a source of inspiration? Rutger On Mon, May 23, 2011 at 12:31 AM, Sheena Scroggins wrote: > Hey everyone, > > > The BioPerl project can be followed at www.techomics.com. > > The relevant github links will be posted there soon. > > Sheena > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading, RG6 6BX, United Kingdom Tel: +44 (0) 118 378 7535 http://rutgervos.blogspot.com From cjfields at illinois.edu Mon May 23 11:15:26 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 23 May 2011 10:15:26 -0500 Subject: [Bioperl-l] GSoC BioPerl Project Page In-Reply-To: References: Message-ID: <7F93ECED-6D56-4A21-9060-0D6943C737FC@illinois.edu> Well, BioPerl was initially designed (with the various pluggable IO systems and interfaces) to accommodate this, so one could have focused packages (for instance, on RNA structural analysis, NGS, etc). That has always been in place, but the standard practice has been to drop everything into the main repo, which isn't tenable long-term. In fact, what has occurred in a number of cases is circular dependencies that will have to be teased apart, and some (possibly bad/unoptimized) code can become entrenched and hard to remove. I'm not sure why this never became a standard practice; I have a feeling it was just easier on the developers to have everything in one place. I'm just as guilty as everyone else though. Anyway, I just conversed with Pjotr Prins re: BioRuby's plugin work, and judging on the BioRuby plugin design we're probably going to go even further, in that some previously core components (Map, Coordinate, Cluster) may be pulled into their own repos as they are pretty self-contained and mainly rely on simpler core components. We'll see how Sheena approaches it :) chris On May 23, 2011, at 9:16 AM, Rutger Vos wrote: > Great project, good luck Sheena. Perhaps the way BioRuby organizes > this (with its plugin system) might be a source of inspiration? > > Rutger > > On Mon, May 23, 2011 at 12:31 AM, Sheena Scroggins > wrote: >> Hey everyone, >> >> >> The BioPerl project can be followed at www.techomics.com. >> >> The relevant github links will be posted there soon. >> >> Sheena >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > > -- > Dr. Rutger A. Vos > School of Biological Sciences > Philip Lyle Building, Level 4 > University of Reading > Reading, RG6 6BX, United Kingdom > Tel: +44 (0) 118 378 7535 > http://rutgervos.blogspot.com > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From locarpau at upvnet.upv.es Mon May 23 12:44:16 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Mon, 23 May 2011 18:44:16 +0200 Subject: [Bioperl-l] GSoC BioPerl Project Page In-Reply-To: <7F93ECED-6D56-4A21-9060-0D6943C737FC@illinois.edu> References: <7F93ECED-6D56-4A21-9060-0D6943C737FC@illinois.edu> Message-ID: <4DDA8EE0.90602@upvnet.upv.es> Hi all, I have created a MSA SimpleAlign object and I'd like to get the number of aligned residues of a given sequences (e.g. the sequence with the maximum or the minimum length). I have been playing with $aln->num_residues() (which returns the total number of residues in the alignment) and $aln->length() (which returns the maximum length of the alignment) but didn't found a way. Anyone knows how to to do this?. Thanks, Lorenzo -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From sdavis2 at mail.nih.gov Mon May 23 14:09:47 2011 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Mon, 23 May 2011 14:09:47 -0400 Subject: [Bioperl-l] [OT] Bioconductor-2011 conference Message-ID: All, Sorry for the slightly off-topic post, but I know there are some overlaps between Bioconductor and Bioperl user groups. The Bioconductor-2011 conference will be held July 28-29, 2011 (optional: July 27 - Developer Day) at the Fred Hutchinson Cancer Research Center in Seattle, WA. This conference highlights current developments within and beyond?Bioconductor, an international open source and open development software project for the analysis and comprehension of high-throughput genomic data. ?The conference provides a forum in which to discuss the use and design of software for analyzing data arising in biology with a focus on Bioconductor and genomic data. If interested, see the website: https://secure.bioconductor.org/BioC2011/ Thanks, Sean From jun.yin at ucd.ie Tue May 24 05:49:22 2011 From: jun.yin at ucd.ie (Jun Yin) Date: Tue, 24 May 2011 10:49:22 +0100 Subject: [Bioperl-l] GSoC BioPerl Project Page In-Reply-To: <4DDA8EE0.90602@upvnet.upv.es> References: <7F93ECED-6D56-4A21-9060-0D6943C737FC@illinois.edu> <4DDA8EE0.90602@upvnet.upv.es> Message-ID: <019001cc19f7$dc291c40$947b54c0$%yin@ucd.ie> Hi, Lorenzo, I don't think there is such function in SimpleAlign to get the minimal and maximal length for the alignment at the moment. You can try: my @lengths; foreach my $seq ($aln->each_seq) { push @lengths,$seq->_ungapped_len; #Get the ungapped length of a sequence } my $minlen=min(@lengths); my $maxlen=max(@lengths); Of course, you need to write your own subroutines for min and max. Cheers, Jun Yin Ph.D.?student in U.C.D. Bioinformatics Laboratory Conway Institute University College Dublin -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Lorenzo Carretero Paulet Sent: Monday, May 23, 2011 5:44 PM To: bioperl-l at lists.open-bio.org Subject: Re: [Bioperl-l] GSoC BioPerl Project Page Hi all, I have created a MSA SimpleAlign object and I'd like to get the number of aligned residues of a given sequences (e.g. the sequence with the maximum or the minimum length). I have been playing with $aln->num_residues() (which returns the total number of residues in the alignment) and $aln->length() (which returns the maximum length of the alignment) but didn't found a way. Anyone knows how to to do this?. Thanks, Lorenzo -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From reece at harts.net Sat May 21 11:55:41 2011 From: reece at harts.net (Reece Hart) Date: Sat, 21 May 2011 08:55:41 -0700 Subject: [Bioperl-l] fetching exon locations in genomic coordinates Message-ID: Hi- I'd like to fetch exons for a given RefSeq in genomic coordinates (GRCh37) directly from NCBI. The only path to this data that I'm aware of is a bit painful: esearch for the gene based on RefSeq accession, efetch the gene record as XML, then parse the XML for Gene-commentary_genomic-coords. That's doable, but unattractive. Anyone got a better way? Normally I'd fetch these with blissful ease from Ensembl, but the point of this exercise is to compare data for a particular transcript against Ensembl. Thanks, Reece From justinchu1989 at gmail.com Fri May 20 15:51:25 2011 From: justinchu1989 at gmail.com (Justin Chu) Date: Fri, 20 May 2011 13:51:25 -0600 Subject: [Bioperl-l] Problems with Bio::DB::Fasta Message-ID: Hello: I'm having trouble with Bio::DB::Fasta. It sometimes occurs when I use large fasta files and retrieve sequence from a bit past the start of the file. I think some characters are being ignored or a rounding error is occurring or something when using the offset to retrieve entries from the index file. I have attached the Fasta files I have been using, just incase my problem is due to improper formatting of my files. For example: my $refDB = Bio::DB::Fasta->new('Test2.Fasta'); my $queryDB = Bio::DB::Fasta->new('Test1.Fasta'); print $refDB->subseq( "gi|294675557|ref|NC_014034.1|", 161067, 161788 )."\n"; print $queryDB->subseq( "gi|169245903|gb|EU376363.1|", 1, 722 )."\n"; output: GGTAGTCCACGCCGTAAACGATGAATGCCAGTCGTCGGCAG... GTAGTCCCGGCCGTAAACGATGGATGCTAGCCGTCGGATAG... my $refDB2 = InMemoryFastaAccess->new('Test2.Fasta'); my $queryDB2 = InMemoryFastaAccess->new('Test1.Fasta'); print $refDB2->subseq( "gi|294675557|ref|NC_014034.1|", 161067, 161788 )."\n"; print $queryDB2->subseq( "gi|169245903|gb|EU376363.1|", 1, 722 )."\n"; I get: output: GTAGTCCACGCCGTAAACGATGAATGCCAGTCGTCGGCA... GTAGTCCCGGCCGTAAACGATGGATGCTAGCCGTCGGAT... Basically, sometimes the sequences retrieved are correct but other times it is offset slightly by a few base pairs. Interestingly it seems that the offset problem gets worse as you retrieve sequence chunks further and further down the sequence. print $refDB->subseq( "gi|294675557|ref|NC_014034.1|", 1514858, 1515579)."\n"; output: CCCTGGTAGTCCACGCCGTAAACGATGAATGCCAGTCGT... when it should be: print $refDB2->subseq( "gi|294675557|ref|NC_014034.1|", 1514858, 1515579)."\n"; output: GTAGTCCACGCCGTAAACGATGAATGCCAGTCGTCGGCA... This module is still way faster than what I have, so I want to keep using it. Do you think there something I'm overlooking that could be the problem or do you see a way to fix this? I am currently running: Bioperl-live from the BioPerl GitHub master branch from 19/5/11 Perl 5.10.1 Debian 6.0.1 If you need any other information please let me know. Thanks, Justin Chu -------------- next part -------------- A non-text attachment was scrubbed... Name: Test2.Fasta Type: application/octet-stream Size: 3798623 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Test1.Fasta Type: application/octet-stream Size: 839 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: InMemoryFastaAccess.pm Type: application/x-perl Size: 1111 bytes Desc: not available URL: From reecehart at gmail.com Sat May 21 12:37:34 2011 From: reecehart at gmail.com (Reece Hart) Date: Sat, 21 May 2011 09:37:34 -0700 Subject: [Bioperl-l] fetching exons in genomic coordinates from NCBI Message-ID: Hi- I'd like to fetch exons for a given RefSeq in genomic coordinates (GRCh37) directly from NCBI. The only path to this data that I'm aware of is a bit painful: esearch for the gene based on RefSeq accession, efetch the gene record as XML, then parse the XML for Gene-commentary_genomic-coords. That's doable, but unattractive. Anyone got a better way? Normally I'd fetch these with blissful ease from Ensembl, but the point of this exercise is to compare data for a particular transcript against Ensembl. Thanks, Reece From donati.claudio at gmail.com Tue May 24 09:04:40 2011 From: donati.claudio at gmail.com (cdonati) Date: Tue, 24 May 2011 06:04:40 -0700 (PDT) Subject: [Bioperl-l] remove gap columns in alignment Message-ID: Dear All, does anybody know how to remove those column in an alignment that have a gap in one sequence that I specify? I know that you can remove columns that have a gap in any sequence or gap only columns, but I don't know how to remove gaps in a reference sequence. Thanks a lot, Claudio Donati From roy.chaudhuri at gmail.com Tue May 24 10:23:15 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Tue, 24 May 2011 15:23:15 +0100 Subject: [Bioperl-l] remove gap columns in alignment In-Reply-To: References: Message-ID: <4DDBBF53.2090106@gmail.com> http://search.cpan.org/~cjfields/BioPerl-1.6.901/Bio/SimpleAlign.pm#splice_by_seq_pos On 24/05/2011 14:04, cdonati wrote: > Dear All, > > does anybody know how to remove those column in an alignment that have > a gap in one sequence that I specify? I know that you can remove > columns that have a gap in any sequence or gap only columns, but I > don't know how to remove gaps in a reference sequence. > > Thanks a lot, > > Claudio Donati > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From Kevin.M.Brown at asu.edu Tue May 24 11:17:15 2011 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Tue, 24 May 2011 08:17:15 -0700 Subject: [Bioperl-l] GSoC BioPerl Project Page In-Reply-To: <019001cc19f7$dc291c40$947b54c0$%yin@ucd.ie> References: <7F93ECED-6D56-4A21-9060-0D6943C737FC@illinois.edu><4DDA8EE0.90602@upvnet.upv.es> <019001cc19f7$dc291c40$947b54c0$%yin@ucd.ie> Message-ID: <1A4207F8295607498283FE9E93B775B407A5B4D0@EX02.asurite.ad.asu.edu> use List::Util qw(min max); Now you don't need to code them up. Kevin Brown Center for Innovations in Medicine Biodesign Institute Arizona State University > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Jun Yin > Sent: Tuesday, May 24, 2011 2:49 AM > To: 'Lorenzo Carretero Paulet'; bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] GSoC BioPerl Project Page > > Hi, Lorenzo, > > I don't think there is such function in SimpleAlign to get the minimal > and > maximal length for the alignment at the moment. > > You can try: > > my @lengths; > foreach my $seq ($aln->each_seq) { > push @lengths,$seq->_ungapped_len; #Get the ungapped length of a > sequence > } > > my $minlen=min(@lengths); > my $maxlen=max(@lengths); > > Of course, you need to write your own subroutines for min and max. > > Cheers, > Jun Yin > Ph.D.?student in U.C.D. > > Bioinformatics Laboratory > Conway Institute > University College Dublin > > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Lorenzo > Carretero > Paulet > Sent: Monday, May 23, 2011 5:44 PM > To: bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] GSoC BioPerl Project Page > > Hi all, > I have created a MSA SimpleAlign object and I'd like to get the number > of aligned residues of a given sequences (e.g. the sequence with the > maximum or the minimum length). I have been playing with > $aln->num_residues() (which returns the total number of residues in the > alignment) and $aln->length() (which returns the maximum length of the > alignment) but didn't found a way. Anyone knows how to to do this?. > Thanks, > Lorenzo > > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- > *-*-* > Lorenzo Carretero Paulet > Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) > Integrative Systems Biology Group > C/ Ingeniero Fausto Elio s/n. > 46022 Valencia, Spain > > Phone: +34 963879934 > Fax: +34 963877859 > e-mail: locarpau at upvnet.upv.es > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- > *-*-* > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Wed May 25 08:13:13 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Wed, 25 May 2011 14:13:13 +0200 Subject: [Bioperl-l] fetching exons in genomic coordinates from NCBI In-Reply-To: References: Message-ID: Hi Reece, As far as I know, you're doing it the NCBI recommended way, byzantine though it may be. Of course I too would be keen to hear of a better approach if anyone's got one. Dave On Sat, May 21, 2011 at 18:37, Reece Hart wrote: > Hi- > > I'd like to fetch exons for a given RefSeq in genomic coordinates (GRCh37) > directly from NCBI. The only path to this data that I'm aware of is a bit > painful: esearch for the gene based on RefSeq accession, efetch the gene > record as XML, then parse the XML for Gene-commentary_genomic-coords. > That's > doable, but unattractive. Anyone got a better way? > > Normally I'd fetch these with blissful ease from Ensembl, but the point of > this exercise is to compare data for a particular transcript against > Ensembl. > > Thanks, > Reece > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From p.j.a.cock at googlemail.com Wed May 25 08:24:59 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 25 May 2011 13:24:59 +0100 Subject: [Bioperl-l] fetching exons in genomic coordinates from NCBI In-Reply-To: References: Message-ID: On Wed, May 25, 2011 at 1:13 PM, Dave Messina wrote: > Hi Reece, > > As far as I know, you're doing it the NCBI recommended way, byzantine though > it may be. Of course I too would be keen to hear of a better approach if > anyone's got one. > > Dave You can probably do step one (RefSeq accession to NCBI accession) with Entrez Link, but getting the right elink arguments is quite fiddly. Peter From cjfields at illinois.edu Wed May 25 08:32:30 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 25 May 2011 14:32:30 +0200 Subject: [Bioperl-l] fetching exons in genomic coordinates from NCBI In-Reply-To: References: Message-ID: <4E312A6A-9CEB-4C55-BB0E-862D133B4AFD@illinois.edu> Bio::SeqIO::entrezgene? This creates a Bio::Seq::RichSeq from Gene XML (though I don't use it myself). chris On May 25, 2011, at 2:13 PM, Dave Messina wrote: > Hi Reece, > > As far as I know, you're doing it the NCBI recommended way, byzantine though > it may be. Of course I too would be keen to hear of a better approach if > anyone's got one. > > > Dave > > > > > On Sat, May 21, 2011 at 18:37, Reece Hart wrote: > >> Hi- >> >> I'd like to fetch exons for a given RefSeq in genomic coordinates (GRCh37) >> directly from NCBI. The only path to this data that I'm aware of is a bit >> painful: esearch for the gene based on RefSeq accession, efetch the gene >> record as XML, then parse the XML for Gene-commentary_genomic-coords. >> That's >> doable, but unattractive. Anyone got a better way? >> >> Normally I'd fetch these with blissful ease from Ensembl, but the point of >> this exercise is to compare data for a particular transcript against >> Ensembl. >> >> Thanks, >> Reece >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From locarpau at upvnet.upv.es Wed May 25 13:58:15 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Wed, 25 May 2011 19:58:15 +0200 Subject: [Bioperl-l] Error calling alignment method Message-ID: <4DDD4337.3030107@upvnet.upv.es> Hi all, I'm trying to create the following subroutine but I get the error message: "Can't call method "alignment" on an undefined value at /Users/marioafares/Documents/workspace/PlantEvolGen/test.pl line 80, line 2." which indicates $dna_aln is undefined. However, I manage to print it using Dumper, so I guess it is actually defined. Can anyone see where the problem is? I attach the two files I'm using in the test. Thanks in advance, Lorenzo #!/usr/local/bin/perl -w use 5.010; use strict; use Data::Dumper; # definition of the environmental variable CLUSTALDIR BEGIN {$ENV{CLUSTALDIR} = '/Applications/Bioinformatics/clustalw-2.1-macosx/'} use Bio::Tools::Run::Alignment::Clustalw; use Bio::Align::Utilities qw(aa_to_dna_aln); BEGIN {$ENV{PAMLDIR} = '/Applications/Bioinformatics/paml44/bin/'} use Bio::Tools::Run::Phylo::PAML::Codeml; my $sequencesfilenameAA = "/Users/marioafares/Documents/SequenceDatabase/plaza_public_02_Apr27/plaza_public_02/BLAST_Parsed_results/PerSpecies/test_vs_test.par.aa.1.fas"; my $sequencesfilenameNT = "/Users/marioafares/Documents/SequenceDatabase/plaza_public_02_Apr27/plaza_public_02/BLAST_Parsed_results/PerSpecies/test_vs_test.par.nt.1.fas"; my $format = 'fasta'; GettingBioperlAlignmentAAtoDNAplusPAMLcalculation ($sequencesfilenameAA, $sequencesfilenameNT, $format); sub GettingBioperlAlignmentAAtoDNAplusPAMLcalculation { my ( $sequencesfilenameAA, $sequencesfilenameNT, $format, $ktuple, $matrix ) = @_; my %hashNTseqs = (); my $likelihood; my $Ks; my $Ka; my $omega; my $factory = Bio::Tools::Run::Alignment::Clustalw->new (); #use default parameters in here my $pep_aln = $factory->align($sequencesfilenameAA); my $inseq = Bio::SeqIO->new(-file => "<$sequencesfilenameNT", -format => $format ); while (my $seq = $inseq->next_seq) { my $seq_id = $seq->display_id(); $hashNTseqs{$seq_id} = $seq; } my $dna_aln = aa_to_dna_aln($pep_aln, \%hashNTseqs); say Dumper \%hashNTseqs; say Dumper $dna_aln; ########################################################### #my $codeml_runs = 5; my $ks_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( -params => { #'verbose' => 0, #'noisy' => 0, 'runmode' => -2, 'seqtype' => 1, #'model' => 0, #'NSsites' => 0, #'icode' => 0, #'fix_alpha' => 0, #'fix_kappa' => 0, #'RateAncestor' => 0, 'CodonFreq' => 2, 'cleandata' => 1, 'ndata' => 1 }, ); my $kaks_factory->alignment($dna_aln); say "\nCalculating Ks-values of current cluster ..."; # The estimation of the Ks-values is repeated $codeml_runs times ... # for(my $k=1;$k <= $codeml_runs;$k++) # { # print "\nCodeml-Run $k:\n\n"; # Ka and Ks-vlaues are calculated using codeml my ($rc,$parser) = $kaks_factory->run(); #$kaks_factory->cleanup(); # If the calculation was succsessful ... # if($rc != 0) # { my $result = $parser->next_result;#not sure what it does here #my $NGmatrix = $result->get_MLmatrix(); my $MLmatrix = $result->get_MLmatrix(); $likelihood = $MLmatrix->[0]->[1]->{'lnL'}; $Ks = $MLmatrix->[0]->[1]->{'dS'}; $Ka = $MLmatrix->[0]->[1]->{'dA'}; $omega = $MLmatrix->[0]->[1]->{'omega'}; print " likelihood = $likelihood, Ka = $Ka, Ks = $Ks, Ka/Ks = $omega\n"; # } # else # If an error occured during the Ks-calculation ... # { # return (-1); # } # } return ( $likelihood, $Ks, $Ka, $omega ); } -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail:locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test_vs_test.par.nt.1.fas URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test_vs_test.par.aa.1.fas URL: From locarpau at upvnet.upv.es Wed May 25 14:41:22 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Wed, 25 May 2011 20:41:22 +0200 Subject: [Bioperl-l] Error calling alignment method In-Reply-To: <4DDD4337.3030107@upvnet.upv.es> References: <4DDD4337.3030107@upvnet.upv.es> Message-ID: <4DDD4D52.606@upvnet.upv.es> El 25/05/11 19:58, Lorenzo Carretero Paulet escribi?: > > > Hi all, > I'm trying to create the following subroutine but I get the error > message: > "Can't call method "alignment" on an undefined value at > /Users/marioafares/Documents/workspace/PlantEvolGen/test.pl line 80, > line 2." > which indicates $dna_aln is undefined. However, I manage to print it > using Dumper, so I guess it is actually defined. > Can anyone see where the problem is? > I attach the two files I'm using in the test. > Thanks in advance, > Lorenzo > > #!/usr/local/bin/perl -w > use 5.010; > use strict; > use Data::Dumper; > # definition of the environmental variable CLUSTALDIR > BEGIN {$ENV{CLUSTALDIR} = > '/Applications/Bioinformatics/clustalw-2.1-macosx/'} > use Bio::Tools::Run::Alignment::Clustalw; > use Bio::Align::Utilities qw(aa_to_dna_aln); > > BEGIN {$ENV{PAMLDIR} = '/Applications/Bioinformatics/paml44/bin/'} > use Bio::Tools::Run::Phylo::PAML::Codeml; > > my $sequencesfilenameAA = > > "/Users/marioafares/Documents/SequenceDatabase/plaza_public_02_Apr27/plaza_public_02/BLAST_Parsed_results/PerSpecies/test_vs_test.par.aa.1.fas"; > > my $sequencesfilenameNT = > > "/Users/marioafares/Documents/SequenceDatabase/plaza_public_02_Apr27/plaza_public_02/BLAST_Parsed_results/PerSpecies/test_vs_test.par.nt.1.fas"; > > my $format = 'fasta'; > > GettingBioperlAlignmentAAtoDNAplusPAMLcalculation > ($sequencesfilenameAA, $sequencesfilenameNT, $format); > > > sub GettingBioperlAlignmentAAtoDNAplusPAMLcalculation > { > my ( $sequencesfilenameAA, $sequencesfilenameNT, $format, $ktuple, > $matrix ) = @_; > my %hashNTseqs = (); > my $likelihood; > my $Ks; > my $Ka; > my $omega; > my $factory = Bio::Tools::Run::Alignment::Clustalw->new (); #use > default parameters in here > my $pep_aln = $factory->align($sequencesfilenameAA); > > > my $inseq = Bio::SeqIO->new(-file => "<$sequencesfilenameNT", > -format => $format ); > while (my $seq = $inseq->next_seq) > { > my $seq_id = $seq->display_id(); > $hashNTseqs{$seq_id} = $seq; > } > my $dna_aln = aa_to_dna_aln($pep_aln, \%hashNTseqs); > say Dumper \%hashNTseqs; > say Dumper $dna_aln; > ########################################################### > #my $codeml_runs = 5; > my $ks_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > ( -params => { > #'verbose' => 0, > #'noisy' => 0, > 'runmode' => -2, > 'seqtype' => 1, > #'model' => 0, > #'NSsites' => 0, > #'icode' => 0, > #'fix_alpha' => 0, > #'fix_kappa' => 0, > #'RateAncestor' => 0, > 'CodonFreq' => 2, > 'cleandata' => 1, > 'ndata' => 1 > }, > > ); > > my $kaks_factory->alignment($dna_aln); > > say "\nCalculating Ks-values of current cluster ..."; > > # The estimation of the Ks-values is repeated $codeml_runs > times ... > # for(my $k=1;$k <= $codeml_runs;$k++) > # { > # print "\nCodeml-Run $k:\n\n"; > > # Ka and Ks-vlaues are calculated using codeml > my ($rc,$parser) = $kaks_factory->run(); > #$kaks_factory->cleanup(); > # If the calculation was succsessful ... > # if($rc != 0) > # { > my $result = $parser->next_result;#not sure what it > does here > #my $NGmatrix = $result->get_MLmatrix(); > my $MLmatrix = $result->get_MLmatrix(); > $likelihood = $MLmatrix->[0]->[1]->{'lnL'}; > $Ks = $MLmatrix->[0]->[1]->{'dS'}; > $Ka = $MLmatrix->[0]->[1]->{'dA'}; > $omega = $MLmatrix->[0]->[1]->{'omega'}; > print " likelihood = $likelihood, Ka = $Ka, Ks = > $Ks, Ka/Ks = $omega\n"; > # } > # else # If an error occured during the Ks-calculation ... > # { > # return (-1); > # } > # } > return ( $likelihood, $Ks, $Ka, $omega ); > } > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Similarly, when I try to run the subroutine from a manually created codon alignment as: my $alignment = "/Users/marioafares/Documents/SequenceDatabase/plaza_public_02_Apr27/plaza_public_02/BLAST_Parsed_results/PerSpecies/Alyrata_vs_Alyrata.par.cds.aln.fas"; my $format = 'fasta'; #GettingBioperlAlignmentAAtoDNAplusPAMLcalculation ($sequencesfilenameAA, $sequencesfilenameNT, $format); GetPAMLcalculation($alignment, $format); sub GetPAMLcalculation { my ( $alignment, $format ) = @_; my $alignio = Bio::AlignIO->new(-format => $format, -file => $alignment); my $dna_aln = $alignio->next_aln; my $kaks_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( -params => { #'verbose' => 0, #'noisy' => 0, 'runmode' => -2, 'seqtype' => 1, #'model' => 0, #'NSsites' => 0, #'icode' => 0, #'fix_alpha' => 0, #'fix_kappa' => 0, #'RateAncestor' => 0, 'CodonFreq' => 2, 'cleandata' => 1, 'ndata' => 1 }, ); $kaks_factory->alignment($dna_aln); say "\nCalculating Ks-values of current cluster ..."; # The estimation of the Ks-values is repeated $codeml_runs times ... # for(my $k=1;$k <= $codeml_runs;$k++) # { # print "\nCodeml-Run $k:\n\n"; # Ka and Ks-vlaues are calculated using codeml my ($rc,$parser) = $kaks_factory->run(); #$kaks_factory->cleanup(); # If the calculation was succsessful ... # if($rc != 0) # { my $result = $parser->next_result;#not sure what it does here #my $NGmatrix = $result->get_MLmatrix(); my $MLmatrix = $result->get_MLmatrix(); my @otus = $result->get_seqs(); my $likelihood = $MLmatrix->[0]->[1]->{'lnL'}; my $Ks = $MLmatrix->[0]->[1]->{'dS'}; my $Ka = $MLmatrix->[0]->[1]->{'dA'}; my $omega = $MLmatrix->[0]->[1]->{'omega'}; print " likelihood = $likelihood, Ka = $Ka, Ks = $Ks, Ka/Ks = $omega\n"; # } # else # If an error occured during the Ks-calculation ... # { # return (-1); # } # } return ( $likelihood, $Ks, $Ka, $omega ); } I also get the following error message, which seems to indicate that the alignment is not recognized as seqtype 1 by codeml (i.e. codon alignment). --------------------- WARNING --------------------- MSG: There was an error - see error_string for the program output --------------------------------------------------- ------------- EXCEPTION Bio::Root::NotImplemented ------------- MSG: Unknown format of PAML output did not see seqtype STACK Bio::Tools::Phylo::PAML::_parse_summary /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:461 STACK Bio::Tools::Phylo::PAML::next_result /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:270 STACK main::GetPAMLcalculation /Users/marioafares/Documents/workspace/PlantEvolGen/test.pl:78 STACK toplevel /Users/marioafares/Documents/workspace/PlantEvolGen/test.pl:36 --------------------------------------------------------------- -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From David.Messina at sbc.su.se Wed May 25 14:44:57 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Wed, 25 May 2011 20:44:57 +0200 Subject: [Bioperl-l] Error calling alignment method In-Reply-To: <4DDD4337.3030107@upvnet.upv.es> References: <4DDD4337.3030107@upvnet.upv.es> Message-ID: Hi Lorenzo, You've got a typo in your variable names. my $ks_factory my $kaks_factory->alignment($dna_aln); So I think it's the $kaks_factory that's undefined. And furthermore, no my in front of the method call. I'm not sure how this even compiles ? I guess the check for method calls happens at runtime, so that's why the stray "my" is missed. Also, a request: I really appreciate you posting your code and including your data files. However, email tends to wrap lines funny, which breaks code. In the future could you try attaching your code also, or using gists ( https://gist.github.com/) ? Thanks, Dave From locarpau at upvnet.upv.es Wed May 25 15:00:29 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Wed, 25 May 2011 21:00:29 +0200 Subject: [Bioperl-l] Error calling alignment method In-Reply-To: References: <4DDD4337.3030107@upvnet.upv.es> Message-ID: <4DDD51CD.90208@upvnet.upv.es> El 25/05/11 20:44, Dave Messina escribi?: > Hi Lorenzo, > > You've got a typo in your variable names. > > my $ks_factory > > > my $kaks_factory->alignment($dna_aln); > > > So I think it's the $kaks_factory that's undefined. And furthermore, > no my in front of the method call. > > I'm not sure how this even compiles ? I guess the check for method > calls happens at runtime, so that's why the stray "my" is missed. > > > Also, a request: I really appreciate you posting your code and > including your data files. However, email tends to wrap lines funny, > which breaks code. In the future could you try attaching your code > also, or using gists (https://gist.github.com/) ? > > Thanks, > Dave > > Thanks Dave. I attach a corrected version of the code and both nt and aa files. It's still not working, returning the message: ------------- EXCEPTION Bio::Root::NotImplemented ------------- MSG: Unknown format of PAML output did not see seqtype STACK Bio::Tools::Phylo::PAML::_parse_summary /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:461 STACK Bio::Tools::Phylo::PAML::next_result /Library/Perl//5.10.0/Bio/Tools/Phylo/PAML.pm:270 STACK main::GettingBioperlAlignmentAAtoDNAplusPAMLcalculation /Users/marioafares/Documents/workspace/PlantEvolGen/test.pl:154 STACK toplevel /Users/marioafares/Documents/workspace/PlantEvolGen/test.pl:35 --------------------------------------------------------------- I think the codon alignment is being proberly constructed by the method aa_to_dna_aln, as I can do a Dumper printing of it. So the problem must be in the PAML codeml wrapper not properly recognizing the codon alignment. Could it be related to the alignment format (PAML runs on PHYLIP formatted files)? Cheers, Lorenzo -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test_vs_test.par.nt.1.fas URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test_vs_test.par.aa.1.fas URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: code.pl Type: text/x-perl-script Size: 3438 bytes Desc: not available URL: From David.Messina at sbc.su.se Wed May 25 15:05:52 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Wed, 25 May 2011 21:05:52 +0200 Subject: [Bioperl-l] Error calling alignment method In-Reply-To: <4DDD4D52.606@upvnet.upv.es> References: <4DDD4337.3030107@upvnet.upv.es> <4DDD4D52.606@upvnet.upv.es> Message-ID: When I run into problems getting my code to work, I always check that everything works properly with the program directly, outside my code. What happens when you run PAML on your alignment directly, away from your code and BioPerl? From jason.stajich at gmail.com Wed May 25 16:24:48 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Wed, 25 May 2011 13:24:48 -0700 Subject: [Bioperl-l] Error calling alignment method In-Reply-To: <4DDD51CD.90208@upvnet.upv.es> References: <4DDD4337.3030107@upvnet.upv.es> <4DDD51CD.90208@upvnet.upv.es> Message-ID: > ------------------------------------------ > > I think the codon alignment is being proberly constructed by the method aa_to_dna_aln, as I can do a Dumper printing of it. So the problem must be in the PAML codeml wrapper not properly recognizing the codon alignment. Could it be related to the alignment format (PAML runs on PHYLIP formatted files)? The writing out in phylip format is taking care of by the factory - you are passing in an alignment object so that is not typically the problem. I would repeat Dave's idea that you just dump the codon alignment file out and you run PAML manually with it. The parsing error sounds like there are problems when running PAML and you may want to check that you don't have stop codons in your alignment. It looks like your CDS file has stops as the last codon so if you drop those last 3 bases, how does it work? > Cheers, > Lorenzo > > > > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > Lorenzo Carretero Paulet > Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) > Integrative Systems Biology Group > C/ Ingeniero Fausto Elio s/n. > 46022 Valencia, Spain > > Phone: +34 963879934 > Fax: +34 963877859 > e-mail: locarpau at upvnet.upv.es > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From nadel at nabsys.com Thu May 26 16:36:06 2011 From: nadel at nabsys.com (Mark Nadel) Date: Thu, 26 May 2011 16:36:06 -0400 Subject: [Bioperl-l] Restriction Enzyme Analysis Message-ID: Let me preface this question by saying that, after following various online sets of instructions, I was unable to "install" bioperl on my Mac. However, by simply copying Bio into a Perl directory, I have some limited functionality. I don't know if this is an installation issue, or if there is a problem with the module, or most probably, with my understanding of the module. Here is a short script to illustrate the issue: use strict; use Bio::Restriction::EnzymeCollection; my $re=Bio::Restriction::Enzyme->new (-enzyme=>'EcoRI', -seq=>'G^AATTC'); my $a = $re->name(); print "The name is $a\n"; my $all_collection = Bio::Restriction::EnzymeCollection->new(); my $x = $all_collection->available_list(); print "the available list is $x\n"; my $six_cutter_collection = $all_collection->cutters(6); my $y = $six_cutter_collection->available_list(); print "the available list is $y\n"; for my $enz ($six_cutter_collection){ print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n"; # prints name, recognition site, overhang } The output is The name is EcoRI the available list is 532 the available list is 266 Can't locate object method "name" via package "Bio::Restriction::EnzymeCollection" at /Users/marknadel/Documents/workspace/adHoc/bioperl_restriction_digest.plline 22, line 532. I would seem that some of this is working, but the last part, taken directly from the HOWTO, is not. If someone reading this list can help me with this, I would be most grateful. Thanks in advance, Mark Nadel From David.Messina at sbc.su.se Thu May 26 17:15:53 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 26 May 2011 23:15:53 +0200 Subject: [Bioperl-l] Restriction Enzyme Analysis In-Reply-To: References: Message-ID: > > Let me preface this question by saying that, after following various online > sets of instructions, I was unable to "install" bioperl on my Mac. > However, by simply copying Bio into a Perl directory, I have some limited > functionality. > Sorry you've had trouble installing BioPerl, Mark. For nearly all of the functionality, your method of installation will work just fine. That's exactly how I and many others use BioPerl. (For those curious about this, there are instructions on the BioPerl install page, or my own version here: http://seqxml.org/xml/BioPerl.html). Sorry also about the trouble with the HOWTO code ? it's important that the example code works! (my guess is that this did in fact work at some point but is now outdated.) In any case, the solution is to change this line > for my $enz ($six_cutter_collection){ > to for my $enz ($six_cutter_collection->each_enzyme() ) { Thanks for pointing this out. I'll also change it in the HOWTO. Dave From David.Messina at sbc.su.se Fri May 27 05:22:14 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Fri, 27 May 2011 11:22:14 +0200 Subject: [Bioperl-l] fetching exons in genomic coordinates from NCBI In-Reply-To: References: Message-ID: On Fri, May 27, 2011 at 10:20, Reece Hart wrote: > On Wed, May 25, 2011 at 5:13 AM, Dave Messina wrote: > >> As far as I know, you're doing it the NCBI recommended way, byzantine >> though it may be. Of course I too would be keen to hear of a better approach >> if anyone's got one. >> > > Is that really a "recommended" way? Aside from the NCBI eutils pages which > describe how to submit queries, I didn't see anything about how to process > the results. > When I said that, I was thinking about the esearch and efetch part, but now that I look around, I believe that yes, the NCBI expects us to parse the XML using XML libraries such as libXML. Or XmlWrapp. See this relatively current page which states that "the NCBI C++ Toolkit has incorporated and enhanced the open source XmlWrapp package, which provides a simplified way for developers to work with XML.": http://www.ncbi.nlm.nih.gov/books/NBK8829/ There is also Genome Workbench, which I have no experience with, but which apparently does read NCBI's XML: http://www.ncbi.nlm.nih.gov/projects/gbench/ So, I ended up reverse engineering the XML by comparing at several efetch > results with web pages. If you haven't already, you might take a look at the dtd and schema: http://www.ncbi.nlm.nih.gov/data_specs/dtd/ http://www.ncbi.nlm.nih.gov/data_specs/schema/ In particular, I think the ones you want are these: http://www.ncbi.nlm.nih.gov/dtd/NCBI_Entrezgene.mod.dtd http://www.ncbi.nlm.nih.gov/data_specs/schema/NCBI_Entrezgene.mod.xsd I am certainly not an expert in this area, but yeah, it sure seems like there should be some more human-readable guide to their XML formats than just the above. Dave From cjfields at illinois.edu Fri May 27 06:13:59 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 27 May 2011 12:13:59 +0200 Subject: [Bioperl-l] fetching exons in genomic coordinates from NCBI In-Reply-To: References: Message-ID: <9A466EDB-87C6-41A0-B01E-4C8FB9A5F1CB@illinois.edu> On May 27, 2011, at 11:22 AM, Dave Messina wrote: > On Fri, May 27, 2011 at 10:20, Reece Hart wrote: > >> On Wed, May 25, 2011 at 5:13 AM, Dave Messina wrote: >> >>> As far as I know, you're doing it the NCBI recommended way, byzantine >>> though it may be. Of course I too would be keen to hear of a better approach >>> if anyone's got one. >>> >> >> Is that really a "recommended" way? Aside from the NCBI eutils pages which >> describe how to submit queries, I didn't see anything about how to process >> the results. > > When I said that, I was thinking about the esearch and efetch part, but now > that I look around, I believe that yes, the NCBI expects us to parse the XML > using XML libraries such as libXML. > >> ... > > ... > I am certainly not an expert in this area, but yeah, it sure seems like > there should be some more human-readable guide to their XML formats than > just the above. > > Dave Brian Osborne and I set up a page to answer some of these questions (Brian's answer is for EntrezGene XML, and there is a EUtilities example). It's here: http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences It's possible if you were going a pure eutils-based route you could kludge something together from the various examples to get at what you want. Note that esummary gives you coords as well, is shorted, and has some OO-based ways to get at the data generically: http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook chris From locarpau at upvnet.upv.es Wed May 25 18:23:17 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero) Date: Thu, 26 May 2011 00:23:17 +0200 Subject: [Bioperl-l] Error calling alignment method In-Reply-To: References: <4DDD4337.3030107@upvnet.upv.es> <4DDD51CD.90208@upvnet.upv.es> Message-ID: <4DDD8155.5060002@upvnet.upv.es> Dave, Jason: I had already tried running PAML manually with the alignment (I always do this to confirm software is properly installed and set up), and ran again with an edited version of the alignment removing the stop codons (I didn't know stop codons at the ends of the alignmente could affect PAML, but inframe stop codons). It worked properly in both cases. I ran again my script (see attached testa.pl) using two different methods, one constructing the codon alignment using aa_to_dna_aln and another one passing the aligned sequences (in both cases after removing the stop codons). I had again the message: ------------- EXCEPTION: Bio::Root::NotImplemented ------------- MSG: Unknown format of PAML output did not see seqtype STACK: Error::throw STACK: Bio::Root::Root::throw /Library/Perl/5.10.0/Bio/Root/Root.pm:368 STACK: Bio::Tools::Phylo::PAML::_parse_summary /Library/Perl/5.10.0/Bio/Tools/Phylo/PAML.pm:461 STACK: Bio::Tools::Phylo::PAML::next_result /Library/Perl/5.10.0/Bio/Tools/Phylo/PAML.pm:270 STACK: main::GettingBioperlAlignmentAAtoDNAplusPAMLcalculation /Users/Lorenzo/Documents/workspace/PlantEvolGen/testa.pl:83 STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/testa.pl:23 ---------------------------------------------------------------- Thanks, Lorenzo On 5/25/11 10:24 PM, Jason Stajich wrote: >> ------------------------------------------ >> >> I think the codon alignment is being proberly constructed by the method aa_to_dna_aln, as I can do a Dumper printing of it. So the problem must be in the PAML codeml wrapper not properly recognizing the codon alignment. Could it be related to the alignment format (PAML runs on PHYLIP formatted files)? > The writing out in phylip format is taking care of by the factory - you are passing in an alignment object so that is not typically the problem. > > I would repeat Dave's idea that you just dump the codon alignment file out and you run PAML manually with it. The parsing error sounds like there are problems when running PAML and you may want to check that you don't have stop codons in your alignment. It looks like your CDS file has stops as the last codon so if you drop those last 3 bases, how does it work? > >> Cheers, >> Lorenzo >> >> >> >> -- >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> Lorenzo Carretero Paulet >> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >> Integrative Systems Biology Group >> C/ Ingeniero Fausto Elio s/n. >> 46022 Valencia, Spain >> >> Phone: +34 963879934 >> Fax: +34 963877859 >> e-mail: locarpau at upvnet.upv.es >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* -------------- next part -------------- A non-text attachment was scrubbed... Name: testa.pl Type: text/x-perl-script Size: 5414 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test_vs_test.par.aa.1.fas URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test_vs_test.par.nt.1.aln.fas URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test_vs_test.par.nt.1.fas URL: From sumithanallu at gmail.com Wed May 25 21:13:34 2011 From: sumithanallu at gmail.com (subiotech) Date: Wed, 25 May 2011 18:13:34 -0700 (PDT) Subject: [Bioperl-l] re covering sequences using SNP data Message-ID: <31704101.post@talk.nabble.com> I have the reference sequence of a gene and its variants (SNPs+ indels) from 101 accessions. Now I need to recover the sequences of the gene for all the 101 accessions using the variant information. I'm new to Perl, until now I was able to get the script to identify the fasta sequence given the id. The template of the files are: 1)variant file: acc_name position variant abc 10 C xyz 15 G 2)fasta file >abc ATTTGGGGGCCCCGGGGGGG >xyz TTTGGGGGCCCCAAAAAAA My unfinished script: #!/usr/bin/perl use Bio::SeqIO; use IO::String; use Bio::SearchIO; #usage variantfile seqfile outfile my $variantfile = shift or die; # open idfile and store the info in the array. open (IDFILE, $variantfile) || die "Can't open $file1: $!\n"; while () { # loop through IDFILE one line at a time chomp; # remove any newlines if they exist; push(@id, $_); } my $seqfile= shift or die; my $outfile= shift or die; my $gb = Bio::SeqIO->new(-file => "<$seqfile", -format => "fasta"); my $seq_out = Bio::SeqIO->new('-file' => ">$outfile", '-format' => 'FASTA'); while($seq = $gb->next_seq) { $seq_out->write_seq($seq) if (grep {$_ eq $seq->id} @id); } exit; Thanks for your help!!! -- View this message in context: http://old.nabble.com/recovering-sequences-using-SNP-data-tp31704101p31704101.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From florent.angly at gmail.com Thu May 26 22:55:35 2011 From: florent.angly at gmail.com (Florent Angly) Date: Fri, 27 May 2011 12:55:35 +1000 Subject: [Bioperl-l] Problems with Bio::DB::Fasta In-Reply-To: References: Message-ID: <4DDF12A7.1010402@gmail.com> Hi Justin, I been trying to reproduce your issue. A problem I ran into was that there were some extra empty lines in your FASTA files. Then I made a test script that gets the subsequences you mentioned using three different methods: Bio::SeqIO+Bio::Seq, Bio::DB::Fasta, and your InMemoryFastaAccess. These three methods return the same answer, so, I see no problem there. My system is pretty similar to yours: Bioperl-live from the BioPerl GitHub master branch from 27/5/11 Perl 5.12.3 Linux 2.6.38-2-amd64 (Linux Mint Debian Edition) Can you run the attached script on the attached FASTA files and see if all tests pass? Thanks, Florent On 21/05/11 05:51, Justin Chu wrote: > Hello: > > I'm having trouble with Bio::DB::Fasta. It sometimes occurs when I use large > fasta files and retrieve sequence from a bit past the start of the file. I > think some characters are being ignored or a rounding error is occurring or > something when using the offset to retrieve entries from the index file. I > have attached the Fasta files I have been using, just incase my problem is > due to improper formatting of my files. > > For example: > > my $refDB = Bio::DB::Fasta->new('Test2.Fasta'); > my $queryDB = Bio::DB::Fasta->new('Test1.Fasta'); > > print $refDB->subseq( "gi|294675557|ref|NC_014034.1|", 161067, 161788 > )."\n"; > print $queryDB->subseq( "gi|169245903|gb|EU376363.1|", 1, 722 )."\n"; > > output: > GGTAGTCCACGCCGTAAACGATGAATGCCAGTCGTCGGCAG... > GTAGTCCCGGCCGTAAACGATGGATGCTAGCCGTCGGATAG... > > my $refDB2 = InMemoryFastaAccess->new('Test2.Fasta'); > my $queryDB2 = InMemoryFastaAccess->new('Test1.Fasta'); > > print $refDB2->subseq( "gi|294675557|ref|NC_014034.1|", 161067, 161788 > )."\n"; > print $queryDB2->subseq( "gi|169245903|gb|EU376363.1|", 1, 722 )."\n"; > > I get: > > output: > GTAGTCCACGCCGTAAACGATGAATGCCAGTCGTCGGCA... > GTAGTCCCGGCCGTAAACGATGGATGCTAGCCGTCGGAT... > > Basically, sometimes the sequences retrieved are correct but other times it > is offset slightly by a few base pairs. Interestingly it seems that the > offset problem gets worse as you retrieve sequence chunks further and > further down the sequence. > > print $refDB->subseq( "gi|294675557|ref|NC_014034.1|", 1514858, > 1515579)."\n"; > > output: > CCCTGGTAGTCCACGCCGTAAACGATGAATGCCAGTCGT... > > when it should be: > > print $refDB2->subseq( "gi|294675557|ref|NC_014034.1|", 1514858, > 1515579)."\n"; > > output: > GTAGTCCACGCCGTAAACGATGAATGCCAGTCGTCGGCA... > > This module is still way faster than what I have, so I want to keep using > it. Do you think there something I'm overlooking that could be the problem > or do you see a way to fix this? > > I am currently running: > Bioperl-live from the BioPerl GitHub master branch from 19/5/11 > Perl 5.10.1 > Debian 6.0.1 > > If you need any other information please let me know. > > Thanks, > > Justin Chu > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -------------- next part -------------- A non-text attachment was scrubbed... Name: test.pl Type: application/x-perl Size: 1409 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Test1.Fasta URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Test2.Fasta URL: From reecehart at gmail.com Fri May 27 04:20:24 2011 From: reecehart at gmail.com (Reece Hart) Date: Fri, 27 May 2011 01:20:24 -0700 Subject: [Bioperl-l] fetching exons in genomic coordinates from NCBI In-Reply-To: References: Message-ID: On Wed, May 25, 2011 at 5:13 AM, Dave Messina wrote: > As far as I know, you're doing it the NCBI recommended way, byzantine > though it may be. Of course I too would be keen to hear of a better approach > if anyone's got one. > Is that really a "recommended" way? Aside from the NCBI eutils pages which describe how to submit queries, I didn't see anything about how to process the results. So, I ended up reverse engineering the XML by comparing at several efetch results with web pages. Or, are you saying that reverse engineering is the recommended NCBI way? -Reece From reecehart at gmail.com Fri May 27 04:27:34 2011 From: reecehart at gmail.com (Reece Hart) Date: Fri, 27 May 2011 01:27:34 -0700 Subject: [Bioperl-l] fetching exons in genomic coordinates from NCBI In-Reply-To: <4E312A6A-9CEB-4C55-BB0E-862D133B4AFD@illinois.edu> References: <4E312A6A-9CEB-4C55-BB0E-862D133B4AFD@illinois.edu> Message-ID: Dave, Chris, Peter- Thanks for the tips. I managed to solve my immediate need by parsing the efetch results using XPath queries. To future travelers who wander this way, there's a bit of code here: https://bitbucket.org/reece/bio-hgvs-perl/src/84181f38d092/sandbox/ -Reece From justinchu1989 at gmail.com Fri May 27 15:07:39 2011 From: justinchu1989 at gmail.com (Justin Chu) Date: Fri, 27 May 2011 13:07:39 -0600 Subject: [Bioperl-l] Problems with Bio::DB::Fasta In-Reply-To: <4DDF12A7.1010402@gmail.com> References: <4DDF12A7.1010402@gmail.com> Message-ID: Hi Florent: Thanks for your reply, I think something is wrong with my installation because I keep getting an error when running your script. I have had already tried reinstalling with a version on cpan to make sure my problem is not due to missing dependencies but I still get the following error: Can't locate Test/Exception.pm in @INC (@INC contains: t/lib /home/justin/workspace/.metadata/.plugins/org.epic.debug /home/justin/workspace/LocalTools/Testing /etc/perl /usr/local/lib/perl/5.10.1 /usr/local/share/perl/5.10.1 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.10 /usr/share/perl/5.10 /usr/local/lib/site_perl .) at (eval 46) line 2. BEGIN failed--compilation aborted at (eval 46) line 2. BEGIN failed--compilation aborted at /usr/local/share/perl/5.10.1/Bio/Root/Test.pm line 152. Compilation failed in require at /home/justin/workspace/LocalTools/Testing/ test.pl line 6. BEGIN failed--compilation aborted at /home/justin/workspace/LocalTools/Testing/test.pl line 6. However I did post my problem somewhere else and I did find other people did get errors when trying to make a index with my files. The weird thing is that I could make index files but lines with out sequence would cause my sequence retrieval to be offset one sequence position by each empty line. I found that removing all the spaces fixed the retrieval but this still does not explain the lack or error messages. Thanks for your help, Justin On Thu, May 26, 2011 at 8:55 PM, Florent Angly wrote: > Hi Justin, > > I been trying to reproduce your issue. A problem I ran into was that there > were some extra empty lines in your FASTA files. Then I made a test script > that gets the subsequences you mentioned using three different methods: > Bio::SeqIO+Bio::Seq, Bio::DB::Fasta, and your InMemoryFastaAccess. These > three methods return the same answer, so, I see no problem there. > > My system is pretty similar to yours: > Bioperl-live from the BioPerl GitHub master branch from 27/5/11 > Perl 5.12.3 > Linux 2.6.38-2-amd64 (Linux Mint Debian Edition) > > Can you run the attached script on the attached FASTA files and see if all > tests pass? > > Thanks, > > Florent > > > > > On 21/05/11 05:51, Justin Chu wrote: > > Hello: > > I'm having trouble with Bio::DB::Fasta. It sometimes occurs when I use large > fasta files and retrieve sequence from a bit past the start of the file. I > think some characters are being ignored or a rounding error is occurring or > something when using the offset to retrieve entries from the index file. I > have attached the Fasta files I have been using, just incase my problem is > due to improper formatting of my files. > > For example: > > my $refDB = Bio::DB::Fasta->new('Test2.Fasta'); > my $queryDB = Bio::DB::Fasta->new('Test1.Fasta'); > > print $refDB->subseq( "gi|294675557|ref|NC_014034.1|", 161067, 161788 > )."\n"; > print $queryDB->subseq( "gi|169245903|gb|EU376363.1|", 1, 722 )."\n"; > > output: > GGTAGTCCACGCCGTAAACGATGAATGCCAGTCGTCGGCAG... > GTAGTCCCGGCCGTAAACGATGGATGCTAGCCGTCGGATAG... > > my $refDB2 = InMemoryFastaAccess->new('Test2.Fasta'); > my $queryDB2 = InMemoryFastaAccess->new('Test1.Fasta'); > > print $refDB2->subseq( "gi|294675557|ref|NC_014034.1|", 161067, 161788 > )."\n"; > print $queryDB2->subseq( "gi|169245903|gb|EU376363.1|", 1, 722 )."\n"; > > I get: > > output: > GTAGTCCACGCCGTAAACGATGAATGCCAGTCGTCGGCA... > GTAGTCCCGGCCGTAAACGATGGATGCTAGCCGTCGGAT... > > Basically, sometimes the sequences retrieved are correct but other times it > is offset slightly by a few base pairs. Interestingly it seems that the > offset problem gets worse as you retrieve sequence chunks further and > further down the sequence. > > print $refDB->subseq( "gi|294675557|ref|NC_014034.1|", 1514858, > 1515579)."\n"; > > output: > CCCTGGTAGTCCACGCCGTAAACGATGAATGCCAGTCGT... > > when it should be: > > print $refDB2->subseq( "gi|294675557|ref|NC_014034.1|", 1514858, > 1515579)."\n"; > > output: > GTAGTCCACGCCGTAAACGATGAATGCCAGTCGTCGGCA... > > This module is still way faster than what I have, so I want to keep using > it. Do you think there something I'm overlooking that could be the problem > or do you see a way to fix this? > > I am currently running: > Bioperl-live from the BioPerl GitHub master branch from 19/5/11 > Perl 5.10.1 > Debian 6.0.1 > > If you need any other information please let me know. > > Thanks, > > Justin Chu > > > > _______________________________________________ > Bioperl-l mailing listBioperl-l at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l > > > From jason.stajich at gmail.com Fri May 27 15:45:41 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Fri, 27 May 2011 12:45:41 -0700 (PDT) Subject: [Bioperl-l] Error calling alignment method In-Reply-To: <4DDD4337.3030107@upvnet.upv.es> References: <4DDD4337.3030107@upvnet.upv.es> Message-ID: Lorenzo - The problem is not that $dna_alignment is not defined - that is not what the error message is saying. It is saying you can't call 'alignment' on an undefined value This is the line: my $kaks_factory->alignment($dna_aln); Should be $ks_factory->alignment($dna_aln); If you star with the simple code in the perldoc you can see how the steps are intended to be run. my $codeml = Bio::Tools::Run::Phylo::PAML::Codeml->new(); $codeml->alignment($aln); my ($rc,$parser) = $codeml->run(); my $result = $parser->next_result; my $MLmatrix = $result->get_MLmatrix(); print "Ka = ", $MLmatrix->[0]->[1]->{'dN'},"\n"; You can also initialize the Codeml object with the alignment directly my $ks_factory = new Bio::Tools::Run::Phylo::PAML::Codeml(-alignment => $dna_aln, -params => .... ); # as you did before On May 25, 10:58?am, Lorenzo Carretero Paulet wrote: > Hi all, > I'm trying to create the following subroutine but I get the error message: > "Can't call method "alignment" on an undefined value at > /Users/marioafares/Documents/workspace/PlantEvolGen/test.pl line 80, > line 2." > which indicates $dna_aln is undefined. However, I manage to print it > using Dumper, so I guess it is actually defined. > Can anyone see where the problem is? > I attach the two files I'm using in the test. > Thanks in advance, > Lorenzo > > ? ? #!/usr/local/bin/perl -w > ? ? use 5.010; > ? ? use strict; > ? ? use Data::Dumper; > ? ? # definition of the environmental variable CLUSTALDIR > ? ? BEGIN {$ENV{CLUSTALDIR} = > ? ? '/Applications/Bioinformatics/clustalw-2.1-macosx/'} > ? ? use Bio::Tools::Run::Alignment::Clustalw; > ? ? use Bio::Align::Utilities qw(aa_to_dna_aln); > > ? ? BEGIN {$ENV{PAMLDIR} = '/Applications/Bioinformatics/paml44/bin/'} > ? ? use Bio::Tools::Run::Phylo::PAML::Codeml; > > ? ? my $sequencesfilenameAA = > ? ? "/Users/marioafares/Documents/SequenceDatabase/plaza_public_02_Apr27/plaza_ public_02/BLAST_Parsed_results/PerSpecies/test_vs_test.par.aa.1.fas"; > > ? ? my $sequencesfilenameNT = > ? ? "/Users/marioafares/Documents/SequenceDatabase/plaza_public_02_Apr27/plaza_ public_02/BLAST_Parsed_results/PerSpecies/test_vs_test.par.nt.1.fas"; > > ? ? my $format = 'fasta'; > > ? ? GettingBioperlAlignmentAAtoDNAplusPAMLcalculation > ? ? ($sequencesfilenameAA, $sequencesfilenameNT, $format); > > ? ? sub GettingBioperlAlignmentAAtoDNAplusPAMLcalculation > ? ? { > ? ? my ( $sequencesfilenameAA, $sequencesfilenameNT, $format, $ktuple, > ? ? $matrix ?) = @_; > ? ? my %hashNTseqs = (); > ? ? my $likelihood; > ? ? my $Ks; > ? ? my $Ka; > ? ? my $omega; > ? ? my $factory = Bio::Tools::Run::Alignment::Clustalw->new (); #use > ? ? default parameters in here > ? ? my $pep_aln = $factory->align($sequencesfilenameAA); > > ? ? my $inseq = Bio::SeqIO->new(-file => "<$sequencesfilenameNT", > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?-format => $format ); > ? ? ? ? ?while (my $seq = $inseq->next_seq) > ? ? ? ? ? ? ?{ > ? ? ? ? ? ? ?my $seq_id = $seq->display_id(); > ? ? ? ? ? ? ?$hashNTseqs{$seq_id} = $seq; > ? ? ? ? ? ? ?} > ? ? my $dna_aln = aa_to_dna_aln($pep_aln, \%hashNTseqs); > ? ? say Dumper \%hashNTseqs; > ? ? say Dumper $dna_aln; > ? ? ########################################################### > ? ? #my $codeml_runs = 5; > ? ? my $ks_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > ? ? ? ? ? ? ? ? ? ? ? ? ? ?( ? ? -params => ? ? { > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#'verbose' => 0, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#'noisy' => 0, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?'runmode' => -2, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 'seqtype' => 1, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #'model' => 0, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #'NSsites' => 0, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #'icode' => 0, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #'fix_alpha' => 0, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #'fix_kappa' => 0, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #'RateAncestor' => 0, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?'CodonFreq' => 2, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?'cleandata' => 1, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?'ndata' => 1 > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? }, > > ? ? ? ? ? ? ? ? ? ? ? ? ? ?); > > ? ? my $kaks_factory->alignment($dna_aln); > > ? ? say "\nCalculating Ks-values of current cluster ..."; > > ? ? ? ? ?# The estimation of the Ks-values is repeated $codeml_runs > ? ? times ... > ? ? # ? ?for(my $k=1;$k <= $codeml_runs;$k++) > ? ? # ? ?{ > ? ? # ? ? ? ?print "\nCodeml-Run $k:\n\n"; > > ? ? ? ? ? ? ?# Ka and Ks-vlaues are calculated using codeml > ? ? ? ? ? ? ?my ($rc,$parser) = $kaks_factory->run(); > ? ? ? ? ? ? ?#$kaks_factory->cleanup(); > ? ? ? ? ? ? ?# If the calculation was succsessful ... > ? ? # ? ? ? ?if($rc != 0) > ? ? # ? ? ? ?{ > ? ? ? ? ? ? ? ? ?my $result = $parser->next_result;#not sure what it > ? ? does here > ? ? ? ? ? ? ? ? ?#my $NGmatrix = $result->get_MLmatrix(); > ? ? ? ? ? ? ? ? ?my $MLmatrix = $result->get_MLmatrix(); > ? ? ? ? ? ? ? ? ? ? ?$likelihood = $MLmatrix->[0]->[1]->{'lnL'}; > ? ? ? ? ? ? ? ? ? ? ?$Ks = $MLmatrix->[0]->[1]->{'dS'}; > ? ? ? ? ? ? ? ? ? ? ?$Ka = $MLmatrix->[0]->[1]->{'dA'}; > ? ? ? ? ? ? ? ? ? ? ?$omega = $MLmatrix->[0]->[1]->{'omega'}; > ? ? ? ? ? ? ? ? ? ? ?print " likelihood = $likelihood, Ka = $Ka, Ks = > ? ? $Ks, Ka/Ks = $omega\n"; > ? ? # ? ? ? ?} > ? ? # ? ? ? ?else # If an error occured during the Ks-calculation ... > ? ? # ? ? ? ?{ > ? ? # ? ? ? ? ? ?return (-1); > ? ? # ? ? ? ?} > ? ? # ? ?} > ? ? return ( $likelihood, $Ks, $Ka, $omega ); > ? ? } > > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > Lorenzo Carretero Paulet > Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) > Integrative Systems Biology Group > C/ Ingeniero Fausto Elio s/n. > 46022 Valencia, Spain > > Phone: ?+34 963879934 > Fax: ? ?+34 963877859 > e-mail:locar... at upvnet.upv.es > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > > ?test_vs_test.par.nt.1.fas > 3KViewDownload > > ?test_vs_test.par.aa.1.fas > 1KViewDownload > > _______________________________________________ > Bioperl-l mailing list > Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l From florent.angly at gmail.com Fri May 27 19:33:36 2011 From: florent.angly at gmail.com (Florent Angly) Date: Sat, 28 May 2011 09:33:36 +1000 Subject: [Bioperl-l] Problems with Bio::DB::Fasta In-Reply-To: References: <4DDF12A7.1010402@gmail.com> Message-ID: <4DE034D0.3080800@gmail.com> On 28/05/11 05:07, Justin Chu wrote: > Thanks for your reply, I think something is wrong with my installation > because I keep getting an error when running your script. I have had > already tried reinstalling with a version on cpan to make sure my > problem is not due to missing dependencies but I still get the > following error: > > Can't locate Test/Exception.pm in @INC (@INC contains: t/lib > /home/justin/workspace/.metadata/.plugins/org.epic.debug > /home/justin/workspace/LocalTools/Testing /etc/perl > /usr/local/lib/perl/5.10.1 /usr/local/share/perl/5.10.1 /usr/lib/perl5 > /usr/share/perl5 /usr/lib/perl/5.10 /usr/share/perl/5.10 > /usr/local/lib/site_perl .) at (eval 46) line 2. > BEGIN failed--compilation aborted at (eval 46) line 2. > > BEGIN failed--compilation aborted at > /usr/local/share/perl/5.10.1/Bio/Root/Test.pm line 152. > Compilation failed in require at > /home/justin/workspace/LocalTools/Testing/test.pl line 6. > BEGIN failed--compilation aborted at > /home/justin/workspace/LocalTools/Testing/test.pl line 6. Hi Justin, Install the Test::Exception module this way (for Debian-like systems): sudo apt-get install libtest-exception- perl Once it is installed, you should get the error messages on the white lines of your FASTA file when running the script. If you don't get errors on the white lines, and the script continues happily, then that's very likely the reason why you get the wrong subsequences. Florent > > However I did post my problem somewhere else and I did find other > people did get errors when trying to make a index with my files. The > weird thing is that I could make index files but lines with out > sequence would cause my sequence retrieval to be offset one sequence > position by each empty line. I found that removing all the spaces > fixed the retrieval but this still does not explain the lack or error > messages. > > Thanks for your help, > > Justin > > On Thu, May 26, 2011 at 8:55 PM, Florent Angly > > wrote: > > Hi Justin, > > I been trying to reproduce your issue. A problem I ran into was > that there were some extra empty lines in your FASTA files. Then I > made a test script that gets the subsequences you mentioned using > three different methods: Bio::SeqIO+Bio::Seq, Bio::DB::Fasta, and > your InMemoryFastaAccess. These three methods return the same > answer, so, I see no problem there. > > My system is pretty similar to yours: > Bioperl-live from the BioPerl GitHub master branch from 27/5/11 > Perl 5.12.3 > Linux 2.6.38-2-amd64 (Linux Mint Debian Edition) > > Can you run the attached script on the attached FASTA files and > see if all tests pass? > > Thanks, > > Florent > > > > > On 21/05/11 05:51, Justin Chu wrote: >> Hello: >> >> I'm having trouble with Bio::DB::Fasta. It sometimes occurs when I use large >> fasta files and retrieve sequence from a bit past the start of the file. I >> think some characters are being ignored or a rounding error is occurring or >> something when using the offset to retrieve entries from the index file. I >> have attached the Fasta files I have been using, just incase my problem is >> due to improper formatting of my files. >> >> For example: >> >> my $refDB = Bio::DB::Fasta->new('Test2.Fasta'); >> my $queryDB = Bio::DB::Fasta->new('Test1.Fasta'); >> >> print $refDB->subseq( "gi|294675557|ref|NC_014034.1|", 161067, 161788 >> )."\n"; >> print $queryDB->subseq( "gi|169245903|gb|EU376363.1|", 1, 722 )."\n"; >> >> output: >> GGTAGTCCACGCCGTAAACGATGAATGCCAGTCGTCGGCAG... >> GTAGTCCCGGCCGTAAACGATGGATGCTAGCCGTCGGATAG... >> >> my $refDB2 = InMemoryFastaAccess->new('Test2.Fasta'); >> my $queryDB2 = InMemoryFastaAccess->new('Test1.Fasta'); >> >> print $refDB2->subseq( "gi|294675557|ref|NC_014034.1|", 161067, 161788 >> )."\n"; >> print $queryDB2->subseq( "gi|169245903|gb|EU376363.1|", 1, 722 )."\n"; >> >> I get: >> >> output: >> GTAGTCCACGCCGTAAACGATGAATGCCAGTCGTCGGCA... >> GTAGTCCCGGCCGTAAACGATGGATGCTAGCCGTCGGAT... >> >> Basically, sometimes the sequences retrieved are correct but other times it >> is offset slightly by a few base pairs. Interestingly it seems that the >> offset problem gets worse as you retrieve sequence chunks further and >> further down the sequence. >> >> print $refDB->subseq( "gi|294675557|ref|NC_014034.1|", 1514858, >> 1515579)."\n"; >> >> output: >> CCCTGGTAGTCCACGCCGTAAACGATGAATGCCAGTCGT... >> >> when it should be: >> >> print $refDB2->subseq( "gi|294675557|ref|NC_014034.1|", 1514858, >> 1515579)."\n"; >> >> output: >> GTAGTCCACGCCGTAAACGATGAATGCCAGTCGTCGGCA... >> >> This module is still way faster than what I have, so I want to keep using >> it. Do you think there something I'm overlooking that could be the problem >> or do you see a way to fix this? >> >> I am currently running: >> Bioperl-live from the BioPerl GitHub master branch from 19/5/11 >> Perl 5.10.1 >> Debian 6.0.1 >> >> If you need any other information please let me know. >> >> Thanks, >> >> Justin Chu >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From florent.angly at gmail.com Mon May 30 17:53:04 2011 From: florent.angly at gmail.com (Florent Angly) Date: Tue, 31 May 2011 07:53:04 +1000 Subject: [Bioperl-l] Problems with Bio::DB::Fasta In-Reply-To: References: <4DDF12A7.1010402@gmail.com> <4DE034D0.3080800@gmail.com> Message-ID: <4DE411C0.2030508@gmail.com> Hi Justin, Please "reply all" so that our emails stay on the BioPerl mailing list. Weirdness regarding new lines if often indicative of a file that has traveled between different operating systems (which have a different way of representing new lines). You may try to follow these instructions if that's the case: http://www.cyberciti.biz/faq/howto-unix-linux-convert-dos-newlines-cr-lf-unix-text-format/ Florent On 31/05/11 04:28, Justin Chu wrote: > Hi Florent: > > It seems that I does not detect the spaces in my files at times for > some reason and will proceed to run the script with no problem. > Strangely empty lines I insert myself seem to be detected in > Test1.Fasta, but not in Test2.Fasta. > > Justin > > On Fri, May 27, 2011 at 5:33 PM, Florent Angly > > wrote: > > > > On 28/05/11 05:07, Justin Chu wrote: >> Thanks for your reply, I think something is wrong with my >> installation because I keep getting an error when running your >> script. I have had already tried reinstalling with a version on >> cpan to make sure my problem is not due to missing dependencies >> but I still get the following error: >> >> Can't locate Test/Exception.pm in @INC (@INC contains: t/lib >> /home/justin/workspace/.metadata/.plugins/org.epic.debug >> /home/justin/workspace/LocalTools/Testing /etc/perl >> /usr/local/lib/perl/5.10.1 /usr/local/share/perl/5.10.1 >> /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.10 >> /usr/share/perl/5.10 /usr/local/lib/site_perl .) at (eval 46) line 2. >> BEGIN failed--compilation aborted at (eval 46) line 2. >> >> BEGIN failed--compilation aborted at >> /usr/local/share/perl/5.10.1/Bio/Root/Test.pm line 152. >> Compilation failed in require at >> /home/justin/workspace/LocalTools/Testing/test.pl >> line 6. >> BEGIN failed--compilation aborted at >> /home/justin/workspace/LocalTools/Testing/test.pl >> line 6. > > Hi Justin, > Install the Test::Exception module this way (for Debian-like > systems): sudo apt-get install libtest-exception- perl > Once it is installed, you should get the error messages on the > white lines of your FASTA file when running the script. If you > don't get errors on the white lines, and the script continues > happily, then that's very likely the reason why you get the wrong > subsequences. > Florent > > > > >> >> However I did post my problem somewhere else and I did find other >> people did get errors when trying to make a index with my files. >> The weird thing is that I could make index files but lines with >> out sequence would cause my sequence retrieval to be offset one >> sequence position by each empty line. I found that removing all >> the spaces fixed the retrieval but this still does not explain >> the lack or error messages. >> >> Thanks for your help, >> >> Justin >> >> On Thu, May 26, 2011 at 8:55 PM, Florent Angly >> > wrote: >> >> Hi Justin, >> >> I been trying to reproduce your issue. A problem I ran into >> was that there were some extra empty lines in your FASTA >> files. Then I made a test script that gets the subsequences >> you mentioned using three different methods: >> Bio::SeqIO+Bio::Seq, Bio::DB::Fasta, and your >> InMemoryFastaAccess. These three methods return the same >> answer, so, I see no problem there. >> >> My system is pretty similar to yours: >> Bioperl-live from the BioPerl GitHub master branch from 27/5/11 >> Perl 5.12.3 >> Linux 2.6.38-2-amd64 (Linux Mint Debian Edition) >> >> Can you run the attached script on the attached FASTA files >> and see if all tests pass? >> >> Thanks, >> >> Florent >> >> >> >> >> On 21/05/11 05:51, Justin Chu wrote: >>> Hello: >>> >>> I'm having trouble with Bio::DB::Fasta. It sometimes occurs when I use large >>> fasta files and retrieve sequence from a bit past the start of the file. I >>> think some characters are being ignored or a rounding error is occurring or >>> something when using the offset to retrieve entries from the index file. I >>> have attached the Fasta files I have been using, just incase my problem is >>> due to improper formatting of my files. >>> >>> For example: >>> >>> my $refDB = Bio::DB::Fasta->new('Test2.Fasta'); >>> my $queryDB = Bio::DB::Fasta->new('Test1.Fasta'); >>> >>> print $refDB->subseq( "gi|294675557|ref|NC_014034.1|", 161067, 161788 >>> )."\n"; >>> print $queryDB->subseq( "gi|169245903|gb|EU376363.1|", 1, 722 )."\n"; >>> >>> output: >>> GGTAGTCCACGCCGTAAACGATGAATGCCAGTCGTCGGCAG... >>> GTAGTCCCGGCCGTAAACGATGGATGCTAGCCGTCGGATAG... >>> >>> my $refDB2 = InMemoryFastaAccess->new('Test2.Fasta'); >>> my $queryDB2 = InMemoryFastaAccess->new('Test1.Fasta'); >>> >>> print $refDB2->subseq( "gi|294675557|ref|NC_014034.1|", 161067, 161788 >>> )."\n"; >>> print $queryDB2->subseq( "gi|169245903|gb|EU376363.1|", 1, 722 )."\n"; >>> >>> I get: >>> >>> output: >>> GTAGTCCACGCCGTAAACGATGAATGCCAGTCGTCGGCA... >>> GTAGTCCCGGCCGTAAACGATGGATGCTAGCCGTCGGAT... >>> >>> Basically, sometimes the sequences retrieved are correct but other times it >>> is offset slightly by a few base pairs. Interestingly it seems that the >>> offset problem gets worse as you retrieve sequence chunks further and >>> further down the sequence. >>> >>> print $refDB->subseq( "gi|294675557|ref|NC_014034.1|", 1514858, >>> 1515579)."\n"; >>> >>> output: >>> CCCTGGTAGTCCACGCCGTAAACGATGAATGCCAGTCGT... >>> >>> when it should be: >>> >>> print $refDB2->subseq( "gi|294675557|ref|NC_014034.1|", 1514858, >>> 1515579)."\n"; >>> >>> output: >>> GTAGTCCACGCCGTAAACGATGAATGCCAGTCGTCGGCA... >>> >>> This module is still way faster than what I have, so I want to keep using >>> it. Do you think there something I'm overlooking that could be the problem >>> or do you see a way to fix this? >>> >>> I am currently running: >>> Bioperl-live from the BioPerl GitHub master branch from 19/5/11 >>> Perl 5.10.1 >>> Debian 6.0.1 >>> >>> If you need any other information please let me know. >>> >>> Thanks, >>> >>> Justin Chu >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > From locarpau at upvnet.upv.es Tue May 31 07:26:59 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Tue, 31 May 2011 13:26:59 +0200 Subject: [Bioperl-l] Error calling alignment method In-Reply-To: References: <4DDD4337.3030107@upvnet.upv.es> Message-ID: <4DE4D083.8010808@upvnet.upv.es> Hi, The last versions of my script had the typo corrected: my $kaks_factory = new Bio::Tools::Run::Phylo::PAML::Codeml ( -params => { #'verbose' => 0, #'noisy' => 0, 'runmode' => -2, 'seqtype' => 1, #'model' => 0, #'NSsites' => 0, #'icode' => 0, #'fix_alpha' => 0, #'fix_kappa' => 0, #'RateAncestor' => 0, 'CodonFreq' => 2, 'cleandata' => 1, 'ndata' => 1 }, ); $kaks_factory->alignment($dna_aln); but still returns the same error. Best, Lorenzo El 27/05/11 21:45, Jason Stajich escribi?: > Lorenzo - > > The problem is not that $dna_alignment is not defined - that is not > what the error message is saying. > It is saying you can't call 'alignment' on an undefined value > > This is the line: > my $kaks_factory->alignment($dna_aln); > Should be > $ks_factory->alignment($dna_aln); > > If you star with the simple code in the perldoc you can see how the > steps are intended to be run. > my $codeml = Bio::Tools::Run::Phylo::PAML::Codeml->new(); > $codeml->alignment($aln); > my ($rc,$parser) = $codeml->run(); > my $result = $parser->next_result; > my $MLmatrix = $result->get_MLmatrix(); > print "Ka = ", $MLmatrix->[0]->[1]->{'dN'},"\n"; > > You can also initialize the Codeml object with the alignment directly > my $ks_factory = new Bio::Tools::Run::Phylo::PAML::Codeml(-alignment > => $dna_aln, > -params => .... ); # as you did before > > > On May 25, 10:58 am, Lorenzo Carretero Paulet > wrote: >> Hi all, >> I'm trying to create the following subroutine but I get the error message: >> "Can't call method "alignment" on an undefined value at >> /Users/marioafares/Documents/workspace/PlantEvolGen/test.pl line 80, >> line 2." >> which indicates $dna_aln is undefined. However, I manage to print it >> using Dumper, so I guess it is actually defined. >> Can anyone see where the problem is? >> I attach the two files I'm using in the test. >> Thanks in advance, >> Lorenzo >> >> #!/usr/local/bin/perl -w >> use 5.010; >> use strict; >> use Data::Dumper; >> # definition of the environmental variable CLUSTALDIR >> BEGIN {$ENV{CLUSTALDIR} = >> '/Applications/Bioinformatics/clustalw-2.1-macosx/'} >> use Bio::Tools::Run::Alignment::Clustalw; >> use Bio::Align::Utilities qw(aa_to_dna_aln); >> >> BEGIN {$ENV{PAMLDIR} = '/Applications/Bioinformatics/paml44/bin/'} >> use Bio::Tools::Run::Phylo::PAML::Codeml; >> >> my $sequencesfilenameAA = >> "/Users/marioafares/Documents/SequenceDatabase/plaza_public_02_Apr27/plaza_ public_02/BLAST_Parsed_results/PerSpecies/test_vs_test.par.aa.1.fas"; >> >> my $sequencesfilenameNT = >> "/Users/marioafares/Documents/SequenceDatabase/plaza_public_02_Apr27/plaza_ public_02/BLAST_Parsed_results/PerSpecies/test_vs_test.par.nt.1.fas"; >> >> my $format = 'fasta'; >> >> GettingBioperlAlignmentAAtoDNAplusPAMLcalculation >> ($sequencesfilenameAA, $sequencesfilenameNT, $format); >> >> sub GettingBioperlAlignmentAAtoDNAplusPAMLcalculation >> { >> my ( $sequencesfilenameAA, $sequencesfilenameNT, $format, $ktuple, >> $matrix ) = @_; >> my %hashNTseqs = (); >> my $likelihood; >> my $Ks; >> my $Ka; >> my $omega; >> my $factory = Bio::Tools::Run::Alignment::Clustalw->new (); #use >> default parameters in here >> my $pep_aln = $factory->align($sequencesfilenameAA); >> >> my $inseq = Bio::SeqIO->new(-file => "<$sequencesfilenameNT", >> -format => $format ); >> while (my $seq = $inseq->next_seq) >> { >> my $seq_id = $seq->display_id(); >> $hashNTseqs{$seq_id} = $seq; >> } >> my $dna_aln = aa_to_dna_aln($pep_aln, \%hashNTseqs); >> say Dumper \%hashNTseqs; >> say Dumper $dna_aln; >> ########################################################### >> #my $codeml_runs = 5; >> my $ks_factory = new Bio::Tools::Run::Phylo::PAML::Codeml >> ( -params => { >> #'verbose' => 0, >> #'noisy' => 0, >> 'runmode' => -2, >> 'seqtype' => 1, >> #'model' => 0, >> #'NSsites' => 0, >> #'icode' => 0, >> #'fix_alpha' => 0, >> #'fix_kappa' => 0, >> #'RateAncestor' => 0, >> 'CodonFreq' => 2, >> 'cleandata' => 1, >> 'ndata' => 1 >> }, >> >> ); >> >> my $kaks_factory->alignment($dna_aln); >> >> say "\nCalculating Ks-values of current cluster ..."; >> >> # The estimation of the Ks-values is repeated $codeml_runs >> times ... >> # for(my $k=1;$k<= $codeml_runs;$k++) >> # { >> # print "\nCodeml-Run $k:\n\n"; >> >> # Ka and Ks-vlaues are calculated using codeml >> my ($rc,$parser) = $kaks_factory->run(); >> #$kaks_factory->cleanup(); >> # If the calculation was succsessful ... >> # if($rc != 0) >> # { >> my $result = $parser->next_result;#not sure what it >> does here >> #my $NGmatrix = $result->get_MLmatrix(); >> my $MLmatrix = $result->get_MLmatrix(); >> $likelihood = $MLmatrix->[0]->[1]->{'lnL'}; >> $Ks = $MLmatrix->[0]->[1]->{'dS'}; >> $Ka = $MLmatrix->[0]->[1]->{'dA'}; >> $omega = $MLmatrix->[0]->[1]->{'omega'}; >> print " likelihood = $likelihood, Ka = $Ka, Ks = >> $Ks, Ka/Ks = $omega\n"; >> # } >> # else # If an error occured during the Ks-calculation ... >> # { >> # return (-1); >> # } >> # } >> return ( $likelihood, $Ks, $Ka, $omega ); >> } >> >> -- >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> Lorenzo Carretero Paulet >> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >> Integrative Systems Biology Group >> C/ Ingeniero Fausto Elio s/n. >> 46022 Valencia, Spain >> >> Phone: +34 963879934 >> Fax: +34 963877859 >> e-mail:locar... at upvnet.upv.es >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> >> test_vs_test.par.nt.1.fas >> 3KViewDownload >> >> test_vs_test.par.aa.1.fas >> 1KViewDownload >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* From lmrodriguezr at gmail.com Tue May 31 12:20:24 2011 From: lmrodriguezr at gmail.com (=?ISO-8859-1?Q?Luis=2DMiguel_Rodr=EDguez_Rojas?=) Date: Tue, 31 May 2011 18:20:24 +0200 Subject: [Bioperl-l] Polloc: Naming question Message-ID: Dear all, I am currently working in a Perl library aimed to simplify the development of applications in Molecular Typing. It's basically a library to handle polymorphic loci (identify, group and predict typing results). I think it's not generic enough to be part of bioperl (and doesn't follow the directives of bioperl) despite it largely uses the bioperl libraries. However, I would be glad to hear some comments about the namespace it should take from the bioperl community. I already developed a script using this library, and it's certainly useful, so I would like to make it available from CPAN. It's currently called Polloc (for *Pol*ymorphic *loc*i), but a top-level name is probably a bad idea. Would it be ok if I use a Bio::* namespace? I was thinking in Bio::PolLoc or Bio::Typing::Polloc. I would like to preserve the 'Polloc' name, because I feel a more generic one would give the wrong idea about how generic it is. For example, I would expect Bio::Polymorphism or Bio::MolecularTyping to be far more abstract than my library (which is aimed to the usage of specific tools, and the analysis of specific loci). Thanks! LRR. -- Luis M. Rodriguez-R [ http://thebio.me/lrr ] --------------------------------- UMR R?sistance des Plantes aux Bioagresseurs - Group effecteur/cible Institut de Recherche pour le D?veloppement, Montpellier, France [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ] +33 (0) 6.29.74.55.93 Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a Universidad de Los Andes, Bogot?, Colombia [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ] +57 (1) 3.39.49.49 ext 2777 From jason.stajich at gmail.com Tue May 31 12:31:26 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Tue, 31 May 2011 09:31:26 -0700 Subject: [Bioperl-l] Error calling alignment method In-Reply-To: <4DE4D083.8010808@upvnet.upv.es> References: <4DDD4337.3030107@upvnet.upv.es> <4DE4D083.8010808@upvnet.upv.es> Message-ID: Please provide exactly the script you are using or an exact simplified script that causes the error - we can't reproduce your problem without. On May 31, 2011, at 4:26 AM, Lorenzo Carretero Paulet wrote: > Hi, > > The last versions of my script had the typo corrected: > > my $kaks_factory = new Bio::Tools::Run::Phylo::PAML::Codeml > > ( -params => { > > #'verbose' => 0, > > #'noisy' => 0, > > 'runmode' => -2, > > 'seqtype' => 1, > > #'model' => 0, > > #'NSsites' => 0, > > #'icode' => 0, > > #'fix_alpha' => 0, > > #'fix_kappa' => 0, > > #'RateAncestor' => 0, > > 'CodonFreq' => 2, > > 'cleandata' => 1, > > 'ndata' => 1 > > }, > > > ); > > > $kaks_factory->alignment($dna_aln); > > > but still returns the same error. > Best, > Lorenzo > > > > > El 27/05/11 21:45, Jason Stajich escribi?: > >> Lorenzo - >> >> The problem is not that $dna_alignment is not defined - that is not >> what the error message is saying. >> It is saying you can't call 'alignment' on an undefined value >> >> This is the line: >> my $kaks_factory->alignment($dna_aln); >> Should be >> $ks_factory->alignment($dna_aln); >> >> If you star with the simple code in the perldoc you can see how the >> steps are intended to be run. >> my $codeml = Bio::Tools::Run::Phylo::PAML::Codeml->new(); >> $codeml->alignment($aln); >> my ($rc,$parser) = $codeml->run(); >> my $result = $parser->next_result; >> my $MLmatrix = $result->get_MLmatrix(); >> print "Ka = ", $MLmatrix->[0]->[1]->{'dN'},"\n"; >> >> You can also initialize the Codeml object with the alignment directly >> my $ks_factory = new Bio::Tools::Run::Phylo::PAML::Codeml(-alignment >> => $dna_aln, >> -params => .... ); # as you did before >> >> >> On May 25, 10:58 am, Lorenzo Carretero Paulet >> wrote: >>> Hi all, >>> I'm trying to create the following subroutine but I get the error message: >>> "Can't call method "alignment" on an undefined value at >>> /Users/marioafares/Documents/workspace/PlantEvolGen/test.pl line 80, >>> line 2." >>> which indicates $dna_aln is undefined. However, I manage to print it >>> using Dumper, so I guess it is actually defined. >>> Can anyone see where the problem is? >>> I attach the two files I'm using in the test. >>> Thanks in advance, >>> Lorenzo >>> >>> #!/usr/local/bin/perl -w >>> use 5.010; >>> use strict; >>> use Data::Dumper; >>> # definition of the environmental variable CLUSTALDIR >>> BEGIN {$ENV{CLUSTALDIR} = >>> '/Applications/Bioinformatics/clustalw-2.1-macosx/'} >>> use Bio::Tools::Run::Alignment::Clustalw; >>> use Bio::Align::Utilities qw(aa_to_dna_aln); >>> >>> BEGIN {$ENV{PAMLDIR} = '/Applications/Bioinformatics/paml44/bin/'} >>> use Bio::Tools::Run::Phylo::PAML::Codeml; >>> >>> my $sequencesfilenameAA = >>> "/Users/marioafares/Documents/SequenceDatabase/plaza_public_02_Apr27/plaza_ public_02/BLAST_Parsed_results/PerSpecies/test_vs_test.par.aa.1.fas"; >>> >>> my $sequencesfilenameNT = >>> "/Users/marioafares/Documents/SequenceDatabase/plaza_public_02_Apr27/plaza_ public_02/BLAST_Parsed_results/PerSpecies/test_vs_test.par.nt.1.fas"; >>> >>> my $format = 'fasta'; >>> >>> GettingBioperlAlignmentAAtoDNAplusPAMLcalculation >>> ($sequencesfilenameAA, $sequencesfilenameNT, $format); >>> >>> sub GettingBioperlAlignmentAAtoDNAplusPAMLcalculation >>> { >>> my ( $sequencesfilenameAA, $sequencesfilenameNT, $format, $ktuple, >>> $matrix ) = @_; >>> my %hashNTseqs = (); >>> my $likelihood; >>> my $Ks; >>> my $Ka; >>> my $omega; >>> my $factory = Bio::Tools::Run::Alignment::Clustalw->new (); #use >>> default parameters in here >>> my $pep_aln = $factory->align($sequencesfilenameAA); >>> >>> my $inseq = Bio::SeqIO->new(-file => "<$sequencesfilenameNT", >>> -format => $format ); >>> while (my $seq = $inseq->next_seq) >>> { >>> my $seq_id = $seq->display_id(); >>> $hashNTseqs{$seq_id} = $seq; >>> } >>> my $dna_aln = aa_to_dna_aln($pep_aln, \%hashNTseqs); >>> say Dumper \%hashNTseqs; >>> say Dumper $dna_aln; >>> ########################################################### >>> #my $codeml_runs = 5; >>> my $ks_factory = new Bio::Tools::Run::Phylo::PAML::Codeml >>> ( -params => { >>> #'verbose' => 0, >>> #'noisy' => 0, >>> 'runmode' => -2, >>> 'seqtype' => 1, >>> #'model' => 0, >>> #'NSsites' => 0, >>> #'icode' => 0, >>> #'fix_alpha' => 0, >>> #'fix_kappa' => 0, >>> #'RateAncestor' => 0, >>> 'CodonFreq' => 2, >>> 'cleandata' => 1, >>> 'ndata' => 1 >>> }, >>> >>> ); >>> >>> my $kaks_factory->alignment($dna_aln); >>> >>> say "\nCalculating Ks-values of current cluster ..."; >>> >>> # The estimation of the Ks-values is repeated $codeml_runs >>> times ... >>> # for(my $k=1;$k<= $codeml_runs;$k++) >>> # { >>> # print "\nCodeml-Run $k:\n\n"; >>> >>> # Ka and Ks-vlaues are calculated using codeml >>> my ($rc,$parser) = $kaks_factory->run(); >>> #$kaks_factory->cleanup(); >>> # If the calculation was succsessful ... >>> # if($rc != 0) >>> # { >>> my $result = $parser->next_result;#not sure what it >>> does here >>> #my $NGmatrix = $result->get_MLmatrix(); >>> my $MLmatrix = $result->get_MLmatrix(); >>> $likelihood = $MLmatrix->[0]->[1]->{'lnL'}; >>> $Ks = $MLmatrix->[0]->[1]->{'dS'}; >>> $Ka = $MLmatrix->[0]->[1]->{'dA'}; >>> $omega = $MLmatrix->[0]->[1]->{'omega'}; >>> print " likelihood = $likelihood, Ka = $Ka, Ks = >>> $Ks, Ka/Ks = $omega\n"; >>> # } >>> # else # If an error occured during the Ks-calculation ... >>> # { >>> # return (-1); >>> # } >>> # } >>> return ( $likelihood, $Ks, $Ka, $omega ); >>> } >>> >>> -- >>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>> Lorenzo Carretero Paulet >>> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >>> Integrative Systems Biology Group >>> C/ Ingeniero Fausto Elio s/n. >>> 46022 Valencia, Spain >>> >>> Phone: +34 963879934 >>> Fax: +34 963877859 >>> e-mail:locar... at upvnet.upv.es >>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>> >>> test_vs_test.par.nt.1.fas >>> 3KViewDownload >>> >>> test_vs_test.par.aa.1.fas >>> 1KViewDownload >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > Lorenzo Carretero Paulet > Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) > Integrative Systems Biology Group > C/ Ingeniero Fausto Elio s/n. > 46022 Valencia, Spain > > Phone: +34 963879934 > Fax: +34 963877859 > e-mail: locarpau at upvnet.upv.es > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From roy.chaudhuri at gmail.com Tue May 31 13:28:09 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Tue, 31 May 2011 18:28:09 +0100 Subject: [Bioperl-l] Error calling alignment method In-Reply-To: <4DDD8155.5060002@upvnet.upv.es> References: <4DDD4337.3030107@upvnet.upv.es> <4DDD51CD.90208@upvnet.upv.es> <4DDD8155.5060002@upvnet.upv.es> Message-ID: <4DE52529.2020403@gmail.com> Hi Lorenzo, I tried your code (the one you attached as testa.pl), and the only errors that were reported were unininitialized values $Ka at lines 90 and 150 when you print the output. This is because of typos in your script, you have "dA" instead of "dN" (PAML uses the terms "dN" and "dS" for Ka and Ks, respectively). I can only think that the problem you are experiencing is because of some change to the PAML output format (although it worked fine for me with a just-downloaded PAML4.4 and an older PAML4). From what I recall, PAML always did have quite volatile output formats. Older versions of PAML are archived, so you could try downgrading: http://abacus.gene.ucl.ac.uk/software/pamlOld.html Cheers, Roy. On 25/05/2011 23:23, Lorenzo Carretero wrote: > Dave, Jason: > > I had already tried running PAML manually with the alignment (I always > do this to confirm software is properly installed and set up), and ran > again with an edited version of the alignment removing the stop codons > (I didn't know stop codons at the ends of the alignmente could affect > PAML, but inframe stop codons). It worked properly in both cases. I ran > again my script (see attached testa.pl) using two different methods, one > constructing the codon alignment using aa_to_dna_aln and another one > passing the aligned sequences (in both cases after removing the stop > codons). I had again the message: > > ------------- EXCEPTION: Bio::Root::NotImplemented ------------- > MSG: Unknown format of PAML output did not see seqtype > STACK: Error::throw > STACK: Bio::Root::Root::throw /Library/Perl/5.10.0/Bio/Root/Root.pm:368 > STACK: Bio::Tools::Phylo::PAML::_parse_summary > /Library/Perl/5.10.0/Bio/Tools/Phylo/PAML.pm:461 > STACK: Bio::Tools::Phylo::PAML::next_result > /Library/Perl/5.10.0/Bio/Tools/Phylo/PAML.pm:270 > STACK: main::GettingBioperlAlignmentAAtoDNAplusPAMLcalculation > /Users/Lorenzo/Documents/workspace/PlantEvolGen/testa.pl:83 > STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/testa.pl:23 > ---------------------------------------------------------------- > > Thanks, > Lorenzo > > On 5/25/11 10:24 PM, Jason Stajich wrote: >>> ------------------------------------------ >>> >>> I think the codon alignment is being proberly constructed by the method aa_to_dna_aln, as I can do a Dumper printing of it. So the problem must be in the PAML codeml wrapper not properly recognizing the codon alignment. Could it be related to the alignment format (PAML runs on PHYLIP formatted files)? >> The writing out in phylip format is taking care of by the factory - you are passing in an alignment object so that is not typically the problem. >> >> I would repeat Dave's idea that you just dump the codon alignment file out and you run PAML manually with it. The parsing error sounds like there are problems when running PAML and you may want to check that you don't have stop codons in your alignment. It looks like your CDS file has stops as the last codon so if you drop those last 3 bases, how does it work? >> >>> Cheers, >>> Lorenzo >>> >>> >>> >>> -- >>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>> Lorenzo Carretero Paulet >>> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >>> Integrative Systems Biology Group >>> C/ Ingeniero Fausto Elio s/n. >>> 46022 Valencia, Spain >>> >>> Phone: +34 963879934 >>> Fax: +34 963877859 >>> e-mail: locarpau at upvnet.upv.es >>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From rmb32 at cornell.edu Tue May 31 14:19:35 2011 From: rmb32 at cornell.edu (Robert Buels) Date: Tue, 31 May 2011 11:19:35 -0700 Subject: [Bioperl-l] Polloc: Naming question In-Reply-To: References: Message-ID: <4DE53137.9040800@cornell.edu> Hi Luis, Yes, I think either Bio::Polloc or Bio::Typing::Polloc would be fine. BioPerl doesn't have exclusive rights to the Bio::* namespace. :-) Rob On 05/31/2011 09:20 AM, Luis-Miguel Rodr?guez Rojas wrote: > Dear all, > > I am currently working in a Perl library aimed to simplify the development > of applications in Molecular Typing. It's basically a library to handle > polymorphic loci (identify, group and predict typing results). I think it's > not generic enough to be part of bioperl (and doesn't follow the directives > of bioperl) despite it largely uses the bioperl libraries. However, I would > be glad to hear some comments about the namespace it should take from the > bioperl community. > > I already developed a script using this library, and it's certainly useful, > so I would like to make it available from CPAN. It's currently called > Polloc (for *Pol*ymorphic *loc*i), but a top-level name is probably a bad > idea. Would it be ok if I use a Bio::* namespace? I was thinking in > Bio::PolLoc or Bio::Typing::Polloc. I would like to preserve the 'Polloc' > name, because I feel a more generic one would give the wrong idea about how > generic it is. For example, I would expect Bio::Polymorphism or > Bio::MolecularTyping to be far more abstract than my library (which is aimed > to the usage of specific tools, and the analysis of specific loci). > > > Thanks! > LRR. > > -- > Luis M. Rodriguez-R > [ http://thebio.me/lrr ] > --------------------------------- > UMR R?sistance des Plantes aux Bioagresseurs - Group effecteur/cible > Institut de Recherche pour le D?veloppement, Montpellier, France > [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ] > +33 (0) 6.29.74.55.93 > > Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a > Universidad de Los Andes, Bogot?, Colombia > [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ] > +57 (1) 3.39.49.49 ext 2777 > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From lmrodriguezr at gmail.com Tue May 31 14:52:28 2011 From: lmrodriguezr at gmail.com (=?ISO-8859-1?Q?Luis=2DMiguel_Rodr=EDguez_Rojas?=) Date: Tue, 31 May 2011 20:52:28 +0200 Subject: [Bioperl-l] Polloc: Naming question In-Reply-To: <4DE53137.9040800@cornell.edu> References: <4DE53137.9040800@cornell.edu> Message-ID: Hello Rob, Thanks. I think it's much better to ask the most experienced before these things. On the other hand, I feel much better having a blessing from bioperl (even if it's not strictly necessary) ;) -- Luis M. Rodriguez-R [ http://thebio.me/lrr ] --------------------------------- UMR R?sistance des Plantes aux Bioagresseurs - Group effecteur/cible Institut de Recherche pour le D?veloppement, Montpellier, France [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ] +33 (0) 6.29.74.55.93 Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a Universidad de Los Andes, Bogot?, Colombia [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ] +57 (1) 3.39.49.49 ext 2777 2011/5/31 Robert Buels > Hi Luis, > > Yes, I think either Bio::Polloc or Bio::Typing::Polloc would be fine. > > BioPerl doesn't have exclusive rights to the Bio::* namespace. :-) > > Rob > > > On 05/31/2011 09:20 AM, Luis-Miguel Rodr?guez Rojas wrote: > >> Dear all, >> >> I am currently working in a Perl library aimed to simplify the development >> of applications in Molecular Typing. It's basically a library to handle >> polymorphic loci (identify, group and predict typing results). I think >> it's >> not generic enough to be part of bioperl (and doesn't follow the >> directives >> of bioperl) despite it largely uses the bioperl libraries. However, I >> would >> be glad to hear some comments about the namespace it should take from the >> bioperl community. >> >> I already developed a script using this library, and it's certainly >> useful, >> so I would like to make it available from CPAN. It's currently called >> Polloc (for *Pol*ymorphic *loc*i), but a top-level name is probably a bad >> >> idea. Would it be ok if I use a Bio::* namespace? I was thinking in >> Bio::PolLoc or Bio::Typing::Polloc. I would like to preserve the 'Polloc' >> name, because I feel a more generic one would give the wrong idea about >> how >> generic it is. For example, I would expect Bio::Polymorphism or >> Bio::MolecularTyping to be far more abstract than my library (which is >> aimed >> to the usage of specific tools, and the analysis of specific loci). >> >> >> Thanks! >> LRR. >> >> -- >> Luis M. Rodriguez-R >> [ http://thebio.me/lrr ] >> --------------------------------- >> UMR R?sistance des Plantes aux Bioagresseurs - Group effecteur/cible >> Institut de Recherche pour le D?veloppement, Montpellier, France >> [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ] >> +33 (0) 6.29.74.55.93 >> >> Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a >> Universidad de Los Andes, Bogot?, Colombia >> [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ] >> +57 (1) 3.39.49.49 ext 2777 >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > From cjfields at illinois.edu Tue May 31 15:03:09 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 31 May 2011 14:03:09 -0500 Subject: [Bioperl-l] Polloc: Naming question In-Reply-To: References: <4DE53137.9040800@cornell.edu> Message-ID: <1CB02258-84A4-44B9-A132-A1C5E146708C@illinois.edu> Agreed with Rob on: Bio::Polloc (and the genericity of the Bio::* namespace). Planning on submitting it to CPAN? chris On May 31, 2011, at 1:52 PM, Luis-Miguel Rodr?guez Rojas wrote: > Hello Rob, > > Thanks. I think it's much better to ask the most experienced before these > things. On the other hand, I feel much better having a blessing from > bioperl (even if it's not strictly necessary) ;) > > > -- > Luis M. Rodriguez-R > [ http://thebio.me/lrr ] > --------------------------------- > UMR R?sistance des Plantes aux Bioagresseurs - Group effecteur/cible > Institut de Recherche pour le D?veloppement, Montpellier, France > [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ] > +33 (0) 6.29.74.55.93 > > Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a > Universidad de Los Andes, Bogot?, Colombia > [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ] > +57 (1) 3.39.49.49 ext 2777 > > > > 2011/5/31 Robert Buels > >> Hi Luis, >> >> Yes, I think either Bio::Polloc or Bio::Typing::Polloc would be fine. >> >> BioPerl doesn't have exclusive rights to the Bio::* namespace. :-) >> >> Rob >> >> >> On 05/31/2011 09:20 AM, Luis-Miguel Rodr?guez Rojas wrote: >> >>> Dear all, >>> >>> I am currently working in a Perl library aimed to simplify the development >>> of applications in Molecular Typing. It's basically a library to handle >>> polymorphic loci (identify, group and predict typing results). I think >>> it's >>> not generic enough to be part of bioperl (and doesn't follow the >>> directives >>> of bioperl) despite it largely uses the bioperl libraries. However, I >>> would >>> be glad to hear some comments about the namespace it should take from the >>> bioperl community. >>> >>> I already developed a script using this library, and it's certainly >>> useful, >>> so I would like to make it available from CPAN. It's currently called >>> Polloc (for *Pol*ymorphic *loc*i), but a top-level name is probably a bad >>> >>> idea. Would it be ok if I use a Bio::* namespace? I was thinking in >>> Bio::PolLoc or Bio::Typing::Polloc. I would like to preserve the 'Polloc' >>> name, because I feel a more generic one would give the wrong idea about >>> how >>> generic it is. For example, I would expect Bio::Polymorphism or >>> Bio::MolecularTyping to be far more abstract than my library (which is >>> aimed >>> to the usage of specific tools, and the analysis of specific loci). >>> >>> >>> Thanks! >>> LRR. >>> >>> -- >>> Luis M. Rodriguez-R >>> [ http://thebio.me/lrr ] >>> --------------------------------- >>> UMR R?sistance des Plantes aux Bioagresseurs - Group effecteur/cible >>> Institut de Recherche pour le D?veloppement, Montpellier, France >>> [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ] >>> +33 (0) 6.29.74.55.93 >>> >>> Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a >>> Universidad de Los Andes, Bogot?, Colombia >>> [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ] >>> +57 (1) 3.39.49.49 ext 2777 >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From tejaminnu at gmail.com Sun May 29 23:12:12 2011 From: tejaminnu at gmail.com (sukeerthi teja Rallapalli) Date: Sun, 29 May 2011 23:12:12 -0400 Subject: [Bioperl-l] Reading protein from file - error Message-ID: Hi, I have been trying to execute this program and i am getting an error. Kindly help me. #!usr/bin/perl -w # The filename of the file which contains the protein sequence $proteinfilename = '/users/sukeerthiteja/desktop/perl/NM_021964fragment.pep'; # First open the file and associate a filehandle with it. for readability purpose lets use the filehandle PROTEINFILE. open(PROTEINFILE, $proteinfilename); # Now we do the actual reading of the protein sequence data from the file,by using < > to get # input from the filehandle. We store the data into variable $protein $protein = ; #now that we got our data we can close the file close PROTEINFILE; #Print the protein onto the screen print "Here is the protein:\n\n"; print $protein; exit; ERROR: sukeerthi-tejas-macbook-pro:desktop sukeerthiteja$ perl 1.pl Here is the protein: {\rtf1\ansi\ansicpg1252\cocoartf1038\cocoasubrtf350 I am not able to print the protein content on the screen. Thank you Regards Teja From locarpau at upvnet.upv.es Tue May 31 15:04:05 2011 From: locarpau at upvnet.upv.es (Lorenzo Carretero Paulet) Date: Tue, 31 May 2011 21:04:05 +0200 Subject: [Bioperl-l] Error calling alignment method In-Reply-To: <4DE52529.2020403@gmail.com> References: <4DDD4337.3030107@upvnet.upv.es> <4DDD51CD.90208@upvnet.upv.es> <4DDD8155.5060002@upvnet.upv.es> <4DE52529.2020403@gmail.com> Message-ID: <4DE53BA5.1020803@upvnet.upv.es> Thanks for the reply, I attach again the revised versiont of the script I'm working with as well as the sequences. I still have the same error. The funny thing is that the yn00 subroutine is running properly when the alignment is passed directly, but not the codeml one (using either PAML 4.2 or 4.4). However, both PAML versions and programs are working fine with my data when I ran them manually. Any suggestion will be much appreciated. CHeers, Lorenzo and El 31/05/11 19:28, Roy Chaudhuri escribi?: > Hi Lorenzo, > > I tried your code (the one you attached as testa.pl), and the only > errors that were reported were unininitialized values $Ka at lines 90 > and 150 when you print the output. This is because of typos in your > script, you have "dA" instead of "dN" (PAML uses the terms "dN" and > "dS" for Ka and Ks, respectively). > > I can only think that the problem you are experiencing is because of > some change to the PAML output format (although it worked fine for me > with a just-downloaded PAML4.4 and an older PAML4). From what I > recall, PAML always did have quite volatile output formats. Older > versions of PAML are archived, so you could try downgrading: > http://abacus.gene.ucl.ac.uk/software/pamlOld.html > > Cheers, > Roy. > > On 25/05/2011 23:23, Lorenzo Carretero wrote: >> Dave, Jason: >> >> I had already tried running PAML manually with the alignment (I always >> do this to confirm software is properly installed and set up), and ran >> again with an edited version of the alignment removing the stop codons >> (I didn't know stop codons at the ends of the alignmente could affect >> PAML, but inframe stop codons). It worked properly in both cases. I ran >> again my script (see attached testa.pl) using two different methods, one >> constructing the codon alignment using aa_to_dna_aln and another one >> passing the aligned sequences (in both cases after removing the stop >> codons). I had again the message: >> >> ------------- EXCEPTION: Bio::Root::NotImplemented ------------- >> MSG: Unknown format of PAML output did not see seqtype >> STACK: Error::throw >> STACK: Bio::Root::Root::throw /Library/Perl/5.10.0/Bio/Root/Root.pm:368 >> STACK: Bio::Tools::Phylo::PAML::_parse_summary >> /Library/Perl/5.10.0/Bio/Tools/Phylo/PAML.pm:461 >> STACK: Bio::Tools::Phylo::PAML::next_result >> /Library/Perl/5.10.0/Bio/Tools/Phylo/PAML.pm:270 >> STACK: main::GettingBioperlAlignmentAAtoDNAplusPAMLcalculation >> /Users/Lorenzo/Documents/workspace/PlantEvolGen/testa.pl:83 >> STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/testa.pl:23 >> ---------------------------------------------------------------- >> >> Thanks, >> Lorenzo >> >> On 5/25/11 10:24 PM, Jason Stajich wrote: >>>> ------------------------------------------ >>>> >>>> I think the codon alignment is being proberly constructed by the >>>> method aa_to_dna_aln, as I can do a Dumper printing of it. So the >>>> problem must be in the PAML codeml wrapper not properly recognizing >>>> the codon alignment. Could it be related to the alignment format >>>> (PAML runs on PHYLIP formatted files)? >>> The writing out in phylip format is taking care of by the factory - >>> you are passing in an alignment object so that is not typically the >>> problem. >>> >>> I would repeat Dave's idea that you just dump the codon alignment >>> file out and you run PAML manually with it. The parsing error >>> sounds like there are problems when running PAML and you may want to >>> check that you don't have stop codons in your alignment. It looks >>> like your CDS file has stops as the last codon so if you drop those >>> last 3 bases, how does it work? >>> >>>> Cheers, >>>> Lorenzo >>>> >>>> >>>> >>>> -- >>>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>>> >>>> Lorenzo Carretero Paulet >>>> Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) >>>> Integrative Systems Biology Group >>>> C/ Ingeniero Fausto Elio s/n. >>>> 46022 Valencia, Spain >>>> >>>> Phone: +34 963879934 >>>> Fax: +34 963877859 >>>> e-mail: locarpau at upvnet.upv.es >>>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >>>> >>>> >>>> _______________________________________________ >>>> >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Lorenzo Carretero Paulet Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV) Integrative Systems Biology Group C/ Ingeniero Fausto Elio s/n. 46022 Valencia, Spain Phone: +34 963879934 Fax: +34 963877859 e-mail: locarpau at upvnet.upv.es *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* -------------- next part -------------- A non-text attachment was scrubbed... Name: testa.pl Type: text/x-perl-script Size: 5725 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Alyrata_vs_Alyrata.par.cds.aln.fas URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test_vs_test.par.nt.1.fas URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test_vs_test.par.aa.1.fas URL: From bosborne11 at verizon.net Tue May 31 16:13:39 2011 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 31 May 2011 16:13:39 -0400 Subject: [Bioperl-l] Reading protein from file - error In-Reply-To: References: Message-ID: <03F8C092-4955-4BE0-B435-3BB787BA31EA@verizon.net> Teja, I will be frank: this is not the right way to read sequence data from a file, if you intend to use Perl. Please take a look at this documentation, it will get you started. http://www.bioperl.org/w/index.php?title=HOWTO:Beginners You also want to take a look at your NM_021964fragment.pep file - is it in RTF format? Brian O. On May 29, 2011, at 11:12 PM, sukeerthi teja Rallapalli wrote: > Hi, > > I have been trying to execute this program and i am getting an error. Kindly > help me. > > #!usr/bin/perl -w > > # The filename of the file which contains the protein sequence > $proteinfilename = > '/users/sukeerthiteja/desktop/perl/NM_021964fragment.pep'; > > # First open the file and associate a filehandle with it. for readability > purpose lets use the filehandle PROTEINFILE. > open(PROTEINFILE, $proteinfilename); > > # Now we do the actual reading of the protein sequence data from the file,by > using < > to get > # input from the filehandle. We store the data into variable $protein > $protein = ; > > #now that we got our data we can close the file > close PROTEINFILE; > > #Print the protein onto the screen > print "Here is the protein:\n\n"; > > print $protein; > > exit; > > > > ERROR: > > sukeerthi-tejas-macbook-pro:desktop sukeerthiteja$ perl 1.pl > Here is the protein: > > {\rtf1\ansi\ansicpg1252\cocoartf1038\cocoasubrtf350 > > > I am not able to print the protein content on the screen. > > Thank you > Regards > Teja > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From lmrodriguezr at gmail.com Tue May 31 17:13:50 2011 From: lmrodriguezr at gmail.com (=?ISO-8859-1?Q?Luis=2DMiguel_Rodr=EDguez_Rojas?=) Date: Tue, 31 May 2011 23:13:50 +0200 Subject: [Bioperl-l] Polloc: Naming question In-Reply-To: <1CB02258-84A4-44B9-A132-A1C5E146708C@illinois.edu> References: <4DE53137.9040800@cornell.edu> <1CB02258-84A4-44B9-A132-A1C5E146708C@illinois.edu> Message-ID: Great. Yes, that's the plan. Methods are documented, tests written and packaging with MakeMaker finished, but I still need some code cleaning (for a more stable API) and additional package-level documentation. Probably to be released within a month or so. -- Luis M. Rodriguez-R [ http://thebio.me/lrr ] --------------------------------- UMR R?sistance des Plantes aux Bioagresseurs - Group effecteur/cible Institut de Recherche pour le D?veloppement, Montpellier, France [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ] +33 (0) 6.29.74.55.93 Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a Universidad de Los Andes, Bogot?, Colombia [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ] +57 (1) 3.39.49.49 ext 2777 2011/5/31 Chris Fields > Agreed with Rob on: Bio::Polloc (and the genericity of the Bio::* > namespace). Planning on submitting it to CPAN? > > chris > > On May 31, 2011, at 1:52 PM, Luis-Miguel Rodr?guez Rojas wrote: > > > Hello Rob, > > > > Thanks. I think it's much better to ask the most experienced before > these > > things. On the other hand, I feel much better having a blessing from > > bioperl (even if it's not strictly necessary) ;) > > > > > > -- > > Luis M. Rodriguez-R > > [ http://thebio.me/lrr ] > > --------------------------------- > > UMR R?sistance des Plantes aux Bioagresseurs - Group effecteur/cible > > Institut de Recherche pour le D?veloppement, Montpellier, France > > [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ] > > +33 (0) 6.29.74.55.93 > > > > Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a > > Universidad de Los Andes, Bogot?, Colombia > > [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ] > > +57 (1) 3.39.49.49 ext 2777 > > > > > > > > 2011/5/31 Robert Buels > > > >> Hi Luis, > >> > >> Yes, I think either Bio::Polloc or Bio::Typing::Polloc would be fine. > >> > >> BioPerl doesn't have exclusive rights to the Bio::* namespace. :-) > >> > >> Rob > >> > >> > >> On 05/31/2011 09:20 AM, Luis-Miguel Rodr?guez Rojas wrote: > >> > >>> Dear all, > >>> > >>> I am currently working in a Perl library aimed to simplify the > development > >>> of applications in Molecular Typing. It's basically a library to > handle > >>> polymorphic loci (identify, group and predict typing results). I think > >>> it's > >>> not generic enough to be part of bioperl (and doesn't follow the > >>> directives > >>> of bioperl) despite it largely uses the bioperl libraries. However, I > >>> would > >>> be glad to hear some comments about the namespace it should take from > the > >>> bioperl community. > >>> > >>> I already developed a script using this library, and it's certainly > >>> useful, > >>> so I would like to make it available from CPAN. It's currently called > >>> Polloc (for *Pol*ymorphic *loc*i), but a top-level name is probably a > bad > >>> > >>> idea. Would it be ok if I use a Bio::* namespace? I was thinking in > >>> Bio::PolLoc or Bio::Typing::Polloc. I would like to preserve the > 'Polloc' > >>> name, because I feel a more generic one would give the wrong idea about > >>> how > >>> generic it is. For example, I would expect Bio::Polymorphism or > >>> Bio::MolecularTyping to be far more abstract than my library (which is > >>> aimed > >>> to the usage of specific tools, and the analysis of specific loci). > >>> > >>> > >>> Thanks! > >>> LRR. > >>> > >>> -- > >>> Luis M. Rodriguez-R > >>> [ http://thebio.me/lrr ] > >>> --------------------------------- > >>> UMR R?sistance des Plantes aux Bioagresseurs - Group effecteur/cible > >>> Institut de Recherche pour le D?veloppement, Montpellier, France > >>> [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr] > >>> +33 (0) 6.29.74.55.93 > >>> > >>> Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a > >>> Universidad de Los Andes, Bogot?, Colombia > >>> [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ] > >>> +57 (1) 3.39.49.49 ext 2777 > >>> > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >>> > >> > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From florent.angly at gmail.com Tue May 31 18:03:39 2011 From: florent.angly at gmail.com (Florent Angly) Date: Wed, 01 Jun 2011 08:03:39 +1000 Subject: [Bioperl-l] Reading protein from file - error In-Reply-To: References: Message-ID: <4DE565BB.1010607@gmail.com> Hi Teja, The code you wrote does not use Bioperl. If you want to use Bioperl, have a look here: http://www.bioperl.org/wiki/HOWTOs As far as your script is concerned, I included some changes that should make it work: > #! /usr/bin/env perl > > use strict; > use warnings; > > # The filename of the file which contains the protein sequence > my $proteinfilename = > '/users/sukeerthiteja/desktop/perl/NM_021964fragment.pep'; > > # First open the file and associate a filehandle with it. for readability > purpose lets use the filehandle PROTEINFILE. > open(PROTEINFILE, $proteinfilename) or die "Error: could not open file '$proteinfilename'\n$!\n"; > > # Now we do the actual reading of the protein sequence data from the file,by > using< > to get > # input from the filehandle. We store the data into variable @protein. > # This method is very inefficient for large files. > my @protein =; > > #now that we got our data we can close the file > close PROTEINFILE; > > #Print the protein onto the screen > print "Here is the protein:\n\n"; > > print "@protein\n"; > > exit; I encourage you to research what all these changes mean and to read a book or two to get more familiar with Perl: http://oreilly.com/catalog/9780596001322 or http://oreilly.com/catalog/9780596000806 Florent On 30/05/11 13:12, sukeerthi teja Rallapalli wrote: > Hi, > > I have been trying to execute this program and i am getting an error. Kindly > help me. > > #!usr/bin/perl -w > > # The filename of the file which contains the protein sequence > $proteinfilename = > '/users/sukeerthiteja/desktop/perl/NM_021964fragment.pep'; > > # First open the file and associate a filehandle with it. for readability > purpose lets use the filehandle PROTEINFILE. > open(PROTEINFILE, $proteinfilename); > > # Now we do the actual reading of the protein sequence data from the file,by > using< > to get > # input from the filehandle. We store the data into variable $protein > $protein =; > > #now that we got our data we can close the file > close PROTEINFILE; > > #Print the protein onto the screen > print "Here is the protein:\n\n"; > > print $protein; > > exit; > > > > ERROR: > > sukeerthi-tejas-macbook-pro:desktop sukeerthiteja$ perl 1.pl > Here is the protein: > > {\rtf1\ansi\ansicpg1252\cocoartf1038\cocoasubrtf350 > > > I am not able to print the protein content on the screen. > > Thank you > Regards > Teja > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From florent.angly at gmail.com Tue May 31 21:42:37 2011 From: florent.angly at gmail.com (Florent Angly) Date: Wed, 01 Jun 2011 11:42:37 +1000 Subject: [Bioperl-l] Reading protein from file - error In-Reply-To: References: <4DE565BB.1010607@gmail.com> Message-ID: <4DE5990D.80203@gmail.com> Teja, Please use "reply all" so that everyone can follow the discussion. In your script, you are not using any of the Bioperl modules, so, it does not matter if Bioperl is installed. Have a good look around the Bioperl wiki (http://www.bioperl.org/wiki/Main_Page), it covers many of the questions you may have. Regards, Florent On 01/06/11 10:15, sukeerthi teja Rallapalli wrote: > Hi Florent, > > Thanks for the reply. I had already installed Bioperl first. But then > you are right i still am not able to get everything. Whats the command > that you use to check if we have Bioperl already installed. > > Thanks > Teja