From lelbourn at cbms.mq.edu.au Wed Nov 5 04:04:38 2008 From: lelbourn at cbms.mq.edu.au (Liam Elbourne) Date: Wed, 5 Nov 2008 20:04:38 +1100 Subject: [Bioperl-l] searchio tags In-Reply-To: <3D0D8BC0-FEB6-486F-A43E-B787D414C896@illinois.edu> References: <6B62B296-7FD5-4DC7-913E-AE29DAF98858@cbms.mq.edu.au> <027C580B-BFCB-4410-B2FA-3C3205167ED3@gmail.com> <3D0D8BC0-FEB6-486F-A43E-B787D414C896@illinois.edu> Message-ID: Thanks Jason and Chris, I ended up iterating through the features as I needed the locus_tag anyway - bit slow but gets there in the end... I've ccd this to the bioperl list as suggested, and will subscribe soon. Regards, Liam. On 05/11/2008, at 6:00 AM, Chris Fields wrote: > I've already looked into that using TBLASTN output from the web > interface, but I found no relevant lines in HSP sections in XML > formatted reports (see below). It is possible the URLAPI interface > is returning different XML data (I've seen stranger things from > NCBI). I'll check into that when I can. BTW, the default read > format for RemoteBlast is still text IIRC (we never changed it to > XML), so is it possible Liam is referring to the default (text) > output, and not XML? > > Also, I have managed to get this working in GenericHSP, will > probably commit momentarily with added tests. Liam, if you have XML > output that contains the relevant data could you attach it to an > email so I can try parsing it out? > > Liam, it might be best to ask these questions on the mail list in > case Jason or I can't get to them immediately. For SeqIO features, > in order to retrieve a particular feature region you would need to > either iterate through or grep the relevant features and retrieve > the sequence using $feature->seq. If I want surrounding sequence I > grab start/end from the feature, then +/- the extra bp and use $seq- > >subseq. > > -c > > Example Text Hit/HSP block: > > >emb|AM408590.1| Mycobacterium bovis BCG Pasteur 1173P2, complete > genome > Length=4374522 > > Features in this part of subject sequence: > Conserved hypothetical protein > Probable pyrimidine operon regulatory protein pyrR > > Score = 372 bits (955), Expect = 3e-101, Method: Compositional > matrix adjust. > Identities = 193/193 (100%), Positives = 193/193 (100%), Gaps = > 0/193 (0%) > Frame = +3 > > Query 1 > MGAAGDAAIGRESRELMSAADVGRTISRIAHQIIEKTALDDPVGPDAPRVVLLGIPTRGV 60 > > MGAAGDAAIGRESRELMSAADVGRTISRIAHQIIEKTALDDPVGPDAPRVVLLGIPTRGV > Sbjct 1577814 > MGAAGDAAIGRESRELMSAADVGRTISRIAHQIIEKTALDDPVGPDAPRVVLLGIPTRGV 1577993 > > Query 61 > TLANRLAGNITEYSGIHVGHGALDITLYRDDLMIKPPRPLASTSIPAGGIDDALVILVDD 120 > > TLANRLAGNITEYSGIHVGHGALDITLYRDDLMIKPPRPLASTSIPAGGIDDALVILVDD > Sbjct 1577994 > TLANRLAGNITEYSGIHVGHGALDITLYRDDLMIKPPRPLASTSIPAGGIDDALVILVDD 1578173 > > Query 121 > VLYSGRSVRSALDALRDVGRPRAVQLAVLVDRGHRELPLRADYVGKNVPTSRSESVHVRL 180 > > VLYSGRSVRSALDALRDVGRPRAVQLAVLVDRGHRELPLRADYVGKNVPTSRSESVHVRL > Sbjct 1578174 > VLYSGRSVRSALDALRDVGRPRAVQLAVLVDRGHRELPLRADYVGKNVPTSRSESVHVRL 1578353 > > Query 181 REHDGRDGVVISR 193 > REHDGRDGVVISR > Sbjct 1578354 REHDGRDGVVISR 1578392 > > Matching XML Hit/HSP block: > > > 2 > gi|121491530|emb|AM408590.1| > Mycobacterium bovis BCG Pasteur 1173P2, complete > genome > AM408590 > 4374522 > > > 1 > 372.474 > 955 > 3.27024e-101 > 1 > 193 > 1577814 > 1578392 > 0 > 3 > 193 > 193 > 0 > 193 > > < > Hsp_qseq > > > MGAAGDAAIGRESRELMSAADVGRTISRIAHQIIEKTALDDPVGPDAPRVVLLGIPTRGVTLANRLAGNITEYSGIHVGHGALDITLYRDDLMIKPPRPLASTSIPAGGIDDALVILVDDVLYSGRSVRSALDALRDVGRPRAVQLAVLVDRGHRELPLRADYVGKNVPTSRSESVHVRLREHDGRDGVVISR > > > < > Hsp_hseq > > > MGAAGDAAIGRESRELMSAADVGRTISRIAHQIIEKTALDDPVGPDAPRVVLLGIPTRGVTLANRLAGNITEYSGIHVGHGALDITLYRDDLMIKPPRPLASTSIPAGGIDDALVILVDDVLYSGRSVRSALDALRDVGRPRAVQLAVLVDRGHRELPLRADYVGKNVPTSRSESVHVRLREHDGRDGVVISR > > > < > Hsp_midline > > > MGAAGDAAIGRESRELMSAADVGRTISRIAHQIIEKTALDDPVGPDAPRVVLLGIPTRGVTLANRLAGNITEYSGIHVGHGALDITLYRDDLMIKPPRPLASTSIPAGGIDDALVILVDDVLYSGRSVRSALDALRDVGRPRAVQLAVLVDRGHRELPLRADYVGKNVPTSRSESVHVRLREHDGRDGVVISR > > > > > > On Nov 4, 2008, at 11:55 AM, Jason Stajich wrote: > >> FYI >> >> - >> Jason Stajich >> Sent from my iPod >> >> Begin forwarded message: >> >>> From: Liam Elbourne >>> Date: November 3, 2008 10:50:35 PM PST >>> To: Jason Stajich >>> Subject: Re: searchio tags >>> >>> Jason, >>> >>> Thanks for the very rapid response! Excuse the circularity, but >>> NCBI must now be including this info in the XML format because >>> RemoteBlast had it in my blast results, or is there a fatal >>> antipodean past pub time flaw in my logic? >>> >>> In any event I'll adopt your suggestion to check back against >>> SeqIO (and check out DB Cache, that's a new one for me...). >>> >>> Is there a method in SeqIO that allows one to go straight to a >>> particular region of the sequence (ie something like "$seq- >>> >features_after($curr_hsp->start('hit')") or do I have to iterate >>> through all the features until $seq->current-feature->start >= >>> $curr_hsp->start('hit') if you'll excuse (and understand) the >>> pseudo-bioperl code? >>> >>> Regards, >>> Liam. >>> >>> On 04/11/2008, at 5:07 PM, Jason Stajich wrote: >>> >>>> Liam - >>>> >>>> Sorry - we don't parse this out of the report -- that information >>>> is only something that is included in the online-only BLAST >>>> report and is not a standard part of the format. The RemoteBlast >>>> will typically get the XML-only version of the report which to my >>>> knowledge does not contain this information. If it is now being >>>> included the parser could be updated to also parse this out, but >>>> I am not sure that it is. >>>> Sorry - this is just a limit of how the data has been >>>> traditionally available and NCBI makes this kind of thing >>>> available only in an HTML form that is not-standardized. >>>> >>>> Your best bet is to get the sequence accession for your hit, then >>>> obtain the sequence as a genbank record from remote genbank (and/ >>>> or also cache this sequence in your local DB, there is a DB Cache >>>> in BioPerl for just this sort of thing) or a local genbank - then >>>> check and see if your HIT location is close to any of the >>>> features you are interested in from the annotation of the record. >>>> >>>> Hope that gets you pointed in the right direction, may be worth >>>> sending a note to the bioperl list as well to see if other people >>>> have similar needs and maybe a better custom solution can be >>>> cooked up. >>>> >>>> Best wishes, >>>> -jason >>>> On Nov 3, 2008, at 9:00 PM, Liam Elbourne wrote: >>>> >>>>> Hi Jason, >>>>> >>>>> I'm trying to parse a bunch of blast reports produced by >>>>> Bio::Tools::Run::RemoteBlast using SearchIO. Everything is fine >>>>> except that the: >>>>> >>>>> "Features in this part of subject sequence:" >>>>> >>>>> information doesn't seem to included in any of the objects >>>>> produced by SearchIO - the only method I could find that looked >>>>> vaguely promising was the locus() method for >>>>> Bio::Search::Hit::HitI, but this isn't initialised. >>>>> >>>>> I'm trying to match a set of primers against the genome they >>>>> were designed to target to check that they are associated with >>>>> the loci they were intended to targeted (as it appears in >>>>> retrospect that they have been 'mislabeled'), and if not, which >>>>> loci they are near or in. I've included an example blast report >>>>> at the end of the email, where the information I'm still trying >>>>> to extract (everything else being extractable) is "hypothetical >>>>> protein". >>>>> >>>>> Sorry to trouble you with such a trivial issue, if you are not >>>>> the right person to contact (and I appreciate that this is not a >>>>> bug as such, except so far as it is a 'bug' in my knowledge of >>>>> bioperl!) please tell me who would be. I've attached the code >>>>> I'm using too for what that is worth. >>>>> >>>>> >>>>> >>>>> >>>>> Regards, >>>>> Liam Elbourne. >>>>> Research Fellow >>>>> Paulsen Lab >>>>> Macquarie University >>>>> Sydney Australia. >>>>> >>>>> blast report below >>>>> **************************** >>>>> BLASTN 2.2.18+ >>>>> Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro >>>>> A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and >>>>> David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new >>>>> generation of protein database search programs", Nucleic >>>>> Acids Res. 25:3389-3402. >>>>> >>>>> >>>>> RID: GZ1WTPWW016 >>>>> >>>>> >>>>> Database: NCBI Genomic Reference Sequences >>>>> 11,340 sequences; 45,795,413,781 total letters >>>>> >>>>> Query= BR0042_r >>>>> Length=25 >>>>> >>>>> >>>>> >>>>> Score E >>>>> Sequences producing significant >>>>> alignments: (Bits) Value >>>>> >>>>> ref|NC_004310.3| Brucella suis 1330 chromosome I, complete >>>>> se... 46.4 3e-07 >>>>> >>>>> ALIGNMENTS >>>>> >ref|NC_004310.3| Brucella suis 1330 chromosome I, complete >>>>> sequence >>>>> gb|AE014291.4| Brucella suis 1330 chromosome I, complete sequence >>>>> Length=2107794 >>>>> >>>>> Features in this part of subject sequence: >>>>> hypothetical protein >>>>> >>>>> Score = 46.4 bits (50), Expect = 3e-07 >>>>> Identities = 25/25 (100%), Gaps = 0/25 (0%) >>>>> Strand=Plus/Minus >>>>> >>>>> Query 1 GTTTTTCGAGCCGCCCTTTTTGCCC 25 >>>>> ||||||||||||||||||||||||| >>>>> Sbjct 494130 GTTTTTCGAGCCGCCCTTTTTGCCC 494106 >>>>> >>>>> >>>>> Database: NCBI Genomic Reference Sequences >>>>> Posted date: Nov 2, 2008 5:47 PM >>>>> Number of letters in database: 3,315,177 >>>>> Number of sequences in database: 2 >>>>> >>>>> Lambda K H >>>>> 0.634 0.408 0.912 >>>>> Gapped >>>>> Lambda K H >>>>> 0.634 0.408 0.912 >>>>> Matrix: blastn matrix:2 -3 >>>>> Gap Penalties: Existence: 5, Extension: 2 >>>>> Number of Sequences: 2 >>>>> Number of Hits to DB: 0 >>>>> Number of extensions: 0 >>>>> Number of successful extensions: 0 >>>>> Number of sequences better than 1e-05: 0 >>>>> Number of HSP's better than 1e-05 without gapping: 0 >>>>> Number of HSP's gapped: 0 >>>>> Number of HSP's successfully gapped: 0 >>>>> Length of query: 25 >>>>> Length of database: 3315177 >>>>> Length adjustment: 18 >>>>> Effective length of query: 7 >>>>> Effective length of database: 3315141 >>>>> Effective search space: 23205987 >>>>> Effective search space used: 23205987 >>>>> A: 0 >>>>> X1: 22 (20.1 bits) >>>>> X2: 33 (29.8 bits) >>>>> X3: 110 (99.2 bits) >>>>> S1: 33 (31.0 bits) >>>>> S2: 45 (41.9 bits) >>>>> >>>>> >>>>> >>>>> >>>> >>>> Jason Stajich >>>> jason at bioperl.org >>>> >>>> >>>> >>> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Marie-Claude Hofmann > College of Veterinary Medicine > University of Illinois Urbana-Champaign > > > > From mark.blaxter at ed.ac.uk Wed Nov 5 10:05:49 2008 From: mark.blaxter at ed.ac.uk (Mark Blaxter) Date: Wed, 5 Nov 2008 15:05:49 +0000 Subject: [Bioperl-l] bioperl on osX 10.5 Message-ID: <625E2910-7462-4D37-B16B-F61D57E1CF1E@ed.ac.uk> Hi BioPerl team in installing BioPerl with cpan on macOSX 10.5 ( "cpan>force install S/ SE/SENDU/bioperl-1.5.2_102.tar.gz" ) I got these errors that looked more worrying than the remainder.... FYI, as I think bioperl works (or at least the bits I am using...) thanks for all the work! mark ------------- EXCEPTION ------------- MSG: No flatfile fileid files in config - check the index has been made correctly STACK Bio::DB::Flat::BinarySearch::read_config_file /Users/mblaxter/ Library/Application Support/.cpan/build/bioperl-1.5.2_102-vnU6b6/blib/ lib/Bio/DB/Flat/BinarySearch.pm:1288 STACK Bio::DB::Flat::BinarySearch::new /Users/mblaxter/Library/ Application Support/.cpan/build/bioperl-1.5.2_102-vnU6b6/blib/lib/Bio/ DB/Flat/BinarySearch.pm:276 STACK Bio::DB::Flat::new /Users/mblaxter/Library/Application Support/.cpan/build/bioperl-1.5.2_102-vnU6b6/blib/lib/Bio/DB/Flat.pm:177 STACK Bio::DB::Flat::new_from_registry /Users/mblaxter/Library/ Application Support/.cpan/build/bioperl-1.5.2_102-vnU6b6/blib/lib/Bio/ DB/Flat.pm:252 STACK (eval) /Users/mblaxter/Library/Application Support/.cpan/build/ bioperl-1.5.2_102-vnU6b6/blib/lib/Bio/DB/Registry.pm:164 STACK Bio::DB::Registry::_load_registry /Users/mblaxter/Library/ Application Support/.cpan/build/bioperl-1.5.2_102-vnU6b6/blib/lib/Bio/ DB/Registry.pm:163 STACK Bio::DB::Registry::new /Users/mblaxter/Library/Application Support/.cpan/build/bioperl-1.5.2_102-vnU6b6/blib/lib/Bio/DB/ Registry.pm:95 STACK toplevel t/Registry.t:72 STACK toplevel t/Registry.t:72 --------------------------------------------------- Can't call method "seq" on an undefined value at t/Registry.t line 81. t/Registry................... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 6/13 subtests t/Relationship...............ok t/RelationshipType...........ok t/RemoteBlast................ok t/RepeatMasker...............ok t/RestrictionAnalysis........ok t/RestrictionIO..............ok t/RootI......................ok t/RootIO.....................ok t/RootStorable...............ok t/SNP........................ok t/Scansite...................ok t/SearchDist.................ok t/SearchIO...................ok t/Seg........................ok t/Seq........................ok t/SeqAnalysisParser..........ok t/SeqBuilder.................ok t/SeqDiff....................ok t/SeqFeatCollection..........ok t/SeqFeature.................92/211 -------------------- WARNING --------------------- MSG: [1/5] tried to fetch http://umn.dl.sourceforge.net/sourceforge/song/sofa.definition , but server threw 500. retrying... --------------------------------------------------- [and on for 5 retrys] at the end cpan reports: Failed during this command: MKUTTER/SOAP-Lite-0.710.08.tar.gz : make_test FAILED but failure ignored because 'force' in effect MINGYILIU/Bio-ASN1-EntrezGene-1.091.tgz : make_test FAILED but failure ignored because 'force' in effect MHX/Convert-Binary-C-0.71.tar.gz : make_test FAILED but failure ignored because 'force' in effect MIROD/XML-XPathEngine-0.11.tar.gz : make_test FAILED but failure ignored because 'force' in effect SENDU/bioperl-1.5.2_102.tar.gz : make_test FAILED but failure ignored because 'force' in effect Mark Blaxter mark.blaxter at ed.ac.uk ~ may all beings be happy ~ -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From dan.bolser at gmail.com Fri Nov 7 09:27:20 2008 From: dan.bolser at gmail.com (Dan Bolser) Date: Fri, 7 Nov 2008 14:27:20 +0000 Subject: [Bioperl-l] Problems with Bio::SearchIO Message-ID: <2c8757af0811070627t57a96965s4a8c8bdae5fbd095@mail.gmail.com> Hi, I'm new to BioPerl, so apologies if I'm doing something wrong. Please just let me know what I'm doing wrong or what you need. I downloaded the latest BioPerl via SVN (revision 14980), and did the, perl Build.PL ./Build test ./Build install (note that I have lib::local installed, pointing things at ~/perl5) I ignored the failed tests, as-per the "everyone who is l88t does this" instructions ;-) I am trying to parse the output of a "blastall -p blastn" job (blastall 2.2.18) using either the default format or tabular format. I copied the "search_overview.PLS" script found here: http://code.open-bio.org/svnweb/index.cgi/bioperl/view/bioperl-live/trunk/scripts/graphics/search_overview.PLS My code looks like this: my $parser = new Bio::SearchIO( -format => 'blasttable', -file => 'my.blast.m8.out' ); warn "parsed (", $parser->result_count(), ")\n"; But what I see (using either default or tabular format) is: ./visualizeBlastHits.plx Use of uninitialized value in warn at ./visualizeBlastHits.plx line 34, line 192. parsed () Looking closer I found that $parser->result_count() only gets set after calling $parser->next_result. Any way to force this? In some Perl objects I've seen a 'parse' method that kicks the object into (silently) calling all its get methods. Is there an equivalent (but apparently undocumented) method? Actually, I think it should kick itself when called... or not? Certainly the docs do not suggest that is won't return a the number of results ("Function: Gets the number of Blast results that have been parsed.") So I think this is a bug. Actually I just checked against version 1.4 and "result_count()" fails with the default, the XML or the tabular format results (all three representing the same query against the same database with the same blast version). Note that trying with the XML format in the latest version of BioPerl results in the outright failure: ------------- EXCEPTION: Bio::Root::NotImplemented ------------- MSG: Abstract method "Bio::SearchIO::result_count" is not implemented by package Bio::SearchIO::blastxml. This is not your fault - author of Bio::SearchIO::blastxml should be blamed! STACK: Error::throw STACK: Bio::Root::Root::throw /homes/dbolser/perl5/lib/perl5/Bio/Root/Root.pm:357 STACK: Bio::Root::RootI::throw_not_implemented /homes/dbolser/perl5/lib/perl5/Bio/Root/RootI.pm:680 STACK: Bio::SearchIO::result_count /homes/dbolser/perl5/lib/perl5/Bio/SearchIO.pm:410 STACK: ./visualizeBlastHits.plx:35 ---------------------------------------------------------------- While parsing and comparing the different formats (and versions) I noticed a couple of other problems that I could not find reported anywhere, so I'll report them here. The appropriate parts of the code are: while( my $r = $parser->next_result ) { print $r->query_length, "\n"; while(my $h = $r->next_hit ) { print $h->length, "\n"; my $p = $h->hsp('best'); print $h->significance, "\t", $h->score, "\t", $h->bits, "\n"; } } It seems that both the "$r->query_length" and the "$h->length" are missing when using the tabular format blast results (using either BioPerl version). I get confused here because there is also some undocumented behavior used in the above script that means a 'hit' (apparently) returns the start and end point of the two most extreme HSPs (and the length of that range). Secondly, I found in all but the default format (using either BioPerl version), the "$p->score" is missing (set to an empty string). It shows up fine using the default format. The significance and the bit-score show up fine... or at least they show up... The values look wrong now I come to check. e.g. the score is equal to the bit-score when using the default format with version 1.4, and is often smaller than the bit-score using the latest version (or is $h->bits not the bit score?) The closest hits in the mailing list that I could find to these probemes were: http://lists.open-bio.org/pipermail/bioperl-l/2002-May/007936.html http://lists.open-bio.org/pipermail/bioperl-l/2002-September/009586.html but I don't think that they are relevant. Since it comes up here, how is the 'best' HSP defined? it isn't documented as far as I can tell. About the documentation... looking here: http://search.cpan.org/~birney/bioperl-1.4/Bio/SearchIO.pm Several of the structured methods 'blocks' are followed by a "See Bio::..." link to other pages in CPAN. However the 'next_result' method is followed by a link to http://search.cpan.org/~birney/bioperl-1.4/Bio/Root/RootI.pm - I think it should be a link to http://search.cpan.org/~birney/bioperl-1.4/Bio/Search/Result/ResultI.pm Also, it would be nice (especially for noobs) if the full list of accepted format codes were given on that page. The current text "# format can be 'fasta', 'blast', 'exonerate', ..." is extremely frustrating for a beginner "... what?!". I now realize that each format code is matched by a "Bio::SearchIO::formatcode" module, but I didn't know that from reading the above. While I'm at it, on page http://search.cpan.org/~birney/bioperl-1.4/Bio/Search/Hit/HitI.pm - the phrase "Equivalent to raw_score()" appearing under the heading "score" is a broken link. In fact every "See also : $this->method()" type link on that page is broken (there are about 25). Also the link to "See also : BUGS" is broken. > User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated. Thank you for your participation! I hope the above can help in some way, and I hope its not down to me making trivial mistakes! If these look like genuine bugs, should I report them on RT? Out of interest, I did get some fail while testing, specifically (or perhaps coincidentally) some related to SearchIO... ./Build test verbose=1 > test.results.dump &> test.results.dump ... ok 83 - HSP num_conserved not implemented ok 84 - HSP num_identical not implemented # Failed test 'HSP seq_inds' # at t/RNA_SearchIO.t line 152. # got: '77' # expected: '67' ok 85 - HSP percent_identity not implemented ok 86 - HSP cigar_string not implemented ... ok 114 - HSP rank not ok 115 - HSP seq_inds ok 116 - HSP significance ok 117 - HSP end ... ok 147 - HSP frame ok 148 - HSP gaps # Failed test 'HSP seq_inds' # at t/RNA_SearchIO.t line 218. # got: '79' # expected: '69' ok 149 - The object isa Bio::Align::AlignI ok 150 - HSP hit isa Bio::SeqFeature::Similarity ... ok 161 - HSP rank not ok 162 - HSP seq_inds ok 163 - HSP significance ok 164 - HSP end ... ok 170 - HSP meta ok 171 - HSP strand # Failed test 'HSP seq_inds' # at t/RNA_SearchIO.t line 285. # got: '76' # expected: '64' ok 172 - HSP meta gap bug ok 173 - HSP meta ... ok 250 - HSP rank not ok 251 - HSP seq_inds # Failed test 'HSP seq_inds' # at t/RNA_SearchIO.t line 475. # got: '76' # expected: '64' ok 252 - HSP significance ok 253 - HSP end ... ok 288 - HSP range ok 289 - HSP rank not ok 290 - HSP seq_inds # Failed test 'HSP seq_inds' # at t/RNA_SearchIO.t line 541. # got: '55' # expected: '31' ok 291 - HSP significance ok 292 - HSP end ... ok 336 - HSP rank not ok 337 - HSP seq_inds # Failed test 'HSP seq_inds' # at t/RNA_SearchIO.t line 608. # got: '79' # expected: '69' ok 338 - HSP significance ok 339 - HSP end ... ok 492 - HSP custom_score ok 493 ok 494 ok 495 - HSP strand # Looks like you failed 6 tests of 495. Dubious, test returned 6 (wstat 1536, 0x600) Failed 6/495 subtests t/RandDistFunctions............. 1..5 ok 1 - use Bio::Tools::RandomDistFunctions; ok 2 ok 3 ... t/SearchDist.................... 1..0 # Skip The optional module Bio::Ext::Align (or dependencies thereof) was not installed skipped: The optional module Bio::Ext::Align (or dependencies thereof) was not installed t/SearchIO...................... 1..1812 ok 1 - use Bio::SearchIO; ok 2 - use Bio::SearchIO::Writer::HitTableWriter; ok 3 - use Bio::SearchIO::Writer::HTMLResultWriter; ok 4 - The object isa Bio::Search::Result::ResultI ok 5 - database_name() ok 6 - query_name() ok 7 ok 8 ok 9 ok 10 ... ok 564 ok 565 ok 566 not ok 567 # Failed test at t/SearchIO.t line 775. # got: '0.5918' # expected: '0.5955' not ok 568 # Failed test at t/SearchIO.t line 776. # got: '0.6100' # expected: '0.6139' not ok 569 # Failed test at t/SearchIO.t line 777. # got: '0.5940' # expected: '0.5977' ok 570 ok 571 not ok 572 # Failed test at t/SearchIO.t line 780. # got: '158' # expected: '159' ok 573 ok 574 ok 575 ... ok 737 not ok 738 # Failed test at t/SearchIO.t line 996. # got: '51.59' # expected: '51.67' not ok 739 # Failed test at t/SearchIO.t line 997. # got: '0.5235' # expected: '0.5244' not ok 740 # Failed test at t/SearchIO.t line 998. # got: '0.5718' # expected: '0.5728' ok 741 ok 742 not ok 743 # Failed test at t/SearchIO.t line 1001. # got: '677' # expected: '678' ok 744 ok 745 ok 746 --------------------- WARNING --------------------- MSG: In sequence 5X_1895.fa residue count gives end value 5611. Overriding value [5623] with value 5611 for Bio::LocatableSeq::end(). ... etc. Test Summary Report ------------------- t/RNA_SearchIO (Wstat: 1536 Tests: 495 Failed: 6) Failed tests: 76, 115, 162, 251, 290, 337 Non-zero exit status: 6 t/SearchIO (Wstat: 2048 Tests: 1812 Failed: 8) Failed tests: 567-569, 572, 738-740, 743 Non-zero exit status: 8 t/singlet (Wstat: 512 Tests: 4 Failed: 2) Failed tests: 3-4 Non-zero exit status: 2 Files=259, Tests=14572, 131 wallclock secs ( 3.47 usr 0.90 sys + 94.53 cusr 8.24 csys = 107.14 CPU) Result: FAIL Failed 3/259 test programs. 16/14572 subtests failed. Dan. P.S. I've also been attacking the wiki, so please undo any mess that I may have made there. -- http://network.nature.com/profile/dan From tristan.lefebure at gmail.com Fri Nov 7 15:59:07 2008 From: tristan.lefebure at gmail.com (Tristan Lefebure) Date: Fri, 07 Nov 2008 15:59:07 -0500 Subject: [Bioperl-l] SeqIO & multi-line fastq Message-ID: <1226091547.6915.29.camel@trudy> Hi there, I'm parsing with SeqIO a FastQ file made by MAQ. SeqIO complains because this is a multiline fastq file. By looking at the Bio::SeqIO::fastq, it's pretty obvious that it can't handle multilines. Who is wrong? MAQ, SeqIO, or am I missing something? Some more details below: ### [tristan at trudy maq_easyrun] seq2seq.pl cns.fq fastq cns.fna fasta ------------- EXCEPTION ------------- MSG: AACTATTTATCAAATTTAAAATTCAACGAAAAACAAAGCAAAGCAGATCTTTTAGTTTTT doesn't match fastq descriptor line type STACK Bio::SeqIO::fastq::next_seq /usr/local/share/perl/5.10.0/Bio/SeqIO/fastq.pm:113 STACK toplevel /home/tristan/bin/seq2seq.pl:25 ------------------------------------- ### The fastq file looks like that: ----------- @nctc11168 atgAATCCAAGCCAAATACTTGAAAATTTAAAAAAAGAATTAAGTGAAAACGAATACGAA AACTATTTATCAAATTTAAAATTCAACGAAAAACAAAGCAAAGCAGATCTTTTAGTTTTT AATGCTCCAAATGAACTCATGGCTAAATTCATACAAACAAAATACGGCAAAAAAATCGCG CATTTTTATGAAGTGCAAAGCGGAAATAAAGCCATCATAAATATACAAGCACAAAGTGCT AAACAAAGCAACAAAAGCACAAAAATCGACATAGCTCATATAAAAGCACAAAGCACGATT TTAAATC[...] [some 20000 lines later] AACCTTTTTTTATAAAATTTAAGATAAAATTTATACATTATGCAAAATTTAAAGAGAgat n + EQWWZ`cffilmu~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~[...] --------- Thanks! -Tristan From cjfields at illinois.edu Fri Nov 7 16:31:06 2008 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 7 Nov 2008 15:31:06 -0600 Subject: [Bioperl-l] Problems with Bio::SearchIO In-Reply-To: <2c8757af0811070627t57a96965s4a8c8bdae5fbd095@mail.gmail.com> References: <2c8757af0811070627t57a96965s4a8c8bdae5fbd095@mail.gmail.com> Message-ID: On Nov 7, 2008, at 8:27 AM, Dan Bolser wrote: > Hi, > > I'm new to BioPerl, so apologies if I'm doing something wrong. Please > just let me know what I'm doing wrong or what you need. > > > I downloaded the latest BioPerl via SVN (revision 14980), and did the, > > perl Build.PL > ./Build test > ./Build install > > (note that I have lib::local installed, pointing things at ~/perl5) > > I ignored the failed tests, as-per the "everyone who is l88t does > this" instructions ;-) SearchIO (specifically, HSP methods used to calculate seq start/end correctly) are being revised in svn, so they are failing at the moment. My local (not-yet-committed) revisions are failing much more, but that's b/c the tests are wrong; these should be added in relatively soon once I work out more kinks in the code. > I am trying to parse the output of a "blastall -p blastn" job > (blastall 2.2.18) using either the default format or tabular format. I > copied the "search_overview.PLS" script found here: > > ... > > Looking closer I found that $parser->result_count() only gets set > after calling $parser->next_result. Any way to force this? In some > Perl objects I've seen a 'parse' method that kicks the object into > (silently) calling all its get methods. Is there an equivalent (but > apparently undocumented) method? Actually, I think it should kick > itself when called... or not? Certainly the docs do not suggest that > is won't return a the number of results ("Function: Gets the number of > Blast results that have been parsed.") So I think this is a bug. We could make it so that the result_count() is eager (parses the results and reports the total back). Not sure, but we could optionally cache the already-parsed Result objects (that could run into memory issues if one is parsing a ton of reports, so it needs to be off by default). > Actually I just checked against version 1.4 and "result_count()" fails > with the default, the XML or the tabular format results (all three > representing the same query against the same database with the same > blast version). Note that trying with the XML format in the latest > version of BioPerl results in the outright failure: > > ------------- EXCEPTION: Bio::Root::NotImplemented ------------- > MSG: Abstract method "Bio::SearchIO::result_count" is not implemented > by package Bio::SearchIO::blastxml. > This is not your fault - author of Bio::SearchIO::blastxml should be > blamed! > > ... That's a bug. Could you file this so we can track it? http://bugzilla.open-bio.org/ > While parsing and comparing the different formats (and versions) I > noticed a couple of other problems that I could not find reported > anywhere, so I'll report them here. > > The appropriate parts of the code are: > > while( my $r = $parser->next_result ) { > print $r->query_length, "\n"; > while(my $h = $r->next_hit ) { > print $h->length, "\n"; > my $p = $h->hsp('best'); > print $h->significance, "\t", $h->score, "\t", $h->bits, "\n"; > } > } > > It seems that both the "$r->query_length" and the "$h->length" are > missing when using the tabular format blast results (using either > BioPerl version). I get confused here because there is also some > undocumented behavior used in the above script that means a 'hit' > (apparently) returns the start and end point of the two most extreme > HSPs (and the length of that range). The HSPs are tiled prior to returning values for these methods. The tiling algorithm works for the most part but still has a few odd issues (one is reported in bugzilla). I am not familiar with that bit of code so can't comment, but Sendu or Steve might answer. I try to stay away from using the various hit methods unless necessary and usually go with simple HSP coordinates. > Secondly, I found in all but the default format (using either BioPerl > version), the "$p->score" is missing (set to an empty string). It > shows up fine using the default format. The significance and the > bit-score show up fine... or at least they show up... The values look > wrong now I come to check. e.g. the score is equal to the bit-score > when using the default format with version 1.4, and is often smaller > than the bit-score using the latest version (or is $h->bits not the > bit score?) There is some discussion about this in the mail list archives: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/16273/focus=16296 NCBI BLAST has the hit summary table reporting the raw score(), whereas in WU_BLAST the hit table reports bit_score(); in the HSPs I think everything is defined. If you have a minimal hit constructed (data only reported in the hit table, no HSP data is reported) some may be undefined. The hit should be calculating the best overall score values from the contained HSPs and falling back to directly-set hit table data (set as above). It is possible this may not be happening when parsing other formats so it may be a bug (and would be worth filing so we can look into it). > The closest hits in the mailing list that I could find to these > probemes were: > > http://lists.open-bio.org/pipermail/bioperl-l/2002-May/007936.html > http://lists.open-bio.org/pipermail/bioperl-l/2002-September/009586.html > > but I don't think that they are relevant. > > Since it comes up here, how is the 'best' HSP defined? it isn't > documented as far as I can tell. 'best' - when comparing HSP data to the summary hit table (in text output only), the highest scoring HSP represent the hit (highest score/ raw_score, lowest evalue). > About the documentation... looking here: > > http://search.cpan.org/~birney/bioperl-1.4/Bio/SearchIO.pm > > > Several of the structured methods 'blocks' are followed by a "See > Bio::..." link to other pages in CPAN. However the 'next_result' > method is followed by a link to > http://search.cpan.org/~birney/bioperl-1.4/Bio/Root/RootI.pm - I think > it should be a link to > http://search.cpan.org/~birney/bioperl-1.4/Bio/Search/Result/ > ResultI.pm > > Also, it would be nice (especially for noobs) if the full list of > accepted format codes were given on that page. The current text "# > format can be 'fasta', 'blast', 'exonerate', ..." is extremely > frustrating for a beginner "... what?!". I now realize that each > format code is matched by a "Bio::SearchIO::formatcode" module, but I > didn't know that from reading the above. > > While I'm at it, on page > http://search.cpan.org/~birney/bioperl-1.4/Bio/Search/Hit/HitI.pm - > the phrase "Equivalent to raw_score()" appearing under the heading > "score" is a broken link. In fact every "See also : $this->method()" > type link on that page is broken (there are about 25). Also the link > to "See also : BUGS" is broken. The pdoc documentation is better and more up-to-date (unfortunately the bioperl-1.4 CPAN docs are out-of-date but always come up first, I think b/c of the stable release status). >> User feedback is an integral part of the evolution of this and >> other Bioperl modules. Send your comments and suggestions >> preferably to one of the Bioperl mailing lists. Your participation >> is much appreciated. > > Thank you for your participation! I hope the above can help in some > way, and I hope its not down to me making trivial mistakes! If these > look like genuine bugs, should I report them on RT? No, use the bugzilla set up. We do not use CPAN's RT and generally redirect any bugs to bugzilla. > Out of interest, I did get some fail while testing, specifically (or > perhaps coincidentally) some related to SearchIO... > > ./Build test verbose=1 > test.results.dump &> test.results.dump > > ... > Dan. Those are due to the changes I have been making (using svn code is bleeding edge!). > P.S. I've also been attacking the wiki, so please undo any mess that I > may have made there. > > > -- > http://network.nature.com/profile/dan > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From j_martin at lbl.gov Fri Nov 7 17:45:34 2008 From: j_martin at lbl.gov (Joel Martin) Date: Fri, 7 Nov 2008 14:45:34 -0800 Subject: [Bioperl-l] SeqIO & multi-line fastq In-Reply-To: <1226091547.6915.29.camel@trudy> References: <1226091547.6915.29.camel@trudy> Message-ID: <20081107224534.GA10246@eniac.jgi-psf.org> Hello, multiline fastq seems broken by design, @ is a quality score and also the id delimiter. the script accompanying maq for converting fastq to fasta can't parse the multiline fastq output by maq, so I'd say it's maq that's wrong. I did this to parse them, but wasn't sure enough about /^\+/ to suggest it for bioperl. while (<$fh>) { if (/^@(\S+)/) { # read name print ">$1\n"; my $lines = 0; while ( <$fh> ) { # read sequence if ( ! (/^\+/) ) { # stop at '+' line print; $lines++; } else { last; } } while ( $lines-- ) { # skip quals <$fh>; } } } Joel On Fri, Nov 07, 2008 at 03:59:07PM -0500, Tristan Lefebure wrote: > Hi there, > > I'm parsing with SeqIO a FastQ file made by MAQ. SeqIO complains because > this is a multiline fastq file. By looking at the Bio::SeqIO::fastq, > it's pretty obvious that it can't handle multilines. Who is wrong? MAQ, > SeqIO, or am I missing something? > > Some more details below: > > ### > [tristan at trudy maq_easyrun] seq2seq.pl cns.fq fastq cns.fna fasta > > ------------- EXCEPTION ------------- > MSG: AACTATTTATCAAATTTAAAATTCAACGAAAAACAAAGCAAAGCAGATCTTTTAGTTTTT > doesn't match fastq descriptor line type > STACK > Bio::SeqIO::fastq::next_seq /usr/local/share/perl/5.10.0/Bio/SeqIO/fastq.pm:113 > STACK toplevel /home/tristan/bin/seq2seq.pl:25 > ------------------------------------- > ### > > The fastq file looks like that: > ----------- > @nctc11168 > atgAATCCAAGCCAAATACTTGAAAATTTAAAAAAAGAATTAAGTGAAAACGAATACGAA > AACTATTTATCAAATTTAAAATTCAACGAAAAACAAAGCAAAGCAGATCTTTTAGTTTTT > AATGCTCCAAATGAACTCATGGCTAAATTCATACAAACAAAATACGGCAAAAAAATCGCG > CATTTTTATGAAGTGCAAAGCGGAAATAAAGCCATCATAAATATACAAGCACAAAGTGCT > AAACAAAGCAACAAAAGCACAAAAATCGACATAGCTCATATAAAAGCACAAAGCACGATT > TTAAATC[...] > [some 20000 lines later] > AACCTTTTTTTATAAAATTTAAGATAAAATTTATACATTATGCAAAATTTAAAGAGAgat > n > + > EQWWZ`cffilmu~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ~~~~~[...] > --------- > > Thanks! > > -Tristan > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From lincoln.stein at gmail.com Mon Nov 10 15:25:24 2008 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Mon, 10 Nov 2008 15:25:24 -0500 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? Message-ID: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> Hi All, The glacial pace of official bioperl releases is interfering with my ability to package GBrowse 2.00 into debian and rpm packages. Is there any objection if I withdraw Bio::Graphics and Bio::DB::SeqFeature from the bioperl distribution and turn them into independent CPAN modules? Thanks, Lincoln -- Lincoln D. Stein Ontario Institute for Cancer Research 101 College St., Suite 800 Toronto, ON, Canada M5G0A3 416 673-8514 Assistant: Stacey Quinn Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 USA (516) 367-8380 Assistant: Sandra Michelsen From cain.cshl at gmail.com Mon Nov 10 15:50:26 2008 From: cain.cshl at gmail.com (Scott Cain) Date: Mon, 10 Nov 2008 15:50:26 -0500 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> Message-ID: <536f21b00811101250y619025c7pd7d6850d318a6979@mail.gmail.com> Hi Lincoln, Interesting idea; I'm sure you've already spent some time thinking it through, but presumably there are a fair amount of dependencies from Bio::Graphics and SeqFeature::Store to other parts of BioPerl. How much of a hassle do you think will be introduced by this bifurcation? Scott On Mon, Nov 10, 2008 at 3:25 PM, Lincoln Stein wrote: > Hi All, > > The glacial pace of official bioperl releases is interfering with my ability > to package GBrowse 2.00 into debian and rpm packages. Is there any objection > if I withdraw Bio::Graphics and Bio::DB::SeqFeature from the bioperl > distribution and turn them into independent CPAN modules? > > Thanks, > > Lincoln > > -- > Lincoln D. Stein > > Ontario Institute for Cancer Research > 101 College St., Suite 800 > Toronto, ON, Canada M5G0A3 > 416 673-8514 > Assistant: Stacey Quinn > > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 USA > (516) 367-8380 > Assistant: Sandra Michelsen > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research From cjfields at illinois.edu Mon Nov 10 16:46:03 2008 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 10 Nov 2008 15:46:03 -0600 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> Message-ID: Lincoln, I agree about the glacial pace. It's also feeling more and more like only a couple of active developers are working on it (so the more that chip in the better). Furthermore, the code base is so large now at this point it feels like steering an aircraft carrier with an oar and has become very hard to work on. I don't have any objections personally if you want to withdraw Bio::Graphics/Bio::DB::SeqFeature, but how much work would that be (scott mentions a few issues I see)? Personally, I think if Bio::Graphics remains in bioperl we have to do two things. We should release the full bioperl-live as-is to CPAN as an official release (TODO any bugs) ASAP. No RCs; we'll post point releases along the way for bug fixes (I like the 'release early/ release often' mantra). I can work on this over the next couple of weeks, aiming for Thanksgiving for a 1.6, but I probably won't get rolling until this weekend (too much going on this week). We can aim for more regular point releases then. Following that, I think a more stable long-term solution is to split off some of the non-core-like modules so that we can speed up releases (this has been discussed in the past, http://www.bioperl.org/wiki/Proposed_1.6_core_modules) . Basically, make a 'bare-bones' well-tested core containing the base classes and interfaces that remain stable long-term, such as Bio::Root, Bio::Seq/PrimarySeq, Bio::SeqFeature::*, with as few dependencies as possible. Everything else requiring constant maintenance, not actively supported, or under development would go into a separate monolithic distribution listing the new core as a dependency; this could feasibly have it's own release schedule. If we go this route, Bio::Graphics and related could also be in a second distribution (and thus also on a distinct release schedule). This could be worked out in a separate subversion directory, so bioperl-live wouldn't be affected until we switch over. Does that seem feasible? chris On Nov 10, 2008, at 2:25 PM, Lincoln Stein wrote: > Hi All, > > The glacial pace of official bioperl releases is interfering with my > ability > to package GBrowse 2.00 into debian and rpm packages. Is there any > objection > if I withdraw Bio::Graphics and Bio::DB::SeqFeature from the bioperl > distribution and turn them into independent CPAN modules? > > Thanks, > > Lincoln > > -- > Lincoln D. Stein > > Ontario Institute for Cancer Research > 101 College St., Suite 800 > Toronto, ON, Canada M5G0A3 > 416 673-8514 > Assistant: Stacey Quinn > > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 USA > (516) 367-8380 > Assistant: Sandra Michelsen > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From dan.bolser at gmail.com Mon Nov 10 17:01:17 2008 From: dan.bolser at gmail.com (Dan Bolser) Date: Mon, 10 Nov 2008 22:01:17 +0000 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> Message-ID: <2c8757af0811101401y4b9c3945m3c4dec6812e9ef5c@mail.gmail.com> 2008/11/10 Chris Fields : > Lincoln, > > I agree about the glacial pace. It's also feeling more and more like only a > couple of active developers are working on it (so the more that chip in the > better). Furthermore, the code base is so large now at this point it feels > like steering an aircraft carrier with an oar and has become very hard to > work on. > > I don't have any objections personally if you want to withdraw > Bio::Graphics/Bio::DB::SeqFeature, but how much work would that be (scott > mentions a few issues I see)? > > Personally, I think if Bio::Graphics remains in bioperl we have to do two > things. We should release the full bioperl-live as-is to CPAN as an > official release (TODO any bugs) ASAP. No RCs; we'll post point releases > along the way for bug fixes (I like the 'release early/release often' > mantra). I can work on this over the next couple of weeks, aiming for > Thanksgiving for a 1.6, but I probably won't get rolling until this weekend > (too much going on this week). We can aim for more regular point releases > then. > > Following that, I think a more stable long-term solution is to split off > some of the non-core-like modules so that we can speed up releases (this has > been discussed in the past, > http://www.bioperl.org/wiki/Proposed_1.6_core_modules). Basically, make a > 'bare-bones' well-tested core containing the base classes and interfaces > that remain stable long-term, such as Bio::Root, Bio::Seq/PrimarySeq, > Bio::SeqFeature::*, with as few dependencies as possible. > > Everything else requiring constant maintenance, not actively supported, or > under development would go into a separate monolithic distribution listing > the new core as a dependency; this could feasibly have it's own release Sorry for the potentially dumb question, but why have an overall release cycle at all? BioPerl is so large and diverse, asking for a new 'release' is almost like asking for 'the next release of CPAN'. Can't inter-module dependencies just be handled in the same way as in the rest of CPAN, with everything released asynchronously? Is BioPerl just too tightly woven for that to happen? Dan. > schedule. If we go this route, Bio::Graphics and related could also be in a > second distribution (and thus also on a distinct release schedule). This > could be worked out in a separate subversion directory, so bioperl-live > wouldn't be affected until we switch over. Does that seem feasible? > > chris > > On Nov 10, 2008, at 2:25 PM, Lincoln Stein wrote: > >> Hi All, >> >> The glacial pace of official bioperl releases is interfering with my >> ability >> to package GBrowse 2.00 into debian and rpm packages. Is there any >> objection >> if I withdraw Bio::Graphics and Bio::DB::SeqFeature from the bioperl >> distribution and turn them into independent CPAN modules? >> >> Thanks, >> >> Lincoln >> >> -- >> Lincoln D. Stein >> >> Ontario Institute for Cancer Research >> 101 College St., Suite 800 >> Toronto, ON, Canada M5G0A3 >> 416 673-8514 >> Assistant: Stacey Quinn >> >> Cold Spring Harbor Laboratory >> 1 Bungtown Road >> Cold Spring Harbor, NY 11724 USA >> (516) 367-8380 >> Assistant: Sandra Michelsen >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Marie-Claude Hofmann > College of Veterinary Medicine > University of Illinois Urbana-Champaign > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- http://network.nature.com/profile/dan From cjfields at illinois.edu Mon Nov 10 16:58:32 2008 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 10 Nov 2008 15:58:32 -0600 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> Message-ID: On Nov 10, 2008, at 3:46 PM, Chris Fields wrote: > Lincoln, > > ....If we go this route, Bio::Graphics and related could also be in > a second distribution (and thus also on a distinct release schedule). My oops. That should read 'Bio::Graphics and related could also be in a separate distribution' (i.e. distinct from core and the rest). -c From dan.bolser at gmail.com Mon Nov 10 17:29:08 2008 From: dan.bolser at gmail.com (Dan Bolser) Date: Mon, 10 Nov 2008 22:29:08 +0000 Subject: [Bioperl-l] Problems with Bio::SearchIO In-Reply-To: References: <2c8757af0811070627t57a96965s4a8c8bdae5fbd095@mail.gmail.com> Message-ID: <2c8757af0811101429q5abe0a0bqf3670c8178b8003d@mail.gmail.com> 2008/11/7 Chris Fields : > On Nov 7, 2008, at 8:27 AM, Dan Bolser wrote: > >> Hi, >> >> I'm new to BioPerl, so apologies if I'm doing something wrong. Please >> just let me know what I'm doing wrong or what you need. >> >> >> I downloaded the latest BioPerl via SVN (revision 14980), and did the, >> >> perl Build.PL >> ./Build test >> ./Build install >> >> (note that I have lib::local installed, pointing things at ~/perl5) >> >> I ignored the failed tests, as-per the "everyone who is l88t does >> this" instructions ;-) > > SearchIO (specifically, HSP methods used to calculate seq start/end > correctly) are being revised in svn, so they are failing at the moment. My > local (not-yet-committed) revisions are failing much more, but that's b/c > the tests are wrong; these should be added in relatively soon once I work > out more kinks in the code. > >> I am trying to parse the output of a "blastall -p blastn" job >> (blastall 2.2.18) using either the default format or tabular format. I >> copied the "search_overview.PLS" script found here: >> >> ... >> >> Looking closer I found that $parser->result_count() only gets set >> after calling $parser->next_result. Any way to force this? In some >> Perl objects I've seen a 'parse' method that kicks the object into >> (silently) calling all its get methods. Is there an equivalent (but >> apparently undocumented) method? Actually, I think it should kick >> itself when called... or not? Certainly the docs do not suggest that >> is won't return a the number of results ("Function: Gets the number of >> Blast results that have been parsed.") So I think this is a bug. > > We could make it so that the result_count() is eager (parses the results and > reports the total back). Not sure, but we could optionally cache the > already-parsed Result objects (that could run into memory issues if one is > parsing a ton of reports, so it needs to be off by default). I see (I think). Anyone first calling result_count() and *then* iterating over the results is getting a performance hit by effectively parsing the results twice? I would suggest that you make this function eager, but document the potential performance issue so that people can choose not to call it first. However, I don't think I can have understood correctly. How can its value be set correctly after calling next() only once? >> Actually I just checked against version 1.4 and "result_count()" fails >> with the default, the XML or the tabular format results (all three >> representing the same query against the same database with the same >> blast version). Note that trying with the XML format in the latest >> version of BioPerl results in the outright failure: >> >> ------------- EXCEPTION: Bio::Root::NotImplemented ------------- >> MSG: Abstract method "Bio::SearchIO::result_count" is not implemented >> by package Bio::SearchIO::blastxml. >> This is not your fault - author of Bio::SearchIO::blastxml should be >> blamed! >> >> ... > > That's a bug. Could you file this so we can track it? > > http://bugzilla.open-bio.org/ http://bugzilla.open-bio.org/show_bug.cgi?id=2650 >> While parsing and comparing the different formats (and versions) I >> noticed a couple of other problems that I could not find reported >> anywhere, so I'll report them here. >> >> The appropriate parts of the code are: >> >> while( my $r = $parser->next_result ) { >> print $r->query_length, "\n"; >> while(my $h = $r->next_hit ) { >> print $h->length, "\n"; >> my $p = $h->hsp('best'); >> print $h->significance, "\t", $h->score, "\t", $h->bits, "\n"; >> } >> } >> >> It seems that both the "$r->query_length" and the "$h->length" are >> missing when using the tabular format blast results (using either >> BioPerl version). I get confused here because there is also some >> undocumented behavior used in the above script that means a 'hit' >> (apparently) returns the start and end point of the two most extreme >> HSPs (and the length of that range). > > The HSPs are tiled prior to returning values for these methods. The tiling > algorithm works for the most part but still has a few odd issues (one is > reported in bugzilla). I am not familiar with that bit of code so can't > comment, but Sendu or Steve might answer. I'll also log a bug. > I try to stay away from using the various hit methods unless necessary and > usually go with simple HSP coordinates. OK. >> Secondly, I found in all but the default format (using either BioPerl >> version), the "$p->score" is missing (set to an empty string). It >> shows up fine using the default format. The significance and the >> bit-score show up fine... or at least they show up... The values look >> wrong now I come to check. e.g. the score is equal to the bit-score >> when using the default format with version 1.4, and is often smaller >> than the bit-score using the latest version (or is $h->bits not the >> bit score?) > > There is some discussion about this in the mail list archives: > > http://thread.gmane.org/gmane.comp.lang.perl.bio.general/16273/focus=16296 > > NCBI BLAST has the hit summary table reporting the raw score(), whereas in > WU_BLAST the hit table reports bit_score(); in the HSPs I think everything > is defined. If you have a minimal hit constructed (data only reported in > the hit table, no HSP data is reported) some may be undefined. The hit > should be calculating the best overall score values from the contained HSPs > and falling back to directly-set hit table data (set as above). It is > possible this may not be happening when parsing other formats so it may be a > bug (and would be worth filing so we can look into it). OK. Also I'll have to double check the actual query report! It could be the problem is there. >> The closest hits in the mailing list that I could find to these probemes >> were: >> >> http://lists.open-bio.org/pipermail/bioperl-l/2002-May/007936.html >> http://lists.open-bio.org/pipermail/bioperl-l/2002-September/009586.html >> >> but I don't think that they are relevant. >> >> Since it comes up here, how is the 'best' HSP defined? it isn't >> documented as far as I can tell. > > 'best' - when comparing HSP data to the summary hit table (in text output > only), the highest scoring HSP represent the hit (highest score/raw_score, > lowest evalue). Which? >> About the documentation... looking here: >> >> http://search.cpan.org/~birney/bioperl-1.4/Bio/SearchIO.pm >> >> >> Several of the structured methods 'blocks' are followed by a "See >> Bio::..." link to other pages in CPAN. However the 'next_result' >> method is followed by a link to >> http://search.cpan.org/~birney/bioperl-1.4/Bio/Root/RootI.pm - I think >> it should be a link to >> http://search.cpan.org/~birney/bioperl-1.4/Bio/Search/Result/ResultI.pm >> >> Also, it would be nice (especially for noobs) if the full list of >> accepted format codes were given on that page. The current text "# >> format can be 'fasta', 'blast', 'exonerate', ..." is extremely >> frustrating for a beginner "... what?!". I now realize that each >> format code is matched by a "Bio::SearchIO::formatcode" module, but I >> didn't know that from reading the above. >> >> While I'm at it, on page >> http://search.cpan.org/~birney/bioperl-1.4/Bio/Search/Hit/HitI.pm - >> the phrase "Equivalent to raw_score()" appearing under the heading >> "score" is a broken link. In fact every "See also : $this->method()" >> type link on that page is broken (there are about 25). Also the link >> to "See also : BUGS" is broken. > > The pdoc documentation is better and more up-to-date (unfortunately the > bioperl-1.4 CPAN docs are out-of-date but always come up first, I think b/c > of the stable release status). > >>> User feedback is an integral part of the evolution of this and other >>> Bioperl modules. Send your comments and suggestions preferably to one of the >>> Bioperl mailing lists. Your participation is much appreciated. >> >> Thank you for your participation! I hope the above can help in some >> way, and I hope its not down to me making trivial mistakes! If these >> look like genuine bugs, should I report them on RT? > > No, use the bugzilla set up. We do not use CPAN's RT and generally redirect > any bugs to bugzilla. > >> Out of interest, I did get some fail while testing, specifically (or >> perhaps coincidentally) some related to SearchIO... >> >> ./Build test verbose=1 > test.results.dump &> test.results.dump >> >> ... >> Dan. > > Those are due to the changes I have been making (using svn code is bleeding > edge!). > >> P.S. I've also been attacking the wiki, so please undo any mess that I >> may have made there. Thanks very much for the detailed reply. Overall, would you recommend that I use SVN or 1.4 or 1.5.2? All the best, Dan. >> -- >> http://network.nature.com/profile/dan >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Marie-Claude Hofmann > College of Veterinary Medicine > University of Illinois Urbana-Champaign > > > > > -- http://network.nature.com/profile/dan From johnsonm at gmail.com Mon Nov 10 17:45:59 2008 From: johnsonm at gmail.com (Mark Johnson) Date: Mon, 10 Nov 2008 16:45:59 -0600 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> Message-ID: On Mon, Nov 10, 2008 at 3:46 PM, Chris Fields wrote: > Personally, I think if Bio::Graphics remains in bioperl we have to do two > things. We should release the full bioperl-live as-is to CPAN as an > official release (TODO any bugs) ASAP. No RCs; we'll post point releases > along the way for bug fixes (I like the 'release early/release often' > mantra). I can work on this over the next couple of weeks, aiming for > Thanksgiving for a 1.6, but I probably won't get rolling until this weekend > (too much going on this week). We can aim for more regular point releases > then. Agreed. We (the Genome Center at Washington University in St. Louis) switched to a nightly build about a month ago. Prior to that, we had been on 1.5.2. What's in the trunk at the moment may not be perfect, but it's the best to be had anytime soon. Slap the 1.6 sticker on it, tag it, bag it and ship it. > Following that, I think a more stable long-term solution is to split off > some of the non-core-like modules so that we can speed up releases (this has > been discussed in the past, > http://www.bioperl.org/wiki/Proposed_1.6_core_modules). Basically, make a > 'bare-bones' well-tested core containing the base classes and interfaces > that remain stable long-term, such as Bio::Root, Bio::Seq/PrimarySeq, > Bio::SeqFeature::*, with as few dependencies as possible. > > Everything else requiring constant maintenance, not actively supported, or > under development would go into a separate monolithic distribution listing > the new core as a dependency; this could feasibly have it's own release > schedule. If we go this route, Bio::Graphics and related could also be in a > second distribution (and thus also on a distinct release schedule). This > could be worked out in a separate subversion directory, so bioperl-live > wouldn't be affected until we switch over. Does that seem feasible? I don't disagree with anything you've said. However, I wonder if maybe there isn't something to be learned from the way the Linux Kernel Development process changed with 1.6 (such that there is no 1.7 development branch)? You're the closest thing we've got to Linus, so you've got my vote for 'benevolent dictator'. From johnsonm at gmail.com Mon Nov 10 17:51:10 2008 From: johnsonm at gmail.com (Mark Johnson) Date: Mon, 10 Nov 2008 16:51:10 -0600 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> Message-ID: On Mon, Nov 10, 2008 at 2:25 PM, Lincoln Stein wrote: > Hi All, > > The glacial pace of official bioperl releases is interfering with my ability > to package GBrowse 2.00 into debian and rpm packages. Is there any objection > if I withdraw Bio::Graphics and Bio::DB::SeqFeature from the bioperl > distribution and turn them into independent CPAN modules? How so? I don't disagree with your characterization of the pace of releases, but if 1.5.3 or 1.6 was released tomorrow, would that really solve all your problems? From lincoln.stein at gmail.com Mon Nov 10 17:57:49 2008 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Mon, 10 Nov 2008 17:57:49 -0500 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> Message-ID: <6dce9a0b0811101457g34fb26b2ib02d98326578d578@mail.gmail.com> Hi Folks, Sorry, I sent my letter out while multitasking, which was inappropriate for such a nuanced subject. Here's my feeling about the issue, which is very similar to Chris's approach: Bioperl should be split up into a core API containing Bio::Root, the interface modules, and some of the really basic modules such as Bio::SeqIO. There are then a series of separately maintained and released modules that build on top of BioPerl and using it as a dependency. I am not in favor of having a "separate monolithic distribution" however. I'd prefer it to be more like this: - Bio::Perl -- the core distribution, containing Bio::Root, Bio::Seq, BioSeqFeature, Bio::SeqIO and Bio::Annotation - Bio::Align -- alignment support - Bio::Ontology -- ontology support - Bio::Microarray -- microarray support - Bio::PopGen -- population genetics - Bio::SeqEvolution -- evolutionary biology - Bio::Structure -- structures - Bio::Tree -- trees I wonder how many interdependencies there are to disentangle? In the immediate future, it'd be great to get the regression tests working 100% and do a CPAN release, but this may be easier said than done. Lincoln On Mon, Nov 10, 2008 at 4:46 PM, Chris Fields wrote: > Lincoln, > > I agree about the glacial pace. It's also feeling more and more like only > a couple of active developers are working on it (so the more that chip in > the better). Furthermore, the code base is so large now at this point it > feels like steering an aircraft carrier with an oar and has become very hard > to work on. > > I don't have any objections personally if you want to withdraw > Bio::Graphics/Bio::DB::SeqFeature, but how much work would that be (scott > mentions a few issues I see)? > > Personally, I think if Bio::Graphics remains in bioperl we have to do two > things. We should release the full bioperl-live as-is to CPAN as an > official release (TODO any bugs) ASAP. No RCs; we'll post point releases > along the way for bug fixes (I like the 'release early/release often' > mantra). I can work on this over the next couple of weeks, aiming for > Thanksgiving for a 1.6, but I probably won't get rolling until this weekend > (too much going on this week). We can aim for more regular point releases > then. > > Following that, I think a more stable long-term solution is to split off > some of the non-core-like modules so that we can speed up releases (this has > been discussed in the past, > http://www.bioperl.org/wiki/Proposed_1.6_core_modules). Basically, make a > 'bare-bones' well-tested core containing the base classes and interfaces > that remain stable long-term, such as Bio::Root, Bio::Seq/PrimarySeq, > Bio::SeqFeature::*, with as few dependencies as possible. > > Everything else requiring constant maintenance, not actively supported, or > under development would go into a separate monolithic distribution listing > the new core as a dependency; this could feasibly have it's own release > schedule. If we go this route, Bio::Graphics and related could also be in a > second distribution (and thus also on a distinct release schedule). This > could be worked out in a separate subversion directory, so bioperl-live > wouldn't be affected until we switch over. Does that seem feasible? > > chris > > > On Nov 10, 2008, at 2:25 PM, Lincoln Stein wrote: > > Hi All, >> >> The glacial pace of official bioperl releases is interfering with my >> ability >> to package GBrowse 2.00 into debian and rpm packages. Is there any >> objection >> if I withdraw Bio::Graphics and Bio::DB::SeqFeature from the bioperl >> distribution and turn them into independent CPAN modules? >> >> Thanks, >> >> Lincoln >> >> -- >> Lincoln D. Stein >> >> Ontario Institute for Cancer Research >> 101 College St., Suite 800 >> Toronto, ON, Canada M5G0A3 >> 416 673-8514 >> Assistant: Stacey Quinn >> >> Cold Spring Harbor Laboratory >> 1 Bungtown Road >> Cold Spring Harbor, NY 11724 USA >> (516) 367-8380 >> Assistant: Sandra Michelsen >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Marie-Claude Hofmann > College of Veterinary Medicine > University of Illinois Urbana-Champaign > > > > > -- Lincoln D. Stein Ontario Institute for Cancer Research 101 College St., Suite 800 Toronto, ON, Canada M5G0A3 416 673-8514 Assistant: Stacey Quinn Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 USA (516) 367-8380 Assistant: Sandra Michelsen From lincoln.stein at gmail.com Mon Nov 10 18:05:27 2008 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Mon, 10 Nov 2008 18:05:27 -0500 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> Message-ID: <6dce9a0b0811101505t401b9f45h42c0b2bd3dc45453@mail.gmail.com> Hi, I think the problem is that gbrowse is moving very rapidly due to the demand for a large number of new feature types, primarily in population and functional genetics. Every time there is a new feature type, I add a new glyph to Bio::Graphics, and this goes into bioperl-live. So it really makes sense for Bio::Graphics to have its own release schedule -- it depends only on Bio::Root and Bio::SeqFeatureI, so the dependencies are minimum. Alternatively, I could publish a separate package of Bio::Graphics::glyphs as "add ons", but it would still be hard to make bug fixes to core glyphs. The issue with Bio::DB::SeqFeature is a little different -- there were a series of bug fixes last summer, and I'd like them to be in an official release. This, and Bio::DB::GFF, are pretty stable, I think. Lincoln On Mon, Nov 10, 2008 at 5:51 PM, Mark Johnson wrote: > On Mon, Nov 10, 2008 at 2:25 PM, Lincoln Stein > wrote: > > Hi All, > > > > The glacial pace of official bioperl releases is interfering with my > ability > > to package GBrowse 2.00 into debian and rpm packages. Is there any > objection > > if I withdraw Bio::Graphics and Bio::DB::SeqFeature from the bioperl > > distribution and turn them into independent CPAN modules? > > How so? I don't disagree with your characterization of the pace of > releases, but if 1.5.3 or 1.6 was released tomorrow, would that really > solve all your problems? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Ontario Institute for Cancer Research 101 College St., Suite 800 Toronto, ON, Canada M5G0A3 416 673-8514 Assistant: Stacey Quinn Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 USA (516) 367-8380 Assistant: Sandra Michelsen From lincoln.stein at gmail.com Mon Nov 10 18:10:09 2008 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Mon, 10 Nov 2008 18:10:09 -0500 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <4918BCEE.80905@sendu.me.uk> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <2c8757af0811101401y4b9c3945m3c4dec6812e9ef5c@mail.gmail.com> <4918BCEE.80905@sendu.me.uk> Message-ID: <6dce9a0b0811101510q78f384e4t46d41419334d15d5@mail.gmail.com> Hi Sendu, Thanks for pointing me to the thread; I'm sorry I missed it at the time. The idea of splitting bioperl into 900 individual modules is a terrible idea but I'm not advocating that. I'm just advocating splitting it into a half dozen modules that are divided according to the* de facto *maintainer. For example, Gabriel becomes the maintainer of Bio::PhyloNetwork. Lincoln On Mon, Nov 10, 2008 at 5:59 PM, Sendu Bala wrote: > Dan Bolser wrote: > >> Sorry for the potentially dumb question, but why have an overall >> release cycle at all? >> >> BioPerl is so large and diverse, asking for a new 'release' is almost >> like asking for 'the next release of CPAN'. Can't inter-module >> dependencies just be handled in the same way as in the rest of CPAN, >> with everything released asynchronously? Is BioPerl just too tightly >> woven for that to happen? >> > > Doing that is considered a Bad Idea (TM): > http://www.nntp.perl.org/group/perl.modules/2007/07/msg55160.html > (especially Adam Kennedy's postings of 4/07) > > (It just so happens that we discuss Bio::Graphics as an example of the > problem) > > > As for Lincoln's desire to pull Bio::Graphics out, as Chris described that > was more or less the kind of plan we had in mind. We'd have probably ended > up splitting it out into its own little package so that it could be updated > and released on its own cycle, with the latest version of it always > compatible and tested with the latest core package and any other split-off > Bio package it might depend on. In our plan Bio::Graphics would still have > belonged to Bioperl-ML on CPAN, but I can understand the desire for Lincoln > to have full direct control over those modules again. > -- Lincoln D. Stein Ontario Institute for Cancer Research 101 College St., Suite 800 Toronto, ON, Canada M5G0A3 416 673-8514 Assistant: Stacey Quinn Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 USA (516) 367-8380 Assistant: Sandra Michelsen From bix at sendu.me.uk Mon Nov 10 17:59:58 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 10 Nov 2008 22:59:58 +0000 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <2c8757af0811101401y4b9c3945m3c4dec6812e9ef5c@mail.gmail.com> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <2c8757af0811101401y4b9c3945m3c4dec6812e9ef5c@mail.gmail.com> Message-ID: <4918BCEE.80905@sendu.me.uk> Dan Bolser wrote: > Sorry for the potentially dumb question, but why have an overall > release cycle at all? > > BioPerl is so large and diverse, asking for a new 'release' is almost > like asking for 'the next release of CPAN'. Can't inter-module > dependencies just be handled in the same way as in the rest of CPAN, > with everything released asynchronously? Is BioPerl just too tightly > woven for that to happen? Doing that is considered a Bad Idea (TM): http://www.nntp.perl.org/group/perl.modules/2007/07/msg55160.html (especially Adam Kennedy's postings of 4/07) (It just so happens that we discuss Bio::Graphics as an example of the problem) As for Lincoln's desire to pull Bio::Graphics out, as Chris described that was more or less the kind of plan we had in mind. We'd have probably ended up splitting it out into its own little package so that it could be updated and released on its own cycle, with the latest version of it always compatible and tested with the latest core package and any other split-off Bio package it might depend on. In our plan Bio::Graphics would still have belonged to Bioperl-ML on CPAN, but I can understand the desire for Lincoln to have full direct control over those modules again. From bix at sendu.me.uk Mon Nov 10 18:46:52 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 10 Nov 2008 23:46:52 +0000 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <6dce9a0b0811101457g34fb26b2ib02d98326578d578@mail.gmail.com> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <6dce9a0b0811101457g34fb26b2ib02d98326578d578@mail.gmail.com> Message-ID: <4918C7EC.7020905@sendu.me.uk> Lincoln Stein wrote: > - Bio::Perl -- the core distribution, containing Bio::Root, Bio::Seq, > BioSeqFeature, Bio::SeqIO and Bio::Annotation > - Bio::Align -- alignment support > - Bio::Ontology -- ontology support > - Bio::Microarray -- microarray support > - Bio::PopGen -- population genetics > - Bio::SeqEvolution -- evolutionary biology > - Bio::Structure -- structures > - Bio::Tree -- trees > > I wonder how many interdependencies there are to disentangle? See discussion here: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/16631/focus=16816 (Once again I use Bio:Graphics as an example during the discussion - popular modules! ;) ) I never made an attempt to disentangle it myself, but hey - pretty picture! :) From charles-listes+bioperl at plessy.org Mon Nov 10 20:19:23 2008 From: charles-listes+bioperl at plessy.org (Charles Plessy) Date: Tue, 11 Nov 2008 10:19:23 +0900 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> Message-ID: <20081111011923.GC25981@kunpuu.plessy.org> Le Mon, Nov 10, 2008 at 03:25:24PM -0500, Lincoln Stein a ?crit : > Hi All, > > The glacial pace of official bioperl releases is interfering with my ability > to package GBrowse 2.00 into debian and rpm packages. Is there any objection > if I withdraw Bio::Graphics and Bio::DB::SeqFeature from the bioperl > distribution and turn them into independent CPAN modules? Dear all, Just a word to say that when BioPerl 1.6 is released, the package in Debian unstable/testing can be quickly updated, and a backport can be made for the stable distribution. However, it would make my task easier if the following file conflict would be resloved. GBrowse and BioPerl are shiping copies of the same modules: Bio::Graphics::Glyph::ideogram Bio::Graphics::Glyph::heat_map Bio::Graphics::Glyph::heat_map_ideogram This is an obstacle to proper packaging, as on Debian system two packages are not supposed to deliver the same file. Have a nice day, -- Charles Plessy Debian Med packaging team http://www.debian.org/devel/debian-med Tsurumi, Kanagawa, Japan From cjfields at illinois.edu Mon Nov 10 19:32:23 2008 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 10 Nov 2008 18:32:23 -0600 Subject: [Bioperl-l] Problems with Bio::SearchIO In-Reply-To: <2c8757af0811101429q5abe0a0bqf3670c8178b8003d@mail.gmail.com> References: <2c8757af0811070627t57a96965s4a8c8bdae5fbd095@mail.gmail.com> <2c8757af0811101429q5abe0a0bqf3670c8178b8003d@mail.gmail.com> Message-ID: <341D01B6-8AC6-4D14-9A71-A01F280486E4@illinois.edu> On Nov 10, 2008, at 4:29 PM, Dan Bolser wrote: > 2008/11/7 Chris Fields : >> On Nov 7, 2008, at 8:27 AM, Dan Bolser wrote: >>> ... >>> Looking closer I found that $parser->result_count() only gets set >>> after calling $parser->next_result. Any way to force this? In some >>> Perl objects I've seen a 'parse' method that kicks the object into >>> (silently) calling all its get methods. Is there an equivalent (but >>> apparently undocumented) method? Actually, I think it should kick >>> itself when called... or not? Certainly the docs do not suggest that >>> is won't return a the number of results ("Function: Gets the >>> number of >>> Blast results that have been parsed.") So I think this is a bug. >> >> We could make it so that the result_count() is eager (parses the >> results and >> reports the total back). Not sure, but we could optionally cache the >> already-parsed Result objects (that could run into memory issues if >> one is >> parsing a ton of reports, so it needs to be off by default). > > I see (I think). Anyone first calling result_count() and *then* > iterating over the results is getting a performance hit by effectively > parsing the results twice? I would suggest that you make this function > eager, but document the potential performance issue so that people can > choose not to call it first. However, I don't think I can have > understood correctly. How can its value be set correctly after calling > next() only once? It's highly possible that result_count() is meant to indicate total ResultI iteration parsed up to the point of being called (as opposed to the total number of ResultI), but that isn't made exactly clear. However, judging by the naming of the other Bio::Search methods for total objects (num_hits, num_hsps) I think that's the case. However, if it's meant to be the total number of ResultI then result_count() should be eagerly called. It must essentially run out the iterator and return the total number of results whether we implement caching or not, otherwise it isn't returning the correct value. BTW, resetting the iterator also relies on the input being seekable (which it easily may not), so caching ResultI probably should be made optionally available. >>> ... >>> The closest hits in the mailing list that I could find to these >>> probemes >>> were: >>> >>> http://lists.open-bio.org/pipermail/bioperl-l/2002-May/007936.html >>> http://lists.open-bio.org/pipermail/bioperl-l/2002-September/009586.html >>> >>> but I don't think that they are relevant. >>> >>> Since it comes up here, how is the 'best' HSP defined? it isn't >>> documented as far as I can tell. >> >> 'best' - when comparing HSP data to the summary hit table (in text >> output >> only), the highest scoring HSP represent the hit (highest score/ >> raw_score, >> lowest evalue). > > Which? Right now I think it's going by evalue/pvalue, but this is dependent on the BLAST report. >>> About the documentation... looking here: >>> >>> http://search.cpan.org/~birney/bioperl-1.4/Bio/SearchIO.pm >>> >>> >>> Several of the structured methods 'blocks' are followed by a "See >>> Bio::..." link to other pages in CPAN. However the 'next_result' >>> method is followed by a link to >>> http://search.cpan.org/~birney/bioperl-1.4/Bio/Root/RootI.pm - I >>> think >>> it should be a link to >>> http://search.cpan.org/~birney/bioperl-1.4/Bio/Search/Result/ResultI.pm >>> >>> Also, it would be nice (especially for noobs) if the full list of >>> accepted format codes were given on that page. The current text "# >>> format can be 'fasta', 'blast', 'exonerate', ..." is extremely >>> frustrating for a beginner "... what?!". I now realize that each >>> format code is matched by a "Bio::SearchIO::formatcode" module, >>> but I >>> didn't know that from reading the above. >>> >>> While I'm at it, on page >>> http://search.cpan.org/~birney/bioperl-1.4/Bio/Search/Hit/HitI.pm - >>> the phrase "Equivalent to raw_score()" appearing under the heading >>> "score" is a broken link. In fact every "See also : $this->method()" >>> type link on that page is broken (there are about 25). Also the link >>> to "See also : BUGS" is broken. >> >> The pdoc documentation is better and more up-to-date (unfortunately >> the >> bioperl-1.4 CPAN docs are out-of-date but always come up first, I >> think b/c >> of the stable release status). 1.5.2 is also in CPAN and is more up-to-date, but is labeled a dev release so doesn't pop up immediately. >>>> User feedback is an integral part of the evolution of this and >>>> other >>>> Bioperl modules. Send your comments and suggestions preferably to >>>> one of the >>>> Bioperl mailing lists. Your participation is much appreciated. >>> >>> Thank you for your participation! I hope the above can help in some >>> way, and I hope its not down to me making trivial mistakes! If these >>> look like genuine bugs, should I report them on RT? >> >> No, use the bugzilla set up. We do not use CPAN's RT and generally >> redirect >> any bugs to bugzilla. >> >>> Out of interest, I did get some fail while testing, specifically (or >>> perhaps coincidentally) some related to SearchIO... >>> >>> ./Build test verbose=1 > test.results.dump &> test.results.dump >>> >>> ... >>> Dan. >> >> Those are due to the changes I have been making (using svn code is >> bleeding >> edge!). >> >>> P.S. I've also been attacking the wiki, so please undo any mess >>> that I >>> may have made there. > > > Thanks very much for the detailed reply. Overall, would you recommend > that I use SVN or 1.4 or 1.5.2? > > All the best, > > Dan. np. Feel free to update the wiki (the more clear the docs are the better). -c From cjfields at illinois.edu Mon Nov 10 22:58:44 2008 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 10 Nov 2008 21:58:44 -0600 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <6dce9a0b0811101457g34fb26b2ib02d98326578d578@mail.gmail.com> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <6dce9a0b0811101457g34fb26b2ib02d98326578d578@mail.gmail.com> Message-ID: <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> On Nov 10, 2008, at 4:57 PM, Lincoln Stein wrote: > Hi Folks, > > Sorry, I sent my letter out while multitasking, which was > inappropriate for such a nuanced subject. > > Here's my feeling about the issue, which is very similar to Chris's > approach: Bioperl should be split up into a core API containing > Bio::Root, the interface modules, and some of the really basic > modules such as Bio::SeqIO. There are then a series of separately > maintained and released modules that build on top of BioPerl and > using it as a dependency. I am not in favor of having a "separate > monolithic distribution" however. I'd prefer it to be more like this: > > ? Bio::Perl -- the core distribution, containing Bio::Root, > Bio::Seq, BioSeqFeature, Bio::SeqIO and Bio::Annotation > ? Bio::Align -- alignment support > ? Bio::Ontology -- ontology support > ? Bio::Microarray -- microarray support > ? Bio::PopGen -- population genetics > ? Bio::SeqEvolution -- evolutionary biology > ? Bio::Structure -- structures > ? Bio::Tree -- trees > I wonder how many interdependencies there are to disentangle? > > In the immediate future, it'd be great to get the regression tests > working 100% and do a CPAN release, but this may be easier said than > done. > > Lincoln BTW, I would like to add to that list Bio::Tools (Bio::Tools* related changes suggested recently by Heikki) and Bio::Dev (for in development or untested modules, or experimental modules with an unstable API). I'll work on a few bugs towards getting 1.6 released with a rough timeline for end of Nov, maybe even Thanksgiving. We can work on splitting things up after that. -c From bix at sendu.me.uk Tue Nov 11 02:59:13 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 11 Nov 2008 07:59:13 +0000 Subject: [Bioperl-l] Problems with Bio::SearchIO In-Reply-To: <341D01B6-8AC6-4D14-9A71-A01F280486E4@illinois.edu> References: <2c8757af0811070627t57a96965s4a8c8bdae5fbd095@mail.gmail.com> <2c8757af0811101429q5abe0a0bqf3670c8178b8003d@mail.gmail.com> <341D01B6-8AC6-4D14-9A71-A01F280486E4@illinois.edu> Message-ID: <49193B51.1050100@sendu.me.uk> Chris Fields wrote: > > On Nov 10, 2008, at 4:29 PM, Dan Bolser wrote: > >> 2008/11/7 Chris Fields : >>> On Nov 7, 2008, at 8:27 AM, Dan Bolser wrote: >>>> ... >>>> Looking closer I found that $parser->result_count() only gets set >>>> after calling $parser->next_result. Any way to force this? In some >>>> Perl objects I've seen a 'parse' method that kicks the object into >>>> (silently) calling all its get methods. Is there an equivalent (but >>>> apparently undocumented) method? Actually, I think it should kick >>>> itself when called... or not? Certainly the docs do not suggest that >>>> is won't return a the number of results ("Function: Gets the number of >>>> Blast results that have been parsed.") So I think this is a bug. >>> >>> We could make it so that the result_count() is eager (parses the >>> results and >>> reports the total back). Not sure, but we could optionally cache the >>> already-parsed Result objects (that could run into memory issues if >>> one is >>> parsing a ton of reports, so it needs to be off by default). >> >> I see (I think). Anyone first calling result_count() and *then* >> iterating over the results is getting a performance hit by effectively >> parsing the results twice? I would suggest that you make this function >> eager, but document the potential performance issue so that people can >> choose not to call it first. However, I don't think I can have >> understood correctly. How can its value be set correctly after calling >> next() only once? > > It's highly possible that result_count() is meant to indicate total > ResultI iteration parsed up to the point of being called (as opposed to > the total number of ResultI), but that isn't made exactly clear. Yes, this is the case. I always thought that was pretty unambiguous from the function description. "the number of Blast results that have been parsed". Not "the number of Blast results". From bix at sendu.me.uk Tue Nov 11 03:05:18 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 11 Nov 2008 08:05:18 +0000 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <6dce9a0b0811101457g34fb26b2ib02d98326578d578@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> Message-ID: <49193CBE.4000800@sendu.me.uk> Chris Fields wrote: > I'll work on a few bugs towards getting 1.6 released with a rough > timeline for end of Nov, maybe even Thanksgiving. We can work on > splitting things up after that. I don't think there's any need to put it out as 1.6 if it doesn't satisfy the requirements we planned for 1.6. Just make it another dev point release: 1.5.3. If the issue is that you want it showing up during a CPAN search, just don't give it a dev version number (1.5.3_001). From heikki at sanbi.ac.za Tue Nov 11 03:38:36 2008 From: heikki at sanbi.ac.za (Heikki Lehvaslaiho) Date: Tue, 11 Nov 2008 10:38:36 +0200 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <49193CBE.4000800@sendu.me.uk> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> Message-ID: <200811111038.37572.heikki@sanbi.ac.za> Either way (1.6 or 1.5.3), the main thing is to get a release out soon. I agree with the sentiments expressed here to split bioperl into smallish logical packages after the next release. Chris, looks like you have volunteered to be the release manager. As such you have quite a lot of responsibility ( :) ) as well as power. Please start posting to the list at least weekly updates what needs to be done and assigning tasks to people who you know can do part of the job. Also, a clear published time line helps getting things done. Let us know what you want us to do. I am waiting instructions. -Heikki On Tuesday 11 November 2008 10:05:18 Sendu Bala wrote: > Chris Fields wrote: > > I'll work on a few bugs towards getting 1.6 released with a rough > > timeline for end of Nov, maybe even Thanksgiving. We can work on > > splitting things up after that. > > I don't think there's any need to put it out as 1.6 if it doesn't > satisfy the requirements we planned for 1.6. Just make it another dev > point release: 1.5.3. If the issue is that you want it showing up during > a CPAN search, just don't give it a dev version number (1.5.3_001). > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ From jay at jays.net Tue Nov 11 08:33:40 2008 From: jay at jays.net (Jay Hannah) Date: Tue, 11 Nov 2008 07:33:40 -0600 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <200811111038.37572.heikki@sanbi.ac.za> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> Message-ID: <4F1555BB-41B6-47FE-8A39-2857252A20F4@jays.net> On Nov 11, 2008, at 2:38 AM, Heikki Lehvaslaiho wrote: > Chris, looks like you have volunteered to be the release manager. > As such you > have quite a lot of responsibility ( :) ) as well as power. Please > start > posting to the list at least weekly updates what needs to be done and > assigning tasks to people who you know can do part of the job. > > Also, a clear published time line helps getting things done. > > Let us know what you want us to do. I am waiting instructions. I hereby volunteer for the strike team. Saluting, j http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah http://www.bioperl.org/wiki/User:Jhannah Aspiring BioPerl tactical demolitions expert From cain.cshl at gmail.com Tue Nov 11 09:29:07 2008 From: cain.cshl at gmail.com (Scott Cain) Date: Tue, 11 Nov 2008 09:29:07 -0500 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <20081111011923.GC25981@kunpuu.plessy.org> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <20081111011923.GC25981@kunpuu.plessy.org> Message-ID: <536f21b00811110629w34f03fbey4d7c09a90ba141b3@mail.gmail.com> Hi Charles, Those glyphs have already been removed from the main development branch (the 2.0 branch) and Lincoln has agreed that it should be removed from the stable (1.70) branch as well. I'll do it today. Scott On Mon, Nov 10, 2008 at 8:19 PM, Charles Plessy wrote: > Le Mon, Nov 10, 2008 at 03:25:24PM -0500, Lincoln Stein a ?crit : >> Hi All, >> >> The glacial pace of official bioperl releases is interfering with my ability >> to package GBrowse 2.00 into debian and rpm packages. Is there any objection >> if I withdraw Bio::Graphics and Bio::DB::SeqFeature from the bioperl >> distribution and turn them into independent CPAN modules? > > Dear all, > > Just a word to say that when BioPerl 1.6 is released, the package in Debian > unstable/testing can be quickly updated, and a backport can be made for the > stable distribution. > > However, it would make my task easier if the following file conflict would be > resloved. GBrowse and BioPerl are shiping copies of the same modules: > > Bio::Graphics::Glyph::ideogram > Bio::Graphics::Glyph::heat_map > Bio::Graphics::Glyph::heat_map_ideogram > > This is an obstacle to proper packaging, as on Debian system two packages are > not supposed to deliver the same file. > > Have a nice day, > > -- > Charles Plessy > Debian Med packaging team > http://www.debian.org/devel/debian-med > Tsurumi, Kanagawa, Japan > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research From cjfields at illinois.edu Tue Nov 11 09:47:15 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 11 Nov 2008 08:47:15 -0600 Subject: [Bioperl-l] Problems with Bio::SearchIO In-Reply-To: <49193B51.1050100@sendu.me.uk> References: <2c8757af0811070627t57a96965s4a8c8bdae5fbd095@mail.gmail.com> <2c8757af0811101429q5abe0a0bqf3670c8178b8003d@mail.gmail.com> <341D01B6-8AC6-4D14-9A71-A01F280486E4@illinois.edu> <49193B51.1050100@sendu.me.uk> Message-ID: <19492F91-306D-49DB-8320-EA52F3BCC36D@illinois.edu> On Nov 11, 2008, at 1:59 AM, Sendu Bala wrote: > Chris Fields wrote: >> On Nov 10, 2008, at 4:29 PM, Dan Bolser wrote: >>> 2008/11/7 Chris Fields : >>>> On Nov 7, 2008, at 8:27 AM, Dan Bolser wrote: >>>>> ... >>>>> Looking closer I found that $parser->result_count() only gets set >>>>> after calling $parser->next_result. Any way to force this? In some >>>>> Perl objects I've seen a 'parse' method that kicks the object into >>>>> (silently) calling all its get methods. Is there an equivalent >>>>> (but >>>>> apparently undocumented) method? Actually, I think it should kick >>>>> itself when called... or not? Certainly the docs do not suggest >>>>> that >>>>> is won't return a the number of results ("Function: Gets the >>>>> number of >>>>> Blast results that have been parsed.") So I think this is a bug. >>>> >>>> We could make it so that the result_count() is eager (parses the >>>> results and >>>> reports the total back). Not sure, but we could optionally cache >>>> the >>>> already-parsed Result objects (that could run into memory issues >>>> if one is >>>> parsing a ton of reports, so it needs to be off by default). >>> >>> I see (I think). Anyone first calling result_count() and *then* >>> iterating over the results is getting a performance hit by >>> effectively >>> parsing the results twice? I would suggest that you make this >>> function >>> eager, but document the potential performance issue so that people >>> can >>> choose not to call it first. However, I don't think I can have >>> understood correctly. How can its value be set correctly after >>> calling >>> next() only once? >> It's highly possible that result_count() is meant to indicate total >> ResultI iteration parsed up to the point of being called (as >> opposed to the total number of ResultI), but that isn't made >> exactly clear. > > Yes, this is the case. I always thought that was pretty unambiguous > from the function description. "the number of Blast results that > have been parsed". Not "the number of Blast results". We'll leave as is then and try implementing it blastxml. chris From lincoln.stein at gmail.com Tue Nov 11 10:24:30 2008 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Tue, 11 Nov 2008 10:24:30 -0500 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <20081111011923.GC25981@kunpuu.plessy.org> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <20081111011923.GC25981@kunpuu.plessy.org> Message-ID: <6dce9a0b0811110724n3c74670fsd7dc1fdac5779b2d@mail.gmail.com> Actually, it was while I was fixing these (and other) file overlap problems that I became concerned about the difficulty in updating bioperl. Lincoln On Mon, Nov 10, 2008 at 8:19 PM, Charles Plessy < charles-listes+bioperl at plessy.org >wrote: > Le Mon, Nov 10, 2008 at 03:25:24PM -0500, Lincoln Stein a ?crit : > > Hi All, > > > > The glacial pace of official bioperl releases is interfering with my > ability > > to package GBrowse 2.00 into debian and rpm packages. Is there any > objection > > if I withdraw Bio::Graphics and Bio::DB::SeqFeature from the bioperl > > distribution and turn them into independent CPAN modules? > > Dear all, > > Just a word to say that when BioPerl 1.6 is released, the package in Debian > unstable/testing can be quickly updated, and a backport can be made for the > stable distribution. > > However, it would make my task easier if the following file conflict would > be > resloved. GBrowse and BioPerl are shiping copies of the same modules: > > Bio::Graphics::Glyph::ideogram > Bio::Graphics::Glyph::heat_map > Bio::Graphics::Glyph::heat_map_ideogram > > This is an obstacle to proper packaging, as on Debian system two packages > are > not supposed to deliver the same file. > > Have a nice day, > > -- > Charles Plessy > Debian Med packaging team > http://www.debian.org/devel/debian-med > Tsurumi, Kanagawa, Japan > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Ontario Institute for Cancer Research 101 College St., Suite 800 Toronto, ON, Canada M5G0A3 416 673-8514 Assistant: Stacey Quinn Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 USA (516) 367-8380 Assistant: Sandra Michelsen From cjfields at illinois.edu Tue Nov 11 10:47:32 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 11 Nov 2008 09:47:32 -0600 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <200811111038.37572.heikki@sanbi.ac.za> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> Message-ID: <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> Heikki, I'll volunteer to do this. I think this should be a 1.6 release. Users have been screaming for a 'stable' release for years now, and everything on trunk is definitely more stable than 1.4, so I say we should (as Mark Johnson says) tag it, bag it and ship it. However, it should be the final 'full' bioperl release prior to splitting things up. Much of the stuff proposed on the release schedule for 1.6 can wait until we can decide what to split off and who can maintain everything. Makes sense to put this off until then anyway if you read the list, at least to me: http://www.bioperl.org/wiki/Release_Schedule Simple bug fixes after the split can be made on the 1.6 branch and submitted to CPAN as point releases on a regular basis. If needed I can work on maintaining the 1.6 releases until we switch over to the split distribution structure. We can discuss other issues (how to tie in separate release versions to 'core', etc) along the way. I'll try to come up with something by this weekend. chris On Nov 11, 2008, at 2:38 AM, Heikki Lehvaslaiho wrote: > Either way (1.6 or 1.5.3), the main thing is to get a release out > soon. I > agree with the sentiments expressed here to split bioperl into > smallish > logical packages after the next release. > > Chris, looks like you have volunteered to be the release manager. As > such you > have quite a lot of responsibility ( :) ) as well as power. Please > start > posting to the list at least weekly updates what needs to be done and > assigning tasks to people who you know can do part of the job. > > Also, a clear published time line helps getting things done. > > Let us know what you want us to do. I am waiting instructions. > > -Heikki > > On Tuesday 11 November 2008 10:05:18 Sendu Bala wrote: >> Chris Fields wrote: >>> I'll work on a few bugs towards getting 1.6 released with a rough >>> timeline for end of Nov, maybe even Thanksgiving. We can work on >>> splitting things up after that. >> >> I don't think there's any need to put it out as 1.6 if it doesn't >> satisfy the requirements we planned for 1.6. Just make it another dev >> point release: 1.5.3. If the issue is that you want it showing up >> during >> a CPAN search, just don't give it a dev version number (1.5.3_001). >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > ______ _/ _/_____________________________________________________ > _/ _/ > _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za > _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho > _/ _/ _/ SANBI, South African National Bioinformatics Institute > _/ _/ _/ University of Western Cape, South Africa > _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 > ___ _/_/_/_/_/________________________________________________________ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From lincoln.stein at gmail.com Tue Nov 11 11:00:56 2008 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Tue, 11 Nov 2008 11:00:56 -0500 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> Message-ID: <6dce9a0b0811110800r386b90fcv3a0ab7f83de45fad@mail.gmail.com> Hi Chris, Thanks so much for volunteering to take this on!!!! Lincoln On Tue, Nov 11, 2008 at 10:47 AM, Chris Fields wrote: > Heikki, > > I'll volunteer to do this. I think this should be a 1.6 release. Users > have been screaming for a 'stable' release for years now, and everything on > trunk is definitely more stable than 1.4, so I say we should (as Mark > Johnson says) tag it, bag it and ship it. However, it should be the final > 'full' bioperl release prior to splitting things up. Much of the stuff > proposed on the release schedule for 1.6 can wait until we can decide what > to split off and who can maintain everything. Makes sense to put this off > until then anyway if you read the list, at least to me: > > http://www.bioperl.org/wiki/Release_Schedule > > Simple bug fixes after the split can be made on the 1.6 branch and > submitted to CPAN as point releases on a regular basis. If needed I can > work on maintaining the 1.6 releases until we switch over to the split > distribution structure. We can discuss other issues (how to tie in separate > release versions to 'core', etc) along the way. > > I'll try to come up with something by this weekend. > > chris > > On Nov 11, 2008, at 2:38 AM, Heikki Lehvaslaiho wrote: > > Either way (1.6 or 1.5.3), the main thing is to get a release out soon. I >> agree with the sentiments expressed here to split bioperl into smallish >> logical packages after the next release. >> >> Chris, looks like you have volunteered to be the release manager. As such >> you >> have quite a lot of responsibility ( :) ) as well as power. Please start >> posting to the list at least weekly updates what needs to be done and >> assigning tasks to people who you know can do part of the job. >> >> Also, a clear published time line helps getting things done. >> >> Let us know what you want us to do. I am waiting instructions. >> >> -Heikki >> >> On Tuesday 11 November 2008 10:05:18 Sendu Bala wrote: >> >>> Chris Fields wrote: >>> >>>> I'll work on a few bugs towards getting 1.6 released with a rough >>>> timeline for end of Nov, maybe even Thanksgiving. We can work on >>>> splitting things up after that. >>>> >>> >>> I don't think there's any need to put it out as 1.6 if it doesn't >>> satisfy the requirements we planned for 1.6. Just make it another dev >>> point release: 1.5.3. If the issue is that you want it showing up during >>> a CPAN search, just don't give it a dev version number (1.5.3_001). >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> -- >> ______ _/ _/_____________________________________________________ >> _/ _/ >> _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za >> _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho >> _/ _/ _/ SANBI, South African National Bioinformatics Institute >> _/ _/ _/ University of Western Cape, South Africa >> _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 >> ___ _/_/_/_/_/________________________________________________________ >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Marie-Claude Hofmann > College of Veterinary Medicine > University of Illinois Urbana-Champaign > > > > > -- Lincoln D. Stein Ontario Institute for Cancer Research 101 College St., Suite 800 Toronto, ON, Canada M5G0A3 416 673-8514 Assistant: Stacey Quinn Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 USA (516) 367-8380 Assistant: Sandra Michelsen From cjfields at illinois.edu Tue Nov 11 11:20:12 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 11 Nov 2008 10:20:12 -0600 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <6dce9a0b0811110800r386b90fcv3a0ab7f83de45fad@mail.gmail.com> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> <6dce9a0b0811110800r386b90fcv3a0ab7f83de45fad@mail.gmail.com> Message-ID: <229CDDF2-F0D0-421F-BB54-CF4B15C8B82D@illinois.edu> No problem. I'm still aiming for sooner rather than later (end of november), with the intent of fixing bugs on the 1.6 branch and having regular point releases. chris On Nov 11, 2008, at 10:00 AM, Lincoln Stein wrote: > Hi Chris, > > Thanks so much for volunteering to take this on!!!! > > Lincoln > > On Tue, Nov 11, 2008 at 10:47 AM, Chris Fields > wrote: > >> Heikki, >> >> I'll volunteer to do this. I think this should be a 1.6 release. >> Users >> have been screaming for a 'stable' release for years now, and >> everything on >> trunk is definitely more stable than 1.4, so I say we should (as Mark >> Johnson says) tag it, bag it and ship it. However, it should be >> the final >> 'full' bioperl release prior to splitting things up. Much of the >> stuff >> proposed on the release schedule for 1.6 can wait until we can >> decide what >> to split off and who can maintain everything. Makes sense to put >> this off >> until then anyway if you read the list, at least to me: >> >> http://www.bioperl.org/wiki/Release_Schedule >> >> Simple bug fixes after the split can be made on the 1.6 branch and >> submitted to CPAN as point releases on a regular basis. If needed >> I can >> work on maintaining the 1.6 releases until we switch over to the >> split >> distribution structure. We can discuss other issues (how to tie in >> separate >> release versions to 'core', etc) along the way. >> >> I'll try to come up with something by this weekend. >> >> chris >> >> On Nov 11, 2008, at 2:38 AM, Heikki Lehvaslaiho wrote: >> >> Either way (1.6 or 1.5.3), the main thing is to get a release out >> soon. I >>> agree with the sentiments expressed here to split bioperl into >>> smallish >>> logical packages after the next release. >>> >>> Chris, looks like you have volunteered to be the release manager. >>> As such >>> you >>> have quite a lot of responsibility ( :) ) as well as power. Please >>> start >>> posting to the list at least weekly updates what needs to be done >>> and >>> assigning tasks to people who you know can do part of the job. >>> >>> Also, a clear published time line helps getting things done. >>> >>> Let us know what you want us to do. I am waiting instructions. >>> >>> -Heikki >>> >>> On Tuesday 11 November 2008 10:05:18 Sendu Bala wrote: >>> >>>> Chris Fields wrote: >>>> >>>>> I'll work on a few bugs towards getting 1.6 released with a rough >>>>> timeline for end of Nov, maybe even Thanksgiving. We can work on >>>>> splitting things up after that. >>>>> >>>> >>>> I don't think there's any need to put it out as 1.6 if it doesn't >>>> satisfy the requirements we planned for 1.6. Just make it another >>>> dev >>>> point release: 1.5.3. If the issue is that you want it showing up >>>> during >>>> a CPAN search, just don't give it a dev version number (1.5.3_001). >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> >>> -- >>> ______ _/ _/ >>> _____________________________________________________ >>> _/ _/ >>> _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za >>> _/_/_/_/_/ Senior Scientist skype: heikki_lehvaslaiho >>> _/ _/ _/ SANBI, South African National Bioinformatics Institute >>> _/ _/ _/ University of Western Cape, South Africa >>> _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 >>> ___ _/_/_/_/_/ >>> ________________________________________________________ >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Marie-Claude Hofmann >> College of Veterinary Medicine >> University of Illinois Urbana-Champaign >> >> >> >> >> > > > -- > Lincoln D. Stein > > Ontario Institute for Cancer Research > 101 College St., Suite 800 > Toronto, ON, Canada M5G0A3 > 416 673-8514 > Assistant: Stacey Quinn > > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 USA > (516) 367-8380 > Assistant: Sandra Michelsen > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From lincoln.stein at gmail.com Tue Nov 11 11:52:06 2008 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Tue, 11 Nov 2008 11:52:06 -0500 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <4919B4B8.5050307@sendu.me.uk> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> <4919B4B8.5050307@sendu.me.uk> Message-ID: <6dce9a0b0811110852y4ab5a5c2pe463012333d1804c@mail.gmail.com> Oh, let's not get stuck in a war of semantics. The point is that the 1.5 series works very well and that all of us would rather use it than the "stable" 1.4 release (correct me if I'm wrong!). "Stable" implies that we are supporting the release, but in fact I suspect that most of us respond to bug reports on 1.4 by asking people to try 1.5.2 or even bioperl-live. Lincoln On Tue, Nov 11, 2008 at 11:37 AM, Sendu Bala wrote: > Chris Fields wrote: > >> I'll volunteer to do this. I think this should be a 1.6 release. Users >> have been screaming for a 'stable' release for years now, and everything on >> trunk is definitely more stable than 1.4, >> > > Well, again, I don't see the value in calling it 1.6. Yes people want a > stable release, but calling it 1.6 doesn't make it stable. Doing the things > in the plan for 1.6 makes it stable. What you're proposing is to just lie to > everyone - "You want 'stable'? Here, have this thing I decided to label as > 'stable'!" It's very wrong-headed in my view. > > Do we really want all those half-tested, half-thought-out APIs that may be > hanging around to become official and therefore need to support them and > make their proper replacements backwards compatible come 1.7? > > But ultimately it's just semantics so I won't bring it up again. I suppose > any issues that arise can be solved with a wiki update explaining that > 'stable' doesn't really mean stable, or that 1.6 wasn't a stable release, or > that our numbering scheme no longer has any particular meaning (it doesn't > have to, after all). > -- Lincoln D. Stein Ontario Institute for Cancer Research 101 College St., Suite 800 Toronto, ON, Canada M5G0A3 416 673-8514 Assistant: Stacey Quinn Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 USA (516) 367-8380 Assistant: Sandra Michelsen From bix at sendu.me.uk Tue Nov 11 11:37:12 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 11 Nov 2008 16:37:12 +0000 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> Message-ID: <4919B4B8.5050307@sendu.me.uk> Chris Fields wrote: > I'll volunteer to do this. I think this should be a 1.6 release. Users > have been screaming for a 'stable' release for years now, and everything > on trunk is definitely more stable than 1.4, Well, again, I don't see the value in calling it 1.6. Yes people want a stable release, but calling it 1.6 doesn't make it stable. Doing the things in the plan for 1.6 makes it stable. What you're proposing is to just lie to everyone - "You want 'stable'? Here, have this thing I decided to label as 'stable'!" It's very wrong-headed in my view. Do we really want all those half-tested, half-thought-out APIs that may be hanging around to become official and therefore need to support them and make their proper replacements backwards compatible come 1.7? But ultimately it's just semantics so I won't bring it up again. I suppose any issues that arise can be solved with a wiki update explaining that 'stable' doesn't really mean stable, or that 1.6 wasn't a stable release, or that our numbering scheme no longer has any particular meaning (it doesn't have to, after all). From cjfields at illinois.edu Tue Nov 11 13:05:47 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 11 Nov 2008 12:05:47 -0600 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <4919B4B8.5050307@sendu.me.uk> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> <4919B4B8.5050307@sendu.me.uk> Message-ID: <4CA2CB1B-8DDC-4DB7-8B80-E490C1F6D4EB@illinois.edu> On Nov 11, 2008, at 10:37 AM, Sendu Bala wrote: > Chris Fields wrote: >> I'll volunteer to do this. I think this should be a 1.6 release. >> Users have been screaming for a 'stable' release for years now, and >> everything on trunk is definitely more stable than 1.4, > > Well, again, I don't see the value in calling it 1.6. Yes people > want a stable release, but calling it 1.6 doesn't make it stable. > Doing the things in the plan for 1.6 makes it stable. What you're > proposing is to just lie to everyone - "You want 'stable'? Here, > have this thing I decided to label as 'stable'!" It's very wrong- > headed in my view. Maybe it's just me, but I don't think having a 1.5 point release solves the perception issue we have been facing over the years after the 1.4 release, i.e. bioperl is now in a constant endless 'beta' release state, even though code on the main trunk is considerably more stable than past releases. It doesn't help that perception when the period of time in between simple point releases (1.5.1 to 1.5.2, for instance) has now extended way beyond what used to be the release period between major releases. FOr instance, some sysadmins see 1.5.x as a developer series (understandably so as it is in our own documentation and FAQ). Ergo they consider it implicitly 'unstable', so most refuse to install it (even though we know better). Changing the wiki documentation won't immediately help that perception; we have all indicated that 1.5 was a dev release at one time or another, and old documentation floating around on CPAN or the web doesn't help. Overall I think we're essentially on the same page. I think the perception that bioperl is 'dev' is what really needs to be changed, but I also think the best short-term solution for Lincoln and the bioperl community is to release something a consensus of users (us, world at large) consider stable, and according to current documentation that would be 1.6; we can make regular point releases for that one on a branch until the next major release. No, it isn't perfect (we don't accomplish everything we set out to do), but it works. We can then work on a better long-term solution, which is to change the perception that the code is 'unstable' or 'beta.' That will come down the road and will take more time and effort. > Do we really want all those half-tested, half-thought-out APIs that > may be hanging around to become official and therefore need to > support them and make their proper replacements backwards compatible > come 1.7? I'm not following you. There are a couple of exceptions but overall the core API (PrimarySeqI, SeqI, SeqFeatureI, AlignI, etc) has remained fairly stable for quite a while now, even with some significant behind-the-scenes changes. Is there something you don't like about any particular API? Also, there is nothing stopping developers from trying out a new and possibly better ways of doing things; you have demonstrated that yourself with PullParserI. As for backward compatibility, I don't have a problem breaking it if it needs to be broken (i.e. if it makes sense, such as renaming methods for consistency). A simple deprecation cycle for old APIs or methods is par for the course with any software project, and we have the deprecation tools in place in Bio::Root::* for helping accomplish that. > But ultimately it's just semantics so I won't bring it up again. I > suppose any issues that arise can be solved with a wiki update > explaining that 'stable' doesn't really mean stable, or that 1.6 > wasn't a stable release, or that our numbering scheme no longer has > any particular meaning (it doesn't have to, after all). I agree, but until the word gets out we should forward with what most sysadmins would consider 'stable', which to me is 1.6. chris From bix at sendu.me.uk Tue Nov 11 13:17:15 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 11 Nov 2008 18:17:15 +0000 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <4CA2CB1B-8DDC-4DB7-8B80-E490C1F6D4EB@illinois.edu> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> <4919B4B8.5050307@sendu.me.uk> <4CA2CB1B-8DDC-4DB7-8B80-E490C1F6D4EB@illinois.edu> Message-ID: <4919CC2B.1070503@sendu.me.uk> Chris Fields wrote: > but until the word gets out we should forward with what most > sysadmins would consider 'stable', which to me is 1.6. OK, that makes sense. I'll chime in with the others and thank you for spear-heading this, and of course consider me on the list of people willing to help out. Provide instructions/demands! :) From cjfields at illinois.edu Tue Nov 11 13:23:27 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 11 Nov 2008 12:23:27 -0600 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <6dce9a0b0811110852y4ab5a5c2pe463012333d1804c@mail.gmail.com> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> <4919B4B8.5050307@sendu.me.uk> <6dce9a0b0811110852y4ab5a5c2pe463012333d1804c@mail.gmail.com> Message-ID: Lincoln, Regardless of what we call it (1.5.3, 1.6, 'biofoo'), I'll try to hammer something out soonish, hopefully by Sunday. chris On Nov 11, 2008, at 10:52 AM, Lincoln Stein wrote: > Oh, let's not get stuck in a war of semantics. The point is that the > 1.5 > series works very well and that all of us would rather use it than the > "stable" 1.4 release (correct me if I'm wrong!). "Stable" implies > that we > are supporting the release, but in fact I suspect that most of us > respond to > bug reports on 1.4 by asking people to try 1.5.2 or even bioperl-live. > > Lincoln > > On Tue, Nov 11, 2008 at 11:37 AM, Sendu Bala wrote: > >> Chris Fields wrote: >> >>> I'll volunteer to do this. I think this should be a 1.6 release. >>> Users >>> have been screaming for a 'stable' release for years now, and >>> everything on >>> trunk is definitely more stable than 1.4, >>> >> >> Well, again, I don't see the value in calling it 1.6. Yes people >> want a >> stable release, but calling it 1.6 doesn't make it stable. Doing >> the things >> in the plan for 1.6 makes it stable. What you're proposing is to >> just lie to >> everyone - "You want 'stable'? Here, have this thing I decided to >> label as >> 'stable'!" It's very wrong-headed in my view. >> >> Do we really want all those half-tested, half-thought-out APIs that >> may be >> hanging around to become official and therefore need to support >> them and >> make their proper replacements backwards compatible come 1.7? >> >> But ultimately it's just semantics so I won't bring it up again. I >> suppose >> any issues that arise can be solved with a wiki update explaining >> that >> 'stable' doesn't really mean stable, or that 1.6 wasn't a stable >> release, or >> that our numbering scheme no longer has any particular meaning (it >> doesn't >> have to, after all). >> > > > > -- > Lincoln D. Stein > > Ontario Institute for Cancer Research > 101 College St., Suite 800 > Toronto, ON, Canada M5G0A3 > 416 673-8514 > Assistant: Stacey Quinn > > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 USA > (516) 367-8380 > Assistant: Sandra Michelsen > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From johnsonm at gmail.com Tue Nov 11 13:31:33 2008 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 11 Nov 2008 12:31:33 -0600 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <4919B4B8.5050307@sendu.me.uk> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> <4919B4B8.5050307@sendu.me.uk> Message-ID: On Tue, Nov 11, 2008 at 10:37 AM, Sendu Bala wrote: > Well, again, I don't see the value in calling it 1.6. Yes people want a > stable release, but calling it 1.6 doesn't make it stable. Doing the things > in the plan for 1.6 makes it stable. What you're proposing is to just lie to > everyone - "You want 'stable'? Here, have this thing I decided to label as > 'stable'!" It's very wrong-headed in my view. I don't see why it has to be advertised as 'stable'. Calling it 1.6 is more of an acceptance of the reality of the present situation than any kind of statement of quality. 1.4 is an antique. An unsupported antique. People have been told to use 1.5.X for years, even thought it's been advertised as an 'unstable' or 'developer' release. Slapping a 1.6 sticker on the current trunk signals that it is ok to use it, it's the best we've got. It's also a promise about the future. If you ask for help with this branch, we won't tell you to use something else. Well, we might ask you to try the latest point release, but we won't tell you to go pick up this 'developer' release and use that. > Do we really want all those half-tested, half-thought-out APIs that may be > hanging around to become official and therefore need to support them and > make their proper replacements backwards compatible come 1.7? Any 'open source' or 'free software' project run by unpaid volunteers that isn't making regular releases is either a dead project, or rapidly on the way to becoming one. I think we're down to run with what we've got or close up shop. I recommend the former. If we build up some momentum, get the release pace back up, that might actually attract more developers and more interest. Maybe we'll be able to make 1.8 what everybody hoped 1.6 would be. > But ultimately it's just semantics so I won't bring it up again. I suppose > any issues that arise can be solved with a wiki update explaining that > 'stable' doesn't really mean stable, or that 1.6 wasn't a stable release, or > that our numbering scheme no longer has any particular meaning (it doesn't > have to, after all). I don't really want to debate this endlessly, either. It's a waste of time and energy that could be better spent elsewhere. Probably the right word for 1.6 is 'supported'. From David.Messina at sbc.su.se Tue Nov 11 13:59:04 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Tue, 11 Nov 2008 19:59:04 +0100 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> <4919B4B8.5050307@sendu.me.uk> <6dce9a0b0811110852y4ab5a5c2pe463012333d1804c@mail.gmail.com> Message-ID: <628aabb70811111059g4508128fkb31d11c14894c00e@mail.gmail.com> > > Regardless of what we call it (1.5.3, 1.6, 'biofoo'), > Please, for the love of all that is holy, call it 1.6. And count me at your disposal for prepping the release. Dave From cjfields at illinois.edu Tue Nov 11 14:14:56 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 11 Nov 2008 13:14:56 -0600 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <628aabb70811111059g4508128fkb31d11c14894c00e@mail.gmail.com> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> <4919B4B8.5050307@sendu.me.uk> <6dce9a0b0811110852y4ab5a5c2pe463012333d1804c@mail.gmail.com> <628aabb70811111059g4508128fkb31d11c14894c00e@mail.gmail.com> Message-ID: <26609740-54F4-4003-B623-4053B0B85A36@illinois.edu> On Nov 11, 2008, at 12:59 PM, Dave Messina wrote: >> >> Regardless of what we call it (1.5.3, 1.6, 'biofoo'), >> > > Please, for the love of all that is holy, call it 1.6. > > And count me at your disposal for prepping the release. > > Dave I quite like 'biofoo'! ;> 1.6 it is, then -c From hlapp at gmx.net Tue Nov 11 14:52:09 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 11 Nov 2008 14:52:09 -0500 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: <4919B4B8.5050307@sendu.me.uk> References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> <4919B4B8.5050307@sendu.me.uk> Message-ID: On Nov 11, 2008, at 11:37 AM, Sendu Bala wrote: > Yes people want a stable release, but calling it 1.6 doesn't make it > stable. Doing the things in the plan for 1.6 makes it stable. I think the main danger with a haphazard 1.6 stable release is to have APIs in there that we aren't ready yet to commit to supporting beyond 1.6, or even beyond 1.6.0. If there are hidden (or known) bugs, these can be declared, and fixed in point releases. (I do feel pretty strongly that a much greater frequency of point releases is necessary and healthy. There was talk earlier this year that a 1.5.3 release is not worth the effort and that we should aim for 1.6 right away. I think such consideration is nearly always mistaken, and backfires - in this case the result of it was that we have had neither 1.5.3 nor 1.6 for 8 months: those feeling capable of shepherding 1.5.3 were told their time isn't needed, and those wanting to take on 1.6 were intimidated by the necessary effort. Had there been 3 point releases since then, we may have gotten to 1.6 in smaller steps at a time but eventually faster.) As for sanctioning APIs that aren't ready to receive official blessing by releasing 1.6, what about individual developers taking responsibility and clearly label their module and interface declarations as experimental if that's what they are. I don't think it's unreasonable either to just declare all new (since 1.4) APIs that haven't been vetted or approved by the core as experimental. They can always be de-experimentalised later. And finally - awesome Chris that you are volunteering to carry the 1.6 torch! Much thanks, and may the Force be with you :-) -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From lincoln.stein at gmail.com Tue Nov 11 15:11:24 2008 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Tue, 11 Nov 2008 15:11:24 -0500 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> <4919B4B8.5050307@sendu.me.uk> Message-ID: <6dce9a0b0811111211x17941a3cn2d4c6fa6af6331df@mail.gmail.com> I for one will take The Pledge not to change the APIs of the modules I have written. Lincoln On Tue, Nov 11, 2008 at 2:52 PM, Hilmar Lapp wrote: > > On Nov 11, 2008, at 11:37 AM, Sendu Bala wrote: > > Yes people want a stable release, but calling it 1.6 doesn't make it >> stable. Doing the things in the plan for 1.6 makes it stable. >> > > > I think the main danger with a haphazard 1.6 stable release is to have APIs > in there that we aren't ready yet to commit to supporting beyond 1.6, or > even beyond 1.6.0. > > If there are hidden (or known) bugs, these can be declared, and fixed in > point releases. (I do feel pretty strongly that a much greater frequency of > point releases is necessary and healthy. There was talk earlier this year > that a 1.5.3 release is not worth the effort and that we should aim for 1.6 > right away. I think such consideration is nearly always mistaken, and > backfires - in this case the result of it was that we have had neither 1.5.3 > nor 1.6 for 8 months: those feeling capable of shepherding 1.5.3 were told > their time isn't needed, and those wanting to take on 1.6 were intimidated > by the necessary effort. Had there been 3 point releases since then, we may > have gotten to 1.6 in smaller steps at a time but eventually faster.) > > As for sanctioning APIs that aren't ready to receive official blessing by > releasing 1.6, what about individual developers taking responsibility and > clearly label their module and interface declarations as experimental if > that's what they are. I don't think it's unreasonable either to just declare > all new (since 1.4) APIs that haven't been vetted or approved by the core as > experimental. They can always be de-experimentalised later. > > And finally - awesome Chris that you are volunteering to carry the 1.6 > torch! Much thanks, and may the Force be with you :-) > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > -- Lincoln D. Stein Ontario Institute for Cancer Research 101 College St., Suite 800 Toronto, ON, Canada M5G0A3 416 673-8514 Assistant: Stacey Quinn Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 USA (516) 367-8380 Assistant: Sandra Michelsen From johnsonm at gmail.com Tue Nov 11 15:19:19 2008 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 11 Nov 2008 14:19:19 -0600 Subject: [Bioperl-l] help on Bio-Perl Installation In-Reply-To: <431725.22025.qm@web31007.mail.mud.yahoo.com> References: <431725.22025.qm@web31007.mail.mud.yahoo.com> Message-ID: On Fri, Oct 24, 2008 at 12:06 PM, Jie Zhang wrote: > HI, > > I'm new to BioPerl and just finished installing BioPerl on Windows XP by dowloading and unpack the file bioperl-1.5.2_102.tar.gz from the Bioperl.org website, then strictly followed the >manual installation instruction. All the Build and Test steps were fine although there were some unimportant modules failed to install. I was able to view the documentation by typing >perldoc Bio::Perl in the command window. However, when I tested if it is installed properly, I encountered problem. I wrote a two-line script file called bp.pl > > #!/bin/perl -w > use Bio::Perl; > > The compilation step failed and gave me this message"use not allowed in the expression at bp.pl line 3, syntax error at bp.pl line 3, near"use Bio::Perl"...." > > That warning appeared no matter the script is "use Bio::Seq" or other modules. It seems use is not allowed here. What could be wrong during installation? Could you please help me? > > Thank you very much > > Jie What distribution of Perl are you using? ActiveState? Strawberry? How are you invoking your script? A Unix style shebang (#!/path/to/interpreter) probably won't work without some deep magic, and perhaps not even then. If you have perl in your path, try 'perl bp.pl', if you haven't already. From johnsonm at gmail.com Tue Nov 11 15:19:19 2008 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 11 Nov 2008 14:19:19 -0600 Subject: [Bioperl-l] help on Bio-Perl Installation In-Reply-To: <431725.22025.qm@web31007.mail.mud.yahoo.com> References: <431725.22025.qm@web31007.mail.mud.yahoo.com> Message-ID: On Fri, Oct 24, 2008 at 12:06 PM, Jie Zhang wrote: > HI, > > I'm new to BioPerl and just finished installing BioPerl on Windows XP by dowloading and unpack the file bioperl-1.5.2_102.tar.gz from the Bioperl.org website, then strictly followed the >manual installation instruction. All the Build and Test steps were fine although there were some unimportant modules failed to install. I was able to view the documentation by typing >perldoc Bio::Perl in the command window. However, when I tested if it is installed properly, I encountered problem. I wrote a two-line script file called bp.pl > > #!/bin/perl -w > use Bio::Perl; > > The compilation step failed and gave me this message"use not allowed in the expression at bp.pl line 3, syntax error at bp.pl line 3, near"use Bio::Perl"...." > > That warning appeared no matter the script is "use Bio::Seq" or other modules. It seems use is not allowed here. What could be wrong during installation? Could you please help me? > > Thank you very much > > Jie What distribution of Perl are you using? ActiveState? Strawberry? How are you invoking your script? A Unix style shebang (#!/path/to/interpreter) probably won't work without some deep magic, and perhaps not even then. If you have perl in your path, try 'perl bp.pl', if you haven't already. From johnsonm at gmail.com Tue Nov 11 16:02:05 2008 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 11 Nov 2008 15:02:05 -0600 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> <4919B4B8.5050307@sendu.me.uk> Message-ID: On Tue, Nov 11, 2008 at 1:52 PM, Hilmar Lapp wrote: > I think the main danger with a haphazard 1.6 stable release is to have APIs > in there that we aren't ready yet to commit to supporting beyond 1.6, or > even beyond 1.6.0. I think not having a 1.6 release at all is a bigger danger than possibly shipping some half-baked APIs. I'd rather ship now, deprecate later. But I'd say it's all up to Chris. He's the one with the time and energy, so I think we get what he's willing to do. 8) > If there are hidden (or known) bugs, these can be declared, and fixed in > point releases. (I do feel pretty strongly that a much greater frequency of > point releases is necessary and healthy. There was talk earlier this year > that a 1.5.3 release is not worth the effort and that we should aim for 1.6 > right away. I think such consideration is nearly always mistaken, and > backfires - in this case the result of it was that we have had neither 1.5.3 > nor 1.6 for 8 months: those feeling capable of shepherding 1.5.3 were told > their time isn't needed, and those wanting to take on 1.6 were intimidated > by the necessary effort. Had there been 3 point releases since then, we may > have gotten to 1.6 in smaller steps at a time but eventually faster.) I think perhaps the previous goals for 1.6 exceed the available developer manpower. It's fine to say "we'll ship 1.6 when X, Y and Z are done", but I think the situation we find ourselves in now is such that we can ship the trunk or not ship anything. The latter course of action could quite possibly see the splintering or dissolution of the project. > As for sanctioning APIs that aren't ready to receive official blessing by > releasing 1.6, what about individual developers taking responsibility and > clearly label their module and interface declarations as experimental if > that's what they are. I don't think it's unreasonable either to just declare > all new (since 1.4) APIs that haven't been vetted or approved by the core as > experimental. They can always be de-experimentalised later. Works for me. > And finally - awesome Chris that you are volunteering to carry the 1.6 > torch! Much thanks, and may the Force be with you :-) I think we all owe him mass quantities of his preferred beverage at the earliest opportunity. From cjfields at illinois.edu Tue Nov 11 16:57:51 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 11 Nov 2008 15:57:51 -0600 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> <4919B4B8.5050307@sendu.me.uk> Message-ID: On Nov 11, 2008, at 3:02 PM, Mark Johnson wrote: > On Tue, Nov 11, 2008 at 1:52 PM, Hilmar Lapp wrote: >> I think the main danger with a haphazard 1.6 stable release is to >> have APIs >> in there that we aren't ready yet to commit to supporting beyond >> 1.6, or >> even beyond 1.6.0. > > I think not having a 1.6 release at all is a bigger danger than > possibly shipping some half-baked APIs. I'd rather ship now, > deprecate later. But I'd say it's all up to Chris. He's the one with > the time and energy, so I think we get what he's willing to do. 8) Energy maybe, but time, eh, not so much anymore (job is taking up much more time now). We could definitely use more hands, but hopefully setting a plan will pull more folks aboard. >> If there are hidden (or known) bugs, these can be declared, and >> fixed in >> point releases. (I do feel pretty strongly that a much greater >> frequency of >> point releases is necessary and healthy. There was talk earlier >> this year >> that a 1.5.3 release is not worth the effort and that we should aim >> for 1.6 >> right away. I think such consideration is nearly always mistaken, and >> backfires - in this case the result of it was that we have had >> neither 1.5.3 >> nor 1.6 for 8 months: those feeling capable of shepherding 1.5.3 >> were told >> their time isn't needed, and those wanting to take on 1.6 were >> intimidated >> by the necessary effort. Had there been 3 point releases since >> then, we may >> have gotten to 1.6 in smaller steps at a time but eventually faster.) > > I think perhaps the previous goals for 1.6 exceed the available > developer manpower. It's fine to say "we'll ship 1.6 when X, Y and Z > are done", but I think the situation we find ourselves in now is such > that we can ship the trunk or not ship anything. The latter course of > action could quite possibly see the splintering or dissolution of the > project. I'm not sure it would be that extreme, but that's possible, yes. >> As for sanctioning APIs that aren't ready to receive official >> blessing by >> releasing 1.6, what about individual developers taking >> responsibility and >> clearly label their module and interface declarations as >> experimental if >> that's what they are. I don't think it's unreasonable either to >> just declare >> all new (since 1.4) APIs that haven't been vetted or approved by >> the core as >> experimental. They can always be de-experimentalised later. > > Works for me. > >> And finally - awesome Chris that you are volunteering to carry the >> 1.6 >> torch! Much thanks, and may the Force be with you :-) > > I think we all owe him mass quantities of his preferred beverage > at the earliest opportunity. Let's see how things turn out before handing out free beverages. chris From mauricio at open-bio.org Tue Nov 11 18:02:39 2008 From: mauricio at open-bio.org (Mauricio Herrera Cuadra) Date: Tue, 11 Nov 2008 17:02:39 -0600 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> <4919B4B8.5050307@sendu.me.uk> Message-ID: <491A0F0F.6000006@open-bio.org> I also vote on making it a 1.6 release, stable or not. Maybe it is the time to get rid of the even/odd versioning scheme at all? The scheme might have worked well in the past but now it's becoming more of a roadblock/legacy thing which is stopping the project to move forward and evolve. Just take a look to what other Bio* projects are doing. BioPython for instance, is releasing a new version every couple of months with the best of both worlds: fixes to bugs in the previous versions as well as new beta/experimental features which let them be more in sync to whatever external dependencies they require (i.e. they always try to support the latest version of Python, NumPy, etc.), and they're evolving fast! This could be an opportunity to switch to a more dynamic release cycle in which "stable" refers to the latest tagged/packaged/released code put into CPAN and "bleeding edge" refers to the latest code in the repository trunk. No more headaches about differences between major releases, and no more "you should try using 1.5.2 or bioperl-live instead of 1.4, which is our 'stable' release but quite old and unsupported" speech. It seems to me like making a HUGE plan for every release is eventually stopping it from happening at all; the bar has been raised far to high. I believe that we need fewer, short-term fixes/changes/enhancements as goals for making minor releases (1.6.0, 1.6.1, etc.), that should be the way to go from now on. We could be making minor releases a lot more often, so the split of the APIs into the proposed sub-packages can be gradually achieved and, once that happens, we could start thinking of a new 1.7 release series. Why don't we categorize the items listed in http://www.bioperl.org/wiki/Release_Schedule#Bioperl_1.6 and http://www.bioperl.org/wiki/Project_priority_list and group them into minor releases along with setting a regular release schedule? We all know it is impossible for everything to be working at a 100% always, and even if only a few patches have been made to the code base between release dates, we should simply make another minor release so things keep moving. Just like others here, I'm happy to help in whatever I can to make this happen. Cheers, Mauricio. Chris Fields wrote: > > On Nov 11, 2008, at 3:02 PM, Mark Johnson wrote: > >> On Tue, Nov 11, 2008 at 1:52 PM, Hilmar Lapp wrote: >>> I think the main danger with a haphazard 1.6 stable release is to >>> have APIs >>> in there that we aren't ready yet to commit to supporting beyond 1.6, or >>> even beyond 1.6.0. >> >> I think not having a 1.6 release at all is a bigger danger than >> possibly shipping some half-baked APIs. I'd rather ship now, >> deprecate later. But I'd say it's all up to Chris. He's the one with >> the time and energy, so I think we get what he's willing to do. 8) > > Energy maybe, but time, eh, not so much anymore (job is taking up much > more time now). We could definitely use more hands, but hopefully > setting a plan will pull more folks aboard. > >>> If there are hidden (or known) bugs, these can be declared, and fixed in >>> point releases. (I do feel pretty strongly that a much greater >>> frequency of >>> point releases is necessary and healthy. There was talk earlier this >>> year >>> that a 1.5.3 release is not worth the effort and that we should aim >>> for 1.6 >>> right away. I think such consideration is nearly always mistaken, and >>> backfires - in this case the result of it was that we have had >>> neither 1.5.3 >>> nor 1.6 for 8 months: those feeling capable of shepherding 1.5.3 were >>> told >>> their time isn't needed, and those wanting to take on 1.6 were >>> intimidated >>> by the necessary effort. Had there been 3 point releases since then, >>> we may >>> have gotten to 1.6 in smaller steps at a time but eventually faster.) >> >> I think perhaps the previous goals for 1.6 exceed the available >> developer manpower. It's fine to say "we'll ship 1.6 when X, Y and Z >> are done", but I think the situation we find ourselves in now is such >> that we can ship the trunk or not ship anything. The latter course of >> action could quite possibly see the splintering or dissolution of the >> project. > > I'm not sure it would be that extreme, but that's possible, yes. > >>> As for sanctioning APIs that aren't ready to receive official >>> blessing by >>> releasing 1.6, what about individual developers taking responsibility >>> and >>> clearly label their module and interface declarations as experimental if >>> that's what they are. I don't think it's unreasonable either to just >>> declare >>> all new (since 1.4) APIs that haven't been vetted or approved by the >>> core as >>> experimental. They can always be de-experimentalised later. >> >> Works for me. >> >>> And finally - awesome Chris that you are volunteering to carry the 1.6 >>> torch! Much thanks, and may the Force be with you :-) >> >> I think we all owe him mass quantities of his preferred beverage >> at the earliest opportunity. > > Let's see how things turn out before handing out free beverages. > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From johnsonm at gmail.com Tue Nov 11 18:17:27 2008 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 11 Nov 2008 17:17:27 -0600 Subject: [Bioperl-l] Withdraw Bio::Graphics and Bio::DB::SeqFeature from bioperl distribution? In-Reply-To: References: <6dce9a0b0811101225o6005250ev2b91a9cf779a8491@mail.gmail.com> <5F516434-EDAD-476B-A2FB-2C6D66C839B9@illinois.edu> <49193CBE.4000800@sendu.me.uk> <200811111038.37572.heikki@sanbi.ac.za> <19455C2C-A86A-40BD-8218-D7E397E563E1@illinois.edu> <4919B4B8.5050307@sendu.me.uk> Message-ID: On Tue, Nov 11, 2008 at 3:57 PM, Chris Fields wrote: > Energy maybe, but time, eh, not so much anymore (job is taking up much more > time now). We could definitely use more hands, but hopefully setting a plan > will pull more folks aboard. I can only justify very limited amounts of time from $day_job working on BioPerl. I might be able to noodle around with helping clean up the test suite while waiting on batch jobs, but that's probably it. I've got vacation aplenty, though, and would not be opposed to taking a day or three to help during a general push towards a release. I might be able to talk another person into doing the same. We do seem to have volunteers coming out of the woodwork seeking direction, so that plan thing is probably a good idea. From florent.angly at gmail.com Wed Nov 12 01:47:49 2008 From: florent.angly at gmail.com (Florent Angly) Date: Tue, 11 Nov 2008 22:47:49 -0800 Subject: [Bioperl-l] Draw phylogenetic trees with bar chart Message-ID: <491A7C15.10809@gmail.com> Dear Bioperl users, I need to represent graphically phylogenetic trees as cladograms or phylograms using Perl. An additional requirement is that I would like to add graph data as bar chart on top of it as in this example: http://scums.sdsu.edu/Mapper/images/b.gif Now, I have read the BioPerl howto and know about the Bio::TreeIO::svggraph and Bio::Tree::Draw::Cladogram modules. The first one uses the SVG Perl module whereas the other one uses the PostScript module to create EPS images. I do not know why there are 2 very similar modules to draw trees (in different locations), but none of them seem to able to plot additional information on the tree. Is there any other Perl code that you know of that would facilitate doing what I want to do? If not, how should I modify one of the 2 mentioned BioPerl modules to plot bar charts, and which one?? Thanks, Florent From shameer at ncbs.res.in Wed Nov 12 03:16:35 2008 From: shameer at ncbs.res.in (K. Shameer) Date: Wed, 12 Nov 2008 13:46:35 +0530 (IST) Subject: [Bioperl-l] Draw phylogenetic trees with bar chart In-Reply-To: <491A7C15.10809@gmail.com> References: <491A7C15.10809@gmail.com> Message-ID: <36163.192.168.1.1.1226477795.squirrel@mail.ncbs.res.in> Hi Florent, I recently used the Bio::Tree::Draw::Cladogram method to generate trees using input files in newick format. I used the eps format output and used the `convert` (part of ImageMagick) tool to convert it in to png format. The example in the HOWTO worked quite well for me. I am not sure if you can add graph data as bar chart using BioPerl. Cheers, K. Shameer > Dear Bioperl users, > > I need to represent graphically phylogenetic trees as cladograms or > phylograms using Perl. An additional requirement is that I would like to > add graph data as bar chart on top of it as in this example: > http://scums.sdsu.edu/Mapper/images/b.gif > > Now, I have read the BioPerl howto and know about the > Bio::TreeIO::svggraph and Bio::Tree::Draw::Cladogram modules. The first > one uses the SVG Perl module whereas the other one uses the PostScript > module to create EPS images. I do not know why there are 2 very similar > modules to draw trees (in different locations), but none of them seem to > able to plot additional information on the tree. > > Is there any other Perl code that you know of that would facilitate > doing what I want to do? If not, how should I modify one of the 2 > mentioned BioPerl modules to plot bar charts, and which one?? > > Thanks, > > Florent > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From c.bailey at bham.ac.uk Wed Nov 12 04:15:36 2008 From: c.bailey at bham.ac.uk (Chris Bailey) Date: Wed, 12 Nov 2008 09:15:36 +0000 Subject: [Bioperl-l] Draw phylogenetic trees with bar chart In-Reply-To: <491A7C15.10809@gmail.com> References: <491A7C15.10809@gmail.com> Message-ID: <491A9EB8.6020706@bham.ac.uk> Florent Angly wrote: > Dear Bioperl users, > > I need to represent graphically phylogenetic trees as cladograms or > phylograms using Perl. An additional requirement is that I would like > to add graph data as bar chart on top of it as in this example: > http://scums.sdsu.edu/Mapper/images/b.gif > > Now, I have read the BioPerl howto and know about the > Bio::TreeIO::svggraph and Bio::Tree::Draw::Cladogram modules. The > first one uses the SVG Perl module whereas the other one uses the > PostScript module to create EPS images. I do not know why there are 2 > very similar modules to draw trees (in different locations), but none > of them seem to able to plot additional information on the tree. I'm not aware of any inbuilt code in any of these modules to do what you're asking. In the past, the way I've added extra information into trees is as follows: 1) Embed the data you want to graph into the alignment/tree file, such that when the tree is drawn, for example, the length of the bar you want drawn is part of the text on the tree node in question 2) Draw the tree as an SVG image using Bio::TreeIO. 3) Parse the SVG document using your XML parser of choice (I've used XML::Simple for this in the past) 4) extract all the elements and their CDATA. 5) the text element will contain an x and y co-ordinate and the CDATA will contain the text string itself 6) pull out the data you need to draw the graph (the information you added in 1) using regexps etc. 7) add an element to the SVG document (e.g. ) simply change the x and y to match the coordinates of the text element (+ an offset to get everything looking right), and change width to the value you extracted in (6) 8) repeat steps 5 to 7 for all the text elements in your SVG document 9) output new SVG document to a file 10) ???? 11) Profit This pattern will also work for more complex data, since you can always add more data to each text node, and it's more than possible to generate any type of chart data you like with the right combination of shape primitives. Cheers, Chris > > Is there any other Perl code that you know of that would facilitate > doing what I want to do? If not, how should I modify one of the 2 > mentioned BioPerl modules to plot bar charts, and which one?? > > Thanks, > > Florent > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ----------------------- Chris Bailey Centre for Systems Biology School of Biosciences University of Birmingham Edgbaston, Birmingham, B15 2TT United Kingdom T: +44(0)121 414 8849 E: c.bailey at bham.ac.uk From s_oheigeartaigh at yahoo.co.uk Wed Nov 12 11:33:10 2008 From: s_oheigeartaigh at yahoo.co.uk (Sean ohEigeartaigh) Date: Wed, 12 Nov 2008 16:33:10 +0000 (GMT) Subject: [Bioperl-l] How to get more results from Blast Message-ID: <508888.30788.qm@web27405.mail.ukl.yahoo.com> Hi, I'm using the module Bio::Tools::Run::StandAloneBlast. The standard cutoff for Blast to display a sequence comparison appears to be 250 results. How do I adjust this parameter to increase the cutoff to, say, 1000 results? I need to be able to look at Blast results in a lot of depth. Thanks very much for your help, Sean O hEigeartaigh. From lincoln.stein at gmail.com Wed Nov 12 12:29:38 2008 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Wed, 12 Nov 2008 12:29:38 -0500 Subject: [Bioperl-l] [Gmod-gbrowse] example pictures of all the glyphs? In-Reply-To: <10C44E08-29B8-4CD5-A6C5-1667477EA833@sgul.ac.uk> References: <10C44E08-29B8-4CD5-A6C5-1667477EA833@sgul.ac.uk> Message-ID: <6dce9a0b0811120929h13593b48h1d2bdd045950289a@mail.gmail.com> There are a pair of scripts in bioperl which generate png images of some of the more esoteric glyphs, but the list of glyphs is not complete, as their names are hard-coded. Perhaps these scripts can be used as the basis for a more general script that traverses the Bio/Graphics/Glyph subdirectory, loads each glyph it finds, and draws it. For what it's worth, the scripts are located here: $BIOPERL/scripts/biographics/bp_glyphs1-demo.PLS $BIOPERL/scripts/biographics/bp_glyphs2-demo.PLS On the todo list is a way for glyphs to self-document their parameters. To do this, glyphs will need to support two new methods: sub parameters() return a hashref consisting of all the options they recognize as keys, which in turn points to a hashref containing a human readable description of the option, and a machine-readable description of the type of data that can be passed. Here's the concept: { height => {description => 'height of the glyph in pixels', range => 'integer(1..100)' }, fgcolor => {description => 'color of the outline of the glyph', range => 'color' }, bump => {description => 'true if features should not overlap', range => 'boolean' }, sort => {description => 'sort order', range => '{by_name,by_score,by_position}' } This is all conceptual. In fact the range should use some sort of Perl prototyping, such as the one used by Class::Struct. In any case, this would let us achieve two things. One is to generate a page illustrating glyph types, as you originally asked about. The other is to enable sophisticated editing of gbrowse track configurations by the user. Lincoln On Wed, Nov 12, 2008 at 11:42 AM, Adam Witney wrote: > > I was just wondering if there are any example pictures of all the > available glyphs somewhere? > > thanks > > adam > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win great > prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Gmod-gbrowse mailing list > Gmod-gbrowse at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse > -- Lincoln D. Stein Ontario Institute for Cancer Research 101 College St., Suite 800 Toronto, ON, Canada M5G0A3 416 673-8514 Assistant: Stacey Quinn Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 USA (516) 367-8380 Assistant: Sandra Michelsen From mbasu at mail.nih.gov Wed Nov 12 12:19:53 2008 From: mbasu at mail.nih.gov (Malay) Date: Wed, 12 Nov 2008 12:19:53 -0500 Subject: [Bioperl-l] Draw phylogenetic trees with bar chart In-Reply-To: <491A9EB8.6020706@bham.ac.uk> References: <491A7C15.10809@gmail.com> <491A9EB8.6020706@bham.ac.uk> Message-ID: <491B1039.7030106@mail.nih.gov> Chris Bailey wrote: > Florent Angly wrote: >> Dear Bioperl users, >> >> I need to represent graphically phylogenetic trees as cladograms or >> phylograms using Perl. An additional requirement is that I would like >> to add graph data as bar chart on top of it as in this example: >> http://scums.sdsu.edu/Mapper/images/b.gif >> >> Now, I have read the BioPerl howto and know about the >> Bio::TreeIO::svggraph and Bio::Tree::Draw::Cladogram modules. The >> first one uses the SVG Perl module whereas the other one uses the >> PostScript module to create EPS images. I do not know why there are 2 >> very similar modules to draw trees (in different locations), but none >> of them seem to able to plot additional information on the tree. > I'm not aware of any inbuilt code in any of these modules to do what > you're asking. In the past, the way I've added extra information into > trees is as follows: > 1) Embed the data you want to graph into the alignment/tree file, such > that when the tree is drawn, for example, the length of the bar you want > drawn is part of the text on the tree node in question > 2) Draw the tree as an SVG image using Bio::TreeIO. > 3) Parse the SVG document using your XML parser of choice (I've used > XML::Simple for this in the past) > 4) extract all the elements and their CDATA. > 5) the text element will contain an x and y co-ordinate and the CDATA > will contain the text string itself > 6) pull out the data you need to draw the graph (the information you > added in 1) using regexps etc. > 7) add an element to the SVG document (e.g. height="10" width="125" style="fill:blue" />) simply change the x and y > to match the coordinates of the text element (+ an offset to get > everything looking right), and change width to the value you extracted > in (6) > 8) repeat steps 5 to 7 for all the text elements in your SVG document > 9) output new SVG document to a file > 10) ???? > 11) Profit > > This pattern will also work for more complex data, since you can always > add more data to each text node, and it's more than possible to generate > any type of chart data you like with the right combination of shape > primitives. > FYI a very standard way to generate images like these is to use R's "ape" package. -Malay -- Malay K Basu Post-doctoral Fellow NCBI/NLM/NIH 8600 Rockville Pike Building 38A, Room 5S514C Bethesda, MD 20894. Phone: 240-421-2460 (mobile) 301-496-5599 (office) Email: mbasu at mail.nih.gov From shalabh.sharma7 at gmail.com Wed Nov 12 13:35:27 2008 From: shalabh.sharma7 at gmail.com (shalabh sharma) Date: Wed, 12 Nov 2008 13:35:27 -0500 Subject: [Bioperl-l] [Gmod-gbrowse] example pictures of all the glyphs? In-Reply-To: <6dce9a0b0811120929h13593b48h1d2bdd045950289a@mail.gmail.com> References: <10C44E08-29B8-4CD5-A6C5-1667477EA833@sgul.ac.uk> <6dce9a0b0811120929h13593b48h1d2bdd045950289a@mail.gmail.com> Message-ID: <9fcc48c70811121035s3d3710c2kb06407304df8f838@mail.gmail.com> Hey Adam, Here are some example pictures of basic glyphs: http://www.agcol.arizona.edu/software/java_gbrowse/Java_GBrowse/GBrowseConfiguration/gbrowse_configuration_content.htm Shalabh On Wed, Nov 12, 2008 at 12:29 PM, Lincoln Stein wrote: > There are a pair of scripts in bioperl which generate png images of some of > the more esoteric glyphs, but the list of glyphs is not complete, as their > names are hard-coded. Perhaps these scripts can be used as the basis for a > more general script that traverses the Bio/Graphics/Glyph subdirectory, > loads each glyph it finds, and draws it. > For what it's worth, the scripts are located here: > > $BIOPERL/scripts/biographics/bp_glyphs1-demo.PLS > $BIOPERL/scripts/biographics/bp_glyphs2-demo.PLS > > On the todo list is a way for glyphs to self-document their parameters. To > do this, glyphs will need to support two new methods: > > sub parameters() > return a hashref consisting of all the options they recognize as keys, > which in turn points to a hashref containing a human readable description of > the option, and a machine-readable description of the type of data that can > be passed. Here's the concept: > > { > height => {description => 'height of the glyph in pixels', > range => 'integer(1..100)' > }, > fgcolor => {description => 'color of the outline of the glyph', > range => 'color' > }, > bump => {description => 'true if features should not overlap', > range => 'boolean' > }, > sort => {description => 'sort order', > range => '{by_name,by_score,by_position}' > } > > > This is all conceptual. In fact the range should use some sort of Perl > prototyping, such as the one used by Class::Struct. > > In any case, this would let us achieve two things. One is to generate a > page illustrating glyph types, as you originally asked about. The other is > to enable sophisticated editing of gbrowse track configurations by the user. > > Lincoln > > On Wed, Nov 12, 2008 at 11:42 AM, Adam Witney wrote: > >> >> I was just wondering if there are any example pictures of all the >> available glyphs somewhere? >> >> thanks >> >> adam >> >> ------------------------------------------------------------------------- >> This SF.Net email is sponsored by the Moblin Your Move Developer's >> challenge >> Build the coolest Linux based applications with Moblin SDK & win great >> prizes >> Grand prize is a trip for two to an Open Source event anywhere in the >> world >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >> _______________________________________________ >> Gmod-gbrowse mailing list >> Gmod-gbrowse at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse >> > > > > -- > Lincoln D. Stein > > Ontario Institute for Cancer Research > 101 College St., Suite 800 > Toronto, ON, Canada M5G0A3 > 416 673-8514 > Assistant: Stacey Quinn > > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 USA > (516) 367-8380 > Assistant: Sandra Michelsen > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win great > prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Gmod-gbrowse mailing list > Gmod-gbrowse at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse > > From florent.angly at gmail.com Wed Nov 12 13:56:38 2008 From: florent.angly at gmail.com (Florent Angly) Date: Wed, 12 Nov 2008 10:56:38 -0800 Subject: [Bioperl-l] Draw phylogenetic trees with bar chart In-Reply-To: <491B1039.7030106@mail.nih.gov> References: <491A7C15.10809@gmail.com> <491A9EB8.6020706@bham.ac.uk> <491B1039.7030106@mail.nih.gov> Message-ID: <491B26E6.5070106@gmail.com> Thank you all for your replies, they are very useful! Since I am looking more for some non-interactive code that integrates well in my existing Perl program, I will look at Chris' method that involves modifying the SVG tree. Cheers, Florent Malay wrote: > Chris Bailey wrote: >> Florent Angly wrote: >>> Dear Bioperl users, >>> >>> I need to represent graphically phylogenetic trees as cladograms or >>> phylograms using Perl. An additional requirement is that I would >>> like to add graph data as bar chart on top of it as in this example: >>> http://scums.sdsu.edu/Mapper/images/b.gif >>> >>> Now, I have read the BioPerl howto and know about the >>> Bio::TreeIO::svggraph and Bio::Tree::Draw::Cladogram modules. The >>> first one uses the SVG Perl module whereas the other one uses the >>> PostScript module to create EPS images. I do not know why there are >>> 2 very similar modules to draw trees (in different locations), but >>> none of them seem to able to plot additional information on the tree. >> I'm not aware of any inbuilt code in any of these modules to do what >> you're asking. In the past, the way I've added extra information into >> trees is as follows: >> 1) Embed the data you want to graph into the alignment/tree file, >> such that when the tree is drawn, for example, the length of the bar >> you want drawn is part of the text on the tree node in question >> 2) Draw the tree as an SVG image using Bio::TreeIO. >> 3) Parse the SVG document using your XML parser of choice (I've used >> XML::Simple for this in the past) >> 4) extract all the elements and their CDATA. >> 5) the text element will contain an x and y co-ordinate and the CDATA >> will contain the text string itself >> 6) pull out the data you need to draw the graph (the information you >> added in 1) using regexps etc. >> 7) add an element to the SVG document (e.g. > height="10" width="125" style="fill:blue" />) simply change the x and >> y to match the coordinates of the text element (+ an offset to get >> everything looking right), and change width to the value you >> extracted in (6) >> 8) repeat steps 5 to 7 for all the text elements in your SVG document >> 9) output new SVG document to a file >> 10) ???? >> 11) Profit >> >> This pattern will also work for more complex data, since you can >> always add more data to each text node, and it's more than possible >> to generate any type of chart data you like with the right >> combination of shape primitives. >> > > FYI a very standard way to generate images like these is to use R's > "ape" package. > > -Malay > From SMarkel at accelrys.com Wed Nov 12 16:45:52 2008 From: SMarkel at accelrys.com (Scott Markel) Date: Wed, 12 Nov 2008 16:45:52 -0500 Subject: [Bioperl-l] How to get more results from Blast In-Reply-To: <508888.30788.qm@web27405.mail.ukl.yahoo.com> References: <508888.30788.qm@web27405.mail.ukl.yahoo.com> Message-ID: <1F1240778FB0AF46B4E5A72C44D2C7471440D2AE@exch1-hi.accelrys.net> Sean, The BLAST command line options are "-b" (number of alignments) and "-v" (number of hits). The defaults are 250 and 500, respectively. You probably want to set them to the same value, otherwise you'll get hits in the one-line summary with no corresponding pairwise alignment. Scott Scott Markel, Ph.D. Principal Bioinformatics Architect email: smarkel at accelrys.com Accelrys (SciTegic R&D) mobile: +1 858 205 3653 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 San Diego, CA 92121 fax: +1 858 799 5222 USA web: http://www.accelrys.com http://www.linkedin.com/in/smarkel Board of Directors: International Society for Computational Biology Co-chair: ISCB Publications Committee Associate Editor: PLoS Computational Biology Editorial Board: Briefings in Bioinformatics > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Sean ohEigeartaigh > Sent: Wednesday, 12 November 2008 8:33 AM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] How to get more results from Blast > > Hi, I'm using the module Bio::Tools::Run::StandAloneBlast. > > The standard cutoff for Blast to display a sequence comparison appears to > be 250 results. How do I adjust this parameter to increase the cutoff to, > say, 1000 results? I need to be able to look at Blast results in a lot of > depth. > > Thanks very much for your help, > Sean O hEigeartaigh. > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From apapanicolaou at ice.mpg.de Fri Nov 14 09:55:55 2008 From: apapanicolaou at ice.mpg.de (Alexie Papanicolaou) Date: Fri, 14 Nov 2008 14:55:55 +0000 Subject: [Bioperl-l] undefined sub-sequence with a single base Message-ID: <1226674555.6451.37.camel@alexie-laptop> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From apapanicolaou at ice.mpg.de Fri Nov 14 09:59:30 2008 From: apapanicolaou at ice.mpg.de (Alexie Papanicolaou) Date: Fri, 14 Nov 2008 14:59:30 +0000 Subject: [Bioperl-l] undefined sub-sequence with a single base In-Reply-To: <1226674555.6451.37.camel@alexie-laptop> References: <1226674555.6451.37.camel@alexie-laptop> Message-ID: <1226674770.6451.41.camel@alexie-laptop> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dan.bolser at gmail.com Fri Nov 14 11:26:00 2008 From: dan.bolser at gmail.com (Dan Bolser) Date: Fri, 14 Nov 2008 16:26:00 +0000 Subject: [Bioperl-l] Mapping 'mate pairs' in a Sanger sequencing assembly Message-ID: <2c8757af0811140826p613e82f3wd6691c066bfc05ff@mail.gmail.com> Hi, I have written some code using BioPerl to do what I want, however, because I am new to BioPerl, I have the 'am I doing it right' fear. I wonder if perhaps there is a much better way to do what I want. I am working with some Sanger sequencing data from a 'double barreled shotgun' sequencing experiment. What this means is that the genomic inserts are sequenced separately from both ends. For this reason, each read has a natural 'pair' or 'mate pair' that has a typical sequence separation, depending on the average insert size that results from a given sequencing protocol. The idea of a 'paired end' read is very important in the NextGen sequencing data, as the fact that two short reads are separated by a known distance is very informative when it comes to assembly. (For example see: http://www.sciencemag.org/cgi/content/abstract/1149504 ) I searched the mailing list archives and the wiki, but I didn't find any modules specifically dealing with this data type. I wrote the following script to simply measure the average insert size between mate pairs in the assembly. Any comments on the code are very welcome (i.e. "BioPerl - ur doing it wrong!"). If anyone is working on code to deal with paired end data directly (or if I just missed those modules), please let me know. #!/usr/bin/perl -w ## This script is based loosely on "contig_draw.PLS" from somewhere in ## BioPerl. (e.g. scripts/graphics/contig_draw.PLS or ## http://code.open-bio.org/svnweb/index.cgi/bioperl/view/bioperl-live/ ## trunk/scripts/graphics/contig_draw.PLS) ## The purpose of this script is to read a .ace file (produced by ## phrap) and identify the 'mate pairs' or 'read pairs' using the St ## Louis naming convention. Once the 'paired end' reads have been ## identified, the mean insert length is calculated. use strict; use Bio::Assembly::IO; use Statistics::Descriptive; ## Some things just don't seem like they should be objects... my $stat = Statistics::Descriptive::Full->new(); ## OPTIONS die "\nusage: parse.plx \n\n" unless @ARGV == 1; my $aceFile = $ARGV[0]; unless(-s $aceFile){ die "\nempty or missing file : $aceFile\n\n"; } warn "parsing $aceFile\n"; my $parser = new Bio::Assembly::IO( -file => $aceFile, -format => 'ace' ); ## Typically there is only one assembly parsed by 'the parser' at a ## time. I'm not sure when this is not the case (there is only one ## 'assembly' per ace file). my $ass = $parser->next_assembly(); ## The assembly is a 'Bio::Assembly::Scaffold' object ## Store the forward and reverse reads. my %readP; # Hash of linked forward and reverse read pairs my $for = 0; my $rev = 0; my @reads = $ass->get_all_seq_ids; ## The seq IDs are strings for my $read (@reads){ ## Check that we can parse the read name according to the St Louis ## naming convention (basically '.b' identifies one half of a 'read ## pair' and '.g' the other half). unless ($read =~ /^(\d{3}[A-P]\d{2}X\d{5}).([bg]).abi$/){ ## See also Bug 2648 ## http://bugzilla.open-bio.org/show_bug.cgi?id=2648 warn "is this a valid read : $read \n"; next; } #warn "$read\n"; ## From the definition of an assembly (specifically ## Bio::Assembly::Contig), the following test is impossible to ## fail... die "duplicate read : $read\n" if $readP{$1}{$2}++; $2 eq 'b' ? $for++ : $rev ++; } ## Store the "read to contig" mapping my %readC; my @contigs = $ass->all_contigs; ## The contigs are 'Bio::Assembly::Contig' objects foreach my $contig (@contigs){ my $cid = $contig->id; #warn $cid, "\n"; ## Loop through the sequences in this contig for my $read ($contig->get_seq_ids){ #warn "\t$read\n"; $readC{$read} = $cid; } } ## Now we can do something useful ## Check the 'read pairs' in detail... my @insertLengths; my ($paired, $unPaired) = (0,0); for my $read (keys %readP){ #warn "$read\n"; my @readP = keys %{$readP{$read}}; if (@readP==1){ #warn "no mate pair $read ($readP[0])\n"; $unPaired++; } elsif (@readP==2){ #warn "mate pair $read\n"; $paired++; ## Do the read pairs come from the same contig? if ($readC{"$read.b.abi"} eq $readC{"$read.g.abi"}){ #print "read $read hits to contig ", $readC{"$read.b.abi"}, "\n"; ## Get the sequence objects my $bId = $ass->get_seq_by_id( "$read.b.abi" ); my $gId = $ass->get_seq_by_id( "$read.g.abi" ); ## Get the contig object my $contig = $ass->get_contig_by_id( $readC{"$read.b.abi"} ); ## Get the sequence to contig alignment information my $bSeq = $contig->get_seq_coord( $bId ); my $gSeq = $contig->get_seq_coord( $gId ); ## Check the sequence pair mapping sanity (?) and measure the ## insert length. my $insertLength; # printf(("%7d\t%7d\t%+3d\t%5d\t - \t" x 2), # ## # $bSeq->start, # $bSeq->end, # $bSeq->strand, # $bSeq->length, # ## # $gSeq->start, # $gSeq->end, # $gSeq->strand, # $gSeq->length, # ## # ); if(+$bSeq->strand == -$gSeq->strand){ if($bSeq->strand == +1){ if($bSeq->start > $gSeq->end){ # print "mate pair mis-match\n"; next; } $insertLength = $gSeq->end - $bSeq->start; } if($bSeq->strand == -1){ if($gSeq->start > $bSeq->end){ # print "mate pair mis-match\n"; next; } $insertLength = $bSeq->end - $gSeq->start; } } else{ # print "mate pair mis-match\n"; next; } # printf("%6d\n", $insertLength); push @insertLengths, $insertLength; } else{ ## Reads map to different contigs, we cannot get an insert ## length. print "read $read spans ", $readC{"$read.b.abi"}, " <-> ", $readC{"$read.g.abi"}, "\n"; } } else{ ## Reads should be paired or not, nothing else. die "WTF?\n"; } } print "$paired paired reads and $unPaired unpaired\n"; $stat->add_data(@insertLengths); print "mean: ", $stat->mean, "\n"; print "stdv: ", $stat->standard_deviation, "\n"; Thanks very much for spending the time to consider the above script. All the best, Dan. -- http://network.nature.com/profile/dan From cjfields at illinois.edu Fri Nov 14 12:08:38 2008 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 14 Nov 2008 11:08:38 -0600 Subject: [Bioperl-l] undefined sub-sequence with a single base In-Reply-To: <1226674770.6451.41.camel@alexie-laptop> References: <1226674555.6451.37.camel@alexie-laptop> <1226674770.6451.41.camel@alexie-laptop> Message-ID: <19F8A0BA-799D-4C4A-8713-6129543C30E1@illinois.edu> We've switched to subversion a while ago. Could you try updating from there, or using one of our nightly builds? http://www.bioperl.org/DIST/nightly_builds/ chris On Nov 14, 2008, at 8:59 AM, Alexie Papanicolaou wrote: > Ah, I should add this is a BLASTX. > > > On Fri, 2008-11-14 at 14:56 +0000, Alexie Papanicolaou wrote: > >> Dear all, >> >> Would anyone have an insight into this error while parsing a BLAST >> report (bioperl from CVS on 28th of August)? >> >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: Undefined sub-sequence (548,548). Valid range = 51 - 548 >> STACK: Error::throw >> STACK: >> Bio::Root::Root::throw /usr/local/share/perl/5.8.8/Bio/Root/Root.pm: >> 357 >> STACK: >> Bio::Search::HSP::HSPI::matches /usr/local/share/perl/5.8.8/Bio/ >> Search/HSP/HSPI.pm:691 >> STACK: >> Bio::Search::SearchUtils::_adjust_contigs /usr/local/share/perl/ >> 5.8.8/Bio/Search/SearchUtils.pm:421 >> STACK: >> Bio::Search::SearchUtils::tile_hsps /usr/local/share/perl/5.8.8/Bio/ >> Search/SearchUtils.pm:200 >> STACK: >> Bio::Search::Hit::GenericHit::strand /usr/local/share/perl/5.8.8/ >> Bio/Search/Hit/GenericHit.pm:1455 >> >> This is using HSPI.pm with the patch starting >> ## ML: START fix for substr out of range error >> >> many thanks for any pointers >> a >> >> Alexie Papanicolaou >> Entomology >> Max Planck Institute for Chemical Ecology >> Hans Knoell Str 8 >> Jena 07745 >> Germany >> Email apapanicolaou at ice.mpg.de >> Tel +493641571561 >> >> >> > > -- > -- > "Eppur si evolve" ("And yet it evolves") > -Galileo Jr (ca 21st century) > > "One Galileo in two thousand years is enough." -Pope Pius XII > -- > Alexie Papanicolaou > Entomology > Max Planck Institute for Chemical Ecology > Hans Knoell Str 8 > Jena 07745 > Germany > Email apapanicolaou at ice.mpg.de > Tel +493641571561 > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bartomas at gmail.com Sun Nov 16 10:58:09 2008 From: bartomas at gmail.com (tomas_bar) Date: Sun, 16 Nov 2008 07:58:09 -0800 (PST) Subject: [Bioperl-l] Re trieving all genes of a species from DBBJ/GENBANK using Perl SOAP Message-ID: <20520856.post@talk.nabble.com> Hi, I?m using Perl to query DBBJ using SOAP. I want to find all genes of a given species in the database. However, the different SOAP services available like GetEntry only allow you to retrieve records by accession number. Do you know how I could find all genes of a species? Thank you very much. -- View this message in context: http://www.nabble.com/Retrieving-all-genes-of-a-species-from-DBBJ-GENBANK-using-Perl-SOAP-tp20520856p20520856.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From cjfields at illinois.edu Sun Nov 16 11:38:39 2008 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 16 Nov 2008 10:38:39 -0600 Subject: [Bioperl-l] Thoughts on some test reorganization Message-ID: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> All, I'm still working on an overall plan for a 1.6 release and fixing some bugs. In the meantime, for my sanity I'm planning on doing some test reorganization in subversion, starting with splitting up SearchIO.t (which has become very long and unwieldy) into separate format- specific test files. This may be something to think about for other sets of modules as well which are plugin-able or parser-specific (SeqIO, AlignIO, Bio::Tools*, etc). Though it will lead to quite a few more files, I think it will be easier in the long term to identify and fix format-specific bugs. It also may be helpful in the long term with splitting up bioperl into subdistributions, identifying holes in test coverage, deprecating unsupported modules, etc. The details (of course subject to debate!): 1) Tests which are parser-specific will be moved to test files in the form SearchIO_*.t, where the '*' represents the specific parser being tested. 2) Tests for methods implemented in SearchIO.pm (such as _guess_format) will remain in SearchIO.t. 3) I'll also move other SearchIO-related tests (hmmer, the pull parsers) to their related SearchIO_* counterparts. 4) The utility method in SearchIO.t will probably be moved to Bio::Search::SearchUtils and imported in to prevent code dups. Comments? Thoughts? chris From cjfields at illinois.edu Sun Nov 16 11:41:58 2008 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 16 Nov 2008 10:41:58 -0600 Subject: [Bioperl-l] Re trieving all genes of a species from DBBJ/GENBANK using Perl SOAP In-Reply-To: <20520856.post@talk.nabble.com> References: <20520856.post@talk.nabble.com> Message-ID: Tomas, Not sure if this is related to bioperl specifically. Is there any particular you are using DDBJ over EMBL or GenBank? I would probably go about this using Bio::DB::GenBank, Bio::DB::EntrezGene, or similar in combination with a GenBank query (Bio::DB::Query::GenBank); see the relevant module POD for details. chris On Nov 16, 2008, at 9:58 AM, tomas_bar wrote: > > Hi, > I?m using Perl to query DBBJ using SOAP. > I want to find all genes of a given species in the database. > However, the different SOAP services available like GetEntry only > allow you > to retrieve records by accession number. > Do you know how I could find all genes of a species? > Thank you very much. > > -- > View this message in context: http://www.nabble.com/Retrieving-all-genes-of-a-species-from-DBBJ-GENBANK-using-Perl-SOAP-tp20520856p20520856.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Sun Nov 16 12:59:15 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Sun, 16 Nov 2008 18:59:15 +0100 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> Message-ID: <628aabb70811160959l13afa7b6t7c3631fb2346d81b@mail.gmail.com> Sounds great, Chris. Makes sense to me. Dave From charles-listes+bioperl at plessy.org Sun Nov 16 19:31:47 2008 From: charles-listes+bioperl at plessy.org (Charles Plessy) Date: Mon, 17 Nov 2008 09:31:47 +0900 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> Message-ID: <20081117003147.GB23090@kunpuu.plessy.org> Hello Chris, if I can make a wish for the test suite, could we have some sort of --no-internet option that would disable the tests requiring internet acces? That would allow Debian to run the tests at build time and have the results in our build logs, which are publically readable. Have a nice day, -- Charles Plessy Debian Med packaging team, http://www.debian.org/devel/debian-med Tsurumi, Kanagawa, Japan From cjfields at illinois.edu Sun Nov 16 21:24:44 2008 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 16 Nov 2008 20:24:44 -0600 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: <20081117003147.GB23090@kunpuu.plessy.org> References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> <20081117003147.GB23090@kunpuu.plessy.org> Message-ID: Could you post that as an enhancement request to bugzilla for the 1.6 release? We have had similar requests for default options. I'm not sure we can add these in for the first 1.6 release but if someone has tuits... chris On Nov 16, 2008, at 6:31 PM, Charles Plessy wrote: > Hello Chris, > > if I can make a wish for the test suite, could we have some sort of > --no-internet option that would disable the tests requiring internet > acces? > > That would allow Debian to run the tests at build time and have the > results in > our build logs, which are publically readable. > > Have a nice day, > > -- > Charles Plessy > Debian Med packaging team, > http://www.debian.org/devel/debian-med > Tsurumi, Kanagawa, Japan From cjfields at illinois.edu Sun Nov 16 21:34:00 2008 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 16 Nov 2008 20:34:00 -0600 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: <27776686627546EA9269EC05C8A6B987@nexus.csiro.au> References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> <27776686627546EA9269EC05C8A6B987@nexus.csiro.au> Message-ID: <074BA856-BBA2-43CE-8BAB-8B20BA892704@illinois.edu> Nathan, I would support adding support for Test::Class, but it isn't part of perl 5.6.1 or 5.8.x core. So, would we need to add it as another dependency for testing (which, unless there is a workaround, appears to be 'yes')? Speaking of older perl releases, I would like to get an idea of how many users are still using (over 8-yr old) perl 5.6.1. I would really like to require 5.8 or above for bioperl 1.6. perl 5.6.x stopped development many moons ago now, and 5.8.x is about to be 'end-of-lifed': http://www.perlmonks.org/?node_id=723008 chris On Nov 16, 2008, at 4:57 PM, Nathan S. Watson-Haigh wrote: > Hi Chris, > > I don't know if using Test::Class might be an option: > http://safari.oreilly.com/0596100922/perltestingadn-CHP-8 > http://perlandmac.blogspot.com/2007/08/using-perl-testclass-to-organized-uni > t.html > http://search.cpan.org/dist/Test-Class/ > > Anyway, just threw that out there to see if there were any thoughts > on this. > > Nath > > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chris > Fields > Sent: Monday, 17 November 2008 2:39 AM > To: BioPerl List > Subject: [Bioperl-l] Thoughts on some test reorganization > > All, > > I'm still working on an overall plan for a 1.6 release and fixing some > bugs. In the meantime, for my sanity I'm planning on doing some test > reorganization in subversion, starting with splitting up SearchIO.t > (which has become very long and unwieldy) into separate format- > specific test files. > > This may be something to think about for other sets of modules as well > which are plugin-able or parser-specific (SeqIO, AlignIO, Bio::Tools*, > etc). Though it will lead to quite a few more files, I think it will > be easier in the long term to identify and fix format-specific bugs. > It also may be helpful in the long term with splitting up bioperl into > subdistributions, identifying holes in test coverage, deprecating > unsupported modules, etc. > > The details (of course subject to debate!): > > 1) Tests which are parser-specific will be moved to test files in the > form SearchIO_*.t, where the '*' represents the specific parser being > tested. > 2) Tests for methods implemented in SearchIO.pm (such as > _guess_format) will remain in SearchIO.t. > 3) I'll also move other SearchIO-related tests (hmmer, the pull > parsers) to their related SearchIO_* counterparts. > 4) The utility method in SearchIO.t will probably be moved to > Bio::Search::SearchUtils and imported in to prevent code dups. > > Comments? Thoughts? > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From charles-listes+bioperl at plessy.org Sun Nov 16 21:42:07 2008 From: charles-listes+bioperl at plessy.org (Charles Plessy) Date: Mon, 17 Nov 2008 11:42:07 +0900 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> <20081117003147.GB23090@kunpuu.plessy.org> Message-ID: <20081117024207.GA27142@kunpuu.plessy.org> Le Sun, Nov 16, 2008 at 08:24:44PM -0600, Chris Fields a ?crit : > Could you post that as an enhancement request to bugzilla for the 1.6 > release? Done: http://bugzilla.open-bio.org/show_bug.cgi?id=2665 Have a nice day, -- Charles From cjfields at illinois.edu Sun Nov 16 21:42:17 2008 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 16 Nov 2008 20:42:17 -0600 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: <628aabb70811160959l13afa7b6t7c3631fb2346d81b@mail.gmail.com> References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> <628aabb70811160959l13afa7b6t7c3631fb2346d81b@mail.gmail.com> Message-ID: I'll commit in the next day or two unless anyone indicates to the contrary. Hopefully that's enough time to gather some more comments. chris On Nov 16, 2008, at 11:59 AM, Dave Messina wrote: > Sounds great, Chris. Makes sense to me. > > > Dave > > > From hlapp at gmx.net Sun Nov 16 22:43:57 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Sun, 16 Nov 2008 22:43:57 -0500 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: <074BA856-BBA2-43CE-8BAB-8B20BA892704@illinois.edu> References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> <27776686627546EA9269EC05C8A6B987@nexus.csiro.au> <074BA856-BBA2-43CE-8BAB-8B20BA892704@illinois.edu> Message-ID: On Nov 16, 2008, at 9:34 PM, Chris Fields wrote: > 5.8.x is about to be 'end-of-lifed' Interesting. 5.8.x is the version of Perl for Mac OSX Leopard (and Tiger, too) - I don't see the need to support it going away any time soon. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at illinois.edu Mon Nov 17 00:30:30 2008 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 16 Nov 2008 23:30:30 -0600 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> <27776686627546EA9269EC05C8A6B987@nexus.csiro.au> <074BA856-BBA2-43CE-8BAB-8B20BA892704@illinois.edu> Message-ID: Hilmar, I think we should support 5.8.1 as a minimum but drop support for perl 5.6 and below. The perl devs are pushing on 5.10 pretty heavily, with 5.10.1 due within 6-12 months. Parrot development (and perl6 along with it) is actually progressing rapidly enough I wouldn't be surprised to see a decently working perl 6 alpha implementation within a year's time (see latest timeline here: http://use.perl.org/~chromatic/journal/37889). bioperl6 anyone? Note: dropping support for 5.6 may not mean that bioperl wouldn't work with 5.6.x; it has been actively tested on it over the years. It can easily mean 'we will no longer make fixes to deal with backporting issues to 5.6'. However, I suggest we make the min 5.8 requirement explicit. The latest perl 5.6 release is just over 5 years old. Completely OT, but I wouldn't be surprised if OS X 10.6 (Snow Leopard) ships with 5.10, just haven't seen anything to confirm that yet. chris On Nov 16, 2008, at 9:43 PM, Hilmar Lapp wrote: > > On Nov 16, 2008, at 9:34 PM, Chris Fields wrote: > >> 5.8.x is about to be 'end-of-lifed' > > > Interesting. 5.8.x is the version of Perl for Mac OSX Leopard (and > Tiger, too) - I don't see the need to support it going away any time > soon. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== From bix at sendu.me.uk Mon Nov 17 04:45:04 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 17 Nov 2008 09:45:04 +0000 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: <20081117003147.GB23090@kunpuu.plessy.org> References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> <20081117003147.GB23090@kunpuu.plessy.org> Message-ID: <49213D20.7020304@sendu.me.uk> Charles Plessy wrote: > if I can make a wish for the test suite, could we have some sort of > --no-internet option that would disable the tests requiring internet acces? By default internet requiring tests are not run. You have to turn them on with --network or say yes during an interactive Build.pl run. I'm marking your enhancement request on bugzilla as invalid, but if you are running into some issue that gives you internet tests when you don't want them, please update the bug report with details. > That would allow Debian to run the tests at build time and have the results in > our build logs, which are publically readable. Though in any case, I'm not sure how having the internet tests turned on prevents the above? From spiros at lokku.com Mon Nov 17 04:55:35 2008 From: spiros at lokku.com (Spiros Denaxas) Date: Mon, 17 Nov 2008 09:55:35 +0000 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> <27776686627546EA9269EC05C8A6B987@nexus.csiro.au> <074BA856-BBA2-43CE-8BAB-8B20BA892704@illinois.edu> Message-ID: I agree, 5.10 is the new cool kid on the block yet I have not come across many people that can firmly say they use it in a production / commercial environment. I will chime in and say that, ideally, support for 5.8.* should be a few years at least from going away. Spiros On Mon, Nov 17, 2008 at 3:43 AM, Hilmar Lapp wrote: > > On Nov 16, 2008, at 9:34 PM, Chris Fields wrote: > > 5.8.x is about to be 'end-of-lifed' >> > > > Interesting. 5.8.x is the version of Perl for Mac OSX Leopard (and Tiger, > too) - I don't see the need to support it going away any time soon. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at illinois.edu Mon Nov 17 07:34:34 2008 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 17 Nov 2008 06:34:34 -0600 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> <27776686627546EA9269EC05C8A6B987@nexus.csiro.au> <074BA856-BBA2-43CE-8BAB-8B20BA892704@illinois.edu> Message-ID: <3B619C86-60C4-4B77-9A9E-D763C2BF2A64@illinois.edu> 5.8 will be supported for a while. Personally I use 5.10 with bioperl and only found a few issues that were fairly easy to deal with. But then again, I am also messing around with Rakudo Perl (6). chris On Nov 17, 2008, at 3:55 AM, Spiros Denaxas wrote: > I agree, 5.10 is the new cool kid on the block yet I have not come > across > many people that can firmly say they use it in a production / > commercial > environment. > I will chime in and say that, ideally, support for 5.8.* should be a > few > years at least from going away. > > Spiros > > On Mon, Nov 17, 2008 at 3:43 AM, Hilmar Lapp wrote: > >> >> On Nov 16, 2008, at 9:34 PM, Chris Fields wrote: >> >> 5.8.x is about to be 'end-of-lifed' >>> >> >> >> Interesting. 5.8.x is the version of Perl for Mac OSX Leopard (and >> Tiger, >> too) - I don't see the need to support it going away any time soon. >> >> -hilmar >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Mon Nov 17 07:40:02 2008 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 17 Nov 2008 06:40:02 -0600 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: <49213D20.7020304@sendu.me.uk> References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> <20081117003147.GB23090@kunpuu.plessy.org> <49213D20.7020304@sendu.me.uk> Message-ID: On Nov 17, 2008, at 3:45 AM, Sendu Bala wrote: > Charles Plessy wrote: >> if I can make a wish for the test suite, could we have some sort of >> --no-internet option that would disable the tests requiring >> internet acces? > > By default internet requiring tests are not run. You have to turn > them on with --network or say yes during an interactive Build.pl run. > > I'm marking your enhancement request on bugzilla as invalid, but if > you are running into some issue that gives you internet tests when > you don't want them, please update the bug report with details. Okay, works for me. chris From rymarquis at dbi.udel.edu Mon Nov 17 07:42:28 2008 From: rymarquis at dbi.udel.edu (larymarquis) Date: Mon, 17 Nov 2008 04:42:28 -0800 (PST) Subject: [Bioperl-l] Re trieving all genes of a species from DBBJ/GENBANK using Perl SOAP In-Reply-To: References: <20520856.post@talk.nabble.com> Message-ID: <20538720.post@talk.nabble.com> Chris Fields-5 wrote: > > > I would probably go about this using Bio::DB::GenBank, > Bio::DB::EntrezGene, or similar in combination with a GenBank query > (Bio::DB::Query::GenBank); see the relevant module POD for details. > > If you are using NCBI, the Taxonomy browser in NCBI is a good way to identify the appropriate query term to obtain all the sequences for a given species. You can navigate to your species in the Taxonomy tree and select what databases you want to query from the check boxes at the top. After hitting the go button, it will display the number of sequences meeting the criteria. Clicking on the number will send you to the normal results page except a keyword has been filled in for you (something like txid3052[Organism:exp] ). You can then use this keyword to access the entire set of sequences using EFETCH from NCBI and maybe the bioperl modules as well. Linda -- View this message in context: http://www.nabble.com/Retrieving-all-genes-of-a-species-from-DBBJ-GENBANK-using-Perl-SOAP-tp20520856p20538720.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From charles-listes+bioperl at plessy.org Mon Nov 17 08:53:41 2008 From: charles-listes+bioperl at plessy.org (Charles Plessy) Date: Mon, 17 Nov 2008 22:53:41 +0900 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: <49213D20.7020304@sendu.me.uk> References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> <20081117003147.GB23090@kunpuu.plessy.org> <49213D20.7020304@sendu.me.uk> Message-ID: <20081117135341.GF7545@kunpuu.plessy.org> Le Mon, Nov 17, 2008 at 09:45:04AM +0000, Sendu Bala a ?crit : > Charles Plessy wrote: >> if I can make a wish for the test suite, could we have some sort of >> --no-internet option that would disable the tests requiring internet >> acces? > > By default internet requiring tests are not run. You have to turn them > on with --network or say yes during an interactive Build.pl run. Oups, I confused myself with another Perl package that had the problem. Sorry for the noise, -- Charles Plessy Debian Med packaging team, Tsurumi, Kanagawa, Japan From cjfields at illinois.edu Mon Nov 17 09:13:02 2008 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 17 Nov 2008 08:13:02 -0600 Subject: [Bioperl-l] Release 1.6 Message-ID: <827932B6-FCBD-4B98-B307-883A93F0DDFF@illinois.edu> All, I have added some general ideas on the 1.6 release here: http://www.bioperl.org/wiki/Release_Schedule It's still incomplete (I need to add a bug priority list, including the latest one Sendu just posted); I'll be updating it over the next few days. Feel free to add comments on the discussion page (this would be the spot to take on tasks for those with the tuits): http://www.bioperl.org/wiki/Talk:Release_Schedule chris From hlapp at gmx.net Mon Nov 17 12:03:28 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 17 Nov 2008 12:03:28 -0500 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> <27776686627546EA9269EC05C8A6B987@nexus.csiro.au> <074BA856-BBA2-43CE-8BAB-8B20BA892704@illinois.edu> Message-ID: <33B53DD7-D362-453D-A21F-67C76E4B7ECD@gmx.net> On Nov 17, 2008, at 12:30 AM, Chris Fields wrote: > Completely OT, but I wouldn't be surprised if OS X 10.6 (Snow > Leopard) ships with 5.10, just haven't seen anything to confirm that > yet. True, but I've just recently been dumbfounded when a student in a course we were running showed up with a Mac OSX Jaguar laptop (and was quite happy with it). In other words, not everyone out there upgrades the OS eagerly. I agree it's reasonable not to put a lot of energy into fixing bugs that only show up under Perl prior to 5.8.x. But if BioPerl refuses to even work (or spit out ugly warnings) under 5.6, isn't that a bit too much of forcing upgrades on people who may not necessarily need it? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From apapanicolaou at ice.mpg.de Mon Nov 17 12:43:07 2008 From: apapanicolaou at ice.mpg.de (Alexie Papanicolaou) Date: Mon, 17 Nov 2008 17:43:07 +0000 Subject: [Bioperl-l] undefined sub-sequence with a single base In-Reply-To: <19F8A0BA-799D-4C4A-8713-6129543C30E1@illinois.edu> References: <1226674555.6451.37.camel@alexie-laptop> <1226674770.6451.41.camel@alexie-laptop> <19F8A0BA-799D-4C4A-8713-6129543C30E1@illinois.edu> Message-ID: <1226943787.17996.26.camel@alexie-laptop> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bix at sendu.me.uk Mon Nov 17 12:30:57 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 17 Nov 2008 17:30:57 +0000 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: <33B53DD7-D362-453D-A21F-67C76E4B7ECD@gmx.net> References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> <27776686627546EA9269EC05C8A6B987@nexus.csiro.au> <074BA856-BBA2-43CE-8BAB-8B20BA892704@illinois.edu> <33B53DD7-D362-453D-A21F-67C76E4B7ECD@gmx.net> Message-ID: <4921AA51.5020503@sendu.me.uk> Hilmar Lapp wrote: > In other words, not everyone out there upgrades the OS eagerly. > > I agree it's reasonable not to put a lot of energy into fixing bugs that > only show up under Perl prior to 5.8.x. But if BioPerl refuses to even > work (or spit out ugly warnings) under 5.6, isn't that a bit too much of > forcing upgrades on people who may not necessarily need it? My thoughts as well. Chris, did you see something specific to justify a change? Like, for 1.5.2 there were specific modules/pragmas only first included in 5.6 that motivated the change. I don't think requiring people upgrade their perl just so we can enjoy some entirely /theoretical/ benefit really makes much sense. From jason at bioperl.org Mon Nov 17 13:16:17 2008 From: jason at bioperl.org (Jason Stajich) Date: Mon, 17 Nov 2008 10:16:17 -0800 Subject: [Bioperl-l] undefined sub-sequence with a single base In-Reply-To: <1226943787.17996.26.camel@alexie-laptop> References: <1226674555.6451.37.camel@alexie-laptop> <1226674770.6451.41.camel@alexie-laptop> <19F8A0BA-799D-4C4A-8713-6129543C30E1@illinois.edu> <1226943787.17996.26.camel@alexie-laptop> Message-ID: <2752CD02-90AE-40AA-8456-56B38CAC6C3B@bioperl.org> Personally - I'm not sure I trust tile_hsps on a translated search - or at all - really - you may want to compute the "dominant" strand yourself by iterating through the HSPs or using WU-BLAST to get logical groups of HSPs which is a better tiling HSP algorithm (the -- links option in WU-BLAST). -jason On Nov 17, 2008, at 9:43 AM, Alexie Papanicolaou wrote: > Hi Chris > > Sorry, I got the new SVN build today and still get the same error... > > Could it be because the subseq is not divisible by 3 (due to blastx)? > > a > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Undefined sub-sequence (2,2). Valid range = 2 - 190 > STACK: Error::throw > STACK: > Bio::Root::Root::throw /usr/local/share/perl/5.8.8/Bio/Root/Root.pm: > 357 > STACK: > Bio::Search::HSP::HSPI::matches /usr/local/share/perl/5.8.8/Bio/ > Search/HSP/HSPI.pm:691 > STACK: > Bio::Search::SearchUtils::_adjust_contigs /usr/local/share/perl/ > 5.8.8/Bio/Search/SearchUtils.pm:460 > STACK: > Bio::Search::SearchUtils::tile_hsps /usr/local/share/perl/5.8.8/Bio/ > Search/SearchUtils.pm:200 > STACK: > Bio::Search::Hit::GenericHit::strand /usr/local/share/perl/5.8.8/Bio/ > Search/Hit/GenericHit.pm:1455 > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Undefined sub-sequence (3,4). Valid range = 3 - 44 > STACK: Error::throw > STACK: > Bio::Root::Root::throw /usr/local/share/perl/5.8.8/Bio/Root/Root.pm: > 357 > STACK: > Bio::Search::HSP::HSPI::matches /usr/local/share/perl/5.8.8/Bio/ > Search/HSP/HSPI.pm:691 > STACK: > Bio::Search::SearchUtils::_adjust_contigs /usr/local/share/perl/ > 5.8.8/Bio/Search/SearchUtils.pm:404 > STACK: > Bio::Search::SearchUtils::tile_hsps /usr/local/share/perl/5.8.8/Bio/ > Search/SearchUtils.pm:200 > STACK: > Bio::Search::Hit::GenericHit::strand /usr/local/share/perl/5.8.8/Bio/ > Search/Hit/GenericHit.pm:1455 > > > > > > > On Fri, 2008-11-14 at 11:08 -0600, Chris Fields wrote: > >> We've switched to subversion a while ago. Could you try updating >> from >> there, or using one of our nightly builds? >> >> http://www.bioperl.org/DIST/nightly_builds/ >> >> chris >> >> On Nov 14, 2008, at 8:59 AM, Alexie Papanicolaou wrote: > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Jason Stajich jason at bioperl.org From cjfields at illinois.edu Mon Nov 17 13:36:32 2008 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 17 Nov 2008 12:36:32 -0600 Subject: [Bioperl-l] undefined sub-sequence with a single base In-Reply-To: <2752CD02-90AE-40AA-8456-56B38CAC6C3B@bioperl.org> References: <1226674555.6451.37.camel@alexie-laptop> <1226674770.6451.41.camel@alexie-laptop> <19F8A0BA-799D-4C4A-8713-6129543C30E1@illinois.edu> <1226943787.17996.26.camel@alexie-laptop> <2752CD02-90AE-40AA-8456-56B38CAC6C3B@bioperl.org> Message-ID: I agree completely with everything Jason says. Not to mention this is also very similar to a filed bug, which I have no clue on how to fix: http://bugzilla.open-bio.org/show_bug.cgi?id=2476 chris On Nov 17, 2008, at 12:16 PM, Jason Stajich wrote: > Personally - I'm not sure I trust tile_hsps on a translated search - > or at all - really - you may want to compute the "dominant" strand > yourself by iterating through the HSPs or using WU-BLAST to get > logical groups of HSPs which is a better tiling HSP algorithm (the -- > links option in WU-BLAST). > > -jason > On Nov 17, 2008, at 9:43 AM, Alexie Papanicolaou wrote: > >> Hi Chris >> >> Sorry, I got the new SVN build today and still get the same error... >> >> Could it be because the subseq is not divisible by 3 (due to blastx)? >> >> a >> >> >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: Undefined sub-sequence (2,2). Valid range = 2 - 190 >> STACK: Error::throw >> STACK: >> Bio::Root::Root::throw /usr/local/share/perl/5.8.8/Bio/Root/Root.pm: >> 357 >> STACK: >> Bio::Search::HSP::HSPI::matches /usr/local/share/perl/5.8.8/Bio/ >> Search/HSP/HSPI.pm:691 >> STACK: >> Bio::Search::SearchUtils::_adjust_contigs /usr/local/share/perl/ >> 5.8.8/Bio/Search/SearchUtils.pm:460 >> STACK: >> Bio::Search::SearchUtils::tile_hsps /usr/local/share/perl/5.8.8/Bio/ >> Search/SearchUtils.pm:200 >> STACK: >> Bio::Search::Hit::GenericHit::strand /usr/local/share/perl/5.8.8/ >> Bio/Search/Hit/GenericHit.pm:1455 >> >> >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: Undefined sub-sequence (3,4). Valid range = 3 - 44 >> STACK: Error::throw >> STACK: >> Bio::Root::Root::throw /usr/local/share/perl/5.8.8/Bio/Root/Root.pm: >> 357 >> STACK: >> Bio::Search::HSP::HSPI::matches /usr/local/share/perl/5.8.8/Bio/ >> Search/HSP/HSPI.pm:691 >> STACK: >> Bio::Search::SearchUtils::_adjust_contigs /usr/local/share/perl/ >> 5.8.8/Bio/Search/SearchUtils.pm:404 >> STACK: >> Bio::Search::SearchUtils::tile_hsps /usr/local/share/perl/5.8.8/Bio/ >> Search/SearchUtils.pm:200 >> STACK: >> Bio::Search::Hit::GenericHit::strand /usr/local/share/perl/5.8.8/ >> Bio/Search/Hit/GenericHit.pm:1455 >> >> >> >> >> >> >> On Fri, 2008-11-14 at 11:08 -0600, Chris Fields wrote: >> >>> We've switched to subversion a while ago. Could you try updating >>> from >>> there, or using one of our nightly builds? >>> >>> http://www.bioperl.org/DIST/nightly_builds/ >>> >>> chris >>> >>> On Nov 14, 2008, at 8:59 AM, Alexie Papanicolaou wrote: >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Jason Stajich > jason at bioperl.org > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From cjfields at illinois.edu Mon Nov 17 14:04:46 2008 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 17 Nov 2008 13:04:46 -0600 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: <4921AA51.5020503@sendu.me.uk> References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> <27776686627546EA9269EC05C8A6B987@nexus.csiro.au> <074BA856-BBA2-43CE-8BAB-8B20BA892704@illinois.edu> <33B53DD7-D362-453D-A21F-67C76E4B7ECD@gmx.net> <4921AA51.5020503@sendu.me.uk> Message-ID: <6D936C4E-FD5D-45FE-AF17-6FD186C6078F@illinois.edu> On Nov 17, 2008, at 11:30 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> In other words, not everyone out there upgrades the OS eagerly. >> I agree it's reasonable not to put a lot of energy into fixing bugs >> that only show up under Perl prior to 5.8.x. But if BioPerl refuses >> to even work (or spit out ugly warnings) under 5.6, isn't that a >> bit too much of forcing upgrades on people who may not necessarily >> need it? > > My thoughts as well. > > Chris, did you see something specific to justify a change? Like, for > 1.5.2 there were specific modules/pragmas only first included in 5.6 > that motivated the change. No, hence my bit indicating that 5.6 should work, and whether or not we want to make the 5.8 requirement explicit. I also don't think we should be fixing bugs or making changes to deal with a 5-yr-old perl release when upgrading one's local perl is a much better option (not to mention the benefit of bug fixes, more cohesive core, better security, etc). As I mentioned, even 5.8 has effectively been 'end-of- lifed', so why actively support a version that is even older than that? > I don't think requiring people upgrade their perl just so we can > enjoy some entirely /theoretical/ benefit really makes much sense. We can leave the indication that we require 5.6.1 and up but recommend 5.8.x (already in place) and will only support fixes for perl 5.8 (not made explicit). 99.9% of the time when a bug is reported the perl version will not make a difference, but I don't want to be shoehorned into supporting an old version of perl when push comes to shove. chris From cjfields at illinois.edu Mon Nov 17 19:08:10 2008 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 17 Nov 2008 18:08:10 -0600 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> <628aabb70811160959l13afa7b6t7c3631fb2346d81b@mail.gmail.com> Message-ID: Committed. I'll start doing the same for the other IO modules where possible. chris On Nov 16, 2008, at 8:42 PM, Chris Fields wrote: > I'll commit in the next day or two unless anyone indicates to the > contrary. Hopefully that's enough time to gather some more comments. > > chris > > On Nov 16, 2008, at 11:59 AM, Dave Messina wrote: > >> Sounds great, Chris. Makes sense to me. >> >> >> Dave From bartomas at gmail.com Tue Nov 18 05:45:46 2008 From: bartomas at gmail.com (bar tomas) Date: Tue, 18 Nov 2008 10:45:46 +0000 Subject: [Bioperl-l] Re trieving all genes of a species from DBBJ/GENBANK using Perl SOAP In-Reply-To: <20538720.post@talk.nabble.com> References: <20520856.post@talk.nabble.com> <20538720.post@talk.nabble.com> Message-ID: Hi Thank you very much for your replies, and sorry that my post was misplaced in this forum. I'm pretty clueless on this subject. My difficulty is that my aim is to retrieve all gene names of a given species and that the when performing a query on a species name, I get a list of accession numbers. Can a record identified by an accession number span several genes? Can a record identified by an accession number refer to a sequence that is not a gene? My choice of using DDBJ is pretty uninformed too. I had understood that the information contained in DDBJ includes Genbank and I was interested in the DDBJ facility of performing XPath queries on GenBank entries. But do you recommend using NCBI Entrez instead? Thank you very much and sorry about my clueless questions. tomas On Mon, Nov 17, 2008 at 12:42 PM, larymarquis wrote: > > > Chris Fields-5 wrote: > > > > > > I would probably go about this using Bio::DB::GenBank, > > Bio::DB::EntrezGene, or similar in combination with a GenBank query > > (Bio::DB::Query::GenBank); see the relevant module POD for details. > > > > > > If you are using NCBI, the Taxonomy browser in NCBI is a good way to > identify the appropriate query term to obtain all the sequences for a given > species. > > You can navigate to your species in the Taxonomy tree and select what > databases you want to query from the check boxes at the top. After hitting > the go button, it will display the number of sequences meeting the > criteria. > Clicking on the number will send you to the normal results page except a > keyword has been filled in for you (something like txid3052[Organism:exp] > ). > You can then use this keyword to access the entire set of sequences using > EFETCH from NCBI and maybe the bioperl modules as well. > Linda > > -- > View this message in context: > http://www.nabble.com/Retrieving-all-genes-of-a-species-from-DBBJ-GENBANK-using-Perl-SOAP-tp20520856p20538720.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bartomas at gmail.com Sat Nov 15 19:01:47 2008 From: bartomas at gmail.com (tomas_bar) Date: Sat, 15 Nov 2008 16:01:47 -0800 (PST) Subject: [Bioperl-l] Re trieving all genes of a species from DBBJ/GENBANK using Perl SOAP Message-ID: <20520856.post@talk.nabble.com> Hi, I?m using Perl to query DBBJ using SOAP. I want to find all genes of a given species in the database. However, the different SOAP services available like GetEntry only allow you to retrieve records by accession number. Do you know how I could find all genes of a species? Thank you very much. -- View this message in context: http://www.nabble.com/Retrieving-all-genes-of-a-species-from-DBBJ-GENBANK-using-Perl-SOAP-tp20520856p20520856.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From bartomas at gmail.com Sun Nov 16 10:43:47 2008 From: bartomas at gmail.com (tomas_bar) Date: Sun, 16 Nov 2008 07:43:47 -0800 (PST) Subject: [Bioperl-l] Re trieving all genes of a species from DBBJ/GENBANK using Perl SOAP Message-ID: <20520856.post@talk.nabble.com> Hi, I?m using Perl to query DBBJ using SOAP. I want to find all genes of a given species in the database. However, the different SOAP services available like GetEntry only allow you to retrieve records by accession number. Do you know how I could find all genes of a species? Thank you very much. -- View this message in context: http://www.nabble.com/Retrieving-all-genes-of-a-species-from-DBBJ-GENBANK-using-Perl-SOAP-tp20520856p20520856.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From nhaigh at sheffield.ac.uk Sun Nov 16 17:57:19 2008 From: nhaigh at sheffield.ac.uk (Nathan S. Watson-Haigh) Date: Mon, 17 Nov 2008 08:57:19 +1000 Subject: [Bioperl-l] Thoughts on some test reorganization In-Reply-To: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> References: <33ED8964-5333-4F76-BD3C-0B89671ECCFB@illinois.edu> Message-ID: <27776686627546EA9269EC05C8A6B987@nexus.csiro.au> Hi Chris, I don't know if using Test::Class might be an option: http://safari.oreilly.com/0596100922/perltestingadn-CHP-8 http://perlandmac.blogspot.com/2007/08/using-perl-testclass-to-organized-uni t.html http://search.cpan.org/dist/Test-Class/ Anyway, just threw that out there to see if there were any thoughts on this. Nath -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields Sent: Monday, 17 November 2008 2:39 AM To: BioPerl List Subject: [Bioperl-l] Thoughts on some test reorganization All, I'm still working on an overall plan for a 1.6 release and fixing some bugs. In the meantime, for my sanity I'm planning on doing some test reorganization in subversion, starting with splitting up SearchIO.t (which has become very long and unwieldy) into separate format- specific test files. This may be something to think about for other sets of modules as well which are plugin-able or parser-specific (SeqIO, AlignIO, Bio::Tools*, etc). Though it will lead to quite a few more files, I think it will be easier in the long term to identify and fix format-specific bugs. It also may be helpful in the long term with splitting up bioperl into subdistributions, identifying holes in test coverage, deprecating unsupported modules, etc. The details (of course subject to debate!): 1) Tests which are parser-specific will be moved to test files in the form SearchIO_*.t, where the '*' represents the specific parser being tested. 2) Tests for methods implemented in SearchIO.pm (such as _guess_format) will remain in SearchIO.t. 3) I'll also move other SearchIO-related tests (hmmer, the pull parsers) to their related SearchIO_* counterparts. 4) The utility method in SearchIO.t will probably be moved to Bio::Search::SearchUtils and imported in to prevent code dups. Comments? Thoughts? chris _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From maj at fortinbras.us Tue Nov 18 22:05:34 2008 From: maj at fortinbras.us (Mark A. Jensen) Date: Tue, 18 Nov 2008 22:05:34 -0500 Subject: [Bioperl-l] PrimarySeq properties in LocatableSeqs Message-ID: Gurus- I ran into a interesting 'bug' while taking slices of a Bio::SimpleAlign. I had set the primary_id of the original LocatableSeqs while constructing the aln. When the slice is delivered using Bio::SimpleAlign::slice(), the primary_id's didn't travel with the subseqs constructed for the subalignment, and this hammered subsequent manipulations with the subalignment. Inspecting slice(), I saw that the new objects created for the subseqs get the id (display_id) from the old, but that other properties with valid accessors in the base class are not passed along, which seemed a bit arbitrary. Course, I can set the other properties after the slice is delivered, but that seems kludgy, and the bug was strange and led to 'time spent in deeper understanding of BioPerl'. What is the philosophy: Could/should all fields/properties from the base classes be generally inherited when constructing an new derived object from an old one? cheers, Mark From cjfields at illinois.edu Tue Nov 18 23:48:20 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 18 Nov 2008 22:48:20 -0600 Subject: [Bioperl-l] PrimarySeq properties in LocatableSeqs In-Reply-To: References: Message-ID: On Nov 18, 2008, at 9:05 PM, Mark A. Jensen wrote: > Gurus- > I ran into a interesting 'bug' while taking slices of a > Bio::SimpleAlign. I had set the primary_id of the original > LocatableSeqs while constructing the aln. When the slice is > delivered using Bio::SimpleAlign::slice(), the primary_id's didn't > travel with the subseqs constructed for the subalignment, and this > hammered subsequent manipulations with the subalignment. > Inspecting slice(), I saw that the new objects created for the subseqs > get the id (display_id) from the old, but that other properties > with valid accessors in the base class are not passed along, which > seemed a bit arbitrary. Course, I can set the other properties after > the slice is delivered, but that seems kludgy, and the bug was strange > and led to 'time spent in deeper understanding of BioPerl'. > What is the philosophy: Could/should all fields/properties from the > base classes be generally inherited when constructing an new derived > object from an old one? > cheers, Mark In general, yes they should when appropriate. However, the problem is that the API may change slightly over time to deal with additional problems (add new attributes/methods), but the method making the slices is incapable of automatically dealing with these and must be updated as well. I think trunc() was supposed to do this but was never implemented. If you can could you add this as a bug? chris From apapanicolaou at ice.mpg.de Wed Nov 19 05:37:04 2008 From: apapanicolaou at ice.mpg.de (Alexie Papanicolaou) Date: Wed, 19 Nov 2008 10:37:04 +0000 Subject: [Bioperl-l] undefined sub-sequence with a single base In-Reply-To: References: <1226674555.6451.37.camel@alexie-laptop> <1226674770.6451.41.camel@alexie-laptop> <19F8A0BA-799D-4C4A-8713-6129543C30E1@illinois.edu> <1226943787.17996.26.camel@alexie-laptop> <2752CD02-90AE-40AA-8456-56B38CAC6C3B@bioperl.org> Message-ID: <1227091024.6703.8.camel@alexie-laptop> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From cjfields at illinois.edu Fri Nov 21 14:12:59 2008 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 21 Nov 2008 13:12:59 -0600 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl 1.6 release agenda Message-ID: All, Apologies in advance for the very long post. This will be placed and organized on the wiki (links below) for those who don't want to wade through the entire email. I'm putting together the general priority list for the BioPerl 1.6 release. Following are a number of items or proposals I think need to be addressed. This list is neither meant to be comprehensive nor does it represent everything that must be completed for the next immediate release, but indicates issues I believe need to be addressed prior to the release. The end game is to start setting and prioritizing goals for 1.6 and future releases. I'm sure that I've missed or glossed over a few things (and possibly missed the mark completely on others), so feel free to speak up! All comments are welcome. If there are any additional concerns or items now is the time to mention them so they can be addressed. Comments can be made here or be added to the wiki release schedule discussion page (relevant comments on the mail list thread will be copied over). http://www.bioperl.org/wiki/Talk:Release_1.6 Also, if anyone wants to work on certain items, please indicate so here or on the wiki. Items 'locked down' for the release series (along with relevant devs working on them) will be posted to the main release 1.6 page (accessible only to wiki sysops): http://www.bioperl.org/wiki/Release_1.6 I would like to continue discussion for the next week or two, allowing some time for the Thanksgiving holiday, then lock down release priorities soon after with the goal of releasing 1.6 within a reasonable time (2-4 weeks if possible). Bio::Graphics and Splitting BioPerl: 1a) We are planning on splitting up bioperl into smaller (more manageable) subdistributions; this will mainly occur after 1.6. However, GBrowse-related modules (Bio::Graphics, Bio::DB::SeqFeature, Bio::DB::GFF) are undergoing rapid development, much more than the rest of the bioperl modules. Therefore I suggest we test the waters (so to speak) and split those off prior to the 1.6 release into their own distribution. This will allow Lincoln and the GBrowse crew to proceed with development on those bits of bioperl independently; the only thing required would be a compatible bioperl release, preferably in CPAN. If needed this can be tested in a release candidate prior to a full release. 1b) If we split off Bio::Graphics and related, we should sort out versioning schemes issues for these subdistributions prior to splitting them off. Should they be completely independent versions or be tied in to core (i.e. be 1.6.x, which would be expected to have a requirement for bioperl 1.6)? 1c) At this point I am considering the other bioperl distributions (db, network, pedigree) as independent subdistributions, similar to what may happen with Bio::Graphics etc. Therefore, these will also have their own independent release schedule. Once 1.6 is released, we can work on getting other bioperl-* released to CPAN in fairly rapid manner so they all correspond to core 1.6, but from that point on they can follow a separate release schedule. 1d) Should we add an option to install subdistributions in the core build script? Bugs and bugzilla: 2a) Identify outstanding bugs which need to be addressed prior to release (I have accepted a couple myself which I've been working on), or should be addressed sometime in the 1.6 release series (i.e. fixed in a point release if there are no API changes). I will be adding a table to the wiki page noted above grouping related bugs with notes. 2b) Add any outstanding issues to bugzilla for tracking. 2c) We need to consider merging the enhancement requests and the project priority list in some meaningful way (either moving everything to the wiki, or organizing the enhancement requests for various Bio::* modules in bugzilla). Test suite: 3a) Though it isn't critical for the release, we should attempt some simple reorganization of the test suite to deal with the distribution split mentioned in (1) above. A significant amount can easily be accomplished by the 1.6 release, but we need to make sure no tests are left out or repeated. I have already done some simplification of the SearchIO tests by splitting them up by parser; we can do something similar for SeqIO, AlignIO, etc. Dave Messina has already taken up splitting and consolidating SeqIO tests, so any other volunteers for AlignIO and other relevant bits would be appreciated. Following is a simple list so we have some to work with (again, this may change depending on comments): 3b) Tests specific for parsers/plugins: rename Parent_plugin.t (ex: SeqIO_genbank.t, AlignIO_stockholm.t). I left the plugin as lowercase here, which is the naming convention we are using for the module names. 3c) Similarly, if modules belong into one collective group, they can follow a similar Group_class.t (Tools_Genewise.t). Here the module being tested is uppercase (not a plugin). 3d) How far do we want to take this? Should Location.t be split up by the various Bio::Location objects into Location_*.t (and similarly, Bio::Annotation, Bio::SeqFeature, etc)? Again, not a blocker for 1.6, should be fairly easy to do, but this will increase the number of test files quite significantly. 3e) We have a set of BioPerl-specific test functions that Sendu graciously set up in the bioperl 1.5.2 release. We should actually wrap these into core (Bio::Root::Test or similar) and have them become generally available for any Bio::* modules, either as exported methods in a package or as a full-on class. Similarly, we should add a test file for the BioPerl-specific test functions using Test::More functions only. This can possibly wait until 1.7, but it shouldn't be too hard to implement. 3f) Test coverage is already in place (thanks Mauricio, Nathan, Sendu, and everyone else involved)! Developer vs. Stable releases: 4a) After 1.6, no more alternating of 'developer' and 'stable' releases. The designation is highly misleading, particularly when one considers how stable current subversion code is compared to the 'stable' 1.4 release in CPAN. 4b) By consequence of losing 'dev' releases, we will need to to etch out a place so developers can test new code, place untested/ unsupported code which may still be useful, etc. bioperl-experimental is currently designated as the sandbox for Perl6 development, so I suggest bioperl-dev (bioperl-here-thar-be-dragoons is too long). 4c) Significant changes to modules already present (API changes, for instance) should be run on a branch to the relevant distribution. Point releases vs minor releases: 5a) Release regular bug fixes (no API changes) as point releases. 5b) Small API changes require minor releases. Related: what will the module split version be (1.7? 2.0?) 5c) Deprecation of modules is decided by the core devs. If the module is widely used (i.e. Bio::Species) we should go through a routine deprecation cycle prior to removing the module from a distribution. Some modules (for instance, DB modules which no longer work due to changes in remote access such as XEMBL) can be immediately deprecated and removed in the next release. Again, comments welcome. I will be posting this to the wiki page shortly: http://www.bioperl.org/wiki/Release_1.6 http://www.bioperl.org/wiki/Talk:Release_1.6 chris From hartzell at alerce.com Sat Nov 22 19:21:19 2008 From: hartzell at alerce.com (George Hartzell) Date: Sat, 22 Nov 2008 16:21:19 -0800 Subject: [Bioperl-l] A better fix for Species.pm memory leak, please REVIEW. Message-ID: <18728.41471.910143.682950@almost.alerce.com> I dug back into the Species.pm memory leak and my less-than-ideal fix for it. First, in my defense, my previous fix only causes failures in t/Species.t if network tests are enabled. It passes the generic ones. I *thought* I'd run the tests before I committed :). The basic cause of the problem is that when species() calls Bio::Tree::Tree->new(-node=> $species_taxon), $species_taxon is a copy of $self. The tree ends up adding that node onto the classification list and it gets linked it into the descendent links, which results in a cycle. If that lousy english description left you scratching your head, here's a hacked-to-pieces version of R Voss's original test case. You'll need Devel::Cycle (YAY LINCOLN!) to run it. Find yourself a genbank file (I used gbpri21.seq, from the original bug report), put it into a directory somewhere and do something like bug.pl GBDIR /tmp (yeah, why do a while loop w/ an exit in the middle? Cut-n-paste. Why else...?) ################################################################ #!/usr/bin/perl use strict; use warnings; use Bio::SeqIO; use Devel::Cycle; my $dir = shift @ARGV; # the directory with *.gz files my $out = shift @ARGV; # the directory to write to... mkdir $out if not -d $out; # ...which may need to be created opendir my $dirhandle, $dir or die $!; for my $file ( readdir $dirhandle ) { next if $file !~ /^gb.*/; # object that parses genbank files, # returns Bio::Seq objects my $reader = Bio::SeqIO->new( '-format' => 'genbank', '-file' => "${dir}/${file}" ); while ( my $seq = $reader->next_seq ) { my $name = $seq->species->binomial; find_cycle($seq); exit 1; } # delete the extracted, unfiltered file unlink "${dir}/${file}"; } ################################################################ Anyway. Here's a fix. Instead of adding $self to the tree, make a new species node w/ $self's important bits and add that instead. There's a patch below that gets rid of my weaken and Sendu's fix-for-the-fix and then Does The Right Thing (I hope). It seems to pass all of the Species.t tests. The take home exam questions are: Are there any other missing important bits? Thoughts? I'll commit it if someone'll review it. g. Index: Bio/Species.pm =================================================================== --- Bio/Species.pm (revision 15008) +++ Bio/Species.pm (working copy) @@ -261,8 +261,13 @@ # work it out from our nodes my $species_taxon = $self->{tree}->find_node(-rank => 'species'); unless ($species_taxon) { - # just assume we are rank species - $species_taxon = $self; + # whip up a new species object so that we don't + # end up with a cycle in the tree. + # initialize it with self's important bits. + # NOTE TO ALL: any missing important bits? + $species_taxon = + Bio::Species->new(-classification => + [$self->classification]); } $species = $species_taxon->scientific_name; @@ -278,7 +283,6 @@ $self->{tree} = Bio::Tree::Tree->new(-node => $species_taxon); delete $self->{tree}->{_root_cleanup_methods}; $root = $self->{tree}->get_root_node; - weaken($self->{tree}->{'_rootnode'}) unless isweak($self->{tree}->{'_rootnode'}); } my @spflds = split(' ', $species); @@ -395,15 +399,6 @@ if ($ss_taxon) { if ($sub) { $ss_taxon->scientific_name($sub); - - # *** weakening ref to our root node in species() to solve a - # memory leak means that we have a subspecies taxon to set - # during the first call to species(), but it has vanished by - # the time a user subsequently calls sub_species() to get the - # value. So we 'cheat' and just store the subspecies name in - # our self hash, instead of the tree. Is this a problem for - # a Species object? Can't decide --sendu - $self->{'_sub_species'} = $sub; } return $ss_taxon->scientific_name; } From cjfields at illinois.edu Sat Nov 22 23:22:25 2008 From: cjfields at illinois.edu (Chris Fields) Date: Sat, 22 Nov 2008 22:22:25 -0600 Subject: [Bioperl-l] A better fix for Species.pm memory leak, please REVIEW. In-Reply-To: <18728.41471.910143.682950@almost.alerce.com> References: <18728.41471.910143.682950@almost.alerce.com> Message-ID: On Nov 22, 2008, at 6:21 PM, George Hartzell wrote: > I dug back into the Species.pm memory leak and my less-than-ideal fix > for it. > > First, in my defense, my previous fix only causes failures in > t/Species.t if network tests are enabled. It passes the generic > ones. I *thought* I'd run the tests before I committed :). > > The basic cause of the problem is that when species() calls > Bio::Tree::Tree->new(-node=> $species_taxon), $species_taxon is a copy > of $self. The tree ends up adding that node onto the classification > list and it gets linked it into the descendent links, which results in > a cycle. ... > If that lousy english description left you scratching your head, > here's a hacked-to-pieces version of R Voss's original test case. > You'll need Devel::Cycle (YAY LINCOLN!) to run it. Find yourself a > genbank file (I used gbpri21.seq, from the original bug report), put > it into a directory somewhere and do something like > > ... > Anyway. Here's a fix. Instead of adding $self to the tree, make a > new species node w/ $self's important bits and add that instead. > There's a patch below that gets rid of my weaken and Sendu's > fix-for-the-fix and then Does The Right Thing (I hope). > > It seems to pass all of the Species.t tests. > > The take home exam questions are: Are there any other missing > important bits? The only thing I can suggest is to make sure it deals with the original problem (memory leaks, per bug 2594): http://bugzilla.open-bio.org/show_bug.cgi?id=2594 > Thoughts? I'm not quite sure why we are attaching a Bio::Tree::Tree to the Bio::Species, so I need to go back and review the changes from the 1.5.2 release (I'm sure there is a good reason, just can't recall and haven't had time to look). However, we aren't planning on retaining Bio::Species too much longer anyway (we'll be moving towards a simpler Bio::Taxon-based system after the 1.6 release and will be deprecating Bio::Species eventually). Any fixes should probably take this into consideration. > I'll commit it if someone'll review it. > > g. I vote to go ahead and commit this (the solution seems sound), but Sendu is the one who would be best to vet as he refactored this. If there isn't word by the middle of the week you can go ahead and make the commit. Thanks George! chris From hartzell at alerce.com Sat Nov 22 23:33:59 2008 From: hartzell at alerce.com (George Hartzell) Date: Sat, 22 Nov 2008 20:33:59 -0800 Subject: [Bioperl-l] A better fix for Species.pm memory leak, please REVIEW. In-Reply-To: References: <18728.41471.910143.682950@almost.alerce.com> Message-ID: <18728.56631.850998.763433@almost.alerce.com> Chris Fields writes: > [...] > The only thing I can suggest is to make sure it deals with the > original problem (memory leaks, per bug 2594): > > http://bugzilla.open-bio.org/show_bug.cgi?id=2594 > [...] Yes, this change also eliminates the original memory leak. g. From patola at gmail.com Sun Nov 23 00:19:24 2008 From: patola at gmail.com (=?ISO-8859-1?Q?Cl=E1udio_Sampaio?=) Date: Sun, 23 Nov 2008 03:19:24 -0200 Subject: [Bioperl-l] How to get the entropy of each nucleotide of an aligment? Message-ID: Hi all, I am still a newbie to bioperl, and while searching for a way to calculate the entropy score of an alignment I came to Matrix scoring - http://doc.bioperl.org/releases/bioperl-1.4/Bio/Matrix/Scoring.html - but couldn't figure out how it relate to the Bio::Align class and objects. Can someone more knowledgeable give me a clue on how to start? Best regards, Cl?udio "Patola" From bix at sendu.me.uk Sun Nov 23 07:19:57 2008 From: bix at sendu.me.uk (Sendu Bala) Date: Sun, 23 Nov 2008 12:19:57 +0000 Subject: [Bioperl-l] A better fix for Species.pm memory leak, please REVIEW. In-Reply-To: References: <18728.41471.910143.682950@almost.alerce.com> Message-ID: <49294A6D.4030107@sendu.me.uk> Chris Fields wrote: > On Nov 22, 2008, at 6:21 PM, George Hartzell wrote: > I'm not quite sure why we are attaching a Bio::Tree::Tree to the > Bio::Species, so I need to go back and review the changes from the 1.5.2 > release (I'm sure there is a good reason, just can't recall and haven't > had time to look). The point of the refactor was that a Species object no longer be a dumb (list of) scalar, but actually 'know' what its nodes are, so you can do things like compare Species and find the LCA and such. You need a Bio::Tree::Tree for that. > However, we aren't planning on retaining Bio::Species too much longer > anyway (we'll be moving towards a simpler Bio::Taxon-based system after > the 1.6 release and will be deprecating Bio::Species eventually). Any > fixes should probably take this into consideration. > >> I'll commit it if someone'll review it. > > I vote to go ahead and commit this (the solution seems sound), but Sendu > is the one who would be best to vet as he refactored this. If there > isn't word by the middle of the week you can go ahead and make the > commit. I doubt I'll have time to really look at it properly in that time-frame... my initial reaction is that it is a bit bizarre (we are a species object intended to hold information about the species that we represent, and in order to do that... we create and store another species object in ourselves?!)... but if it works, it works? Is there a test in Species.t that does a Species object comparison and finds the LCA? Does it trigger the problem, and does it still get the correct LCA after this patch? But anyway, thanks George, go ahead and commit if it all seems functional. If you could summarise your solution in the bug report(s?) as well, that would be great. From maj at fortinbras.us Sun Nov 23 09:30:24 2008 From: maj at fortinbras.us (Mark A. Jensen) Date: Sun, 23 Nov 2008 09:30:24 -0500 Subject: [Bioperl-l] How to get the entropy of each nucleotide of analigment? Message-ID: Cl?udio - If you have a Bio::SimpleAlign object prepared ( maybe from $alnio = new Bio::AlignIO(-format=>'fasta', -file=>'your.fas'); $aln = $alnio->next_aln; try the following function, as $entropies = entropy_by_column( $aln ) (which also gives an example of how you (or I, anyway) might manipulate alignments on a per-column basis) cheers, Mark =head2 entropy_by_column Title : entropy_by_column Usage : entropy_by_column( $aln ) Function: returns the Shannon entropy for each column in an alignment Example : Returns : hashref of the form { $column_number => $entropy, ... } Args : a Bio::SimpleAlign object =cut sub entropy_by_column { my ($aln) = @_; my (%ent); foreach my $col (1..$aln->length) { my %res; foreach my $seq ($aln->each_seq) { my $loc = $seq->location_from_column($col); next if $loc->location_type eq 'IN-BETWEEN'; $res{$seq->subseq($loc)}++; } $ent{$col} = entropy(values %res); } return [%ent]; } =head2 entropy Title : entropy Usage : entropy( @numbers ) Function: returns the Shannon entropy of an array of numbers, each number represents the count of a unique category in a collection of items Example : entropy ( 1, 1, 1 ) # returns 1.09861228866811 = log(1/3) Returns : Shannon entropy or undef if entropy undefined; Args : an array =cut sub entropy { @a = map {$_ || ()} @_; return undef unless grep {$_>0} @a; return undef if grep {$_<0} @a; my $t = eval join('+', @a); map {$_ /= $t} @a; return eval(join('+', map { $_ ? -$_*log($_) : () } @a)); } > ----- Original Message ----- > From: "Cl?udio Sampaio" > To: > Sent: Sunday, November 23, 2008 12:19 AM > Subject: [Bioperl-l] How to get the entropy of each nucleotide of > analigment? > > > Hi all, > > I am still a newbie to bioperl, and while searching for a way to > calculate > the entropy score of an alignment I came to Matrix scoring - > http://doc.bioperl.org/releases/bioperl-1.4/Bio/Matrix/Scoring.html > - but > couldn't figure out how it relate to the Bio::Align class and > objects. Can > someone more knowledgeable give me a clue on how to start? > > Best regards, > > Cl?udio "Patola" > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From spiros at lokku.com Sun Nov 23 10:39:14 2008 From: spiros at lokku.com (Spiros Denaxas) Date: Sun, 23 Nov 2008 15:39:14 +0000 Subject: [Bioperl-l] TODO task: Ontology page Message-ID: Hello, any objection if I take on this and add text on http://www.bioperl.org/wiki/Ontology about ontologies? Spiros From cjfields at illinois.edu Sun Nov 23 10:56:29 2008 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 23 Nov 2008 09:56:29 -0600 Subject: [Bioperl-l] TODO task: Ontology page In-Reply-To: References: Message-ID: <412ADE96-4EEF-4A2A-B659-B25E91747E10@illinois.edu> Go ahead, that would be great! chris On Nov 23, 2008, at 9:39 AM, Spiros Denaxas wrote: > Hello, > > any objection if I take on this and add text on > http://www.bioperl.org/wiki/Ontology > about ontologies? > > Spiros > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Sun Nov 23 14:13:00 2008 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 23 Nov 2008 13:13:00 -0600 Subject: [Bioperl-l] A better fix for Species.pm memory leak, please REVIEW. In-Reply-To: <49294A6D.4030107@sendu.me.uk> References: <18728.41471.910143.682950@almost.alerce.com> <49294A6D.4030107@sendu.me.uk> Message-ID: <2B801E79-43A2-4213-A075-47AE38C915ED@illinois.edu> On Nov 23, 2008, at 6:19 AM, Sendu Bala wrote: > Chris Fields wrote: >> On Nov 22, 2008, at 6:21 PM, George Hartzell wrote: >> I'm not quite sure why we are attaching a Bio::Tree::Tree to the >> Bio::Species, so I need to go back and review the changes from the >> 1.5.2 release (I'm sure there is a good reason, just can't recall >> and haven't had time to look). > > The point of the refactor was that a Species object no longer be a > dumb (list of) scalar, but actually 'know' what its nodes are, so > you can do things like compare Species and find the LCA and such. > You need a Bio::Tree::Tree for that. > ... >> I vote to go ahead and commit this (the solution seems sound), but >> Sendu is the one who would be best to vet as he refactored this. >> If there isn't word by the middle of the week you can go ahead and >> make the commit. > > I doubt I'll have time to really look at it properly in that time- > frame... my initial reaction is that it is a bit bizarre (we are a > species object intended to hold information about the species that > we represent, and in order to do that... we create and store another > species object in ourselves?!)... but if it works, it works? Is > there a test in Species.t that does a Species object comparison and > finds the LCA? Does it trigger the problem, and does it still get > the correct LCA after this patch? ... > But anyway, thanks George, go ahead and commit if it all seems > functional. > > If you could summarise your solution in the bug report(s?) as well, > that would be great. Agreed, though I think we need to ensure that memory leaks don't permeate into Bio::Taxon when we make the switch (it shouldn't, I think). BTW, checking Blame/Annotate it looks like some of this stems from r10974: http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=10974 chris From cjfields at illinois.edu Sun Nov 23 19:31:17 2008 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 23 Nov 2008 18:31:17 -0600 Subject: [Bioperl-l] LocatableSeq::subseq(): bug or not? Message-ID: <53450E04-8729-4066-84C4-9C2FCC7C7275@illinois.edu> Currently, we have Bio::LocatableSeq use the default (Bio::PrimarySeq) implementation of subseq(). However the returned data apparently clashes with the actual PrimarySeq documentation: Function: returns the subseq from start to end, where the first base is 1 and the number is inclusive, ie 1-2 are the first two bases of the sequence So, should the following actually return the indicated range of bases (no gaps)? Or should we clarify the above documentation to indicate subseq() returns the first x positions/columns (anything) instead of 'bases' (no gaps)? my $seq = Bio::LocatableSeq->new( -seq => '--atg---gta--', -strand => 1, -start => 1, -end => 6, -alphabet => 'dna' ); # comments indicate current returned val $seq->subseq(1,3); # returns '--a' $seq->subseq(3,6); # returns 'atg-' $seq->subseq(1,10); # returns '--atg---gt' chris From hartzell at alerce.com Sun Nov 23 21:31:09 2008 From: hartzell at alerce.com (George Hartzell) Date: Sun, 23 Nov 2008 18:31:09 -0800 Subject: [Bioperl-l] A better fix for Species.pm memory leak, please REVIEW. In-Reply-To: <2B801E79-43A2-4213-A075-47AE38C915ED@illinois.edu> References: <18728.41471.910143.682950@almost.alerce.com> <49294A6D.4030107@sendu.me.uk> <2B801E79-43A2-4213-A075-47AE38C915ED@illinois.edu> Message-ID: <18730.4589.804276.536603@almost.alerce.com> Chris Fields writes: > On Nov 23, 2008, at 6:19 AM, Sendu Bala wrote: > > > Chris Fields wrote: > >> On Nov 22, 2008, at 6:21 PM, George Hartzell wrote: > >> I'm not quite sure why we are attaching a Bio::Tree::Tree to the > >> Bio::Species, so I need to go back and review the changes from the > >> 1.5.2 release (I'm sure there is a good reason, just can't recall > >> and haven't had time to look). > > > > The point of the refactor was that a Species object no longer be a > > dumb (list of) scalar, but actually 'know' what its nodes are, so > > you can do things like compare Species and find the LCA and such. > > You need a Bio::Tree::Tree for that. > > ... > >> I vote to go ahead and commit this (the solution seems sound), but > >> Sendu is the one who would be best to vet as he refactored this. > >> If there isn't word by the middle of the week you can go ahead and > >> make the commit. > > > > I doubt I'll have time to really look at it properly in that time- > > frame... my initial reaction is that it is a bit bizarre (we are a > > species object intended to hold information about the species that > > we represent, and in order to do that... we create and store another > > species object in ourselves?!)... but if it works, it works? Is > > there a test in Species.t that does a Species object comparison and > > finds the LCA? Does it trigger the problem, and does it still get > > the correct LCA after this patch? > ... > > But anyway, thanks George, go ahead and commit if it all seems > > functional. > > > > If you could summarise your solution in the bug report(s?) as well, > > that would be great. > > > Agreed, though I think we need to ensure that memory leaks don't > permeate into Bio::Taxon when we make the switch (it shouldn't, I > think). BTW, checking Blame/Annotate it looks like some of this stems > from r10974: > > http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=10974 > The change that I proposed wouldn't cause anything to permeate into Bio::Taxon. It's really a design problem with Bio::Species. A Species "has-a" reference to a tree *and* when the Species creates that tree it asks the tree constructor to initialize itself using the Species itself which results in the tree including a reference to that Species object. Et voila, circular reference. It's not helped by the fact that Bio::Species explicitly disconnects all of the tree cleanup code that was supposed to handle those circular references. On the other hand there are a bunch of comments that suggest that the cleanup code never worked right. The other places that use the tree seem to only care about it right-then-and-there and they can get away with weaken-ing the root and letting it just fall apart. Bio::Species::subspecies seems to be the only code that depends on a tree built by some other call (Bio::Species::species) and it's unhappy when it's assumptions don't work out. If you think of the species object as having a tree that describes it's place in the world then it seems semantically valid to initialized the tree from a copy of the Species object's self, thereby avoiding the mess. Alternate fixes would be Sendu's caching of the info that subspecies needs or fixing subspecies to build the tree itself. I'm not in front of that machine at the moment, but I think that there are a couple of other places in the src tree that use the -node or -root arg's to Bio::Tree::Tree::new and should probably be checked for leaks. I don't really feel hot or bothered about this fix vs. the combination of weaken and stashing the subspecies. It depends on what the author intended for Species and its tree, neither are particularly sexy and if Species is really on its way out the door it may not matter..... g. From maj at fortinbras.us Sun Nov 23 21:40:06 2008 From: maj at fortinbras.us (Mark A. Jensen) Date: Sun, 23 Nov 2008 21:40:06 -0500 Subject: [Bioperl-l] LocatableSeq::subseq(): bug or not? In-Reply-To: <53450E04-8729-4066-84C4-9C2FCC7C7275@illinois.edu> References: <53450E04-8729-4066-84C4-9C2FCC7C7275@illinois.edu> Message-ID: <819C37DC45784AF694EAF707F4092722@NewLife> Since subseq() returns a string, I would (and do) expect a 1-origin substring of the actual character data. It would be nice to continue to have a thoughtless data grab without using substr directly. I find I want to deal with both gapped sequence and the gap-stripped sequence at various points in an app, and have the goodies that Locatable and Align provide as well. It might be convenient to have another method for dealing specifically with the gap-stripped sequence, say subseq_nogap() or subseq_residues(), so that the expected (and regressed) subseq() behavior is preserved. [I prefer 'residues' to 'bases' to highlight the generality of the representation.] MAJ ----- Original Message ----- From: "Chris Fields" To: "BioPerl List" Sent: Sunday, November 23, 2008 7:31 PM Subject: [Bioperl-l] LocatableSeq::subseq(): bug or not? > Currently, we have Bio::LocatableSeq use the default > (Bio::PrimarySeq) implementation of subseq(). However the returned > data apparently clashes with the actual PrimarySeq documentation: > > Function: returns the subseq from start to end, where the first > base > is 1 and the number is inclusive, ie 1-2 are the first > two > bases of the sequence > > So, should the following actually return the indicated range of > bases (no gaps)? Or should we clarify the above documentation to > indicate subseq() returns the first x positions/columns (anything) > instead of 'bases' (no gaps)? > > my $seq = Bio::LocatableSeq->new( > -seq => '--atg---gta--', > -strand => 1, > -start => 1, > -end => 6, > -alphabet => 'dna' > ); > > # comments indicate current returned val > $seq->subseq(1,3); # returns '--a' > $seq->subseq(3,6); # returns 'atg-' > $seq->subseq(1,10); # returns '--atg---gt' > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From maj at fortinbras.us Mon Nov 24 09:04:38 2008 From: maj at fortinbras.us (Mark A. Jensen) Date: Mon, 24 Nov 2008 09:04:38 -0500 Subject: [Bioperl-l] LocatableSeq::subseq(): bug or not? In-Reply-To: <53450E04-8729-4066-84C4-9C2FCC7C7275@illinois.edu> References: <53450E04-8729-4066-84C4-9C2FCC7C7275@illinois.edu> Message-ID: <6DFFF4639BB240AA9D0BC190949BCE96@NewLife> Bug #2682 contains a patch that modifies subseq() to strip gaps if desired. It also tries to fix the $replace weirdness. perldb transcript: DB<11> $seq = new Bio::PrimarySeq(-seq=>'--atg---gta--') DB<12> x $seq->subseq(1,3) 0 '--a' DB<13> x $seq->subseq(1,3,NOGAP) 0 'a' DB<15> x $seq->seq 0 '--atg---gta--' DB<16> x $seq->subseq(-START=>1, -END=>3, -REPLACE_WITH=>'tga') 0 '--a' DB<18> x $seq->seq 0 'tgatg---gta--' ## silly gap-stripper: DB<21> x $seq->subseq(-START=>1, -END=>$seq->length, -REPLACE_WITH=>$seq->subseq(-START=>1, -END=>$seq->length, -NOGAP=>1)) 0 'tgatg---gta--' DB<22> x $seq->seq 0 'tgatggta' ----- Original Message ----- From: "Chris Fields" To: "BioPerl List" Sent: Sunday, November 23, 2008 7:31 PM Subject: [Bioperl-l] LocatableSeq::subseq(): bug or not? > Currently, we have Bio::LocatableSeq use the default > (Bio::PrimarySeq) implementation of subseq(). However the returned > data apparently clashes with the actual PrimarySeq documentation: > > Function: returns the subseq from start to end, where the first > base > is 1 and the number is inclusive, ie 1-2 are the first > two > bases of the sequence > > So, should the following actually return the indicated range of > bases (no gaps)? Or should we clarify the above documentation to > indicate subseq() returns the first x positions/columns (anything) > instead of 'bases' (no gaps)? > > my $seq = Bio::LocatableSeq->new( > -seq => '--atg---gta--', > -strand => 1, > -start => 1, > -end => 6, > -alphabet => 'dna' > ); > > # comments indicate current returned val > $seq->subseq(1,3); # returns '--a' > $seq->subseq(3,6); # returns 'atg-' > $seq->subseq(1,10); # returns '--atg---gt' > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Mon Nov 24 16:49:21 2008 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 24 Nov 2008 15:49:21 -0600 Subject: [Bioperl-l] A better fix for Species.pm memory leak, please REVIEW. In-Reply-To: <18730.4589.804276.536603@almost.alerce.com> References: <18728.41471.910143.682950@almost.alerce.com> <49294A6D.4030107@sendu.me.uk> <2B801E79-43A2-4213-A075-47AE38C915ED@illinois.edu> <18730.4589.804276.536603@almost.alerce.com> Message-ID: <18D474CE-73AC-4888-8909-2E66C73FEB6A@illinois.edu> On Nov 23, 2008, at 8:31 PM, George Hartzell wrote: > Chris Fields writes: >> On Nov 23, 2008, at 6:19 AM, Sendu Bala wrote: >> >>> But anyway, thanks George, go ahead and commit if it all seems >>> functional. >>> >>> If you could summarise your solution in the bug report(s?) as well, >>> that would be great. >> >> >> Agreed, though I think we need to ensure that memory leaks don't >> permeate into Bio::Taxon when we make the switch (it shouldn't, I >> think). BTW, checking Blame/Annotate it looks like some of this >> stems >> from r10974: >> >> http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=10974 >> > > The change that I proposed wouldn't cause anything to permeate into > Bio::Taxon. > > It's really a design problem with Bio::Species. A Species "has-a" > reference to a tree *and* when the Species creates that tree it asks > the tree constructor to initialize itself using the Species itself > which results in the tree including a reference to that Species > object. Et voila, circular reference. My thoughts as well. > It's not helped by the fact that Bio::Species explicitly disconnects > all of the tree cleanup code that was supposed to handle those > circular references. On the other hand there are a bunch of comments > that suggest that the cleanup code never worked right. The cleanup code relies on the object being destroyed (either via garbage collection or explicitly). Since there is a circular dependency and DESTROY isn't explicitly called, the root cleanup method won't work. > ... > I'm not in front of that machine at the moment, but I think that there > are a couple of other places in the src tree that use the -node or > -root arg's to Bio::Tree::Tree::new and should probably be checked for > leaks. A possible solution would be to make Bio::Species a shell or proxy object that decorates a Bio::Taxon instance (and a Bio::Tree::Tree where needed) and just delegates the appropriate methods to them. It could still inherit from Bio::Taxon; it would just override the Bio::Taxon methods for delegation. The Bio::Tree::Tree can refer to the Bio::Taxon object (and vice versa), but neither refer back to the Species object directly. In this case, when the last ref to the Bio::Species is released it should be garbage collected and trigger DESTROY (which is set up in Bio::Root::Root). This in turn triggers any cleanup methods, such as calling DESTROY on the Tree and Taxon and get rid of memory leaks (and wouldn't require weaken/is_weak). I'm perfectly willing to test that out if no one else is (I am working on a few other bugs at the moment but I can probably get to it in the next day or two). Not to tread on any toes, but I would like to get this one taken care of prior to 1.6 and these are fairly nasty memory issues. If there is any consolation it shouldn't be too hard to re- rig the methods to delegate appropriately. > I don't really feel hot or bothered about this fix vs. the combination > of weaken and stashing the subspecies. It depends on what the author > intended for Species and its tree, neither are particularly sexy and > if Species is really on its way out the door it may not matter..... > > g. I don't think it will matter post-1.6, but we'll have to support Bio::Species for the 1.6 release series. So it will have to be maintained for the short term, until 1.7 comes out. chris From clements at nescent.org Mon Nov 24 17:35:56 2008 From: clements at nescent.org (Dave Clements) Date: Mon, 24 Nov 2008 14:35:56 -0800 Subject: [Bioperl-l] GMOD Meeting, January 15-16, 2009 In-Reply-To: <71ee57c70811241433k62d416abpbdfafbbca61a4848@mail.gmail.com> References: <71ee57c70811241353rb26d61fi1c244964dd418aa2@mail.gmail.com> <71ee57c70811241424w520bcb77pc1b5d81ea299814c@mail.gmail.com> <71ee57c70811241425pe33ab06p675c2cc7f2273de1@mail.gmail.com> <71ee57c70811241427i5b85fd7bx9131488e09bd7c97@mail.gmail.com> <71ee57c70811241427l6d1062fdkab05258f0d1369d2@mail.gmail.com> <71ee57c70811241429m7913b71bj69e66cee837ef2dc@mail.gmail.com> <71ee57c70811241430x4f11f5b1n6d133f4857791301@mail.gmail.com> <71ee57c70811241431t1601d91yf86710bea514c307@mail.gmail.com> <71ee57c70811241432l3a2c28c0t18159a70cfe62628@mail.gmail.com> <71ee57c70811241433k62d416abpbdfafbbca61a4848@mail.gmail.com> Message-ID: Hello all, The next GMOD Community Meeting ( http://gmod.org/wiki/January_2009_GMOD_Meeting) will be held January 15-16, 2009, in San Diego, California, immediately following the Plant and Animal Genome (PAG 2009) conference. If you are a GMOD user and/or developer, or just want to learn more about GMOD then you are encouraged to attend. See the July 2008 GMOD Meeting page (http://gmod.org/wiki/July_2008_GMOD_Meeting) for an idea of what goes on at a GMOD meeting. You can register for the meeting at http://gmod.org/wiki/January_2009_GMOD_Meeting_Registration. Thanks to the generous support of Doreen Ware and USDA-ARS, registration is *free*. Space is limited, so please register early (and it helps us plan). Details on lodging and other logistics will be forthcoming as the meeting gets closer. If you have topics that you want covered at the meeting please add them to the meeting page. Please let the Help Desk know if you have any questions. Thanks, Dave C GMOD Help Desk Useful URLs: http://gmod.org/wiki/January_2009_GMOD_Meeting http://gmod.org/wiki/January_2009_GMOD_Meeting_Registration http://gmod.org/wiki/PAG_2009 - GMOD @ PAG 2009 http://www.intl-pag.org/ - Plant and Animal Genome Meeting http://gmod.org/wiki/July_2008_GMOD_Meeting - notes from prior meeting http://gmod.org/wiki/2008_GMOD_Community_Survey#GMOD_Meetings From hartzell at alerce.com Mon Nov 24 23:59:00 2008 From: hartzell at alerce.com (George Hartzell) Date: Mon, 24 Nov 2008 20:59:00 -0800 Subject: [Bioperl-l] A better fix for Species.pm memory leak, please REVIEW. In-Reply-To: <18D474CE-73AC-4888-8909-2E66C73FEB6A@illinois.edu> References: <18728.41471.910143.682950@almost.alerce.com> <49294A6D.4030107@sendu.me.uk> <2B801E79-43A2-4213-A075-47AE38C915ED@illinois.edu> <18730.4589.804276.536603@almost.alerce.com> <18D474CE-73AC-4888-8909-2E66C73FEB6A@illinois.edu> Message-ID: <18731.34324.610535.907621@almost.alerce.com> Chris Fields writes: > > On Nov 23, 2008, at 8:31 PM, George Hartzell wrote: > > > Chris Fields writes: > >> On Nov 23, 2008, at 6:19 AM, Sendu Bala wrote: > >> > >>> But anyway, thanks George, go ahead and commit if it all seems > >>> functional. > >>> > >>> If you could summarise your solution in the bug report(s?) as well, > >>> that would be great. > >> > >> > >> Agreed, though I think we need to ensure that memory leaks don't > >> permeate into Bio::Taxon when we make the switch (it shouldn't, I > >> think). BTW, checking Blame/Annotate it looks like some of this > >> stems > >> from r10974: > >> > >> http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=10974 > >> > > > > The change that I proposed wouldn't cause anything to permeate into > > Bio::Taxon. > > > > It's really a design problem with Bio::Species. A Species "has-a" > > reference to a tree *and* when the Species creates that tree it asks > > the tree constructor to initialize itself using the Species itself > > which results in the tree including a reference to that Species > > object. Et voila, circular reference. > > My thoughts as well. > > > It's not helped by the fact that Bio::Species explicitly disconnects > > all of the tree cleanup code that was supposed to handle those > > circular references. On the other hand there are a bunch of comments > > that suggest that the cleanup code never worked right. > > The cleanup code relies on the object being destroyed (either via > garbage collection or explicitly). Since there is a circular > dependency and DESTROY isn't explicitly called, the root cleanup > method won't work. > > > ... > > I'm not in front of that machine at the moment, but I think that there > > are a couple of other places in the src tree that use the -node or > > -root arg's to Bio::Tree::Tree::new and should probably be checked for > > leaks. > > A possible solution would be to make Bio::Species a shell or proxy > object that decorates a Bio::Taxon instance (and a Bio::Tree::Tree > where needed) and just delegates the appropriate methods to them. It > could still inherit from Bio::Taxon; it would just override the > Bio::Taxon methods for delegation. The Bio::Tree::Tree can refer to > the Bio::Taxon object (and vice versa), but neither refer back to the > Species object directly. > > In this case, when the last ref to the Bio::Species is released it > should be garbage collected and trigger DESTROY (which is set up in > Bio::Root::Root). This in turn triggers any cleanup methods, such as > calling DESTROY on the Tree and Taxon and get rid of memory leaks (and > wouldn't require weaken/is_weak). > [...] How will this interact with the places that Bio::Species unhooks the various cleanup mechanisms that Tree sets up. There are comments that talk about the sub ref screwing with someone's use of Storable. It seems like if the new Species has cleanup code registered then there's a problem with whom-ever had a problem with it before. > I'm perfectly willing to test that out if no one else is (I am working > on a few other bugs at the moment but I can probably get to it in the > next day or two). Not to tread on any toes, but I would like to get > this one taken care of prior to 1.6 and these are fairly nasty memory > issues. If there is any consolation it shouldn't be too hard to re- > rig the methods to delegate appropriately. > > > I don't really feel hot or bothered about this fix vs. the combination > > of weaken and stashing the subspecies. It depends on what the author > > intended for Species and its tree, neither are particularly sexy and > > if Species is really on its way out the door it may not matter..... > > > > g. > > I don't think it will matter post-1.6, but we'll have to support > Bio::Species for the 1.6 release series. So it will have to be > maintained for the short term, until 1.7 comes out. I can't poke at this stuff much until the weekend. If you can kick it, go for it! g. From cjfields at illinois.edu Tue Nov 25 00:57:31 2008 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 24 Nov 2008 23:57:31 -0600 Subject: [Bioperl-l] A better fix for Species.pm memory leak, please REVIEW. In-Reply-To: <18731.34324.610535.907621@almost.alerce.com> References: <18728.41471.910143.682950@almost.alerce.com> <49294A6D.4030107@sendu.me.uk> <2B801E79-43A2-4213-A075-47AE38C915ED@illinois.edu> <18730.4589.804276.536603@almost.alerce.com> <18D474CE-73AC-4888-8909-2E66C73FEB6A@illinois.edu> <18731.34324.610535.907621@almost.alerce.com> Message-ID: On Nov 24, 2008, at 10:59 PM, George Hartzell wrote: >> ... >> >> In this case, when the last ref to the Bio::Species is released it >> should be garbage collected and trigger DESTROY (which is set up in >> Bio::Root::Root). This in turn triggers any cleanup methods, such as >> calling DESTROY on the Tree and Taxon and get rid of memory leaks >> (and >> wouldn't require weaken/is_weak). >> [...] > > How will this interact with the places that Bio::Species unhooks the > various cleanup mechanisms that Tree sets up. There are comments that > talk about the sub ref screwing with someone's use of Storable. It > seems like if the new Species has cleanup code registered then there's > a problem with whom-ever had a problem with it before. http://bugzilla.open-bio.org/show_bug.cgi?id=2149#c1 The deleted root cleanup methods are for the internal Bio::Tree::Tree. Apparently code refs don't persist using Storable or BioSQL (though I thought they could using B::Deparse?). Anyway, for now I have everything set up in Bio::Species::DESTROY, which calls SUPER::DESTROY(), then the Tree and Taxon cleanup methods prior to explicitly deleting them. I also removed all weaken/is_weak and the Scalar::Utils requirement from Bio::Species. It passes all tests in the test suite. Also, I didn't see any significant memory issues using Rutger's original file and the test file in the above bug report, but it's worth another run through with more data.. >> I'm perfectly willing to test that out if no one else is (I am >> working >> on a few other bugs at the moment but I can probably get to it in the >> next day or two). Not to tread on any toes, but I would like to get >> this one taken care of prior to 1.6 and these are fairly nasty memory >> issues. If there is any consolation it shouldn't be too hard to re- >> rig the methods to delegate appropriately. >> >>> I don't really feel hot or bothered about this fix vs. the >>> combination >>> of weaken and stashing the subspecies. It depends on what the >>> author >>> intended for Species and its tree, neither are particularly sexy and >>> if Species is really on its way out the door it may not matter..... >>> >>> g. >> >> I don't think it will matter post-1.6, but we'll have to support >> Bio::Species for the 1.6 release series. So it will have to be >> maintained for the short term, until 1.7 comes out. > > I can't poke at this stuff much until the weekend. If you can kick > it, go for it! > > g. As you might have guessed I already have this set up. If there aren't objections I'll commit the code tomorrow for testing. We can role back if there are significant issues. -c From cjfields at illinois.edu Tue Nov 25 11:06:57 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 25 Nov 2008 10:06:57 -0600 Subject: [Bioperl-l] A better fix for Species.pm memory leak, please REVIEW. In-Reply-To: <18731.34324.610535.907621@almost.alerce.com> References: <18728.41471.910143.682950@almost.alerce.com> <49294A6D.4030107@sendu.me.uk> <2B801E79-43A2-4213-A075-47AE38C915ED@illinois.edu> <18730.4589.804276.536603@almost.alerce.com> <18D474CE-73AC-4888-8909-2E66C73FEB6A@illinois.edu> <18731.34324.610535.907621@almost.alerce.com> Message-ID: <35951CFE-7672-4D4F-9D40-AF19E7C7A76D@illinois.edu> George (and anyone else paying attention), I have attached a Bio::Species patch to bug 2594: http://bugzilla.open-bio.org/show_bug.cgi?id=2594 I would like comments back on that patch as soon as anyone can test it out. If there aren't any objections I'll probably commit it in a week's time and close the bug out. For the record, I don't see memory issues and the API is intact (still a Bio::Taxon), but it's a fairly drastic change as Bio::Species is now just a proxy. No idea on how this affects parsing speed either and it hasn't been tested with Devel::Cycle, so benchmarks are welcome; I would guess that it's slightly slower (extra object being generated), something I can deal with as long as it doesn't suck memory away on long parses. Cheers! chris On Nov 24, 2008, at 10:59 PM, George Hartzell wrote: > Chris Fields writes: >> >> On Nov 23, 2008, at 8:31 PM, George Hartzell wrote: >> >>> Chris Fields writes: >>>> On Nov 23, 2008, at 6:19 AM, Sendu Bala wrote: >>>> >>>>> But anyway, thanks George, go ahead and commit if it all seems >>>>> functional. >>>>> >>>>> If you could summarise your solution in the bug report(s?) as >>>>> well, >>>>> that would be great. >>>> >>>> >>>> Agreed, though I think we need to ensure that memory leaks don't >>>> permeate into Bio::Taxon when we make the switch (it shouldn't, I >>>> think). BTW, checking Blame/Annotate it looks like some of this >>>> stems >>>> from r10974: >>>> >>>> http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=10974 >>>> >>> >>> The change that I proposed wouldn't cause anything to permeate into >>> Bio::Taxon. >>> >>> It's really a design problem with Bio::Species. A Species "has-a" >>> reference to a tree *and* when the Species creates that tree it asks >>> the tree constructor to initialize itself using the Species itself >>> which results in the tree including a reference to that Species >>> object. Et voila, circular reference. >> >> My thoughts as well. >> >>> It's not helped by the fact that Bio::Species explicitly disconnects >>> all of the tree cleanup code that was supposed to handle those >>> circular references. On the other hand there are a bunch of >>> comments >>> that suggest that the cleanup code never worked right. >> >> The cleanup code relies on the object being destroyed (either via >> garbage collection or explicitly). Since there is a circular >> dependency and DESTROY isn't explicitly called, the root cleanup >> method won't work. >> >>> ... >>> I'm not in front of that machine at the moment, but I think that >>> there >>> are a couple of other places in the src tree that use the -node or >>> -root arg's to Bio::Tree::Tree::new and should probably be checked >>> for >>> leaks. >> >> A possible solution would be to make Bio::Species a shell or proxy >> object that decorates a Bio::Taxon instance (and a Bio::Tree::Tree >> where needed) and just delegates the appropriate methods to them. It >> could still inherit from Bio::Taxon; it would just override the >> Bio::Taxon methods for delegation. The Bio::Tree::Tree can refer to >> the Bio::Taxon object (and vice versa), but neither refer back to the >> Species object directly. >> >> In this case, when the last ref to the Bio::Species is released it >> should be garbage collected and trigger DESTROY (which is set up in >> Bio::Root::Root). This in turn triggers any cleanup methods, such as >> calling DESTROY on the Tree and Taxon and get rid of memory leaks >> (and >> wouldn't require weaken/is_weak). >> [...] > > How will this interact with the places that Bio::Species unhooks the > various cleanup mechanisms that Tree sets up. There are comments that > talk about the sub ref screwing with someone's use of Storable. It > seems like if the new Species has cleanup code registered then there's > a problem with whom-ever had a problem with it before. > >> I'm perfectly willing to test that out if no one else is (I am >> working >> on a few other bugs at the moment but I can probably get to it in the >> next day or two). Not to tread on any toes, but I would like to get >> this one taken care of prior to 1.6 and these are fairly nasty memory >> issues. If there is any consolation it shouldn't be too hard to re- >> rig the methods to delegate appropriately. >> >>> I don't really feel hot or bothered about this fix vs. the >>> combination >>> of weaken and stashing the subspecies. It depends on what the >>> author >>> intended for Species and its tree, neither are particularly sexy and >>> if Species is really on its way out the door it may not matter..... >>> >>> g. >> >> I don't think it will matter post-1.6, but we'll have to support >> Bio::Species for the 1.6 release series. So it will have to be >> maintained for the short term, until 1.7 comes out. > > I can't poke at this stuff much until the weekend. If you can kick > it, go for it! > > g. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From cjfields at illinois.edu Tue Nov 25 12:31:27 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 25 Nov 2008 11:31:27 -0600 Subject: [Bioperl-l] Bio::Assembly tests failing Message-ID: <62D72B16-B9E1-4097-931C-CB05934005A0@illinois.edu> I am getting tests failing on svn trunk (bioperl-live) that appear related to Bio::Assembly changes. Florent, can you take a look at these? chris 1..4 ok 1 - use Bio::Assembly::IO; ok 2 - Testing to see if the first contig is a Contig isa Bio::Assembly::Contig not ok 3 - Testing to see if the first singlet is a Singlet isa Bio::Assembly::Singlet not ok 4 - Testing to see if the Singlet ISA Contig isa Bio::Assembly::Contig # Failed test 'Testing to see if the first singlet is a Singlet isa Bio::Assembly::Singlet' # at t/singlet.t line 25. # Testing to see if the first singlet is a Singlet isn't defined # Failed test 'Testing to see if the Singlet ISA Contig isa Bio::Assembly::Contig' # at t/singlet.t line 27. # Testing to see if the Singlet ISA Contig isn't defined # Looks like you failed 2 tests of 4. Dubious, test returned 2 (wstat 512, 0x200) Failed 2/4 subtests Test Summary Report ------------------- t/singlet.t (Wstat: 512 Tests: 4 Failed: 2) Failed tests: 3-4 Non-zero exit status: 2 Files=1, Tests=4, 1 wallclock secs ( 0.01 usr 0.01 sys + 0.26 cusr 0.04 csys = 0.32 CPU) Result: FAIL Failed 1/1 test programs. 2/4 subtests failed. From cjfields at illinois.edu Tue Nov 25 12:34:37 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 25 Nov 2008 11:34:37 -0600 Subject: [Bioperl-l] LocatableSeq::subseq(): bug or not? In-Reply-To: <6DFFF4639BB240AA9D0BC190949BCE96@NewLife> References: <53450E04-8729-4066-84C4-9C2FCC7C7275@illinois.edu> <6DFFF4639BB240AA9D0BC190949BCE96@NewLife> Message-ID: <0740FC00-B535-4904-9DA1-365EC587E6BA@illinois.edu> Mark, Your subseq() patch appears to work just fine; no apparent tests are failing, API doesn't change, so that will be added for the release. We may need to define a new subseq()-like method to work properly with start/end coordinates that match only residues and are consistent with different coordinate systems (i.e. mapping), or we can add that in as a flag. Related to this, I have made a few commits defining groups of symbols for LocatableSeq ($GAP_SYMBOLS, $RESIDUE_SYMBOLS, $FRAMESHIFT_SYMBOLS, and the catchall $OTHER_SYMBOLS). I had already started down this path anyway, so might as well finish it. A remaining problem: they are currently set as class global variables, so there are some odd scoping issues when using them globally or locally (detailed in the test suite as a TODO), and they do not reset the $MATCHPATTERN. I'll set them up to be object-scoped attributes in a future release. chris On Nov 24, 2008, at 8:04 AM, Mark A. Jensen wrote: > Bug #2682 contains a patch that modifies subseq() to strip gaps if > desired. It also tries to fix the $replace weirdness. > > perldb transcript: > DB<11> $seq = new Bio::PrimarySeq(-seq=>'--atg---gta--') > > DB<12> x $seq->subseq(1,3) > 0 '--a' > DB<13> x $seq->subseq(1,3,NOGAP) > 0 'a' > DB<15> x $seq->seq > 0 '--atg---gta--' > DB<16> x $seq->subseq(-START=>1, -END=>3, -REPLACE_WITH=>'tga') > 0 '--a' > DB<18> x $seq->seq > 0 'tgatg---gta--' > ## silly gap-stripper: > DB<21> x $seq->subseq(-START=>1, -END=>$seq->length, > -REPLACE_WITH=>$seq->subseq(- > START=>1, > -END > =>$seq->length, > -NOGAP > =>1)) > 0 'tgatg---gta--' > DB<22> x $seq->seq > 0 'tgatggta' > > ----- Original Message ----- From: "Chris Fields" > > To: "BioPerl List" > Sent: Sunday, November 23, 2008 7:31 PM > Subject: [Bioperl-l] LocatableSeq::subseq(): bug or not? > > >> Currently, we have Bio::LocatableSeq use the default >> (Bio::PrimarySeq) implementation of subseq(). However the >> returned data apparently clashes with the actual PrimarySeq >> documentation: >> >> Function: returns the subseq from start to end, where the first base >> is 1 and the number is inclusive, ie 1-2 are the first two >> bases of the sequence >> >> So, should the following actually return the indicated range of >> bases (no gaps)? Or should we clarify the above documentation to >> indicate subseq() returns the first x positions/columns (anything) >> instead of 'bases' (no gaps)? >> >> my $seq = Bio::LocatableSeq->new( >> -seq => '--atg---gta--', >> -strand => 1, >> -start => 1, >> -end => 6, >> -alphabet => 'dna' >> ); >> >> # comments indicate current returned val >> $seq->subseq(1,3); # returns '--a' >> $seq->subseq(3,6); # returns 'atg-' >> $seq->subseq(1,10); # returns '--atg---gt' >> >> chris >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From cjfields at illinois.edu Tue Nov 25 12:51:49 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 25 Nov 2008 11:51:49 -0600 Subject: [Bioperl-l] bioperl.lisp bugs Message-ID: I'm not a user of this file, so if any users of the bioperl.lisp file in svn can comment on the following two bugs it would be of tremendous help. I'm sure they're fine but I would like to know if they should be accepted for the next release. http://bugzilla.open-bio.org/show_bug.cgi?id=2641 http://bugzilla.open-bio.org/show_bug.cgi?id=2642 chris From alperyilmaz at gmail.com Tue Nov 25 13:15:56 2008 From: alperyilmaz at gmail.com (Alper Yilmaz) Date: Tue, 25 Nov 2008 13:15:56 -0500 Subject: [Bioperl-l] quick pairwise alignment Message-ID: Hi, I am getting two nucleotide sequences from a database and I want to quickly align and view the result online(I'll wrap it in HTML). The examples I saw online require saving files (either output file or input file). I tried StandAloneBlast and tried to pass the sequences as Bio::Seq elements as shown below First I tried the Bioperl tutorial, section "IV.2.2 Aligning 2 sequences with Blast using bl2seq and AlignIO" $factory = Bio::Tools::Run::StandAloneBlast->new('outfile' => 'bl2seq.out'); my $seq1='AGCTACGATCAGCACTACGACTACGACTACGACTACACTAGCTAC' ; my $seq2='AGCTACGATCACCACTACGACTACGGCTACGACTACACGAGCTAC' ; $bl2seq_report = $factory->bl2seq($seq1, $seq2); The result I got was the following error: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: AGCTACGATCAGCACTACGACTACGACTACGACTACACTAGCTAC not Seq Object or file name! Then I used Bio::Seq objects as shown: my $factory = Bio::Tools::Run::StandAloneBlast->new('outfile' => '/tmp/bl2seq.out'); my $seq1 = Bio::Seq->new(-id=>$seqname1,-seq=>$ntseq1); my $seq2 = Bio::Seq->new(-id=>$seqname2,-seq=>$ntseq2); my $alignment = $factory->bl2seq($seq1, $seq2); - The result I got is bl2seq crashed error. As fas as I can tell StandAloneBlast cannot write to temporary files that are defined by "_rootio_tempfiles" in Bio::Root::IO. The detail is below: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: bl2seq call crashed: 256 | No such file or directory | /usr/bin/bl2seq -j /tmp/sNpLUbS6jJ -i /tmp/aNWLAj_ZDN -o /tmp/bl2seq.out So, is there a way to do alignments without reading/writing files? Is there a quick way to align two sequence that are kept in $seq1 and $seq2 variables? I had trouble installing Bioperl-Ext package, if there's a tool in that package I can try harder to install Ext package. thanks, alper From David.Messina at sbc.su.se Tue Nov 25 13:46:21 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Tue, 25 Nov 2008 19:46:21 +0100 Subject: [Bioperl-l] quick pairwise alignment In-Reply-To: References: Message-ID: <628aabb70811251046u3e9f2711pdb6c90afdcb6f94c@mail.gmail.com> Hi Alper, It looks like StandAloneBlast is having trouble locating bl2seq. Do you have bl2seq installed on your machine at /usr/bin/bl2seq? If it is installed, you might try running a test outside of bioperl, just to verify that your input, output, and bl2seq are working correctly. Dave From maj at fortinbras.us Tue Nov 25 14:00:10 2008 From: maj at fortinbras.us (Mark A. Jensen) Date: Tue, 25 Nov 2008 14:00:10 -0500 Subject: [Bioperl-l] LocatableSeq::subseq(): bug or not? In-Reply-To: <0740FC00-B535-4904-9DA1-365EC587E6BA@illinois.edu> References: <53450E04-8729-4066-84C4-9C2FCC7C7275@illinois.edu> <6DFFF4639BB240AA9D0BC190949BCE96@NewLife> <0740FC00-B535-4904-9DA1-365EC587E6BA@illinois.edu> Message-ID: ----- Original Message ----- From: "Chris Fields" To: "Mark A. Jensen" Cc: "BioPerl List" Sent: Tuesday, November 25, 2008 12:34 PM Subject: Re: [Bioperl-l] LocatableSeq::subseq(): bug or not? > Mark, > > Your subseq() patch appears to work just fine; no apparent tests are > failing, API doesn't change, so that will be added for the release. > We may need to define a new subseq()-like method to work properly > with start/end coordinates that match only residues and are > consistent with different coordinate systems (i.e. mapping), or we > can add that in as a flag. I'm willing to try my hand at this, if desired-- can you point me to the modules involved off the top of yer head? MAJ > > Related to this, I have made a few commits defining groups of > symbols for LocatableSeq ($GAP_SYMBOLS, $RESIDUE_SYMBOLS, > $FRAMESHIFT_SYMBOLS, and the catchall $OTHER_SYMBOLS). I had > already started down this path anyway, so might as well finish it. > A remaining problem: they are currently set as class global > variables, so there are some odd scoping issues when using them > globally or locally (detailed in the test suite as a TODO), and > they do not reset the $MATCHPATTERN. I'll set them up to be > object-scoped attributes in a future release. > > chris > > On Nov 24, 2008, at 8:04 AM, Mark A. Jensen wrote: > >> Bug #2682 contains a patch that modifies subseq() to strip gaps if >> desired. It also tries to fix the $replace weirdness. >> >> perldb transcript: >> DB<11> $seq = new Bio::PrimarySeq(-seq=>'--atg---gta--') >> >> DB<12> x $seq->subseq(1,3) >> 0 '--a' >> DB<13> x $seq->subseq(1,3,NOGAP) >> 0 'a' >> DB<15> x $seq->seq >> 0 '--atg---gta--' >> DB<16> x $seq->subseq(-START=>1, -END=>3, -REPLACE_WITH=>'tga') >> 0 '--a' >> DB<18> x $seq->seq >> 0 'tgatg---gta--' >> ## silly gap-stripper: >> DB<21> x $seq->subseq(-START=>1, -END=>$seq->length, >> -REPLACE_WITH=>$seq->subseq(- >> START=>1, >> >> -END =>$seq->length, >> >> -NOGAP =>1)) >> 0 'tgatg---gta--' >> DB<22> x $seq->seq >> 0 'tgatggta' >> >> ----- Original Message ----- From: "Chris Fields" >> > > >> To: "BioPerl List" >> Sent: Sunday, November 23, 2008 7:31 PM >> Subject: [Bioperl-l] LocatableSeq::subseq(): bug or not? >> >> >>> Currently, we have Bio::LocatableSeq use the default >>> (Bio::PrimarySeq) implementation of subseq(). However the >>> returned data apparently clashes with the actual PrimarySeq >>> documentation: >>> >>> Function: returns the subseq from start to end, where the first >>> base >>> is 1 and the number is inclusive, ie 1-2 are the first >>> two >>> bases of the sequence >>> >>> So, should the following actually return the indicated range of >>> bases (no gaps)? Or should we clarify the above documentation to >>> indicate subseq() returns the first x positions/columns >>> (anything) instead of 'bases' (no gaps)? >>> >>> my $seq = Bio::LocatableSeq->new( >>> -seq => '--atg---gta--', >>> -strand => 1, >>> -start => 1, >>> -end => 6, >>> -alphabet => 'dna' >>> ); >>> >>> # comments indicate current returned val >>> $seq->subseq(1,3); # returns '--a' >>> $seq->subseq(3,6); # returns 'atg-' >>> $seq->subseq(1,10); # returns '--atg---gt' >>> >>> chris >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Marie-Claude Hofmann > College of Veterinary Medicine > University of Illinois Urbana-Champaign > > > > > > From cjfields at illinois.edu Tue Nov 25 15:03:23 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 25 Nov 2008 14:03:23 -0600 Subject: [Bioperl-l] LocatableSeq::subseq(): bug or not? In-Reply-To: References: <53450E04-8729-4066-84C4-9C2FCC7C7275@illinois.edu> <6DFFF4639BB240AA9D0BC190949BCE96@NewLife> <0740FC00-B535-4904-9DA1-365EC587E6BA@illinois.edu> Message-ID: On Nov 25, 2008, at 1:00 PM, Mark A. Jensen wrote: > > ----- Original Message ----- From: "Chris Fields" > > To: "Mark A. Jensen" > Cc: "BioPerl List" > Sent: Tuesday, November 25, 2008 12:34 PM > Subject: Re: [Bioperl-l] LocatableSeq::subseq(): bug or not? > > >> Mark, >> >> Your subseq() patch appears to work just fine; no apparent tests >> are failing, API doesn't change, so that will be added for the >> release. We may need to define a new subseq()-like method to work >> properly with start/end coordinates that match only residues and >> are consistent with different coordinate systems (i.e. mapping), >> or we can add that in as a flag. > > I'm willing to try my hand at this, if desired-- can you point me to > the modules involved off the top of yer head? > MAJ The test suite has the beginnings for mapping() and frameshifts() methods; the former is a simple two element array of # residues mapping to # positions. This is primarily so endpoint calculations are correct. frameshifts() accepts/returns a simple hash indicating position of frameshifts and their shift (-+, integer). Both are necessary for HSPs. I haven't integrated checking of frameshift symbol positions yet (they are mainly used for sequence validation). That will probably need to be done first before working this into subseq() as frameshifting will affect what the start/end position is in the sequence. chris From hlapp at gmx.net Tue Nov 25 15:25:31 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 25 Nov 2008 15:25:31 -0500 Subject: [Bioperl-l] bioperl.lisp bugs In-Reply-To: References: Message-ID: <3D391976-B0F0-41C1-BB91-03545539E615@gmx.net> On Nov 25, 2008, at 12:51 PM, Chris Fields wrote: > I'm not a user of this file, so if any users of the bioperl.lisp > file in svn can comment on the following two bugs it would be of > tremendous help. I'm sure they're fine but I would like to know if > they should be accepted for the next release. > > http://bugzilla.open-bio.org/show_bug.cgi?id=2641 This looks perfectly fine. > > http://bugzilla.open-bio.org/show_bug.cgi?id=2642 This is probably not an issue for any other Perl than ActiveState's, but the fix shouldn't hurt either. There may just be a lot of files already lying around with an extra empty line. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at illinois.edu Tue Nov 25 15:40:33 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 25 Nov 2008 14:40:33 -0600 Subject: [Bioperl-l] bioperl.lisp bugs In-Reply-To: <3D391976-B0F0-41C1-BB91-03545539E615@gmx.net> References: <3D391976-B0F0-41C1-BB91-03545539E615@gmx.net> Message-ID: Hilmar, Thanks for looking into it. Did you want to commit them, or should I? chris On Nov 25, 2008, at 2:25 PM, Hilmar Lapp wrote: > > On Nov 25, 2008, at 12:51 PM, Chris Fields wrote: > >> I'm not a user of this file, so if any users of the bioperl.lisp >> file in svn can comment on the following two bugs it would be of >> tremendous help. I'm sure they're fine but I would like to know if >> they should be accepted for the next release. >> >> http://bugzilla.open-bio.org/show_bug.cgi?id=2641 > > This looks perfectly fine. > >> >> http://bugzilla.open-bio.org/show_bug.cgi?id=2642 > > This is probably not an issue for any other Perl than ActiveState's, > but the fix shouldn't hurt either. There may just be a lot of files > already lying around with an extra empty line. > > -hilmar > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== From florent.angly at gmail.com Tue Nov 25 15:53:27 2008 From: florent.angly at gmail.com (Florent Angly) Date: Tue, 25 Nov 2008 12:53:27 -0800 Subject: [Bioperl-l] Bio::Assembly tests failing In-Reply-To: <62D72B16-B9E1-4097-931C-CB05934005A0@illinois.edu> References: <62D72B16-B9E1-4097-931C-CB05934005A0@illinois.edu> Message-ID: <492C65C7.2080609@gmail.com> Hi Chris, I didn't realize that there were Bio::Assembly related tests outside of the t/Assembly.t file. The story is that I changed the ACE parser so that it only gets its information from the ACE file (including singlets). This was not the case before, so it broke the old, not updated t/singlet.t tests. So what I did is to update and move the tests that t/singlet.t did to t/Assembly.t and removed t/singlet.t. All the Bio::Assembly tests should pass now. Cheers, Florent Chris Fields wrote: > I am getting tests failing on svn trunk (bioperl-live) that appear > related to Bio::Assembly changes. Florent, can you take a look at these? > > chris > > 1..4 > ok 1 - use Bio::Assembly::IO; > ok 2 - Testing to see if the first contig is a Contig isa > Bio::Assembly::Contig > not ok 3 - Testing to see if the first singlet is a Singlet isa > Bio::Assembly::Singlet > not ok 4 - Testing to see if the Singlet ISA Contig isa > Bio::Assembly::Contig > > # Failed test 'Testing to see if the first singlet is a Singlet isa > Bio::Assembly::Singlet' > # at t/singlet.t line 25. > # Testing to see if the first singlet is a Singlet isn't defined > > # Failed test 'Testing to see if the Singlet ISA Contig isa > Bio::Assembly::Contig' > # at t/singlet.t line 27. > # Testing to see if the Singlet ISA Contig isn't defined > # Looks like you failed 2 tests of 4. > Dubious, test returned 2 (wstat 512, 0x200) > Failed 2/4 subtests > > Test Summary Report > ------------------- > t/singlet.t (Wstat: 512 Tests: 4 Failed: 2) > Failed tests: 3-4 > Non-zero exit status: 2 > Files=1, Tests=4, 1 wallclock secs ( 0.01 usr 0.01 sys + 0.26 cusr > 0.04 csys = 0.32 CPU) > Result: FAIL > Failed 1/1 test programs. 2/4 subtests failed. > > From hlapp at gmx.net Tue Nov 25 16:37:42 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 25 Nov 2008 16:37:42 -0500 Subject: [Bioperl-l] bioperl.lisp bugs In-Reply-To: References: <3D391976-B0F0-41C1-BB91-03545539E615@gmx.net> Message-ID: <081BDFBD-163F-45B5-B723-1A453789C24F@gmx.net> Go ahead while you're at it, don't wait for me :-) -hilmar On Nov 25, 2008, at 3:40 PM, Chris Fields wrote: > Hilmar, > > Thanks for looking into it. Did you want to commit them, or should I? > > chris > > On Nov 25, 2008, at 2:25 PM, Hilmar Lapp wrote: > >> >> On Nov 25, 2008, at 12:51 PM, Chris Fields wrote: >> >>> I'm not a user of this file, so if any users of the bioperl.lisp >>> file in svn can comment on the following two bugs it would be of >>> tremendous help. I'm sure they're fine but I would like to know >>> if they should be accepted for the next release. >>> >>> http://bugzilla.open-bio.org/show_bug.cgi?id=2641 >> >> This looks perfectly fine. >> >>> >>> http://bugzilla.open-bio.org/show_bug.cgi?id=2642 >> >> This is probably not an issue for any other Perl than >> ActiveState's, but the fix shouldn't hurt either. There may just be >> a lot of files already lying around with an extra empty line. >> >> -hilmar >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From freshpetals03 at yahoo.com Mon Nov 24 12:54:28 2008 From: freshpetals03 at yahoo.com (bioperl_inquisitive) Date: Mon, 24 Nov 2008 09:54:28 -0800 (PST) Subject: [Bioperl-l] unable to open swissprot.nin Message-ID: <20666106.post@talk.nabble.com> hey ppl ... I'm having a problem at the last step of blast installation on windows.:-( blastall -p blastn -d swissprot -i fasta_query.txt -o output.txt Its showing an Error like this: [NULL_Caption] WARNING: Unable to open swissprot.nin [NULL_Caption] WARNING: gi 157836126 pdb 2QOX Z: Unable to open swissprot.nin Any help will be appreciated !! Thanks in advance. -- View this message in context: http://www.nabble.com/unable-to-open-swissprot.nin-tp20666106p20666106.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From Thomas.Jahns at gmx.net Thu Nov 20 08:04:51 2008 From: Thomas.Jahns at gmx.net (Thomas Jahns) Date: Thu, 20 Nov 2008 14:04:51 +0100 Subject: [Bioperl-l] Update of bioperl-ext for modern environment Message-ID: <20081120130451.307240@gmx.net> Hello everyone, I hope I didn't duplicate anyone's work, but I couldn't find anything on this in the archives and so I patched bioperl-ext-1.5.1 to work with - bioperl-1.5.2_102 - staden io_lib 1.11.4 and not crash perl. Please see attached patch, I hope someone reading here can integrate it with the repository. There is one necessary externally visible change: instead of specifiyng /foo/include/io_lib for the headers of the staden package, one now needs to specify /foo/include, because read.h includes other files with io_lib prefix. I hope I removed the double-free bug in the right place, if the free'ing of a pointer passed into the function pgreen was intentional, another strategy will be needed. Also I found make clean to be dysfunctional, but I don't know enough about MakeMaker to fix that, thus for recompiles I used a script for cleanup (also attached). Greetings, Thomas Jahns -- Sensationsangebot nur bis 30.11: GMX FreeDSL - Telefonanschluss + DSL f?r nur 16,37 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K11308T4569a -------------- next part -------------- A non-text attachment was scrubbed... Name: bioperl-ext-update.patch Type: application/octet-stream Size: 4788 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: clean-bioperl-ext.sh Type: application/octet-stream Size: 441 bytes Desc: not available URL: From cjfields at illinois.edu Tue Nov 25 19:05:30 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 25 Nov 2008 18:05:30 -0600 Subject: [Bioperl-l] bioperl.lisp bugs In-Reply-To: <081BDFBD-163F-45B5-B723-1A453789C24F@gmx.net> References: <3D391976-B0F0-41C1-BB91-03545539E615@gmx.net> <081BDFBD-163F-45B5-B723-1A453789C24F@gmx.net> Message-ID: <20432E5A-DC20-4D0F-AB3C-0B8CCE573D15@illinois.edu> Done! -c On Nov 25, 2008, at 3:37 PM, Hilmar Lapp wrote: > Go ahead while you're at it, don't wait for me :-) > > -hilmar > > On Nov 25, 2008, at 3:40 PM, Chris Fields wrote: > >> Hilmar, >> >> Thanks for looking into it. Did you want to commit them, or should >> I? >> >> chris >> >> On Nov 25, 2008, at 2:25 PM, Hilmar Lapp wrote: >> >>> >>> On Nov 25, 2008, at 12:51 PM, Chris Fields wrote: >>> >>>> I'm not a user of this file, so if any users of the bioperl.lisp >>>> file in svn can comment on the following two bugs it would be of >>>> tremendous help. I'm sure they're fine but I would like to know >>>> if they should be accepted for the next release. >>>> >>>> http://bugzilla.open-bio.org/show_bug.cgi?id=2641 >>> >>> This looks perfectly fine. >>> >>>> >>>> http://bugzilla.open-bio.org/show_bug.cgi?id=2642 >>> >>> This is probably not an issue for any other Perl than >>> ActiveState's, but the fix shouldn't hurt either. There may just >>> be a lot of files already lying around with an extra empty line. >>> >>> -hilmar >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From David.Messina at sbc.su.se Wed Nov 26 05:31:34 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Wed, 26 Nov 2008 11:31:34 +0100 Subject: [Bioperl-l] quick pairwise alignment In-Reply-To: References: <628aabb70811251046u3e9f2711pdb6c90afdcb6f94c@mail.gmail.com> Message-ID: <628aabb70811260231p24281dfcq9903b46d84b3c8c2@mail.gmail.com> Hi Alper, Please remember to 'reply all' to keep this conversation on the bioperl list. It's hard to be sure with just fragments of your code, but it seems to be working fine for me. Below I've attached a sample script showing how you can run bl2seq. If you aren't already, you probably will want to download a nightly build of bioperl-live and bioperl-run to eliminate any problems stemming from outdated code. http://www.bioperl.org/DIST/nightly_builds/ Also, for future reference, instead of the BioPerl tutorial, I recommend the HOWTOs on the website; for this question, this one should be helpful: http://www.bioperl.org/wiki/HOWTO:Beginners You tell StandAloneBlast which program to use (blastp, blastn, etc) using the -program parameter. You'll see it in the example script below. Dave ---------------example code--------------- #!/usr/bin/perl use strict; use warnings; use Bio::Seq; use Bio::Tools::Run::StandAloneBlast; # lots of params can be set here, basically anything that you would normally # be able to pass to blast on the command line. Type # perldoc Bio::Tools::Run::StandAloneBlast # on the command line to see details. my @params = (program => 'blastp'); my $factory = Bio::Tools::Run::StandAloneBlast->new(@params); # make some fake data my $input1 = Bio::Seq->new(-id => "testquery1", -seq => "ACTADDEEQQPPTCADEEQQQVVGG"); my $input2 = Bio::Seq->new(-id => "testquery2", -seq => "ACTADDEMMMMMMMDEEQQQVVGG"); # execute the blast command with this line my $blast_report = $factory->bl2seq ($input1, $input2); # just one result in a bl2seq report my $result_obj = $blast_report->next_result; # likewise just one hit my $hit_obj = $result_obj->next_hit; # there may be >1 hsp, but I'm only looking at the first one in this example my $hsp_obj = $hit_obj->next_hsp; # take a quick look at the alignment print $hsp_obj->query_string, "\n", $hsp_obj->homology_string, "\n", $hsp_obj->hit_string, "\n"; ---------------end example code------------- On Tue, Nov 25, 2008 at 20:00, Alper Yilmaz wrote: > Hi Dave, > bls2seq works okay. I tried the following outside of bioperl and it > successfully generates the output. > Btw, how do I tell StandAloneBlast, bl2seq function which program to use > (blastn, blastp, etc)? > > bl2seq -p blastn -j seq1.fa -i seq2.fa -o bl2seq.out > > thanks, > alper > > From timmcilveen at talktalk.net Wed Nov 26 06:30:18 2008 From: timmcilveen at talktalk.net (Tim) Date: Wed, 26 Nov 2008 11:30:18 -0000 Subject: [Bioperl-l] writing bioperl modules Message-ID: <006001c94fba$625a5b90$0301a8c0@your83dafb4529> Hi, I'm just curious to know if bioperl is considered complete or are there still projects that programmers can become involved in. I notice that for other bio-languages there are on-going hackathons were programmers can meet and work on a problem. Do these exist in bio-perl or is the code base now mature? thanks, tim From biobrain at gmail.com Wed Nov 26 08:52:39 2008 From: biobrain at gmail.com (Imtiaz M.) Date: Wed, 26 Nov 2008 13:52:39 +0000 Subject: [Bioperl-l] BioPerl Installation on Suse 11.0 Failed ? Message-ID: <59840f5f0811260552o33d6f586qbf7fbb671770a09@mail.gmail.com> I am trying to install bioperl 1.4 both from CPAN and from local directory after downloading but i am unable to get it running. here is some log of installation work. Please give me some suggestions how to proceed perl Makefile.PL Generated sub tests. go make show_tests to see available subtests *** Script Install Section **** Bioperl comes with a number of useful scripts which you may wish to install. Install [a]ll Bioperl scripts, [n]one, or choose groups [i]nteractively? [a] cannot unlink file for scripts_temp/bp_pg_bulk_load_gff.pl: Permission denied at Makefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_pg_bulk_load_gff.pl: P ermission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_process_gadfly.pl: Permission denied at M akefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_process_gadfly.pl: Per mission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_composite_LD.pl: Permission denied at Mak efile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_composite_LD.pl: Permi ssion denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_index.pl: Permission denied at Makefile.P L line 118 cannot restore permissions to 0100555 for scripts_temp/bp_index.pl: Permission d enied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_dbsplit.pl: Permission denied at Makefile .PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_dbsplit.pl: Permission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_filter_search.pl: Permission denied at Ma kefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_filter_search.pl: Perm ission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_bioflat_index.pl: Permission denied at Ma kefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_bioflat_index.pl: Perm ission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_pairwise_kaks.pl: Permission denied at Ma kefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_pairwise_kaks.pl: Perm ission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_mutate.pl: Permission denied at Makefile. PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_mutate.pl: Permission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_taxid4species.pl: Permission denied at Ma kefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_taxid4species.pl: Perm ission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_nrdb.pl: Permission denied at Makefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_nrdb.pl: Permission de nied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_heterogeneity_test.pl: Permission denied at Makefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_heterogeneity_test.pl: Permission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_chaos_plot.pl: Permission denied at Makef ile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_chaos_plot.pl: Permiss ion denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_oligo_count.pl: Permission denied at Make file.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_oligo_count.pl: Permis sion denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_blast2tree.pl: Permission denied at Makef ile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_blast2tree.pl: Permiss ion denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_biofetch_genbank_proxy.pl: Permission den ied at Makefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_biofetch_genbank_proxy .pl: Permission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_search2alnblocks.pl: Permission denied at Makefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_search2alnblocks.pl: P ermission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_search2tribe.pl: Permission denied at Mak efile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_search2tribe.pl: Permi ssion denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_seqconvert.pl: Permission denied at Makef ile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_seqconvert.pl: Permiss ion denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_mask_by_search.pl: Permission denied at M akefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_mask_by_search.pl: Per mission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_sreformat.pl: Permission denied at Makefi le.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_sreformat.pl: Permissi on denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_seq_length.pl: Permission denied at Makef ile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_seq_length.pl: Permiss ion denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_extract_feature_seq.pl: Permission denied at Makefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_extract_feature_seq.pl : Permission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_split_seq.pl: Permission denied at Makefi le.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_split_seq.pl: Permissi on denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_mrtrans.pl: Permission denied at Makefile .PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_mrtrans.pl: Permission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_aacomp.pl: Permission denied at Makefile. PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_aacomp.pl: Permission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_remote_blast.pl: Permission denied at Mak efile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_remote_blast.pl: Permi ssion denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_flanks.pl: Permission denied at Makefile. PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_flanks.pl: Permission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_genbank2gff.pl: Permission denied at Make file.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_genbank2gff.pl: Permis sion denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_process_sgd.pl: Permission denied at Make file.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_process_sgd.pl: Permis sion denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_process_wormbase.pl: Permission denied at Makefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_process_wormbase.pl: P ermission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_gccalc.pl: Permission denied at Makefile. PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_gccalc.pl: Permission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_generate_histogram.pl: Permission denied at Makefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_generate_histogram.pl: Permission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_fast_load_gff.pl: Permission denied at Ma kefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_fast_load_gff.pl: Perm ission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_local_taxonomydb_query.pl: Permission den ied at Makefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_local_taxonomydb_query .pl: Permission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_process_ncbi_human.pl: Permission denied at Makefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_process_ncbi_human.pl: Permission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_load_gff.pl: Permission denied at Makefil e.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_load_gff.pl: Permissio n denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_biogetseq.pl: Permission denied at Makefi le.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_biogetseq.pl: Permissi on denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_feature_draw.pl: Permission denied at Mak efile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_feature_draw.pl: Permi ssion denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_search2gff.pl: Permission denied at Makef ile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_search2gff.pl: Permiss ion denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_search_overview.pl: Permission denied at Makefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_search_overview.pl: Pe rmission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_bulk_load_gff.pl: Permission denied at Ma kefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_bulk_load_gff.pl: Perm ission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_frend.pl: Permission denied at Makefile.P L line 118 cannot restore permissions to 0100555 for scripts_temp/bp_frend.pl: Permission d enied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_translate_seq.pl: Permission denied at Ma kefile.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_translate_seq.pl: Perm ission denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_fetch.pl: Permission denied at Makefile.P L line 118 cannot restore permissions to 0100555 for scripts_temp/bp_fetch.pl: Permission d enied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_search2BSML.pl: Permission denied at Make file.PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_search2BSML.pl: Permis sion denied at Makefile.PL line 118 cannot unlink file for scripts_temp/bp_biblio.pl: Permission denied at Makefile. PL line 118 cannot restore permissions to 0100555 for scripts_temp/bp_biblio.pl: Permission denied at Makefile.PL line 118 cannot remove directory for ./scripts_temp: Directory not empty at Makefile.PL l ine 118 cannot restore permissions to 0755 for ./scripts_temp: Operation not permitted a t Makefile.PL line 118 External Module GD::SVG, Generate optional SVG output, is not installed on this computer. The in Bioperl needs it for Bio::Graphics External Module Ace, Aceperl, is not installed on this computer. The Bio::DB::Ace in Bioperl needs it for access of ACeDB database External Module SOAP::Lite, SOAP protocol, is not installed on this computer. The Bio::DB::XEMBLService in Bioperl needs it for XEMBL Services (also Bibliog raphic queries in Biblio::) External Module GD, Graphical Drawing Toolkit, is not installed on this computer. The Bio::Graphics in Bioperl needs it for Rendering Sequences and Features External Module XML::Twig, Available on CPAN, is not installed on this computer. The Module Bio::Variation::IO::xml.pm in Bioperl needs it for parsing of XML d ocuments External Module SVG, Generate optional SVG output, is not installed on this computer. The in Bioperl needs it for Bio::Graphics External Module Text::Shellwords, Execute shell commands, is not installed on this computer. The Bio::Graphics in Bioperl needs it for test scripts External Module XML::Parser::PerlSAX, Parsing of XML documents, is not installed on this computer. The Bio::SeqIO::game,Bio::Variation::* in Bioperl needs it for Bio::Variation code, GAME parser External Module DBD::mysql, Mysql driver, is not installed on this computer. The Bio::DB::GFF in Bioperl needs it for loading and querying of Mysql-based G FF feature databases External Module Graph::Directed, Generic Graph data stucture and algorithms, is not installed on this computer. The Bio::Ontology::SimpleOntologyEngine in Bioperl needs it for Ontology Engin e implementation for the GO parser Information: There are some external packages and perl modules, listed above, which bioperl uses. This only effects the functionality which is listed above: the rest of bioperl will work fine, which includes nearly all of the core packages. The installation of these external packages is very simple. You can read more about bioperl external dependencies in the INSTALL file or at: http://bioperl.org/Core/Latest/INSTALL Enjoy the rest of bioperl, which you can use after going 'make install' Writing Makefile for Bio shafiq at linux-22nr:~/source/bio> make shafiq at linux-22nr:~/source/bio> make test PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/AAChange...................ok t/AAReverseMutate............ok t/AlignIO....................ok t/AlignStats.................ok t/Allele.....................ok t/Alphabet...................ok t/Annotation.................Useless localization of scalar assignment at /home/ shafiq/source/bio/blib/lib/Bio/Root/Object.pm line 699. t/Annotation.................ok t/AnnotationAdaptor..........ok t/Assembly...................ok t/Biblio.....................SOAP::Lite not installed. Skipping some tests. t/Biblio.....................ok 1/24 skipped: various reasons t/Biblio_biofetch............ok t/BiblioReferences...........ok t/BioDBGFF...................ok t/BioFetch_DB................FAILED tests 8, 20-21, 27 Failed 4/27 tests, 85.19% okay t/BioGraphics................GD or Text::Shellwords modules are not installed. T his means that Bio::Graphics module is unusable. Skipping tests. t/BioGraphics................ok t/BlastIndex.................ok t/BPbl2seq...................ok t/BPlite.....................ok t/BPpsilite..................ok t/Chain......................ok t/cigarstring................ok t/ClusterIO..................XML::Parser::PerlSAX not loaded. This means Cluster IO::dbsnp test cannot be executed. Skipping t/ClusterIO..................ok t/Coalescent.................ok t/CodonTable.................ok t/consed.....................ok t/CoordinateGraph............ok t/CoordinateMapper...........ok t/Correlate..................Useless localization of scalar assignment at /home/ shafiq/source/bio/blib/lib/Bio/Root/Object.pm line 699. t/Correlate..................ok t/CytoMap....................ok t/DB.........................FAILED tests 30-31 Failed 2/78 tests, 97.44% okay (less 1 skipped test: 75 okay, 96.15%) t/DBCUTG.....................ok t/DBFasta....................ok t/DNAMutation................ok t/Domcut.....................ok t/ECnumber...................Useless localization of scalar assignment at /home/ shafiq/source/bio/blib/lib/Bio/Root/Object.pm line 699. t/ECnumber...................ok t/ELM........................ok 1/14 -------------------- WARNING --------------------- MSG: Bio::Tools::Analysis::Protein::ELM Request Error: 400 URL must be absolute Content-Type: text/plain Client-Date: Wed, 26 Nov 2008 13:50:44 GMT Client-Warning: Internal response 400 URL must be absolute --------------------------------------------------- t/ELM........................ok t/EMBL_DB....................FAILED tests 6, 13-14 Failed 3/15 tests, 80.00% okay t/EMBOSS_Tools...............ok t/EncodedSeq.................ok t/ePCR.......................ok t/ESEfinder..................ok t/est2genome.................ok t/Exception..................ok t/Exonerate..................ok t/flat.......................ok t/FootPrinter................ok t/game.......................XML::Parser::PerlSAX not loaded. This means game te st cannot be executed. Skipping t/game.......................ok t/GDB........................ok t/GeneCoordinateMapper.......ok 1/113 -------------------- WARNING --------------------- MSG: sorted sublocation array requested but root location doesn't define seq_id (at least one sublocation does!) --------------------------------------------------- -------------------- WARNING --------------------- MSG: sorted sublocation array requested but root location doesn't define seq_id (at least one sublocation does!) --------------------------------------------------- t/GeneCoordinateMapper.......ok 83/113Use of uninitialized value in concatenatio n (.) or string at /home/shafiq/source/bio/blib/lib/Bio/Coordinate/GeneMapper.pm line 814. t/GeneCoordinateMapper.......ok t/Geneid.....................ok t/Genewise...................ok 2/51 skipped: various reasons t/Genomewise.................ok t/Genpred....................ok t/GFF........................ok 1/32Filehandle GEN0 opened only for output at /h ome/shafiq/source/bio/blib/lib/Bio/Root/IO.pm line 440. Filehandle GEN1 opened only for output at /home/shafiq/source/bio/blib/lib/Bio/R oot/IO.pm line 440. t/GFF........................ok t/GOR4.......................ok t/GOterm.....................Useless localization of scalar assignment at /home/ shafiq/source/bio/blib/lib/Bio/Root/Object.pm line 699. t/GOterm.....................ok t/GuessSeqFormat.............ok 1/46Bio::SeqIO: game cannot be found Exception ------------- EXCEPTION ------------- MSG: Failed to load module Bio::SeqIO::game. Can't locate XML/Parser/PerlSAX.pm in @INC (@INC contains: t .. /home/shafiq/source/bio/blib/lib /home/shafiq/sourc e/bio/blib/arch /usr/lib/perl5/5.10.0/i586-linux-thread-multi /usr/lib/perl5/5.1 0.0 /usr/lib/perl5/site_perl/5.10.0/i586-linux-thread-multi /usr/lib/perl5/site_ perl/5.10.0 /usr/lib/perl5/vendor_perl/5.10.0/i586-linux-thread-multi /usr/lib/p erl5/vendor_perl/5.10.0 /usr/lib/perl5/vendor_perl .) at /home/shafiq/source/bio /blib/lib/Bio/SeqIO/game/gameSubs.pm line 70. BEGIN failed--compilation aborted at /home/shafiq/source/bio/blib/lib/Bio/SeqIO/ game/gameSubs.pm line 70. Compilation failed in require at /home/shafiq/source/bio/blib/lib/Bio/SeqIO/game /gameHandler.pm line 62. BEGIN failed--compilation aborted at /home/shafiq/source/bio/blib/lib/Bio/SeqIO/ game/gameHandler.pm line 62. Compilation failed in require at /home/shafiq/source/bio/blib/lib/Bio/SeqIO/game .pm line 76. BEGIN failed--compilation aborted at /home/shafiq/source/bio/blib/lib/Bio/SeqIO/ game.pm line 76. Compilation failed in require at /home/shafiq/source/bio/blib/lib/Bio/Root/Root. pm line 394. STACK Bio::Root::Root::_load_module /home/shafiq/source/bio/blib/lib/Bio/Root/Ro ot.pm:396 STACK (eval) /home/shafiq/source/bio/blib/lib/Bio/SeqIO.pm:549 STACK Bio::SeqIO::_load_format_module /home/shafiq/source/bio/blib/lib/Bio/SeqIO .pm:548 STACK Bio::SeqIO::new /home/shafiq/source/bio/blib/lib/Bio/SeqIO.pm:377 STACK (eval) t/GuessSeqFormat.t:61 STACK toplevel t/GuessSeqFormat.t:60 -------------------------------------- For more information about the SeqIO system please see the SeqIO docs. This includes ways of checking for formats at compile time, not run time t/GuessSeqFormat.............FAILED test 11 Failed 1/46 tests, 97.83% okay t/hmmer......................ok t/HNN........................ok t/Index......................ok t/InstanceSite...............ok t/InterProParser.............ok 1/47Bio::OntologyIO: InterProParser cannot be fo und Exception ------------- EXCEPTION ------------- MSG: Failed to load module Bio::OntologyIO::InterProParser. Can't locate XML/Par ser/PerlSAX.pm in @INC (@INC contains: t .. . ./blib/lib /home/shafiq/source/bio /blib/lib /home/shafiq/source/bio/blib/arch /usr/lib/perl5/5.10.0/i586-linux-thr ead-multi /usr/lib/perl5/5.10.0 /usr/lib/perl5/site_perl/5.10.0/i586-linux-threa d-multi /usr/lib/perl5/site_perl/5.10.0 /usr/lib/perl5/vendor_perl/5.10.0/i586-l inux-thread-multi /usr/lib/perl5/vendor_perl/5.10.0 /usr/lib/perl5/vendor_perl) at Bio/OntologyIO/InterProParser.pm line 84. BEGIN failed--compilation aborted at Bio/OntologyIO/InterProParser.pm line 84. Compilation failed in require at Bio/Root/Root.pm line 394. STACK Bio::Root::Root::_load_module Bio/Root/Root.pm:396 STACK (eval) Bio/OntologyIO.pm:255 STACK Bio::OntologyIO::_load_format_module Bio/OntologyIO.pm:254 STACK Bio::OntologyIO::new Bio/OntologyIO.pm:165 STACK toplevel t/InterProParser.t:52 -------------------------------------- For more information about the OntologyIO system please see the docs. This includes ways of checking for formats at compile time, not run time t/InterProParser.............NOK 2/47Can't call method "next_ontology" on an und efined value at t/InterProParser.t line 59. t/InterProParser.............dubious Test returned status 2 (wstat 512, 0x200) DIED. FAILED test 2 Failed 1/47 tests, 97.87% okay t/IUPAC......................ok t/largefasta.................ok t/largepseq..................ok t/LinkageMap.................ok t/LiveSeq....................ok t/LocatableSeq...............ok t/Location...................ok t/LocationFactory............ok t/LocusLink..................ok 1/23Useless localization of scalar assignment at /home/shafiq/source/bio/blib/lib/Bio/Root/Object.pm line 699. t/LocusLink..................ok t/lucy.......................ok t/Map........................ok t/MapIO......................ok t/Matrix.....................ok t/Measure....................Useless localization of scalar assignment at /home/ shafiq/source/bio/blib/lib/Bio/Root/Object.pm line 699. t/Measure....................ok t/MeSH.......................ok t/MetaSeq....................ok t/MicrosatelliteMarker.......ok t/MiniMIMentry...............Useless localization of scalar assignment at /home/ shafiq/source/bio/blib/lib/Bio/Root/Object.pm line 699. t/MiniMIMentry...............ok t/MitoProt...................ok t/Molphy.....................ok t/multiple_fasta.............ok t/Mutation...................ok t/Mutator....................ok t/NetPhos....................ok t/Node.......................ok t/OddCodes...................ok t/OMIMentry..................Useless localization of scalar assignment at /home/ shafiq/source/bio/blib/lib/Bio/Root/Object.pm line 699. t/OMIMentry..................ok t/OMIMentryAllelicVariant....Useless localization of scalar assignment at /home/ shafiq/source/bio/blib/lib/Bio/Root/Object.pm line 699. t/OMIMentryAllelicVariant....ok t/OMIMparser.................Useless localization of scalar assignment at /home/ shafiq/source/bio/blib/lib/Bio/Root/Object.pm line 699. t/OMIMparser.................ok t/Ontology................... Graph.pm doesn't seem to be installed on this system -- the GO Parser needs it.. . t/Ontology...................ok t/OntologyEngine.............Useless localization of scalar assignment at /home/ shafiq/source/bio/blib/lib/Bio/Root/Object.pm line 699. t/OntologyEngine.............ok t/PAML.......................ok t/Perl.......................ok 1/14Use of uninitialized value $au in substituti on (s///) at /home/shafiq/source/bio/blib/lib/Bio/SeqIO/swiss.pm line 855, line 45. t/Perl.......................ok t/phd........................ok t/Phenotype..................Useless localization of scalar assignment at /home/ shafiq/source/bio/blib/lib/Bio/Root/Object.pm line 699. t/Phenotype..................ok t/PhylipDist.................ok t/pICalculator...............ok t/Pictogram..................SVG not installed, skipping tests at t/Pictogram.t line 29. t/Pictogram..................ok t/PopGen.....................ok t/PopGenSims.................ok t/primaryqual................ok t/PrimarySeq.................ok t/primedseq..................ok t/Primer.....................ok t/primer3....................ok t/Promoterwise...............ok t/ProtDist...................ok t/psm........................ok t/QRNA.......................ok t/qual.......................ok t/RandDistFunctions..........ok t/RandomTreeFactory..........ok t/Range......................ok t/RangeI.....................ok t/RefSeq.....................ok t/Registry...................DB_File and BerkeleyDB not found. Skipping DB_File tests t/Registry...................ok t/Relationship...............Useless localization of scalar assignment at /home/ shafiq/source/bio/blib/lib/Bio/Root/Object.pm line 699. t/Relationship...............ok t/RelationshipType...........Useless localization of scalar assignment at /home/ shafiq/source/bio/blib/lib/Bio/Root/Object.pm line 699. t/RelationshipType...........ok t/RemoteBlast................ok 4/6 skipped: various reasons t/RepeatMasker...............ok t/RestrictionAnalysis........ok t/RestrictionEnzyme..........ok t/RestrictionIO..............ok 1/14 ------------- EXCEPTION ------------- MSG: Could not open >/tmp/r: Permission denied STACK Bio::Root::IO::_initialize_io /home/shafiq/source/bio/blib/lib/Bio/Root/IO .pm:273 STACK Bio::SeqIO::_initialize /home/shafiq/source/bio/blib/lib/Bio/SeqIO.pm:447 STACK Bio::Restriction::IO::base::_initialize /home/shafiq/source/bio/blib/lib/B io/Restriction/IO/base.pm:87 STACK Bio::Restriction::IO::base::new /home/shafiq/source/bio/blib/lib/Bio/Restr iction/IO/base.pm:81 STACK Bio::Restriction::IO::new /home/shafiq/source/bio/blib/lib/Bio/Restriction /IO.pm:152 STACK toplevel t/RestrictionIO.t:49 -------------------------------------- t/RestrictionIO..............dubious Test returned status 13 (wstat 3328, 0xd00) DIED. FAILED tests 7-14 Failed 8/14 tests, 42.86% okay t/RNAChange..................ok t/RootI......................ok t/RootIO.....................ok t/RootStorable...............ok t/Scansite...................ok t/scf........................ok t/SearchDist.................ok t/SearchIO...................XML::Parser::PerlSAX or HTML::Entities not loaded. This means SearchIO::blastxml test cannot be executed. Skipping t/SearchIO...................ok t/Seq........................ok t/SeqAnalysisParser..........ok t/SeqBuilder.................ok t/SeqDiff....................ok t/SeqFeatCollection..........ok t/SeqFeature.................Useless localization of scalar assignment at /home/ shafiq/source/bio/blib/lib/Bio/Root/Object.pm line 699. t/SeqFeature.................ok t/seqfeaturePrimer...........ok t/SeqIO......................ok 3/235 skipped: various reasons t/SeqPattern.................ok t/SeqStats...................ok t/SequenceFamily.............ok t/sequencetrace..............ok t/SeqUtils...................ok t/seqwithquality.............ok t/SeqWords...................ok t/Sigcleave..................ok t/Sim4.......................ok t/SimilarityPair.............ok t/SimpleAlign................ok t/simpleGOparser............. Graph.pm doesn't seem to be installed on this system -- the GO Parser needs it.. . t/simpleGOparser.............ok t/sirna......................ok t/SiteMatrix.................ok t/SNP........................ok t/Sopma......................ok t/Species....................ok t/splicedseq.................ok t/StandAloneBlast............ok t/StructIO...................ok t/Structure..................ok t/Swiss......................ok 1/5 ------------- EXCEPTION ------------- MSG: Could not open >test.swiss: Permission denied STACK Bio::Root::IO::_initialize_io /home/shafiq/source/bio/blib/lib/Bio/Root/IO .pm:273 STACK Bio::SeqIO::_initialize /home/shafiq/source/bio/blib/lib/Bio/SeqIO.pm:447 STACK Bio::SeqIO::swiss::_initialize /home/shafiq/source/bio/blib/lib/Bio/SeqIO/ swiss.pm:131 STACK Bio::SeqIO::new /home/shafiq/source/bio/blib/lib/Bio/SeqIO.pm:358 STACK Bio::SeqIO::new /home/shafiq/source/bio/blib/lib/Bio/SeqIO.pm:378 STACK toplevel t/Swiss.t:49 -------------------------------------- t/Swiss......................dubious Test returned status 13 (wstat 3328, 0xd00) DIED. FAILED tests 3-5 Failed 3/5 tests, 40.00% okay t/Symbol.....................ok t/Taxonomy...................ok t/Tempfile...................ok t/Term.......................Useless localization of scalar assignment at /home/ shafiq/source/bio/blib/lib/Bio/Root/Object.pm line 699. t/Term.......................ok t/Tools......................ok t/Tree.......................ok t/TreeIO.....................ok t/trim.......................ok t/tutorial...................ok 3/21Use of uninitialized value $au in substituti on (s///) at Bio/SeqIO/swiss.pm line 855, line 45. t/tutorial...................ok 17/21Bio::SeqIO: game cannot be found Exception ------------- EXCEPTION ------------- MSG: Failed to load module Bio::SeqIO::game. Can't locate XML/Parser/PerlSAX.pm in @INC (@INC contains: . t /home/shafiq/source/bio/blib/lib /home/shafiq/source /bio/blib/arch /usr/lib/perl5/5.10.0/i586-linux-thread-multi /usr/lib/perl5/5.10 .0 /usr/lib/perl5/site_perl/5.10.0/i586-linux-thread-multi /usr/lib/perl5/site_p erl/5.10.0 /usr/lib/perl5/vendor_perl/5.10.0/i586-linux-thread-multi /usr/lib/pe rl5/vendor_perl/5.10.0 /usr/lib/perl5/vendor_perl) at Bio/SeqIO/game/gameSubs.pm line 70. BEGIN failed--compilation aborted at Bio/SeqIO/game/gameSubs.pm line 70. Compilation failed in require at Bio/SeqIO/game/gameHandler.pm line 62. BEGIN failed--compilation aborted at Bio/SeqIO/game/gameHandler.pm line 62. Compilation failed in require at Bio/SeqIO/game.pm line 76. BEGIN failed--compilation aborted at Bio/SeqIO/game.pm line 76. Compilation failed in require at /home/shafiq/source/bio/blib/lib/Bio/Root/Root. pm line 394. STACK Bio::Root::Root::_load_module /home/shafiq/source/bio/blib/lib/Bio/Root/Ro ot.pm:396 STACK (eval) /home/shafiq/source/bio/blib/lib/Bio/SeqIO.pm:549 STACK Bio::SeqIO::_load_format_module /home/shafiq/source/bio/blib/lib/Bio/SeqIO .pm:548 STACK Bio::SeqIO::new /home/shafiq/source/bio/blib/lib/Bio/SeqIO.pm:377 STACK (eval) /home/shafiq/source/bio/blib/lib/bptutorial.pl:4027 STACK main::__ANON__ /home/shafiq/source/bio/blib/lib/bptutorial.pl:4025 STACK main::run_examples /home/shafiq/source/bio/blib/lib/bptutorial.pl:4152 STACK toplevel t/tutorial.t:23 -------------------------------------- For more information about the SeqIO system please see the SeqIO docs. This includes ways of checking for formats at compile time, not run time Can't call method "next_seq" on an undefined value at /home/shafiq/source/bio/bl ib/lib/bptutorial.pl line 4035. t/tutorial...................dubious Test returned status 2 (wstat 512, 0x200) DIED. FAILED tests 19-21 Failed 3/21 tests, 85.71% okay t/UCSCParsers................ok t/Unflattener................ok t/Unflattener2...............ok t/UniGene....................ok t/Variation_IO...............ok 3/25 The XML-format conversion requires the CPAN modules XML::Twig, XML::Writer, and IO::String to be installed on your system, which they probably aren't. Skipping these tests. t/Variation_IO...............ok t/WABA.......................ok t/XEMBL_DB...................SOAP::Lite and/or XML::DOM not installed. This mean s that Bio::DB::XEMBL module is not usable. Skipping tests. t/XEMBL_DB...................ok Failed Test Stat Wstat Total Fail List of Failed ------------------------------------------------------------------------------- t/BioFetch_DB.t 27 4 8 20-21 27 t/DB.t 78 2 30-31 t/EMBL_DB.t 15 3 6 13-14 t/GuessSeqFormat.t 46 1 11 t/InterProParser.t 2 512 47 1 2 t/RestrictionIO.t 13 3328 14 16 7-14 t/Swiss.t 13 3328 5 6 3-5 t/tutorial.t 2 512 21 6 19-21 11 subtests skipped. Failed 8/179 test scripts. 25/8122 subtests failed. Files=179, Tests=8122, 119 wallclock secs (67.10 cusr + 3.36 csys = 70.46 CPU) Failed 8/179 test programs. 25/8122 subtests failed. make: *** [test_dynamic] Error 255 From jay at jays.net Wed Nov 26 09:28:47 2008 From: jay at jays.net (Jay Hannah) Date: Wed, 26 Nov 2008 08:28:47 -0600 Subject: [Bioperl-l] MailMan delay Message-ID: <1882DF64-0DCA-4956-8E84-B207D3845B4F@jays.net> Hmm... It's been 1 hour, 23 minutes and counting since I sent my last post and I haven't received my copy yet, nor is it in the MailMan archive. What is the typical MailMan lag for this server nowadays? j IRC rules! Email drools! -grin- http://www.bioperl.org/wiki/Irc From cjfields at illinois.edu Wed Nov 26 09:29:38 2008 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 26 Nov 2008 08:29:38 -0600 Subject: [Bioperl-l] Update of bioperl-ext for modern environment In-Reply-To: <20081120130451.307240@gmx.net> References: <20081120130451.307240@gmx.net> Message-ID: Thomas, Just to keep you updated, I'm check on the attached patch and script against main trunk code and will post back how it works out. chris On Nov 20, 2008, at 7:04 AM, Thomas Jahns wrote: > Hello everyone, > > I hope I didn't duplicate anyone's work, but I couldn't find > anything on this in the archives and so I patched bioperl-ext-1.5.1 > to work with > > - bioperl-1.5.2_102 > - staden io_lib 1.11.4 > > and not crash perl. > > Please see attached patch, I hope someone reading here can integrate > it with the repository. > > There is one necessary externally visible change: instead of > specifiyng /foo/include/io_lib for the headers of the staden > package, one now needs to specify /foo/include, because read.h > includes other files with io_lib prefix. > > I hope I removed the double-free bug in the right place, if the > free'ing of a pointer passed into the function pgreen was > intentional, another strategy will be needed. > > Also I found make clean to be dysfunctional, but I don't know enough > about MakeMaker to fix that, thus for recompiles I used a script for > cleanup (also attached). > > > Greetings, > Thomas Jahns > -- > Sensationsangebot nur bis 30.11: GMX FreeDSL - Telefonanschluss + DSL > f?r nur 16,37 Euro/mtl.!* http://dsl.gmx.de/? > ac=OM.AD.PD003K11308T4569a > ext.sh>_______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Nov 26 09:31:13 2008 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 26 Nov 2008 08:31:13 -0600 Subject: [Bioperl-l] Bio::Assembly tests failing In-Reply-To: <492C65C7.2080609@gmail.com> References: <62D72B16-B9E1-4097-931C-CB05934005A0@illinois.edu> <492C65C7.2080609@gmail.com> Message-ID: <95C6E6EF-69A9-47A2-89F1-B53FCFB5DAFA@illinois.edu> Florent, In the long run we're probably going to split some tests up so they're more module-specific (or parser-specific). This is really a preemptive measure for the eventual bioperl split post-1.6, but we might as well tackle some of this now. Thanks for taking care of that! chris On Nov 25, 2008, at 2:53 PM, Florent Angly wrote: > Hi Chris, > > I didn't realize that there were Bio::Assembly related tests outside > of the t/Assembly.t file. The story is that I changed the ACE parser > so that it only gets its information from the ACE file (including > singlets). This was not the case before, so it broke the old, not > updated t/singlet.t tests. So what I did is to update and move the > tests that t/singlet.t did to t/Assembly.t and removed t/singlet.t. > All the Bio::Assembly tests should pass now. > Cheers, > > Florent > > > Chris Fields wrote: >> I am getting tests failing on svn trunk (bioperl-live) that appear >> related to Bio::Assembly changes. Florent, can you take a look at >> these? >> >> chris >> >> 1..4 >> ok 1 - use Bio::Assembly::IO; >> ok 2 - Testing to see if the first contig is a Contig isa >> Bio::Assembly::Contig >> not ok 3 - Testing to see if the first singlet is a Singlet isa >> Bio::Assembly::Singlet >> not ok 4 - Testing to see if the Singlet ISA Contig isa >> Bio::Assembly::Contig >> >> # Failed test 'Testing to see if the first singlet is a Singlet >> isa Bio::Assembly::Singlet' >> # at t/singlet.t line 25. >> # Testing to see if the first singlet is a Singlet isn't defined >> >> # Failed test 'Testing to see if the Singlet ISA Contig isa >> Bio::Assembly::Contig' >> # at t/singlet.t line 27. >> # Testing to see if the Singlet ISA Contig isn't defined >> # Looks like you failed 2 tests of 4. >> Dubious, test returned 2 (wstat 512, 0x200) >> Failed 2/4 subtests >> >> Test Summary Report >> ------------------- >> t/singlet.t (Wstat: 512 Tests: 4 Failed: 2) >> Failed tests: 3-4 >> Non-zero exit status: 2 >> Files=1, Tests=4, 1 wallclock secs ( 0.01 usr 0.01 sys + 0.26 >> cusr 0.04 csys = 0.32 CPU) >> Result: FAIL >> Failed 1/1 test programs. 2/4 subtests failed. >> >> > Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From jay at jays.net Wed Nov 26 09:32:35 2008 From: jay at jays.net (Jay Hannah) Date: Wed, 26 Nov 2008 08:32:35 -0600 Subject: [Bioperl-l] MailMan delay In-Reply-To: <1882DF64-0DCA-4956-8E84-B207D3845B4F@jays.net> References: <1882DF64-0DCA-4956-8E84-B207D3845B4F@jays.net> Message-ID: Hmm... Nevermind. That one came right through. Only my constructive email disappeared into the void. Go figure. :) j On Nov 26, 2008, at 8:28 AM, Jay Hannah wrote: > Hmm... It's been 1 hour, 23 minutes and counting since I sent my > last post and I haven't received my copy yet, nor is it in the > MailMan archive. > > What is the typical MailMan lag for this server nowadays? > > j > IRC rules! Email drools! -grin- > http://www.bioperl.org/wiki/Irc > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Wed Nov 26 09:39:11 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Wed, 26 Nov 2008 15:39:11 +0100 Subject: [Bioperl-l] BioPerl Installation on Suse 11.0 Failed ? In-Reply-To: <59840f5f0811260552o33d6f586qbf7fbb671770a09@mail.gmail.com> References: <59840f5f0811260552o33d6f586qbf7fbb671770a09@mail.gmail.com> Message-ID: <628aabb70811260639q6dd65ebq899bdbdeb10c8000@mail.gmail.com> Hi, 1.4 is way out of date -- unfortunately that's the version that comes up by default in CPAN. We recommend that you use the latest nightly build instead, which you can get here: http://www.bioperl.org/DIST/nightly_builds/ Don't worry, the Build script will still use CPAN to fetch any dependencies it needs. Dave From David.Messina at sbc.su.se Wed Nov 26 09:46:16 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Wed, 26 Nov 2008 15:46:16 +0100 Subject: [Bioperl-l] writing bioperl modules In-Reply-To: <006001c94fba$625a5b90$0301a8c0@your83dafb4529> References: <006001c94fba$625a5b90$0301a8c0@your83dafb4529> Message-ID: <628aabb70811260646m4f06595evf2921ea06db2e1dc@mail.gmail.com> Hey Tim, Thanks for your email. Absolutely we're interested in any contributions you're willing to make. While there is a lot of functionality in BioPerl, by no means would I consider it complete. Take a look at: http://www.bioperl.org/wiki/Project_priority_list and http://www.bioperl.org/wiki/Becoming_a_developer for starters. Also, we're in the middle of trying to push out version 1.6 of BioPerl, and we could certainly use your help with that. See this post for details: http://bioperl.org/pipermail/bioperl-l/2008-November/028513.html Finally, if there's something you are personally interested in adding to BioPerl, maybe something that's relevant to your work, please feel free to propose it here on the list. Dave From cjfields at illinois.edu Wed Nov 26 09:53:39 2008 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 26 Nov 2008 08:53:39 -0600 Subject: [Bioperl-l] MailMan delay In-Reply-To: <1882DF64-0DCA-4956-8E84-B207D3845B4F@jays.net> References: <1882DF64-0DCA-4956-8E84-B207D3845B4F@jays.net> Message-ID: <75740E94-316D-495D-9046-180D6501E63D@illinois.edu> my theory: time warp, probably from the LHC. Oh wait, that's offline, my bad. chris On Nov 26, 2008, at 8:28 AM, Jay Hannah wrote: > Hmm... It's been 1 hour, 23 minutes and counting since I sent my > last post and I haven't received my copy yet, nor is it in the > MailMan archive. > > What is the typical MailMan lag for this server nowadays? > > j > IRC rules! Email drools! -grin- > http://www.bioperl.org/wiki/Irc > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jay at jays.net Wed Nov 26 08:03:34 2008 From: jay at jays.net (Jay Hannah) Date: Wed, 26 Nov 2008 07:03:34 -0600 Subject: [Bioperl-l] writing bioperl modules In-Reply-To: <006001c94fba$625a5b90$0301a8c0@your83dafb4529> References: <006001c94fba$625a5b90$0301a8c0@your83dafb4529> Message-ID: <876F0DAA-5A28-4925-B607-9D4A006D2F3E@jays.net> On Nov 26, 2008, at 5:30 AM, Tim wrote: > I'm just curious to know if bioperl is considered complete or are > there still projects that programmers can become involved in. I think everything is open for enhancement. "Patches welcome." :) > I notice that for other bio-languages there are on-going hackathons > were programmers can meet and work on a problem. Do these exist in > bio-perl or is the code base now mature? "Mature" is probably a good label for most of the code base. I'm not aware of any face-to-face hackathon events. Joining us in our quiet IRC channel http://www.bioperl.org/wiki/Irc and discussing your ideas on this mailing list should be pretty interactive. The 1.6 release push is underway. You could help with that: http://www.bioperl.org/wiki/Talk:Release_1.6 HTH, Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From cjfields at illinois.edu Wed Nov 26 10:10:32 2008 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 26 Nov 2008 09:10:32 -0600 Subject: [Bioperl-l] writing bioperl modules In-Reply-To: <628aabb70811260646m4f06595evf2921ea06db2e1dc@mail.gmail.com> References: <006001c94fba$625a5b90$0301a8c0@your83dafb4529> <628aabb70811260646m4f06595evf2921ea06db2e1dc@mail.gmail.com> Message-ID: Tim, I would suggest there are several areas which also could use a new set of eyes to check things over. The more devs the better. For instance, Sendu Bala led the team for the last release and largely refactored Taxonomy/Species, which has worked out wonderfully. He also introduced a generic lazy parsing interface. Personally, I would like to see an R/BioConductor-BioPerl bridging system (which would be itself a separate project). I'm working on lotsa microarray and will probably be looking at Solexa or 454 data soon. If only we could get a halfway decent interface to R (and a buggy RSPerl does not count, as it's not on the CPAN and is poorly supported). chris On Nov 26, 2008, at 8:46 AM, Dave Messina wrote: > Hey Tim, > Thanks for your email. > > Absolutely we're interested in any contributions you're willing to > make. > While there is a lot of functionality in BioPerl, by no means would I > consider it complete. > > Take a look at: > http://www.bioperl.org/wiki/Project_priority_list > > and > > http://www.bioperl.org/wiki/Becoming_a_developer > > for starters. > > Also, we're in the middle of trying to push out version 1.6 of > BioPerl, and > we could certainly use your help with that. See this post for details: > http://bioperl.org/pipermail/bioperl-l/2008-November/028513.html > > Finally, if there's something you are personally interested in > adding to > BioPerl, maybe something that's relevant to your work, please feel > free to > propose it here on the list. > > > Dave > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From cjfields at illinois.edu Wed Nov 26 10:34:37 2008 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 26 Nov 2008 09:34:37 -0600 Subject: [Bioperl-l] Update of bioperl-ext for modern environment In-Reply-To: <20081120130451.307240@gmx.net> References: <20081120130451.307240@gmx.net> Message-ID: <169DE429-A1F7-4EF4-B150-8500DE2B2AFB@illinois.edu> Thomas, We have partially accepted your patch (those specific for Ext::Align). The Bio::SeqIO::staden::read patches didn't work, primarily b/c Bio::SeqIO::staden::read in subversion has already switched over a while ago to io_lib 1.11 compatibility and is now XS (no longer Inline::C). However, since io_lib itself no longer carries abi/ztf by default (you need the full staden package) those modules no longer appear to work. There are still some significant bugs that need to be worked out within bioperl-ext, particularly re: AlignStats, the alignment algorithm, etc. The problem is the code has suffered some bit-rot over the years and isn't particularly well-supported; most bioperl-ext related bugs in Bugzilla have been around for a very long time. What might be an alternative avenue to pursue is the BioLib initiative, which aims to make swig-based libraries available for all Bio* languages (and will likely have better support). http://biolib.open-bio.org/wiki/Main_Page chris On Nov 20, 2008, at 7:04 AM, Thomas Jahns wrote: > Hello everyone, > > I hope I didn't duplicate anyone's work, but I couldn't find > anything on this in the archives and so I patched bioperl-ext-1.5.1 > to work with > > - bioperl-1.5.2_102 > - staden io_lib 1.11.4 > > and not crash perl. > > Please see attached patch, I hope someone reading here can integrate > it with the repository. > > There is one necessary externally visible change: instead of > specifiyng /foo/include/io_lib for the headers of the staden > package, one now needs to specify /foo/include, because read.h > includes other files with io_lib prefix. > > I hope I removed the double-free bug in the right place, if the > free'ing of a pointer passed into the function pgreen was > intentional, another strategy will be needed. > > Also I found make clean to be dysfunctional, but I don't know enough > about MakeMaker to fix that, thus for recompiles I used a script for > cleanup (also attached). > > > Greetings, > Thomas Jahns > -- > Sensationsangebot nur bis 30.11: GMX FreeDSL - Telefonanschluss + DSL > f?r nur 16,37 Euro/mtl.!* http://dsl.gmx.de/? > ac=OM.AD.PD003K11308T4569a > ext.sh>_______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Marie-Claude Hofmann College of Veterinary Medicine University of Illinois Urbana-Champaign From cjfields at illinois.edu Wed Nov 26 11:39:53 2008 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 26 Nov 2008 10:39:53 -0600 Subject: [Bioperl-l] writing bioperl modules In-Reply-To: <876F0DAA-5A28-4925-B607-9D4A006D2F3E@jays.net> References: <006001c94fba$625a5b90$0301a8c0@your83dafb4529> <876F0DAA-5A28-4925-B607-9D4A006D2F3E@jays.net> Message-ID: <9F4BC1D5-D0C5-4A03-AF96-744B2E2E5061@illinois.edu> On Nov 26, 2008, at 7:03 AM, Jay Hannah wrote: > On Nov 26, 2008, at 5:30 AM, Tim wrote: >> I'm just curious to know if bioperl is considered complete or are >> there still projects that programmers can become involved in. > > I think everything is open for enhancement. "Patches welcome." :) > >> I notice that for other bio-languages there are on-going hackathons >> were programmers can meet and work on a problem. Do these exist in >> bio-perl or is the code base now mature? > > "Mature" is probably a good label for most of the code base. I'm not > aware of any face-to-face hackathon events. Joining us in our quiet > IRC channel > > http://www.bioperl.org/wiki/Irc > > and discussing your ideas on this mailing list should be pretty > interactive. > > The 1.6 release push is underway. You could help with that: > > http://www.bioperl.org/wiki/Talk:Release_1.6 The more comments the better! Haven't heard much, but I did have a ton of items (so it may take time to parse through). > HTH, > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah -c From alperyilmaz at gmail.com Wed Nov 26 12:15:44 2008 From: alperyilmaz at gmail.com (Alper Yilmaz) Date: Wed, 26 Nov 2008 12:15:44 -0500 Subject: [Bioperl-l] quick pairwise alignment In-Reply-To: <628aabb70811260231p24281dfcq9903b46d84b3c8c2@mail.gmail.com> References: <628aabb70811251046u3e9f2711pdb6c90afdcb6f94c@mail.gmail.com> <628aabb70811260231p24281dfcq9903b46d84b3c8c2@mail.gmail.com> Message-ID: Hi Dave, Thank you so much. It works perfect. I was working on Bio::Tools::Run::Alignment::Clustalw , I succeeded to make clustalw run but even though I select -quiet option, the output of the program was directed to my HTML code, instead of being assigned to $aln variable. I was losing hope but your solution saved the day. StandAloneBlast works very nicely and I'm glad viewing the alignment is possible. Out of curiosity, is it possible to see the "original" sequence that is used for BLAST in the alignment? I tried using $hsp_obj->str_seq('query') or $hsp_obj->str_seq('sbjct') but they print out the matched sequence only. I am interested in seeing if the match is at N-terminus or C-terminus (visually thru alignment, not by numbers). In other words, is it possible to get the blast alignment look like clustalw alignment? thanks, alper On Wed, Nov 26, 2008 at 5:31 AM, Dave Messina wrote: > Hi Alper, > Please remember to 'reply all' to keep this conversation on the bioperl > list. > > It's hard to be sure with just fragments of your code, but it seems to be > working fine for me. Below I've attached a sample script showing how you can > run bl2seq. > > If you aren't already, you probably will want to download a nightly build > of bioperl-live and bioperl-run to eliminate any problems stemming from > outdated code. > http://www.bioperl.org/DIST/nightly_builds/ > > Also, for future reference, instead of the BioPerl tutorial, I recommend > the HOWTOs on the website; for this question, this one should be helpful: > http://www.bioperl.org/wiki/HOWTO:Beginners > > You tell StandAloneBlast which program to use (blastp, blastn, etc) using > the -program parameter. You'll see it in the example script below. > > Dave > > > ---------------example code--------------- > #!/usr/bin/perl > > use strict; > use warnings; > use Bio::Seq; > use Bio::Tools::Run::StandAloneBlast; > > # lots of params can be set here, basically anything that you would > normally > # be able to pass to blast on the command line. Type > # perldoc Bio::Tools::Run::StandAloneBlast > # on the command line to see details. > my @params = (program => 'blastp'); > > my $factory = Bio::Tools::Run::StandAloneBlast->new(@params); > > # make some fake data > my $input1 = Bio::Seq->new(-id => "testquery1", > -seq => "ACTADDEEQQPPTCADEEQQQVVGG"); > my $input2 = Bio::Seq->new(-id => "testquery2", > -seq => "ACTADDEMMMMMMMDEEQQQVVGG"); > > # execute the blast command with this line > my $blast_report = $factory->bl2seq ($input1, $input2); > > # just one result in a bl2seq report > my $result_obj = $blast_report->next_result; > > # likewise just one hit > my $hit_obj = $result_obj->next_hit; > > # there may be >1 hsp, but I'm only looking at the first one in this > example > my $hsp_obj = $hit_obj->next_hsp; > > # take a quick look at the alignment > print $hsp_obj->query_string, "\n", > $hsp_obj->homology_string, "\n", > $hsp_obj->hit_string, "\n"; > > > ---------------end example code------------- > > > > On Tue, Nov 25, 2008 at 20:00, Alper Yilmaz wrote: > >> Hi Dave, >> bls2seq works okay. I tried the following outside of bioperl and it >> successfully generates the output. >> Btw, how do I tell StandAloneBlast, bl2seq function which program to use >> (blastn, blastp, etc)? >> >> bl2seq -p blastn -j seq1.fa -i seq2.fa -o bl2seq.out >> >> thanks, >> alper >> >> From cjfields at illinois.edu Wed Nov 26 12:29:58 2008 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 26 Nov 2008 11:29:58 -0600 Subject: [Bioperl-l] Bioperl 1.6 and Bio::Graphics Message-ID: <2E7089AA-50C0-45E9-A9E6-F80BD9876B99@illinois.edu> Lincoln, Scott, (and anyone else interested), One of the items on the list for the next bioperl release is whether to split off Bio::Graphics before 1.6 roles out: http://www.bioperl.org/wiki/ Talk:Release_1.6#Bio::Graphics_and_Splitting_BioPerl My general reasoning is that the Gbrowse devs can start releasing to CPAN on their own (bio-graphics specific) schedule and not be tied to the core release schedule. bio-graphics would just be tied in to a minimally compatible CPAN bioperl-core release. The link above elaborates more on the idea as well as potential issues (versioning, etc). We don't have to do it for this release, but my concern is if it doesn't happen prior to 1.6 it will have to wait until 1.7 (date unknown), thus further impeding a Gbrowse 2.0. Therefore I think it's worth delaying 1.6 a couple weeks to see how this goes. Any comments over the next week or two would be appreciated (gotta factor in the holiday break!). chris (I've set the reply-to for the bioperl list, so hopefully we won't get double posting to both lists for responses) From cain.cshl at gmail.com Wed Nov 26 13:37:46 2008 From: cain.cshl at gmail.com (Scott Cain) Date: Wed, 26 Nov 2008 13:37:46 -0500 Subject: [Bioperl-l] [Gmod-gbrowse] Bioperl 1.6 and Bio::Graphics In-Reply-To: <2E7089AA-50C0-45E9-A9E6-F80BD9876B99@illinois.edu> References: <2E7089AA-50C0-45E9-A9E6-F80BD9876B99@illinois.edu> Message-ID: <536f21b00811261037s4cc34377u6c45c9aef89159b2@mail.gmail.com> Hi Chris, While the decision ultimately rests with Lincoln (as, presumably, will most of the work :-) I'm not sure I agree that the split needs to happen before the 1.6 release. Consider the case where GBrowse 1.70 requires a newer version of Bio::Graphics than is available in BioPerl. We would put a version requirement in the Makefile.PL for Bio::Graphics::Panel to be some number greater than the current BioPerl release and put it on cpan. Users would then do the cpan shell shuffle, and get the right version of Bio::Graphics, and if they don't already have BioPerl, they'll get that too (with an older version of Bio::Graphics that will be immediately overwritten). Is there a flaw (or horrible user experience) in my thinking? Scott On Wed, Nov 26, 2008 at 12:29 PM, Chris Fields wrote: > Lincoln, Scott, (and anyone else interested), > > One of the items on the list for the next bioperl release is whether > to split off Bio::Graphics before 1.6 roles out: > > http://www.bioperl.org/wiki/ > Talk:Release_1.6#Bio::Graphics_and_Splitting_BioPerl > > My general reasoning is that the Gbrowse devs can start releasing to > CPAN on their own (bio-graphics specific) schedule and not be tied to > the core release schedule. bio-graphics would just be tied in to a > minimally compatible CPAN bioperl-core release. The link above > elaborates more on the idea as well as potential issues (versioning, > etc). > > We don't have to do it for this release, but my concern is if it > doesn't happen prior to 1.6 it will have to wait until 1.7 (date > unknown), thus further impeding a Gbrowse 2.0. Therefore I think it's > worth delaying 1.6 a couple weeks to see how this goes. Any comments > over the next week or two would be appreciated (gotta factor in the > holiday break!). > > chris > > (I've set the reply-to for the bioperl list, so hopefully we won't get > double posting to both lists for responses) > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Gmod-gbrowse mailing list > Gmod-gbrowse at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research From cjfields at illinois.edu Wed Nov 26 15:25:43 2008 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 26 Nov 2008 14:25:43 -0600 Subject: [Bioperl-l] [Gmod-gbrowse] Bioperl 1.6 and Bio::Graphics In-Reply-To: <536f21b00811261037s4cc34377u6c45c9aef89159b2@mail.gmail.com> References: <2E7089AA-50C0-45E9-A9E6-F80BD9876B99@illinois.edu> <536f21b00811261037s4cc34377u6c45c9aef89159b2@mail.gmail.com> Message-ID: <510199F3-D209-45E9-A03C-5F49E849F44A@illinois.edu> On Nov 26, 2008, at 12:37 PM, Scott Cain wrote: > Hi Chris, > > While the decision ultimately rests with Lincoln (as, presumably, will > most of the work :-) I'm not sure I agree that the split needs to > happen before the 1.6 release. Consider the case where GBrowse 1.70 > requires a newer version of Bio::Graphics than is available in > BioPerl. We would put a version requirement in the Makefile.PL for > Bio::Graphics::Panel to be some number greater than the current > BioPerl release and put it on cpan. Users would then do the cpan > shell shuffle, and get the right version of Bio::Graphics, and if they > don't already have BioPerl, they'll get that too (with an older > version of Bio::Graphics that will be immediately overwritten). Is > there a flaw (or horrible user experience) in my thinking? > > Scott Sorry Scott, forgot to mention it wouldn't just be Bio::Graphics. Lincoln also mentioned Bio::DB::GFF and Bio::DB::SeqFeature::Store (basically anything Gbrowse-related) would be included, so maybe bio- gbrowse or similar would be a better name. If you like we could postpone a split (less work for the release). A split bio-gbrowse release would just overwrite the older modules as you mention. However, I plan on having regular point releases to CPAN; how do we want to handle Gbrowse-related bug fixes in a point release down the road, after a Bio::Graphics split? Do we stop Bio::Graphics fixes at a certain point after a post-1.6 split so an installation script always finds the latest Bio::Graphics::Panel? Or do we want to merge those fixes into the point release regardless, just in case? chris From David.Messina at sbc.su.se Wed Nov 26 18:53:41 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 27 Nov 2008 00:53:41 +0100 Subject: [Bioperl-l] quick pairwise alignment In-Reply-To: References: <628aabb70811251046u3e9f2711pdb6c90afdcb6f94c@mail.gmail.com> <628aabb70811260231p24281dfcq9903b46d84b3c8c2@mail.gmail.com> Message-ID: <628aabb70811261553w2a74f066t946f06df7456e5eb@mail.gmail.com> Hi Alper, Great -- glad I was able to help. is it possible to see the "original" sequence that is used for BLAST in the > alignment? Sorry, I can't remember what ClustalW's output looks like. Would you like to see something like this? AGCGTGAGTAGTAGATGAGTAGTAGTGAGATGTGAGTAGTAAAAAGTGATGATGATGAGTAAAAAAAAA ||||||||| AGCGTGAGT If I were to do it, I think, since you get the start and stop positions of the hit, I might try to place the hit string in relation to the query string. So in the code from before, instead of printing $hsp_obj->query_string, you could print $input1->seq. Or, I think if you run blastall outside of BioPerl, there are options for query-anchored output, but you'd have to parse that out of the blast report yourself. But those are just off the top of my head. Anyone else reading, please chime in if you've got better ideas. Dave From pengyu.ut at gmail.com Wed Nov 26 22:03:54 2008 From: pengyu.ut at gmail.com (Peng Yu) Date: Wed, 26 Nov 2008 21:03:54 -0600 Subject: [Bioperl-l] How to write a Bio::DB:Fasta sequence into a file? Message-ID: <366c6f340811261903s1788c646n87fd5038bf8363b1@mail.gmail.com> Hi, It shows in the tutorial, that I can use the following command to write a gene into a file. $seq_object = get_sequence('swiss',"ROA1_HUMAN"); write_sequence(">roa1.fasta",'fasta',$seq_object); I'm wondering how to write a gene (id is available) of Bio::DB:Fasta into a file. Thanks, Peng From SMarkel at accelrys.com Thu Nov 27 08:03:38 2008 From: SMarkel at accelrys.com (Scott Markel) Date: Thu, 27 Nov 2008 08:03:38 -0500 Subject: [Bioperl-l] quick pairwise alignment In-Reply-To: <628aabb70811261553w2a74f066t946f06df7456e5eb@mail.gmail.com> References: <628aabb70811251046u3e9f2711pdb6c90afdcb6f94c@mail.gmail.com> <628aabb70811260231p24281dfcq9903b46d84b3c8c2@mail.gmail.com> <628aabb70811261553w2a74f066t946f06df7456e5eb@mail.gmail.com> Message-ID: <1F1240778FB0AF46B4E5A72C44D2C74716909001@exch1-hi.accelrys.net> Alper, I don't think the Bio::SearchIO data objects provide functions allowing the retrieval of the query sequence. You should have either the sequence object or a file containing it that you used in the BioPerl BLAST call. If by "original" sequence you mean the datbase hit, then you'll have to use NCBI's fastacmd (part of the BLAST suite of programs) to retrieve the full-length sequence. Only the portion(s) needed for the pairwise alignment(s) are contained in the BLAST output. Of course, using the sequence identifiers you could retrieve the full-length database ht a number of other ways. Scott Scott Markel, Ph.D. Principal Bioinformatics Architect email: smarkel at accelrys.com Accelrys (SciTegic R&D) mobile: +1 858 205 3653 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 San Diego, CA 92121 fax: +1 858 799 5222 USA web: http://www.accelrys.com http://www.linkedin.com/in/smarkel Board of Directors: International Society for Computational Biology Co-chair: ISCB Publications Committee Associate Editor: PLoS Computational Biology Editorial Board: Briefings in Bioinformatics > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Dave Messina > Sent: Wednesday, 26 November 2008 3:54 PM > To: Alper Yilmaz > Cc: BioPerl List > Subject: Re: [Bioperl-l] quick pairwise alignment > > Hi Alper, > Great -- glad I was able to help. > > is it possible to see the "original" sequence that is used for BLAST in > the > > alignment? > > > Sorry, I can't remember what ClustalW's output looks like. Would you like > to > see something like this? > > AGCGTGAGTAGTAGATGAGTAGTAGTGAGATGTGAGTAGTAAAAAGTGATGATGATGAGTAAAAAAAAA > ||||||||| > AGCGTGAGT > > If I were to do it, I think, since you get the start and stop positions of > the hit, I might try to place the hit string in relation to the query > string. So in the code from before, instead of printing > $hsp_obj->query_string, you could print $input1->seq. > > Or, I think if you run blastall outside of BioPerl, there are options for > query-anchored output, but you'd have to parse that out of the blast > report > yourself. > > But those are just off the top of my head. Anyone else reading, please > chime > in if you've got better ideas. > > Dave > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Thu Nov 27 12:05:34 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 27 Nov 2008 18:05:34 +0100 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl 1.6 release agenda In-Reply-To: References: Message-ID: <628aabb70811270905n1814dfd6ia9150dd5bb739da4@mail.gmail.com> Chris et al, A few quick notes on the SeqIO test reorg I just committed: - each SeqIO driver now has its own .t file. - six modules have all tests skipped (except for use_ok) because there isn't a sample input file against which to test. Those are: agave alf chadoxml chaos flybase_chadoxml strider - where possible, I renamed the sample file in data/ to be test. (e.g. test.fasta, test.embl). There were several already named this way, and future refactoring will benefit from the consistency. - a few of the modules throw off some warnings (which don't cause any test failures). An example is Bio::Location::Fuzzy, called by Bio::SeqIO::genbank, complaining about not finding valid fuzzy encodings. The modules throwing warnings are: SeqIO::chadoxml SeqIO::genbank (really Location::Fuzzy) SeqIO::phd I don't think it's serious. - I opted to organize all of the .t in a subdirectory, SeqIO. So it's t/SeqIO.t t/SeqIO/abi.t t/SeqIO/ace.t ...etc... Chris, I know you've been doing SearchIO_blast.t. I propose this alternative, hierarchical structure so that the t/ directory would be more manageable, but I will happily rename the SeqIO test files to conform to your standard if you prefer. Dave From hlapp at gmx.net Thu Nov 27 12:20:47 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 27 Nov 2008 12:20:47 -0500 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl 1.6 release agenda In-Reply-To: <628aabb70811270905n1814dfd6ia9150dd5bb739da4@mail.gmail.com> References: <628aabb70811270905n1814dfd6ia9150dd5bb739da4@mail.gmail.com> Message-ID: <4FB9E09D-124D-40D2-A8A9-5D84B1A6A3DA@gmx.net> Looks like a great structure! (Sorry I can't help much at this time, so I'm resigning myself to cheering from the sidelines :) -hilmar On Nov 27, 2008, at 12:05 PM, Dave Messina wrote: > Chris et al, > A few quick notes on the SeqIO test reorg I just committed: > > - each SeqIO driver now has its own .t file. > > > > - six modules have all tests skipped (except for use_ok) because > there isn't > a sample input file against which to test. Those are: > > agave > alf > chadoxml > chaos > flybase_chadoxml > strider > > > - where possible, I renamed the sample file in data/ to be > test. > (e.g. test.fasta, test.embl). There were several already named this > way, and > future refactoring will benefit from the consistency. > - a few of the modules throw off some warnings (which don't cause > any test > failures). > > An example is Bio::Location::Fuzzy, called by Bio::SeqIO::genbank, > complaining about not finding valid fuzzy encodings. > > The modules throwing warnings are: > > SeqIO::chadoxml > SeqIO::genbank (really Location::Fuzzy) > SeqIO::phd > > > I don't think it's serious. > > > - I opted to organize all of the .t in a subdirectory, SeqIO. So it's > > t/SeqIO.t > t/SeqIO/abi.t > t/SeqIO/ace.t > ...etc... > > > Chris, I know you've been doing SearchIO_blast.t. > > I propose this alternative, hierarchical structure so that the t/ > directory > would be more manageable, but I will happily rename the SeqIO test > files to > conform to your standard if you prefer. > > > Dave > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at illinois.edu Thu Nov 27 13:58:30 2008 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 27 Nov 2008 12:58:30 -0600 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl 1.6 release agenda In-Reply-To: <628aabb70811270905n1814dfd6ia9150dd5bb739da4@mail.gmail.com> References: <628aabb70811270905n1814dfd6ia9150dd5bb739da4@mail.gmail.com> Message-ID: <063C8C1E-54C6-4F46-9EFD-90D80396B88E@illinois.edu> On Nov 27, 2008, at 11:05 AM, Dave Messina wrote: > Chris et al, > > A few quick notes on the SeqIO test reorg I just committed: > > - each SeqIO driver now has its own .t file. > > - six modules have all tests skipped (except for use_ok) because > there isn't a sample input file against which to test. Those are: > > agave > alf > chadoxml > chaos > flybase_chadoxml > strider Possible candidates to be moved out of core into a dev (unless we come up with some tests!). > - where possible, I renamed the sample file in data/ to be > test. (e.g. test.fasta, test.embl). There were several > already named this way, and future refactoring will benefit from the > consistency. I agree. > - a few of the modules throw off some warnings (which don't cause > any test failures). > > An example is Bio::Location::Fuzzy, called by Bio::SeqIO::genbank, > complaining about not finding valid fuzzy encodings. > > The modules throwing warnings are: > > SeqIO::chadoxml > SeqIO::genbank (really Location::Fuzzy) > SeqIO::phd > > I don't think it's serious. Still warrants some investigation, but I agree. We can turn off warning if needed, but I would rather try to find what is triggering them (it's almost always something). > - I opted to organize all of the .t in a subdirectory, SeqIO. So it's > > t/SeqIO.t > t/SeqIO/abi.t > t/SeqIO/ace.t > ...etc... > > Chris, I know you've been doing SearchIO_blast.t. > > I propose this alternative, hierarchical structure so that the t/ > directory would be more manageable, but I will happily rename the > SeqIO test files to conform to your standard if you prefer. I like! I'll change AlignIO and SearchIO over to this and work on cleaning up test file names. We'll need to update the wiki accordingly. > Dave Thanks Dave! Feel free to do more restructuring along these lines; we have the other IO's (Assembly, Resstriction, Tree, Feature, and a few others I think), and I could see collecting various modules into a collective structure, such as Ontology, Tools, Graphics, etc. If you do just post a note here or on the wiki (I'll do likewise) so we don't work on the same thing. Anyone else wanting to join in please let us know likewise. chris From David.Messina at sbc.su.se Thu Nov 27 16:53:18 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 27 Nov 2008 22:53:18 +0100 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl 1.6 release agenda In-Reply-To: <063C8C1E-54C6-4F46-9EFD-90D80396B88E@illinois.edu> References: <628aabb70811270905n1814dfd6ia9150dd5bb739da4@mail.gmail.com> <063C8C1E-54C6-4F46-9EFD-90D80396B88E@illinois.edu> Message-ID: <628aabb70811271353h66826a91m5dcc0e6b597450ec@mail.gmail.com> > > agave >> alf >> chadoxml >> chaos >> flybase_chadoxml >> strider >> > > Possible candidates to be moved out of core into a dev (unless we come up > with some tests!). > Agreed. > - a few of the modules throw off some warnings (which don't cause any test >> failures). >> >> Still warrants some investigation, but I agree. We can turn off warning > if needed, but I would rather try to find what is triggering them (it's > almost always something). > Sounds good. I'll add those to the TODO list on the wiki. > I'll change AlignIO and SearchIO over to this and work on cleaning up test > file names. We'll need to update the wiki accordingly. > I've updated the 1.6 release talk page to describe the hierarchical t/ structure I used for SeqIO. > Feel free to do more restructuring along these lines; we have the other > IO's (Assembly, Resstriction, Tree, Feature, and a few others I think), and > I could see collecting various modules into a collective structure, such as > Ontology, Tools, Graphics, etc. > If you do just post a note here or on the wiki (I'll do likewise) so we > don't work on the same thing. Will do. > Anyone else wanting to join in please let us know likewise. > Yes, anyone looking to lend a hand, please speak up! Restructuring these test files does not require much expertise in BioPerl, and it's a good way to get your feet wet. I've gone through the list of outstanding bugs on the release page and tried to do some triage on which release they seemed appropriate for -- see the big TODO table on: http://www.bioperl.org/wiki/Talk:Release_1.6 Chris, I know you've been working hard on going through a lot of bugs in the last few days. Hope I'm not stepping on your toes there; just trying to get something written down on the list for each one of them so we can all discuss and see what's left to be done. Dave From cjfields at illinois.edu Thu Nov 27 17:45:12 2008 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 27 Nov 2008 16:45:12 -0600 Subject: [Bioperl-l] [ANNOUNCEMENT] BioPerl 1.6 release agenda In-Reply-To: <628aabb70811271353h66826a91m5dcc0e6b597450ec@mail.gmail.com> References: <628aabb70811270905n1814dfd6ia9150dd5bb739da4@mail.gmail.com> <063C8C1E-54C6-4F46-9EFD-90D80396B88E@illinois.edu> <628aabb70811271353h66826a91m5dcc0e6b597450ec@mail.gmail.com> Message-ID: <9A99DCFA-E7B7-45A8-834A-A0D22761520E@illinois.edu> On Nov 27, 2008, at 3:53 PM, Dave Messina wrote: > ...I've gone through the list of outstanding bugs on the release > page and tried to do some triage on which release they seemed > appropriate for -- see the big TODO table on: > > http://www.bioperl.org/wiki/Talk:Release_1.6 > > Chris, I know you've been working hard on going through a lot of > bugs in the last few days. Hope I'm not stepping on your toes there; > just trying to get something written down on the list for each one > of them so we can all discuss and see what's left to be done. No problem at all! As long as we know what everyone is working on (noted on the wiki or here), shouldn't be a problem, so the more involved the better. Turkey day here so I'll be intermittent, but I'll work on updating the wiki over the next day or two. I'm working on a few more bugs (including that stockholm one) and will update the SearchIO/AlignIO tests along the way. > Dave Thanks again for the help! chris From maj at fortinbras.us Thu Nov 27 21:20:30 2008 From: maj at fortinbras.us (Mark A. Jensen) Date: Thu, 27 Nov 2008 21:20:30 -0500 Subject: [Bioperl-l] general coordinate transformation in LocatableSeq Message-ID: <317A2349A7BA41FE89016A21E52C0730@NewLife> Chris and others interested- Under bug #2689 I've taken a stab add moving in and out of sequence coordinate systems in a general way (aa to/from nt, but also say repeats to/from mnemonics), under the guise of a subseq() method for LocatableSeq. Prob not for this release, but may be helpful for other components as things become (even) more generalized in future. This subseq also handles frameshifts. Would be grateful for comments (after the rush)- cheers MAJ From David.Messina at sbc.su.se Fri Nov 28 03:33:39 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Fri, 28 Nov 2008 09:33:39 +0100 Subject: [Bioperl-l] general coordinate transformation in LocatableSeq In-Reply-To: <317A2349A7BA41FE89016A21E52C0730@NewLife> References: <317A2349A7BA41FE89016A21E52C0730@NewLife> Message-ID: <628aabb70811280033l5a5cee4dib0ad6566694353ad@mail.gmail.com> I wonder if this would be useful for sorting out the subsequence issues that we've seen in Bio::Search::SearchUtils::tile_hsps, Bio::Search::Hit::GenericHit::length_aln, et al. (e.g. Bug 2476 ). Not for release 1.6, but a 1.6.x point release perhaps. Dave From David.Messina at sbc.su.se Fri Nov 28 07:11:13 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Fri, 28 Nov 2008 13:11:13 +0100 Subject: [Bioperl-l] verbose turned on by PrimarySeq->new() Message-ID: <628aabb70811280411l62c4416v74ce8a0f19948842@mail.gmail.com> Hey everyone, In trying to figure out why the SeqIO::largefasta tests were being verbose about tempfile cleanup, I discovered that the PrimarySeq constructor turns on verbose, which gets inherited by LargePrimarySeq, which in turn cause RootIO to overshare about the temp files. (see PrimarySeq.pm line 164). The verbose flag was put in when Mira Han added -nowarnonempty. Turning verbose off doesn't seem to cause any tests to fail, so it looks like it might have been turned on for initial debugging purposes and never removed. Mira (or anyone else in the know), could you comment on this? Dave From heikki.lehvaslaiho at gmail.com Fri Nov 28 07:49:46 2008 From: heikki.lehvaslaiho at gmail.com (Heikki Lehvaslaiho) Date: Fri, 28 Nov 2008 14:49:46 +0200 Subject: [Bioperl-l] verbose turned on by PrimarySeq->new() In-Reply-To: <628aabb70811280411l62c4416v74ce8a0f19948842@mail.gmail.com> References: <628aabb70811280411l62c4416v74ce8a0f19948842@mail.gmail.com> Message-ID: Dave, Could this be the same reason LargeLocatableSeq was printing out these same warnings? I just hacked it silent in its DESTROY method a few days ago. Maybe that can be revoked now? I can not see any reasons why module code itself should increase verbosity level, so go and fix it. -Heikki bioperl-live/trunk/Bio/Seq/LargeLocatableSeq.pm 2008/11/28 Dave Messina : > Hey everyone, > In trying to figure out why the SeqIO::largefasta tests were being verbose > about tempfile cleanup, I discovered that the PrimarySeq constructor turns > on verbose, which gets inherited by LargePrimarySeq, which in turn cause > RootIO to overshare about the temp files. (see PrimarySeq.pm line 164). > > The verbose flag was put in when Mira Han added -nowarnonempty. Turning > verbose off doesn't seem to cause any tests to fail, so it looks like it > might have been turned on for initial debugging purposes and never removed. > > Mira (or anyone else in the know), could you comment on this? > > > Dave > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- -Heikki Heikki Lehvaslaiho - heikki lehvaslaiho gmail com http://kapkaupunki.blogspot.com/ From David.Messina at sbc.su.se Fri Nov 28 08:24:43 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Fri, 28 Nov 2008 14:24:43 +0100 Subject: [Bioperl-l] verbose turned on by PrimarySeq->new() In-Reply-To: References: <628aabb70811280411l62c4416v74ce8a0f19948842@mail.gmail.com> Message-ID: <628aabb70811280524h59cab0f4vfd33c4b910728e58@mail.gmail.com> > > Could this be the same reason LargeLocatableSeq was printing out these > same warnings? I just hacked it silent in its DESTROY method a few days ago. Maybe > that can be revoked now? > I think so; I reverted your changes, reran t/AlignIO/largemultifasta.t and saw no warnings. > I can not see any reasons why module code itself should increase > verbosity level, so go and fix it. > Thanks, done. Dave From maj at fortinbras.us Fri Nov 28 10:34:26 2008 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 28 Nov 2008 10:34:26 -0500 Subject: [Bioperl-l] general coordinate transformation in LocatableSeq In-Reply-To: <628aabb70811280033l5a5cee4dib0ad6566694353ad@mail.gmail.com> References: <317A2349A7BA41FE89016A21E52C0730@NewLife> <628aabb70811280033l5a5cee4dib0ad6566694353ad@mail.gmail.com> Message-ID: It's likely it could help, since one of Bert's tweaks was to fix a wrong transformation. I will dig into it. ----- Original Message ----- From: Dave Messina To: Mark A. Jensen Cc: BioPerl List ; Chris Fields Sent: Friday, November 28, 2008 3:33 AM Subject: Re: [Bioperl-l] general coordinate transformation in LocatableSeq I wonder if this would be useful for sorting out the subsequence issues that we've seen in Bio::Search::SearchUtils::tile_hsps, Bio::Search::Hit::GenericHit::length_aln, et al. (e.g. Bug 2476 ). Not for release 1.6, but a 1.6.x point release perhaps. Dave From maj at fortinbras.us Fri Nov 28 16:54:56 2008 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 28 Nov 2008 16:54:56 -0500 Subject: [Bioperl-l] general coordinate transformation in LocatableSeq In-Reply-To: <628aabb70811280033l5a5cee4dib0ad6566694353ad@mail.gmail.com> References: <317A2349A7BA41FE89016A21E52C0730@NewLife> <628aabb70811280033l5a5cee4dib0ad6566694353ad@mail.gmail.com> Message-ID: <4C99E476D3A84EE9ADC256BF092A9BCA@NewLife> Dave- this was an excellent trial run. Have uploaded patches to #2476 that seem to eliminate the problem there. Will need to explore other issues (length_aln, particularly) further. MAJ ----- Original Message ----- From: "Dave Messina" To: "Mark A. Jensen" Cc: "BioPerl List" ; "Chris Fields" Sent: Friday, November 28, 2008 3:33 AM Subject: Re: [Bioperl-l] general coordinate transformation in LocatableSeq >I wonder if this would be useful for sorting out the subsequence >issues that > we've seen > in Bio::Search::SearchUtils::tile_hsps, > Bio::Search::Hit::GenericHit::length_aln, > et al. (e.g. Bug 2476 > ). > > Not for release 1.6, but a 1.6.x point release perhaps. > > > Dave > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From David.Messina at sbc.su.se Fri Nov 28 18:28:13 2008 From: David.Messina at sbc.su.se (Dave Messina) Date: Sat, 29 Nov 2008 00:28:13 +0100 Subject: [Bioperl-l] general coordinate transformation in LocatableSeq In-Reply-To: <4C99E476D3A84EE9ADC256BF092A9BCA@NewLife> References: <317A2349A7BA41FE89016A21E52C0730@NewLife> <628aabb70811280033l5a5cee4dib0ad6566694353ad@mail.gmail.com> <4C99E476D3A84EE9ADC256BF092A9BCA@NewLife> Message-ID: <628aabb70811281528p1e563c34tbe2dd4cee721fd4b@mail.gmail.com> Mark, you rock! I'll take a look at your patches in detail tomorrow and check them in (unless Chris beats me to it). Thanks so much for taking the time. Dave From maj at fortinbras.us Sat Nov 29 00:06:09 2008 From: maj at fortinbras.us (Mark A. Jensen) Date: Sat, 29 Nov 2008 00:06:09 -0500 Subject: [Bioperl-l] undefined sub-sequence with a single base Message-ID: <8ABE9C6DF0A34EEFB090D82DCE5A28B4@NewLife> All- Thought I would throw this up to this thread. Patch to http://bugzilla.open-bio.org/show_bug.cgi?id=2476 using mapping-aware subseq() seems to solve the problem. It appears to translate cleanly between aas and nts, and gives single-residue subseqs naturally (even at the boundaries). Comments please... cheers Mark From maj at fortinbras.us Sat Nov 29 00:57:56 2008 From: maj at fortinbras.us (Mark A. Jensen) Date: Sat, 29 Nov 2008 00:57:56 -0500 Subject: [Bioperl-l] How to write a Bio::DB:Fasta sequence into a file? Message-ID: <45C996B6EC9A4FB3A11FCD066FB21594@NewLife> What you are doing is a little unclear. If you have obtained a sequence object, from Bio::DB::Fasta or anywhere, you can write it to a FASTA file by using the write_sequence call. Maybe you have done $seq = read_sequence("theirseqs.fas", 'fasta'); or ($seq, @more_seqs) = read_all_sequences("theirseqs.fas", 'fasta'); or even $db = new Bio::DB::Fasta("theirseqs.fas"); $seq = $db->get_Seq_by_id('THX1138'); then write it with write_sequence(">myseqs.fas", 'fasta', $seq); or append it to your file with write_sequence(">>myseqs.fas", 'fasta', $seq); If your problem is, you have a sequence string, and you want to write it to a file in FASTA format, then you create a sequence object directly: $seq = new_sequence( 'atcgtgcaat', 'THX1138' ) and write it using write_sequence() as above. Hope this helps- Mark From maj at fortinbras.us Sat Nov 29 09:34:33 2008 From: maj at fortinbras.us (Mark A. Jensen) Date: Sat, 29 Nov 2008 09:34:33 -0500 Subject: [Bioperl-l] How to write a Bio::DB:Fasta sequence into a file? In-Reply-To: <366c6f340811261903s1788c646n87fd5038bf8363b1@mail.gmail.com> Message-ID: <8CB2B1AA4F28487694BAB88F10281A80@NewLife> ## sorry about the multiple posts What you are doing is a little unclear. If you have obtained a sequence object, from Bio::DB::Fasta or anywhere, you can write it to a FASTA file by using the write_sequence call. Maybe you have done $seq = read_sequence("theirseqs.fas", 'fasta'); or ($seq, @more_seqs) = read_all_sequences("theirseqs.fas", 'fasta'); or even $db = new Bio::DB::Fasta("theirseqs.fas"); $seq = $db->get_Seq_by_id('THX1138'); then write it with write_sequence(">myseqs.fas", 'fasta', $seq); or append it to your file with write_sequence(">>myseqs.fas", 'fasta', $seq); If your problem is, you have a sequence string, and you want to write it to a file in FASTA format, then you create a sequence object directly: $seq = new_sequence( 'atcgtgcaat', 'THX1138' ) and write it using write_sequence() as above. Hope this helps- Mark From maj at fortinbras.us Sat Nov 29 09:38:06 2008 From: maj at fortinbras.us (Mark A. Jensen) Date: Sat, 29 Nov 2008 09:38:06 -0500 Subject: [Bioperl-l] undefined sub-sequence with a single base In-Reply-To: <19F8A0BA-799D-4C4A-8713-6129543C30E1@illinois.edu> Message-ID: <3F8FAACCC3C14ACAA911ECFC7F463988@NewLife> ## sorry about the multi-post All- Thought I would throw this up to this thread. Patch to http://bugzilla.open-bio.org/show_bug.cgi?id=2476 using mapping-aware subseq() seems to solve the problem. It appears to translate cleanly between aas and nts, and gives single-residue subseqs naturally (even at the boundaries). Comments please... cheers Mark From sac at bioperl.org Sun Nov 30 12:50:26 2008 From: sac at bioperl.org (Steve Chervitz) Date: Sun, 30 Nov 2008 09:50:26 -0800 Subject: [Bioperl-l] undefined sub-sequence with a single base In-Reply-To: <2752CD02-90AE-40AA-8456-56B38CAC6C3B@bioperl.org> References: <1226674555.6451.37.camel@alexie-laptop> <1226674770.6451.41.camel@alexie-laptop> <19F8A0BA-799D-4C4A-8713-6129543C30E1@illinois.edu> <1226943787.17996.26.camel@alexie-laptop> <2752CD02-90AE-40AA-8456-56B38CAC6C3B@bioperl.org> Message-ID: <8f200b4c0811300950v5f75e1d6mb4ceee38fd4d5888@mail.gmail.com> Apologies for the trouble my tile_hsps method has caused. My original intent for it was to provide approximate answers at the level of the Hit object, providing a means of summing over all HSPs in a reasonable way. I don't think it has ever gone through extensive testing with different blast flavors and edge cases. What to do with it? Maybe provide warnings to it's limitations with advice on other ways to proceed. Or bite the bullet and make it more robust. As for the specific exception reported in this thread, it looks like there's a bug causing it to not properly handle zero-length ranges: "Undefined sub-sequence (548,548). Valid range = 51 - 548" Patching this might be fairly straightforward and could help in the short term. I'll take a look. Steve On Mon, Nov 17, 2008 at 10:16 AM, Jason Stajich wrote: > Personally - I'm not sure I trust tile_hsps on a translated search - or at > all - really - you may want to compute the "dominant" strand yourself by > iterating through the HSPs or using WU-BLAST to get logical groups of HSPs > which is a better tiling HSP algorithm (the --links option in WU-BLAST). > > -jason > On Nov 17, 2008, at 9:43 AM, Alexie Papanicolaou wrote: > >> Hi Chris >> >> Sorry, I got the new SVN build today and still get the same error... >> >> Could it be because the subseq is not divisible by 3 (due to blastx)? >> >> a >> >> >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: Undefined sub-sequence (2,2). Valid range = 2 - 190 >> STACK: Error::throw >> STACK: >> Bio::Root::Root::throw /usr/local/share/perl/5.8.8/Bio/Root/Root.pm:357 >> STACK: >> Bio::Search::HSP::HSPI::matches >> /usr/local/share/perl/5.8.8/Bio/Search/HSP/HSPI.pm:691 >> STACK: >> Bio::Search::SearchUtils::_adjust_contigs >> /usr/local/share/perl/5.8.8/Bio/Search/SearchUtils.pm:460 >> STACK: >> Bio::Search::SearchUtils::tile_hsps >> /usr/local/share/perl/5.8.8/Bio/Search/SearchUtils.pm:200 >> STACK: >> Bio::Search::Hit::GenericHit::strand >> /usr/local/share/perl/5.8.8/Bio/Search/Hit/GenericHit.pm:1455 >> >> >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: Undefined sub-sequence (3,4). Valid range = 3 - 44 >> STACK: Error::throw >> STACK: >> Bio::Root::Root::throw /usr/local/share/perl/5.8.8/Bio/Root/Root.pm:357 >> STACK: >> Bio::Search::HSP::HSPI::matches >> /usr/local/share/perl/5.8.8/Bio/Search/HSP/HSPI.pm:691 >> STACK: >> Bio::Search::SearchUtils::_adjust_contigs >> /usr/local/share/perl/5.8.8/Bio/Search/SearchUtils.pm:404 >> STACK: >> Bio::Search::SearchUtils::tile_hsps >> /usr/local/share/perl/5.8.8/Bio/Search/SearchUtils.pm:200 >> STACK: >> Bio::Search::Hit::GenericHit::strand >> /usr/local/share/perl/5.8.8/Bio/Search/Hit/GenericHit.pm:1455 >> >> >> >> >> >> >> On Fri, 2008-11-14 at 11:08 -0600, Chris Fields wrote: >> >>> We've switched to subversion a while ago. Could you try updating from >>> there, or using one of our nightly builds? >>> >>> http://www.bioperl.org/DIST/nightly_builds/ >>> >>> chris >>> >>> On Nov 14, 2008, at 8:59 AM, Alexie Papanicolaou wrote: >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Jason Stajich > jason at bioperl.org > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From maj at fortinbras.us Sun Nov 30 12:58:08 2008 From: maj at fortinbras.us (Mark A. Jensen) Date: Sun, 30 Nov 2008 12:58:08 -0500 Subject: [Bioperl-l] undefined sub-sequence with a single base In-Reply-To: <8f200b4c0811300950v5f75e1d6mb4ceee38fd4d5888@mail.gmail.com> References: <1226674555.6451.37.camel@alexie-laptop><1226674770.6451.41.camel@alexie-laptop><19F8A0BA-799D-4C4A-8713-6129543C30E1@illinois.edu><1226943787.17996.26.camel@alexie-laptop><2752CD02-90AE-40AA-8456-56B38CAC6C3B@bioperl.org> <8f200b4c0811300950v5f75e1d6mb4ceee38fd4d5888@mail.gmail.com> Message-ID: <1E0F040108EA4B6182E0C6D0CC60526A@NewLife> I think we're (Dave, Chris, and I) are making progress on the subsequence boundary issues. The fix should propagate through everything having this issue...we hope--stay tuned... ----- Original Message ----- From: "Steve Chervitz" To: "Jason Stajich" Cc: "Chris Fields" ; ; "Alexie Papanicolaou" Sent: Sunday, November 30, 2008 12:50 PM Subject: Re: [Bioperl-l] undefined sub-sequence with a single base > Apologies for the trouble my tile_hsps method has caused. My original > intent for it was to provide approximate answers at the level of the > Hit object, providing a means of summing over all HSPs in a reasonable > way. I don't think it has ever gone through extensive testing with > different blast flavors and edge cases. > > What to do with it? Maybe provide warnings to it's limitations with > advice on other ways to proceed. Or bite the bullet and make it more > robust. > > As for the specific exception reported in this thread, it looks like > there's a bug causing it to not properly handle zero-length ranges: > > "Undefined sub-sequence (548,548). Valid range = 51 - 548" > > Patching this might be fairly straightforward and could help in the > short term. I'll take a look. > > Steve > > > On Mon, Nov 17, 2008 at 10:16 AM, Jason Stajich wrote: >> Personally - I'm not sure I trust tile_hsps on a translated search - or at >> all - really - you may want to compute the "dominant" strand yourself by >> iterating through the HSPs or using WU-BLAST to get logical groups of HSPs >> which is a better tiling HSP algorithm (the --links option in WU-BLAST). >> >> -jason >> On Nov 17, 2008, at 9:43 AM, Alexie Papanicolaou wrote: >> >>> Hi Chris >>> >>> Sorry, I got the new SVN build today and still get the same error... >>> >>> Could it be because the subseq is not divisible by 3 (due to blastx)? >>> >>> a >>> >>> >>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>> MSG: Undefined sub-sequence (2,2). Valid range = 2 - 190 >>> STACK: Error::throw >>> STACK: >>> Bio::Root::Root::throw /usr/local/share/perl/5.8.8/Bio/Root/Root.pm:357 >>> STACK: >>> Bio::Search::HSP::HSPI::matches >>> /usr/local/share/perl/5.8.8/Bio/Search/HSP/HSPI.pm:691 >>> STACK: >>> Bio::Search::SearchUtils::_adjust_contigs >>> /usr/local/share/perl/5.8.8/Bio/Search/SearchUtils.pm:460 >>> STACK: >>> Bio::Search::SearchUtils::tile_hsps >>> /usr/local/share/perl/5.8.8/Bio/Search/SearchUtils.pm:200 >>> STACK: >>> Bio::Search::Hit::GenericHit::strand >>> /usr/local/share/perl/5.8.8/Bio/Search/Hit/GenericHit.pm:1455 >>> >>> >>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>> MSG: Undefined sub-sequence (3,4). Valid range = 3 - 44 >>> STACK: Error::throw >>> STACK: >>> Bio::Root::Root::throw /usr/local/share/perl/5.8.8/Bio/Root/Root.pm:357 >>> STACK: >>> Bio::Search::HSP::HSPI::matches >>> /usr/local/share/perl/5.8.8/Bio/Search/HSP/HSPI.pm:691 >>> STACK: >>> Bio::Search::SearchUtils::_adjust_contigs >>> /usr/local/share/perl/5.8.8/Bio/Search/SearchUtils.pm:404 >>> STACK: >>> Bio::Search::SearchUtils::tile_hsps >>> /usr/local/share/perl/5.8.8/Bio/Search/SearchUtils.pm:200 >>> STACK: >>> Bio::Search::Hit::GenericHit::strand >>> /usr/local/share/perl/5.8.8/Bio/Search/Hit/GenericHit.pm:1455 >>> >>> >>> >>> >>> >>> >>> On Fri, 2008-11-14 at 11:08 -0600, Chris Fields wrote: >>> >>>> We've switched to subversion a while ago. Could you try updating from >>>> there, or using one of our nightly builds? >>>> >>>> http://www.bioperl.org/DIST/nightly_builds/ >>>> >>>> chris >>>> >>>> On Nov 14, 2008, at 8:59 AM, Alexie Papanicolaou wrote: >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Jason Stajich >> jason at bioperl.org >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Sun Nov 30 16:07:58 2008 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 30 Nov 2008 15:07:58 -0600 Subject: [Bioperl-l] undefined sub-sequence with a single base In-Reply-To: <1E0F040108EA4B6182E0C6D0CC60526A@NewLife> References: <1226674555.6451.37.camel@alexie-laptop><1226674770.6451.41.camel@alexie-laptop><19F8A0BA-799D-4C4A-8713-6129543C30E1@illinois.edu><1226943787.17996.26.camel@alexie-laptop><2752CD02-90AE-40AA-8456-56B38CAC6C3B@bioperl.org> <8f200b4c0811300950v5f75e1d6mb4ceee38fd4d5888@mail.gmail.com> <1E0F040108EA4B6182E0C6D0CC60526A@NewLife> Message-ID: <78CCBDAC-8ABA-41FC-BE59-BD0496AB830D@illinois.edu> Steve, If you can look through it that would be great, if it gets fixed for 1.6 even better, but it's not a blocker by any means and I would rather have it fixed robustly vs. a problematic quickie fix 'for now'. I think a documentation note indicating potential issues would suffice for me for the time being. Sendu also made some changes on this for 1.5.2 if memory serves (made it a bit more robust), so it might be worth looking over the revision history. Somewhat related: I noticed that Bio::Search::BlastUtils and Bio::Search::SearchUtils is redundant (both have the same methods, including tile_hsps, but the latter module seems to be more up-to- date). Should we deprecate one for the other? They are both used (but I'm sure SearchUtils could drop-in replace BlastUtils). chris On Nov 30, 2008, at 11:58 AM, Mark A. Jensen wrote: > I think we're (Dave, Chris, and I) are making progress on the > subsequence boundary issues. The fix should propagate through > everything having this issue...we hope--stay tuned... > ----- Original Message ----- From: "Steve Chervitz" > To: "Jason Stajich" > Cc: "Chris Fields" ; >; "Alexie Papanicolaou" > Sent: Sunday, November 30, 2008 12:50 PM > Subject: Re: [Bioperl-l] undefined sub-sequence with a single base > > >> Apologies for the trouble my tile_hsps method has caused. My original >> intent for it was to provide approximate answers at the level of the >> Hit object, providing a means of summing over all HSPs in a >> reasonable >> way. I don't think it has ever gone through extensive testing with >> different blast flavors and edge cases. >> >> What to do with it? Maybe provide warnings to it's limitations with >> advice on other ways to proceed. Or bite the bullet and make it more >> robust. >> >> As for the specific exception reported in this thread, it looks like >> there's a bug causing it to not properly handle zero-length ranges: >> >> "Undefined sub-sequence (548,548). Valid range = 51 - 548" >> >> Patching this might be fairly straightforward and could help in the >> short term. I'll take a look. >> >> Steve >> >> >> On Mon, Nov 17, 2008 at 10:16 AM, Jason Stajich >> wrote: >>> Personally - I'm not sure I trust tile_hsps on a translated search >>> - or at >>> all - really - you may want to compute the "dominant" strand >>> yourself by >>> iterating through the HSPs or using WU-BLAST to get logical groups >>> of HSPs >>> which is a better tiling HSP algorithm (the --links option in WU- >>> BLAST). >>> >>> -jason >>> On Nov 17, 2008, at 9:43 AM, Alexie Papanicolaou wrote: >>> >>>> Hi Chris >>>> >>>> Sorry, I got the new SVN build today and still get the same >>>> error... >>>> >>>> Could it be because the subseq is not divisible by 3 (due to >>>> blastx)? >>>> >>>> a >>>> >>>> >>>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>>> MSG: Undefined sub-sequence (2,2). Valid range = 2 - 190 >>>> STACK: Error::throw >>>> STACK: >>>> Bio::Root::Root::throw /usr/local/share/perl/5.8.8/Bio/Root/ >>>> Root.pm:357 >>>> STACK: >>>> Bio::Search::HSP::HSPI::matches >>>> /usr/local/share/perl/5.8.8/Bio/Search/HSP/HSPI.pm:691 >>>> STACK: >>>> Bio::Search::SearchUtils::_adjust_contigs >>>> /usr/local/share/perl/5.8.8/Bio/Search/SearchUtils.pm:460 >>>> STACK: >>>> Bio::Search::SearchUtils::tile_hsps >>>> /usr/local/share/perl/5.8.8/Bio/Search/SearchUtils.pm:200 >>>> STACK: >>>> Bio::Search::Hit::GenericHit::strand >>>> /usr/local/share/perl/5.8.8/Bio/Search/Hit/GenericHit.pm:1455 >>>> >>>> >>>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>>> MSG: Undefined sub-sequence (3,4). Valid range = 3 - 44 >>>> STACK: Error::throw >>>> STACK: >>>> Bio::Root::Root::throw /usr/local/share/perl/5.8.8/Bio/Root/ >>>> Root.pm:357 >>>> STACK: >>>> Bio::Search::HSP::HSPI::matches >>>> /usr/local/share/perl/5.8.8/Bio/Search/HSP/HSPI.pm:691 >>>> STACK: >>>> Bio::Search::SearchUtils::_adjust_contigs >>>> /usr/local/share/perl/5.8.8/Bio/Search/SearchUtils.pm:404 >>>> STACK: >>>> Bio::Search::SearchUtils::tile_hsps >>>> /usr/local/share/perl/5.8.8/Bio/Search/SearchUtils.pm:200 >>>> STACK: >>>> Bio::Search::Hit::GenericHit::strand >>>> /usr/local/share/perl/5.8.8/Bio/Search/Hit/GenericHit.pm:1455 >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Fri, 2008-11-14 at 11:08 -0600, Chris Fields wrote: >>>> >>>>> We've switched to subversion a while ago. Could you try >>>>> updating from >>>>> there, or using one of our nightly builds? >>>>> >>>>> http://www.bioperl.org/DIST/nightly_builds/ >>>>> >>>>> chris >>>>> >>>>> On Nov 14, 2008, at 8:59 AM, Alexie Papanicolaou wrote: >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> Jason Stajich >>> jason at bioperl.org >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l