From pblaiklo at 110.net Sun Nov 2 21:52:37 2003 From: pblaiklo at 110.net (pblaiklo@110.net) Date: Sun Nov 2 21:50:40 2003 Subject: [Bioperl-l] Restriction Enzyme cuts on Circular plasmids In-Reply-To: <1E0CC447E59C974CA5C7160D2A2854ECC51A76@SJMEMXMB04.stjude.sjcrh.local> References: <1E0CC447E59C974CA5C7160D2A2854ECC51A76@SJMEMXMB04.stjude.sjcrh.local> Message-ID: <200311022152.37487."pblaiklo@110.net"> I am rewriting Bio::Restriction::Analysis to fix the circularity bug (and others), using similar algorithms to the ones you describe. Right now it doesn't handle overlapping sites well, but this should be pretty easy to fix. Using an array of cut positions also lets you keep the start and end points of each restriction fragment. This will make it easier for users to turn restriction fragments into features of the target sequence. Peter Blaiklock On Friday 31 October 2003 09:36, Gray, John wrote: > After reading some of your comments about how the site recognition is > functioning, I am concerned that there may be another problem. It commonly > occurs that restriction enzyme recognition sites will overlap, and I think > this may cause your method to miss some sites. I am wondering whether it > may be necessary to separate the process of site mapping and cleavage. > > For example, BssH II cuts at G^CGCGC, and the sequence of GCGCGCGC > theoretically has two cut sites within it. Of course, your algorithm is > similar to reality in that once the enzyme cuts the sequence once, it > probably won't be able to recognize the other site. However, in the test > tube what you will actually get is a random distribution of cutting at the > two sites. Traditionally (at least in the software I have used), the site > mapping algorithms have returned all possible cut sites. > > I am thinking the only way around this would be to first map the sites into > an array, and then use that array to either calculate fragment sizes or > sequences. With the possibility of overlapping sites in mind, I still > can't think of any way to circumvent the problem of the origin on circular > sequences without concatenating the sequence to simulate circularity. > > John > > -----Original Message----- > From: Rob Edwards [mailto:redwards@utmem.edu] > Sent: Thursday, October 30, 2003 7:53 PM > To: bioperl-l@portal.open-bio.org > Subject: Re: [Bioperl-l] Restriction Enzyme cuts on Circular plasmids > > The following is a quick patch for Bio/Restriction/Analysis.pm so that > it handles circular sequences correctly if there is another cut site in > the region that has been linearized. At the moment it won't handle a > single cut site at that point (e.g. pBR322 has a single EcoRI site at > the point it is circularized). I am not sure how to deal with this and > need to think about it (the fragments are right but the cut sites are > not). > > Can someone submit it for me? > > I have submitted a Bugzilla report as #1548 > > 120c120,121 > < for further analysis. However, this will change the start of the > --- > > > for further analysis. This fragment will also be checked for cuts > > by the enzyme(s). However, this will change the start of the > > 737c738,749 > < unshift (@re_frags, $last.$first); > --- > > > my $newfrag=$last.$first; > > my @cuts = split /($beforeseq)($afterseq)/i, $newfrag; > > my @newfrags; > > if ($#cuts) { > > # there is another cut > > for (my $i=0; $i<=$#cuts; $i+=2) {push (@newfrags, > > $cuts[$i].$cuts[$i+1])} > > > } > > else { > > # there isn't another cut > > push (@newfrags, $newfrag); > > } > > push @re_frags, @newfrags; > > Thanks > > Rob > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From hlapp at gnf.org Mon Nov 3 02:20:17 2003 From: hlapp at gnf.org (Hilmar Lapp) Date: Mon Nov 3 02:17:08 2003 Subject: [Bioperl-l] OMIM tests failing In-Reply-To: Message-ID: <2CC97099-0DCE-11D8-ACBD-000A959EB4C4@gnf.org> Juguang is modifying the parser, yes. I've noticed it failing too. Juguang, if you can, try to avoid committing intermediate versions that aren't completed or tested and cause test failures. If for some reason or another you can't avoid that, *first* send an email to the list saying that what you need to commit is going to temporarily break tests blah and foo. -hilmar On Friday, October 31, 2003, at 05:29 PM, Allen Day wrote: > any idea what's going on here? i see some recent commits (yesterday, > first in 7 months) by juguang. it worked two days ago... > > -allen > > [5:27pm]allenday@sumo:/raid5a/allenday/cvsroot/bioperl-live> make > test_OMIMentry > PERL_DL_NONLAZY=1 /usr/bin/perl -Iblib/arch -Iblib/lib > -I/usr/lib/perl5/5.8.0/i386-linux-thread-multi -I/usr/lib/perl5/5.8.0 > -e > 'use Test::Harness qw(&runtests $verbose); $verbose=0; runtests @ARGV;' > t/OMIMentry.t > t/OMIMentry....ok 5/145 > ------------- EXCEPTION ------------- > MSG: a hash referenced needed > STACK Bio::Phenotype::OMIM::OMIMentry::clinical_symptoms > blib/lib/Bio/Phenotype/OMIM/OMIMentry.pm:537 > STACK toplevel t/OMIMentry.t:60 > > -------------------------------------- > t/OMIMentry....dubious > Test returned status 255 (wstat 65280, 0xff00) > DIED. FAILED tests 17-145 > Failed 129/145 tests, 11.03% okay > Failed Test Stat Wstat Total Fail Failed List of Failed > ----------------------------------------------------------------------- > -------- > t/OMIMentry.t 255 65280 145 129 88.97% 17-145 > Failed 1/1 test scripts, 0.00% okay. 129/145 subtests failed, 11.03% > okay. > make: *** [test_OMIMentry] Error 2 > [5:27pm]allenday@sumo:/raid5a/allenday/cvsroot/bioperl-live> make > test_OMIMentryAllelicVariant > PERL_DL_NONLAZY=1 /usr/bin/perl -Iblib/arch -Iblib/lib > -I/usr/lib/perl5/5.8.0/i386-linux-thread-multi -I/usr/lib/perl5/5.8.0 > -e > 'use Test::Harness qw(&runtests $verbose); $verbose=0; runtests @ARGV;' > t/OMIMentryAllelicVariant.t > t/OMIMentryAllelicVariant....ok > All tests successful. > Files=1, Tests=26, 0 wallclock secs ( 0.14 cusr + 0.01 csys = 0.15 > CPU) > [5:28pm]allenday@sumo:/raid5a/allenday/cvsroot/bioperl-live> make > test_OMIMparser > PERL_DL_NONLAZY=1 /usr/bin/perl -Iblib/arch -Iblib/lib > -I/usr/lib/perl5/5.8.0/i386-linux-thread-multi -I/usr/lib/perl5/5.8.0 > -e > 'use Test::Harness qw(&runtests $verbose); $verbose=0; runtests @ARGV;' > t/OMIMparser.t > t/OMIMparser....ok 1/173 > ------------- EXCEPTION ------------- > MSG: a part/organism must be assigned > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms > blib/lib/Bio/Phenotype/OMIM/OMIMentry.pm:567 > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms > blib/lib/Bio/Phenotype/OMIM/OMIMparser.pm:550 > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry > blib/lib/Bio/Phenotype/OMIM/OMIMparser.pm:531 > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype > blib/lib/Bio/Phenotype/OMIM/OMIMparser.pm:272 > STACK toplevel t/OMIMparser.t:31 > > -------------------------------------- > t/OMIMparser....dubious > Test returned status 25 (wstat 6400, 0x1900) > DIED. FAILED tests 2-173 > Failed 172/173 tests, 0.58% okay > Failed Test Stat Wstat Total Fail Failed List of Failed > ----------------------------------------------------------------------- > -------- > t/OMIMparser.t 25 6400 173 172 99.42% 2-173 > Failed 1/1 test scripts, 0.00% okay. 172/173 subtests failed, 0.58% > okay. > make: *** [test_OMIMparser] Error 2 > [5:28pm]allenday@sumo:/raid5a/allenday/cvsroot/bioperl-live> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From fangl at genomics.org.cn Mon Nov 3 04:57:27 2003 From: fangl at genomics.org.cn (Magic Fang) Date: Mon Nov 3 04:52:08 2003 Subject: [Bioperl-l] how to set first line of genbank file Message-ID: <200311031754218.SM01160@magicpc> the standard first line of genbank file is: LOCUS OSA277468 17385 bp DNA linear PLN 23-OCT-2003 how to set the it when use bioperl to create genbank file. thank u. From rc91 at leicester.ac.uk Mon Nov 3 06:14:58 2003 From: rc91 at leicester.ac.uk (Crook, R.) Date: Mon Nov 3 06:11:42 2003 Subject: [Bioperl-l] help with BLAST Message-ID: Hi, I'm a newbie to this site so please be gentle!! I'm using blastn and remoteblast to find human LINEs and its not working as I'd expect. I'm using the stardard code to get the acc and start of each match but the matches aren't as I expect. They have been fragmented so even when the subject matches with itself it divides into three parts. I could write a long complex prog to find the real start but I think this would be difficult as the many of the sequences will have two LINEs in them. Does anyone know a way round the matching. I hope thats clear to you. Becca Leicester Uni. From michael.watson at bbsrc.ac.uk Mon Nov 3 06:34:24 2003 From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C)) Date: Mon Nov 3 06:36:09 2003 Subject: [Bioperl-l] Glimmer Message-ID: <20B7EB075F2D4542AFFAF813E98ACD93028223A3@cl-exsrv1.irad.bbsrc.ac.uk> I have Bioperl 1.2.2, but this doesn't include Bio::Tools::Glimmer There is conflicting info on bioperl.org, what is the current stable release, Bioperl 1.2.2 or 1.2.3? Thanks Mick -----Original Message----- From: Jason Stajich [mailto:jason@cgt.duhs.duke.edu] Sent: 31 October 2003 18:41 To: michael watson (IAH-C) Cc: Bioperl-l@bioperl.org Subject: Re: [Bioperl-l] Glimmer Bio::Tools::Glimmer Only in CVS or or 1.3.x releases NOT in 1.2.x series. -jason On Fri, 31 Oct 2003, michael watson (IAH-C) wrote: > Hi > > Does anyone have, or does there exist, any perl modules to parse and handle the output from glimmer? > > Thanks > > Michael Watson > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From jason at cgt.duhs.duke.edu Mon Nov 3 06:57:00 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Nov 3 06:53:40 2003 Subject: [Bioperl-l] Glimmer In-Reply-To: <20B7EB075F2D4542AFFAF813E98ACD93028223A3@cl-exsrv1.irad.bbsrc.ac.uk> References: <20B7EB075F2D4542AFFAF813E98ACD93028223A3@cl-exsrv1.irad.bbsrc.ac.uk> Message-ID: As I said in my message, it is NOT in the 1.2.x series so it won't be in 1.2.2 or in 1.2.3. The latest stable release is 1.2.3 which the website correctly reports. If you are confused about the distinction between development and stable releases this is covered in the FAQ I hope. The Glimmer parser is only available in the development series releases 1.3.x OR directly from CVS. The code is pretty generic so I think you could probably download it from 1.3.x or from CVS and drop it into the 1.2.x distribution and it should just work. -jason On Mon, 3 Nov 2003, michael watson (IAH-C) wrote: > I have Bioperl 1.2.2, but this doesn't include Bio::Tools::Glimmer > > There is conflicting info on bioperl.org, what is the current stable release, Bioperl 1.2.2 or 1.2.3? > > Thanks > Mick > > -----Original Message----- > From: Jason Stajich [mailto:jason@cgt.duhs.duke.edu] > Sent: 31 October 2003 18:41 > To: michael watson (IAH-C) > Cc: Bioperl-l@bioperl.org > Subject: Re: [Bioperl-l] Glimmer > > > Bio::Tools::Glimmer > > Only in CVS or or 1.3.x releases NOT in 1.2.x series. > > -jason > > On Fri, 31 Oct 2003, michael watson (IAH-C) wrote: > > > Hi > > > > Does anyone have, or does there exist, any perl modules to parse and handle the output from glimmer? > > > > Thanks > > > > Michael Watson > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From michael.watson at bbsrc.ac.uk Mon Nov 3 07:00:57 2003 From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C)) Date: Mon Nov 3 07:02:29 2003 Subject: [Bioperl-l] Glimmer Message-ID: <20B7EB075F2D4542AFFAF813E98ACD93028223A4@cl-exsrv1.irad.bbsrc.ac.uk> Thanks Jason :-) The website confusion comes from the presence of the file: CURRENT_STABLE_IS_1_2_2 under http://www.bioperl.org/DIST/ Thanks again, Mick -----Original Message----- From: Jason Stajich [mailto:jason@cgt.duhs.duke.edu] Sent: 03 November 2003 11:57 To: michael watson (IAH-C) Cc: Bioperl-l@bioperl.org Subject: RE: [Bioperl-l] Glimmer As I said in my message, it is NOT in the 1.2.x series so it won't be in 1.2.2 or in 1.2.3. The latest stable release is 1.2.3 which the website correctly reports. If you are confused about the distinction between development and stable releases this is covered in the FAQ I hope. The Glimmer parser is only available in the development series releases 1.3.x OR directly from CVS. The code is pretty generic so I think you could probably download it from 1.3.x or from CVS and drop it into the 1.2.x distribution and it should just work. -jason On Mon, 3 Nov 2003, michael watson (IAH-C) wrote: > I have Bioperl 1.2.2, but this doesn't include Bio::Tools::Glimmer > > There is conflicting info on bioperl.org, what is the current stable release, Bioperl 1.2.2 or 1.2.3? > > Thanks > Mick > > -----Original Message----- > From: Jason Stajich [mailto:jason@cgt.duhs.duke.edu] > Sent: 31 October 2003 18:41 > To: michael watson (IAH-C) > Cc: Bioperl-l@bioperl.org > Subject: Re: [Bioperl-l] Glimmer > > > Bio::Tools::Glimmer > > Only in CVS or or 1.3.x releases NOT in 1.2.x series. > > -jason > > On Fri, 31 Oct 2003, michael watson (IAH-C) wrote: > > > Hi > > > > Does anyone have, or does there exist, any perl modules to parse and handle the output from glimmer? > > > > Thanks > > > > Michael Watson > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From sanges at biogem.it Mon Nov 3 07:16:01 2003 From: sanges at biogem.it (Remo Sanges) Date: Mon Nov 3 07:17:12 2003 Subject: [Bioperl-l] help with BLAST References: Message-ID: <00b501c3a204$4a867300$7a3ca48c@Remo> Hi Becca, I think it is wrong to use blast in order to search for particular element. Blast algorithm is a local alignment so it is normal to have multiple HSP, particularly if you search for repetitive elements. In order to search for LINE you can use repat masker Its output look like this: SW perc perc perc query position in query matching repeat position in repeat score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID 226 23.7 0.0 1.7 162 96 154 (372) + L1ME LINE/L1 5544 5601 (569) 1 483 11.1 1.2 0.0 162 441 521 (5) + B1_MM SINE/Alu 31 112 (35) 2 1869 7.5 0.7 0.7 175 11 291 (286) + RLTR4_MM LTR/ERV1 462 742 (0) 3 ............................................. And in BioPerl there is a module to parse this output: Bio::Tools::RepeatMasker Remo _____________________________________ Remo Sanges - Ph.D. Student BioGeM - IGB Gene Expression & Sequencing Core Lab Via Pietro Castellino 111 80131 Naples - Italy Tel:+390816132303 - Fax:+390816132262 sanges@biogem.it - sanges@iigb.na.cnr.it _____________________________________ ----- Original Message ----- From: "Crook, R." To: Sent: Monday, November 03, 2003 12:14 PM Subject: [Bioperl-l] help with BLAST > Hi, > I'm a newbie to this site so please be gentle!! > I'm using blastn and remoteblast to find human LINEs and its not working as I'd expect. I'm using the stardard code to get the acc and start of each match but the matches aren't as I expect. They have been fragmented so even when the subject matches with itself it divides into three parts. I could write a long complex prog to find the real start but I think this would be difficult as the many of the sequences will have two LINEs in them. Does anyone know a way round the matching. > > I hope thats clear to you. > > Becca > Leicester Uni. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From jason at cgt.duhs.duke.edu Mon Nov 3 08:21:21 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Nov 3 08:18:03 2003 Subject: [Bioperl-l] Glimmer In-Reply-To: <20B7EB075F2D4542AFFAF813E98ACD93028223A4@cl-exsrv1.irad.bbsrc.ac.uk> References: <20B7EB075F2D4542AFFAF813E98ACD93028223A4@cl-exsrv1.irad.bbsrc.ac.uk> Message-ID: Fixed. I also re-synched (and setup the cronjob to work again) http://bioperl.org/SRC for those who like to browse. -jason On Mon, 3 Nov 2003, michael watson (IAH-C) wrote: > Thanks Jason :-) > > The website confusion comes from the presence of the file: > > CURRENT_STABLE_IS_1_2_2 > > under http://www.bioperl.org/DIST/ > > Thanks again, > > Mick > > -----Original Message----- > From: Jason Stajich [mailto:jason@cgt.duhs.duke.edu] > Sent: 03 November 2003 11:57 > To: michael watson (IAH-C) > Cc: Bioperl-l@bioperl.org > Subject: RE: [Bioperl-l] Glimmer > > > As I said in my message, it is NOT in the 1.2.x series so it won't be in > 1.2.2 or in 1.2.3. The latest stable release is 1.2.3 which the website > correctly reports. If you are confused about the distinction between > development and stable releases this is covered in the FAQ I hope. > > The Glimmer parser is only available in the development series releases > 1.3.x OR directly from CVS. > > The code is pretty generic so I think you could probably download it from > 1.3.x or from CVS and drop it into the 1.2.x distribution and it should > just work. > > -jason > On Mon, 3 Nov 2003, michael watson (IAH-C) wrote: > > > I have Bioperl 1.2.2, but this doesn't include Bio::Tools::Glimmer > > > > There is conflicting info on bioperl.org, what is the current stable release, Bioperl 1.2.2 or 1.2.3? > > > > Thanks > > Mick > > > > -----Original Message----- > > From: Jason Stajich [mailto:jason@cgt.duhs.duke.edu] > > Sent: 31 October 2003 18:41 > > To: michael watson (IAH-C) > > Cc: Bioperl-l@bioperl.org > > Subject: Re: [Bioperl-l] Glimmer > > > > > > Bio::Tools::Glimmer > > > > Only in CVS or or 1.3.x releases NOT in 1.2.x series. > > > > -jason > > > > On Fri, 31 Oct 2003, michael watson (IAH-C) wrote: > > > > > Hi > > > > > > Does anyone have, or does there exist, any perl modules to parse and handle the output from glimmer? > > > > > > Thanks > > > > > > Michael Watson > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l@portal.open-bio.org > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > -- > > Jason Stajich > > Duke University > > jason at cgt.mc.duke.edu > > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From iain.wallace at ucd.ie Mon Nov 3 08:29:31 2003 From: iain.wallace at ucd.ie (Iain Wallace) Date: Mon Nov 3 08:26:33 2003 Subject: [Bioperl-l] Help with testing of parallel Smith-Waterman code on x86 workstations Message-ID: Hi I think this is very interesting! I have a p4 to use. Are you using sse2 , and 16 integer calculations at once? I was wondering where i could download the code? thanks Iain ____________ Virus checked by G DATA AntiVirusKit Version: AVK 12.0.594 from 19.09.2003 Virus news: www.antiviruslab.com From perlmatrix2003 at yahoo.com Mon Nov 3 04:24:33 2003 From: perlmatrix2003 at yahoo.com (hank wong) Date: Mon Nov 3 08:37:03 2003 Subject: [Bioperl-l] how to tun bioperl and parsing blast report Message-ID: <20031103092433.21898.qmail@web21506.mail.yahoo.com> Dear Sir, I tried to install Bioperl and make it work in my unix system. I did not had the administrator mode, so I put everything in my writable directary. I followed these steps to install(go to the INSTALL): INSTALLING BIOPERL IN A PERSONAL OR PRIVATE MODULE AREA If you lack permission to install perl modules into the standard site_perl/ system area you can configure bioperl to install itself anywhere you choose. Ideally this would be a personal perl directory or standard place where you plan to put all your 'local' or personal perl modules. Note: you _must_ have write permission to this area. Simply pass a parameter to perl as it builds your system specific makefile. Example: perl Makefile.PL PREFIX=/home/dag/My_Local_Perl_Modules make make test make install --------------------------------------------------------- and get everything, here is the directary under my bioperl-1.2.3 AUTHORS INSTALL.WIN README bioperl.lisp examples BUGS LICENSE biodatabases.PL bioperl.pod lib Bio LocalConfig.pm biodatabases.pod bioscripts.PL models Changes MANIFEST.SKIP biodesign.PL bioscripts.pod pm_to_blib DEPRECATED Makefile biodesign.pod blib scripts FAQ Makefile.PL bioperl.PL bptutorial.pl t INSTALL PLATFORMS bioperl.conf doc ------------------- it seems everything is ok right now. ------------------------------------------------------ Here is my question, how to run Bioperl, and bptutorial.pl? I tried like this: >./bptutorial.pl Can't locate IO/String.pm in @INC (@INC contains: . /usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl) at Bio/DB/WebDBSeqI.pm line 90. BEGIN failed--compilation aborted at Bio/DB/WebDBSeqI.pm line 90. Compilation failed in require at Bio/DB/DBFetch.pm line 75. BEGIN failed--compilation aborted at Bio/DB/DBFetch.pm line 75. Compilation failed in require at Bio/DB/EMBL.pm line 103. BEGIN failed--compilation aborted at Bio/DB/EMBL.pm line 103. Compilation failed in require at Bio/LiveSeq/IO/BioPerl.pm line 110. BEGIN failed--compilation aborted at Bio/LiveSeq/IO/BioPerl.pm line 110. Compilation failed in require at ./bptutorial.pl line 3730. BEGIN failed--compilation aborted at ./bptutorial.pl line 3730. ---------------- And it got the above information. Do u think what way I can run bptutorial.PL sucessfully? 2) I am typically very interested in the parsing blast report which could use SearchIO modules... I saw a Bio::SearchIO directary, but sounds I need to guidance to make it run up. Would you pls refer some FAQ page for guidance? I tried the page: http://www.bioperl.org/Core/Latest/bptutorial.html#i.1_overview http://bioperl.org/HOWTOs/html/SearchIO.html anymore? I appreciated your reply... thanks, -hank --------------------------------- Do you Yahoo!? Exclusive Video Premiere - Britney Spears From donald.jackson at bms.com Mon Nov 3 08:53:57 2003 From: donald.jackson at bms.com (Donald G. Jackson) Date: Mon Nov 3 08:50:18 2003 Subject: [Bioperl-l] RPSblast and existing BLAST packages (WAS: RemoteBlast) Message-ID: <3FA65DF5.8080700@bms.com> Paul and Richard and the rest of bioperl-l, as I volunteered to maintain StandAloneBlast.pm, I've been thinking about including RPSblast since the topic came up on the list a few days back. I can see the argument for doing this - since RPSblast is part of the NCBI blast kit - but conceptually I think of RPSblast as more similar to HMMsearch and such, and a lot of how I'd access the results is also similar. Above all, RemoteBlast and StandAloneBlast should be consistent - I like the idea of a BlastI standard . I wonder if we should be supporting all the Blast programs in a monolithic package or breaking them out more, at least wrt new capabilities like RPSblast. What are your thoughts? Don Jackson From tobias.straub at lmu.de Mon Nov 3 09:02:59 2003 From: tobias.straub at lmu.de (Tobias) Date: Mon Nov 3 08:59:59 2003 Subject: [Bioperl-l] bioperl and Bio::Factory::EMBOSS via cgi... permission problems? Message-ID: <6E5229FA-0E06-11D8-B0E5-0003935A86C6@lmu.de> Hi, I have to run a restriction analysis through EMBOSS restrict (as Bio::Tools::RestrictionEnzyme has some serious problems with certain custom enzymes). Now everything runs fine when invoked from commandline, but when I run the script through browser and apache I get an error: Can't call method "run" on an undefined value at .. when telling my EMBOSS program to run seems that I don't get a proper connection to the EMBOSS programs (i also can't get version information, nor program descriptions...) could that be permission problems of the apache user (my EMBOSS executables are world read- and executable)? someone has similar setup (bioperl and EMBOSS via cgi) and knows where to look at? best regards Tobias Dr. Tobias Straub Molecular Biology Adolf Butenandt Institut, LMU Schillerstr. 44 80336 M?nchen, Germany Tel: +49-89-5996439 Fax: +49-89-5996425 From brian_osborne at cognia.com Mon Nov 3 09:02:34 2003 From: brian_osborne at cognia.com (Brian Osborne) Date: Mon Nov 3 09:01:25 2003 Subject: [Bioperl-l] how to tun bioperl and parsing blast report In-Reply-To: <20031103092433.21898.qmail@web21506.mail.yahoo.com> Message-ID: Hank, To run bptutorial.pl cd to the bioperl-1.2.3 directory in your home directory, then try it. You should also set your PERL5LIB environmental variable to that same directory you specified in the make, so in csh or tcsh: setenv PERL5LIB /home/dag/My_Local_Perl_Modules I'm nor sure how to address your SearchIO questions. The SearchIO HOWTO wasn't clear? Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of hank wong Sent: Monday, November 03, 2003 4:25 AM To: bioperl-l@bioperl.org; bioperl-guts-l@bioperl.org Subject: [Bioperl-l] how to tun bioperl and parsing blast report Dear Sir, I tried to install Bioperl and make it work in my unix system. I did not had the administrator mode, so I put everything in my writable directary. I followed these steps to install(go to the INSTALL): INSTALLING BIOPERL IN A PERSONAL OR PRIVATE MODULE AREA If you lack permission to install perl modules into the standard site_perl/ system area you can configure bioperl to install itself anywhere you choose. Ideally this would be a personal perl directory or standard place where you plan to put all your 'local' or personal perl modules. Note: you _must_ have write permission to this area. Simply pass a parameter to perl as it builds your system specific makefile. Example: perl Makefile.PL PREFIX=/home/dag/My_Local_Perl_Modules make make test make install --------------------------------------------------------- and get everything, here is the directary under my bioperl-1.2.3 AUTHORS INSTALL.WIN README bioperl.lisp examples BUGS LICENSE biodatabases.PL bioperl.pod lib Bio LocalConfig.pm biodatabases.pod bioscripts.PL models Changes MANIFEST.SKIP biodesign.PL bioscripts.pod pm_to_blib DEPRECATED Makefile biodesign.pod blib scripts FAQ Makefile.PL bioperl.PL bptutorial.pl t INSTALL PLATFORMS bioperl.conf doc ------------------- it seems everything is ok right now. ------------------------------------------------------ Here is my question, how to run Bioperl, and bptutorial.pl? I tried like this: >./bptutorial.pl Can't locate IO/String.pm in @INC (@INC contains: . /usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl) at Bio/DB/WebDBSeqI.pm line 90. BEGIN failed--compilation aborted at Bio/DB/WebDBSeqI.pm line 90. Compilation failed in require at Bio/DB/DBFetch.pm line 75. BEGIN failed--compilation aborted at Bio/DB/DBFetch.pm line 75. Compilation failed in require at Bio/DB/EMBL.pm line 103. BEGIN failed--compilation aborted at Bio/DB/EMBL.pm line 103. Compilation failed in require at Bio/LiveSeq/IO/BioPerl.pm line 110. BEGIN failed--compilation aborted at Bio/LiveSeq/IO/BioPerl.pm line 110. Compilation failed in require at ./bptutorial.pl line 3730. BEGIN failed--compilation aborted at ./bptutorial.pl line 3730. ---------------- And it got the above information. Do u think what way I can run bptutorial.PL sucessfully? 2) I am typically very interested in the parsing blast report which could use SearchIO modules... I saw a Bio::SearchIO directary, but sounds I need to guidance to make it run up. Would you pls refer some FAQ page for guidance? I tried the page: http://www.bioperl.org/Core/Latest/bptutorial.html#i.1_overview http://bioperl.org/HOWTOs/html/SearchIO.html anymore? I appreciated your reply... thanks, -hank --------------------------------- Do you Yahoo!? Exclusive Video Premiere - Britney Spears From brian_osborne at cognia.com Mon Nov 3 09:21:37 2003 From: brian_osborne at cognia.com (Brian Osborne) Date: Mon Nov 3 09:20:27 2003 Subject: [Bioperl-l] bioperl and Bio::Factory::EMBOSS via cgi... permissionproblems? In-Reply-To: <6E5229FA-0E06-11D8-B0E5-0003935A86C6@lmu.de> Message-ID: Tobias, >I have to run a restriction analysis through EMBOSS restrict (as >Bio::Tools::RestrictionEnzyme has some serious problems with certain >custom enzymes). Did you try the new Bio::Restriction classes? Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Tobias Sent: Monday, November 03, 2003 9:03 AM To: Bioperl Subject: [Bioperl-l] bioperl and Bio::Factory::EMBOSS via cgi... permissionproblems? Hi, I have to run a restriction analysis through EMBOSS restrict (as Bio::Tools::RestrictionEnzyme has some serious problems with certain custom enzymes). Now everything runs fine when invoked from commandline, but when I run the script through browser and apache I get an error: Can't call method "run" on an undefined value at .. when telling my EMBOSS program to run seems that I don't get a proper connection to the EMBOSS programs (i also can't get version information, nor program descriptions...) could that be permission problems of the apache user (my EMBOSS executables are world read- and executable)? someone has similar setup (bioperl and EMBOSS via cgi) and knows where to look at? best regards Tobias Dr. Tobias Straub Molecular Biology Adolf Butenandt Institut, LMU Schillerstr. 44 80336 M?nchen, Germany Tel: +49-89-5996439 Fax: +49-89-5996425 _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From d.gatherer at vir.gla.ac.uk Mon Nov 3 09:45:07 2003 From: d.gatherer at vir.gla.ac.uk (Derek Gatherer) Date: Mon Nov 3 09:40:45 2003 Subject: [Bioperl-l] gcg.pm, another comment In-Reply-To: References: <5.2.1.1.1.20031031120319.00af7328@udcf.gla.ac.uk> Message-ID: <5.2.1.1.1.20031103144005.00aa5d08@udcf.gla.ac.uk> Hello again Thanks for the replies, I'll get down to some work on this soon I hope. Meanwhile, I find that the following error is also thrown unpredictably, gcg.pm line 137 if(defined $chksum) { unless(_validate_checksum($sequence,$chksum)) { $self->throw("Checksum failure on parsed sequence."); } By unpredictably, I mean that the same sequence may or may not throw this error. Some sequences seem to be a little more prone to it, but with enough tries you can always get a sequence past it just by rerunning the code until it works... Can I safely comment this out???? or is it an indication of something more deeply wrong at the heart of my OS (which is a Tru64 5.1B) ?? thanks Derek At 11:11 31/10/2003 -0800, Hilmar Lapp wrote: >On 10/31/03 4:13 AM, "Derek Gatherer" wrote: > > > Otherwise, you get: > > > > Use of uninitialized value in concatenation (.) or string at > > /usr/local/lib/site_perl/5.8.0//Bio/SeqIO/gcg.pm line 197, > > > > The offending line in gcg.pm is: > >This is just a warning, and the line has no detrimental effects. So as for >your code, you can safely ignore it. > >I still agree with Jason to file a bug report. Or better yet, submit a patch >along with it if you have one ... > > -hilmar >-- >------------------------------------------------------------- >Hilmar Lapp email: lapp at gnf.org >GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 >------------------------------------------------------------- From redwards at utmem.edu Mon Nov 3 10:03:21 2003 From: redwards at utmem.edu (Rob Edwards) Date: Mon Nov 3 10:00:06 2003 Subject: [Bioperl-l] bioperl and Bio::Factory::EMBOSS via cgi... permission problems? In-Reply-To: <6E5229FA-0E06-11D8-B0E5-0003935A86C6@lmu.de> Message-ID: Tobias, You could try the newer Bio::Restriction modules available in bioperl 1.3. It should be able to handle custom enzymes a lot better. Rob On Monday, November 3, 2003, at 08:02 AM, Tobias wrote: > Hi, > > I have to run a restriction analysis through EMBOSS restrict (as > Bio::Tools::RestrictionEnzyme has some serious problems with certain > custom enzymes). Now everything runs fine when invoked from > commandline, but when I run the script through browser and apache I > get an error: > Can't call method "run" on an undefined value at .. > when telling my EMBOSS program to run > > seems that I don't get a proper connection to the EMBOSS programs (i > also can't get version information, nor program descriptions...) > > could that be permission problems of the apache user (my EMBOSS > executables are world read- and executable)? > someone has similar setup (bioperl and EMBOSS via cgi) and knows where > to look at? > > best regards > Tobias > > Dr. Tobias Straub > Molecular Biology > Adolf Butenandt Institut, LMU > Schillerstr. 44 > 80336 M?nchen, Germany > > Tel: +49-89-5996439 > Fax: +49-89-5996425 > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From jason at cgt.duhs.duke.edu Mon Nov 3 10:35:35 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Nov 3 10:32:46 2003 Subject: [Bioperl-l] Bio::SeqIO::tigr In-Reply-To: <20031031175411.GB18009@bioinfo.ucr.edu> References: <20031031175411.GB18009@bioinfo.ucr.edu> Message-ID: Josh - would love to see it one way or another - I assume XML::Twig doesn't give anything faster/less memory? I have asked in the past if the XML - Perl gurus could give us some hard and fast rules as to what is the best set of tools to use. I think we are okay with ugly and uncommented code iff you are willing to contribute it and then work on cleaning it up. Since you don't show the code I'm not clear what is so DIFFERENT about your coding style and whether or not that this truly incompatible. The requirements we have for something that would be a SeqIO module is they have to follow the structure of SeqIO drivers, mainly they implement next_seq and write_seq and inherit from Bio::SeqIO and use the inherited _readline or _print for IO rather than <$fh> and print $fh. You can contribute it be posting it to the list, asking nicely for CVS r/w account, or submitting it as an enhancement to bugzilla.open-bio.org. Looking forward to it. -jason On Fri, 31 Oct 2003, Josh Lauricha wrote: > I've written a SeqIO parser for the tigr xml data format, and would like > to contribute it to BioPerl. However, there are a couple things I don't > really like about it but don't have the time to fix right now. Could I > get some feedback from the list regaurding each? > > First, some background. Since each XML file is roughly 60MB, using the > XML parsers provided by TIGR (using XML::Simple and XML::Sax, IIRC) > takes around 7-10 minutes to parse (no including BioPerl object > creation) and occationally used more than ~2.5GB of memory, which an x86 > can't handle. > > To get around this, I took advantage of the fact that these are machine > generated and parsed the entire file using regexp, only storing what is > "relavent" to retrieve a sequence. This means, the ~75 lines of code > TIGR used is around 1280. However, it uses around 250MB of memory and > (converting from TIGR to GenBank) runs in around two to three and a half > minutes, 30-60% slower than GenBank -> GenBank convertion. > > 1) The code is pretty ugly. It was one of my first "large" perl projects > and reflects that. The uglyness is partially due to my inexperiance > at the time, and partially do to the ugliness of the problem. > > 2) Its not very well commented, ok its not commented. This isn't too big > a problem, as everything acts basically the same way, and once > someone understands that the rest is easy. (Its really just the same > thing over and over). Its just fairly bad form. > > 3) The memory usage (and runtime) could be improved by one or more of: > a) Storing everything directly into objects rather than a tree > b) Using arrays to store everything rather than hashes > c) Ignoring any tags that aren't actually used. > > 4) The coding style is nothing like the rest of BioPerl's. Mainly > because, I prefer this style (PERSONAL preference, no flames, > everyone gets their own oppinion). This is bad for a project, > but in all honesty if I need to drastically change my coding > style I will probably never get around to fixing up this code. > > 5) There is quite a long delay before anything is actually accessible > because the nucleotide data is given at the end of the files > (actually, at the end of an ASSEMBLY tag) so everything before it > needs to be parsed. This leads to the first ->next_seq() call taking > a significant time. > > Since I can't show you what the object looks like, I'll show you what > the GenBank file looks like. An example of the genbank file is at: > > http://bioinfo.ucr.edu/cgi-bin/seqfetch.pl?database=all&accession=At1g03870 > > Thanks for your time, > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From amackey at pcbi.upenn.edu Mon Nov 3 11:35:53 2003 From: amackey at pcbi.upenn.edu (Aaron J.Mackey) Date: Mon Nov 3 11:32:39 2003 Subject: [Bioperl-l] Bio::SeqIO::tigr In-Reply-To: References: <20031031175411.GB18009@bioinfo.ucr.edu> Message-ID: IANAXPG[1], but: Leanest and meanest: XML::SAX event-based "push" parsing Leaner and potentially very friendly: XML::SAX::PullParser (unfortunately, not yet implemented) Potentially lean and slightly friendlier to non-event-based programming: XML::Twig (sub-DOM) Fat and friendly: XML::Simple (full-DOM) -Aaron [1] I Am Not An XML-Perl Guru On Nov 3, 2003, at 10:35 AM, Jason Stajich wrote: > Josh - would love to see it one way or another - I assume XML::Twig > doesn't give anything faster/less memory? I have asked in the past if > the > XML - Perl gurus could give us some hard and fast rules as to what is > the > best set of tools to use. > > I think we are okay with ugly and uncommented code iff you are willing > to > contribute it and then work on cleaning it up. Since you don't show > the > code I'm not clear what is so DIFFERENT about your coding style and > whether or not that this truly incompatible. The requirements we have > for > something that would be a SeqIO module is they have to follow the > structure of SeqIO drivers, mainly they implement next_seq and > write_seq > and inherit from Bio::SeqIO and use the inherited _readline or _print > for > IO rather than <$fh> and print $fh. > > You can contribute it be posting it to the list, asking nicely for CVS > r/w account, or submitting it as an enhancement to > bugzilla.open-bio.org. > Looking forward to it. > > -jason > > On Fri, 31 Oct 2003, Josh Lauricha wrote: > >> I've written a SeqIO parser for the tigr xml data format, and would >> like >> to contribute it to BioPerl. However, there are a couple things I >> don't >> really like about it but don't have the time to fix right now. Could I >> get some feedback from the list regaurding each? >> >> First, some background. Since each XML file is roughly 60MB, using the >> XML parsers provided by TIGR (using XML::Simple and XML::Sax, IIRC) >> takes around 7-10 minutes to parse (no including BioPerl object >> creation) and occationally used more than ~2.5GB of memory, which an >> x86 >> can't handle. >> >> To get around this, I took advantage of the fact that these are >> machine >> generated and parsed the entire file using regexp, only storing what >> is >> "relavent" to retrieve a sequence. This means, the ~75 lines of code >> TIGR used is around 1280. However, it uses around 250MB of memory and >> (converting from TIGR to GenBank) runs in around two to three and a >> half >> minutes, 30-60% slower than GenBank -> GenBank convertion. >> >> 1) The code is pretty ugly. It was one of my first "large" perl >> projects >> and reflects that. The uglyness is partially due to my inexperiance >> at the time, and partially do to the ugliness of the problem. >> >> 2) Its not very well commented, ok its not commented. This isn't too >> big >> a problem, as everything acts basically the same way, and once >> someone understands that the rest is easy. (Its really just the >> same >> thing over and over). Its just fairly bad form. >> >> 3) The memory usage (and runtime) could be improved by one or more of: >> a) Storing everything directly into objects rather than a tree >> b) Using arrays to store everything rather than hashes >> c) Ignoring any tags that aren't actually used. >> >> 4) The coding style is nothing like the rest of BioPerl's. Mainly >> because, I prefer this style (PERSONAL preference, no flames, >> everyone gets their own oppinion). This is bad for a project, >> but in all honesty if I need to drastically change my coding >> style I will probably never get around to fixing up this code. >> >> 5) There is quite a long delay before anything is actually accessible >> because the nucleotide data is given at the end of the files >> (actually, at the end of an ASSEMBLY tag) so everything before it >> needs to be parsed. This leads to the first ->next_seq() call >> taking >> a significant time. >> >> Since I can't show you what the object looks like, I'll show you what >> the GenBank file looks like. An example of the genbank file is at: >> >> http://bioinfo.ucr.edu/cgi-bin/seqfetch.pl? >> database=all&accession=At1g03870 >> >> Thanks for your time, >> >> > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > From rnokes at bearingpoint.net Mon Nov 3 11:50:00 2003 From: rnokes at bearingpoint.net (Nokes, Rebecca (BearingPoint)) Date: Mon Nov 3 11:46:58 2003 Subject: [Bioperl-l] (no subject) Message-ID: Please add me to your list. thx ****************************************************************************** The information in this email is confidential and may be legally privileged. Access to this email by anyone other than the intended addressee is unauthorized. If you are not the intended recipient of this message, any review, disclosure, copying, distribution, retention, or any action taken or omitted to be taken in reliance on it is prohibited and may be unlawful. If you are not the intended recipient, please reply to or forward a copy of this message to the sender and delete the message, any attachments, and any copies thereof from your system. ****************************************************************************** From brian_osborne at cognia.com Mon Nov 3 12:14:19 2003 From: brian_osborne at cognia.com (Brian Osborne) Date: Mon Nov 3 12:13:29 2003 Subject: [Bioperl-l] (no subject) In-Reply-To: Message-ID: Rebecca, You need to do this yourself since you may want to supply your own password. http://bioperl.org/mailman/listinfo/bioperl-l Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Nokes, Rebecca (BearingPoint) Sent: Monday, November 03, 2003 11:50 AM To: 'bioperl-l@bioperl.org' Subject: [Bioperl-l] (no subject) Please add me to your list. thx **************************************************************************** ** The information in this email is confidential and may be legally privileged. Access to this email by anyone other than the intended addressee is unauthorized. If you are not the intended recipient of this message, any review, disclosure, copying, distribution, retention, or any action taken or omitted to be taken in reliance on it is prohibited and may be unlawful. If you are not the intended recipient, please reply to or forward a copy of this message to the sender and delete the message, any attachments, and any copies thereof from your system. **************************************************************************** ** _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From cain at cshl.org Mon Nov 3 12:41:02 2003 From: cain at cshl.org (Scott Cain) Date: Mon Nov 3 12:37:49 2003 Subject: [Bioperl-l] problems with Bio::Tools::GFF Message-ID: <1067881262.1436.47.camel@localhost.localdomain> Hi Jason and Lincoln, I have a few concerns with Bio::Tools::GFF. The first is with the method _from_gff3_string, which does a split on \t to separate columns. I think the GFF3 spec says it can be space delimited, so that should probably be \s+. Additionally, to split the groups column, it uses \s*;\s*, but I think that spaces have to be escaped, therefore, it should only split on ; and spaces would indicate a problem (especially if one splits on spaces as indicated above). Finally, it doesn't provide a method of accessing the sequence that is optionally at the bottom of the file. I am not exactly sure how to implement that (or I would), but I suspect it will have to be handled in the next_feature method. Of course, the problem with handling it there is that it is not a feature. Scott -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain@cshl.org GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory From Matthew.Betts at ii.uib.no Mon Nov 3 13:06:24 2003 From: Matthew.Betts at ii.uib.no (Matthew Betts) Date: Mon Nov 3 13:03:14 2003 Subject: [Bioperl-l] SeqFeatureI->spliced_translation Message-ID: Hi, I've written a method 'spliced_translation' for Bio::SeqFeatureI that translates a spliced sequence and deals with any codon exceptions. It is really just glue between the existing SeqFeatureI->spliced_seq and PrimarySeqI->translate, but can deal with codons that are non-standard across the whole sequence ('/codon' in GenBank feature tables) and codons that are non-standard at specific locations ('/transl_except'). I mainly use it to check the conceptual translation against that given in the feature tables. I could do a bit (a lot...) of polishing (suggestions welcome) if it's useful to anyone else? Matt -- Matthew Betts, mailto:matthew.betts@ii.uib.no Phone: (+47) 55 58 40 22, Fax: (+47) 55 58 42 95 CBU, BCCS, UNIFOB / Universitetet i Bergen Thorm?hlensgt. 55, N-5008 Bergen, Norway -------------------- sub spliced_translation { my $self = shift; my $db = shift; my $not_concept = shift; # if $not_concept is defined, will return the sequence # given by the /translation qualifier rather than the # conceptual translation. All the checks are still done. my $complete5; my $complete3; my $frame; my $table; my $loc_factory; my @exceptions; my $except; my $cds; my $trans; my @locs; my $loc; my $fstrand; my $mixed; my $ft_trans; my $trans_aa; my $cdna_start; my $na_e_start; my $aa_e_pos; # FIXME - improve the warnings. Also allow 'throw' if requested if($self->primary_tag ne 'CDS') { $self->warn("Calling spliced_seq on a feature which is not a CDS"); } # is the whole sequence of the CDS known? if(defined($self->location->strand) and ($self->location->strand == -1)) { $complete5 = ($self->location->to_FTstring =~ />/) ? 0 : 1; $complete3 = ($self->location->to_FTstring =~ /location->to_FTstring =~ /location->to_FTstring =~ />/) ? 0 : 1; } # find the reading frame before translating... $frame = 0; if($self->has_tag('codon_start')) { $frame = join '', $self->get_tag_values('codon_start'); # '/codon_start' tags are 1, 2, or 3, but bioperl # uses 0, 1, or 2 to indicate reading frame, so... $frame--; } # ...and the codon table too $table = 1; if($self->has_tag('transl_table')) { $table = join '', $self->get_tag_values('transl_table'); } $cds = $self->spliced_seq($db); $trans = $cds->translate(undef, undef, $frame, $table, undef, undef, $complete5, $complete3); # the following exceptions should ideally be dealt with # by translate, except that the single codon exceptions # need to know about locations on the genomic sequence $trans_aa = $trans->seq; # deal with codons that differ from the reference genetic # code ('/codon' qualifiers in gb feature table) @exceptions = (); if($self->has_tag('codon')) { foreach $except ($self->get_tag_values('codon')) { $except =~ s/\s+//g; # spaces are meaningless here $except =~ s/["']//g; # don't need quotes either if($except =~ /seq:(...),aa:(.*)\)/) { my $codon = $1; my $aa_temp = substr($2, 0, 3); $aa_temp =~ s/(.)(..)/\u$1\L$2/; # seq3in() expects first letter as capital, rest lower-case my $aa = Bio::Seq->new('-alphabet' => 'protein'); Bio::SeqUtils->seq3in($aa, $aa_temp); push @exceptions, { 'codon' => $codon, 'aa' => $aa->seq, }; } } } my @codons = grep(!/\A\Z/, split(/(...)/, substr($cds->seq, $frame))); $aa_e_pos = 0; foreach my $codon (@codons) { foreach $except (@exceptions) { ($except->{'codon'} =~ /$codon/i) and (substr($trans_aa, $aa_e_pos, 1) = $except->{'aa'}) } $aa_e_pos++; } # deal with single non standard codons # ('/transl_except' qualifiers in gb feature table) $loc_factory = Bio::Factory::FTLocationFactory->new(); @exceptions = (); if($self->has_tag('transl_except')) { foreach $except ($self->get_tag_values('transl_except')) { $except =~ s/\s+//g; # spaces are meaningless here if($except =~ /\(pos:(.*?),aa:(.*)\)/) { my $loc_str = $1; my $aa_temp = substr($2, 0, 3); $aa_temp =~ s/(.)(..)/\u$1\L$2/; # seq3in() expects first letter as capital, rest lower-case my $aa = Bio::Seq->new('-alphabet' => 'protein'); Bio::SeqUtils->seq3in($aa, $aa_temp); push @exceptions, { 'loc' => $loc_factory->from_string($loc_str), 'aa' => $aa->seq, }; } else { $self->warn("Cannot parse translation exception '$except'"); } } } # order the locations in the same way that spliced_seq does @locs = $self->location->each_Location; foreach $loc (@locs) { defined($fstrand) or ($fstrand = $loc->strand); if(defined($loc->strand) and ($fstrand != $loc->strand)) { $mixed = 1; last; } } if($mixed) { $self->warn("Mixed strand locations, spliced seq using the input order rather than trying to sort"); } elsif($fstrand == -1) { @locs = reverse $self->location->each_Location; } # pair up any translation exceptions with # their corresponding sub locations, and # calculate the position in the amino acid # sequence that is exceptional $cdna_start = 1 - $frame; # start position of the current segment in the na seq of the cds $na_e_start = 0; # position of the exception on the na seq of the cds $aa_e_pos = 0; # position of the exception on the aa seq of the cds # there might be a clever way to avoid the following if-else... if(!$mixed and ($fstrand == -1)) { foreach $loc (@locs) { foreach $except (@exceptions) { if($loc->overlaps($except->{'loc'})) { $na_e_start = $loc->end - $except->{'loc'}->end + $cdna_start; $aa_e_pos = ($na_e_start + 2) / 3; # Ignore this position if it is off the end of # the sequence. This can happen when the # exception is for a non-standard stop codon. ($aa_e_pos > $trans->length) and next; # Otherwise, replace the aa in the translation $aa_e_pos--; # positions above start at one substr($trans_aa, $aa_e_pos, 1) = $except->{'aa'}; } } $cdna_start += ($loc->end - $loc->start + 1); } $trans->seq($trans_aa); } else { foreach $loc (@locs) { foreach $except (@exceptions) { if($loc->overlaps($except->{'loc'})) { $na_e_start = $except->{'loc'}->start - $loc->start + $cdna_start; $aa_e_pos = ($na_e_start + 2) / 3; # Ignore this position if it is off the end of # the sequence. This can happen when the # exception is for a non-standard stop codon. ($aa_e_pos > $trans->length) and next; # Otherwise, replace the aa in the translation $aa_e_pos--; # positions above start at one substr($trans_aa, $aa_e_pos, 1) = $except->{'aa'}; } } $cdna_start += ($loc->end - $loc->start + 1); } $trans->seq($trans_aa); } # check that the translation matches that given # by the feature table. This is better than just # using the feature table translation, since it # is also an indirect check that the DNA has been # spliced together correctly $ft_trans = undef; if($self->has_tag('translation')) { $ft_trans = join '', $self->get_tag_values('translation'); } if(defined($ft_trans)) { if($trans->seq !~ /^$ft_trans/) { my $display_id = $self->seq->display_id; $self->warn("Translated sequence '$display_id' does not match '/translation' tag"); } } else { $self->warn("Warning: no translation tag so can't check"); } $not_concept and $ft_trans and $trans->seq($ft_trans); return $trans, $complete5, $complete3, $frame, $table; } From tobias.straub at lmu.de Mon Nov 3 13:22:09 2003 From: tobias.straub at lmu.de (Tobias) Date: Mon Nov 3 13:19:25 2003 Subject: [Bioperl-l] bioperl and Bio::Factory::EMBOSS via cgi...permissionproblems? In-Reply-To: Message-ID: Brian, Rob, thanks. that was indeed what i was waiting for! BUT: unfortunately it seems that with the new classes it is not possible to get the cut sites of a single enzyme but the fragments (which are of course internally calculated from the cut sites). second: my test digest with a non-palindromic cutter (BsaI) gave a wrong result. Seems I have to digest "manually" ... or wait for the stable release? thanks anyway Tobias Am Montag, 03.11.03 um 15:21 Uhr schrieb Brian Osborne: > Tobias, > >> I have to run a restriction analysis through EMBOSS restrict (as >> Bio::Tools::RestrictionEnzyme has some serious problems with certain >> custom enzymes). > > Did you try the new Bio::Restriction classes? > > > Brian O. > > > -----Original Message----- > From: bioperl-l-bounces@portal.open-bio.org > [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Tobias > Sent: Monday, November 03, 2003 9:03 AM > To: Bioperl > Subject: [Bioperl-l] bioperl and Bio::Factory::EMBOSS via cgi... > permissionproblems? > > Hi, > > I have to run a restriction analysis through EMBOSS restrict (as > Bio::Tools::RestrictionEnzyme has some serious problems with certain > custom enzymes). Now everything runs fine when invoked from > commandline, but when I run the script through browser and apache I > get an error: > Can't call method "run" on an undefined value at .. > when telling my EMBOSS program to run > > seems that I don't get a proper connection to the EMBOSS programs (i > also can't get version information, nor program descriptions...) > > could that be permission problems of the apache user (my EMBOSS > executables are world read- and executable)? > someone has similar setup (bioperl and EMBOSS via cgi) and knows where > to look at? > > best regards > Tobias > > Dr. Tobias Straub > Molecular Biology > Adolf Butenandt Institut, LMU > Schillerstr. 44 > 80336 M?nchen, Germany > > Tel: +49-89-5996439 > Fax: +49-89-5996425 > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > From jason at cgt.duhs.duke.edu Mon Nov 3 14:13:36 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Nov 3 14:10:26 2003 Subject: [Bioperl-l] Re: problems with Bio::Tools::GFF In-Reply-To: <1067881262.1436.47.camel@localhost.localdomain> References: <1067881262.1436.47.camel@localhost.localdomain> Message-ID: Feel free to fix it to spec Scott. Note that I have also made no attempt to parse/write the Gap or Alignment stuff in any sort of special way - I basically made it so it supports what GFF2 currently looks like only in GFF3 flavor. Perhaps it makes sense to do all of that work on Chris's Unflattner though rather than in Tools::GFF. A SeqFeature::Tools::Flattner is probably in order as well to turn HSPs and other paired sequences into GFF3 Alignments. As for the seq stuff - will likely need a Bio::SeqIO::gff3 for that. Anyone is welcome to add these changes - I don't think I'll be able to make many contributions until December so it would be best if someone else took it on. -jason On Mon, 3 Nov 2003, Scott Cain wrote: > Hi Jason and Lincoln, > > I have a few concerns with Bio::Tools::GFF. The first is with the method > _from_gff3_string, which does a split on \t to separate columns. I > think the GFF3 spec says it can be space delimited, so that should > probably be \s+. Additionally, to split the groups column, it uses > \s*;\s*, but I think that spaces have to be escaped, therefore, it > should only split on ; and spaces would indicate a problem (especially > if one splits on spaces as indicated above). > > Finally, it doesn't provide a method of accessing the sequence that is > optionally at the bottom of the file. I am not exactly sure how to > implement that (or I would), but I suspect it will have to be handled in > the next_feature method. Of course, the problem with handling it there > is that it is not a feature. > > Scott > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From cain at cshl.org Mon Nov 3 14:33:52 2003 From: cain at cshl.org (Scott Cain) Date: Mon Nov 3 14:30:35 2003 Subject: [Bioperl-l] Re: problems with Bio::Tools::GFF In-Reply-To: References: <1067881262.1436.47.camel@localhost.localdomain> Message-ID: <1067888032.3457.81.camel@localhost.localdomain> On Mon, 2003-11-03 at 14:13, Jason Stajich wrote: > Feel free to fix it to spec Scott. Will do--I mentioned it because I am always concerned that I am misinterpreting the spec; if I codify my misinterpretations, that would kind of shoot the idea of standard out the window. > > Note that I have also made no attempt to parse/write the Gap or Alignment > stuff in any sort of special way - I basically made it so it supports what > GFF2 currently looks like only in GFF3 flavor. Perhaps it makes sense to > do all of that work on Chris's Unflattner though rather than in > Tools::GFF. A SeqFeature::Tools::Flattner is probably in order as well to > turn HSPs and other paired sequences into GFF3 Alignments. I'm not sure it's necessary to move to Unflattener. Since the format is fairly simple, it is only really necessary to split the information in the groups column to tag value pairs and let the user decide what to do with the information. The only thing that I am somewhat at a loss to deal with is cigar line info, but I don't think that is being parse by Bio::DB::GFF yet either. > > As for the seq stuff - will likely need a Bio::SeqIO::gff3 for that. > Ouch--I was afraid you were going to suggest that. I suppose if we make it a read-only module, I guess that should be ok. The thought of making it write makes my head hurt. > Anyone is welcome to add these changes - I don't think I'll be able to > make many contributions until December so it would be best if someone else > took it on. > > -jason > > On Mon, 3 Nov 2003, Scott Cain wrote: > > > Hi Jason and Lincoln, > > > > I have a few concerns with Bio::Tools::GFF. The first is with the method > > _from_gff3_string, which does a split on \t to separate columns. I > > think the GFF3 spec says it can be space delimited, so that should > > probably be \s+. Additionally, to split the groups column, it uses > > \s*;\s*, but I think that spaces have to be escaped, therefore, it > > should only split on ; and spaces would indicate a problem (especially > > if one splits on spaces as indicated above). > > > > Finally, it doesn't provide a method of accessing the sequence that is > > optionally at the bottom of the file. I am not exactly sure how to > > implement that (or I would), but I suspect it will have to be handled in > > the next_feature method. Of course, the problem with handling it there > > is that it is not a feature. > > > > Scott > > > > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain@cshl.org GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory From jason at cgt.duhs.duke.edu Mon Nov 3 15:47:13 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Nov 3 15:43:51 2003 Subject: [Bioperl-l] Re: problems with Bio::Tools::GFF In-Reply-To: <1067888032.3457.81.camel@localhost.localdomain> References: <1067881262.1436.47.camel@localhost.localdomain> <1067888032.3457.81.camel@localhost.localdomain> Message-ID: On Mon, 3 Nov 2003, Scott Cain wrote: > On Mon, 2003-11-03 at 14:13, Jason Stajich wrote: > > Feel free to fix it to spec Scott. > > Will do--I mentioned it because I am always concerned that I am > misinterpreting the spec; if I codify my misinterpretations, that would > kind of shoot the idea of standard out the window. Well given that I just found the published spec online http://song.sourceforge.net/gff3.shtml I had been basing things off of Lincoln's earlier emails so I really didn't pay much attention to all of that. I am a bit wary of splitting on space wrt the last column but so we'll have to cook up some test cases to make sure it goes through okay. > > > > Note that I have also made no attempt to parse/write the Gap or Alignment > > stuff in any sort of special way - I basically made it so it supports what > > GFF2 currently looks like only in GFF3 flavor. Perhaps it makes sense to > > do all of that work on Chris's Unflattner though rather than in > > Tools::GFF. A SeqFeature::Tools::Flattner is probably in order as well to > > turn HSPs and other paired sequences into GFF3 Alignments. > > I'm not sure it's necessary to move to Unflattener. Since the format is > fairly simple, it is only really necessary to split the information in > the groups column to tag value pairs and let the user decide what to do > with the information. The only thing that I am somewhat at a loss to > deal with is cigar line info, but I don't think that is being parse by > Bio::DB::GFF yet either. One day I could imagine us building Gene/Transcript objects from the GFF3. Actually I was thinking we'd need a Flattner to turn the Gene object back into flattened features. Likewise with HSP objects and alignments. I can't produce CIGAR lines currently from HSPs - I'm still a little confused about how to construct them but it means I need to read the spec a little more probably. > > > > As for the seq stuff - will likely need a Bio::SeqIO::gff3 for that. > > > Ouch--I was afraid you were going to suggest that. I suppose if we make > it a read-only module, I guess that should be ok. The thought of making > it write makes my head hurt. For writing multiple sequences, could be pretty ugly. Either some caching OR a special write_seq which takes an arrayref. Maybe not a SeqIO after all.... unless GFF3 lets a new set start with # gff-version 3 so you could interleave them? # gff-version 3 ... ##FASTA >oneseq.1 CAGT # gff-version 3 ... ## FASTA >oneseq.2 GATC For reading sequences next_seq will have to parse in the entire GFF file at once and next_seq will have to iterate through an internal array I guess. Not that hard I hope... > > > Anyone is welcome to add these changes - I don't think I'll be able to > > make many contributions until December so it would be best if someone else > > took it on. > > > > -jason > > > > On Mon, 3 Nov 2003, Scott Cain wrote: > > > > > Hi Jason and Lincoln, > > > > > > I have a few concerns with Bio::Tools::GFF. The first is with the method > > > _from_gff3_string, which does a split on \t to separate columns. I > > > think the GFF3 spec says it can be space delimited, so that should > > > probably be \s+. Additionally, to split the groups column, it uses > > > \s*;\s*, but I think that spaces have to be escaped, therefore, it > > > should only split on ; and spaces would indicate a problem (especially > > > if one splits on spaces as indicated above). > > > > > > Finally, it doesn't provide a method of accessing the sequence that is > > > optionally at the bottom of the file. I am not exactly sure how to > > > implement that (or I would), but I suspect it will have to be handled in > > > the next_feature method. Of course, the problem with handling it there > > > is that it is not a feature. > > > > > > Scott > > > > > > > > > > -- > > Jason Stajich > > Duke University > > jason at cgt.mc.duke.edu > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From laurichj at bioinfo.ucr.edu Mon Nov 3 15:57:23 2003 From: laurichj at bioinfo.ucr.edu (Josh Lauricha) Date: Mon Nov 3 15:54:12 2003 Subject: [Bioperl-l] Bio::SeqIO::tigr In-Reply-To: References: <20031031175411.GB18009@bioinfo.ucr.edu> Message-ID: <20031103205723.GA26644@bioinfo.ucr.edu> On Mon 11/03/03 10:35, Jason Stajich wrote: > Josh - would love to see it one way or another - I assume XML::Twig > doesn't give anything faster/less memory? I have asked in the past if the > XML - Perl gurus could give us some hard and fast rules as to what is the > best set of tools to use. Not an XML Guru, but Twig seems to work really well in basically anycase I've come acrossed. Twig would work... if TIGR wasn't stupid. The XML file is roughly: ... ... .. The assembly tag is the basically the entire file and, unfortunatly, has needed information. So, twig doesn't quite work. I guess if there is a way to tell it to mix the SAX way and TWIG way that might work... But, if there is I don't know about it. Actually, as of writing this email, I just dug up a Twig based XML parser for Tigr... however all it does is spit out the IDs, descriptions and TU coords (no sequences), this takes ~110MB of RAM and almost 4 Minutes. On the same file, mine is ~246MB RAM and an extra minute, however its a full GenBank dump. > I think we are okay with ugly and uncommented code iff you are willing to > contribute it and then work on cleaning it up. Since you don't show the > code I'm not clear what is so DIFFERENT about your coding style and > whether or not that this truly incompatible. The reason its wasn't attached is because I've had issues with posting attached files before. The different style is more or less just with idents, so stuff I've seen like: if() { foreach () { if() { do something } } } is: if() { foreach () { if() { do something } } } Basically, just because I can't read it. > The requirements we have for > something that would be a SeqIO module is they have to follow the > structure of SeqIO drivers, mainly they implement next_seq and write_seq > and inherit from Bio::SeqIO and use the inherited _readline or _print for > IO rather than <$fh> and print $fh. My module uses the _readline interface. write_seq isn't implemented because it doesn't make any sense to do so. > You can contribute it be posting it to the list, asking nicely for CVS > r/w account, or submitting it as an enhancement to bugzilla.open-bio.org. > Looking forward to it. I'll post another e-mail will it attached. > > -jason > > On Fri, 31 Oct 2003, Josh Lauricha wrote: > > > I've written a SeqIO parser for the tigr xml data format, and would like > > to contribute it to BioPerl. However, there are a couple things I don't > > really like about it but don't have the time to fix right now. Could I > > get some feedback from the list regaurding each? > > > > First, some background. Since each XML file is roughly 60MB, using the > > XML parsers provided by TIGR (using XML::Simple and XML::Sax, IIRC) > > takes around 7-10 minutes to parse (no including BioPerl object > > creation) and occationally used more than ~2.5GB of memory, which an x86 > > can't handle. > > > > To get around this, I took advantage of the fact that these are machine > > generated and parsed the entire file using regexp, only storing what is > > "relavent" to retrieve a sequence. This means, the ~75 lines of code > > TIGR used is around 1280. However, it uses around 250MB of memory and > > (converting from TIGR to GenBank) runs in around two to three and a half > > minutes, 30-60% slower than GenBank -> GenBank convertion. > > > > 1) The code is pretty ugly. It was one of my first "large" perl projects > > and reflects that. The uglyness is partially due to my inexperiance > > at the time, and partially do to the ugliness of the problem. > > > > 2) Its not very well commented, ok its not commented. This isn't too big > > a problem, as everything acts basically the same way, and once > > someone understands that the rest is easy. (Its really just the same > > thing over and over). Its just fairly bad form. > > > > 3) The memory usage (and runtime) could be improved by one or more of: > > a) Storing everything directly into objects rather than a tree > > b) Using arrays to store everything rather than hashes > > c) Ignoring any tags that aren't actually used. > > > > 4) The coding style is nothing like the rest of BioPerl's. Mainly > > because, I prefer this style (PERSONAL preference, no flames, > > everyone gets their own oppinion). This is bad for a project, > > but in all honesty if I need to drastically change my coding > > style I will probably never get around to fixing up this code. > > > > 5) There is quite a long delay before anything is actually accessible > > because the nucleotide data is given at the end of the files > > (actually, at the end of an ASSEMBLY tag) so everything before it > > needs to be parsed. This leads to the first ->next_seq() call taking > > a significant time. > > > > Since I can't show you what the object looks like, I'll show you what > > the GenBank file looks like. An example of the genbank file is at: > > > > http://bioinfo.ucr.edu/cgi-bin/seqfetch.pl?database=all&accession=At1g03870 > > > > Thanks for your time, > > > > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- ---------------------------- | Josh Lauricha | | laurichj@bioinfo.ucr.edu | | Bioinformatics, UCR | |--------------------------| From jason at cgt.duhs.duke.edu Mon Nov 3 16:00:54 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Nov 3 15:57:44 2003 Subject: [Bioperl-l] how to set first line of genbank file In-Reply-To: <200311031754218.SM01160@magicpc> References: <200311031754218.SM01160@magicpc> Message-ID: You'll want to read up on Bio::Seq::RichSeq as that is where these 'extra' fields come from for GenBank/EMBL writing LOCUS name part comes from $seq->display_id Length is based on the length of the sequence. $seq->molecule defines what goes in as 'DNA' $seq->is_circular() is how 'linear' gets set. $seq->division() sets the division (PLN) $seq->get_dates() gets the dates (the 1st one in the list is what is put here). See Bio::Seq::RichSeq for add_dates() method to add your own. It is a little bit of work to remove a date, ask if you need to do this. Barring this sort of information (above) you may have to read Bio::SeqIO::genbank I did add a table to the documentation which attempts to map these fields into the Bioperl data structures for you. [from Bio::SeqIO::genbank] Items listed as Annotation 'NAME' tell you the data is stored the associated Bio::Annotation::Colection object which is associated with Bio::Seq objects. If it is explictly requested that no annotations should be stored when parsing a record of course they won't be available when you try and get them. If you are having this problem look at the type of SeqBuilder that is being used to contruct your sequence object. Comments Annotation 'comment' References Annotation 'reference' Segment Annotation 'segment' Origin Annotation 'origin' Accessions PrimarySeq accession_number() Secondary accessions RichSeq get_secondary_accessions() Keywords RichSeq keywords() Dates RichSeq get_dates() Molecule RichSeq molecule() Seq Version RichSeq seq_version() PID RichSeq pid() Division RichSeq division() Features Seq get_SeqFeatures() Alphabet PrimarySeq alphabet() Definition PrimarySeq description() or desc() Version PrimarySeq version() Sequence PrimarySeq seq() -jason On Mon, 3 Nov 2003, Magic Fang wrote: > the standard first line of genbank file is: > LOCUS OSA277468 17385 bp DNA linear PLN 23-OCT-2003 > how to set the it when use bioperl to create genbank file. > thank u. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From jason at cgt.duhs.duke.edu Mon Nov 3 16:04:40 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Nov 3 16:01:42 2003 Subject: [Bioperl-l] Bio::SeqIO::tigr In-Reply-To: <20031103205723.GA26644@bioinfo.ucr.edu> References: <20031031175411.GB18009@bioinfo.ucr.edu> <20031103205723.GA26644@bioinfo.ucr.edu> Message-ID: > attached files before. The different style is more or less just with > idents, so stuff I've seen like: > > if() { > foreach () { > if() { > do something > } > } > } > > is: > > if() { > foreach () { > if() { > do something > } > } > } > So it SHOULD be like the second part ^^^^^^^. We did get called fascist by Chad (in jest) for asking for this... We actually prefer 4 spaces and no tabs. I just let my emacs take over and frequently reformat people's code when I can't read it... That probably makes me a tyrant, but such is life... =) > Basically, just because I can't read it. > > > > The requirements we have for > > something that would be a SeqIO module is they have to follow the > > structure of SeqIO drivers, mainly they implement next_seq and write_seq > > and inherit from Bio::SeqIO and use the inherited _readline or _print for > > IO rather than <$fh> and print $fh. > > My module uses the _readline interface. write_seq isn't implemented > because it doesn't make any sense to do so. > sure - sounds good. > > You can contribute it be posting it to the list, asking nicely for CVS > > r/w account, or submitting it as an enhancement to bugzilla.open-bio.org. > > Looking forward to it. > > I'll post another e-mail will it attached. > > > > > -jason > > > > On Fri, 31 Oct 2003, Josh Lauricha wrote: > > > > > I've written a SeqIO parser for the tigr xml data format, and would like > > > to contribute it to BioPerl. However, there are a couple things I don't > > > really like about it but don't have the time to fix right now. Could I > > > get some feedback from the list regaurding each? > > > > > > First, some background. Since each XML file is roughly 60MB, using the > > > XML parsers provided by TIGR (using XML::Simple and XML::Sax, IIRC) > > > takes around 7-10 minutes to parse (no including BioPerl object > > > creation) and occationally used more than ~2.5GB of memory, which an x86 > > > can't handle. > > > > > > To get around this, I took advantage of the fact that these are machine > > > generated and parsed the entire file using regexp, only storing what is > > > "relavent" to retrieve a sequence. This means, the ~75 lines of code > > > TIGR used is around 1280. However, it uses around 250MB of memory and > > > (converting from TIGR to GenBank) runs in around two to three and a half > > > minutes, 30-60% slower than GenBank -> GenBank convertion. > > > > > > 1) The code is pretty ugly. It was one of my first "large" perl projects > > > and reflects that. The uglyness is partially due to my inexperiance > > > at the time, and partially do to the ugliness of the problem. > > > > > > 2) Its not very well commented, ok its not commented. This isn't too big > > > a problem, as everything acts basically the same way, and once > > > someone understands that the rest is easy. (Its really just the same > > > thing over and over). Its just fairly bad form. > > > > > > 3) The memory usage (and runtime) could be improved by one or more of: > > > a) Storing everything directly into objects rather than a tree > > > b) Using arrays to store everything rather than hashes > > > c) Ignoring any tags that aren't actually used. > > > > > > 4) The coding style is nothing like the rest of BioPerl's. Mainly > > > because, I prefer this style (PERSONAL preference, no flames, > > > everyone gets their own oppinion). This is bad for a project, > > > but in all honesty if I need to drastically change my coding > > > style I will probably never get around to fixing up this code. > > > > > > 5) There is quite a long delay before anything is actually accessible > > > because the nucleotide data is given at the end of the files > > > (actually, at the end of an ASSEMBLY tag) so everything before it > > > needs to be parsed. This leads to the first ->next_seq() call taking > > > a significant time. > > > > > > Since I can't show you what the object looks like, I'll show you what > > > the GenBank file looks like. An example of the genbank file is at: > > > > > > http://bioinfo.ucr.edu/cgi-bin/seqfetch.pl?database=all&accession=At1g03870 > > > > > > Thanks for your time, > > > > > > > > > > -- > > Jason Stajich > > Duke University > > jason at cgt.mc.duke.edu > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From laurichj at bioinfo.ucr.edu Mon Nov 3 16:02:29 2003 From: laurichj at bioinfo.ucr.edu (Josh Lauricha) Date: Mon Nov 3 16:02:13 2003 Subject: [Bioperl-l] Bio::SeqIO::tigr In-Reply-To: References: <20031031175411.GB18009@bioinfo.ucr.edu> Message-ID: <20031103210229.GB26644@bioinfo.ucr.edu> Skipped content of type multipart/mixed-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20031103/008f2cb8/attachment-0001.bin From wakibbe at northwestern.edu Mon Nov 3 13:38:19 2003 From: wakibbe at northwestern.edu (Warren Alden Kibbe) Date: Mon Nov 3 16:02:20 2003 Subject: [Bioperl-l] get_sequence error retrieving protein sequences from GenBank In-Reply-To: <200311031803.hA3I3mdb011455@portal.open-bio.org> References: <200311031803.hA3I3mdb011455@portal.open-bio.org> Message-ID: This is an odd problem that I have seen in the get_sequence call for genbank accession numbers that are for protein rather than DNA entries. One of my colleagues, Julie Zhu, has reported the bug at http://bugzilla.bioperl.org with bug# 1545 a week or two ago and still waiting for response. The detailed bug information is as follows: The following two protein accession numbers are valid by searching through the ncbi protein database. However, bioperl 1.2.3 get_sequence complains that the accession number does not exist. The bioperl 1.2.3 get_sequence for the corresponding DNA accession numbers works fine. Please note that one is for genbank and the other is for embl (Both swissprot and refseq entries work fine). $seq_object = get_sequence('genbank', "AAQ10714");#protein accession number does not work $seq_object = get_sequence('genbank', "AF536179"); #DNA accession number works $seq_object = get_sequence('embl',"CAD32973"); #protein accession number does not work $seq_object = get_sequence('embl',"AJ489231"); #DNA accession number works Does anyone else see this behavior? We see on under Perl 5.61 and Perl 5.8 running bioperl 1.2.3 on both Mac OS X and Windows 2000. Thank you very much for your time and help in advance. Warren From jason at cgt.duhs.duke.edu Mon Nov 3 16:13:01 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Nov 3 16:09:32 2003 Subject: [Bioperl-l] get_sequence error retrieving protein sequences from GenBank In-Reply-To: References: <200311031803.hA3I3mdb011455@portal.open-bio.org> Message-ID: On Mon, 3 Nov 2003, Warren Alden Kibbe wrote: > This is an odd problem that I have seen in the get_sequence call for > genbank accession numbers that are for protein rather than DNA > entries. One of my colleagues, Julie Zhu, has reported the bug at > http://bugzilla.bioperl.org with bug# 1545 a week or two ago and > still waiting for response. > > The detailed bug information is as follows: > > The following two protein accession numbers are valid by searching > through the ncbi protein database. However, bioperl 1.2.3 > get_sequence complains that the accession number does not exist. The > bioperl 1.2.3 get_sequence for the corresponding DNA accession > numbers works fine. Please note that one is for genbank and the other > is for embl (Both swissprot and refseq entries work fine). Because you cannot ask for protein Accession numbers from genbank, you have to ask for them from genpept. > > $seq_object = get_sequence('genbank', "AAQ10714");#protein accession > number does not work > $seq_object = get_sequence('genbank', "AF536179"); #DNA accession number works > > $seq_object = get_sequence('embl',"CAD32973"); #protein accession > number does not work > $seq_object = get_sequence('embl',"AJ489231"); #DNA accession number works > > Does anyone else see this behavior? We see on under Perl 5.61 and > Perl 5.8 running bioperl 1.2.3 on both Mac OS X and Windows 2000. > Thank you very much for your time and help in advance. > > Warren -- Jason Stajich Duke University jason at cgt.mc.duke.edu From laurichj at bioinfo.ucr.edu Mon Nov 3 17:01:12 2003 From: laurichj at bioinfo.ucr.edu (Josh Lauricha) Date: Mon Nov 3 16:57:55 2003 Subject: [Bioperl-l] Bio::Tools::Run::Alignment::Clustalw Message-ID: <20031103220112.GD26644@bioinfo.ucr.edu> I have a small patch for Clustalw.pm to allow the creation of trees without multiple alignments. But, of course, I've got a few questions ;) One thing that bothers me with Clustalw.pm is that its not quiet by default. I'd rather not see the garbage from clustalw, and having it print out a bunch of stuff is not what I expected, is it ok if I change it to be quiet by default? Theres a chmod in _run, why is that there? and why is it setting it group and world readable? Shouldn't permissions be delegated to the OS? Since the tree is generated with a multiple alignment (and I wanted one of those too), I'd like to make an option to save the tree. However, align() returns the alignment... Should I add "{save,get}_tree" or have it return the tree if wantarray is set? I think the last is the most natural, and lets the module clean up after itself easily. Thanks, -- ---------------------------- | Josh Lauricha | | laurichj@bioinfo.ucr.edu | | Bioinformatics, UCR | |--------------------------| From laurichj at bioinfo.ucr.edu Mon Nov 3 17:06:14 2003 From: laurichj at bioinfo.ucr.edu (Josh Lauricha) Date: Mon Nov 3 17:02:58 2003 Subject: [Bioperl-l] Bio::Tools::Run::Alignment::Clustalw In-Reply-To: <20031103220112.GD26644@bioinfo.ucr.edu> References: <20031103220112.GD26644@bioinfo.ucr.edu> Message-ID: <20031103220614.GE26644@bioinfo.ucr.edu> On Mon 11/03/03 14:01, Josh Lauricha wrote: > Theres a chmod in _run, why is that there? and why is it setting it > group and world readable? Shouldn't permissions be delegated to the OS? Excuse me, group and world writeable (777). -- ---------------------------- | Josh Lauricha | | laurichj@bioinfo.ucr.edu | | Bioinformatics, UCR | |--------------------------| From wes.barris at csiro.au Mon Nov 3 18:10:26 2003 From: wes.barris at csiro.au (Wes Barris) Date: Mon Nov 3 18:07:25 2003 Subject: [Bioperl-l] Understanding LocatableSeq Message-ID: <3FA6E062.7010102@csiro.au> Hi, I am trying to create an msf alignment of several LocatableSeq objects. I have tried setting the "start" and "end" attributes of each LocatableSeq object before adding it to the alignment but the resulting sequences are still not aligned in the SimpleAlign object. What am I doing wrong? #!/usr/local/bin/perl -w # # use strict; use Bio::AlignIO; # my $aln = new Bio::SimpleAlign(); my $lseq; $lseq = new Bio::LocatableSeq(); $lseq->seq('GATCGATC'); $lseq->id('this'); $lseq->start(1); $lseq->end(8); $aln->add_seq($lseq); $lseq = new Bio::LocatableSeq(); $lseq->seq('ATCGAT'); $lseq->id('that'); $lseq->start(2); $lseq->end(7); $aln->add_seq($lseq); my $outstream = new Bio::AlignIO(-format=>'msf', -file=>">junk.msf"); $outstream->write_aln($aln); undef $outstream; The output looks like this: NoName MSF: 2 Type: N Tue Nov 4 09:01:58 2003 Check: 00 .. Name: this/1-8 Len: 8 Check: 2590 Weight: 1.00 Name: that/2-7 Len: 6 Check: 1547 Weight: 1.00 // this/1-8 GATCGATC that/2-7 ATCGAT <--- Should't this be shifted one position to the right? I am using bioperl-1.2.3. -- Wes Barris E-Mail: Wes.Barris@csiro.au From birney at ebi.ac.uk Mon Nov 3 18:28:04 2003 From: birney at ebi.ac.uk (Ewan Birney) Date: Mon Nov 3 18:24:32 2003 Subject: [Bioperl-l] Understanding LocatableSeq In-Reply-To: <3FA6E062.7010102@csiro.au> Message-ID: On Tue, 4 Nov 2003, Wes Barris wrote: > Hi, > > I am trying to create an msf alignment of several LocatableSeq objects. > I have tried setting the "start" and "end" attributes of each LocatableSeq > object before adding it to the alignment but the resulting sequences are > still not aligned in the SimpleAlign object. What am I doing wrong? > SimpleAlign does not automagically do the alignment - you need to call out to Bio::Tools::Clustalw or TCoffee or something else. SimpleAlign *represents* alignments, doesn't make them. (it would be cute if simple align did this magically but probably would lead to some fascinating bug-hunts as simple align booted up a full progressive alignment engine when a sequence was added. Hmmm. Wistful thinking...) > #!/usr/local/bin/perl -w > # > # > use strict; > use Bio::AlignIO; > # > my $aln = new Bio::SimpleAlign(); > my $lseq; > $lseq = new Bio::LocatableSeq(); > $lseq->seq('GATCGATC'); > $lseq->id('this'); > $lseq->start(1); > $lseq->end(8); > $aln->add_seq($lseq); > > $lseq = new Bio::LocatableSeq(); > $lseq->seq('ATCGAT'); > $lseq->id('that'); > $lseq->start(2); > $lseq->end(7); > $aln->add_seq($lseq); > > my $outstream = new Bio::AlignIO(-format=>'msf', -file=>">junk.msf"); > $outstream->write_aln($aln); > undef $outstream; > > > The output looks like this: > > > NoName MSF: 2 Type: N Tue Nov 4 09:01:58 2003 Check: 00 .. > > Name: this/1-8 Len: 8 Check: 2590 Weight: 1.00 > Name: that/2-7 Len: 6 Check: 1547 Weight: 1.00 > > // > > this/1-8 GATCGATC > that/2-7 ATCGAT <--- Should't this be shifted one position to the right? > > > I am using bioperl-1.2.3. > -- > Wes Barris > E-Mail: Wes.Barris@csiro.au > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > From kvddrift at earthlink.net Mon Nov 3 20:18:38 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon Nov 3 20:15:28 2003 Subject: [Bioperl-l] post translational modifications Message-ID: Hi, Is it possible to add post translational modifications to an amino acid sequence? For instance, in the following example, I would like to put a phosphate-group on the 2nd serine: my $seq = Bio::Seq->new('-seq' => 'QWERTHGSSTYNMKGEDDLKR', '-alphabet' => 'protein' ); thanks, - Koen. From wes.barris at csiro.au Mon Nov 3 20:19:25 2003 From: wes.barris at csiro.au (Wes Barris) Date: Mon Nov 3 20:16:19 2003 Subject: [Bioperl-l] Understanding LocatableSeq In-Reply-To: References: Message-ID: <3FA6FE9D.2000007@csiro.au> Ewan Birney wrote: > > On Tue, 4 Nov 2003, Wes Barris wrote: > > >>Hi, >> >>I am trying to create an msf alignment of several LocatableSeq objects. >>I have tried setting the "start" and "end" attributes of each LocatableSeq >>object before adding it to the alignment but the resulting sequences are >>still not aligned in the SimpleAlign object. What am I doing wrong? >> > > > SimpleAlign does not automagically do the alignment - you need to call out > to Bio::Tools::Clustalw or TCoffee or something else. SimpleAlign > *represents* alignments, doesn't make them. I know that SimpleAlign does not automagically do the alignment. That is why I am setting the "start" and "end" attributes. I have an ACE file from clustal. I am trying to write an msf file. I have the sequences and the alignment information. Now I want to write this into a SimpleAlign object. > > > (it would be cute if simple align did this magically but probably would > lead to some fascinating bug-hunts as simple align booted up a full > progressive alignment engine when a sequence was added. Hmmm. Wistful > thinking...) > > > > > >>#!/usr/local/bin/perl -w >># >># >>use strict; >>use Bio::AlignIO; >># >>my $aln = new Bio::SimpleAlign(); >>my $lseq; >>$lseq = new Bio::LocatableSeq(); >>$lseq->seq('GATCGATC'); >>$lseq->id('this'); >>$lseq->start(1); >>$lseq->end(8); >>$aln->add_seq($lseq); >> >>$lseq = new Bio::LocatableSeq(); >>$lseq->seq('ATCGAT'); >>$lseq->id('that'); >>$lseq->start(2); >>$lseq->end(7); >>$aln->add_seq($lseq); >> >>my $outstream = new Bio::AlignIO(-format=>'msf', -file=>">junk.msf"); >>$outstream->write_aln($aln); >>undef $outstream; >> >> >>The output looks like this: >> >> >>NoName MSF: 2 Type: N Tue Nov 4 09:01:58 2003 Check: 00 .. >> >> Name: this/1-8 Len: 8 Check: 2590 Weight: 1.00 >> Name: that/2-7 Len: 6 Check: 1547 Weight: 1.00 >> >>// >> >>this/1-8 GATCGATC >>that/2-7 ATCGAT <--- Should't this be shifted one position to the right? >> >> >>I am using bioperl-1.2.3. >>-- >>Wes Barris >>E-Mail: Wes.Barris@csiro.au >> >> >>_______________________________________________ >>Bioperl-l mailing list >>Bioperl-l@portal.open-bio.org >>http://portal.open-bio.org/mailman/listinfo/bioperl-l >> -- Wes Barris E-Mail: Wes.Barris@csiro.au From jason at cgt.duhs.duke.edu Mon Nov 3 20:26:21 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Nov 3 20:22:55 2003 Subject: [Bioperl-l] post translational modifications In-Reply-To: References: Message-ID: Bio::Seq::Meta should do what you want. We need someone to write up a tutorial which would show exactly what you are asking for - I would also like to see an example for RNA sequence + 2ndary structure encoding. -jason On Mon, 3 Nov 2003, Koen van der Drift wrote: > Hi, > > Is it possible to add post translational modifications to an amino acid > sequence? For instance, in the following example, I would like to put a > phosphate-group on the 2nd serine: > > my $seq = Bio::Seq->new('-seq' => 'QWERTHGSSTYNMKGEDDLKR', '-alphabet' > => 'protein' ); > > > thanks, > > > - Koen. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From jason at cgt.duhs.duke.edu Mon Nov 3 20:30:27 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Nov 3 20:27:01 2003 Subject: [Bioperl-l] Understanding LocatableSeq In-Reply-To: <3FA6FE9D.2000007@csiro.au> References: <3FA6FE9D.2000007@csiro.au> Message-ID: On Tue, 4 Nov 2003, Wes Barris wrote: > Ewan Birney wrote: > > > > > On Tue, 4 Nov 2003, Wes Barris wrote: > > > > > >>Hi, > >> > >>I am trying to create an msf alignment of several LocatableSeq objects. > >>I have tried setting the "start" and "end" attributes of each LocatableSeq > >>object before adding it to the alignment but the resulting sequences are > >>still not aligned in the SimpleAlign object. What am I doing wrong? > >> > > > > > > SimpleAlign does not automagically do the alignment - you need to call out > > to Bio::Tools::Clustalw or TCoffee or something else. SimpleAlign > > *represents* alignments, doesn't make them. > > I know that SimpleAlign does not automagically do the alignment. That > is why I am setting the "start" and "end" attributes. I have an ACE > file from clustal. I am trying to write an msf file. I have the > sequences and the alignment information. Now I want to write this > into a SimpleAlign object. > You still need to prefix/postfix with the requisite number of gaps. The start/end describe where the sequence participating the alignment COMES FROM not where they are in the alignment, so you have to explicitly code their alignment by placing the right number of gaps. > > > > > > (it would be cute if simple align did this magically but probably would > > lead to some fascinating bug-hunts as simple align booted up a full > > progressive alignment engine when a sequence was added. Hmmm. Wistful > > thinking...) > > > > > > > > > > > >>#!/usr/local/bin/perl -w > >># > >># > >>use strict; > >>use Bio::AlignIO; > >># > >>my $aln = new Bio::SimpleAlign(); > >>my $lseq; > >>$lseq = new Bio::LocatableSeq(); > >>$lseq->seq('GATCGATC'); > >>$lseq->id('this'); > >>$lseq->start(1); > >>$lseq->end(8); > >>$aln->add_seq($lseq); > >> > >>$lseq = new Bio::LocatableSeq(); > >>$lseq->seq('ATCGAT'); > >>$lseq->id('that'); > >>$lseq->start(2); > >>$lseq->end(7); > >>$aln->add_seq($lseq); > >> > >>my $outstream = new Bio::AlignIO(-format=>'msf', -file=>">junk.msf"); > >>$outstream->write_aln($aln); > >>undef $outstream; > >> > >> > >>The output looks like this: > >> > >> > >>NoName MSF: 2 Type: N Tue Nov 4 09:01:58 2003 Check: 00 .. > >> > >> Name: this/1-8 Len: 8 Check: 2590 Weight: 1.00 > >> Name: that/2-7 Len: 6 Check: 1547 Weight: 1.00 > >> > >>// > >> > >>this/1-8 GATCGATC > >>that/2-7 ATCGAT <--- Should't this be shifted one position to the right? > >> > >> > >>I am using bioperl-1.2.3. > >>-- > >>Wes Barris > >>E-Mail: Wes.Barris@csiro.au > >> > >> > >>_______________________________________________ > >>Bioperl-l mailing list > >>Bioperl-l@portal.open-bio.org > >>http://portal.open-bio.org/mailman/listinfo/bioperl-l > >> > > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From juguang at tll.org.sg Mon Nov 3 21:58:59 2003 From: juguang at tll.org.sg (Juguang Xiao) Date: Mon Nov 3 21:55:04 2003 Subject: [Bioperl-l] OMIM tests failing In-Reply-To: <2CC97099-0DCE-11D8-ACBD-000A959EB4C4@gnf.org> Message-ID: Sorry, I did such stupid again. I tested it locally with my data, which is with clinical symptom records that I try to parse, while the original test data in repository is without it. Now I made the test passed and commit them. I realized one thing that I am not sure to call it a bug or what. The previous parser returns the record that $entry->desc returns the string 'desc' if the there is no desc field in the data file. I actually expect it returns undef or empty string. There are some such things still left. If no one disagrees, I will make them return empty string if the fields are not available in files. Juguang On Monday, November 3, 2003, at 03:20 PM, Hilmar Lapp wrote: > Juguang is modifying the parser, yes. I've noticed it failing too. > > Juguang, if you can, try to avoid committing intermediate versions > that aren't completed or tested and cause test failures. If for some > reason or another you can't avoid that, *first* send an email to the > list saying that what you need to commit is going to temporarily break > tests blah and foo. > > -hilmar > > On Friday, October 31, 2003, at 05:29 PM, Allen Day wrote: > >> any idea what's going on here? i see some recent commits (yesterday, >> first in 7 months) by juguang. it worked two days ago... >> >> -allen From hlapp at gnf.org Mon Nov 3 23:02:40 2003 From: hlapp at gnf.org (Hilmar Lapp) Date: Mon Nov 3 22:59:23 2003 Subject: [Bioperl-l] OMIM tests failing In-Reply-To: Message-ID: On Monday, November 3, 2003, at 06:58 PM, Juguang Xiao wrote: > I realized one thing that I am not sure to call it a bug or what. The > previous parser returns the record that $entry->desc returns the > string 'desc' if the there is no desc field in the data file. I > actually expect it returns undef or empty string. There are some such > things still left. If no one disagrees, I will make them return empty > string if the fields are not available in files. > They should return undef if the information is not present. -hilmar -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From hlapp at gmx.net Tue Nov 4 03:18:31 2003 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue Nov 4 03:15:20 2003 Subject: [Bioperl-l] how to set first line of genbank file In-Reply-To: Message-ID: <79A0D328-0E9F-11D8-B1F6-000A959EB4C4@gmx.net> On Monday, November 3, 2003, at 01:00 PM, Jason Stajich wrote: > Keywords RichSeq keywords() > This should be get_keywords(). -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From birney at ebi.ac.uk Tue Nov 4 03:24:06 2003 From: birney at ebi.ac.uk (Ewan Birney) Date: Tue Nov 4 03:20:37 2003 Subject: [Bioperl-l] Understanding LocatableSeq In-Reply-To: <3FA6FE9D.2000007@csiro.au> Message-ID: On Tue, 4 Nov 2003, Wes Barris wrote: > Ewan Birney wrote: > > > > > On Tue, 4 Nov 2003, Wes Barris wrote: > > > > > >>Hi, > >> > >>I am trying to create an msf alignment of several LocatableSeq objects. > >>I have tried setting the "start" and "end" attributes of each LocatableSeq > >>object before adding it to the alignment but the resulting sequences are > >>still not aligned in the SimpleAlign object. What am I doing wrong? > >> > > > > > > SimpleAlign does not automagically do the alignment - you need to call out > > to Bio::Tools::Clustalw or TCoffee or something else. SimpleAlign > > *represents* alignments, doesn't make them. > > I know that SimpleAlign does not automagically do the alignment. That > is why I am setting the "start" and "end" attributes. I have an ACE > file from clustal. I am trying to write an msf file. I have the > sequences and the alignment information. Now I want to write this > into a SimpleAlign object. > Then you need to put pad characters (- or .) in the appropiate places in the string of the LocatableSeq (in your case, it looks like the end). From Richard.Adams at ed.ac.uk Tue Nov 4 06:02:31 2003 From: Richard.Adams at ed.ac.uk (Richard Adams) Date: Tue Nov 4 05:59:14 2003 Subject: [Bioperl-l] RPSblast and existing BLAST packages (WAS: RemoteBlast) Message-ID: <3FA78747.F6B90365@ed.ac.uk> I'm not sure - here are some random musings It seems easier to organize the modules by program package rather than by program function - for example, Smith-Waterman modules are distinct from Blast modules even though the programs have similar aims. If we're going to have a uniform access to Remote Blast and standalone blast then one way might be to have BlastQuery class with common parameter setting methods, and methods such as run_remote_blast run_local_blast which access the implementing code as appropriate. But this might be a pain to implement without breaking everyone's existing code. Or, since standaloneblast uses autoload we could just add alternative allowable names for methods so that $factory->p('blastn') and $factory->program('blastn')are treated the same. Having method names the same as the header names in the blast URI documentation might be best as I would suspect that everyone has used the web interface but not everyone uses standalone blast. Richard -- Dr Richard Adams Bioinformatician, Psychiatric Genetics Group, Medical Genetics, Molecular Medicine Centre, Western General Hospital, Crewe Rd West, Edinburgh UK EH4 2XU Tel: 44 131 651 1084 richard.adams@ed.ac.uk From rc91 at leicester.ac.uk Tue Nov 4 07:31:34 2003 From: rc91 at leicester.ac.uk (Crook, R.) Date: Tue Nov 4 07:28:17 2003 Subject: [Bioperl-l] RepeatMasker Message-ID: I've read about Repeatmasker but I'm unclear about some things. I need only the RC L1s how do I get the postions of these using repeatmasker. From whs at sanger.ac.uk Tue Nov 4 07:34:01 2003 From: whs at sanger.ac.uk (Will Spooner) Date: Tue Nov 4 07:30:47 2003 Subject: [Bioperl-l] RPSblast and existing BLAST packages (WAS: RemoteBlast) In-Reply-To: <3FA78747.F6B90365@ed.ac.uk> References: <3FA78747.F6B90365@ed.ac.uk> Message-ID: Hi Richard, I recently implemented a BioPerl-based generic sequence search API for Ensembl (http://www.ensembl.org/Multi/blastview). This seems very similar to what you propose below. The approach I used, however, was to abstract the differences between search methods (wu-blast, ncbi-blast, fasta etc) into different perl modules. This is similar to the way that SeqIO and SearchIO handle different formats. For example: # This lazy-loads Bio/Tools/Run/Search/wu_blastn.pm my $search = Bio::Tools::Run::Search->new( -method=>'wu_blastn' ); # This lazy-loads Bio/Tools/Run/Search/fasta.pm my $search = Bio::Tools::Run::Search->new( -method=>'fasta' ); Bio::Tools::Run::Search has the following methods: 'seq' - adds the query sequance 'database' - configures the database location 'command' - generates the command to run 'option' - configures command options 'dispatch' - launches the command 'environment_variables' - configures environment variables 'run' - combines 'command' + 'dispatch' 'status' - reports job status (PENDING, RUNNING, COMPLETED etc) 'report' - returns the raw search report 'next_result' - returns a Bio::Search::Result object (N.b. Bio::Tools::Run::Search ISA Bio::Tools::Run::WrapperBase) This approach is pretty nice because you can easily subclass the '-method' modules to change the search behaviour. For example, -method=>'wu_blastn_bsub' is the same as -method=>'wu_blastn', except that the 'dispatch' method has been overridden to use the bsub job submission system. In addition, new '-methods' can be added without editing existing code. Whilst I'm still developing the core of the system, I have functioning modules for wu-blast (inline, offline, bsub), ncbi-blast (inline, offline, bsub), blat (gfClient) and ssaha (ssahaClient). A lot more detail can be found at: http://www.ensembl.org/Docs/wiki/html/EnsemblDocs/EnsemblBlastView.html If this style of approach is of interest to you moving forward, then I would be very interested in contributing. Kind regards, Will On Tue, 4 Nov 2003, Richard Adams wrote: > I'm not sure - here are some random musings > It seems easier to organize the modules by program package rather than > by program function - for example, > Smith-Waterman modules are distinct from Blast modules even though the > programs have similar aims. If we're going to have > a uniform access to Remote Blast and standalone blast then one way might > be to have BlastQuery class with common parameter > setting methods, and methods such as > run_remote_blast > run_local_blast > which access the implementing code as appropriate. But this might be > a pain to implement without breaking everyone's existing code. > > Or, since standaloneblast uses autoload > we could just add alternative allowable names for methods so that > $factory->p('blastn') and $factory->program('blastn')are treated the > same. Having method names the same as the header names in the blast URI > documentation might be best as I would suspect that everyone has used > the web interface but not everyone uses standalone blast. > > Richard > > > > -- > Dr Richard Adams > Bioinformatician, > Psychiatric Genetics Group, > Medical Genetics, > Molecular Medicine Centre, > Western General Hospital, > Crewe Rd West, > Edinburgh UK > EH4 2XU > > Tel: 44 131 651 1084 > richard.adams@ed.ac.uk > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > --- Dr William Spooner whs@sanger.ac.uk Ensembl Web Developer http://www.ensembl.org From chauser at duke.edu Tue Nov 4 09:40:12 2003 From: chauser at duke.edu (Charles Hauser) Date: Tue Nov 4 09:40:13 2003 Subject: [Bioperl-l] gbrowse: creating link based on 'Note' feature Message-ID: <1067956998.17369.23.camel@pandorina.biology.duke.edu> All, I'd like to create a link for a tract using an element in the 'Note' field (gff-2) which contains a genbank accn. $feat->add_tag_value('Note', $blat_feat->seq_id); where $blat_feat->seq_id = genbank accn this returns the link lacking a value for $notes[0]. link = sub { my $feature = shift; my @notes = $feature->get_tag_values('note'); my $link = "http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=search&db=Nucleotide&term=$notes[0]"; return $link; } Charles From Marc.Logghe at devgen.com Tue Nov 4 09:47:29 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Tue Nov 4 09:44:56 2003 Subject: [Bioperl-l] RE: [Gmod-gbrowse] gbrowse: creating link based on 'Note' feature Message-ID: > -----Original Message----- > From: Charles Hauser [mailto:chauser@duke.edu] > Sent: Tuesday, November 04, 2003 3:43 PM > To: BioPerl-List; GBrowse > Subject: [Gmod-gbrowse] gbrowse: creating link based on 'Note' feature > > > All, > > I'd like to create a link for a tract using an element in the 'Note' > field (gff-2) which contains a genbank accn. > > $feat->add_tag_value('Note', $blat_feat->seq_id); > where $blat_feat->seq_id = genbank accn > > this returns the link lacking a value for $notes[0]. > > link = sub { > my $feature = shift; > my @notes = $feature->get_tag_values('note'); tags are case sensitive, so you should call: my @notes = $feature->get_tag_values('Note'); ~~~~ HTH, Marc From jmanning at broad.mit.edu Tue Nov 4 13:06:09 2003 From: jmanning at broad.mit.edu (Jonathan M. Manning) Date: Tue Nov 4 13:02:52 2003 Subject: [Bioperl-l] Bio::AlignIO::bl2seq doesn't know when to stop... Message-ID: <3FA7EA91.9020107@broad.mit.edu> Hi, If I try to parse a bl2seq alignment using align (using "while(my $aln = $str->next_aln())" ), when it runs out of alignments, I get: Can't call method "querySeq" on an undefined value at ~/perllib/Bio/AlignIO/bl2seq.pm line 134, line 6002. If I look at the file I'm trying to parse, line 6002 is the end of the alignments. There is a "Lambda" line, and some summary information following it. I would expect next_aln to return false here. The following patch to CVS head fixes this - and properly returns false when there is no next alignment. This fix can also be applied to 1.2.3. All tests in AlignIO.t pass. I'm using blast tools 2.2.1, btw. It may only be a problem with this version of bl2seq. However, the fix below is a good safety check regardless of what blast version is used - but someone needs to test it against the latest version, just in case. ~Jonathan Index: bl2seq.pm =================================================================== RCS file: /home/repository/bioperl/bioperl-live/Bio/AlignIO/bl2seq.pm,v retrieving revision 1.15 diff -c -r1.15 bl2seq.pm *** bl2seq.pm 2003/10/28 13:52:03 1.15 --- bl2seq.pm 2003/11/04 15:54:16 *************** *** 131,136 **** --- 131,137 ---- -report_type => $self->report_type); my $bl2seqobj = $self->{'bl2seqobj'}; my $hsp = $bl2seqobj->next_feature; + unless($hsp) { return 0 }; $seqchar = $hsp->querySeq; $start = $hsp->query->start; $end = $hsp->query->end; From Richard.Adams at ed.ac.uk Tue Nov 4 13:16:07 2003 From: Richard.Adams at ed.ac.uk (Richard Adams) Date: Tue Nov 4 13:12:55 2003 Subject: [Bioperl-l] RPSblast and existing BLAST packages (WAS: RemoteBlast) References: <3FA78747.F6B90365@ed.ac.uk> Message-ID: <3FA7ECE7.8080409@ed.ac.uk> Will, That's great, just looking at your docs it definitely sounds the best option to have separate modules for the specific parts of each blast program. You obviously spend a lot of time woriking on this and so I'd imagine that most of your methods for setting up the query could be put straight in with little change, with the base class containing the remote_blast() and local_blast() methods instead of your run() method, which send the query to RemoteBlast / StandAloneBlast for actually running the query. Maybe to answer Donald's point the module for running hmmer could also be included? If you'd be willing to send your code for the ncbi-blast modules and, I'll try and put together a draft plan of module organisation and methods for discussion. Cheers Richard Will Spooner wrote: >Hi Richard, > >I recently implemented a BioPerl-based generic sequence search API >for Ensembl (http://www.ensembl.org/Multi/blastview). This seems very >similar to what you propose below. The approach I used, however, was to >abstract the differences between search methods (wu-blast, ncbi-blast, >fasta etc) into different perl modules. This is similar to the way that >SeqIO and SearchIO handle different formats. For example: > > # This lazy-loads Bio/Tools/Run/Search/wu_blastn.pm > my $search = Bio::Tools::Run::Search->new( -method=>'wu_blastn' ); > > # This lazy-loads Bio/Tools/Run/Search/fasta.pm > my $search = Bio::Tools::Run::Search->new( -method=>'fasta' ); > >Bio::Tools::Run::Search has the following methods: > > 'seq' - adds the query sequance > 'database' - configures the database location > 'command' - generates the command to run > 'option' - configures command options > 'dispatch' - launches the command > 'environment_variables' - configures environment variables > 'run' - combines 'command' + 'dispatch' > 'status' - reports job status (PENDING, RUNNING, COMPLETED etc) > 'report' - returns the raw search report > 'next_result' - returns a Bio::Search::Result object > (N.b. Bio::Tools::Run::Search ISA Bio::Tools::Run::WrapperBase) > >This approach is pretty nice because you can easily subclass the '-method' >modules to change the search behaviour. For example, >-method=>'wu_blastn_bsub' is the same as -method=>'wu_blastn', except >that the 'dispatch' method has been overridden to use the bsub job >submission system. In addition, new '-methods' can be added without >editing existing code. > >Whilst I'm still developing the core of the system, I have functioning >modules for wu-blast (inline, offline, bsub), ncbi-blast (inline, >offline, bsub), blat (gfClient) and ssaha (ssahaClient). > >A lot more detail can be found at: > http://www.ensembl.org/Docs/wiki/html/EnsemblDocs/EnsemblBlastView.html > >If this style of approach is of interest to you moving forward, then I >would be very interested in contributing. > >Kind regards, > >Will > > >On Tue, 4 Nov 2003, Richard Adams wrote: > > > >>I'm not sure - here are some random musings >>It seems easier to organize the modules by program package rather than >>by program function - for example, >>Smith-Waterman modules are distinct from Blast modules even though the >>programs have similar aims. If we're going to have >>a uniform access to Remote Blast and standalone blast then one way might >>be to have BlastQuery class with common parameter >>setting methods, and methods such as >> run_remote_blast >> run_local_blast >> which access the implementing code as appropriate. But this might be >>a pain to implement without breaking everyone's existing code. >> >>Or, since standaloneblast uses autoload >>we could just add alternative allowable names for methods so that >>$factory->p('blastn') and $factory->program('blastn')are treated the >>same. Having method names the same as the header names in the blast URI >>documentation might be best as I would suspect that everyone has used >>the web interface but not everyone uses standalone blast. >> >>Richard >> >> >> >>-- >>Dr Richard Adams >>Bioinformatician, >>Psychiatric Genetics Group, >>Medical Genetics, >>Molecular Medicine Centre, >>Western General Hospital, >>Crewe Rd West, >>Edinburgh UK >>EH4 2XU >> >>Tel: 44 131 651 1084 >>richard.adams@ed.ac.uk >> >> >> >>_______________________________________________ >>Bioperl-l mailing list >>Bioperl-l@portal.open-bio.org >>http://portal.open-bio.org/mailman/listinfo/bioperl-l >> >> >> > >--- >Dr William Spooner whs@sanger.ac.uk >Ensembl Web Developer http://www.ensembl.org > > > -- Dr Richard Adams Bioinformatician, Psychiatric Genetics Group, Medical Genetics, Molecular Medicine Centre, Western General Hospital, Crewe Rd West, Edinburgh UK EH4 2XU Tel: 44 131 651 1084 richard.adams@ed.ac.uk From ypeng at sfu.ca Tue Nov 4 19:07:22 2003 From: ypeng at sfu.ca (ypeng@sfu.ca) Date: Tue Nov 4 19:04:05 2003 Subject: [Bioperl-l] Bio::Tools::FootPrinter Message-ID: <200311050007.hA507MkR021251@rm-rstar.sfu.ca> Hi all, Did anybody use this module to parse FootPrinter output? I am trying to use it but encountered two problems: 1. Some shorter motifs also show up. Say, I specified the motif size=14, "motifs" like "accg" also reported as features. 2. The parsed output is not the same as the FootPrinter HTML output. Any help will be appreciated. Fred Peng From jason at cgt.duhs.duke.edu Tue Nov 4 21:38:10 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Tue Nov 4 21:34:46 2003 Subject: [Bioperl-l] Bio::Graphics::Feature Message-ID: I brought Bio::Graphics::Feature up to spec in that it now implements the primary_seq, get_SeqFeatures, get_all_SeqFeatures, species, annotation methods. We probably need to check that Bio::DB::GFF::Segment is also doing the same. I know that Hilmar instigated a number of changes on the main trunk outside of 1.2.x to try and standardize the method names in modules so that they look like get_XXX. -jason -- Jason Stajich Duke University jason at cgt.mc.duke.edu From pedro.fabre at gen.gu.se Wed Nov 5 05:51:07 2003 From: pedro.fabre at gen.gu.se (Pedro) Date: Wed Nov 5 05:47:53 2003 Subject: [Bioperl-l] haplotype Message-ID: Dear developers, I have done some work on SNP haplotypes. I have created two modules one to find the ht_SNP set and another one to tag the minimal set of an haplotype. I would like to contibute to bioperl these two modules. The modules could be under: Bio::Haplotype::Select Bio::Haplotype::Tag I also have stand alone examples about how the code is working and about what to input and what you get from the module. I would like also to know if there is any perl style to follow. Any documentation about that? If you think this can be insteresting for bioperl, please let me know. Cheers Pedro From birney at ebi.ac.uk Wed Nov 5 06:25:27 2003 From: birney at ebi.ac.uk (Ewan Birney) Date: Wed Nov 5 06:22:09 2003 Subject: [Bioperl-l] haplotype In-Reply-To: Message-ID: On Wed, 5 Nov 2003, Pedro wrote: > Dear developers, > > I have done some work on SNP haplotypes. I have created two modules > one to find the ht_SNP set and another one to tag the minimal set of > an haplotype. > > I would like to contibute to bioperl these two modules. > > The modules could be under: > > Bio::Haplotype::Select > Bio::Haplotype::Tag > > I also have stand alone examples about how the code is working and > about what to input and what you get from the module. > It would be great if you could show a little synopsis of how a client would use objects # entirely making this up $hap = Bio::Haplotype::Select->new(xxxxxx); @tags = $hap->get_tag_snps(); foreach $tag ( @tags ) { # do something interesting, like writing out the allele. } > I would like also to know if there is any perl style to follow. Any > documentation about that? In bioperl release there is biodesign.pod which gives you some hints. > > If you think this can be insteresting for bioperl, please let me know. > I am sure this would be and I would be happy to help you do conversion into bioperl style... > Cheers > Pedro > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > ----------------------------------------------------------------- Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420 . ----------------------------------------------------------------- From michael.watson at bbsrc.ac.uk Wed Nov 5 06:50:50 2003 From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C)) Date: Wed Nov 5 06:49:45 2003 Subject: [Bioperl-l] Concatenating Bacterial Genome Sequence Message-ID: <20B7EB075F2D4542AFFAF813E98ACD93028223C4@cl-exsrv1.irad.bbsrc.ac.uk> Hi First of all, apologies to posting to both lists at once, I realise a lot of people will get this e-mail twice, but I believe this question is of relevance to both lists. Those of you on the ensembl list will be familiar with my (successful!) attempts to put the Salmonella genome into an ensembl (well, actually, an otter) database - the parse_pathogen script, by and large, worked very well and I have a (mostly) functional website. The problem comes from the fact that the EMBL entries for the bacterial genomes I am interested in consist of many different sequences which represent segments of the genome. So parse_pathogen handles this by creating a new ensembl "chromosome" for each segment. Of course these bacterial genomes are circular and constant, so splitting them up into chromosomes doesn't make too much sense, but I can get away with it most of the time with typhi CT18, which is in 20 pieces, and typhi Ty2, which is in 16 pieces, but when I come to typhimurium LT2, this is in 220 pieces; If I want to pose the question "Are these two gene's adjacent on the genome?", normally a very simple task using ensembl, I will have to do some jumping through hoops figuring out if the genes are at the end of segments, and if so, what are the adjacent segments and are the gene's adjacent on the genome but on two different segments... So what would be realy great, and this is where bioperl (maybe) comes in, is something that takes the EMBL entry for the S.typhimurium genome, which is actually 220 EMBL sequences, and creates a single EMBL sequence entry for the whole genome, with all the feature's updated so that their location is relative to the start of the whole genome, and not just of the segment they are on. Has anyone done this and care to share? If not, any comments on how difficult/easy this might be using Bioperl would be welcome. Regards Mick From pedro.fabre at gen.gu.se Wed Nov 5 07:14:41 2003 From: pedro.fabre at gen.gu.se (Pedro) Date: Wed Nov 5 07:11:29 2003 Subject: Fwd: Re: [Bioperl-l] haplotype Message-ID: I forgot cc to the bioperl-list. Here is the code example. Cheers Pedro >Date: Wed, 5 Nov 2003 12:09:27 +0000 >To: Ewan Birney >From: Pedro >Subject: Re: [Bioperl-l] haplotype >Cc: >Bcc: >X-Attachments: > >Hi Ewan, > >Thanks for your email. >>On Wed, 5 Nov 2003, Pedro wrote: >> >>> Dear developers, >>> >>> I have done some work on SNP haplotypes. I have created two modules >>> one to find the ht_SNP set and another one to tag the minimal set of >>> an haplotype. >>> >>> I would like to contibute to bioperl these two modules. >>> >>> The modules could be under: >>> >>> Bio::Haplotype::Select >>> Bio::Haplotype::Tag >>> >>> I also have stand alone examples about how the code is working and >>> about what to input and what you get from the module. >>> >> >>It would be great if you could show a little synopsis of how a client >>would use objects > > >Select.pm >my $hap = Select->new($hap,$snp,$pop); > >methods: > >$hap->input_block; # returns the input block >$hap->snp_ids; # returns the snp_ids set >$hap->pop_freq; # returns the population id and their frequency > >$hap->deg-snp; # list of degenerated snps >$hap->snp_type; # all snp and their types >$hap->no_snp; # false snps >$hap->useful_snp; # list of snps that can be used for the analysis >$hap->ht_type; # selection of snps which create not redundante set >$hap->ht_set; # minimal set of SNP to tag >$hap->snp_type_code # every snp is converted to a numeric code. > # this is the result. > >Tag.pm > >my $tag = Tag -> new($hap); > >methods: > >$tag->input_block; # input block >$tag->tag_list; # returns list of SNP's combination. >$tag->tag_length; # return the minimal number of SNPs you have to tag > # to define the block > > > >Here come the examples > >############## Call to Select.pm ########################## > >#!/usr/local/bin/perl > >use warnings; >use strict; >use Select; >use Util; >use Data::Dumper; > >my $hap = [ > 'acgt?cact', > 'acgt?ca-t', > 'cg?tag?gc', > 'cactcgtgc', > 'cgctcgtgc', > 'cggtag?gc', > 'ac?t?cact' > ]; > >my $snp = [qw/s1 s2 s3 s4 s5 s6 s7 s8 s9/]; > >my $pop = [ > [qw/ uno 1/], > [qw/ dos 2/], > [qw/ tres 3/], > [qw/ cuatro 4/], > [qw/ cinco 5/], > [qw/ seis 6/], > [qw/ siete 7/] > ]; > ># create the object >my $obj = Select->new($hap,$snp,$pop); > >print Dumper $obj; > > >#print input block >print "this is the input block (Haplotype) [input_block]\n"; >foreach (@{$obj->input_block}){ > print "$_\n"; >} > >print "\n",'[snp_id] return the SNP IDs list',"\n"; >print "@{$obj->snp_ids}\n"; > >print "\n",'[pop_freq] return the population frequency',"\n"; > > >############## result for this set ########################## > >this is the input block (Haplotype) [input_block] >acgt?cact >acgt?ca-t >cg?tag?gc >cactcgtgc >cgctcgtgc >cggtag?gc >ac?t?cact > >[snp_id] return the SNP IDs list >s1 s2 s3 s4 s5 s6 s7 s8 s9 > >[pop_freq] return the population frequency >uno 1 >dos 2 >tres 3 >cuatro 4 >cinco 5 >seis 6 >siete 7 > >[deg_snp] if there is any degenerate SNP must be here >s7 s5 s3 > >[ht_set] working haplotype >0 0 0 >1 1 1 >0 0 2 >1 2 1 > >[ht_type] return the list of the snp type used >30 57 48 > >[useful_snp] >s1 s2 s6 s8 s9 > >[snp_type_code] >30 57 30 48 30 > >[ht_type] >30 57 48 >################################################################# > >once you have the set this is the second example > >####################### taggin module ########################### > >#!/usr/local/bin/perl > >use warnings; >use strict; >use Tag; >use Util; >use Data::Dumper; > >my $hap = [ > [qw/0 0 0/], > [qw/1 1 1/], > [qw/0 0 2/], > [qw/1 2 1/] > ]; > >my $obj = Tag -> new($hap); > >my $input = $obj->input_block; >my $tag_list = $obj->tag_list; >my $tag_length = $obj->tag_length; > >print "This was the input\n"; >foreach (@$input){ > print "@$_\n"; >} > >print "This is the tag list. You will need to tag the snp's on the >list (start from 0)\n"; >foreach (@$tag_list){ > print "@$_\n"; >} > >print "This is the tag length\n"; > >print "$tag_length\n"; > >###################### this is the result for this set ################# >This was the input >0 0 0 >1 1 1 >0 0 2 >1 2 1 >This is the tag list. You will need to tag the snp's on the list >(start from 0) >1 2 >This is the tag length >2 > >######################################################################## > >Do you think this information if enough? > >I know that the biological implication of multiallelic variance is >rare but I have taken that into account. > > > >> > I would like also to know if there is any perl style to follow. Any >>> documentation about that? >> >>In bioperl release there is biodesign.pod which gives you some hints. > >Thanks for the reference. > >> > >>> If you think this can be insteresting for bioperl, please let me know. >>> >> >>I am sure this would be and I would be happy to help you do conversion >>into bioperl style... > >Thanks once again. > >Pedro > >> >> >>> Cheers >>> Pedro >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l@portal.open-bio.org >>> http://portal.open-bio.org/mailman/listinfo/bioperl-l >>> >> >>----------------------------------------------------------------- >>Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420 >>. >>----------------------------------------------------------------- From jason at cgt.duhs.duke.edu Wed Nov 5 07:53:19 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Wed Nov 5 07:49:58 2003 Subject: Fwd: Re: [Bioperl-l] haplotype In-Reply-To: References: Message-ID: Sounds good. I would prefer the namespace to be Bio::PopGen::Haplotype as we already have Bio::PopGen::Genotype and this seems to fit more properly into that subclass rather than a top-level Bio:: class. I have some unfinished code which tries to do some similar things and builds Individuals and Genotypes from a DNA alignment. At some point will want to try and merge these things if there is some overlap, but happy to see your code prior to all of this. -jason On Wed, 5 Nov 2003, Pedro wrote: > I forgot cc to the bioperl-list. > > Here is the code example. > > Cheers > Pedro > > >Date: Wed, 5 Nov 2003 12:09:27 +0000 > >To: Ewan Birney > >From: Pedro > >Subject: Re: [Bioperl-l] haplotype > >Cc: > >Bcc: > >X-Attachments: > > > >Hi Ewan, > > > >Thanks for your email. > >>On Wed, 5 Nov 2003, Pedro wrote: > >> > >>> Dear developers, > >>> > >>> I have done some work on SNP haplotypes. I have created two modules > >>> one to find the ht_SNP set and another one to tag the minimal set of > >>> an haplotype. > >>> > >>> I would like to contibute to bioperl these two modules. > >>> > >>> The modules could be under: > >>> > >>> Bio::Haplotype::Select > >>> Bio::Haplotype::Tag > >>> > >>> I also have stand alone examples about how the code is working and > >>> about what to input and what you get from the module. > >>> > >> > >>It would be great if you could show a little synopsis of how a client > >>would use objects > > > > > >Select.pm > >my $hap = Select->new($hap,$snp,$pop); > > > >methods: > > > >$hap->input_block; # returns the input block > >$hap->snp_ids; # returns the snp_ids set > >$hap->pop_freq; # returns the population id and their frequency > > > >$hap->deg-snp; # list of degenerated snps > >$hap->snp_type; # all snp and their types > >$hap->no_snp; # false snps > >$hap->useful_snp; # list of snps that can be used for the analysis > >$hap->ht_type; # selection of snps which create not redundante set > >$hap->ht_set; # minimal set of SNP to tag > >$hap->snp_type_code # every snp is converted to a numeric code. > > # this is the result. > > > >Tag.pm > > > >my $tag = Tag -> new($hap); > > > >methods: > > > >$tag->input_block; # input block > >$tag->tag_list; # returns list of SNP's combination. > >$tag->tag_length; # return the minimal number of SNPs you have to tag > > # to define the block > > > > > > > >Here come the examples > > > >############## Call to Select.pm ########################## > > > >#!/usr/local/bin/perl > > > >use warnings; > >use strict; > >use Select; > >use Util; > >use Data::Dumper; > > > >my $hap = [ > > 'acgt?cact', > > 'acgt?ca-t', > > 'cg?tag?gc', > > 'cactcgtgc', > > 'cgctcgtgc', > > 'cggtag?gc', > > 'ac?t?cact' > > ]; > > > >my $snp = [qw/s1 s2 s3 s4 s5 s6 s7 s8 s9/]; > > > >my $pop = [ > > [qw/ uno 1/], > > [qw/ dos 2/], > > [qw/ tres 3/], > > [qw/ cuatro 4/], > > [qw/ cinco 5/], > > [qw/ seis 6/], > > [qw/ siete 7/] > > ]; > > > ># create the object > >my $obj = Select->new($hap,$snp,$pop); > > > >print Dumper $obj; > > > > > >#print input block > >print "this is the input block (Haplotype) [input_block]\n"; > >foreach (@{$obj->input_block}){ > > print "$_\n"; > >} > > > >print "\n",'[snp_id] return the SNP IDs list',"\n"; > >print "@{$obj->snp_ids}\n"; > > > >print "\n",'[pop_freq] return the population frequency',"\n"; > > > > > >############## result for this set ########################## > > > >this is the input block (Haplotype) [input_block] > >acgt?cact > >acgt?ca-t > >cg?tag?gc > >cactcgtgc > >cgctcgtgc > >cggtag?gc > >ac?t?cact > > > >[snp_id] return the SNP IDs list > >s1 s2 s3 s4 s5 s6 s7 s8 s9 > > > >[pop_freq] return the population frequency > >uno 1 > >dos 2 > >tres 3 > >cuatro 4 > >cinco 5 > >seis 6 > >siete 7 > > > >[deg_snp] if there is any degenerate SNP must be here > >s7 s5 s3 > > > >[ht_set] working haplotype > >0 0 0 > >1 1 1 > >0 0 2 > >1 2 1 > > > >[ht_type] return the list of the snp type used > >30 57 48 > > > >[useful_snp] > >s1 s2 s6 s8 s9 > > > >[snp_type_code] > >30 57 30 48 30 > > > >[ht_type] > >30 57 48 > >################################################################# > > > >once you have the set this is the second example > > > >####################### taggin module ########################### > > > >#!/usr/local/bin/perl > > > >use warnings; > >use strict; > >use Tag; > >use Util; > >use Data::Dumper; > > > >my $hap = [ > > [qw/0 0 0/], > > [qw/1 1 1/], > > [qw/0 0 2/], > > [qw/1 2 1/] > > ]; > > > >my $obj = Tag -> new($hap); > > > >my $input = $obj->input_block; > >my $tag_list = $obj->tag_list; > >my $tag_length = $obj->tag_length; > > > >print "This was the input\n"; > >foreach (@$input){ > > print "@$_\n"; > >} > > > >print "This is the tag list. You will need to tag the snp's on the > >list (start from 0)\n"; > >foreach (@$tag_list){ > > print "@$_\n"; > >} > > > >print "This is the tag length\n"; > > > >print "$tag_length\n"; > > > >###################### this is the result for this set ################# > >This was the input > >0 0 0 > >1 1 1 > >0 0 2 > >1 2 1 > >This is the tag list. You will need to tag the snp's on the list > >(start from 0) > >1 2 > >This is the tag length > >2 > > > >######################################################################## > > > >Do you think this information if enough? > > > >I know that the biological implication of multiallelic variance is > >rare but I have taken that into account. > > > > > > > >> > I would like also to know if there is any perl style to follow. Any > >>> documentation about that? > >> > >>In bioperl release there is biodesign.pod which gives you some hints. > > > >Thanks for the reference. > > > >> > > >>> If you think this can be insteresting for bioperl, please let me know. > >>> > >> > >>I am sure this would be and I would be happy to help you do conversion > >>into bioperl style... > > > >Thanks once again. > > > >Pedro > > > >> > >> > >>> Cheers > >>> Pedro > >>> > >>> > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l@portal.open-bio.org > >>> http://portal.open-bio.org/mailman/listinfo/bioperl-l > >>> > >> > >>----------------------------------------------------------------- > >>Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420 > >>. > >>----------------------------------------------------------------- > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From lstein at cshl.edu Wed Nov 5 09:39:20 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Wed Nov 5 09:36:12 2003 Subject: [Bioperl-l] Re: problems with Bio::Tools::GFF In-Reply-To: References: <1067881262.1436.47.camel@localhost.localdomain> <1067888032.3457.81.camel@localhost.localdomain> Message-ID: <200311050939.20901.lstein@cshl.edu> > I am a bit wary of splitting on space wrt the last column but so we'll > have to cook up some test cases to make sure it goes through okay. Split on space? You shouldn't need to do that. > > > Note that I have also made no attempt to parse/write the Gap or > > > Alignment stuff in any sort of special way - I basically made it so it > > > supports what GFF2 currently looks like only in GFF3 flavor. Perhaps > > > it makes sense to do all of that work on Chris's Unflattner though > > > rather than in Tools::GFF. A SeqFeature::Tools::Flattner is probably > > > in order as well to turn HSPs and other paired sequences into GFF3 > > > Alignments. > > > > I'm not sure it's necessary to move to Unflattener. Since the format is > > fairly simple, it is only really necessary to split the information in > > the groups column to tag value pairs and let the user decide what to do > > with the information. The only thing that I am somewhat at a loss to > > deal with is cigar line info, but I don't think that is being parse by > > Bio::DB::GFF yet either. Every CIGAR line can be turned into a set of nongapped HSPs and vice versa. The main issue is that if you represent a gapped alignment as a CIGAR, you can't give each of its HSPs a separate score! I think this is a big problem. Therefore you can either ignore CIGARs completely, or mix CIGARs with HSPs. Lincoln > > One day I could imagine us building Gene/Transcript objects from the GFF3. > Actually I was thinking we'd need a Flattner to turn the Gene object back > into flattened features. Likewise with HSP objects and alignments. I > can't produce CIGAR lines currently from HSPs - I'm still a little > confused about how to construct them but it means I need to read the spec > a little more probably. > > > > As for the seq stuff - will likely need a Bio::SeqIO::gff3 for that. > > > > Ouch--I was afraid you were going to suggest that. I suppose if we make > > it a read-only module, I guess that should be ok. The thought of making > > it write makes my head hurt. > > For writing multiple sequences, could be pretty ugly. Either some > caching OR a special write_seq which takes an arrayref. Maybe not a SeqIO > after all.... unless GFF3 lets a new set start with > # gff-version 3 > so you could interleave them? > # gff-version 3 > .. > ##FASTA > > >oneseq.1 > > CAGT > # gff-version 3 > .. > ## FASTA > > >oneseq.2 > > GATC > > > For reading sequences next_seq will have to parse in the entire GFF file > at once and next_seq will have to iterate through an internal array I > guess. Not that hard I hope... > > > > Anyone is welcome to add these changes - I don't think I'll be able to > > > make many contributions until December so it would be best if someone > > > else took it on. > > > > > > -jason > > > > > > On Mon, 3 Nov 2003, Scott Cain wrote: > > > > Hi Jason and Lincoln, > > > > > > > > I have a few concerns with Bio::Tools::GFF. The first is with the > > > > method _from_gff3_string, which does a split on \t to separate > > > > columns. I think the GFF3 spec says it can be space delimited, so > > > > that should probably be \s+. Additionally, to split the groups > > > > column, it uses \s*;\s*, but I think that spaces have to be escaped, > > > > therefore, it should only split on ; and spaces would indicate a > > > > problem (especially if one splits on spaces as indicated above). > > > > > > > > Finally, it doesn't provide a method of accessing the sequence that > > > > is optionally at the bottom of the file. I am not exactly sure how > > > > to implement that (or I would), but I suspect it will have to be > > > > handled in the next_feature method. Of course, the problem with > > > > handling it there is that it is not a feature. > > > > > > > > Scott > > > > > > -- > > > Jason Stajich > > > Duke University > > > jason at cgt.mc.duke.edu > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- Lincoln Stein lstein@cshl.edu Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) From Xiaoying.Lin at celera.com Wed Nov 5 09:47:57 2003 From: Xiaoying.Lin at celera.com (Lin, Xiaoying) Date: Wed Nov 5 09:44:42 2003 Subject: [Bioperl-l] Concatenating Bacterial Genome Sequence Message-ID: I believe NCBI's 'ref' genomes are presented as 1 seq/chromosome. If the genome you are interested in is published, it should be in that collection. They are in the GenBank format, you can convert with SeqIO into EMBL if that is the only format your script take. If the genome is not in NCBI genomes yet, I found it is easier to load things into the database first, and do the merging afterwards, sequence first, and feature coordinates. If there are overlaps between pieces, you have to watch out for features that fall into the overlap, or spanning the boundary of a overlap, and which sequence to use if the overlapping sequences differ. This is often an issue when dealing with BACs, but those pieces in your case should have been artificially cut from the chromosome, and the overlapping regions should be identical, so it may not be a problem at all. Depends how your database is set up, this script should be 1 day worth of work. I did not use bioperl (at version 0.7) then. Regards, Xiaoying ----------- Xiaoying Lin, PhD Senior Manager Celera Genomics 45 West Gude Drive, Rockville, MD 20850 240-453-3695, 240-453-3768 (FAX), Xiaoying.Lin@celera.com > -----Original Message----- > From: michael watson (IAH-C) [mailto:michael.watson@bbsrc.ac.uk] > Sent: Wednesday, November 05, 2003 6:51 AM > To: 'ensembl-dev@ebi.ac.uk' > Cc: Bioperl > Subject: [Bioperl-l] Concatenating Bacterial Genome Sequence > > > Hi > > First of all, apologies to posting to both lists at once, I > realise a lot of people will get this e-mail twice, but I > believe this question is of relevance to both lists. > > Those of you on the ensembl list will be familiar with my > (successful!) attempts to put the Salmonella genome into an > ensembl (well, actually, an otter) database - the > parse_pathogen script, by and large, worked very well and I > have a (mostly) functional website. > > The problem comes from the fact that the EMBL entries for the > bacterial genomes I am interested in consist of many > different sequences which represent segments of the genome. > So parse_pathogen handles this by creating a new ensembl > "chromosome" for each segment. Of course these bacterial > genomes are circular and constant, so splitting them up into > chromosomes doesn't make too much sense, but I can get away > with it most of the time with typhi CT18, which is in 20 > pieces, and typhi Ty2, which is in 16 pieces, but when I come > to typhimurium LT2, this is in 220 pieces; If I want to pose > the question "Are these two gene's adjacent on the genome?", > normally a very simple task using ensembl, I will have to do > some jumping through hoops figuring out if the genes are at > the end of segments, and if so, what are the adjacent > segments and are the gene's adjacent on the genome but on two > different segments... > > So what would be realy great, and this is where bioperl > (maybe) comes in, is something that takes the EMBL entry for > the S.typhimurium genome, which is actually 220 EMBL > sequences, and creates a single EMBL sequence entry for the > whole genome, with all the feature's updated so that their > location is relative to the start of the whole genome, and > not just of the segment they are on. Has anyone done this > and care to share? If not, any comments on how > difficult/easy this might be using Bioperl would be welcome. > > Regards > > Mick > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-> bio.org/mailman/listinfo/bioperl-l > From lstein at cshl.edu Wed Nov 5 10:51:49 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Wed Nov 5 10:48:30 2003 Subject: [Bioperl-l] Bio::Graphics::Feature In-Reply-To: References: Message-ID: <200311051051.49465.lstein@cshl.edu> Thanks! Lincoln On Tuesday 04 November 2003 09:38 pm, Jason Stajich wrote: > I brought Bio::Graphics::Feature up to spec in that it now implements > the primary_seq, get_SeqFeatures, get_all_SeqFeatures, species, annotation > methods. > > We probably need to check that Bio::DB::GFF::Segment is also doing the > same. I know that Hilmar instigated a number of changes on the main trunk > outside of 1.2.x to try and standardize the method names in modules so > that they look like get_XXX. > > -jason > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- Lincoln Stein lstein@cshl.edu Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) From shawnh at stanford.edu Wed Nov 5 11:25:50 2003 From: shawnh at stanford.edu (Shawn Hoon) Date: Wed Nov 5 11:20:24 2003 Subject: [Bioperl-l] Bio::Tools::FootPrinter In-Reply-To: <200311050007.hA507MkR021251@rm-rstar.sfu.ca> References: <200311050007.hA507MkR021251@rm-rstar.sfu.ca> Message-ID: Uhm yeah, I haven't played with this in a while. The parser is not bug free as the output was not terribly amenable to parsing. For example, the same motif represented by the same character 111111 for example has different lengths for different instances. So what I took to be the 'motif' was the longest one with that string of characters. Maybe if you have suggestions for fixing this or sending an example output for this I can take a look. cheers, shawn On Tuesday, November 4, 2003, at 4:07PM, ypeng@sfu.ca wrote: > Hi all, > > Did anybody use this module to parse FootPrinter output? I am trying > to use > it but encountered two problems: > > 1. Some shorter motifs also show up. Say, I specified the motif > size=14, > "motifs" like "accg" also reported as features. > > 2. The parsed output is not the same as the FootPrinter HTML output. > > Any help will be appreciated. > > > Fred Peng > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From harris at cshl.org Wed Nov 5 12:34:01 2003 From: harris at cshl.org (Todd Harris) Date: Wed Nov 5 12:30:42 2003 Subject: [Bioperl-l] Enabling SVG output from Bio::Graphics Message-ID: Hi all - Formalities: for those of you who don't know me, I'm a postdoc in Lincoln's lab. I've been lurking on the list for awhile. It's good to be here! Due to popular demand (that is, I needed it myself, hehe), I'm implementing SVG output from Bio::Graphics. To accomplish this, I've written a module that overrides GD methods and translates them into SVG output. In general, this approach works fairly well although there are some methods that do not translate well. In order to make this happen, some fairly substantive changes need to be made to the Bio::Graphics core. In particular, this requires removing direct calls to GD methods from all subclasses, instead using the gd() accessor method to fetch an image object appropriate for the type of image being generated. I haven't worked out all the details yet, but it should work something like this. 1. Bio::Graphics::Panel should accept an optional parameter for the image class Bio::Graphics::Panel->new(-image_class=>'SVG'); 2. The gd() method of Panel.pm creates (and acts as an accessor) to the image object: my $image_clas = $self->{image_class}; my $gd = $existing_gd || $image_class->new($width,$height); $self->{gd} = $gd; 3. Glyphs and such should NOT use exported GD methods (and not 'use GD') gdMediumBoldFont # no! my $img = $self->gd; $img->gdMediumBoldFont; # yes, thank you! 4. Finally, the svg() method will dump the SVG $img->svg(); Comments and suggestions would be most appreciated! Thanks, Todd Harris From m_conte at hotmail.com Thu Nov 6 10:09:46 2003 From: m_conte at hotmail.com (matthieu CONTE) Date: Thu Nov 6 10:06:30 2003 Subject: [Bioperl-l] BioSQL Message-ID: Hello, We just transfered Arabidospsis Thaliania?s EMBL sequences via SwissProt's format in BioSQL. We are able to take out sequences and annotations (using a Bio::DB::PersitenceAdaptorI and the ?find_by_unique_key? method) from a BioentryID. But from an entry, we are unable to take out the access number Pfam, Prosite and others... from the crossreferences table (BioentryDbxref). Could you please tell us wich adaptor can we use and how to do it ? In short, how can we use the Bio ::DB:: ...adaptor to access to all the tables in a BioSQLformat?!! Many thanks in advance, M Matthieu CONTE 23 route d'EUS 66500 CATLLAR Tel 0468962854 m_conte@hotmail.com _________________________________________________________________ MSN Messenger : discutez en direct avec vos amis ! http://www.msn.fr/msger/default.asp From ndr at sanger.ac.uk Thu Nov 6 11:49:55 2003 From: ndr at sanger.ac.uk (Neil Rawlings) Date: Thu Nov 6 11:46:33 2003 Subject: [Bioperl-l] embl.pm and virus names Message-ID: <131DF14ECE564A4D8F35376FD346517201C37C4B@EXCHSRV1.internal.sanger.ac.uk> I am trying to use the Bio::SeqIO::EMBL to parse EMBL database entries, but am having problems whenever I try to retrieve the organism name whenever the EMBL entry is for a viral sequence. I am using the embl.pm module and a line such as: My ($spec, $genus) = $entry->species->classification(); But for a virus (which doesn't have a species name - for example "apple chlorotic leaf spot virus") I get "Apple chlorotic" as the organism name. I'm not just interested in viruses, so I'm happy when the name comes back as "Drosophila melanogaster". The problem is also apparent for some bacteria, especially something like Synechocystis sp. PCC6803 in which PCC6803 is lost (probably assumed to be a subspecies name). If a solution exists to this problem, please let me know. ======================================================================== ==== Neil D. Rawlings Sanger Institute Wellcome Trust Genome Campus Hinxton, Cambs CB10 1SA, UK Tel: +1223 495330 Fax: +1223 494919 E-mail: ndr@sanger.ac.uk ======================================================================== ====== Please visit the MEROPS database for peptidase classification. The URL is: MEROPS.SANGER.AC.UK From hlapp at gnf.org Thu Nov 6 17:48:44 2003 From: hlapp at gnf.org (Hilmar Lapp) Date: Thu Nov 6 17:45:20 2003 Subject: [Bioperl-l] Re: [BioSQL-l] (no subject) In-Reply-To: Message-ID: On 11/6/03 1:26 AM, "matthieu CONTE" wrote: > Hello, > > We just transfered Arabidospsis Thaliania?s EMBL sequences via SwissProt in > BioSQL format. > We are able to take out sequences and annotations (using a > Bio::DB::PersitenceAdaptorI and the ?find_by_unique_key? method) from a > BioentryID. > But from an entry, we are unable to take out the access number Pfam, Prosite > and others... from the crossreferences table (BioentryDbxref). > > Could you please tell us wich adaptor can we use and how to do it ? > > In short, how can we use the Bio ::DB:: ...adaptor to access to all the > tables in a BioSQLformat?!! > Generally speaking, you use adaptors to pull objects out or get objects into the database, the idea being you don?t have to care a lot about which tables are involved. Since you don't exactly explain what you're trying to do, there's multiple answers and I'll give you three, hoping to hit a match with at least one. A) If all you know is the database and accession of the dbxref, you can pull it out by a unique key query: $dbxref = Bio::Annotation::DBLink->new(-dbname => 'Genbank', -primary_id => 'BC6256426'); $adp = $db->get_persistence_adaptor($dbxref); $found = $adp->find_by_unique_key($dbxref); B) If you want to pull out all dbxrefs of a sequence entry, and you have the seq object in hand, all annotation will have been loaded. Hence, $seq = < ... e.g., find by unique key ...> @dblinks = $seq->annotation->get_Annotations('dblink'); Will do the job. C) Same as B) but you don't have the seq object and you don't want it either. $query = Bio::DB::Query::BioQuery->new( -datacollections => ["Bio::SeqI s", "Bio::Annotation::DBLink dbx", "Bio::SeqI<=>Bio::Annotation::DBLink"], -where => ["s.accession_number = 'BC236452'"]); $adp = $db->get_persistence_adaptor("Bio::Annotation::DBLink"); $result = $adp->find_by_query($query); while(my $dbx = $result->next_object()) { # do something with $dbx } I may have forgotten a parameter or so, especially for C), check out the documentation in Bio::DB::PersistenceAdaptorI (and possibly also Bio::DB::BioSQL::BasePersistenceAdaptor, the base class for all implementors). Hth, -hilmar > Many thanks in advance, > > M > > > > > Matthieu CONTE > 23 route d'EUS > 66500 CATLLAR > Tel > 0468962854 > m_conte@hotmail.com > > _________________________________________________________________ > MSN Search, le moteur de recherche qui pense comme vous ! > http://search.msn.fr/worldwide.asp > > _______________________________________________ > BioSQL-l mailing list > BioSQL-l@open-bio.org > http://open-bio.org/mailman/listinfo/biosql-l > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From liam at mmb.usyd.edu.au Thu Nov 6 20:50:08 2003 From: liam at mmb.usyd.edu.au (Liam Elbourne) Date: Thu Nov 6 20:46:51 2003 Subject: [Bioperl-l] macperl Message-ID: I was just wondering, preparatory to installing bioperl, how relevant the documentation for bioperl 0.7/macperl 5.004 ( http://bioperl.org/Core/mac-bioperl.html ) would be for the current situation of bioperl 1.2.3/macperl 5.6 would be, and curious whether a more current version exists. Naturally, if there isn't a more current version, I am not going to look gift documentation in the particulars, and will take this opportunity to say "Thanks!" to Todd, if he is still out there. Regards, Liam Elbourne. From hlapp at gnf.org Thu Nov 6 22:33:20 2003 From: hlapp at gnf.org (Hilmar Lapp) Date: Thu Nov 6 22:29:55 2003 Subject: [Bioperl-l] macperl In-Reply-To: Message-ID: Bioperl should build like a charm on a MacOSX 10.2.+ platform, without any magic having to be undertaken. In fact, MacOSX is the main development platform for at least I believe 3 people on the core team (Jason, Ewan, I hope I don't grossly misquote you ...). You may have some trouble with dependencies though, but really the only one that may require some acrobatics is GD and its own dependencies, which I've heard can be a hassle on any platform. Just be careful that you don't install an older version of LWP as it will overwrite /usr/bin/head with /usr/bin/HEAD (OSX being case-insensitive). -hilmar On 11/6/03 5:50 PM, "Liam Elbourne" wrote: > > I was just wondering, preparatory to installing bioperl, how relevant > the documentation for bioperl 0.7/macperl 5.004 ( > http://bioperl.org/Core/mac-bioperl.html ) would be for the current > situation of bioperl 1.2.3/macperl 5.6 would be, and curious whether > a more current version exists. Naturally, if there isn't a more > current version, I am not going to look gift documentation in the > particulars, and will take this opportunity to say "Thanks!" to Todd, > if he is still out there. > > Regards, > Liam Elbourne. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From wes.barris at csiro.au Thu Nov 6 23:31:41 2003 From: wes.barris at csiro.au (Wes Barris) Date: Thu Nov 6 23:29:07 2003 Subject: [Bioperl-l] Understanding LocatableSeq In-Reply-To: References: <3FA6FE9D.2000007@csiro.au> Message-ID: <3FAB202D.6090409@csiro.au> Jason Stajich wrote: > On Tue, 4 Nov 2003, Wes Barris wrote: > > >>Ewan Birney wrote: >> >> >>>On Tue, 4 Nov 2003, Wes Barris wrote: >>> >>> >>> >>>>Hi, >>>> >>>>I am trying to create an msf alignment of several LocatableSeq objects. >>>>I have tried setting the "start" and "end" attributes of each LocatableSeq >>>>object before adding it to the alignment but the resulting sequences are >>>>still not aligned in the SimpleAlign object. What am I doing wrong? >>>> >>> >>> >>>SimpleAlign does not automagically do the alignment - you need to call out >>>to Bio::Tools::Clustalw or TCoffee or something else. SimpleAlign >>>*represents* alignments, doesn't make them. >> >>I know that SimpleAlign does not automagically do the alignment. That >>is why I am setting the "start" and "end" attributes. I have an ACE >>file from clustal. I am trying to write an msf file. I have the >>sequences and the alignment information. Now I want to write this >>into a SimpleAlign object. >> > > > You still need to prefix/postfix with the requisite number of gaps. > > The start/end describe where the sequence participating the alignment > COMES FROM not where they are in the alignment, so you have to > explicitly code their alignment by placing the right number of gaps. Ok. I understand that I have to add the gap characters on either end of each aligned sequence. Sorry for being so dense but I still don't understand the use of the "start" and "end" attributes. They don't appear to do anything. If I have two sequences: GATCGATC and ATCGAT what would be the start and end for each sequence or doesn't it matter? When you say that they represent where the sequence COMES FROM, what does that mean? > > > >>> >>>(it would be cute if simple align did this magically but probably would >>>lead to some fascinating bug-hunts as simple align booted up a full >>>progressive alignment engine when a sequence was added. Hmmm. Wistful >>>thinking...) >>> >>> >>> >>> >>> >>> >>>>#!/usr/local/bin/perl -w >>>># >>>># >>>>use strict; >>>>use Bio::AlignIO; >>>># >>>>my $aln = new Bio::SimpleAlign(); >>>>my $lseq; >>>>$lseq = new Bio::LocatableSeq(); >>>>$lseq->seq('GATCGATC'); >>>>$lseq->id('this'); >>>>$lseq->start(1); >>>>$lseq->end(8); >>>>$aln->add_seq($lseq); >>>> >>>>$lseq = new Bio::LocatableSeq(); >>>>$lseq->seq('ATCGAT'); >>>>$lseq->id('that'); >>>>$lseq->start(2); >>>>$lseq->end(7); >>>>$aln->add_seq($lseq); >>>> >>>>my $outstream = new Bio::AlignIO(-format=>'msf', -file=>">junk.msf"); >>>>$outstream->write_aln($aln); >>>>undef $outstream; >>>> >>>> >>>>The output looks like this: >>>> >>>> >>>>NoName MSF: 2 Type: N Tue Nov 4 09:01:58 2003 Check: 00 .. >>>> >>>> Name: this/1-8 Len: 8 Check: 2590 Weight: 1.00 >>>> Name: that/2-7 Len: 6 Check: 1547 Weight: 1.00 >>>> >>>>// >>>> >>>>this/1-8 GATCGATC >>>>that/2-7 ATCGAT <--- Should't this be shifted one position to the right? >>>> >>>> >>>>I am using bioperl-1.2.3. >>>>-- >>>>Wes Barris >>>>E-Mail: Wes.Barris@csiro.au >>>> >>>> >>>>_______________________________________________ >>>>Bioperl-l mailing list >>>>Bioperl-l@portal.open-bio.org >>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l >>>> >> >> >> > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu -- Wes Barris E-Mail: Wes.Barris@csiro.au From wes.barris at csiro.au Fri Nov 7 00:16:55 2003 From: wes.barris at csiro.au (Wes Barris) Date: Fri Nov 7 00:13:39 2003 Subject: [Bioperl-l] psl to gff? Message-ID: <3FAB2AC7.6090808@csiro.au> Hi, Does bioperl provide a psl parser and a gff writer? If so, where can I find information on them? I am using bioperl-1.2.3. -- Wes Barris E-Mail: Wes.Barris@csiro.au From todd at verdant.stanford.edu Fri Nov 7 00:10:45 2003 From: todd at verdant.stanford.edu (Todd Richmond) Date: Fri Nov 7 00:14:22 2003 Subject: [Bioperl-l] macperl In-Reply-To: References: Message-ID: <3FAB2955.8080603@verdant.stanford.edu> Liam Elbourne wrote: > > I was just wondering, preparatory to installing bioperl, how relevant > the documentation for bioperl 0.7/macperl 5.004 ( > http://bioperl.org/Core/mac-bioperl.html ) would be for the current > situation of bioperl 1.2.3/macperl 5.6 would be, and curious whether a > more current version exists. Naturally, if there isn't a more current > version, I am not going to look gift documentation in the particulars, > and will take this opportunity to say "Thanks!" to Todd, if he is > still out there. > I'm still around - though I made the switch to Mac OS X as soon as I could so there won't be any updated documentation from me. I would imagine a few of the issues caused by macperl 5.004 have gone away, but I'm sure the ever advancing nature of bioperl (and its dependencies) has introduced some new ones. If at all possible, I'd suggest using OS X - the ability to type "perl Makefile.pl, make, make test, make install" and have it all just work is worth it. Todd -- Todd Richmond todd@verdant.stanford.edu From jason at cgt.duhs.duke.edu Fri Nov 7 10:42:32 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Fri Nov 7 10:39:09 2003 Subject: [Bioperl-l] psl to gff? In-Reply-To: <3FAB2AC7.6090808@csiro.au> References: <3FAB2AC7.6090808@csiro.au> Message-ID: Only in bioperl-live or 1.3.x Bio::SearchIO::psl or Bio::Tools::Blat -jason On Fri, 7 Nov 2003, Wes Barris wrote: > Hi, > > Does bioperl provide a psl parser and a gff writer? If so, where can > I find information on them? > > I am using bioperl-1.2.3. > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From umayamla at mail.nih.gov Fri Nov 7 14:17:27 2003 From: umayamla at mail.nih.gov (Lowell Umayam) Date: Fri Nov 7 14:14:04 2003 Subject: [Bioperl-l] suggestion for drawing pedigrees Message-ID: <3FABEFC7.6060404@mail.nih.gov> It would be nice if there was a function that returned an array or hash of an imagemapping of a all the individuals in the pedigree. This can allow people to display these pedigrees in a cgi script and have the image link to other pages. Lowell From allenday at ucla.edu Fri Nov 7 20:51:00 2003 From: allenday at ucla.edu (Allen Day) Date: Fri Nov 7 20:47:36 2003 Subject: [Bioperl-l] load_gff3 handles ##sequence-region Message-ID: >From chado's load_gff3.pl POD: " Also, in order for the load to be successful, the reference sequences (eg, chromosomes or contigs) must be defined in the GFF file before any features on them are listed. This can be done either by the reference-sequence meta data specification, which would be lines that look like this: ##sequence-region chr1 1 246127941 ----except that this isn't supported yet--can I get Bio::Tools::GFF to give me this info? " this is now fixed on bioperl-live HEAD Bio::Tools::GFF. I've added parsing capability for the ##sequence region header tag, and stubbed out handling of other header tags from the GFF3 spec such as "##attribute ontology". segments are created as Bio::LocatableSeq objects and available via Bio::Tools::GFF::next_segment(). -allen From damien at rael.org Sat Nov 8 00:02:52 2003 From: damien at rael.org (Damien Marsic) Date: Fri Nov 7 23:59:27 2003 Subject: [Bioperl-l] How to check the validity of an accession number ? Message-ID: <01cc01c3a5b5$93b5ba90$a75aec18@damjancek> Hello, I hope someone will be able to help. I wrote a program that does the following: - opens a file containing sequences named with their Genbank accession number if it exists (and an arbitrary name if the sequence is not in Genbank) - reads each sequence name, looks in Genbank if it is an accession number, and if it is, retrieves the organism name and other information. If the name is not recognized as an accession number, the program says it and proposes to continue with the next sequences. My programs worked perfectly during the last 2 years or so. But for the last few months it does not work anymore, although I did not make any change to it. What is happening now is that when there is a sequence name that is not an accession number, the program crashes. The problem lies with this line: $seq = $stream -> get_Seq_by_acc($an); When $an was an invalid accession number, $seq used to be "undefined" and the program could go on (the program checks whether &seq in defined or undefined and then goes to the appropriate step). But now, when $an is an invalid accession number, the program just crashes with the following message: ------------- EXCEPTION ------------- MSG: acc does not exist STACK Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/lib/perl5/site_perl/5.6.1/Bio/DB/WebDBSeqI.pm:177 STACK Bio::DB::GenBank::get_Seq_by_acc /usr/lib/perl5/site_perl/5.6.1/Bio/DB/GenBank.pm:216 STACK toplevel phylplus.pl:90 -------------------------------------- Can someone helps me understand what is happening ? Were there some changes at Genbank that could explain why my program behaves differently ? Is there anything I can do to make it behave like it did before ? Is there any other way than "get_Seq_by_acc" to check if an accession number exists or not ? Thanks in advance for the replies. Damien From heikki at nildram.co.uk Sat Nov 8 04:10:53 2003 From: heikki at nildram.co.uk (Heikki Lehvaslaiho) Date: Sat Nov 8 04:07:34 2003 Subject: [Bioperl-l] How to check the validity of an accession number ? In-Reply-To: <01cc01c3a5b5$93b5ba90$a75aec18@damjancek> References: <01cc01c3a5b5$93b5ba90$a75aec18@damjancek> Message-ID: <1068282652.2452.10.camel@localhost> Damien, The change is in the bioperl code. It was put in almost 11 months ago for the 1.2 release. As the error message indicates, line 177 in Bio::DB::WebDBSeqI now throws an error. You can catch it with eval statement, e.g: eval { $seq = $stream -> get_Seq_by_acc($an); } if ($@) { print STDERR "Not a valid accession [$an]\n"; next; } The Web interface to the bioperl CVS repository can help you track whys, whens and hows of code changes: http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/Bio/DB/WebDBSeqI.pm?cvsroot=bioperl Yours, -Heikki On Sat, 2003-11-08 at 05:02, Damien Marsic wrote: > Hello, > > I hope someone will be able to help. I wrote a program that does the > following: > > - opens a file containing sequences named with their Genbank accession > number if it exists (and an arbitrary name if the sequence is not in > Genbank) > > - reads each sequence name, looks in Genbank if it is an accession number, > and if it is, retrieves the organism name and other information. If the name > is not recognized as an accession number, the program says it and proposes > to continue with the next sequences. > > My programs worked perfectly during the last 2 years or so. But for the last > few months it does not work anymore, although I did not make any change to > it. What is happening now is that when there is a sequence name that is not > an accession number, the program crashes. > > The problem lies with this line: > > $seq = $stream -> get_Seq_by_acc($an); > > When $an was an invalid accession number, $seq used to be "undefined" and > the program could go on (the program checks whether &seq in defined or > undefined and then goes to the appropriate step). > > But now, when $an is an invalid accession number, the program just crashes > with the following message: > > ------------- EXCEPTION ------------- > MSG: acc does not exist > STACK Bio::DB::WebDBSeqI::get_Seq_by_acc > /usr/lib/perl5/site_perl/5.6.1/Bio/DB/WebDBSeqI.pm:177 > STACK Bio::DB::GenBank::get_Seq_by_acc > /usr/lib/perl5/site_perl/5.6.1/Bio/DB/GenBank.pm:216 > STACK toplevel phylplus.pl:90 > -------------------------------------- > > Can someone helps me understand what is happening ? Were there some changes > at Genbank that could explain why my program behaves differently ? Is there > anything I can do to make it behave like it did before ? > > Is there any other way than "get_Seq_by_acc" to check if an accession number > exists or not ? > > Thanks in advance for the replies. > > Damien > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From heikki at nildram.co.uk Sat Nov 8 04:28:30 2003 From: heikki at nildram.co.uk (Heikki Lehvaslaiho) Date: Sat Nov 8 04:25:05 2003 Subject: [Bioperl-l] Understanding LocatableSeq In-Reply-To: <3FAB202D.6090409@csiro.au> References: <3FA6FE9D.2000007@csiro.au> <3FAB202D.6090409@csiro.au> Message-ID: <1068283709.2449.26.camel@localhost> On Fri, 2003-11-07 at 04:31, Wes Barris wrote: > > You still need to prefix/postfix with the requisite number of gaps. > > > > The start/end describe where the sequence participating the alignment > > COMES FROM not where they are in the alignment, so you have to > > explicitly code their alignment by placing the right number of gaps. > > Ok. I understand that I have to add the gap characters on either end > of each aligned sequence. Sorry for being so dense but I still don't > understand the use of the "start" and "end" attributes. They don't > appear to do anything. If I have two sequences: > > GATCGATC > and > ATCGAT > > what would be the start and end for each sequence or doesn't it matter? > When you say that they represent where the sequence COMES FROM, what does > that mean? Wes, Sequence alignments represent quite often not global but local alignments between sequences. Local means that only a small portion of the compared sequences match each other. This is the approach used by, for example, Blast and Fasta programs. Now, when you see a high scoring alignment from fasta run, you want to know which part of your query sequence match which part of the database sequence, so that you can, e.g., check the feature table: 300 ATGCGA 305 3 ATGC-- 6 If you are building your own alignment from scratch and you do not care or know where the sequences came from, you assign '1' for the start and the length of the sequence to the end. If you then later manipulate your alignment, e.g. take a slice, the new object knows where in your original alignment that slice came from (i.e, what were the original start and end columns). I hope this helped. Yours, -Heikki From markw at illuminae.com Sat Nov 8 07:46:33 2003 From: markw at illuminae.com (Mark Wilkinson) Date: Sat Nov 8 07:43:10 2003 Subject: [BioPerl] Re: [Bioperl-l] How to check the validity of an accession number ? In-Reply-To: <1068282652.2452.10.camel@localhost> References: <01cc01c3a5b5$93b5ba90$a75aec18@damjancek> <1068282652.2452.10.camel@localhost> Message-ID: <1068295591.1710.1.camel@localhost.localdomain> On Sat, 2003-11-08 at 03:10, Heikki Lehvaslaiho wrote: > The change is in the bioperl code. It was put in almost 11 months ago > for the 1.2 release. As the error message indicates, line 177 in > Bio::DB::WebDBSeqI now throws an error. Although I am now "getting used to it", I have to say that I found the earlier behaviour much more sensible and easier to deal with... why was this change made in this way? Mark From lstein at cshl.edu Sat Nov 8 08:48:09 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Sat Nov 8 08:44:48 2003 Subject: [Bioperl-l] problems with Bio::Tools::GFF In-Reply-To: <1067881262.1436.47.camel@localhost.localdomain> References: <1067881262.1436.47.camel@localhost.localdomain> Message-ID: <200311080848.09325.lstein@cshl.edu> No, it is still tab delimited, but people consistently screw it up and it might be better to split on space, since spaces are no longer allowed in the columns. Lincoln On Monday 03 November 2003 12:41 pm, Scott Cain wrote: > Hi Jason and Lincoln, > > I have a few concerns with Bio::Tools::GFF. The first is with the method > _from_gff3_string, which does a split on \t to separate columns. I > think the GFF3 spec says it can be space delimited, so that should > probably be \s+. Additionally, to split the groups column, it uses > \s*;\s*, but I think that spaces have to be escaped, therefore, it > should only split on ; and spaces would indicate a problem (especially > if one splits on spaces as indicated above). > > Finally, it doesn't provide a method of accessing the sequence that is > optionally at the bottom of the file. I am not exactly sure how to > implement that (or I would), but I suspect it will have to be handled in > the next_feature method. Of course, the problem with handling it there > is that it is not a feature. > > Scott -- Lincoln Stein lstein@cshl.edu Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) From heikki at nildram.co.uk Sat Nov 8 09:05:00 2003 From: heikki at nildram.co.uk (Heikki Lehvaslaiho) Date: Sat Nov 8 09:05:04 2003 Subject: [Bioperl-l] embl.pm and virus names In-Reply-To: <131DF14ECE564A4D8F35376FD346517201C37C4B@EXCHSRV1.internal.sanger.ac.uk> References: <131DF14ECE564A4D8F35376FD346517201C37C4B@EXCHSRV1.internal.sanger.ac.uk> Message-ID: <1068300299.3434.19.camel@localhost> Neal, The Bio::Species class is relatively new in Bioperl and has not been extensively tested. The EMBL parser simply expects to find a normal binomial scientific name in every entry. I'll try to fix this for viri. The current EMBL parser generates this kind of structure: $VAR1 = bless( { '_sub_species' => 'virus', '_classification' => [ 'immunodeficiency', 'Human', 'Primate lentivirus group', 'Lentivirus', 'Retroviridae', 'Retroid viruses', 'Viruses' ] }, 'Bio::Species' ); I've now changed my copy of the parser to produce: $VAR1 = bless( { '_classification' => [ 'Human immunodeficiency virus', 'Primate lentivirus group', 'Lentivirus', 'Retroviridae', 'Retroid viruses', 'Viruses' ] }, 'Bio::Species' ); No subspecies and the whole OS line is in first item of the array. I reasonably happy with this. My only gripe is that if call binomial() on this object, you get: 'Primate lentivirus group Human immunodeficiency virus' while genus() gives: 'Primate lentivirus group' Is this good enough, or can anyone suggest a better solution? In addition to EMBL, I'll try to make sure that GenBank and SWISS-PROT parsers treat viri similarly. -Heikki P.S. I could not find any EMBL entries with PCC6803 in OS line, but given OS line like 'Synechocystis sp. PCC6803', 'PCC6803' should end up into subspecies(). -H On Thu, 2003-11-06 at 16:49, Neil Rawlings wrote: > I am trying to use the Bio::SeqIO::EMBL to parse EMBL database entries, > but am having problems whenever I try to retrieve the organism name > whenever the EMBL entry is for a viral sequence. I am using the embl.pm > module and a line such as: > > My ($spec, $genus) = $entry->species->classification(); > > But for a virus (which doesn't have a species name - for example "apple > chlorotic leaf spot virus") I get "Apple chlorotic" as the organism > name. I'm not just interested in viruses, so I'm happy when the name > comes back as "Drosophila melanogaster". The problem is also apparent > for some bacteria, especially something like Synechocystis sp. PCC6803 > in which PCC6803 is lost (probably assumed to be a subspecies name). > > If a solution exists to this problem, please let me know. > > ======================================================================== > ==== > Neil D. Rawlings > Sanger Institute > Wellcome Trust Genome Campus > Hinxton, Cambs CB10 1SA, UK > > Tel: +1223 495330 > Fax: +1223 494919 > E-mail: ndr@sanger.ac.uk > ======================================================================== > ====== > Please visit the MEROPS database for peptidase classification. The URL > is: > href="merops.sanger.ac.uk">MEROPS.SANGER.AC.UK > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From heikki at nildram.co.uk Sat Nov 8 09:08:25 2003 From: heikki at nildram.co.uk (Heikki Lehvaslaiho) Date: Sat Nov 8 09:05:23 2003 Subject: [BioPerl] Re: [Bioperl-l] How to check the validity of an accession number ? In-Reply-To: <1068295591.1710.1.camel@localhost.localdomain> References: <01cc01c3a5b5$93b5ba90$a75aec18@damjancek> <1068282652.2452.10.camel@localhost> <1068295591.1710.1.camel@localhost.localdomain> Message-ID: <1068300505.3440.23.camel@localhost> Mark, Jason's cvs commit note: "properly throw an error when no sequences are retrieved for a query -- cannot distinguish between errors and non-connections though at this point, but this makes t/Perl.t now properly run when network is disconnected [during an ice storm]" -Heikki On Sat, 2003-11-08 at 12:46, Mark Wilkinson wrote: > On Sat, 2003-11-08 at 03:10, Heikki Lehvaslaiho wrote: > > > The change is in the bioperl code. It was put in almost 11 months ago > > for the 1.2 release. As the error message indicates, line 177 in > > Bio::DB::WebDBSeqI now throws an error. > > Although I am now "getting used to it", I have to say that I found the > earlier behaviour much more sensible and easier to deal with... why was > this change made in this way? > > Mark > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From markw at illuminae.com Sat Nov 8 09:44:59 2003 From: markw at illuminae.com (Mark Wilkinson) Date: Sat Nov 8 09:41:36 2003 Subject: [BioPerl] Re: [Bioperl-l] How to check the validity of an accession number ? In-Reply-To: <1068300505.3440.23.camel@localhost> References: <01cc01c3a5b5$93b5ba90$a75aec18@damjancek> <1068282652.2452.10.camel@localhost> <1068295591.1710.1.camel@localhost.localdomain> <1068300505.3440.23.camel@localhost> Message-ID: <1068302699.1710.21.camel@localhost.localdomain> Well... I'm not convinced that this is the "polite" thing to do :-) We have to be much more forgiving in the MOBY world. The fact that THIS service provider does not understand the ID number, does *not* mean that the ID number is invalid (as you are asserting by the fact that you throw an error). In MOBY we would then simply pass it off to another service provider who claimed to know something about it, and so on, until in the end we just claim "nobody knew anything about it... but that STILL doesn't mean it is invalid!". We certainly would never "break" due to service providers ignorance, or we would be broken all the time :-) Mark On Sat, 2003-11-08 at 08:08, Heikki Lehvaslaiho wrote: > Mark, > > Jason's cvs commit note: > > "properly throw an error when no sequences are retrieved for a query -- > cannot distinguish between errors and non-connections though at this > point, but this makes t/Perl.t now properly run when network is > disconnected [during an ice storm]" > > -Heikki > > On Sat, 2003-11-08 at 12:46, Mark Wilkinson wrote: > > On Sat, 2003-11-08 at 03:10, Heikki Lehvaslaiho wrote: > > > > > The change is in the bioperl code. It was put in almost 11 months ago > > > for the 1.2 release. As the error message indicates, line 177 in > > > Bio::DB::WebDBSeqI now throws an error. > > > > Although I am now "getting used to it", I have to say that I found the > > earlier behaviour much more sensible and easier to deal with... why was > > this change made in this way? > > > > Mark > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- Mark Wilkinson Illuminae From damien at rael.org Sat Nov 8 19:05:04 2003 From: damien at rael.org (Damien Marsic) Date: Sat Nov 8 18:59:27 2003 Subject: [Bioperl-l] How to check the validity of an accession number ? References: <01cc01c3a5b5$93b5ba90$a75aec18@damjancek> <1068282652.2452.10.camel@localhost> Message-ID: <00a401c3a655$27371c60$41d8e592@XTAL5> Thank you for your help. However, when I try it says "Syntax error". Sorry I am not very good at perl and I have never seen "$@" before so I don't know how to use it. But it really seems that there is a spelling error in your example. PLease help. Damien ----- Original Message ----- From: "Heikki Lehvaslaiho" To: "Damien Marsic" Cc: "Bioperl" Sent: Saturday, November 08, 2003 3:10 AM Subject: Re: [Bioperl-l] How to check the validity of an accession number ? > Damien, > > The change is in the bioperl code. It was put in almost 11 months ago > for the 1.2 release. As the error message indicates, line 177 in > Bio::DB::WebDBSeqI now throws an error. You can catch it with eval > statement, e.g: > > eval { > $seq = $stream -> get_Seq_by_acc($an); > } > if ($@) { > print STDERR "Not a valid accession [$an]\n"; > next; > } > > > The Web interface to the bioperl CVS repository can help you track > whys, whens and hows of code changes: > > http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/Bio/DB/WebDBSeqI.pm?cvsroot=bioperl > > Yours, > > -Heikki > > > On Sat, 2003-11-08 at 05:02, Damien Marsic wrote: > > Hello, > > > > I hope someone will be able to help. I wrote a program that does the > > following: > > > > - opens a file containing sequences named with their Genbank accession > > number if it exists (and an arbitrary name if the sequence is not in > > Genbank) > > > > - reads each sequence name, looks in Genbank if it is an accession number, > > and if it is, retrieves the organism name and other information. If the name > > is not recognized as an accession number, the program says it and proposes > > to continue with the next sequences. > > > > My programs worked perfectly during the last 2 years or so. But for the last > > few months it does not work anymore, although I did not make any change to > > it. What is happening now is that when there is a sequence name that is not > > an accession number, the program crashes. > > > > The problem lies with this line: > > > > $seq = $stream -> get_Seq_by_acc($an); > > > > When $an was an invalid accession number, $seq used to be "undefined" and > > the program could go on (the program checks whether &seq in defined or > > undefined and then goes to the appropriate step). > > > > But now, when $an is an invalid accession number, the program just crashes > > with the following message: > > > > ------------- EXCEPTION ------------- > > MSG: acc does not exist > > STACK Bio::DB::WebDBSeqI::get_Seq_by_acc > > /usr/lib/perl5/site_perl/5.6.1/Bio/DB/WebDBSeqI.pm:177 > > STACK Bio::DB::GenBank::get_Seq_by_acc > > /usr/lib/perl5/site_perl/5.6.1/Bio/DB/GenBank.pm:216 > > STACK toplevel phylplus.pl:90 > > -------------------------------------- > > > > Can someone helps me understand what is happening ? Were there some changes > > at Genbank that could explain why my program behaves differently ? Is there > > anything I can do to make it behave like it did before ? > > > > Is there any other way than "get_Seq_by_acc" to check if an accession number > > exists or not ? > > > > Thanks in advance for the replies. > > > > Damien > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > From jason at cgt.duhs.duke.edu Sat Nov 8 19:05:29 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Sat Nov 8 19:02:00 2003 Subject: [BioPerl] Re: [Bioperl-l] How to check the validity of an accession number ? In-Reply-To: <1068302699.1710.21.camel@localhost.localdomain> References: <01cc01c3a5b5$93b5ba90$a75aec18@damjancek> <1068282652.2452.10.camel@localhost> <1068295591.1710.1.camel@localhost.localdomain> <1068300505.3440.23.camel@localhost> <1068302699.1710.21.camel@localhost.localdomain> Message-ID: I have repeatedly asked for input on these modules and for someone to really run them through the paces with all the different types of errors one would get. If you can think of a good way to distinguish between a temporary server error, disconnected network, and a non-valid accession please propose it - esp since you must be thinking about these type of situations in the MOBY world. A simple spec as to what you would expect from the client lib would go a long way to making it behave as you like. On Sat, 8 Nov 2003, Mark Wilkinson wrote: > Well... I'm not convinced that this is the "polite" thing to do :-) > > We have to be much more forgiving in the MOBY world. The fact that THIS > service provider does not understand the ID number, does *not* mean that > the ID number is invalid (as you are asserting by the fact that you > throw an error). In MOBY we would then simply pass it off to another > service provider who claimed to know something about it, and so on, > until in the end we just claim "nobody knew anything about it... but > that STILL doesn't mean it is invalid!". > > We certainly would never "break" due to service providers ignorance, or > we would be broken all the time :-) > I don't see why trapping it with an eval doesn't work anyways. get_Seq_by_acc seems to have the implicit behavior that you know you are asking for something valid. what should it return when the different error states arise? You're going to pass on to the next service provider whether or not the error is because the acc is unknown or the provider is down. I would assume that the MOBY layer is providing this type of insulation to clients using the code? -jason > Mark > > > On Sat, 2003-11-08 at 08:08, Heikki Lehvaslaiho wrote: > > Mark, > > > > Jason's cvs commit note: > > > > "properly throw an error when no sequences are retrieved for a query -- > > cannot distinguish between errors and non-connections though at this > > point, but this makes t/Perl.t now properly run when network is > > disconnected [during an ice storm]" > > > > -Heikki > > > > On Sat, 2003-11-08 at 12:46, Mark Wilkinson wrote: > > > On Sat, 2003-11-08 at 03:10, Heikki Lehvaslaiho wrote: > > > > > > > The change is in the bioperl code. It was put in almost 11 months ago > > > > for the 1.2 release. As the error message indicates, line 177 in > > > > Bio::DB::WebDBSeqI now throws an error. > > > > > > Although I am now "getting used to it", I have to say that I found the > > > earlier behaviour much more sensible and easier to deal with... why was > > > this change made in this way? > > > > > > Mark > > > > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l@portal.open-bio.org > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From jason at cgt.duhs.duke.edu Sat Nov 8 19:15:21 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Sat Nov 8 19:11:42 2003 Subject: [Bioperl-l] suggestion for drawing pedigrees In-Reply-To: <3FABEFC7.6060404@mail.nih.gov> References: <3FABEFC7.6060404@mail.nih.gov> Message-ID: Assuming this code works (which I've not really touched in quite a while so other changes might have broken things): my $draw = new Bio::Pedigree::Draw(); $draw->draw(-pedigree => $pedigree, -rendertype => 'pedplot', -file => ">$outfile", -format => 'png'); Then the following code will put image data into a scalar instead of a new file. my $fh = new IO::String; $draw->draw(-pedigree => $pedigree, -rendertype => 'pedplot', -fh => $fh, -format => 'png'); Image is now in $fh->string_ref; Although if you are doing this for CGI you will probably be just writing to STDOUT so -fh => \*STDOUT would work. As for the imagemapping stuff - that is well beyond the current capabilities, mostly because I don't know anything about how to generate imagemaps + GD. But since this is open-source code you are encouraged to work on that and add it back to the project. An as for the general behavior of those modules - it probably needs some time and energy, I designed and wrote that code when I was a perl OO neophyte and tried to clean it up later so it has some cruft that can be refactored. My idea would be to see the graphics layer futher abstracted so the postscript or GD level could also be replaced with Todd's new SVG modules as well. Work on this is pretty low on my priority list however so if you really want these type of improvements it would be good to try and contribute some work yourself. -jason On Fri, 7 Nov 2003, Lowell Umayam wrote: > It would be nice if there was a function that returned an array or hash > of an imagemapping of a all the individuals in the pedigree. This can > allow people to display these pedigrees in a cgi script and have the > image link to other pages. > > Lowell > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From damien at rael.org Sat Nov 8 19:47:33 2003 From: damien at rael.org (Damien Marsic) Date: Sat Nov 8 19:41:47 2003 Subject: [Bioperl-l] How to check the validity of an accession number ? References: <01cc01c3a5b5$93b5ba90$a75aec18@damjancek><1068282652.2452.10.camel@localhost> <00a401c3a655$27371c60$41d8e592@XTAL5> Message-ID: <000e01c3a65b$11760d90$41d8e592@XTAL5> Disregard my previous message. I found out how to correctly write the code and now it's working fine. Damien ----- Original Message ----- From: "Damien Marsic" To: Cc: "Bioperl" Sent: Saturday, November 08, 2003 6:05 PM Subject: Re: [Bioperl-l] How to check the validity of an accession number ? > Thank you for your help. However, when I try it says "Syntax error". Sorry I > am not very good at perl and I have never seen "$@" before so I don't know > how to use it. But it really seems that there is a spelling error in your > example. PLease help. > > Damien > > ----- Original Message ----- > From: "Heikki Lehvaslaiho" > To: "Damien Marsic" > Cc: "Bioperl" > Sent: Saturday, November 08, 2003 3:10 AM > Subject: Re: [Bioperl-l] How to check the validity of an accession number ? > > > > Damien, > > > > The change is in the bioperl code. It was put in almost 11 months ago > > for the 1.2 release. As the error message indicates, line 177 in > > Bio::DB::WebDBSeqI now throws an error. You can catch it with eval > > statement, e.g: > > > > eval { > > $seq = $stream -> get_Seq_by_acc($an); > > } > > if ($@) { > > print STDERR "Not a valid accession [$an]\n"; > > next; > > } > > > > > > The Web interface to the bioperl CVS repository can help you track > > whys, whens and hows of code changes: > > > > > http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/Bio/DB/WebDBSeqI.pm?cvsroot=bioperl > > > > Yours, > > > > -Heikki > > > > > > On Sat, 2003-11-08 at 05:02, Damien Marsic wrote: > > > Hello, > > > > > > I hope someone will be able to help. I wrote a program that does the > > > following: > > > > > > - opens a file containing sequences named with their Genbank accession > > > number if it exists (and an arbitrary name if the sequence is not in > > > Genbank) > > > > > > - reads each sequence name, looks in Genbank if it is an accession > number, > > > and if it is, retrieves the organism name and other information. If the > name > > > is not recognized as an accession number, the program says it and > proposes > > > to continue with the next sequences. > > > > > > My programs worked perfectly during the last 2 years or so. But for the > last > > > few months it does not work anymore, although I did not make any change > to > > > it. What is happening now is that when there is a sequence name that is > not > > > an accession number, the program crashes. > > > > > > The problem lies with this line: > > > > > > $seq = $stream -> get_Seq_by_acc($an); > > > > > > When $an was an invalid accession number, $seq used to be "undefined" > and > > > the program could go on (the program checks whether &seq in defined or > > > undefined and then goes to the appropriate step). > > > > > > But now, when $an is an invalid accession number, the program just > crashes > > > with the following message: > > > > > > ------------- EXCEPTION ------------- > > > MSG: acc does not exist > > > STACK Bio::DB::WebDBSeqI::get_Seq_by_acc > > > /usr/lib/perl5/site_perl/5.6.1/Bio/DB/WebDBSeqI.pm:177 > > > STACK Bio::DB::GenBank::get_Seq_by_acc > > > /usr/lib/perl5/site_perl/5.6.1/Bio/DB/GenBank.pm:216 > > > STACK toplevel phylplus.pl:90 > > > -------------------------------------- > > > > > > Can someone helps me understand what is happening ? Were there some > changes > > > at Genbank that could explain why my program behaves differently ? Is > there > > > anything I can do to make it behave like it did before ? > > > > > > Is there any other way than "get_Seq_by_acc" to check if an accession > number > > > exists or not ? > > > > > > Thanks in advance for the replies. > > > > > > Damien > > > > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l@portal.open-bio.org > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > From heikki at nildram.co.uk Sun Nov 9 10:24:43 2003 From: heikki at nildram.co.uk (Heikki Lehvaslaiho) Date: Sun Nov 9 10:21:17 2003 Subject: [Bioperl-l] How to check the validity of an accession number ? In-Reply-To: <000e01c3a65b$11760d90$41d8e592@XTAL5> References: <01cc01c3a5b5$93b5ba90$a75aec18@damjancek> <1068282652.2452.10.camel@localhost> <00a401c3a655$27371c60$41d8e592@XTAL5> <000e01c3a65b$11760d90$41d8e592@XTAL5> Message-ID: <1068391482.2459.3.camel@localhost> On Sun, 2003-11-09 at 00:47, Damien Marsic wrote: > Disregard my previous message. I found out how to correctly write the code > and now it's working fine. Yes, looking at the code, I seem to have left out semicolon (;) after the eval block. -Heikki > Damien > > ----- Original Message ----- > From: "Damien Marsic" > To: > Cc: "Bioperl" > Sent: Saturday, November 08, 2003 6:05 PM > Subject: Re: [Bioperl-l] How to check the validity of an accession number ? > > > > Thank you for your help. However, when I try it says "Syntax error". Sorry > I > > am not very good at perl and I have never seen "$@" before so I don't know > > how to use it. But it really seems that there is a spelling error in your > > example. PLease help. > > > > Damien > > > > ----- Original Message ----- > > From: "Heikki Lehvaslaiho" > > To: "Damien Marsic" > > Cc: "Bioperl" > > Sent: Saturday, November 08, 2003 3:10 AM > > Subject: Re: [Bioperl-l] How to check the validity of an accession number > ? > > > > > > > Damien, > > > > > > The change is in the bioperl code. It was put in almost 11 months ago > > > for the 1.2 release. As the error message indicates, line 177 in > > > Bio::DB::WebDBSeqI now throws an error. You can catch it with eval > > > statement, e.g: > > > > > > eval { > > > $seq = $stream -> get_Seq_by_acc($an); > > > } > > > if ($@) { > > > print STDERR "Not a valid accession [$an]\n"; > > > next; > > > } > > > > > > > > > The Web interface to the bioperl CVS repository can help you track > > > whys, whens and hows of code changes: > > > > > > > > > http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/Bio/DB/WebDBSeqI.pm?cvsroot=bioperl > > > > > > Yours, > > > > > > -Heikki > > > > > > > > > On Sat, 2003-11-08 at 05:02, Damien Marsic wrote: > > > > Hello, > > > > > > > > I hope someone will be able to help. I wrote a program that does the > > > > following: > > > > > > > > - opens a file containing sequences named with their Genbank accession > > > > number if it exists (and an arbitrary name if the sequence is not in > > > > Genbank) > > > > > > > > - reads each sequence name, looks in Genbank if it is an accession > > number, > > > > and if it is, retrieves the organism name and other information. If > the > > name > > > > is not recognized as an accession number, the program says it and > > proposes > > > > to continue with the next sequences. > > > > > > > > My programs worked perfectly during the last 2 years or so. But for > the > > last > > > > few months it does not work anymore, although I did not make any > change > > to > > > > it. What is happening now is that when there is a sequence name that > is > > not > > > > an accession number, the program crashes. > > > > > > > > The problem lies with this line: > > > > > > > > $seq = $stream -> get_Seq_by_acc($an); > > > > > > > > When $an was an invalid accession number, $seq used to be "undefined" > > and > > > > the program could go on (the program checks whether &seq in defined or > > > > undefined and then goes to the appropriate step). > > > > > > > > But now, when $an is an invalid accession number, the program just > > crashes > > > > with the following message: > > > > > > > > ------------- EXCEPTION ------------- > > > > MSG: acc does not exist > > > > STACK Bio::DB::WebDBSeqI::get_Seq_by_acc > > > > /usr/lib/perl5/site_perl/5.6.1/Bio/DB/WebDBSeqI.pm:177 > > > > STACK Bio::DB::GenBank::get_Seq_by_acc > > > > /usr/lib/perl5/site_perl/5.6.1/Bio/DB/GenBank.pm:216 > > > > STACK toplevel phylplus.pl:90 > > > > -------------------------------------- > > > > > > > > Can someone helps me understand what is happening ? Were there some > > changes > > > > at Genbank that could explain why my program behaves differently ? Is > > there > > > > anything I can do to make it behave like it did before ? > > > > > > > > Is there any other way than "get_Seq_by_acc" to check if an accession > > number > > > > exists or not ? > > > > > > > > Thanks in advance for the replies. > > > > > > > > Damien > > > > > > > > > > > > _______________________________________________ > > > > Bioperl-l mailing list > > > > Bioperl-l@portal.open-bio.org > > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From sanjib at uchicago.edu Sat Nov 8 20:37:44 2003 From: sanjib at uchicago.edu (Sanjib Dutta) Date: Sun Nov 9 12:29:30 2003 Subject: [Bioperl-l] (no subject) Message-ID: <001001c3a662$12fe7e20$f223fea9@Sanjib> Hi, I am trying to do some psiblast search locally using the nr database which I have downloaded and compiled. But I am noticing that the search converges much earlier after two or three rounds compared to when I do the same search over the web remotely. I checked all the parameters and they seemed to be the same for the two searches. Also with one set of e and h values I see that the search gives a segmentation fault after one round. Could you please help me regarding this? Thanks Sanjib From alok at caltech.edu Sun Nov 9 17:00:34 2003 From: alok at caltech.edu (Alok Saldanha) Date: Sun Nov 9 16:57:06 2003 Subject: [Bioperl-l] bioperl meme/mast parser? Message-ID: <248A1311-1300-11D8-BB30-000A95A524DA@caltech.edu> Hello, is there any interest/effort going into a meme/mast parser? -Alok From skirov at utk.edu Sun Nov 9 17:56:21 2003 From: skirov at utk.edu (Stefan A Kirov) Date: Sun Nov 9 17:52:53 2003 Subject: [Bioperl-l] bioperl meme/mast parser? In-Reply-To: <248A1311-1300-11D8-BB30-000A95A524DA@caltech.edu> References: <248A1311-1300-11D8-BB30-000A95A524DA@caltech.edu> Message-ID: Look in bioperl-live, Bio::Matrix::PSM::IO (meme, mast and transfac parsers). mast parser parses only section I and II, maybe I will add more stuff to it in the next few months. Any suggestions are appreciated. Stefan Kirov On Sun, 9 Nov 2003, Alok Saldanha wrote: >Hello, > >is there any interest/effort going into a meme/mast parser? > > -Alok > > >_______________________________________________ >Bioperl-l mailing list >Bioperl-l@portal.open-bio.org >http://portal.open-bio.org/mailman/listinfo/bioperl-l > From wes.barris at csiro.au Sun Nov 9 19:03:31 2003 From: wes.barris at csiro.au (Wes Barris) Date: Sun Nov 9 19:00:18 2003 Subject: [Bioperl-l] Score computation in psl.pm Message-ID: <3FAED5D3.5030903@csiro.au> Hi, I found this line in psl.pm: my $score = sprintf "%.2f", ( 100 * ( $matches + $mismatches + $rep_matches ) / $q_length ); However, from the BLAT program documentation: -minScore=N sets minimum score. This is twice the matches minus the mismatches minus some sort of gap penalty. Default is 30 It seems to me that the score computed in psl.pm should be based on something more like this: 100 * (2* $matches - $mismatches + $rep_matches) / $q_length -- Wes Barris E-Mail: Wes.Barris@csiro.au From hlapp at gnf.org Sun Nov 9 19:12:08 2003 From: hlapp at gnf.org (Hilmar Lapp) Date: Sun Nov 9 19:08:55 2003 Subject: [Bioperl-l] Re: SeqFeatureI::display_name In-Reply-To: Message-ID: <86084C29-1312-11D8-A309-000A959EB4C4@gnf.org> On Saturday, November 8, 2003, at 04:18 PM, Marc Logghe wrote: > The returned persistent feature objects seem not to know about their > parent; I mean, the display_name is undef. What should I change to the > query in order to fill that slot ? > Display_name is not populated from some property of the parent bioentry; it is a property of the seqfeature (see also Bio::SeqFeatureI::display_name). Bioperl itself doesn't use the display_name property when you create those features from databank files via the SeqIO path. Only Bio::Graphics/Bio::DB::GFF uses it I think. Maybe the GFF parser will populate it too, possibly only when reading GFF3? Does anybody know off hand? So, for all features that were loaded into the database using a SeqIO parser the display_name property will be undefined. Now, the other thing you're noticing is that the seqfeature adaptor will not automatically load the corresponding sequence and attach it to the feature. This is so that sequence object serialization does not enter a circular loop, because when storing sequences their features will be serialized too. Your way out is either to retrieve sequences from the query, not features, and then filter out those features you didn't want, or the code needs to be changed to allow that optionally the sequence is retrieved for each feature. -hilmar -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From fangl at genomics.org.cn Mon Nov 10 03:11:32 2003 From: fangl at genomics.org.cn (Magic Fang) Date: Mon Nov 10 03:08:20 2003 Subject: [Bioperl-l] about the FgeneSH parser Message-ID: <3FAF4834.8060307@genomics.org.cn> i have developed a fgenesh parser these days, it is base on genscan package, current version is 0.1. i think fgenesh is the powerful gene finding program. if i can add it to the bioperl code tree? From michael.watson at bbsrc.ac.uk Mon Nov 10 04:26:45 2003 From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C)) Date: Mon Nov 10 04:27:13 2003 Subject: [Bioperl-l] How to check the validity of an accession number ? Message-ID: <20B7EB075F2D4542AFFAF813E98ACD93028223EB@cl-exsrv1.irad.bbsrc.ac.uk> There should be a ";" after the eval statement: eval { $seq = $stream -> get_Seq_by_acc($an); }; if ($@) { print STDERR "Not a valid accession [$an]\n"; next; } The $@ variable is where perl puts any error messages that come from the eval statement. If no errors, then $@ is undefined. -----Original Message----- From: Damien Marsic [mailto:damien@rael.org] Sent: 09 November 2003 00:05 To: heikki@ebi.ac.uk Cc: Bioperl Subject: Re: [Bioperl-l] How to check the validity of an accession number ? Thank you for your help. However, when I try it says "Syntax error". Sorry I am not very good at perl and I have never seen "$@" before so I don't know how to use it. But it really seems that there is a spelling error in your example. PLease help. Damien ----- Original Message ----- From: "Heikki Lehvaslaiho" To: "Damien Marsic" Cc: "Bioperl" Sent: Saturday, November 08, 2003 3:10 AM Subject: Re: [Bioperl-l] How to check the validity of an accession number ? > Damien, > > The change is in the bioperl code. It was put in almost 11 months ago > for the 1.2 release. As the error message indicates, line 177 in > Bio::DB::WebDBSeqI now throws an error. You can catch it with eval > statement, e.g: > > eval { > $seq = $stream -> get_Seq_by_acc($an); > } > if ($@) { > print STDERR "Not a valid accession [$an]\n"; > next; > } > > > The Web interface to the bioperl CVS repository can help you track > whys, whens and hows of code changes: > > http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/Bio/DB/WebDBSeqI.pm?cvsroot=bioperl > > Yours, > > -Heikki > > > On Sat, 2003-11-08 at 05:02, Damien Marsic wrote: > > Hello, > > > > I hope someone will be able to help. I wrote a program that does the > > following: > > > > - opens a file containing sequences named with their Genbank accession > > number if it exists (and an arbitrary name if the sequence is not in > > Genbank) > > > > - reads each sequence name, looks in Genbank if it is an accession number, > > and if it is, retrieves the organism name and other information. If the name > > is not recognized as an accession number, the program says it and proposes > > to continue with the next sequences. > > > > My programs worked perfectly during the last 2 years or so. But for the last > > few months it does not work anymore, although I did not make any change to > > it. What is happening now is that when there is a sequence name that is not > > an accession number, the program crashes. > > > > The problem lies with this line: > > > > $seq = $stream -> get_Seq_by_acc($an); > > > > When $an was an invalid accession number, $seq used to be "undefined" and > > the program could go on (the program checks whether &seq in defined or > > undefined and then goes to the appropriate step). > > > > But now, when $an is an invalid accession number, the program just crashes > > with the following message: > > > > ------------- EXCEPTION ------------- > > MSG: acc does not exist > > STACK Bio::DB::WebDBSeqI::get_Seq_by_acc > > /usr/lib/perl5/site_perl/5.6.1/Bio/DB/WebDBSeqI.pm:177 > > STACK Bio::DB::GenBank::get_Seq_by_acc > > /usr/lib/perl5/site_perl/5.6.1/Bio/DB/GenBank.pm:216 > > STACK toplevel phylplus.pl:90 > > -------------------------------------- > > > > Can someone helps me understand what is happening ? Were there some changes > > at Genbank that could explain why my program behaves differently ? Is there > > anything I can do to make it behave like it did before ? > > > > Is there any other way than "get_Seq_by_acc" to check if an accession number > > exists or not ? > > > > Thanks in advance for the replies. > > > > Damien > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From heikki at ebi.ac.uk Mon Nov 10 04:47:04 2003 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Mon Nov 10 04:43:44 2003 Subject: [Bioperl-l] about the FgeneSH parser In-Reply-To: <3FAF4834.8060307@genomics.org.cn> References: <3FAF4834.8060307@genomics.org.cn> Message-ID: <1068457624.9907.6.camel@localhost> Hi, Sure, but... How does it compare with Bio::EnsEMBL::Pipeline::Runnable module? http://www.ensembl.org/Docs/Pdoc/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/Runnable/Fgenesh.html or this: http://www2.toddot.net:8081/research/scripts/parsers/fgenesh/ -Heikki On Mon, 2003-11-10 at 08:11, Magic Fang wrote: > i have developed a fgenesh parser these days, it is base on genscan > package, current version is 0.1. i think fgenesh is the powerful gene > finding program. if i can add it to the bioperl code tree? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From fangl at genomics.org.cn Mon Nov 10 04:58:38 2003 From: fangl at genomics.org.cn (Magic Fang) Date: Mon Nov 10 04:55:25 2003 Subject: [Bioperl-l] about the FgeneSH parser In-Reply-To: <1068457624.9907.6.camel@localhost> References: <3FAF4834.8060307@genomics.org.cn> <1068457624.9907.6.camel@localhost> Message-ID: <3FAF614E.4030904@genomics.org.cn> Heikki Lehvaslaiho wrote: >Hi, > >Sure, but... > >How does it compare with Bio::EnsEMBL::Pipeline::Runnable module? > >http://www.ensembl.org/Docs/Pdoc/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/Runnable/Fgenesh.html > >or this: > >http://www2.toddot.net:8081/research/scripts/parsers/fgenesh/ > > -Heikki > >On Mon, 2003-11-10 at 08:11, Magic Fang wrote: > > >>i have developed a fgenesh parser these days, it is base on genscan >>package, current version is 0.1. i think fgenesh is the powerful gene >>finding program. if i can add it to the bioperl code tree? >> >>_______________________________________________ >>Bioperl-l mailing list >>Bioperl-l@portal.open-bio.org >>http://portal.open-bio.org/mailman/listinfo/bioperl-l >> >> i use it like genscan module From jmanning at broad.mit.edu Mon Nov 10 10:23:03 2003 From: jmanning at broad.mit.edu (Jonathan Manning) Date: Mon Nov 10 10:19:35 2003 Subject: [Bioperl-l] Bio::AlignIO::bl2seq doesn't know when to stop... In-Reply-To: <3FA7EA91.9020107@broad.mit.edu> References: <3FA7EA91.9020107@broad.mit.edu> Message-ID: <3FAFAD57.10806@broad.mit.edu> I hate replying to my own post, but is anyone interested in reviewing/applying this patch? It won't successfully parse bl2seq alignments without it. If someone will confirm this is a problem, and that this looks like a correct solution, then I'll go through the trouble of testing it against blast tools 2.2.1-2.2.6 and even writing a test for it. Even a "works for me (without the patch) with blast tools version 2.x.x" reply would be helpful. ~J Jonathan Manning wrote: > Hi, > If I try to parse a bl2seq alignment using align (using "while(my $aln = > $str->next_aln())" ), when it runs out of alignments, I get: > > Can't call method "querySeq" on an undefined value at > ~/perllib/Bio/AlignIO/bl2seq.pm line 134, line 6002. > > If I look at the file I'm trying to parse, line 6002 is the end of the > alignments. There is a "Lambda" line, and some summary information > following it. I would expect next_aln to return false here. > > The following patch to CVS head fixes this - and properly returns false > when there is no next alignment. This fix can also be applied to 1.2.3. > All tests in AlignIO.t pass. > > I'm using blast tools 2.2.1, btw. It may only be a problem with this > version of bl2seq. However, the fix below is a good safety check > regardless of what blast version is used - but someone needs to test it > against the latest version, just in case. > > ~Jonathan > > > Index: bl2seq.pm > =================================================================== > RCS file: /home/repository/bioperl/bioperl-live/Bio/AlignIO/bl2seq.pm,v > retrieving revision 1.15 > diff -c -r1.15 bl2seq.pm > *** bl2seq.pm 2003/10/28 13:52:03 1.15 > --- bl2seq.pm 2003/11/04 15:54:16 > *************** > *** 131,136 **** > --- 131,137 ---- > -report_type => $self->report_type); > my $bl2seqobj = $self->{'bl2seqobj'}; > my $hsp = $bl2seqobj->next_feature; > + unless($hsp) { return 0 }; > $seqchar = $hsp->querySeq; > $start = $hsp->query->start; > $end = $hsp->query->end; > > From d.gatherer at vir.gla.ac.uk Mon Nov 10 11:25:36 2003 From: d.gatherer at vir.gla.ac.uk (Derek Gatherer) Date: Mon Nov 10 11:20:51 2003 Subject: [Bioperl-l] SeqWords.pm In-Reply-To: <200311090013.hA90CAcm027838@portal.open-bio.org> Message-ID: <5.2.1.1.1.20031110161920.00b08650@udcf.gla.ac.uk> Hi Bio::Tools::Seqwords.pm should have two methods, one for counting overlapping words and one for counting non-overlapping words. The present count_words could/should be renamed count_nonoverlap_words, and an additional count_overlap_words could be created, same code as count_words, but... after line 214, you would need $seqlen = $seqobj->length(); # measure length for ($frame = 1; $frame<=$word_length; $frame++)# run through frames { my $seqstring = uc($seqobj->subseq($frame,$seqlen));# take the relevant substring while($seqstring =~ /((\w){$word_length})/gim) { $codon{uc($1)}++; # keep adding to hash } } return \%codon; Should I send this to bugzilla? (it isn't really a bug, just a gap in the functionality of the object.) thanks Derek From harris at cshl.org Mon Nov 10 11:26:05 2003 From: harris at cshl.org (Todd Harris) Date: Mon Nov 10 11:22:44 2003 Subject: [Bioperl-l] about the FgeneSH parser In-Reply-To: <1068457624.9907.6.camel@localhost> Message-ID: > Hi, > > Sure, but... > > How does it compare with Bio::EnsEMBL::Pipeline::Runnable module? > > http://www.ensembl.org/Docs/Pdoc/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline > /Runnable/Fgenesh.html > > or this: > > http://www2.toddot.net:8081/research/scripts/parsers/fgenesh/ Ha! That's my total FGENESH parser hack! The power of google never ceases to amaze. todd > > -Heikki > > On Mon, 2003-11-10 at 08:11, Magic Fang wrote: >> i have developed a fgenesh parser these days, it is base on genscan >> package, current version is 0.1. i think fgenesh is the powerful gene >> finding program. if i can add it to the bioperl code tree? >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l@portal.open-bio.org >> http://portal.open-bio.org/mailman/listinfo/bioperl-l From d.gatherer at vir.gla.ac.uk Mon Nov 10 12:01:01 2003 From: d.gatherer at vir.gla.ac.uk (Derek Gatherer) Date: Mon Nov 10 11:56:16 2003 Subject: [Bioperl-l] Bio::Tools::Run::Phylo::PAML::Codeml In-Reply-To: <200311090013.hA90CAcm027838@portal.open-bio.org> Message-ID: <5.2.1.1.1.20031110164743.03514008@udcf.gla.ac.uk> Hi try this: _________________________ #!/usr/bin/perl -w use lib "/usr/local/lib/site_perl/5.8.0/"; use strict; use Bio::Tools::Run::Phylo::PAML::Codeml; ______________________________________ I get the following: BEGIN failed--compilation aborted at /usr/local/lib/site_perl/5.8.0//Bio/Tools/Phylo/PAML.pm line 168. The offending line is: use IO::String; find /usr/local/lib/site_perl/5.8.0/ -name String.pm gives nothing. Am I missing a module? Commenting this line out allows the script above to run, but I am unsure of its effect in other places. Cheers Derek From heikki at ebi.ac.uk Mon Nov 10 12:04:33 2003 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Mon Nov 10 12:01:19 2003 Subject: [Bioperl-l] SeqWords.pm In-Reply-To: <5.2.1.1.1.20031110161920.00b08650@udcf.gla.ac.uk> References: <5.2.1.1.1.20031110161920.00b08650@udcf.gla.ac.uk> Message-ID: <1068483873.2453.10.camel@localhost> Derek, That looks like a good extension. Please submit it to bugzilla. Do you think you could write a few tests into t/Tools.t. That would be grand. For backward compatibility I think we will keep count_words() (there could be an alias count_nonoverlaping_words() but I do not think it is needed. Too verbose and long.) and add count_overlapping_words(). Internally they should call the same method with different parameters. I can do this if you make sure that the code works (and write the tests to prove it). Cheers, -Heikki On Mon, 2003-11-10 at 16:25, Derek Gatherer wrote: > Hi > > Bio::Tools::Seqwords.pm should have two methods, one for counting > overlapping words and one for counting non-overlapping words. The present > count_words could/should be renamed count_nonoverlap_words, and an > additional count_overlap_words could be created, same code as count_words, > but... > > after line 214, you would need > > $seqlen = $seqobj->length(); # measure length > for ($frame = 1; $frame<=$word_length; $frame++)# run through frames > { > my $seqstring = uc($seqobj->subseq($frame,$seqlen));# take the relevant > substring > > while($seqstring =~ /((\w){$word_length})/gim) > { > $codon{uc($1)}++; # keep adding to hash > } > } > return \%codon; > > Should I send this to bugzilla? (it isn't really a bug, just a gap in the > functionality of the object.) > > thanks > Derek > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From heikki at ebi.ac.uk Mon Nov 10 12:45:21 2003 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Mon Nov 10 12:42:04 2003 Subject: [Bioperl-l] Bio::Tools::Run::Phylo::PAML::Codeml In-Reply-To: <5.2.1.1.1.20031110164743.03514008@udcf.gla.ac.uk> References: <5.2.1.1.1.20031110164743.03514008@udcf.gla.ac.uk> Message-ID: <1068486320.2453.26.camel@localhost> Yep. IO::String is one of the few CPAN modules that are really needed to run a large number of bioperl modules (most others are needed only by one or two modules). You probably have never installed bioperl or ran its tests using the provided makefile. The makefile would have detected and listed missing dependencies in your system. Do yourself a favour and install IO::String. -Heikki On Mon, 2003-11-10 at 17:01, Derek Gatherer wrote: > Hi > > try this: > _________________________ > #!/usr/bin/perl -w > > use lib "/usr/local/lib/site_perl/5.8.0/"; > use strict; > > use Bio::Tools::Run::Phylo::PAML::Codeml; > ______________________________________ > > I get the following: > > BEGIN failed--compilation aborted at > /usr/local/lib/site_perl/5.8.0//Bio/Tools/Phylo/PAML.pm line 168. > > The offending line is: > > use IO::String; > > find /usr/local/lib/site_perl/5.8.0/ -name String.pm > > gives nothing. Am I missing a module? Commenting this line out allows the > script above to run, but I am unsure of its effect in other places. > > Cheers > Derek > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From jason at cgt.duhs.duke.edu Mon Nov 10 12:49:15 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Nov 10 12:45:51 2003 Subject: [Bioperl-l] Bio::Tools::Run::Phylo::PAML::Codeml In-Reply-To: <5.2.1.1.1.20031110164743.03514008@udcf.gla.ac.uk> References: <5.2.1.1.1.20031110164743.03514008@udcf.gla.ac.uk> Message-ID: On Mon, 10 Nov 2003, Derek Gatherer wrote: > Hi > > try this: > _________________________ > #!/usr/bin/perl -w > > use lib "/usr/local/lib/site_perl/5.8.0/"; > use strict; > > use Bio::Tools::Run::Phylo::PAML::Codeml; > ______________________________________ > > I get the following: > > BEGIN failed--compilation aborted at > /usr/local/lib/site_perl/5.8.0//Bio/Tools/Phylo/PAML.pm line 168. > > The offending line is: > > use IO::String; > > find /usr/local/lib/site_perl/5.8.0/ -name String.pm > > gives nothing. Am I missing a module? Commenting this line out allows the How about installing the module IO::String? You need this to parse the trees out of Codeml output I think. > script above to run, but I am unsure of its effect in other places. > > Cheers > Derek > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From cain at cshl.org Mon Nov 10 12:52:26 2003 From: cain at cshl.org (Scott Cain) Date: Mon Nov 10 12:48:59 2003 Subject: [Bioperl-l] Re: [Gmod-schema] load_gff3 handles ##sequence-region In-Reply-To: References: Message-ID: <1068486746.1446.49.camel@localhost.localdomain> Allen, Thanks for doing this, but it seems I hadn't thought this all the way through--specifying a name, a start and an end is not sufficent, therefore, the '##sequence-region' line is not going to work in it's current form. The problem lies in assigning a SO type to the region. Is it a chromosome, contig, assembly, band, arm, etc? So I think we will have to go back to requiring that the reference sequence be a full GFF line that occurs in the file before any features referring to it. Thanks, Scott On Fri, 2003-11-07 at 20:51, Allen Day wrote: > >From chado's load_gff3.pl POD: > > " > Also, in order for the load to be successful, the reference sequences (eg, > chromosomes or contigs) must be defined in the GFF file before any > features on them are listed. This can be done either by the > reference-sequence meta data specification, which would be lines that look > like this: > > ##sequence-region chr1 1 246127941 ----except that this isn't supported > yet--can I get Bio::Tools::GFF to give me this info? > " > > this is now fixed on bioperl-live HEAD Bio::Tools::GFF. I've added > parsing capability for the ##sequence region header tag, and stubbed out > handling of other header tags from the GFF3 spec such as "##attribute > ontology". > > segments are created as Bio::LocatableSeq objects and available via > Bio::Tools::GFF::next_segment(). > > -allen > > > > ------------------------------------------------------- > This SF.Net email sponsored by: ApacheCon 2003, > 16-19 November in Las Vegas. Learn firsthand the latest > developments in Apache, PHP, Perl, XML, Java, MySQL, > WebDAV, and more! http://www.apachecon.com/ > _______________________________________________ > Gmod-schema mailing list > Gmod-schema@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/gmod-schema -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain@cshl.org GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory From heikki at ebi.ac.uk Tue Nov 11 06:54:39 2003 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Tue Nov 11 06:51:19 2003 Subject: [Bioperl-l] Re: [Bioperl-guts-l] bioperl-live SeqFeature test hanging In-Reply-To: References: Message-ID: <1068551678.5936.7.camel@localhost> Charles, I just noticed the same thing. Hitting the return key released it. I have not had time to investigate more closely. I glad to hear that at least that one is not caused by upgrade to perl 5.8.2 this morning. I am seeing all kinds of warnings which I'll try to work through today. -Heikki P.S. Lets keep discussions in the main bioperl list. Guts is only for generated reports. On Mon, 2003-11-10 at 22:12, Charles Hauser wrote: > Hi, > > I updated bioperl to the current cvs version this am (cvs udate -d). > I ran perl Makefile.PL & make. > Now, when I run make test, it hangs at SeqFeature.............5/74? > > Never had this problem previously????? > Ideas > > Charles > > _______________________________________________ > Bioperl-guts-l mailing list > Bioperl-guts-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-guts-l -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From heikki at ebi.ac.uk Tue Nov 11 10:45:06 2003 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Tue Nov 11 10:41:39 2003 Subject: [Bioperl-l] SiteMatrix Message-ID: <1068565505.5940.21.camel@localhost> Stefan, t/psm.t was printing out warnings originating from Bio::Matrix::PSM::SiteMatrix (under perl 5.8.2). I commented out some lines in the constructor that were trying to split non-existing hash values. All tests pass. Could you check that I did not do anything stupid. Cheers, -Heikki -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From skirov at utk.edu Tue Nov 11 11:26:23 2003 From: skirov at utk.edu (Stefan Kirov) Date: Tue Nov 11 11:22:58 2003 Subject: [Bioperl-l] SiteMatrix Message-ID: <3FB10DAF.1070502@utk.edu> Yup:-)), actually you can supply two types of input data - arrays or strings: -pA=>\@pA, where @pA=(0.7,0.3,0.5,0.1,0.9) or -pA=>$pA, where $pA='73519' So the lines you commented out are necessary to parse the string input, however I guess it is my mistake somewhere if the input type is not recognized correctly and SiteMatrix is trying to parse arrays as strings. Could you send me the warnings you are getting? I am runnning 5.8.0 and did not get any. Thanks! Stefan Heikki Lehvaslaiho wrote: > Stefan, > > t/psm.t was printing out warnings originating from > Bio::Matrix::PSM::SiteMatrix (under perl 5.8.2). I commented out some > lines in the constructor that were trying to split non-existing hash > values. All tests pass. Could you check that I did not do anything > stupid. > > Cheers, > -Heikki > > > > From skirov at utk.edu Tue Nov 11 12:02:39 2003 From: skirov at utk.edu (Stefan Kirov) Date: Tue Nov 11 11:59:13 2003 Subject: [Bioperl-l] Re: SiteMatrix In-Reply-To: <1068568832.5941.30.camel@localhost> References: <1068565505.5940.21.camel@localhost> <3FB10BEE.3000004@utk.edu> <1068568832.5941.30.camel@localhost> Message-ID: <3FB1162F.4060707@utk.edu> Got it! It is actually a design issue in SiteMatrix- the constructor should rather call _initialize instead of doing everything by itself. Right now since Psm is SiteMatrix, SiteMatrix is trying to create the 4 vectors (A,G,C,T) even though there is no such need (in the case of parsing mast for example). I will apply a quick fix to check for the input more properly and later I will split the constructor into two subroutines, which should be a permanent solution. Thanks for pointing the problem Heikki. Stefan Heikki Lehvaslaiho wrote: >These are the error messages: > >t/psm........................ok 39/42Use of uninitialized value in split >at /home/heikki/src/bioperl/core/blib/lib/Bio/Matrix/PSM/SiteMatrix.pm >line 177, line 172. >Use of uninitialized value in split at >/home/heikki/src/bioperl/core/blib/lib/Bio/Matrix/PSM/SiteMatrix.pm line >178, line 172. >Use of uninitialized value in split at >/home/heikki/src/bioperl/core/blib/lib/Bio/Matrix/PSM/SiteMatrix.pm line >179, line 172. >Use of uninitialized value in split at >/home/heikki/src/bioperl/core/blib/lib/Bio/Matrix/PSM/SiteMatrix.pm line >180, line 172. > >I checked to variable and the %input{pA} and others were empty. > >Thanks into looking into this, > > -Heikki > > >On Tue, 2003-11-11 at 16:18, Stefan Kirov wrote: > > >>Yup:-)), actually you can supply two types of input data - arrays or >>strings: >>-pA=>\@pA, where @pA=(0.7,0.3,0.5,0.1,0.9) >>or >>-pA=>$pA, where $pA='73519' >>So the lines you commented out are necessary to parse the string input, >>however I guess it is my mistake somewhere if the input type is not >>recognized correctly and SiteMatrix is trying to parse arrays as >>strings. Could you send me the warnings you are getting? I am runnning >>5.8.0 and did not get any. >>Thanks! >>Stefan >> >>Heikki Lehvaslaiho wrote: >> >> >> >>>Stefan, >>> >>>t/psm.t was printing out warnings originating from >>>Bio::Matrix::PSM::SiteMatrix (under perl 5.8.2). I commented out some >>>lines in the constructor that were trying to split non-existing hash >>>values. All tests pass. Could you check that I did not do anything >>>stupid. >>> >>>Cheers, >>> -Heikki >>> >>> >>> >>> >>> >>> -- Stefan Kirov, Ph.D. University of Tennessee/Oak Ridge National Laboratory 1060 Commerce Park, Oak Ridge TN 37830-8026 USA tel +865 576 5120 fax +865 241 1965 e-mail: skirov@utk.edu sao@ornl.gov From chauser at duke.edu Tue Nov 11 14:37:43 2003 From: chauser at duke.edu (Charles Hauser) Date: Tue Nov 11 14:37:44 2003 Subject: [Bioperl-l] which GD.pm for libgd1.so.1.8.4 Message-ID: <1068579665.18699.2.camel@pandorina> What is the most recent version of GD.pm that is compatible w/ libgd1.so.1.8.4? We've been forced to upgrade our system to rh9, brrrrrrrrr Charles From fangl at genomics.org.cn Tue Nov 11 20:34:48 2003 From: fangl at genomics.org.cn (Magic Fang) Date: Tue Nov 11 20:31:27 2003 Subject: [Bioperl-l] about the GD problem Message-ID: <3FB18E38.3010607@genomics.org.cn> in the Bio::Graphics::Panel, if i set width smaller then the pad_left+pad_right, it will cause perl core dump in my system(freebsd 5.1), need to fix this bug? From fangl at genomics.org.cn Tue Nov 11 20:32:32 2003 From: fangl at genomics.org.cn (Magic Fang) Date: Wed Nov 12 08:12:51 2003 Subject: [Bioperl-l] about the FgeneSH parser In-Reply-To: References: Message-ID: <3FB18DB0.8070707@genomics.org.cn> hi, in the attachment is the fgenesh packages drived from genscan module, i fixed some problem for GFF output, include add the seq_id etc. there are some codes i marked. please check it. Magic Fang Todd Harris wrote: >>Hi, >> >>Sure, but... >> >>How does it compare with Bio::EnsEMBL::Pipeline::Runnable module? >> >>http://www.ensembl.org/Docs/Pdoc/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline >>/Runnable/Fgenesh.html >> >>or this: >> >>http://www2.toddot.net:8081/research/scripts/parsers/fgenesh/ >> >> > >Ha! That's my total FGENESH parser hack! The power of google never ceases >to amaze. > >todd > > > >>-Heikki >> >>On Mon, 2003-11-10 at 08:11, Magic Fang wrote: >> >> >>>i have developed a fgenesh parser these days, it is base on genscan >>>package, current version is 0.1. i think fgenesh is the powerful gene >>>finding program. if i can add it to the bioperl code tree? >>> >>>_______________________________________________ >>>Bioperl-l mailing list >>>Bioperl-l@portal.open-bio.org >>>http://portal.open-bio.org/mailman/listinfo/bioperl-l >>> >>> > > > > > -------------- next part -------------- # $Id: FgeneSH.pm,v 1.21 2002/10/08 08:38:32 lapp Exp $ # # BioPerl module for Bio::Tools::FgeneSH # # Cared for by Magic Fang # # Copyright Magic Fang # # You may distribute this module under the same terms as perl itself # POD documentation - main docs before the code =head1 NAME Bio::Tools::FgeneSH - Results of one FgeneSH run =head1 SYNOPSIS $fgenesh = Bio::Tools::FgeneSH->new(-file => 'result.fgenesh'); # filehandle: $fgenesh = Bio::Tools::FgeneSH->new( -fh => \*INPUT ); # parse the results # note: this class is-a Bio::Tools::AnalysisResult which implements # Bio::SeqAnalysisParserI, i.e., $fgenesh->next_feature() is the same while($gene = $fgenesh->next_prediction()) { # $gene is an instance of Bio::Tools::Prediction::Gene, which inherits # off Bio::SeqFeature::Gene::Transcript. # # $gene->exons() returns an array of # Bio::Tools::Prediction::Exon objects # all exons: @exon_arr = $gene->exons(); # initial exons only @init_exons = $gene->exons('Initial'); # internal exons only @intrl_exons = $gene->exons('Internal'); # terminal exons only @term_exons = $gene->exons('Terminal'); # singleton exons: ($single_exon) = $gene->exons(); } # essential if you gave a filename at initialization (otherwise the file # will stay open) $fgenesh->close(); =head1 DESCRIPTION The FgeneSH module provides a parser for FgeneSH gene structure prediction output. It parses one gene prediction into a Bio::SeqFeature::Gene::Transcript- derived object. This module also implements the Bio::SeqAnalysisParserI interface, and thus can be used wherever such an object fits. See L. =head1 FEEDBACK =head2 Mailing Lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bio.perl.org/MailList.html - About the mailing lists =head2 Reporting Bugs Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via email or the web: bioperl-bugs@bio.perl.org http://bio.perl.org/bioperl-bugs/ =head1 AUTHOR - Hilmar Lapp Email hlapp@gmx.net Describe contact details here =head1 APPENDIX The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _ =cut # Let the code begin... package Bio::Tools::FgeneSH; use vars qw(@ISA); use strict; use Symbol; use Bio::Root::Root; use Bio::Tools::AnalysisResult; use Bio::Tools::Prediction::Gene; use Bio::Tools::Prediction::Exon; @ISA = qw(Bio::Tools::AnalysisResult); sub _initialize_state { my ($self,@args) = @_; # first call the inherited method! $self->SUPER::_initialize_state(@args); # our private state variables $self->{'_preds_parsed'} = 0; $self->{'_has_cds'} = 0; # array of pre-parsed predictions $self->{'_preds'} = []; # seq stack $self->{'_seqstack'} = []; } =head2 analysis_method Usage : $fgenesh->analysis_method(); Purpose : Inherited method. Overridden to ensure that the name matches /fgenesh/i. Returns : String Argument : n/a =cut #------------- sub analysis_method { #------------- my ($self, $method) = @_; if($method && ($method !~ /fgenesh/i)) { $self->throw("method $method not supported in " . ref($self)); } return $self->SUPER::analysis_method($method); } =head2 next_feature Title : next_feature Usage : while($gene = $fgenesh->next_feature()) { # do something } Function: Returns the next gene structure prediction of the FgeneSH result file. Call this method repeatedly until FALSE is returned. The returned object is actually a SeqFeatureI implementing object. This method is required for classes implementing the SeqAnalysisParserI interface, and is merely an alias for next_prediction() at present. Example : Returns : A Bio::Tools::Prediction::Gene object. Args : =cut sub next_feature { my ($self,@args) = @_; # even though next_prediction doesn't expect any args (and this method # does neither), we pass on args in order to be prepared if this changes # ever return $self->next_prediction(@args); } =head2 next_prediction Title : next_prediction Usage : while($gene = $fgenesh->next_prediction()) { # do something } Function: Returns the next gene structure prediction of the FgeneSH result file. Call this method repeatedly until FALSE is returned. Example : Returns : A Bio::Tools::Prediction::Gene object. Args : =cut sub next_prediction { my ($self) = @_; my $gene; # if the prediction section hasn't been parsed yet, we do this now $self->_parse_predictions() unless $self->_predictions_parsed(); # get next gene structure $gene = $self->_prediction(); # here i patch the GenScan package bug when we meet a result just contain # promoter or polyA records # Magic Fang(fangl@genomics.org.cn) BGI, 2003-11-10 if($gene && $gene->exons>0) { # fill in predicted protein, and if available the predicted CDS # my ($id, $seq); # use the seq stack if there's a seq on it my $seqobj = pop(@{$self->{'_seqstack'}}); if(! $seqobj) { # otherwise read from input stream ($id, $seq) = $self->_read_fasta_seq(); # there may be no sequence at all, or none any more if($id && $seq) { $seqobj = Bio::PrimarySeq->new('-seq' => $seq, '-display_id' => $id, '-alphabet' => "protein"); } } if($seqobj) { # check that prediction number matches the prediction number # indicated in the sequence id (there may be incomplete gene # predictions that contain only signals with no associated protein # and CDS, like promoters, poly-A sites etc) $gene->primary_tag() =~ /[^0-9]([0-9]+)$/; my $prednr = $1; if($seqobj->display_id() !~ /FGENESH\:\s+$prednr\s+/) { # this is not our sequence, so push back for next prediction push(@{$self->{'_seqstack'}}, $seqobj); } else { $gene->predicted_protein($seqobj); # CDS prediction, too? if($self->_has_cds()) { ($id, $seq) = $self->_read_fasta_seq(); $seqobj = Bio::PrimarySeq->new('-seq' => $seq, '-display_id' => $id, '-alphabet' => "dna"); $gene->predicted_cds($seqobj); } } } } return $gene; } =head2 _parse_predictions Title : _parse_predictions() Usage : $obj->_parse_predictions() Function: Parses the prediction section. Automatically called by next_prediction() if not yet done. Example : Returns : =cut sub _parse_predictions { my ($self) = @_; my %exontags = ('CDSf' => 'Initial', 'CDSi' => 'Internal', 'CDSl' => 'Terminal', 'CDSo' => ''); my $gene; my $seqname; while(defined($_ = $self->_readline())) { if(/^\s*(\d+)\s*[\+|\-]/) { # catch the gene region line # exon or signal my $prednr = $1; # my $signalnr = $2; # not used presently if(! defined($gene)) { # for GFF parser i modified the original GenScan module # Magic Fang(fangl@genomics.org.cn) BGI, 2003-11-10 $gene = Bio::Tools::Prediction::Gene->new(-primary=>"gene", '-display_name' => "GenePrediction$prednr", '-source' => 'FgeneSH', '-tag'=>{ 'gene_id'=>"GenePrediction$prednr"}); } # split into fields chomp(); $_=~s/^\s+//g; my @flds = split(/[\s|\t]+/, $_); # print "@flds\n"; # create the feature object depending on the type of signal my $predobj; my $is_exon = grep {$_ eq $flds[3];} (keys(%exontags)); if($is_exon) { $predobj = Bio::Tools::Prediction::Exon->new(); $predobj->score($flds[7]); $predobj->start($flds[4]); $predobj->end($flds[6]); $predobj->start_signal_score($flds[7]); $predobj->end_signal_score($flds[7]); $predobj->coding_signal_score($flds[7]); $predobj->significance($flds[7]); # for correct GFF parser, i change the primary tag of the exon # to the exon, and set the display name to the original exon # primary tag # Magic Fang(fangl@genomics.org.cn) BGI, 2003-11-10 $predobj->display_name($exontags{$flds[3]}); $predobj->primary_tag('exon'); $predobj->is_coding(1); $predobj->seq_id($seqname); # add 2 lines for artemis, for gene clustering and record the exon id # in the last column of GFF # Magic Fang(fangl@genomics.org.cn) BGI, 2003-11-10 $predobj->add_tag_value('exon_id', $exontags{$flds[3]}.'Exon'); $predobj->add_tag_value('gene', "GenePrediction$prednr"); # first, set fields unique to exons # Figure out the frame of this exon. This is NOT the frame # given by FgeneSH, which is the absolute frame of the base # starting the first predicted complete codon. By comparing # to the absolute frame of the first base we can compute the # offset of the first complete codon to the first base of the # exon, which determines the frame of the exon. if($predobj->strand() == 1) { $predobj->frame($flds[8]-$flds[4]); } else { $predobj->frame($flds[6]-$flds[10]); } # then add to gene structure object # i modified the order of the follow 3 lines in GenScan module # Magic Fang(fangl@genomics.org.cn) BGI, 2003-11-10 $predobj->source_tag('FgeneSH'); $predobj->strand((($flds[1] eq '+') ? 1 : -1)); $gene->add_exon($predobj, 'exon'); } else { # PolyA site, or Promoter $predobj = Bio::SeqFeature::Generic->new(); $predobj->score($flds[4]); $predobj->start($flds[3]); $predobj->end($flds[3]); $predobj->source_tag('FgeneSH'); $predobj->strand((($flds[1] eq '+') ? 1 : -1)); $predobj->seq_id($seqname); # add to gene structure (should be done only when start and end # are set, in order to allow for proper expansion of the range) if($flds[2] eq 'PolA') { # for correct GFF and Feature Table format # Magic Fang(fangl@genomics.org.cn) BGI, 2003-11-10 $predobj->primary_tag("PolyA"); $gene->poly_A_site($predobj); } elsif($flds[2] eq 'TSS') { $predobj->primary_tag("Promoter"); $gene->add_promoter($predobj); } next; } # set common fields } if(/^\s*$/ && defined($gene)) { # current gene is completed $gene->seq_id($seqname); $self->_add_prediction($gene); $gene = undef; next; } if(/^(FGENESH)\s+(.+)\s+Prediction.+in\s+(\S+)\s+genomic/) { $self->analysis_method($1); $self->analysis_method_version($2); $self->analysis_subject($3); next; } if(/^\s*Seq name\:\s+(\S+)[\s|\r|\n]*/) { $seqname = $1; next; } $self->_has_cds(0); /^>/ && do { # section of predicted sequences $self->_pushback($_); last; }; } $self->_predictions_parsed(1); } =head2 _prediction Title : _prediction() Usage : $gene = $obj->_prediction() Function: internal Example : Returns : =cut sub _prediction { my ($self) = @_; return undef unless(exists($self->{'_preds'}) && @{$self->{'_preds'}}); return shift(@{$self->{'_preds'}}); } =head2 _add_prediction Title : _add_prediction() Usage : $obj->_add_prediction($gene) Function: internal Example : Returns : =cut sub _add_prediction { my ($self, $gene) = @_; if(! exists($self->{'_preds'})) { $self->{'_preds'} = []; } push(@{$self->{'_preds'}}, $gene); } =head2 _predictions_parsed Title : _predictions_parsed Usage : $obj->_predictions_parsed Function: internal Example : Returns : TRUE or FALSE =cut sub _predictions_parsed { my ($self, $val) = @_; $self->{'_preds_parsed'} = $val if $val; if(! exists($self->{'_preds_parsed'})) { $self->{'_preds_parsed'} = 0; } return $self->{'_preds_parsed'}; } =head2 _has_cds Title : _has_cds() Usage : $obj->_has_cds() Function: Whether or not the result contains the predicted CDSs, too. Example : Returns : TRUE or FALSE =cut sub _has_cds { my ($self, $val) = @_; $self->{'_has_cds'} = $val if $val; if(! exists($self->{'_has_cds'})) { $self->{'_has_cds'} = 0; } return $self->{'_has_cds'}; } =head2 _read_fasta_seq Title : _read_fasta_seq() Usage : ($id,$seqstr) = $obj->_read_fasta_seq(); Function: Simple but specialised FASTA format sequence reader. Uses $self->_readline() to retrieve input, and is able to strip off the traling description lines. Example : Returns : An array of two elements. =cut sub _read_fasta_seq { my ($self) = @_; my ($id, $seq); local $/ = ">"; my $entry = $self->_readline(); if($entry) { $entry =~ s/^>//; # complete the entry if the first line came from a pushback buffer while($entry !~ />$/) { last unless $_ = $self->_readline(); $entry .= $_; } # delete everything onwards from an intervening empty line (at the # end there might be statistics stuff) $entry =~ s/[\n|\r]+>$//s; # id and sequence if($entry =~ /^(.+)\n([^>]+)/) { $id = $1; $seq = $2; } else { # I have to repair some fgenesh err here, for when the exon length is 3, # fgenesh will not print the protein sequence, but just the fasta head, # so i get rid of the exception handler here # $self->throw("Can't parse FgeneSH predicted sequence entry"); # Magic Fang(fangl@genomics.org.cn) BGI, 2003-11-10 $seq=undef; } $seq =~ s/\s//g; # Remove whitespace } return ($id, $seq); } 1; From jaudall at iastate.edu Wed Nov 12 09:50:57 2003 From: jaudall at iastate.edu (Joshua A Udall) Date: Wed Nov 12 09:47:24 2003 Subject: [Bioperl-l] passing $result Message-ID: <6.0.0.22.2.20031112084230.01cbbc18@jaudall.mail.iastate.edu> Bioperl - I'm having trouble passing an object and I seem only able to pass part of it, the first 'layer'. I'm trying to pass $result (from blast parsing) into a function and access the hits but they seem to have disappeared. What is the correct syntax to pass $result as a whole object? while(my $result = $searchio->next_result() ) { while (my $hit=$result->next_hit){ print_graph($result); } } sub print_graph { my $result_sub = $_[0]; while( my $hit = $result_sub->next_hit ) { ... and do groovy stuff here with the hits } } Joshua Udall Department of Ecology, Evolution, and Organismal Biology Iowa State University Ames, IA 50011 Ph: (515) 294-7098 Fax: (515) 294-1337 From ak at ebi.ac.uk Wed Nov 12 10:12:55 2003 From: ak at ebi.ac.uk (Andreas Kahari) Date: Wed Nov 12 10:09:26 2003 Subject: [Bioperl-l] passing $result In-Reply-To: <6.0.0.22.2.20031112084230.01cbbc18@jaudall.mail.iastate.edu> References: <6.0.0.22.2.20031112084230.01cbbc18@jaudall.mail.iastate.edu> Message-ID: <20031112151255.GA7022@ebi.ac.uk> On Wed, Nov 12, 2003 at 08:50:57AM -0600, Joshua A Udall wrote: > Bioperl - > > I'm having trouble passing an object and I seem only able to pass part of > it, the first 'layer'. I'm trying to pass $result (from blast parsing) > into a function and access the hits but they seem to have > disappeared. What is the correct syntax to pass $result as a whole object? > > while(my $result = $searchio->next_result() ) { > while (my $hit=$result->next_hit){ > print_graph($result); > } > } > > sub print_graph { > my $result_sub = $_[0]; > while( my $hit = $result_sub->next_hit ) { > > ... and do groovy stuff here with the hits > > } > } Are you sure you want to call print_graph() inside the inner loop, looping on next_hit(), and the loop over next_hit() within print_graph() as well? -- | )( | Andreas K?h?ri |< >| |( )| EMBL, European Bioinformatics Institute | >< | | )( | Wellcome Trust Genome Campus, Hinxton |< >| |( )| Cambridge, CB10 1SD | >< | | )( | United Kingdom |< >| From quickster333 at hotmail.com Wed Nov 12 12:15:57 2003 From: quickster333 at hotmail.com (Johnny Amos) Date: Wed Nov 12 12:12:29 2003 Subject: [Bioperl-l] Drawing Chromosomes Message-ID: Hello, I would like to draw chromosomes dynamically, colour cytogenic bands (e.g. q13) according to the quantity of something (e.g. SNPs) in that band. The idea is to high-light chromosomal patterns in these sorts of things in a really display-friendly format. I would use Bio::Graphics, but it seems the only way to do that is to load the entire chromosome in, and then manually add each SNP. Does anyone have any other suggestions on how to do this? I'm open to any ideas, and I'll happily contribute anything of value I develop. -J _________________________________________________________________ Frustrated with dial-up? Get high-speed for as low as $26.95. https://broadband.msn.com (Prices may vary by service area.) From jason at cgt.duhs.duke.edu Wed Nov 12 12:41:17 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Wed Nov 12 12:37:37 2003 Subject: [Bioperl-l] passing $result In-Reply-To: <6.0.0.22.2.20031112084230.01cbbc18@jaudall.mail.iastate.edu> References: <6.0.0.22.2.20031112084230.01cbbc18@jaudall.mail.iastate.edu> Message-ID: Two things - you don't want to call print_graph with $result - remove the inner while( my $hit ...) loop if you want to operate on $result objs. Second - remember that the next_XXX methods indicate an iterator. So if all the hits have already been processed (i.e. you have called next_hit until it returns undef) then calling it again will also return undef. You have 2 options. Get the list of all the hits -- my @hits = $result->hits; Or call $result->_rewind to reset the iterator to the beginning of the list of hits. Examples of code that does this sort of thing are in SearchIO::Writer::XXX, HTMLResultWriter for example. Note that some future implementations may not support _rewind for various reasons as it requires us to keep all the hits and hsps in memory for a Result, but all the current implementations do support this. -jason On Wed, 12 Nov 2003, Joshua A Udall wrote: > Bioperl - > > I'm having trouble passing an object and I seem only able to pass part of > it, the first 'layer'. I'm trying to pass $result (from blast parsing) > into a function and access the hits but they seem to have > disappeared. What is the correct syntax to pass $result as a whole object? > > while(my $result = $searchio->next_result() ) { > while (my $hit=$result->next_hit){ > print_graph($result); > } > } > > sub print_graph { > my $result_sub = $_[0]; > while( my $hit = $result_sub->next_hit ) { > > ... and do groovy stuff here with the hits > > } > } > > > > > > Joshua Udall > Department of Ecology, Evolution, and Organismal Biology > Iowa State University > Ames, IA 50011 > Ph: (515) 294-7098 > Fax: (515) 294-1337 > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From lstein at cshl.edu Wed Nov 12 13:47:07 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Wed Nov 12 13:43:38 2003 Subject: [Bioperl-l] about the GD problem In-Reply-To: <3FB18E38.3010607@genomics.org.cn> References: <3FB18E38.3010607@genomics.org.cn> Message-ID: <200311121347.07930.lstein@cshl.edu> It's probably trying to create an image with negative width. Don't do that! Lincoln On Tuesday 11 November 2003 08:34 pm, Magic Fang wrote: > in the Bio::Graphics::Panel, if i set width smaller then the > pad_left+pad_right, it will cause perl core dump in my system(freebsd > 5.1), need to fix this bug? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- ======================================================================== Lincoln D. Stein Cold Spring Harbor Laboratory lstein@cshl.org Cold Spring Harbor, NY ======================================================================== From lstein at cshl.edu Wed Nov 12 13:51:41 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Wed Nov 12 13:48:10 2003 Subject: [Bioperl-l] which GD.pm for libgd1.so.1.8.4 In-Reply-To: <1068579665.18699.2.camel@pandorina> References: <1068579665.18699.2.camel@pandorina> Message-ID: <200311121351.41894.lstein@cshl.edu> I think 1.40. Lincoln On Tuesday 11 November 2003 02:41 pm, Charles Hauser wrote: > What is the most recent version of GD.pm that is compatible w/ > libgd1.so.1.8.4? > > We've been forced to upgrade our system to rh9, brrrrrrrrr > > Charles > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- ======================================================================== Lincoln D. Stein Cold Spring Harbor Laboratory lstein@cshl.org Cold Spring Harbor, NY ======================================================================== From lstein at cshl.edu Wed Nov 12 13:31:51 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Wed Nov 12 15:05:35 2003 Subject: [Bioperl-l] Drawing Chromosomes In-Reply-To: References: Message-ID: <200311121331.51893.lstein@cshl.edu> There is an idiogram glyph in the library that comes with Generic Genome Browser. It is used to draw human chromosomes for the hapmap project (www.hapmap.org) You might be able to modify it in order to color-code the cytogenetic bands. Currently the information on coloring comes out of the feature tags. Lincoln On Wednesday 12 November 2003 12:15 pm, Johnny Amos wrote: > Hello, > > I would like to draw chromosomes dynamically, colour cytogenic bands (e.g. > q13) according to the quantity of something (e.g. SNPs) in that band. The > idea is to high-light chromosomal patterns in these sorts of things in a > really display-friendly format. > > I would use Bio::Graphics, but it seems the only way to do that is to load > the entire chromosome in, and then manually add each SNP. Does anyone have > any other suggestions on how to do this? I'm open to any ideas, and I'll > happily contribute anything of value I develop. > > -J > > _________________________________________________________________ > Frustrated with dial-up? Get high-speed for as low as $26.95. > https://broadband.msn.com (Prices may vary by service area.) > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- ======================================================================== Lincoln D. Stein Cold Spring Harbor Laboratory lstein@cshl.org Cold Spring Harbor, NY ======================================================================== -------------- next part -------------- A non-text attachment was scrubbed... Name: ideogram.pm Type: text/x-perl Size: 9570 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20031112/5bea252b/ideogram-0001.bin From pat.jones at 123-survey.com Wed Nov 12 16:44:21 2003 From: pat.jones at 123-survey.com (=?iso-8859-1?Q?Pat_Jones?=) Date: Wed Nov 12 16:40:59 2003 Subject: [Bioperl-l] Join Our Studies! Current Study (Molecular Testing Professionals) Message-ID: <200311122140.hACLeick024551@portal.open-bio.org> Greetings! We invite you to join our new MOLECULAR Diagnostic study. We appreciate you sharing your medical expertise as a lab professional. -Please click the link below to PRE-REGISTER for this study -The 15-20 minute survey will be sent to respondents via e-mail next week -Respondents will receive a $50 gift certificate upon completion of the 15-20 minute survey -The gift certificate is redeemable at hundreds of shops and restaurants, including Starbucks, Amazon.com, and more! -Please forward this invitation to other experts in your field who would like to share their valuable opinions. Thank you for your time and interest! Join future surveys. Click the link below to register! ----------------------------------------------------------------- LOGIN INFORMATION: To register for this study or for future studies, click the link below: http://www.123-survey.com/ss.asp?c=b48p7sFxr6MM56Bf7Qc6 If clicking the link does not automatically launch your web browser: -Highlight the link -Copy it (Ctrl C) -Open your web browser -Paste the link (Ctrl V) into the Address bar -Hit Enter The site works best when viewed with Microsoft Internet Explorer version 5.0 or newer. If you do not want to receive these emails anymore click the link below: http://advisor.imagecareinc.com/optout/index.asp From Perdeep.Mehta at stjude.org Wed Nov 12 18:40:42 2003 From: Perdeep.Mehta at stjude.org (Mehta, Perdeep) Date: Wed Nov 12 18:37:10 2003 Subject: [Bioperl-l] Bio::Tools::Run::StandAloneBlast Message-ID: Hi, I'm struggling to find the reason why my following test code to parse Blast output is not functioning. Just beginning to learn Bioperl and couldn't figure out what's missing. #!/usr/bin/perl -w # use strict; use Bio::Perl; use Bio::Tools::Run::StandAloneBlast; use Bio::SearchIO; # this script will only work with an internet connection # on the computer it is run on # Get the protein sequence my $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); # Set database to search against my $DB = "/refseqdb/complete/rs_rel2"; # Create factory for stand alone Blast and search my @params = ('program' => 'blastp','database' => $DB); my $factory = Bio::Tools::Run::StandAloneBlast->new(@params); my $blast_result = $factory->blastall($seq_object); # Parse Blast results my $in = new Bio::SearchIO(-format => 'blast', -file => $blast_result); while( my $result = $in->next_result ) { while( my $hit = $result->next_hit ) { while( my $hsp = $hit->next_hsp ) { if( $hsp->length('total') > 100 ) { if ( $hsp->percent_identity >= 75 ) { print "Hit= ", $hit->name, ",Length=", $hsp->length('total'), ",Percent_id=", $hsp->percent_identity, "\n"; } } } } } exit; Blastall runs fine though. Error appears to be Blast result parsing related. Here is the error message that I get; >test.pl Can't locate object method "algorithm" via package "Bio::SearchIO::blast" (perhaps you for got to load "Bio::SearchIO::blast"?) at /usr/local/lib/perl5/site_perl/5.6.1/Bio/SearchIO/ Writer/TextResultWriter.pm line 146, line 197. Thank you in advance for any help. perdeep Perdeep K. Mehta, PhD Hartwell Center for Bioinformatics & Biotechnology St. Jude Children's Research Hospital Memphis, TN 38105-2794 Tel: 901-495 3774 http://www.hartwellcenter.org From jason at cgt.duhs.duke.edu Wed Nov 12 19:18:13 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Wed Nov 12 19:14:26 2003 Subject: [Bioperl-l] Bio::Tools::Run::StandAloneBlast In-Reply-To: References: Message-ID: Are you SURE you're running the same script which giving that error - you must be missing the write_blast() call in your code. The problem seems to be whereever you are calling write_blast, you are passing in $in (A Bio::SearchIO::blast object) NOT $result (A Bio::Search::Result::ResultI object). -jason On Wed, 12 Nov 2003, Mehta, Perdeep wrote: > Hi, > > I'm struggling to find the reason why my following test code to parse Blast output is not functioning. Just beginning to learn Bioperl and couldn't figure out what's missing. > > #!/usr/bin/perl -w > # > use strict; > > use Bio::Perl; > use Bio::Tools::Run::StandAloneBlast; > use Bio::SearchIO; > > # this script will only work with an internet connection > # on the computer it is run on > > # Get the protein sequence > my $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); > > # Set database to search against > my $DB = "/refseqdb/complete/rs_rel2"; > > # Create factory for stand alone Blast and search > my @params = ('program' => 'blastp','database' => $DB); > my $factory = Bio::Tools::Run::StandAloneBlast->new(@params); > my $blast_result = $factory->blastall($seq_object); > > # Parse Blast results > my $in = new Bio::SearchIO(-format => 'blast', > -file => $blast_result); > while( my $result = $in->next_result ) { > while( my $hit = $result->next_hit ) { > while( my $hsp = $hit->next_hsp ) { > if( $hsp->length('total') > 100 ) { > if ( $hsp->percent_identity >= 75 ) { > print "Hit= ", $hit->name, > ",Length=", $hsp->length('total'), > ",Percent_id=", $hsp->percent_identity, "\n"; > } > } > } > } > } > > exit; > > Blastall runs fine though. Error appears to be Blast result parsing related. Here is the error message that I get; > >test.pl > Can't locate object method "algorithm" via package "Bio::SearchIO::blast" (perhaps you for > got to load "Bio::SearchIO::blast"?) at /usr/local/lib/perl5/site_perl/5.6.1/Bio/SearchIO/ > Writer/TextResultWriter.pm line 146, line 197. > > Thank you in advance for any help. > perdeep > > Perdeep K. Mehta, PhD > Hartwell Center for Bioinformatics & Biotechnology > St. Jude Children's Research Hospital > Memphis, TN 38105-2794 > Tel: 901-495 3774 > http://www.hartwellcenter.org > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From Perdeep.Mehta at stjude.org Wed Nov 12 22:06:23 2003 From: Perdeep.Mehta at stjude.org (Mehta, Perdeep) Date: Wed Nov 12 22:03:02 2003 Subject: [Bioperl-l] Bio::Tools::Run::StandAloneBlast Message-ID: You are right. I mixed up the things I'm not sure of. I had tried write_blast() also and that also threw errors. Then I thought to try the method with $in ... and drop the write_blast() call altogether. I thought $in... gives more control. However, it didn't as well. Here is the right error message that I got with the code included earlier. I'm sorry for the confusion. >test2.pl ------------- EXCEPTION ------------- MSG: Could not open Bio::SearchIO::blast=HASH(0x140b4c910) for reading: No such file or directory STACK Bio::Root::IO::_initialize_io /usr/local/lib/perl5/site_perl/5.6.1/Bio/Root/IO.pm:260 STACK Bio::Root::IO::new /usr/local/lib/perl5/site_perl/5.6.1/Bio/Root/IO.pm:206 STACK Bio::SearchIO::new /usr/local/lib/perl5/site_perl/5.6.1/Bio/SearchIO.pm:123 STACK Bio::SearchIO::new /usr/local/lib/perl5/site_perl/5.6.1/Bio/SearchIO.pm:155 STACK toplevel test2.pl:21 -------------------------------------- Also isn't that $blast_result contains the result from blastall search. Thanks, perdeep -----Original Message----- From: Jason Stajich [mailto:jason@cgt.duhs.duke.edu] Sent: Wed 11/12/2003 6:18 PM To: Mehta, Perdeep Cc: bioperl-l@bioperl.org Subject: Re: [Bioperl-l] Bio::Tools::Run::StandAloneBlast Are you SURE you're running the same script which giving that error - you must be missing the write_blast() call in your code. The problem seems to be whereever you are calling write_blast, you are passing in $in (A Bio::SearchIO::blast object) NOT $result (A Bio::Search::Result::ResultI object). -jason On Wed, 12 Nov 2003, Mehta, Perdeep wrote: > Hi, > > I'm struggling to find the reason why my following test code to parse Blast output is not functioning. Just beginning to learn Bioperl and couldn't figure out what's missing. > > #!/usr/bin/perl -w > # > use strict; > > use Bio::Perl; > use Bio::Tools::Run::StandAloneBlast; > use Bio::SearchIO; > > # this script will only work with an internet connection > # on the computer it is run on > > # Get the protein sequence > my $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); > > # Set database to search against > my $DB = "/refseqdb/complete/rs_rel2"; > > # Create factory for stand alone Blast and search > my @params = ('program' => 'blastp','database' => $DB); > my $factory = Bio::Tools::Run::StandAloneBlast->new(@params); > my $blast_result = $factory->blastall($seq_object); > > # Parse Blast results > my $in = new Bio::SearchIO(-format => 'blast', > -file => $blast_result); > while( my $result = $in->next_result ) { > while( my $hit = $result->next_hit ) { > while( my $hsp = $hit->next_hsp ) { > if( $hsp->length('total') > 100 ) { > if ( $hsp->percent_identity >= 75 ) { > print "Hit= ", $hit->name, > ",Length=", $hsp->length('total'), > ",Percent_id=", $hsp->percent_identity, "\n"; > } > } > } > } > } > > exit; > > Blastall runs fine though. Error appears to be Blast result parsing related. Here is the error message that I get; > >test.pl > Can't locate object method "algorithm" via package "Bio::SearchIO::blast" (perhaps you for > got to load "Bio::SearchIO::blast"?) at /usr/local/lib/perl5/site_perl/5.6.1/Bio/SearchIO/ > Writer/TextResultWriter.pm line 146, line 197. > > Thank you in advance for any help. > perdeep > > Perdeep K. Mehta, PhD > Hartwell Center for Bioinformatics & Biotechnology > St. Jude Children's Research Hospital > Memphis, TN 38105-2794 > Tel: 901-495 3774 > http://www.hartwellcenter.org > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From wes.barris at csiro.au Thu Nov 13 00:26:55 2003 From: wes.barris at csiro.au (Wes Barris) Date: Thu Nov 13 00:23:43 2003 Subject: [Bioperl-l] Bio::Tools::Run::StandAloneBlast In-Reply-To: References: Message-ID: <3FB3161F.5010106@csiro.au> Mehta, Perdeep wrote: > You are right. I mixed up the things I'm not sure of. I had tried write_blast() also and that also threw errors. Then I thought to try the method with $in ... and drop the write_blast() call altogether. I thought $in... gives more control. However, it didn't as well. Here is the right error message that I got with the code included earlier. I'm sorry for the confusion. Change these lines: my $blast_result = $factory->blastall($seq_object); # Parse Blast results my $in = new Bio::SearchIO(-format => 'blast', -file => $blast_result); while( my $result = $in->next_result ) { to this: my $blast_result = $factory->blastall($seq_object); # Parse Blast results while( my $result = $blast_result->next_result ) { >>test2.pl > > ------------- EXCEPTION ------------- > MSG: Could not open Bio::SearchIO::blast=HASH(0x140b4c910) for reading: No such file or directory > STACK Bio::Root::IO::_initialize_io /usr/local/lib/perl5/site_perl/5.6.1/Bio/Root/IO.pm:260 > STACK Bio::Root::IO::new /usr/local/lib/perl5/site_perl/5.6.1/Bio/Root/IO.pm:206 > STACK Bio::SearchIO::new /usr/local/lib/perl5/site_perl/5.6.1/Bio/SearchIO.pm:123 > STACK Bio::SearchIO::new /usr/local/lib/perl5/site_perl/5.6.1/Bio/SearchIO.pm:155 > STACK toplevel test2.pl:21 > -------------------------------------- > > Also isn't that $blast_result contains the result from blastall search. > > Thanks, > perdeep > > -----Original Message----- > From: Jason Stajich [mailto:jason@cgt.duhs.duke.edu] > Sent: Wed 11/12/2003 6:18 PM > To: Mehta, Perdeep > Cc: bioperl-l@bioperl.org > Subject: Re: [Bioperl-l] Bio::Tools::Run::StandAloneBlast > > > > Are you SURE you're running the same script which giving that error - you > must be missing the write_blast() call in your code. > > The problem seems to be whereever you are calling write_blast, you are > passing in $in (A Bio::SearchIO::blast object) NOT $result (A > Bio::Search::Result::ResultI object). > > > -jason > > On Wed, 12 Nov 2003, Mehta, Perdeep wrote: > > > Hi, > > > > I'm struggling to find the reason why my following test code to parse Blast output is not functioning. Just beginning to learn Bioperl and couldn't figure out what's missing. > > > > #!/usr/bin/perl -w > > # > > use strict; > > > > use Bio::Perl; > > use Bio::Tools::Run::StandAloneBlast; > > use Bio::SearchIO; > > > > # this script will only work with an internet connection > > # on the computer it is run on > > > > # Get the protein sequence > > my $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); > > > > # Set database to search against > > my $DB = "/refseqdb/complete/rs_rel2"; > > > > # Create factory for stand alone Blast and search > > my @params = ('program' => 'blastp','database' => $DB); > > my $factory = Bio::Tools::Run::StandAloneBlast->new(@params); > > my $blast_result = $factory->blastall($seq_object); > > > > # Parse Blast results > > my $in = new Bio::SearchIO(-format => 'blast', > > -file => $blast_result); > > while( my $result = $in->next_result ) { > > while( my $hit = $result->next_hit ) { > > while( my $hsp = $hit->next_hsp ) { > > if( $hsp->length('total') > 100 ) { > > if ( $hsp->percent_identity >= 75 ) { > > print "Hit= ", $hit->name, > > ",Length=", $hsp->length('total'), > > ",Percent_id=", $hsp->percent_identity, "\n"; > > } > > } > > } > > } > > } > > > > exit; > > > > Blastall runs fine though. Error appears to be Blast result parsing related. Here is the error message that I get; > > >test.pl > > Can't locate object method "algorithm" via package "Bio::SearchIO::blast" (perhaps you for > > got to load "Bio::SearchIO::blast"?) at /usr/local/lib/perl5/site_perl/5.6.1/Bio/SearchIO/ > > Writer/TextResultWriter.pm line 146, line 197. > > > > Thank you in advance for any help. > > perdeep > > > > Perdeep K. Mehta, PhD > > Hartwell Center for Bioinformatics & Biotechnology > > St. Jude Children's Research Hospital > > Memphis, TN 38105-2794 > > Tel: 901-495 3774 > > http://www.hartwellcenter.org > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- Wes Barris E-Mail: Wes.Barris@csiro.au From heikki at ebi.ac.uk Thu Nov 13 09:26:14 2003 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Thu Nov 13 09:22:51 2003 Subject: [Bioperl-l] tests failing, please help Message-ID: <1068733573.2373.14.camel@localhost> I've now gone through all warnings and failures in the bioperl-live test suite. I was running a brand new perl 5.8.2. Some of the errors I saw are due to perl not talking to a right CPAN module. The most serious and widespread problems were caused by Bio::LocatableSeq::trunc() not handling sequences with strand=-1 correctly (my code!). In the process of fixing it, I've touched start() and end(), too. You can now initialise a LocatableSeq with sequence only and start() and and end() will return correct values. The remaining problems are: * Registry tests failing -- but we knew that already * EncodedSeq needs fixing, -- this might be related to LocatableSeq or not -- Aaron? * t/SeqFeature.t hangs on $feat->gff_string() call -- pressing return releases it & GFF.t has IO handle problems -- these two might be related -- Lincoln? If some other tests fail for you, especially due to my "fixes", let me know. -Heikki -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From Perdeep.Mehta at stjude.org Thu Nov 13 09:50:28 2003 From: Perdeep.Mehta at stjude.org (Mehta, Perdeep) Date: Thu Nov 13 09:47:00 2003 Subject: [Bioperl-l] Bio::Tools::Run::StandAloneBlast Message-ID: Thank you. It has worked, but what this statement is for my $in = new Bio::SearchIO(-format => 'blast', -file => $blast_result); and why the following didn't work; write_blast(">roa1.blast",$blast_result); Thanks to all for responding. perdeep -----Original Message----- From: Wes Barris [mailto:wes.barris@csiro.au] Sent: Wednesday, November 12, 2003 11:27 PM To: Mehta, Perdeep Cc: bioperl-l@bioperl.org Subject: Re: [Bioperl-l] Bio::Tools::Run::StandAloneBlast Mehta, Perdeep wrote: > You are right. I mixed up the things I'm not sure of. I had tried write_blast() also and that also threw errors. Then I thought to try the method with $in ... and drop the write_blast() call altogether. I thought $in... gives more control. However, it didn't as well. Here is the right error message that I got with the code included earlier. I'm sorry for the confusion. Change these lines: my $blast_result = $factory->blastall($seq_object); # Parse Blast results my $in = new Bio::SearchIO(-format => 'blast', -file => $blast_result); while( my $result = $in->next_result ) { to this: my $blast_result = $factory->blastall($seq_object); # Parse Blast results while( my $result = $blast_result->next_result ) { >>test2.pl > > ------------- EXCEPTION ------------- > MSG: Could not open Bio::SearchIO::blast=HASH(0x140b4c910) for reading: No such file or directory > STACK Bio::Root::IO::_initialize_io /usr/local/lib/perl5/site_perl/5.6.1/Bio/Root/IO.pm:260 > STACK Bio::Root::IO::new /usr/local/lib/perl5/site_perl/5.6.1/Bio/Root/IO.pm:206 > STACK Bio::SearchIO::new /usr/local/lib/perl5/site_perl/5.6.1/Bio/SearchIO.pm:123 > STACK Bio::SearchIO::new /usr/local/lib/perl5/site_perl/5.6.1/Bio/SearchIO.pm:155 > STACK toplevel test2.pl:21 > -------------------------------------- > > Also isn't that $blast_result contains the result from blastall search. > > Thanks, > perdeep > > -----Original Message----- > From: Jason Stajich [mailto:jason@cgt.duhs.duke.edu] > Sent: Wed 11/12/2003 6:18 PM > To: Mehta, Perdeep > Cc: bioperl-l@bioperl.org > Subject: Re: [Bioperl-l] Bio::Tools::Run::StandAloneBlast > > > > Are you SURE you're running the same script which giving that error - you > must be missing the write_blast() call in your code. > > The problem seems to be whereever you are calling write_blast, you are > passing in $in (A Bio::SearchIO::blast object) NOT $result (A > Bio::Search::Result::ResultI object). > > > -jason > > On Wed, 12 Nov 2003, Mehta, Perdeep wrote: > > > Hi, > > > > I'm struggling to find the reason why my following test code to parse Blast output is not functioning. Just beginning to learn Bioperl and couldn't figure out what's missing. > > > > #!/usr/bin/perl -w > > # > > use strict; > > > > use Bio::Perl; > > use Bio::Tools::Run::StandAloneBlast; > > use Bio::SearchIO; > > > > # this script will only work with an internet connection > > # on the computer it is run on > > > > # Get the protein sequence > > my $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); > > > > # Set database to search against > > my $DB = "/refseqdb/complete/rs_rel2"; > > > > # Create factory for stand alone Blast and search > > my @params = ('program' => 'blastp','database' => $DB); > > my $factory = Bio::Tools::Run::StandAloneBlast->new(@params); > > my $blast_result = $factory->blastall($seq_object); > > > > # Parse Blast results > > my $in = new Bio::SearchIO(-format => 'blast', > > -file => $blast_result); > > while( my $result = $in->next_result ) { > > while( my $hit = $result->next_hit ) { > > while( my $hsp = $hit->next_hsp ) { > > if( $hsp->length('total') > 100 ) { > > if ( $hsp->percent_identity >= 75 ) { > > print "Hit= ", $hit->name, > > ",Length=", $hsp->length('total'), > > ",Percent_id=", $hsp->percent_identity, "\n"; > > } > > } > > } > > } > > } > > > > exit; > > > > Blastall runs fine though. Error appears to be Blast result parsing related. Here is the error message that I get; > > >test.pl > > Can't locate object method "algorithm" via package "Bio::SearchIO::blast" (perhaps you for > > got to load "Bio::SearchIO::blast"?) at /usr/local/lib/perl5/site_perl/5.6.1/Bio/SearchIO/ > > Writer/TextResultWriter.pm line 146, line 197. > > > > Thank you in advance for any help. > > perdeep > > > > Perdeep K. Mehta, PhD > > Hartwell Center for Bioinformatics & Biotechnology > > St. Jude Children's Research Hospital > > Memphis, TN 38105-2794 > > Tel: 901-495 3774 > > http://www.hartwellcenter.org > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- Wes Barris E-Mail: Wes.Barris@csiro.au From tobias.straub at lmu.de Thu Nov 13 10:07:36 2003 From: tobias.straub at lmu.de (Tobias) Date: Thu Nov 13 10:04:02 2003 Subject: [Bioperl-l] xyplot weirdness Message-ID: <1D6E4A70-15EB-11D8-9E9B-0003935A86C6@lmu.de> hi, when trying to use xy-plot (boxes) in gbrowse I get unpredictable rendering of the scores. if I set a set the scores of all subfeatures that I aggregate to the same value (i.e. 5) the boxes are not rendered at all, means they have a height of zero. when I change the value of one subfeature to a different value (i.e. 3), all but this subfeature are rendered properly. the changed subfeature gets a box of zero height, while the other ones now have a clear 5. furthermore this behaviour changes (i did not say improve) when changing zoom levels. sometimes, not always. I'm using bioperl v. 1.302 and gbrowse v. 1.54 looks like a bug, a known-one? cheers Tobias From Matthew.Betts at ii.uib.no Thu Nov 13 10:14:26 2003 From: Matthew.Betts at ii.uib.no (Matthew Betts) Date: Thu Nov 13 10:10:55 2003 Subject: [Bioperl-l] Bio::AlignIO::po Message-ID: Hei, Does anyone have a PO format reader for AlignIO? This is the native output format for POA. It can also output in pir or clustalw, so it's not a big problem. But I'd quite like to try to try my hand at writing the parser if one doesn't exist already... Thanks, Matthew -- Matthew Betts, mailto:matthew.betts@ii.uib.no Phone: (+47) 55 58 40 22, Fax: (+47) 55 58 42 95 CBU, BCCS, UNIFOB / Universitetet i Bergen Thorm?hlensgt. 55, N-5008 Bergen, Norway From jason at cgt.duhs.duke.edu Thu Nov 13 10:41:08 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Thu Nov 13 10:37:31 2003 Subject: [Bioperl-l] Bio::Tools::Run::StandAloneBlast In-Reply-To: References: Message-ID: $blast_result ISA Bio::SearchIO object not a filename. If you do print ref($blast_result) You'll see that StandAloneBlast is returning you a SearchIO object. -jason On Thu, 13 Nov 2003, Mehta, Perdeep wrote: > Thank you. It has worked, but what this statement is for > > my $in = new Bio::SearchIO(-format => 'blast', > -file => $blast_result); > > and why the following didn't work; > write_blast(">roa1.blast",$blast_result); > > Thanks to all for responding. > perdeep > > -----Original Message----- > From: Wes Barris [mailto:wes.barris@csiro.au] > Sent: Wednesday, November 12, 2003 11:27 PM > To: Mehta, Perdeep > Cc: bioperl-l@bioperl.org > Subject: Re: [Bioperl-l] Bio::Tools::Run::StandAloneBlast > > > Mehta, Perdeep wrote: > > > You are right. I mixed up the things I'm not sure of. I had tried write_blast() also and that also threw errors. Then I thought to try the method with $in ... and drop the write_blast() call altogether. I thought $in... gives more control. However, it didn't as well. Here is the right error message that I got with the code included earlier. I'm sorry for the confusion. > > Change these lines: > > my $blast_result = $factory->blastall($seq_object); > # Parse Blast results > my $in = new Bio::SearchIO(-format => 'blast', > -file => $blast_result); > while( my $result = $in->next_result ) { > > to this: > > my $blast_result = $factory->blastall($seq_object); > # Parse Blast results > while( my $result = $blast_result->next_result ) { > > > >>test2.pl > > > > ------------- EXCEPTION ------------- > > MSG: Could not open Bio::SearchIO::blast=HASH(0x140b4c910) for reading: No such file or directory > > STACK Bio::Root::IO::_initialize_io /usr/local/lib/perl5/site_perl/5.6.1/Bio/Root/IO.pm:260 > > STACK Bio::Root::IO::new /usr/local/lib/perl5/site_perl/5.6.1/Bio/Root/IO.pm:206 > > STACK Bio::SearchIO::new /usr/local/lib/perl5/site_perl/5.6.1/Bio/SearchIO.pm:123 > > STACK Bio::SearchIO::new /usr/local/lib/perl5/site_perl/5.6.1/Bio/SearchIO.pm:155 > > STACK toplevel test2.pl:21 > > -------------------------------------- > > > > Also isn't that $blast_result contains the result from blastall search. > > > > Thanks, > > perdeep > > > > -----Original Message----- > > From: Jason Stajich [mailto:jason@cgt.duhs.duke.edu] > > Sent: Wed 11/12/2003 6:18 PM > > To: Mehta, Perdeep > > Cc: bioperl-l@bioperl.org > > Subject: Re: [Bioperl-l] Bio::Tools::Run::StandAloneBlast > > > > > > > > Are you SURE you're running the same script which giving that error - you > > must be missing the write_blast() call in your code. > > > > The problem seems to be whereever you are calling write_blast, you are > > passing in $in (A Bio::SearchIO::blast object) NOT $result (A > > Bio::Search::Result::ResultI object). > > > > > > -jason > > > > On Wed, 12 Nov 2003, Mehta, Perdeep wrote: > > > > > Hi, > > > > > > I'm struggling to find the reason why my following test code to parse Blast output is not functioning. Just beginning to learn Bioperl and couldn't figure out what's missing. > > > > > > #!/usr/bin/perl -w > > > # > > > use strict; > > > > > > use Bio::Perl; > > > use Bio::Tools::Run::StandAloneBlast; > > > use Bio::SearchIO; > > > > > > # this script will only work with an internet connection > > > # on the computer it is run on > > > > > > # Get the protein sequence > > > my $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); > > > > > > # Set database to search against > > > my $DB = "/refseqdb/complete/rs_rel2"; > > > > > > # Create factory for stand alone Blast and search > > > my @params = ('program' => 'blastp','database' => $DB); > > > my $factory = Bio::Tools::Run::StandAloneBlast->new(@params); > > > my $blast_result = $factory->blastall($seq_object); > > > > > > # Parse Blast results > > > my $in = new Bio::SearchIO(-format => 'blast', > > > -file => $blast_result); > > > while( my $result = $in->next_result ) { > > > while( my $hit = $result->next_hit ) { > > > while( my $hsp = $hit->next_hsp ) { > > > if( $hsp->length('total') > 100 ) { > > > if ( $hsp->percent_identity >= 75 ) { > > > print "Hit= ", $hit->name, > > > ",Length=", $hsp->length('total'), > > > ",Percent_id=", $hsp->percent_identity, "\n"; > > > } > > > } > > > } > > > } > > > } > > > > > > exit; > > > > > > Blastall runs fine though. Error appears to be Blast result parsing related. Here is the error message that I get; > > > >test.pl > > > Can't locate object method "algorithm" via package "Bio::SearchIO::blast" (perhaps you for > > > got to load "Bio::SearchIO::blast"?) at /usr/local/lib/perl5/site_perl/5.6.1/Bio/SearchIO/ > > > Writer/TextResultWriter.pm line 146, line 197. > > > > > > Thank you in advance for any help. > > > perdeep > > > > > > Perdeep K. Mehta, PhD > > > Hartwell Center for Bioinformatics & Biotechnology > > > St. Jude Children's Research Hospital > > > Memphis, TN 38105-2794 > > > Tel: 901-495 3774 > > > http://www.hartwellcenter.org > > > > > > > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l@portal.open-bio.org > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > -- > > Jason Stajich > > Duke University > > jason at cgt.mc.duke.edu > > > > > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From jason at cgt.duhs.duke.edu Thu Nov 13 10:42:34 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Thu Nov 13 10:38:57 2003 Subject: [Bioperl-l] Bio::AlignIO::po In-Reply-To: References: Message-ID: One does not exist - would be happy for you to write one. AlignIO::clustalw or fasta would be good templates. An AlignIO pir parser wouldn't be too bad either if you get into it. -jason On Thu, 13 Nov 2003, Matthew Betts wrote: > > Hei, > > Does anyone have a PO format reader for AlignIO? This is the native output > format for POA. It can also output in pir or clustalw, so it's not a big > problem. But I'd quite like to try to try my hand at writing the parser if > one doesn't exist already... > > Thanks, > > Matthew > > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From brian_osborne at cognia.com Thu Nov 13 11:12:51 2003 From: brian_osborne at cognia.com (Brian Osborne) Date: Thu Nov 13 11:11:56 2003 Subject: [Bioperl-l] tests failing, please help In-Reply-To: <1068733573.2373.14.camel@localhost> Message-ID: Heikki et al. >The remaining problems are: >* Registry tests failing > -- but we knew that already I would like a bit of help with this, I'd like to know if the behavior I see with Cygwin is the same elsewhere. Could someone please do the following? Comment out the END line in Registry.t and run the test twice. When I do this the first run fails and the second succeeds. All the END line is doing is cleaning up the Flat databases that the test creates. To me it appears as if I make a Flat database and then call the Registry in the same script I'll see an error, but if the databases already exist and I call the Registry everything, at least as tested in Registry.t, is fine. Can you someone try this? Thanks again, Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Heikki Lehvaslaiho Sent: Thursday, November 13, 2003 9:26 AM To: Bioperl Subject: [Bioperl-l] tests failing, please help I've now gone through all warnings and failures in the bioperl-live test suite. I was running a brand new perl 5.8.2. Some of the errors I saw are due to perl not talking to a right CPAN module. The most serious and widespread problems were caused by Bio::LocatableSeq::trunc() not handling sequences with strand=-1 correctly (my code!). In the process of fixing it, I've touched start() and end(), too. You can now initialise a LocatableSeq with sequence only and start() and and end() will return correct values. The remaining problems are: * Registry tests failing -- but we knew that already * EncodedSeq needs fixing, -- this might be related to LocatableSeq or not -- Aaron? * t/SeqFeature.t hangs on $feat->gff_string() call -- pressing return releases it & GFF.t has IO handle problems -- these two might be related -- Lincoln? If some other tests fail for you, especially due to my "fixes", let me know. -Heikki -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From lstein at cshl.edu Thu Nov 13 13:40:47 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Thu Nov 13 13:39:37 2003 Subject: [Bioperl-l] xyplot weirdness In-Reply-To: <1D6E4A70-15EB-11D8-9E9B-0003935A86C6@lmu.de> References: <1D6E4A70-15EB-11D8-9E9B-0003935A86C6@lmu.de> Message-ID: <200311131340.47481.lstein@cshl.edu> Hi Tobias, You might try updating to the version of bioperl-live I checked in about three weeks ago. There were quite a few bugs in the xyplot glyph that I quashed. However, you might as well send along a dataset and config file that reproduces the problem and I'll work on it here too. Lincoln On Thursday 13 November 2003 10:07 am, Tobias wrote: > hi, > > when trying to use xy-plot (boxes) in gbrowse I get unpredictable > rendering of the scores. if I set a set the scores of all subfeatures > that I aggregate to the same value (i.e. 5) the boxes are not rendered > at all, means they have a height of zero. when I change the value of > one subfeature to a different value (i.e. 3), all but this subfeature > are rendered properly. the changed subfeature gets a box of zero > height, while the other ones now have a clear 5. furthermore this > behaviour changes (i did not say improve) when changing zoom levels. > sometimes, not always. > I'm using bioperl v. 1.302 and gbrowse v. 1.54 > > looks like a bug, a known-one? > > cheers > Tobias > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- Lincoln Stein lstein@cshl.edu Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) From skirov at utk.edu Thu Nov 13 14:58:43 2003 From: skirov at utk.edu (Stefan Kirov) Date: Thu Nov 13 14:55:09 2003 Subject: [Bioperl-l] SeqFeature test Message-ID: <3FB3E273.1000508@utk.edu> I have run into a problem I cannot quite track. Seqfeature test waits for the STDIN on my system (Linux SuSE), which I suppose is not meant to happen. This happens when gff_string is called: $str = $feat->gff_string() || ""; # placate -w Isn't the gff_string supposed to be called with an argument (because it isn't). I am hesitant to submit this as a bug since I don't quite get the logic behind this. Stefan From lstein at cshl.edu Thu Nov 13 15:44:10 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Thu Nov 13 15:40:38 2003 Subject: [Bioperl-l] tests failing, please help In-Reply-To: <1068733573.2373.14.camel@localhost> References: <1068733573.2373.14.camel@localhost> Message-ID: <200311131544.10694.lstein@cshl.edu> > * t/SeqFeature.t hangs on $feat->gff_string() call > -- pressing return releases it > & GFF.t has IO handle problems > -- these two might be related > -- Lincoln? I just love IO handle problems. Give me a couple days to get 5.8.2 installed and I'll track it down. Lincoln -- Lincoln Stein lstein@cshl.edu Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) From skirov at utk.edu Thu Nov 13 16:07:10 2003 From: skirov at utk.edu (Stefan Kirov) Date: Thu Nov 13 16:03:38 2003 Subject: [Bioperl-l] tests failing, please help Message-ID: <3FB3F27E.2050801@utk.edu> I guess it is the same problem I had. I forgot to mention I am running 5.8.0. And it's not hanging, it just waits for end-of-line from STDIN (press ctrl-j or what. you sys needs). Stefan Lincoln Stein wrote: >> * t/SeqFeature.t hangs on $feat->gff_string() call >> -- pressing return releases it >> & GFF.t has IO handle problems >> -- these two might be related >> -- Lincoln? >> > > > I just love IO handle problems. Give me a couple days to get 5.8.2 > installed and I'll track it down. > > Lincoln > > > From ronan at roasp.com Thu Nov 13 16:22:26 2003 From: ronan at roasp.com (Ronan Oger) Date: Thu Nov 13 16:18:32 2003 Subject: [Bioperl-l] new to the group. And a quick demo of SVG::GD Message-ID: <200311132122.26300.ronan@roasp.com> Hi, My name is Ronan Oger, I am the lead developer of the SVG module. One of the focuses of my current work is SVG::GD, a wrapper for the GD module to provide SVG (vector) output instead of raster. http://www.w3.org/Graphics/SVG/ http://www.w3.org/TR/SVG I've been doing some tests with GD and GD derivatives, and since bioperl is a fairly heavy user of GD, I have been testing around some bioperl code to see how it works. There are some real issues in the SVG::GD at this point, but several people are working on it and progress is being made. Clearly this is not production code at this stage. In particular, font support is still very poor, and font positions are still broken. However, here is a bioperl-specific sample (you need an SVG-compliant browser, such as IE with Adobe or Corel's SVG viewers installed. A png and its svg friend taken from a bio-related example on the net ------------------------------------ http://www.roasp.com/2003/11/13/ More prolific example comparisons http://www.roasp.com/2003/11/11/ The SVG::GD module (version 0.07): http://www.roasp.com/2003/11/11/SVG-GD-0.07.tar.gz (This module has a dependency on SVG, which is on CPAN) When it ripens, the module will live on CPAN. I'd appreciate some feedback, issues, etc. In particular, relating to the module's usability. All the best, Ronan -- Ronan Oger http://www.roasp.com Serverside SVG Portal From wes.barris at csiro.au Thu Nov 13 18:02:24 2003 From: wes.barris at csiro.au (Wes Barris) Date: Thu Nov 13 17:58:58 2003 Subject: [Bioperl-l] standaloneblast executable location? Message-ID: <3FB40D80.8050001@csiro.au> Hi, I am using Bio::Tools::Run::StandAloneBlast. The only thing that I have been having trouble with is telling it where my "blastall" executable is located. If I set this environment variable outside of the bioperl script it works: setenv BLASTDIR /usr/local/blast2.0 However, I want this script to be self contained and have everything set from inside the script. I have tried the following inside the script, but none of them work: BEGIN {$ENV{'BLASTDIR'} = '/usr/local/blast2.0';} $ENV{'BLASTDIR'} = '/usr/local/blast2.0'; my $PROGRAMDIR = '/usr/local/blast2.0'; In all cases (unless I set the BLASTDIR outside of the script), I get this error: -------------------- WARNING --------------------- MSG: cannot find path to blastall --------------------------------------------------- Here is a portion of my code: #!/usr/local/bin/perl -w # use strict; use Bio::Tools::Run::StandAloneBlast; #BEGIN {$ENV{'BLASTDIR'} = '/usr/local/blast2.0';} #my $PROGRAMDIR = '/usr/local/blast2.0'; my $usage = "Usage: $0 \n"; my $infile = shift or die $usage; my $seq_in = Bio::SeqIO->new(-format=>'fasta', -file=>$infile); my $seq = $seq_in->next_seq; my @params = ( 'program' => 'blastn', 'database' => '/htdocs/db/blast/refseq', ); my $factory = Bio::Tools::Run::StandAloneBlast->new(@params); my $blast_result = $factory->blastall($seq); -- Wes Barris E-Mail: Wes.Barris@csiro.au From jason at cgt.duhs.duke.edu Thu Nov 13 18:25:08 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Thu Nov 13 18:21:22 2003 Subject: [Bioperl-l] standaloneblast executable location? In-Reply-To: <3FB40D80.8050001@csiro.au> References: <3FB40D80.8050001@csiro.au> Message-ID: On Fri, 14 Nov 2003, Wes Barris wrote: > Hi, > > I am using Bio::Tools::Run::StandAloneBlast. The only thing that I have > been having trouble with is telling it where my "blastall" executable > is located. If I set this environment variable outside of the bioperl > script it works: > > setenv BLASTDIR /usr/local/blast2.0 > > However, I want this script to be self contained and have everything set > from inside the script. I have tried the following inside the script, > but none of them work: > > BEGIN {$ENV{'BLASTDIR'} = '/usr/local/blast2.0';} > > $ENV{'BLASTDIR'} = '/usr/local/blast2.0'; > > my $PROGRAMDIR = '/usr/local/blast2.0'; > > In all cases (unless I set the BLASTDIR outside of the script), I get > this error: So you need to put the env setting BEFORE the 'use StandAloneBlast' use strict; BEGIN {$ENV{'BLASTDIR'} = '/usr/local/blast2.0';} use Bio::Tools::Run::StandAloneBlast; This is because at compile time BLASTDIR envvariable is used to set the $PROGRAMDIR in StandAloneBlast Doing this $Bio::Tools::Run::StandAloneBlast::PROGRAMDIR='/usr/local/blast2.0'; will also work I think. -jason > --------------------------------------------------- > > Here is a portion of my code: > > #!/usr/local/bin/perl -w > # > use strict; > use Bio::Tools::Run::StandAloneBlast; > #BEGIN {$ENV{'BLASTDIR'} = '/usr/local/blast2.0';} > #my $PROGRAMDIR = '/usr/local/blast2.0'; > my $usage = "Usage: $0 \n"; > my $infile = shift or die $usage; > my $seq_in = Bio::SeqIO->new(-format=>'fasta', -file=>$infile); > my $seq = $seq_in->next_seq; > my @params = ( > 'program' => 'blastn', > 'database' => '/htdocs/db/blast/refseq', > ); > my $factory = Bio::Tools::Run::StandAloneBlast->new(@params); > my $blast_result = $factory->blastall($seq); > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From zayed at sanbi.ac.za Fri Nov 14 04:45:16 2003 From: zayed at sanbi.ac.za (Zayed Albertyn) Date: Fri Nov 14 04:41:45 2003 Subject: [Bioperl-l] Problem with Bio::SeqIO opening gzipped files Message-ID: <20031114111251.D96833@fling.sanbi.ac.za> Hi I am trying to process the genbank flatfile release by opening the file from a gunzip pipe. The problem is that I keep on recieving this weird error about guess_format internal methods of Bio::SeqIO when I have specified the format type. Can't call method "_guess_format" without a package or object reference at /cip0/research/zayed/gmod/bioperl-live/Bio/SeqIO.pm line 363, line 4. Could the fact that I am trying to open it from a pipe be the problem because I dont get this error if I use a decompressed file. Here is my code foreach $file (@file) { my $seq_in = Bio::SeqIO::new( '-file' => "/bin/gunzip $path/$file |", '-format' => 'genbank' ); } Also, I would like to pass an array reference to Bio::SeqIO::MultiFile (so I dont have to use the foreach loop above) but how would I do that if wanted to open multiple gzipped files? I have updated by bioperl-live distribution on Fri Nov 14 2003 11:15 am CAT Finally, there is an error in the SeqIO HowTo page that might confuse some people. In the section where you describe working with gzipped files the following code has my $outformat = shift or die $usage; Shouldnt this be $outfile because $outformat is not used anywhere else in the code as you have already specifies 'Fasta' as the $seqout output format? Thanks Zayed ----------------------------------------------- From: Zayed Albertyn Electric Genetics PTY Ltd Tel: +27 21 959 3645; Mobile: +2782 480 6097 www.egenetics.com From ak at ebi.ac.uk Fri Nov 14 05:06:12 2003 From: ak at ebi.ac.uk (Andreas Kahari) Date: Fri Nov 14 05:02:39 2003 Subject: [Bioperl-l] Problem with Bio::SeqIO opening gzipped files In-Reply-To: <20031114111251.D96833@fling.sanbi.ac.za> References: <20031114111251.D96833@fling.sanbi.ac.za> Message-ID: <20031114100612.GA30180@ebi.ac.uk> On Fri, Nov 14, 2003 at 11:45:16AM +0200, Zayed Albertyn wrote: [cut] > my $seq_in = Bio::SeqIO::new( > '-file' => "/bin/gunzip $path/$file |", > '-format' => 'genbank' > ); > } You might find that you now have a lot of uncompressed files lying around. I believe you're missing the "-c" switch which makes gunzip send its output to standard output. my $seq_in = Bio::SeqIO::new( '-file' => "/bin/gunzip -c $path/$file|", '-format' => 'genbank' ); -- |()()| Andreas K?h?ri |(==)| |)()(| EMBL, European Bioinformatics Institute |=)(=| |()()| Wellcome Trust Genome Campus, Hinxton |(==)| |)()(| Cambridge, CB10 1SD |=)(=| |()()| United Kingdom |(==)| From zayed at sanbi.ac.za Fri Nov 14 05:13:32 2003 From: zayed at sanbi.ac.za (Zayed Albertyn) Date: Fri Nov 14 05:10:02 2003 Subject: [Bioperl-l] Problem with Bio::SeqIO opening gzipped files In-Reply-To: <20031114100612.GA30180@ebi.ac.uk> References: <20031114111251.D96833@fling.sanbi.ac.za> <20031114100612.GA30180@ebi.ac.uk> Message-ID: <20031114121149.B96833@fling.sanbi.ac.za> Thanks alot guys. Could you please include that in further editions of the HowTo? You'll are doing a great job! Zayed On Fri, 14 Nov 2003, Andreas Kahari wrote: > On Fri, Nov 14, 2003 at 11:45:16AM +0200, Zayed Albertyn wrote: > [cut] > > my $seq_in = Bio::SeqIO::new( > > '-file' => "/bin/gunzip $path/$file |", > > '-format' => 'genbank' > > ); > > } > > You might find that you now have a lot of uncompressed files > lying around. > > I believe you're missing the "-c" switch which makes gunzip send > its output to standard output. > > my $seq_in = Bio::SeqIO::new( > '-file' => "/bin/gunzip -c $path/$file|", > '-format' => 'genbank' > ); > > > > -- > |()()| Andreas K?h?ri |(==)| > |)()(| EMBL, European Bioinformatics Institute |=)(=| > |()()| Wellcome Trust Genome Campus, Hinxton |(==)| > |)()(| Cambridge, CB10 1SD |=)(=| > |()()| United Kingdom |(==)| > ----------------------------------------------- From: Zayed Albertyn Electric Genetics PTY Ltd Tel: +27 21 959 3645; Mobile: +2782 480 6097 www.egenetics.com From zayed at sanbi.ac.za Fri Nov 14 05:34:53 2003 From: zayed at sanbi.ac.za (Zayed Albertyn) Date: Fri Nov 14 05:31:23 2003 Subject: [Bioperl-l] Problem with Bio::SeqIO opening gzipped files In-Reply-To: <20031114100612.GA30180@ebi.ac.uk> References: <20031114111251.D96833@fling.sanbi.ac.za> <20031114100612.GA30180@ebi.ac.uk> Message-ID: <20031114121837.F96833@fling.sanbi.ac.za> Hi Andreas Adding the -c switch still doesnt work. I still get the same error message. Input is the full path to the file e.g. /cip0/db/GENBANK/RELEASE137/gbest13.seq.gz == $path/$file I've written another script that does the normal open(FILE,"/bin/gunzip -c file1 |") and it works fine Z > > my $seq_in = Bio::SeqIO::new( > '-file' => "/bin/gunzip -c $path/$file|", > '-format' => 'genbank' > ); > > > > -- > |()()| Andreas K?h?ri |(==)| > |)()(| EMBL, European Bioinformatics Institute |=)(=| > |()()| Wellcome Trust Genome Campus, Hinxton |(==)| > |)()(| Cambridge, CB10 1SD |=)(=| > |()()| United Kingdom |(==)| > ----------------------------------------------- From: Zayed Albertyn Electric Genetics PTY Ltd Tel: +27 21 959 3645; Mobile: +2782 480 6097 www.egenetics.com From ak at ebi.ac.uk Fri Nov 14 05:38:33 2003 From: ak at ebi.ac.uk (Andreas Kahari) Date: Fri Nov 14 05:34:58 2003 Subject: [Bioperl-l] Problem with Bio::SeqIO opening gzipped files In-Reply-To: <20031114121837.F96833@fling.sanbi.ac.za> References: <20031114111251.D96833@fling.sanbi.ac.za> <20031114100612.GA30180@ebi.ac.uk> <20031114121837.F96833@fling.sanbi.ac.za> Message-ID: <20031114103833.GA25889@ebi.ac.uk> On Fri, Nov 14, 2003 at 12:34:53PM +0200, Zayed Albertyn wrote: > Hi Andreas > > Adding the -c switch still doesnt work. I still get the same error > message. Input is the full path to the file e.g. > > /cip0/db/GENBANK/RELEASE137/gbest13.seq.gz == $path/$file Does the file "/cip0/db/GENBANK/RELEASE137/gbest13.seq.gz" exist or was it uncompressed into "/cip0/db/GENBANK/RELEASE137/gbest13.seq" when you ran the program which didn't use '-c' with gunzip? Andreas -- |)()(| Andreas K?h?ri |{}{}| |()()| EMBL, European Bioinformatics Institute |}{}{| |)()(| Wellcome Trust Genome Campus, Hinxton |{}{}| |()()| Cambridge, CB10 1SD |}{}{| |)()(| United Kingdom |{}{}| From zayed at sanbi.ac.za Fri Nov 14 05:52:25 2003 From: zayed at sanbi.ac.za (Zayed Albertyn) Date: Fri Nov 14 05:49:01 2003 Subject: [Bioperl-l] Problem with Bio::SeqIO opening gzipped files In-Reply-To: <20031114103833.GA25889@ebi.ac.uk> References: <20031114111251.D96833@fling.sanbi.ac.za> <20031114100612.GA30180@ebi.ac.uk> <20031114121837.F96833@fling.sanbi.ac.za> <20031114103833.GA25889@ebi.ac.uk> Message-ID: <20031114124403.G96833@fling.sanbi.ac.za> The file does exist as -rwxrwxrwx 1 dbases sanbi 21510739 Sep 27 23:27 /cip0/db/GENBANK/RELEASE137/gbest13.seq.gz I always supplied a gzipped file and printed it's name to check what files are being worked on. I even added an "if (-e "$path/$file") " to the loop and it still went in to try and open the file. But I cannot write to the directory it is in. So without the -c switch I wont be able to decompress it anyway unless I make a copy in my own directory. I have tried that with and without the -c switch. gunzip does the job but I cant get around that internal method error. Z On Fri, 14 Nov 2003, Andreas Kahari wrote: > On Fri, Nov 14, 2003 at 12:34:53PM +0200, Zayed Albertyn wrote: > > Hi Andreas > > > > Adding the -c switch still doesnt work. I still get the same error > > message. Input is the full path to the file e.g. > > > > /cip0/db/GENBANK/RELEASE137/gbest13.seq.gz == $path/$file > > Does the file "/cip0/db/GENBANK/RELEASE137/gbest13.seq.gz" > exist or was it uncompressed into > "/cip0/db/GENBANK/RELEASE137/gbest13.seq" when you ran the > program which didn't use '-c' with gunzip? > > > > Andreas > > -- > |)()(| Andreas K?h?ri |{}{}| > |()()| EMBL, European Bioinformatics Institute |}{}{| > |)()(| Wellcome Trust Genome Campus, Hinxton |{}{}| > |()()| Cambridge, CB10 1SD |}{}{| > |)()(| United Kingdom |{}{}| > ----------------------------------------------- From: Zayed Albertyn Electric Genetics PTY Ltd Tel: +27 21 959 3645; Mobile: +2782 480 6097 www.egenetics.com From ak at ebi.ac.uk Fri Nov 14 07:00:03 2003 From: ak at ebi.ac.uk (Andreas Kahari) Date: Fri Nov 14 06:56:28 2003 Subject: [Bioperl-l] Problem with Bio::SeqIO opening gzipped files In-Reply-To: <20031114124403.G96833@fling.sanbi.ac.za> References: <20031114111251.D96833@fling.sanbi.ac.za> <20031114100612.GA30180@ebi.ac.uk> <20031114121837.F96833@fling.sanbi.ac.za> <20031114103833.GA25889@ebi.ac.uk> <20031114124403.G96833@fling.sanbi.ac.za> Message-ID: <20031114120003.GA25691@ebi.ac.uk> On Fri, Nov 14, 2003 at 12:52:25PM +0200, Zayed Albertyn wrote: > The file does exist as > > -rwxrwxrwx 1 dbases sanbi 21510739 Sep 27 23:27 /cip0/db/GENBANK/RELEASE137/gbest13.seq.gz > > I always supplied a gzipped file and printed it's name to check what files > are being worked on. > > I even added an "if (-e "$path/$file") " to the loop and it still went in > to try and open the file. > > > But I cannot write to the directory it is in. So without the -c switch I > wont be able to decompress it anyway unless I make a copy in my own > directory. I have tried that with and without the -c switch. gunzip does > the job but I cant get around that internal method error. [cut] I think this is where you let us know what version of bioperl you're using and what system you're on. Using fresh bioperl 1.2.3 installation on OpenBSD, I have no problem reading genbank data off a gunzip pipe. It would be useful to see a minimal, yet complete, program that exhibits the behaviour you're mentioning. Andreas -- |[--]| Andreas K?h?ri |-}{-| |-][-| EMBL, European Bioinformatics Institute |{--}| |[--]| Wellcome Trust Genome Campus, Hinxton |-}{-| |-][-| Cambridge, CB10 1SD |{--}| |[--]| United Kingdom |-}{-| From zayed at sanbi.ac.za Fri Nov 14 09:25:54 2003 From: zayed at sanbi.ac.za (Zayed Albertyn) Date: Fri Nov 14 09:23:23 2003 Subject: [Bioperl-l] Problem with Bio::SeqIO opening gzipped files In-Reply-To: <20031114120003.GA25691@ebi.ac.uk> References: <20031114111251.D96833@fling.sanbi.ac.za> <20031114100612.GA30180@ebi.ac.uk> <20031114121837.F96833@fling.sanbi.ac.za> <20031114103833.GA25889@ebi.ac.uk> <20031114124403.G96833@fling.sanbi.ac.za> <20031114120003.GA25691@ebi.ac.uk> Message-ID: <20031114161930.R96833@fling.sanbi.ac.za> I am using bioperl-live checked out this morning from bioperl.org cvs. My OS is Redhat 7.3. I tried it on a Rh9 machine with stable bioperl 1.2.3 and I get the same error message: zayedi@jive4:26pm~/research/GBEST_tmp% perl bin/plant_org.pl plant_org.list Working on /cip0/db/GENBANK/RELEASE137/gbest13.seq.gz 1/269 Can't call method "_guess_format" without a package or object reference at /cip0/research/zayed/gmod/bioperl-live/Bio/SeqIO.pm line 363, line 4. Here is my minimilast code: $files is an array reference to a list of files in $path directory foreach $file (@{$files}) { if (-e "$path/$file") { $c++; my $seq_in = Bio::SeqIO::new( '-file' => "/bin/gunzip -c $path/$file|", '-format' => 'genbank' ); while ( my $seqobj = $seq_in->next_seq ) { next unless $ORG->{$species}; print OUT $file, "\n"; } } else { print "$file not in existence\n"; } } >On Fri, 14 Nov 2003, Andreas Kahari wrote: > On Fri, Nov 14, 2003 at 12:52:25PM +0200, Zayed Albertyn wrote: > > The file does exist as > > > > -rwxrwxrwx 1 dbases sanbi 21510739 Sep 27 23:27 /cip0/db/GENBANK/RELEASE137/gbest13.seq.gz > > > > I always supplied a gzipped file and printed it's name to check what files > > are being worked on. > > > > I even added an "if (-e "$path/$file") " to the loop and it still went in > > to try and open the file. > > > > > > But I cannot write to the directory it is in. So without the -c switch I > > wont be able to decompress it anyway unless I make a copy in my own > > directory. I have tried that with and without the -c switch. gunzip does > > the job but I cant get around that internal method error. > [cut] > > I think this is where you let us know what version of bioperl > you're using and what system you're on. > > Using fresh bioperl 1.2.3 installation on OpenBSD, I have no > problem reading genbank data off a gunzip pipe. > > It would be useful to see a minimal, yet complete, program that > exhibits the behaviour you're mentioning. > > > Andreas > > -- > |[--]| Andreas K?h?ri |-}{-| > |-][-| EMBL, European Bioinformatics Institute |{--}| > |[--]| Wellcome Trust Genome Campus, Hinxton |-}{-| > |-][-| Cambridge, CB10 1SD |{--}| > |[--]| United Kingdom |-}{-| > ----------------------------------------------- From: Zayed Albertyn Electric Genetics PTY Ltd Tel: +27 21 959 3645; Mobile: +2782 480 6097 www.egenetics.com From heikki at ebi.ac.uk Fri Nov 14 10:14:37 2003 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Fri Nov 14 10:11:07 2003 Subject: [Bioperl-l] new HOWTO: SimpleWebAnalysis Message-ID: <1068822876.6731.30.camel@localhost> In the bioperl-live CVS ( and 1.3 snap shots) we have several new modules that try to make life easier when you want to automate using sequence analysis algorithms from web forms. Richard Adams has now written a HOWTO document about his Bio::Tools::Analysis modules. See into /doc/howto/{sgml|html|pdf|txt} for the SimpleWebAnalysis document in you favourite format (it is CVS only for now). Summary: Richard has written a superclass Bio::Tools::Analysis::SimpleAnalysisBase, which implements Bio::SimpleAnalysisI and inherits from Bio::WebAgent. Adding a new form-based sequence service is easily done by sub-classing, specifying a some parameters and overriding a few methods. Seven different services have been added to date, with more to come. Enjoy, -Heikki -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From jason at cgt.duhs.duke.edu Fri Nov 14 11:14:32 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Fri Nov 14 11:11:02 2003 Subject: [Bioperl-l] Problem with Bio::SeqIO opening gzipped files In-Reply-To: <20031114121837.F96833@fling.sanbi.ac.za> References: <20031114111251.D96833@fling.sanbi.ac.za> <20031114100612.GA30180@ebi.ac.uk> <20031114121837.F96833@fling.sanbi.ac.za> Message-ID: When you pass in -file there is an implicit assumption that it is a filename you are passing in, NOT a stream. If you want to make this work, do this (you can replace 'zcat' with 'gunzip -c' if you prefer ) open($fh, "zcat $filename.gz |"); my $seqio = new Bio::SeqIO(-fh => $fh, -format => 'genbank'); You can also provide multiple files in that zcat open($fh, "zcat $file1 $file2 ... |"); -jason On Fri, 14 Nov 2003, Zayed Albertyn wrote: > Hi Andreas > > Adding the -c switch still doesnt work. I still get the same error > message. Input is the full path to the file e.g. > > /cip0/db/GENBANK/RELEASE137/gbest13.seq.gz == $path/$file > > I've written another script that does the normal > open(FILE,"/bin/gunzip -c file1 |") > > and it works fine > > Z > > > > > my $seq_in = Bio::SeqIO::new( > > '-file' => "/bin/gunzip -c $path/$file|", > > '-format' => 'genbank' > > ); > > > > > > > > -- > > |()()| Andreas K?h?ri |(==)| > > |)()(| EMBL, European Bioinformatics Institute |=)(=| > > |()()| Wellcome Trust Genome Campus, Hinxton |(==)| > > |)()(| Cambridge, CB10 1SD |=)(=| > > |()()| United Kingdom |(==)| > > > > ----------------------------------------------- > From: Zayed Albertyn > Electric Genetics PTY Ltd > Tel: +27 21 959 3645; Mobile: +2782 480 6097 > www.egenetics.com > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From brian_osborne at cognia.com Fri Nov 14 11:43:54 2003 From: brian_osborne at cognia.com (Brian Osborne) Date: Fri Nov 14 11:43:31 2003 Subject: [Bioperl-l] Problem with Bio::SeqIO opening gzipped files In-Reply-To: Message-ID: Jason, This is odd because the SeqIO HOWTO says you can do the trick that Zayed is trying. From the HOWTO: use Bio::SeqIO; # get command-line arguments, or die with a usage statement my $usage = "gzip2fasta.pl infile informat outfile\n"; my $infile = shift or die $usage; my $informat = shift or die $usage; my $outformat = shift or die $usage; # create one SeqIO object to read in, and another to write out my $seqin = Bio::SeqIO->new('-file' => "/usr/local/bin/gunzip $infile |", '-format' => $informat); my $seqout = Bio::SeqIO->new('-file' => ">$outfile", '-format' => 'Fasta'); # write each entry in the input to the output file while (my $inseq = $seqin->next_seq) { $outseq->write_seq($inseq); } exit; I should correct the HOWTO? Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Jason Stajich Sent: Friday, November 14, 2003 11:15 AM To: Zayed Albertyn Cc: bioperl-l@bioperl.org; Andreas Kahari Subject: Re: [Bioperl-l] Problem with Bio::SeqIO opening gzipped files When you pass in -file there is an implicit assumption that it is a filename you are passing in, NOT a stream. If you want to make this work, do this (you can replace 'zcat' with 'gunzip -c' if you prefer ) open($fh, "zcat $filename.gz |"); my $seqio = new Bio::SeqIO(-fh => $fh, -format => 'genbank'); You can also provide multiple files in that zcat open($fh, "zcat $file1 $file2 ... |"); -jason On Fri, 14 Nov 2003, Zayed Albertyn wrote: > Hi Andreas > > Adding the -c switch still doesnt work. I still get the same error > message. Input is the full path to the file e.g. > > /cip0/db/GENBANK/RELEASE137/gbest13.seq.gz == $path/$file > > I've written another script that does the normal > open(FILE,"/bin/gunzip -c file1 |") > > and it works fine > > Z > > > > > my $seq_in = Bio::SeqIO::new( > > '-file' => "/bin/gunzip -c $path/$file|", > > '-format' => 'genbank' > > ); > > > > > > > > -- > > |()()| Andreas K?h?ri |(==)| > > |)()(| EMBL, European Bioinformatics Institute |=)(=| > > |()()| Wellcome Trust Genome Campus, Hinxton |(==)| > > |)()(| Cambridge, CB10 1SD |=)(=| > > |()()| United Kingdom |(==)| > > > > ----------------------------------------------- > From: Zayed Albertyn > Electric Genetics PTY Ltd > Tel: +27 21 959 3645; Mobile: +2782 480 6097 > www.egenetics.com > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From d.gatherer at vir.gla.ac.uk Fri Nov 14 11:54:15 2003 From: d.gatherer at vir.gla.ac.uk (Derek Gatherer) Date: Fri Nov 14 11:49:18 2003 Subject: [Bioperl-l] Codeml.pm/PAML.pm In-Reply-To: <1068483873.2453.10.camel@localhost> References: <5.2.1.1.1.20031110161920.00b08650@udcf.gla.ac.uk> <5.2.1.1.1.20031110161920.00b08650@udcf.gla.ac.uk> Message-ID: <5.2.1.1.1.20031114163856.00ad4dd0@udcf.gla.ac.uk> Hi I'm running the script in scripts/utilities/pairwise_kaks.PLS , but it chokes when it gets to the line: my $result = $parser->next_result; the output is: ____________________ CLUSTAL W (1.83) Multiple Sequence Alignments Sequence format is Pearson Sequence 1: MERLIN_UL150.CDS.EXP 639 aa Sequence 2: TOLEDO_UL150.CDS.EXP 639 aa Start of Pairwise alignments Aligning... Sequences (1:2) Aligned. Score: 94 Guide tree file created: [/tmp/9TloZHefb1/Xk3b0BFDEn.dnd] Start of Multiple Alignment There are 1 groups Aligning... Group 1: Sequences: 2 Score:13481 Alignment Score 3765 GCG-Alignment file created [/tmp/9TloZHefb1/aGgun5VY9E] Use of uninitialized value in pattern match (m//) at /usr/lib/perl-5.8.0/lib/site_perl/5.8.0/Bio/Tools/Phylo/PAML.pm line 524, line 89. _______________________ so I seem to be getting as far as actually making the parser with $kaks_factory->run(), but then that parser won't run next_result(). The line in PAML.pm that causes the problem is: while( $rest =~ /(\-?\d+(\.\d+)?)\s*\(\-?(\d+(\.\d+)?)\s+(\-?\d+(\.\d+)?)\)/g ) Is this a PAML configuration problem? or is it my input format (it looks like okay FASTA to me) Cheers Derek From jason at cgt.duhs.duke.edu Fri Nov 14 12:07:12 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Fri Nov 14 12:03:48 2003 Subject: [Bioperl-l] Problem with Bio::SeqIO opening gzipped files In-Reply-To: References: Message-ID: Ah right - I guess that should work - the "-c" is really needed to correct the howto. I just make a separate filehandle myself to make sure things work. -jason On Fri, 14 Nov 2003, Brian Osborne wrote: > Jason, > > This is odd because the SeqIO HOWTO says you can do the trick that Zayed is > trying. From the HOWTO: > > use Bio::SeqIO; > # get command-line arguments, or die with a usage statement > my $usage = "gzip2fasta.pl infile informat outfile\n"; > my $infile = shift or die $usage; > my $informat = shift or die $usage; > my $outformat = shift or die $usage; > > # create one SeqIO object to read in, and another to write out > my $seqin = Bio::SeqIO->new('-file' => "/usr/local/bin/gunzip $infile > |", > '-format' => $informat); > > my $seqout = Bio::SeqIO->new('-file' => ">$outfile", > '-format' => 'Fasta'); > > # write each entry in the input to the output file > while (my $inseq = $seqin->next_seq) { > $outseq->write_seq($inseq); > } > exit; > > I should correct the HOWTO? > > Brian O. > > > -----Original Message----- > From: bioperl-l-bounces@portal.open-bio.org > [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Jason Stajich > Sent: Friday, November 14, 2003 11:15 AM > To: Zayed Albertyn > Cc: bioperl-l@bioperl.org; Andreas Kahari > Subject: Re: [Bioperl-l] Problem with Bio::SeqIO opening gzipped files > > When you pass in -file there is an implicit assumption that it is a > filename you are passing in, NOT a stream. > > If you want to make this work, do this (you can replace 'zcat' with > 'gunzip -c' if you prefer ) > open($fh, "zcat $filename.gz |"); > my $seqio = new Bio::SeqIO(-fh => $fh, -format => 'genbank'); > > You can also provide multiple files in that zcat > open($fh, "zcat $file1 $file2 ... |"); > > -jason > On Fri, 14 Nov 2003, Zayed Albertyn wrote: > > > Hi Andreas > > > > Adding the -c switch still doesnt work. I still get the same error > > message. Input is the full path to the file e.g. > > > > /cip0/db/GENBANK/RELEASE137/gbest13.seq.gz == $path/$file > > > > I've written another script that does the normal > > open(FILE,"/bin/gunzip -c file1 |") > > > > and it works fine > > > > Z > > > > > > > > my $seq_in = Bio::SeqIO::new( > > > '-file' => "/bin/gunzip -c $path/$file|", > > > '-format' => 'genbank' > > > ); > > > > > > > > > > > > -- > > > |()()| Andreas K?h?ri |(==)| > > > |)()(| EMBL, European Bioinformatics Institute |=)(=| > > > |()()| Wellcome Trust Genome Campus, Hinxton |(==)| > > > |)()(| Cambridge, CB10 1SD |=)(=| > > > |()()| United Kingdom |(==)| > > > > > > > ----------------------------------------------- > > From: Zayed Albertyn > > Electric Genetics PTY Ltd > > Tel: +27 21 959 3645; Mobile: +2782 480 6097 > > www.egenetics.com > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From hyupaik at indiana.edu Fri Nov 14 13:48:44 2003 From: hyupaik at indiana.edu (hyupaik@indiana.edu) Date: Fri Nov 14 13:45:00 2003 Subject: [Bioperl-l] Problem or Bug with Bio:: In-Reply-To: References: Message-ID: <1068835724.3fb5238c6deef@webmail.iu.edu> #! /local/bin/perl -w use strict; use lib '/home/hy1001/bin'; use Bio::SimpleAlign; use Bio::AlignIO; my $in = Bio::AlignIO->new('-file' => 'fastaResult', '-format' => 'fasta'); my $aln = $in->next_aln(); my $out = new Bio::AlignIO('-file' => '>testout.fasta', '-format' => 'fasta'); $out->write_aln($aln); >>crab_chick/1-462 initninitoptZ-score.bits.E.Smith-Watermanscore.identity.unga ppedinaaoverlap--crabaMDITIHNPLIRRPLFSWLAPSRIFDQIFGEHLQESELL PASPSLSPFLMRSPIFRMPSWL....crabcMDITIHNPLVRRPLFSWLTPSRIFDQIFG EHLQESELLPTSPSLSPFLMRSPFFRMPSWLcrabaETGLSEMRLEKDKFSVNLDVKHFS PEELKVKVLGDMVEIHGKHEERQDEHGFIAREFNRK..crabcETGLSEMRLEKDKFSVN LDVKHFSPEELKVKVLGDMIEIHGKHEERQDEHGFIAREFSRKcrabaYRIPADVDPLTI TSSLSLDGVLTVSAPRKQSDVPERSIPITREEKPAIAGAQRK.crabcYRIPADVDPLTI TSSLSLDGVLTVSAPRKQSDVPERSIPITREEKPAIAGSQRK >>crab_bovin/1-628 initninitoptZ-score.bits.E.Smith-Watermanscore.identity.unga ppedinaaoverlap--crabaMDITIHNPLIRRPLFSWLAPSRIFDQIFGEHLQESELL PASPSLSPFLMRSPIF-RMPSW..........crabbMDIAIHHPWIRRPFFPFHSPSRL FDQFFGEHLLESDLFPASTSLSPFYLRPPSFLRAPSWcrabaLETGLSEMRLEKDKFSVN LDVKHFSPEELKVKVLGDMVEIHGKHEERQDEHGFIAREFNR........crabbIDTGL SEMRLEKDRFSVNLDVKHFSPEELKVKVLGDVIEVHGKHEERQDEHGFISREFHRcraba KYRIPADVDPLTITSSLSLDGVLTVSAPRKQSDVPERSIPITREEKPAIAGAQRK..... ....crabbKYRIPADVDPLAITSSLSSDGVLTVNGPRKQASGPERTIPITREEKPAVT AAPKKresiduesinquerysequencesresiduesinlibrarysequencesScomp libtstartFriNovdoneFriNovTotalScantime.TotalDisplaytime.Func tionusedwasFASTAversion.tNov From hyupaik at indiana.edu Fri Nov 14 14:03:17 2003 From: hyupaik at indiana.edu (hyupaik@indiana.edu) Date: Fri Nov 14 13:59:38 2003 Subject: [Bioperl-l] Problem or Bug with Bio::LocatableSeq In-Reply-To: References: Message-ID: <1068836597.3fb526f5e5905@webmail.iu.edu> Hello list, I am sorry for the pervious email. I sent it accidentally. I found out something strange when I used AlignIO. I was going to see a sequence through LocatableSeq. But I got extra characters than seq. I believe that those characters are from the index of the fasta result file. Here's my code and the result sequence. Is this a bug or did I something wrong? Thank you. - Henry. ---------------------------------------------------------------- #! /local/bin/perl -w use strict; use lib '/home/hy1001/bin'; use Bio::SimpleAlign; use Bio::AlignIO; my $in = Bio::AlignIO->new('-file' => 'fastaResult', '-format' => 'fasta'); my $aln = $in->next_aln(); print $aln->get_seq_by_pos(1)->seq(),"\n"; ------------------------------------------------------------------ [hy1001@biokdd fastaAlign]$ perl alignfasta.pl initninitoptZ-score.bits.E.Smith-Watermanscore.identity.ungappedinaaoverlap--crabaMDITIHNPLIRRPLFSWLAPSRIFDQIFGEHLQESELLPASPSLSPFLMRSPIFRMPSWL....crabcMDITIHNPLVRRPLFSWLTPSRIFDQIFGEHLQESELLPTSPSLSPFLMRSPFFRMPSWLcrabaETGLSEMRLEKDKFSVNLDVKHFSPEELKVKVLGDMVEIHGKHEERQDEHGFIAREFNRK..crabcETGLSEMRLEKDKFSVNLDVKHFSPEELKVKVLGDMIEIHGKHEERQDEHGFIAREFSRKcrabaYRIPADVDPLTITSSLSLDGVLTVSAPRKQSDVPERSIPITREEKPAIAGAQRK.crabcYRIPADVDPLTITSSLSLDGVLTVSAPRKQSDVPERSIPITREEKPAIAGSQRK From jason at cgt.duhs.duke.edu Fri Nov 14 14:06:30 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Fri Nov 14 14:02:48 2003 Subject: [Bioperl-l] Problem or Bug with Bio:: In-Reply-To: <1068835724.3fb5238c6deef@webmail.iu.edu> References: <1068835724.3fb5238c6deef@webmail.iu.edu> Message-ID: FASTA format has different meanings AlignIO fasta means Multiple sequence alignment FASTA like this >seq1 AGATAG-GATGA >seq2 AGATAGTGA-GA If you want to parse FASTA (pairwise) search output use Bio::SearchIO On Fri, 14 Nov 2003 hyupaik@indiana.edu wrote: > > > #! /local/bin/perl -w > > use strict; > use lib '/home/hy1001/bin'; > > use Bio::SimpleAlign; > use Bio::AlignIO; > > > my $in = Bio::AlignIO->new('-file' => 'fastaResult', > '-format' => 'fasta'); > my $aln = $in->next_aln(); > > my $out = new Bio::AlignIO('-file' => '>testout.fasta', > '-format' => 'fasta'); > $out->write_aln($aln); > > > > > > > >>crab_chick/1-462 > initninitoptZ-score.bits.E.Smith-Watermanscore.identity.unga > ppedinaaoverlap--crabaMDITIHNPLIRRPLFSWLAPSRIFDQIFGEHLQESELL > PASPSLSPFLMRSPIFRMPSWL....crabcMDITIHNPLVRRPLFSWLTPSRIFDQIFG > EHLQESELLPTSPSLSPFLMRSPFFRMPSWLcrabaETGLSEMRLEKDKFSVNLDVKHFS > PEELKVKVLGDMVEIHGKHEERQDEHGFIAREFNRK..crabcETGLSEMRLEKDKFSVN > LDVKHFSPEELKVKVLGDMIEIHGKHEERQDEHGFIAREFSRKcrabaYRIPADVDPLTI > TSSLSLDGVLTVSAPRKQSDVPERSIPITREEKPAIAGAQRK.crabcYRIPADVDPLTI > TSSLSLDGVLTVSAPRKQSDVPERSIPITREEKPAIAGSQRK > >>crab_bovin/1-628 > initninitoptZ-score.bits.E.Smith-Watermanscore.identity.unga > ppedinaaoverlap--crabaMDITIHNPLIRRPLFSWLAPSRIFDQIFGEHLQESELL > PASPSLSPFLMRSPIF-RMPSW..........crabbMDIAIHHPWIRRPFFPFHSPSRL > FDQFFGEHLLESDLFPASTSLSPFYLRPPSFLRAPSWcrabaLETGLSEMRLEKDKFSVN > LDVKHFSPEELKVKVLGDMVEIHGKHEERQDEHGFIAREFNR........crabbIDTGL > SEMRLEKDRFSVNLDVKHFSPEELKVKVLGDVIEVHGKHEERQDEHGFISREFHRcraba > KYRIPADVDPLTITSSLSLDGVLTVSAPRKQSDVPERSIPITREEKPAIAGAQRK..... > ....crabbKYRIPADVDPLAITSSLSSDGVLTVNGPRKQASGPERTIPITREEKPAVT > AAPKKresiduesinquerysequencesresiduesinlibrarysequencesScomp > libtstartFriNovdoneFriNovTotalScantime.TotalDisplaytime.Func > tionusedwasFASTAversion.tNov > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From lstein at cshl.edu Fri Nov 14 17:20:40 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Fri Nov 14 17:17:09 2003 Subject: [Bioperl-l] new to the group. And a quick demo of SVG::GD In-Reply-To: <200311132122.26300.ronan@roasp.com> References: <200311132122.26300.ronan@roasp.com> Message-ID: <200311141720.40286.lstein@cshl.edu> Hi Ronan, You should coordinate with Todd Harris, who has gotten very far along the identical path, up to the point where most of Bio::Graphics displays correctly in SVG. Lincoln On Thursday 13 November 2003 04:22 pm, Ronan Oger wrote: > Hi, > > My name is Ronan Oger, I am the lead developer of the SVG module. > > One of the focuses of my current work is SVG::GD, a wrapper for the GD > module to provide SVG (vector) output instead of raster. > > http://www.w3.org/Graphics/SVG/ > http://www.w3.org/TR/SVG > > I've been doing some tests with GD and GD derivatives, and since bioperl is > a fairly heavy user of GD, I have been testing around some bioperl code to > see how it works. > > There are some real issues in the SVG::GD at this point, but several people > are working on it and progress is being made. > > Clearly this is not production code at this stage. In particular, font > support is still very poor, and font positions are still broken. > > However, here is a bioperl-specific sample (you need an SVG-compliant > browser, such as IE with Adobe or Corel's SVG viewers installed. > > A png and its svg friend taken from a bio-related example on the net > ------------------------------------ > http://www.roasp.com/2003/11/13/ > > More prolific example comparisons > http://www.roasp.com/2003/11/11/ > > The SVG::GD module (version 0.07): > http://www.roasp.com/2003/11/11/SVG-GD-0.07.tar.gz > (This module has a dependency on SVG, which is on CPAN) > When it ripens, the module will live on CPAN. > > I'd appreciate some feedback, issues, etc. In particular, relating to the > module's usability. > > All the best, > > Ronan -- ======================================================================== Lincoln D. Stein Cold Spring Harbor Laboratory lstein@cshl.org Cold Spring Harbor, NY ======================================================================== From vesko_baev at abv.bg Sun Nov 16 06:07:21 2003 From: vesko_baev at abv.bg (Vesko Baev) Date: Sun Nov 16 06:03:51 2003 Subject: [Bioperl-l] PCR Message-ID: <2046208294.1068980841531.JavaMail.nobody@app1.ni.bg> Hi All, Did anyone have a module or algorithm for simulating a PCR reaction? Thanks to ALL! You can mail or send to: baev@pu.acad.bg ----------------------------------------------------------------- http://home-techno.abv.bg - ??? ?? Home Techno ? ????? ????? ??????! ??????? ?????? Hyundai! From heikki at nildram.co.uk Sun Nov 16 06:59:44 2003 From: heikki at nildram.co.uk (Heikki Lehvaslaiho) Date: Sun Nov 16 06:56:14 2003 Subject: [Bioperl-l] PCR In-Reply-To: <2046208294.1068980841531.JavaMail.nobody@app1.ni.bg> References: <2046208294.1068980841531.JavaMail.nobody@app1.ni.bg> Message-ID: <1068983984.2554.12.camel@localhost> Could electronic-PCR simulate the aspect of PCR you have in mind? http://www.ncbi.nlm.nih.gov/genome/sts/epcr.cgi more sites, ftp, Ensembl wrappers etc can be found using google. -Heikki On Sun, 2003-11-16 at 11:07, Vesko Baev wrote: > Hi All, > Did anyone have a module or algorithm for simulating a PCR reaction? > > Thanks to ALL! > You can mail or send to: baev@pu.acad.bg > > ----------------------------------------------------------------- > http://home-techno.abv.bg - ??? ?? Home Techno ? ????? ????? ??????! ??????? ?????? Hyundai! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From redwards at utmem.edu Sun Nov 16 07:24:35 2003 From: redwards at utmem.edu (Rob Edwards) Date: Sun Nov 16 07:21:02 2003 Subject: [Bioperl-l] PCR In-Reply-To: <2046208294.1068980841531.JavaMail.nobody@app1.ni.bg> Message-ID: yes. There is a PCRSimulation module that I wrote a while back though it doesn't appear to be in 1.3. I can add it if people want. Its available from http://www.salmonella.org/bioperl/ Rob On Sunday, November 16, 2003, at 11:07 AM, Vesko Baev wrote: > Hi All, > Did anyone have a module or algorithm for simulating a PCR reaction? > > Thanks to ALL! > You can mail or send to: baev@pu.acad.bg > > ----------------------------------------------------------------- > http://home-techno.abv.bg - ??? ?? Home Techno ? ????? ????? ??????! > ??????? ?????? Hyundai! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From Guido.Dieterich at gbf.de Mon Nov 17 02:56:27 2003 From: Guido.Dieterich at gbf.de (Guido Dieterich) Date: Mon Nov 17 02:56:29 2003 Subject: [Bioperl-l] Blast Message-ID: <1069055988.2623.1397.camel@SB289.gbf-braunschweig.de> Dear BioPerl User, I have a problem with Blast. One year ago I wrote a perl script to performe a remote Blast by help of the BioPerl module. Everythinks works well. Now this script does not work. I get instead of the blast result, only a short string as part of the URL "post ... " as return value. I wrote a program in C to send a blast to the ncbi remote server. This programs produces sometime a blast result, sometimes not! Any ideas? Thanks in advance GUido -- Dr. Guido Dieterich Dipl.-Biologe BioComputing SB - Strukturbiologie \==-| GBF - Gesellschaft fuer Biotechnologische Forschung \=/ 0010010010100101110010 German Research Centre for Biotechnology /-\ /-==| 0010100100111101010010 WWW: http://www.gbf.de _/_/_/ _/_/_/ _/_/_/ |==-/ EMAIL: gdi@gbf.de _/ _/ _/ _/ _/ \=/ 0100100100010010010101 _/ _/ _/ _/ /\ Mascheroder Weg 1 _/ _/ _/_/_/ _/_/_/ /=-\ 1101001010100101010101 D-38124 Braunschweig _/ _/ _/ _/ _/ Tel: +(49) 531 6181 745 _/ _/ _/ _/ _/ FAX: +(49) 531 2612 388 _/_/_/ _/_/_/ _/ http://struktur.gbf.de/ Es ist nicht genug, zu wissen, man muss auch anwenden. Es ist nicht genug, zu wollen, man muss auch tun. JOHANN WOLFGANG VON GOETHE Deutscher Dichter (1749 - 1832) From Daniel.Lang at biologie.uni-freiburg.de Mon Nov 17 05:06:40 2003 From: Daniel.Lang at biologie.uni-freiburg.de (Daniel Lang) Date: Mon Nov 17 05:03:00 2003 Subject: [Bioperl-l] Graphics:Panel /SeqFeature::Generic Message-ID: <3FB89DB0.5070303@biologie.uni-freiburg.de> Hi, I want to generate overview graphics from BLAST reports, where the hits are sorted and colored (>1e-10 -->green, ...)according their evalues... So I thought, I could solve this using a callback function for the bgcolor and using the 'low_score' sort_order, but when applied to a BLAST report, it results in sorted but only red hits? I also tried introducing the evalues as additional tags like done with 'bits' or 'range', but when testing for this tag in the callback (has_tag) its not available? So I wander if the function is envoked for each hit in the while loop? Here the code sniplet: my $track = $panel->add_track(-glyph => 'generic', -label => 1, -connector => 'dashed', -height => 5, -bgcolor => sub { my $feature = shift; my $evalue = $feature->score; if ($evalue < 1e-10) {return 'green';} else {return 'red';}} , -fontcolor => 'green', -font2color => 'red', -sort_order => 'low_score', -min_score => '1e-1000', -max_score => '10000', -description => sub { my $feature = shift; return unless $feature->has_tag('bits'); my ($description) = $feature->each_tag_value('bits'); my $score = $feature->score; my ($range) = $feature->each_tag_value('range'); "Score=$description bits, E-value=$score, $range"; }); while( my $hit = $result->next_hit ) { my $evalue = $hit->significance; my $feature = Bio::SeqFeature::Generic->new(-score => $evalue, -display_name => $hit->name, -tag => { 'bits' => $hit->bits, 'range' => "from ". $hit->start('query') . " to " . $hit->end('query'), }, ); while( my $hsp = $hit->next_hsp ) { $feature->add_sub_SeqFeature($hsp,'EXPAND'); } $track->add_feature($feature); } Thanks in advance, Daniel From rkh at gene.com Mon Nov 17 19:19:26 2003 From: rkh at gene.com (Reece Hart) Date: Mon Nov 17 19:17:13 2003 Subject: [Bioperl-l] location of non-bioperl code in the Bio:: tree Message-ID: <1069114766.3454.71.camel@tallac> Cheers, bioperl community. Thanks for your contributions to bioperl. I'm about to package a set of perl modules which facilitate execution, manipulation, and visualization of Prospect protein threading alignments. Is there consensus (or, lacking that, rational opinion) about where bioperl developers and community would like to stash non-bioperl code in the Bio:: tree? The package uses bioperl, but is not bioperl compliant (i.e., there're no pure virtual interfaces, I use a home-grown exception class, hash keys don't have hyphen prefixes, etc). For this reason, I'm strongly inclined to keep these modules together in one tree rather than scattering them among bioperl's Bio::Tools, Bio::Align, and perhaps other Bio:: subtrees where the dissimilarity of the API would be confusing to users. I've tentatively chosen to name these Bio::Prospect::{ Thread, ThreadSummary, SOAPClient, Options, ... }. This is akin to Bio::MAGE:: and is a tradeoff between isolation of related modules under one subtree but not wanting to bury them under a lengthy name like Bio::Interfaces::Prospect::Blah.pm. Before I release these modules, I'm open to comments about where these should go. If there's no strong dissent, I'll go with Bio::Prospect:: and upload to CPAN in the very near future. Comments? Thanks, Reece -- Reece Hart, Ph.D. rkh@gene.com, http://www.gene.com/ Genentech, Inc. 650/225-6133 (voice), -5389 (fax) Bioinformatics and Protein Engineering 1 DNA Way, MS-93 http://www.in-machina.com/~reece/ South San Francisco, CA 94080-4990 reece@in-machina.com, GPG: 0x25EC91A0 From jurgen.pletinckx at algonomics.com Tue Nov 18 06:08:57 2003 From: jurgen.pletinckx at algonomics.com (Jurgen Pletinckx) Date: Tue Nov 18 05:57:15 2003 Subject: [Bioperl-l] RE: about pdb parsing In-Reply-To: <3FAF5FE5.2050109@bioinfo.cu> Message-ID: | Hi Kris: | | I've known about you because of the examples scripts of PDB parsing | that comes with the bioperl distribution. I'm interested on the research | in this field and I've been trying to get some information about it. I | would like to know if you could send me more examples of PDB file parsing | using bioperl. Actually I've been trying to get the chain_id from pdb | files but I just couldn't. I would really appreciate any help you can | give me. Dear Vladimir, I'm sorry to say Kris has left AlgoNomics. As it happens, I can be of assistance in this matter. In attachment, you can find a sample script together with sample output. The script runs through a directory of Brookhaven files, and print some info for each chain it finds. I hope it comes close to what you require. If not, do ask again. I rather like the module myself, and I'm glad to help. (Although it might be even better to ask questions on the actual bioperl mailing list - I'm CC'ing this reply). -- Jurgen Pletinckx AlgoNomics NV From jurgen.pletinckx at algonomics.com Tue Nov 18 06:22:16 2003 From: jurgen.pletinckx at algonomics.com (Jurgen Pletinckx) Date: Tue Nov 18 08:16:17 2003 Subject: [Bioperl-l] RE: about pdb parsing In-Reply-To: Message-ID: # In attachment, you can find a sample script together with # sample output. And, as usual, the actual attachments come later. Sorry about that. -- Jurgen Pletinckx AlgoNomics NV -------------- next part -------------- pdb1sw6.ent 1SW6 A 253 GLY-212 GLY-512 pdb1sw6.ent 1SW6 B 253 GLY-212 GLY-512 pdb1sw6.ent 1SW6 default 214 HOH-1 HOH-216 pdb1swa.ent 1SWA A 119 GLY-16 PRO-135 pdb1swa.ent 1SWA B 114 GLY-16 VAL-133 pdb1swa.ent 1SWA C 114 GLY-16 VAL-133 pdb1swa.ent 1SWA D 114 GLY-16 VAL-133 pdb1swa.ent 1SWA default 210 HOH-201 HOH-411 pdb1swb.ent 1SWB A 119 GLY-16 PRO-135 pdb1swb.ent 1SWB B 114 GLY-16 VAL-133 pdb1swb.ent 1SWB C 114 GLY-16 VAL-133 pdb1swb.ent 1SWB D 114 GLY-16 VAL-133 pdb1swb.ent 1SWB default 208 HOH-201 HOH-409 pdb1swc.ent 1SWC A 113 GLY-16 LYS-132 pdb1swc.ent 1SWC B 116 GLY-16 LYS-132 pdb1swc.ent 1SWC C 112 GLY-16 LYS-132 pdb1swc.ent 1SWC D 116 GLY-16 LYS-132 pdb1swc.ent 1SWC default 225 HOH-201 HOH-426 pdb1swd.ent 1SWD A 116 GLY-16 LYS-132 pdb1swd.ent 1SWD B 113 GLY-16 VAL-133 pdb1swd.ent 1SWD C 114 GLY-16 VAL-133 pdb1swd.ent 1SWD D 117 GLY-16 VAL-133 pdb1swd.ent 1SWD default 235 BTN-300 HOH-434 pdb1swe.ent 1SWE A 116 GLY-16 LYS-132 pdb1swe.ent 1SWE B 117 GLY-16 VAL-133 pdb1swe.ent 1SWE C 117 GLY-16 VAL-133 pdb1swe.ent 1SWE D 117 GLY-16 VAL-133 pdb1swe.ent 1SWE default 305 BTN-300 HOH-502 pdb1swf.ent 1SWF A 115 SER-52 SER-45 pdb1swf.ent 1SWF B 124 SER-52 SER-45 pdb1swf.ent 1SWF C 124 SER-52 SER-45 pdb1swf.ent 1SWF D 115 SER-52 SER-45 pdb1swf.ent 1SWF default 213 HOH-201 HOH-414 pdb1swg.ent 1SWG A 111 SER-52 ALA-46 pdb1swg.ent 1SWG B 111 SER-52 ALA-46 pdb1swg.ent 1SWG C 125 SER-52 ALA-46 pdb1swg.ent 1SWG D 112 GLU-51 SER-45 pdb1swg.ent 1SWG default 338 BTN-600 HOH-535 pdb1swh.ent 1SWH A 113 GLY-16 PRO-135 pdb1swh.ent 1SWH B 114 GLY-16 VAL-133 pdb1swh.ent 1SWH C 114 GLY-16 VAL-133 pdb1swh.ent 1SWH D 114 GLY-16 VAL-133 pdb1swh.ent 1SWH default 198 HOH-201 HOH-399 pdb1swi.ent 1SWI A 30 ARG-1 GLY-31 pdb1swi.ent 1SWI B 29 ARG-1 VAL-30 pdb1swi.ent 1SWI C 29 ARG-1 VAL-30 pdb1swi.ent 1SWI default 29 BNZ-100 HOH-78 pdb1swj.ent 1SWJ A 111 GLY-16 LYS-132 pdb1swj.ent 1SWJ B 117 GLY-16 VAL-133 pdb1swj.ent 1SWJ C 116 GLY-16 LYS-132 pdb1swj.ent 1SWJ D 112 GLY-16 VAL-133 pdb1swj.ent 1SWJ default 206 HOH-201 HOH-407 pdb1swk.ent 1SWK A 116 GLY-16 LYS-132 pdb1swk.ent 1SWK B 116 GLY-16 LYS-132 pdb1swk.ent 1SWK C 116 GLY-16 LYS-132 pdb1swk.ent 1SWK D 116 GLY-16 LYS-132 pdb1swk.ent 1SWK default 214 BTN-5100 HOH-6211 pdb1swl.ent 1SWL A 118 GLY-16 LYS-134 pdb1swl.ent 1SWL B 113 GLY-16 VAL-133 pdb1swl.ent 1SWL C 110 GLY-16 VAL-133 pdb1swl.ent 1SWL D 110 GLY-16 VAL-133 pdb1swl.ent 1SWL default 176 HOH-201 HOH-377 pdb1swm.ent 1SWM default 278 VAL-1 HOH-123 pdb1swn.ent 1SWN A 119 GLY-16 PRO-135 pdb1swn.ent 1SWN B 113 GLY-16 VAL-133 pdb1swn.ent 1SWN C 117 GLY-16 VAL-133 pdb1swn.ent 1SWN D 117 GLY-16 VAL-133 pdb1swn.ent 1SWN default 194 BTN-300 HOH-392 pdb1swo.ent 1SWO A 119 GLY-16 PRO-135 pdb1swo.ent 1SWO B 114 GLY-16 VAL-133 pdb1swo.ent 1SWO C 114 GLY-16 VAL-133 pdb1swo.ent 1SWO D 114 GLY-16 VAL-133 pdb1swo.ent 1SWO default 227 HOH-201 HOH-6228 pdb1swp.ent 1SWP A 116 GLY-16 LYS-132 pdb1swp.ent 1SWP B 116 GLY-16 LYS-132 pdb1swp.ent 1SWP C 116 GLY-16 LYS-132 pdb1swp.ent 1SWP D 116 GLY-16 LYS-132 pdb1swp.ent 1SWP default 188 BTN-300 HOH-385 pdb1swq.ent 1SWQ A 112 GLY-16 PRO-135 pdb1swq.ent 1SWQ B 113 GLY-16 VAL-133 pdb1swq.ent 1SWQ C 111 GLY-16 VAL-133 pdb1swq.ent 1SWQ D 113 GLY-16 VAL-133 pdb1swq.ent 1SWQ default 239 HOH-201 HOH-440 pdb1swr.ent 1SWR A 116 GLY-16 LYS-132 pdb1swr.ent 1SWR B 117 GLY-16 VAL-133 pdb1swr.ent 1SWR C 117 GLY-16 VAL-133 pdb1swr.ent 1SWR D 117 GLY-16 VAL-133 pdb1swr.ent 1SWR default 210 BTN-300 HOH-407 pdb1sws.ent 1SWS A 113 GLY-16 LYS-132 pdb1sws.ent 1SWS B 116 GLY-16 LYS-132 pdb1sws.ent 1SWS C 112 GLY-16 LYS-132 pdb1sws.ent 1SWS D 111 GLY-16 LYS-132 pdb1sws.ent 1SWS default 177 HOH-201 HOH-378 pdb1swt.ent 1SWT A 117 GLY-16 VAL-133 pdb1swt.ent 1SWT B 116 GLY-16 LYS-132 pdb1swt.ent 1SWT default 105 BTN-500 HOH-304 Exception found for input /xlv1/db/pdb/uncompressed_files/sw/pdb1swu.ent ------------- EXCEPTION ------------- MSG: A ANISOU record should have the same O0 as the previous record O2 STACK Bio::Structure::IO::pdb::_read_PDB_coordinate_section /usr/local/lib/perl5/site_perl/5.6.1/Bio/Structure/IO/pdb.pm:1277 STACK Bio::Structure::IO::pdb::next_structure /usr/local/lib/perl5/site_perl/5.6.1/Bio/Structure/IO/pdb.pm:459 STACK (eval) read.pl:16 STACK toplevel read.pl:13 -------------------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: read.pl Type: application/octet-stream Size: 655 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20031118/a4a3a70c/read.obj From m_conte at hotmail.com Tue Nov 18 10:11:26 2003 From: m_conte at hotmail.com (matthieu CONTE) Date: Tue Nov 18 10:07:45 2003 Subject: [Bioperl-l] need help with biopipe Message-ID: I'm trying to work with biopipe , i'm using the example program: Blast_file_pipeline.xml I use the command "perl PipelineManager -xml /home/conte/xml/test_blast_pipe.xml -dbname biopipe -dbpass biopipe -dbuser biopipe" in bioperl-pipeline/scripts and I have : "Retrying........ Fetched 0 completed jobs Going to snooze for 3 seconds... Waking up and run again! Fetching Jobs... Fetched 1 incomplete jobs " so ???Does anybody know what it means? Matthieu CONTE 23 route d'EUS 66500 CATLLAR Tel 0468962854 m_conte@hotmail.com _________________________________________________________________ MSN Search, le moteur de recherche qui pense comme vous ! http://search.msn.fr/worldwide.asp From harris at cshl.org Tue Nov 18 12:40:40 2003 From: harris at cshl.org (Todd Harris) Date: Tue Nov 18 12:37:01 2003 Subject: [Bioperl-l] new to the group. And a quick demo of SVG::GD In-Reply-To: <200311132122.26300.ronan@roasp.com> Message-ID: Hi Ronan - Ha, great minds!, right This looks pretty good. I've been working on a similar module that works exactly the same and maps almost all functions into SVG output (using your SVG module). I've placed this in the GD namespace as GD::SVG since that seems to more closely represent the intent of the module. You can check out a preliminary version of my module at http://toddot.net/GD-SVG/GD-SVG.0.01.tgz Docs: http://toddot.net/GD-SVG/gd-svg.html And some very preliminary test images based on Bio::Graphics and some simple test scripts: http://toddot.net/GD-SVG/test.png http://toddot.net/GD-SVG/test.svg http://toddot.net/GD-SVG/biographics-dynamic_glyphs.png http://toddot.net/GD-SVG/biographics-dynamic_glyphs.svg http://toddot.net/GD-SVG/biographics-lots.png http://toddot.net/GD-SVG/biographics-lots.svg These images are a little out-of-date. I've fixed many of the formatting discrepancies already. I've already added support for GD::SVG into bioperl, so perhaps we should coordinate our efforts on the GD::SVG (or SVG::GD module). In particualr, there are a number of kludges that need to be implemented (regarding font sizes, positions, etc) to correctly map GD<->SVG output (particularly in regards to Bio::Graphics. Thanks, todd On Thu, 13 Nov 2003, Ronan Oger wrote: > Hi, > > My name is Ronan Oger, I am the lead developer of the SVG module. > > One of the focuses of my current work is SVG::GD, a wrapper for the GD module > to provide SVG (vector) output instead of raster. > > http://www.w3.org/Graphics/SVG/ > http://www.w3.org/TR/SVG > > I've been doing some tests with GD and GD derivatives, and since bioperl is a > fairly heavy user of GD, I have been testing around some bioperl code to see > how it works. > > There are some real issues in the SVG::GD at this point, but several people > are working on it and progress is being made. > > Clearly this is not production code at this stage. In particular, font support > is still very poor, and font positions are still broken. > > However, here is a bioperl-specific sample (you need an SVG-compliant browser, > such as IE with Adobe or Corel's SVG viewers installed. > > A png and its svg friend taken from a bio-related example on the net > ------------------------------------ > http://www.roasp.com/2003/11/13/ > > More prolific example comparisons > http://www.roasp.com/2003/11/11/ > > The SVG::GD module (version 0.07): > http://www.roasp.com/2003/11/11/SVG-GD-0.07.tar.gz > (This module has a dependency on SVG, which is on CPAN) > When it ripens, the module will live on CPAN. > > I'd appreciate some feedback, issues, etc. In particular, relating to the > module's usability. > > All the best, > > Ronan > > -- > Ronan Oger > http://www.roasp.com > Serverside SVG Portal > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > From lstein at cshl.edu Tue Nov 18 13:53:34 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Tue Nov 18 13:50:28 2003 Subject: [Bioperl-l] Graphics:Panel /SeqFeature::Generic In-Reply-To: <3FB89DB0.5070303@biologie.uni-freiburg.de> References: <3FB89DB0.5070303@biologie.uni-freiburg.de> Message-ID: <200311181353.34763.lstein@cshl.edu> Hi Dan, Try changing the "generic" glyph to "segments." The first glyph doesn't know how to deal with subparts (such as HSPs), the second does. Lincoln On Monday 17 November 2003 05:06 am, Daniel Lang wrote: > Hi, > I want to generate overview graphics from BLAST reports, where the hits > are sorted and colored (>1e-10 -->green, ...)according their evalues... > > So I thought, I could solve this using a callback function for the > bgcolor and using the 'low_score' sort_order, but when applied to a > BLAST report, it results in sorted but only red hits? > I also tried introducing the evalues as additional tags like done with > 'bits' or 'range', but when testing for this tag in the callback > (has_tag) its not available? > So I wander if the function is envoked for each hit in the while loop? > > Here the code sniplet: > > my $track = $panel->add_track(-glyph => 'generic', > -label => 1, > -connector => 'dashed', > -height => 5, > -bgcolor => sub { > my $feature = shift; > my $evalue = $feature->score; > if ($evalue < 1e-10) {return 'green';} > else {return 'red';}} > , > -fontcolor => 'green', > -font2color => 'red', > -sort_order => 'low_score', > -min_score => '1e-1000', > -max_score => '10000', > -description => sub { > my $feature = shift; > return unless $feature->has_tag('bits'); > my ($description) = > $feature->each_tag_value('bits'); > my $score = $feature->score; > my ($range) = > $feature->each_tag_value('range'); > "Score=$description bits, E-value=$score, $range"; > }); > > while( my $hit = $result->next_hit ) { > my $evalue = $hit->significance; > my $feature = Bio::SeqFeature::Generic->new(-score => $evalue, > -display_name => $hit->name, > -tag => { 'bits' => $hit->bits, > 'range' => "from ". $hit->start('query') . " to " . > $hit->end('query'), > }, > ); > while( my $hsp = $hit->next_hsp ) { > $feature->add_sub_SeqFeature($hsp,'EXPAND'); > } > $track->add_feature($feature); > } > > Thanks in advance, > Daniel > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- ======================================================================== Lincoln D. Stein Cold Spring Harbor Laboratory lstein@cshl.org Cold Spring Harbor, NY ======================================================================== From shawnh at stanford.edu Wed Nov 19 00:31:46 2003 From: shawnh at stanford.edu (Shawn Hoon) Date: Wed Nov 19 00:25:30 2003 Subject: [Bioperl-l] Re: [Bioperl-pipeline] need help with biopipe In-Reply-To: References: Message-ID: On Tuesday, November 18, 2003, at 7:11AM, matthieu CONTE wrote: > I'm trying to work with biopipe , i'm using the example program: > Blast_file_pipeline.xml > I use the command > "perl PipelineManager -xml /home/conte/xml/test_blast_pipe.xml -dbname > biopipe -dbpass biopipe -dbuser biopipe" > in bioperl-pipeline/scripts > and I have : > "Retrying........ > Fetched 0 completed jobs > Going to snooze for 3 seconds... > Waking up and run again! > Fetching Jobs... > Fetched 1 incomplete jobs > " > so ???Does anybody know what it means? > This is the output from PipelineManager that describes whether it is fetching/running jobs etc. PipelineManager cycles through a series of steps: 1) Fetched Failed or New Jobs in batches (the number u can specify in PipeConf.pm) 2) Run these Jobs 3)Fetched Completed Jobs and remove them from the job table into the completed_jobs table To figure out whether the individual jobs are failing, you can look up the job table for its status and look up the stderr log file to see any other error messages. shawn > > > > Matthieu CONTE > 23 route d'EUS > 66500 CATLLAR > Tel > 0468962854 > m_conte@hotmail.com > > _________________________________________________________________ > MSN Search, le moteur de recherche qui pense comme vous ! > http://search.msn.fr/worldwide.asp > > _______________________________________________ > bioperl-pipeline mailing list > bioperl-pipeline@bioperl.org > http://bioperl.org/mailman/listinfo/bioperl-pipeline From chenn at cshl.edu Wed Nov 19 01:40:24 2003 From: chenn at cshl.edu (Jack Chen) Date: Wed Nov 19 01:36:41 2003 Subject: [Bioperl-l] Bio::Tools::Phylo::Phylip::ProtDist module: warning message for bootstrapped data file In-Reply-To: Message-ID: Hi Shawn, While I am reading datafile from protdist for bootstrapped data, I always get the warning message: -------------------- WARNING --------------------- MSG: The number of entries 28 is not the same 0 --------------------------------------------------- Looks like the $size variable is never set, for some reason. I know that I can just ignore this and go ahead. But does this mean anything? Thanks, Jack ++++++++++++++++++++++++++++++++++++++++++++ o-o Jack Chen, Stein Laboratory o---o Cold Spring Harbor Laboratory o----o #5 Williams, 1 Bungtown Road O----O Cold Spring Harbor, NY, 11724 0--o Tel: 1 516 367 8394 O e-mail: chenn@cshl.org o-o Website: http://www.wormbase.org +++++++++++++++++++++++++++++++++++++++++++++ From ronan at roasp.com Wed Nov 19 13:51:26 2003 From: ronan at roasp.com (Ronan Oger) Date: Wed Nov 19 13:47:32 2003 Subject: [Bioperl-l] new to the group. And a quick demo of SVG::GD In-Reply-To: References: Message-ID: <200311191851.26014.ronan@roasp.com> Hi Todd, I see we're both running into the same problems. In particular the font issue (this seems to be due to the way perl handles xs objects? it returns an object that sometimes seems to bet treated like a string. Maybe someone in bioperl has advice about how to grab the GD namespace without forsaking the use of the GD methods? Pls. see private re. collaboration. I'm all for it. I've installed your module and will run around in it a bit to get a better world view of how you use it and of how much similarity there is between SVG::GD and GD::SVG. For now, feel free to take anything out of SVG::GD for your own use. Ronan On Tuesday 18 November 2003 17:40, Todd Harris wrote: > Hi Ronan - > > Ha, great minds!, right This looks pretty good. I've been working on a > similar module that works exactly the same and maps almost all functions > into SVG output (using your SVG module). I've placed this in the GD > namespace as GD::SVG since that seems to more closely represent the intent > of the module. > > You can check out a preliminary version of my module at > http://toddot.net/GD-SVG/GD-SVG.0.01.tgz > > Docs: > http://toddot.net/GD-SVG/gd-svg.html > > And some very preliminary test images based on Bio::Graphics and some > simple test scripts: > http://toddot.net/GD-SVG/test.png > http://toddot.net/GD-SVG/test.svg > http://toddot.net/GD-SVG/biographics-dynamic_glyphs.png > http://toddot.net/GD-SVG/biographics-dynamic_glyphs.svg > http://toddot.net/GD-SVG/biographics-lots.png > http://toddot.net/GD-SVG/biographics-lots.svg > > These images are a little out-of-date. I've fixed many of the formatting > discrepancies already. > > I've already added support for GD::SVG into bioperl, so perhaps we should > coordinate our efforts on the GD::SVG (or SVG::GD module). In particualr, > there are a number of kludges that need to be implemented (regarding font > sizes, positions, etc) to correctly map GD<->SVG output (particularly in > regards to Bio::Graphics. > > Thanks, > > todd > > On Thu, 13 Nov 2003, Ronan Oger wrote: > > Hi, > > > > My name is Ronan Oger, I am the lead developer of the SVG module. > > > > One of the focuses of my current work is SVG::GD, a wrapper for the GD > > module to provide SVG (vector) output instead of raster. > > > > http://www.w3.org/Graphics/SVG/ > > http://www.w3.org/TR/SVG > > > > I've been doing some tests with GD and GD derivatives, and since bioperl > > is a fairly heavy user of GD, I have been testing around some bioperl > > code to see how it works. > > > > There are some real issues in the SVG::GD at this point, but several > > people are working on it and progress is being made. > > > > Clearly this is not production code at this stage. In particular, font > > support is still very poor, and font positions are still broken. > > > > However, here is a bioperl-specific sample (you need an SVG-compliant > > browser, such as IE with Adobe or Corel's SVG viewers installed. > > > > A png and its svg friend taken from a bio-related example on the net > > ------------------------------------ > > http://www.roasp.com/2003/11/13/ > > > > More prolific example comparisons > > http://www.roasp.com/2003/11/11/ > > > > The SVG::GD module (version 0.07): > > http://www.roasp.com/2003/11/11/SVG-GD-0.07.tar.gz > > (This module has a dependency on SVG, which is on CPAN) > > When it ripens, the module will live on CPAN. > > > > I'd appreciate some feedback, issues, etc. In particular, relating to the > > module's usability. > > > > All the best, > > > > Ronan > > > > -- > > Ronan Oger > > http://www.roasp.com > > Serverside SVG Portal > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- Ronan Oger http://www.roasp.com Serverside SVG Portal From Yue.Ke at usask.ca Wed Nov 19 15:28:44 2003 From: Yue.Ke at usask.ca (Yue Ke) Date: Wed Nov 19 15:25:02 2003 Subject: [Bioperl-l] usage of Bio::SeqIO::tigr Message-ID: <3FBBD27C.4010201@mail.usask.ca> Greeting, I am very interested in Bio::SeqIO::tigr.pm and would like to try it out on Tiger Xml file. Could anyone tell me the usage of it? Could I still use the following: my $SIO = Bio::SeqIO->new(-file=> $fn, '-format' => 'Tigr'); If so, SeqIO.pm maybe be modified. Where could I get the new version of SeqIO.pm..... Thanks in advanced! Yue From Yue.Ke at usask.ca Wed Nov 19 15:31:25 2003 From: Yue.Ke at usask.ca (Yue Ke) Date: Wed Nov 19 15:27:39 2003 Subject: [Bioperl-l] usage of Bio::SeqIO::tigr Message-ID: <3FBBD31D.70109@mail.usask.ca> Greeting, I am very interested in Bio::SeqIO::tigr.pm and would like to try it out on Tiger Xml file. Could anyone tell me the usage of it? Could I still use the following: my $SIO = Bio::SeqIO->new(-file=> $fn, '-format' => 'Tigr'); If so, SeqIO.pm maybe be modified. Where could I get the new version of SeqIO.pm..... Thanks in advanced! Yue From Yue.Ke at usask.ca Wed Nov 19 15:34:36 2003 From: Yue.Ke at usask.ca (Yue Ke) Date: Wed Nov 19 15:30:57 2003 Subject: [Bioperl-l] (no subject) Message-ID: <3FBBD3DC.2030404@mail.usask.ca> From Yue.Ke at usask.ca Wed Nov 19 16:04:49 2003 From: Yue.Ke at usask.ca (Yue Ke) Date: Wed Nov 19 16:01:03 2003 Subject: [Bioperl-l] usage of Bio::SeqIO::tigr Message-ID: <3FBBDAF1.3080808@mail.usask.ca> Greeting, I am very interested in Bio::SeqIO::tigr.pm and would like to try it out on Tiger Xml file. Could anyone tell me the usage of it? Could I still use the following: my $SIO = Bio::SeqIO->new(-file=> $fn, '-format' => 'Tigr'); If so, SeqIO.pm must be modified. Where could I get the new version of SeqIO.pm..... Thanks in advanced! Yue From jason at cgt.duhs.duke.edu Wed Nov 19 20:41:02 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Wed Nov 19 20:37:09 2003 Subject: [Bioperl-l] usage of Bio::SeqIO::tigr In-Reply-To: <3FBBDAF1.3080808@mail.usask.ca> References: <3FBBDAF1.3080808@mail.usask.ca> Message-ID: CVS: http://cvs.open-bio.org Or bioperl developer releases (1.3.X) On Wed, 19 Nov 2003, Yue Ke wrote: > Greeting, > I am very interested in Bio::SeqIO::tigr.pm and would like to try it out > on Tiger Xml file. Could anyone tell me the usage of it? Could I still > use the following: > my $SIO = Bio::SeqIO->new(-file=> $fn, '-format' => 'Tigr'); > If so, SeqIO.pm must be modified. Where could I get the new version of > SeqIO.pm..... > > Thanks in advanced! > > Yue > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From cjm at fruitfly.org Wed Nov 19 21:47:13 2003 From: cjm at fruitfly.org (Chris Mungall) Date: Wed Nov 19 22:09:09 2003 Subject: [Bioperl-l] proposed additions to SeqFeatureI, RangeI and FeatureHolderI Message-ID: I have some proposed changes I would like to commit to bioperl, mostly for using GFF3. In both SeqFeatureI and SeqFeature::Generic I would like to add some accessor methods. They would all map to tag-values. ID - synonym for tag_value('ID')[0] ParentIDs - synonym for tag_value('Parent') and also add_ParentID remove_ParentID remove_ParentIDs Question - should the method be Parent or ParentID? In GFF3, the tag is "Parent". But an accessor method called "Parents()" feels like it should return objects, so I think ParentIDs() is better. Also, I realise it's contrary to bioperl convention to have method names in caps, but it's nice to be consistent with the GFF3 tags. I also notice that in SeqFeatureI we have an accessor definition and implementation for "primary_id". There is no definition for this. I propose either eliminating this, or making it a synonym of ID() I think we need clearly defined semantics for these fields. I think the semantics should be such that the ID should uniquely identify the feature. This is problemmatic, as most sources don't issue a unique accession or identifier for features. For example, genbank files provide a /gene for a lot of features, but this isn't even unique e.g. with multicopy genes. In cases where the data source does not provide a unique ID, we may want a way to generate them. So I think there should also be a method: generateID() which sets the ID field to something that's guaranteed unique. I'm not sure how. Perhaps a combination of the timestamp and the object memory reference? Because I'm lazy I'd rather do all this in SeqFeatureI - it all delegates to existing methods. But I am unsure as to bioperl conventions regarding when an 'interface' has implementation code. ---- I also want to add some code to FeatureHolderI, for dealing with the "nesting hierarchy" in bioperl, i.e. features that contain other features. The methods are: nest_features() creates a feature nesting hierarchy based on the "ID" and "Parent" tags. This is useful when parsing GFF3. Also: flatten_features() for flattening the nesting hierarchy (so top_SeqFeatures and get_SeqFeatures return the same thing) Also: set_ParentIDs_from_hierarchy() This will go through the FeatureHolder hierarchy; any time it sees a feature with subfeatures, it will set the children's "Parent" tag according to the "ID" tag of the parent. If the parent does not have an ID, one will be generated. I particularly want this so I can take genbank files, feed them through Bio::Seqfeature::Tools::Unflattener, call this method, then dump the results as GFF3 The one reservation I have about this is that there are two (easily interchangeable) ways of dealing with hierarchies in bioperl. The alternative is to do this conversion on the fly in the GFF3 adapter. But this messes with people who want to get/set ID and Parent tags explicitly. ---- And nothing to do with the above code, I would like to add methods to RangeI for interbase coordinates. Love em or hate em, these methods will make some people's code easier at no cost to bioperl. First the interbase equivalent of start/end: istart iend Of course, iend is just a synonym for end, but it's nice for completion This is the equivalent of chado fmin/fmax. I would also like: ifrom ito For interbase directional coordinates. This is equivalent to istart,iend in the + strand, and the reverse of this in the - strand. Let me know if there's any objections, otherwise I'll commit sometime next week. From laurichj at bioinfo.ucr.edu Thu Nov 20 02:26:07 2003 From: laurichj at bioinfo.ucr.edu (Josh Lauricha) Date: Thu Nov 20 02:22:21 2003 Subject: [Bioperl-l] usage of Bio::SeqIO::tigr In-Reply-To: References: <3FBBDAF1.3080808@mail.usask.ca> Message-ID: <20031120072607.GA11223@bioinfo.ucr.edu> To be annoyingly precise, SeqIO.pm doesn't need to be modified, tigr.pm from CVS needs to be dumped into the SeqIO directory. It should work with 1.2.3, unless I've installed CVS unwitingly. Yue, if you run into any bugs, please let me know, its only been tested locally on ath and osa. Oh, and the 1.0 release of osa is massivly messed up, somehow several exons are outside the gene... by like 8000bp. The module should skip these... On Wed 11/19/03 20:41, Jason Stajich wrote: > CVS: > http://cvs.open-bio.org > Or bioperl developer releases (1.3.X) > > On Wed, 19 Nov 2003, Yue Ke wrote: > > > Greeting, > > I am very interested in Bio::SeqIO::tigr.pm and would like to try it out > > on Tiger Xml file. Could anyone tell me the usage of it? Could I still > > use the following: > > my $SIO = Bio::SeqIO->new(-file=> $fn, '-format' => 'Tigr'); > > If so, SeqIO.pm must be modified. Where could I get the new version of > > SeqIO.pm..... > > > > Thanks in advanced! > > > > Yue > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- ---------------------------- | Josh Lauricha | | laurichj@bioinfo.ucr.edu | | Bioinformatics, UCR | |--------------------------| From Richard.Adams at ed.ac.uk Thu Nov 20 02:26:20 2003 From: Richard.Adams at ed.ac.uk (Richard Adams) Date: Thu Nov 20 02:22:34 2003 Subject: [Bioperl-l] Blast Message-ID: <3FBC6C9C.4060703@ed.ac.uk> Hi, If you post your script that no longer works someone might be able to help. Richard -- Dr Richard Adams Bioinformatician, Psychiatric Genetics Group, Medical Genetics, Molecular Medicine Centre, Western General Hospital, Crewe Rd West, Edinburgh UK EH4 2XU Tel: 44 131 651 1084 richard.adams@ed.ac.uk From iain.wallace at ucd.ie Thu Nov 20 06:21:53 2003 From: iain.wallace at ucd.ie (Iain Wallace) Date: Thu Nov 20 06:17:51 2003 Subject: [Bioperl-l] Newbie question: Removing sequences from alignment Message-ID: Hi all, I am quite new to all this and was wondering if anyone could help me. I am trying to create a new alignment file from an old one excluding some sequences and am not sure how to do it? This is the script I am trying to modify ; which writes from one alignment file to another. I just don't know how to add new sequences to the new alignment. Thanks for any help Iain use Bio::AlignIO; my $alignio = new Bio::AlignIO(-format => 'clustalw', -file => '/home/iain/alignment1.aln'); my $alignnew = new Bio::AlignIO->new(-format => 'clustalw', -file => '>/home/iain/alignment2.aln'); while( my $aln = $alignio->next_aln ) { print "len=", $aln->length, " # residures=", $aln->no_residues, " ", " percent id=", $aln->percentage_identity, "\n"; print "seqs are :\n"; foreach my $seq ($aln->each_seq) { print "\t '", $seq->display_id(), "'\n"; } print "---\n"; $alignnew->write_aln($aln); } ____________ Virus checked by G DATA AntiVirusKit Version: AVK 12.0.594 from 19.09.2003 Virus news: www.antiviruslab.com From brian_osborne at cognia.com Thu Nov 20 07:42:48 2003 From: brian_osborne at cognia.com (Brian Osborne) Date: Thu Nov 20 07:41:44 2003 Subject: [Bioperl-l] Newbie question: Removing sequences from alignment In-Reply-To: Message-ID: Iain, That $aln that you're getting from AlignIO is a SimpleAlign object, so what you want to do is take a look at the SimpleAlign documentation: http://doc.bioperl.org/releases/bioperl-1.2/Bio/SimpleAlign.html You mentioned both adding and subtracting sequences from the alignment. You probably want to do the latter since adding sequences simply adds them, it doesn't align them. See that page for more detail. Brian O. -----Original Message----- From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Iain Wallace Sent: Thursday, November 20, 2003 6:22 AM To: bioperl-l@bioperl.org Subject: [Bioperl-l] Newbie question: Removing sequences from alignment Hi all, I am quite new to all this and was wondering if anyone could help me. I am trying to create a new alignment file from an old one excluding some sequences and am not sure how to do it? This is the script I am trying to modify ; which writes from one alignment file to another. I just don't know how to add new sequences to the new alignment. Thanks for any help Iain use Bio::AlignIO; my $alignio = new Bio::AlignIO(-format => 'clustalw', -file => '/home/iain/alignment1.aln'); my $alignnew = new Bio::AlignIO->new(-format => 'clustalw', -file => '>/home/iain/alignment2.aln'); while( my $aln = $alignio->next_aln ) { print "len=", $aln->length, " # residures=", $aln->no_residues, " ", " percent id=", $aln->percentage_identity, "\n"; print "seqs are :\n"; foreach my $seq ($aln->each_seq) { print "\t '", $seq->display_id(), "'\n"; } print "---\n"; $alignnew->write_aln($aln); } ____________ Virus checked by G DATA AntiVirusKit Version: AVK 12.0.594 from 19.09.2003 Virus news: www.antiviruslab.com _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From basm101 at york.ac.uk Thu Nov 20 10:23:50 2003 From: basm101 at york.ac.uk (Bryony Mackenzie) Date: Thu Nov 20 10:20:18 2003 Subject: [Bioperl-l] Capabilities of Bio::Tree Message-ID: <3FBCDC86.8080303@york.ac.uk> Hello, I am interested in using the Bio::Tree module and would like to read in Newick format and convert to a web-friendly graphical format. Ideally I want to use colours and other options to prettify the tree. I have looked at using the Treeplot program but it does not seem to distinguish properly between my consensus trees with bootstrap values and trees with branch lengths. It tries to interpret my boostraps as branch lengths :( I have read the docs but am still confused as to the abilities of Bio::Tree. I see that handling phylogenetic trees has been put down as a problem area for bioperl and I was wondering what progress is being made ? Thanks, Bryony From gaetan.droc at cirad.fr Tue Nov 18 09:47:03 2003 From: gaetan.droc at cirad.fr (Gaetan) Date: Thu Nov 20 10:38:54 2003 Subject: [Bioperl-l] Drawing chromosomes in Generic Genome Browser Message-ID: <3FBA30E7.9090806@cirad.fr> Hello, I am currently working with gbrowse to integrate/display various rice annotations on TIGR pseudo-chromosomes. Everything is working well and I try to play with the ideogram.pm module to draw chromosomes (from a previously email posted from Lincoln Stein) When I use this module centromeres and cytobands were not aligned. Any help will be welcome !!!! Here the picture and the GFF source I worked with Thanks, Gaetan Droc GFF source 1 TIGR cytoband 1 16000000 . + . CytoBand B3; Stain gpos25 1 TIGR cytoband 16000001 18000000 . + . Centromere Chr1 1 TIGR cytoband 18000001 42904173 . + . CytoBand B2; Stain gpos25 Picture Chromosome -------------- next part -------------- Skipped content of type multipart/related From gaetan.droc at cirad.fr Tue Nov 18 11:16:34 2003 From: gaetan.droc at cirad.fr (Gaetan) Date: Thu Nov 20 10:39:01 2003 Subject: [Bioperl-l] Drawing Chromosomes Message-ID: <3FBA45E2.6040205@cirad.fr> Hello, I am currently working with gbrowse to integrate/display various rice annotations on TIGR pseudo-chromosomes. Everything is working well and I try to play with the ideogram.pm module to draw chromosomes (from a previously email posted from Lincoln Stein) When I use this module centromeres and cytobands were not aligned. Any help will be welcome !!!! Here the picture and the GFF source I worked with Thanks, Gaetan Droc GFF source 1 TIGR cytoband 1 16000000 . + . CytoBand B3; Stain gpos25 1 TIGR cytoband 16000001 18000000 . + . Centromere Chr1 1 TIGR cytoband 18000001 42904173 . + . CytoBand B2; Stain gpos25 Picture Chromosome -------------- next part -------------- Skipped content of type multipart/related From dag at sonsorol.org Thu Nov 20 10:38:23 2003 From: dag at sonsorol.org (Chris Dagdigian) Date: Thu Nov 20 10:39:07 2003 Subject: [Bioperl-l] Total OBF server shutdown Saturday November 22nd (all day EDT timezone) Message-ID: <3FBCDFEF.1010807@sonsorol.org> Hi folks, Apologies for the massive cross-posting. Our CVS, mailing list and web servers are located in a Cambridge, MA USA datacenter belonging to Wyeth Resarch. Genetics Institute (which became part of Wyeth) has supported our signficant internet bandwidth and hosting needs for many years since the earliest versions of our open source efforts. Since I have to do this massive cross-post anyway I figured it was a good time to thank them again in public. The real reason for this message is to announce a 1-day period of significant server downtime. The office floor & datacenter in the building where our servers are hosted is going to have a planned electrical shutdown (including emergency and backup power circuits) from 10am - 6pm on Saturday November 22nd. I'll be manually bringing down our servers sometime before the 10am deadline. The time estimate is conservative. In the event that the facilty work takes less time than expected I'll probably take advantage of the window to perform some server upgrades and failed disk replacements. For any questions/concerns or if you notice a server or service that is still not available after the 22nd please contact me directly at 'chris@bioteam.net' or 1-617-877-5498. Regards, Chris From jason at cgt.duhs.duke.edu Thu Nov 20 12:13:40 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Thu Nov 20 12:09:51 2003 Subject: [Bioperl-l] Capabilities of Bio::Tree In-Reply-To: <3FBCDC86.8080303@york.ac.uk> References: <3FBCDC86.8080303@york.ac.uk> Message-ID: On Thu, 20 Nov 2003, Bryony Mackenzie wrote: > Hello, > > I am interested in using the Bio::Tree module and would like to read in > Newick format and convert to a web-friendly > graphical format. Ideally I want to use colours and other options to > prettify the tree. > > I have looked at using the Treeplot program but it does not seem to > distinguish properly between my consensus trees > with bootstrap values and trees with branch lengths. It tries to > interpret my boostraps as branch lengths :( I have had no problems with treeplot, are you sure you are putting the labels on correctly? My tree looks like this ((a:1,b:1)75:2,c:3); and I get a bootstrap value of 75 labeled for the a,b clade. Are you having trouble with trees that are converted to-from the bioperl tree reading and then to treeplot? As for colors, etc, that is also possible with treeplot http://www.cnrs-gif.fr/pge/bioinfo/treeplot/index.php?lang=en#ancre_exemples does it not work for you? If someone wanted to work on a wrapper for Treeplot in Bioperl this would be very helpful. > I have read the docs but am still confused as to the abilities of > Bio::Tree. I see that handling phylogenetic trees has been > put down as a problem area for bioperl and I was wondering what progress > is being made ? We currently don't support tree drawing natively in bioperl - you must use a 3rd party application. Seems like a interesting problem for someone to work on. Allen Day has some sort of SVG tree plotting output as well but not sure how primetime that is. As for the current capabilities - do you want a list of all the supported functionality or are you angling for something in particular other than the drawing part? I'm currently working on implementing some more statistics, random tree generation, distance calculations, and some topology comparison stuff. So in short - not much on the automated pretty output front so we definitely need people helping out there. > Thanks, > Bryony > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From pm66 at nyu.edu Thu Nov 20 13:50:05 2003 From: pm66 at nyu.edu (Philip MacMenamin) Date: Thu Nov 20 13:46:21 2003 Subject: [Bioperl-l] AcePerl Makefile problem w perl 5.8.0 Message-ID: <200311201850.hAKIo7hE015202@mx3.nyu.edu> Hi, There seems to be some sort of a problem creating the Makefile for AcePerl 1.87 and 1.83 under perl 5.8.0 (w Linux, RH9). This does not occur with perl 5.6.1, 5.6.0 or 5.8.1 (from fedora). Sample snippet from Makefile with perl 5.8.0: // installvendorlib='/usr/lib/perl5/vendor_perl/5.8. INSTALLVENDORLIB = ib/perl5' installusrbinperl='def INSTALLARCHLIB = /usr/lib/perl5/5.8.0/i386-linux-thread-multi INSTALLSITEARCH = /usr/lib/pe INSTALLVENDORARCH = /usr/lib/perl5/vendor_perl/5.8.0/i38 // So, in this snip there seems to be errors on each line, but I there are other errors throughout the file. This can be solved this by getting an AcePerl 1.87 RPM. -- Philip MacMenamin From lstein at cshl.edu Thu Nov 20 15:00:41 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Thu Nov 20 14:57:00 2003 Subject: [Bioperl-l] Drawing chromosomes in Generic Genome Browser In-Reply-To: <3FBA30E7.9090806@cirad.fr> References: <3FBA30E7.9090806@cirad.fr> Message-ID: <200311201500.41108.lstein@cshl.edu> Could you help with this? Lincoln On Tuesday 18 November 2003 09:47 am, Gaetan wrote: > Hello, > I am currently working with gbrowse to integrate/display various rice > annotations on TIGR pseudo-chromosomes. Everything is working well and I > try to play with > the ideogram.pm module to draw chromosomes (from a previously email > posted from Lincoln Stein) > > When I use this module centromeres and cytobands were not aligned. > > Any help will be welcome !!!! > > Here the picture and the GFF source I worked with > > Thanks, > > Gaetan Droc > > > GFF source > > 1 TIGR cytoband 1 16000000 . + . CytoBand B3; > Stain gpos25 > 1 TIGR cytoband 16000001 18000000 . + . > Centromere Chr1 > 1 TIGR cytoband 18000001 42904173 . + . CytoBand > B2; Stain gpos25 > > Picture > > Chromosome -- ======================================================================== Lincoln D. Stein Cold Spring Harbor Laboratory lstein@cshl.org Cold Spring Harbor, NY ======================================================================== From lstein at cshl.edu Thu Nov 20 20:24:36 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Thu Nov 20 20:21:48 2003 Subject: [Bioperl-l] proposed additions to SeqFeatureI, RangeI and FeatureHolderI In-Reply-To: References: Message-ID: <200311202024.36691.lstein@cshl.edu> On Wednesday 19 November 2003 09:47 pm, Chris Mungall wrote: > I have some proposed changes I would like to commit to bioperl, mostly > for using GFF3. > > In both SeqFeatureI and SeqFeature::Generic I would like to add some > accessor methods. They would all map to tag-values. > > ID - synonym for tag_value('ID')[0] > ParentIDs - synonym for tag_value('Parent') I like this. > add_ParentID > remove_ParentID > remove_ParentIDs > > Question - should the method be Parent or ParentID? In GFF3, the tag > is "Parent". But an accessor method called "Parents()" feels like it > should return objects, so I think ParentIDs() is better. Do the methods return IDs or objects? If they're returning IDs, then the ParentID() name sounds right. > Also, I realise it's contrary to bioperl convention to have method > names in caps, but it's nice to be consistent with the GFF3 tags. If you want to be completely consistent with convention, how about get_ID() and get_ParentIDs()? I have a private convention that initial capitalized methods are autoloaded/autogenerated, but this is just me. > I also notice that in SeqFeatureI we have an accessor definition and > implementation for "primary_id". There is no definition for this. > > I propose either eliminating this, or making it a synonym of ID() Good with me. > I think we need clearly defined semantics for these fields. I think > the semantics should be such that the ID should uniquely identify the > feature. This is problemmatic, as most sources don't issue a unique > accession or identifier for features. For example, genbank files > provide a /gene for a lot of features, but this isn't even unique > e.g. with multicopy genes. In cases where the data source does not > provide a unique ID, we may want a way to generate them. So I think > there should also be a method: > > generateID() > > which sets the ID field to something that's guaranteed unique. I'm not > sure how. Perhaps a combination of the timestamp and the object memory > reference? I think there was a proposal for globally_unique_ID() at some point. Perhaps time to resurrect that thread? > Because I'm lazy I'd rather do all this in SeqFeatureI - it all > delegates to existing methods. But I am unsure as to bioperl > conventions regarding when an 'interface' has implementation code. Happy to see it. > > ---- > > I also want to add some code to FeatureHolderI, for dealing with the > "nesting hierarchy" in bioperl, i.e. features that contain other > features. > > The methods are: > > nest_features() > > creates a feature nesting hierarchy based on the "ID" and "Parent" > tags. This is useful when parsing GFF3. Yes, I like this. > > Also: > > flatten_features() > > for flattening the nesting hierarchy (so top_SeqFeatures and > get_SeqFeatures return the same thing) I like this too. > > Also: > > set_ParentIDs_from_hierarchy() > > This will go through the FeatureHolder hierarchy; any time it sees a > feature with subfeatures, it will set the children's "Parent" tag > according to the "ID" tag of the parent. If the parent does not have > an ID, one will be generated. This sounds like an internal method that nobody should ever see in the API! > And nothing to do with the above code, I would like to add methods to > RangeI for interbase coordinates. Love em or hate em, these methods > will make some people's code easier at no cost to bioperl. > > First the interbase equivalent of start/end: > > istart > iend > > Of course, iend is just a synonym for end, but it's nice for > completion > > This is the equivalent of chado fmin/fmax. > > I would also like: > > ifrom > ito > > For interbase directional coordinates. This is equivalent to > istart,iend in the + strand, and the reverse of this in the - strand. I have no objection to these guys going into the Interface as the appropriate implemented methods. That way they'd be available everywhere. Lincoln > > Let me know if there's any objections, otherwise I'll commit sometime > next week. > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- Lincoln Stein lstein@cshl.edu Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) From lstein at cshl.edu Thu Nov 20 20:25:12 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Thu Nov 20 20:22:54 2003 Subject: [Bioperl-l] Drawing Chromosomes In-Reply-To: <3FBA45E2.6040205@cirad.fr> References: <3FBA45E2.6040205@cirad.fr> Message-ID: <200311202025.12371.lstein@cshl.edu> Hi Gaetan, If you set bump to false (0), the centromere will probably be drawn in the proper place. Lincoln On Tuesday 18 November 2003 11:16 am, Gaetan wrote: > Hello, > I am currently working with gbrowse to integrate/display various rice > annotations on TIGR pseudo-chromosomes. Everything is working well and I > try to play with > the ideogram.pm module to draw chromosomes (from a previously email > posted from Lincoln Stein) > > When I use this module centromeres and cytobands were not aligned. > > Any help will be welcome !!!! > > Here the picture and the GFF source I worked with > > Thanks, > > Gaetan Droc > > > GFF source > > 1 TIGR cytoband 1 16000000 . + . CytoBand B3; > Stain gpos25 > 1 TIGR cytoband 16000001 18000000 . + . > Centromere Chr1 > 1 TIGR cytoband 18000001 42904173 . + . CytoBand > B2; Stain gpos25 > > Picture > Chromosome -- Lincoln Stein lstein@cshl.edu Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) From lstein at cshl.edu Thu Nov 20 20:25:12 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Thu Nov 20 21:13:30 2003 Subject: [Bioperl-l] Drawing Chromosomes In-Reply-To: <3FBA45E2.6040205@cirad.fr> References: <3FBA45E2.6040205@cirad.fr> Message-ID: <200311202025.12371.lstein@cshl.edu> Hi Gaetan, If you set bump to false (0), the centromere will probably be drawn in the proper place. Lincoln On Tuesday 18 November 2003 11:16 am, Gaetan wrote: > Hello, > I am currently working with gbrowse to integrate/display various rice > annotations on TIGR pseudo-chromosomes. Everything is working well and I > try to play with > the ideogram.pm module to draw chromosomes (from a previously email > posted from Lincoln Stein) > > When I use this module centromeres and cytobands were not aligned. > > Any help will be welcome !!!! > > Here the picture and the GFF source I worked with > > Thanks, > > Gaetan Droc > > > GFF source > > 1 TIGR cytoband 1 16000000 . + . CytoBand B3; > Stain gpos25 > 1 TIGR cytoband 16000001 18000000 . + . > Centromere Chr1 > 1 TIGR cytoband 18000001 42904173 . + . CytoBand > B2; Stain gpos25 > > Picture > Chromosome -- Lincoln Stein lstein@cshl.edu Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) From facemann at yahoo.com Thu Nov 20 21:22:08 2003 From: facemann at yahoo.com (Andy Hammer) Date: Thu Nov 20 21:18:23 2003 Subject: [Bioperl-l] Searching the bioperl archive Message-ID: <20031121022208.59330.qmail@web13405.mail.yahoo.com> Is there a web interface to search the bioperl archive? __________________________________ Do you Yahoo!? Free Pop-Up Blocker - Get it now http://companion.yahoo.com/ From lstein at cshl.edu Thu Nov 20 20:24:36 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Thu Nov 20 21:21:58 2003 Subject: [Bioperl-l] proposed additions to SeqFeatureI, RangeI and FeatureHolderI In-Reply-To: References: Message-ID: <200311202024.36691.lstein@cshl.edu> On Wednesday 19 November 2003 09:47 pm, Chris Mungall wrote: > I have some proposed changes I would like to commit to bioperl, mostly > for using GFF3. > > In both SeqFeatureI and SeqFeature::Generic I would like to add some > accessor methods. They would all map to tag-values. > > ID - synonym for tag_value('ID')[0] > ParentIDs - synonym for tag_value('Parent') I like this. > add_ParentID > remove_ParentID > remove_ParentIDs > > Question - should the method be Parent or ParentID? In GFF3, the tag > is "Parent". But an accessor method called "Parents()" feels like it > should return objects, so I think ParentIDs() is better. Do the methods return IDs or objects? If they're returning IDs, then the ParentID() name sounds right. > Also, I realise it's contrary to bioperl convention to have method > names in caps, but it's nice to be consistent with the GFF3 tags. If you want to be completely consistent with convention, how about get_ID() and get_ParentIDs()? I have a private convention that initial capitalized methods are autoloaded/autogenerated, but this is just me. > I also notice that in SeqFeatureI we have an accessor definition and > implementation for "primary_id". There is no definition for this. > > I propose either eliminating this, or making it a synonym of ID() Good with me. > I think we need clearly defined semantics for these fields. I think > the semantics should be such that the ID should uniquely identify the > feature. This is problemmatic, as most sources don't issue a unique > accession or identifier for features. For example, genbank files > provide a /gene for a lot of features, but this isn't even unique > e.g. with multicopy genes. In cases where the data source does not > provide a unique ID, we may want a way to generate them. So I think > there should also be a method: > > generateID() > > which sets the ID field to something that's guaranteed unique. I'm not > sure how. Perhaps a combination of the timestamp and the object memory > reference? I think there was a proposal for globally_unique_ID() at some point. Perhaps time to resurrect that thread? > Because I'm lazy I'd rather do all this in SeqFeatureI - it all > delegates to existing methods. But I am unsure as to bioperl > conventions regarding when an 'interface' has implementation code. Happy to see it. > > ---- > > I also want to add some code to FeatureHolderI, for dealing with the > "nesting hierarchy" in bioperl, i.e. features that contain other > features. > > The methods are: > > nest_features() > > creates a feature nesting hierarchy based on the "ID" and "Parent" > tags. This is useful when parsing GFF3. Yes, I like this. > > Also: > > flatten_features() > > for flattening the nesting hierarchy (so top_SeqFeatures and > get_SeqFeatures return the same thing) I like this too. > > Also: > > set_ParentIDs_from_hierarchy() > > This will go through the FeatureHolder hierarchy; any time it sees a > feature with subfeatures, it will set the children's "Parent" tag > according to the "ID" tag of the parent. If the parent does not have > an ID, one will be generated. This sounds like an internal method that nobody should ever see in the API! > And nothing to do with the above code, I would like to add methods to > RangeI for interbase coordinates. Love em or hate em, these methods > will make some people's code easier at no cost to bioperl. > > First the interbase equivalent of start/end: > > istart > iend > > Of course, iend is just a synonym for end, but it's nice for > completion > > This is the equivalent of chado fmin/fmax. > > I would also like: > > ifrom > ito > > For interbase directional coordinates. This is equivalent to > istart,iend in the + strand, and the reverse of this in the - strand. I have no objection to these guys going into the Interface as the appropriate implemented methods. That way they'd be available everywhere. Lincoln > > Let me know if there's any objections, otherwise I'll commit sometime > next week. > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- Lincoln Stein lstein@cshl.edu Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) From lstein at cshl.edu Thu Nov 20 22:01:05 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Thu Nov 20 21:59:58 2003 Subject: [Bioperl-l] Re: bioperl Registry In-Reply-To: <200311202116.24042.lstein@cshl.edu> References: <200311201319.45681.heikki@ebi.ac.uk> <200311202116.24042.lstein@cshl.edu> Message-ID: <200311202201.05526.lstein@cshl.edu> I've fixed issues in the Registry, SeqFeature and GFF modules. The corresponding regression tests all pass with Perl 5.8.2 (built from source). I've had to change Bio::Tools::GFF IO slightly. The module used to default to reading from ARGV if no filehandle or filename was passed to it in new(). However, this is no good when you don't want to read from a file but just write GFF output, and this was causing the SeqFeature regression test to hang. So I removed this default. In LocatableSeq.pm, there was the following odd bit of code in the start() method: $self->seq ? (return $self->{'start'} || 1) : undef; This is weird, because it means if there is no sequence attached to the LocatableSeq, it returns undef as the start coordinate. This was causing the GFF regression tests to fail. I replaced the line with a simple return $self->{'start'} || 1. Lincoln On Thursday 20 November 2003 09:16 pm, Lincoln Stein wrote: > I've got perl 5.8.2 installed now and can partially reproduce the problem. > I am working on it now. > > Lincoln > > On Thursday 20 November 2003 08:19 am, Heikki Lehvaslaiho wrote: > > Michele & Lincoln, > > > > Could I bother either of you two to have a look at the bioperl live > > Registry/ Bio::DB::Flat modules. There is something wrong in the > > BerkeleyDB version of flatdb and I can not get my head around it. > > > > Brian has been working on it but everything works under his cygwin > > system so he is stuck. I isolated the BDB tests from t/Registry.t into an > > other file and put it into bugzilla: > > http://bugzilla.bioperl.org/show_bug.cgi?id=1562 > > > > The tests fail in both Mandrake Linux 9.2+cooker (Perl 5.8.2) and RedHat > > Linux 9 (Perl 5.8.1), so it is definitely not just my system's problem. > > > > Cheers, > > -Heikki -- Lincoln Stein lstein@cshl.edu Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) From wes.barris at csiro.au Thu Nov 20 23:19:32 2003 From: wes.barris at csiro.au (Wes Barris) Date: Thu Nov 20 23:15:59 2003 Subject: [Bioperl-l] GFF file output missing semicolon Message-ID: <3FBD9254.8080008@csiro.au> Hi, I have written a bioperl program that parses blast files and generates a gff file. I have everything working except there is one small detail that I have not been able to figure out. When generating each line of gff output, the semicolon is left off at the end of the Accession name. Here is a sample line from a gff file that I generated: AF354168 mirseeker pred_miRNA 188152 188251 198 - . Note "mirseeker score 17.58" ; Accession "s-h_19_r_99330000-99363000" Notice that: 1) There are three space characters after the note and the semicolon that occurs before "Accession". 2) At the end of the line, after the Accession, there are three space characters and no semicolon. Without that semicolon, the genome browser doesn't display the "rollover" information properly. 3) The "Note" field is written before the "Accession" field. I thought that the Accession should come first. Here is the relevant portion of my code: while( my $hsp = $hit->next_hsp ) { my $strand = 1; $strand = -1 if ($hsp->strand('query') == -1 || $hsp->strand('hit') == -1); my $feature = new Bio::SeqFeature::Generic( -source_tag=>$source, -primary_tag=>$feature_type, -start=>$hsp->start('hit'), -end=>$hsp->end('hit'), -score=>$hit->raw_score, -strand=>$strand, -tag=>{ Accession=>$result->query_name, Note=>$result->query_description, } ); $feature->seq_id($hit->accession); $gffio->write_feature($feature); #Bio::SeqFeatureI } Perhaps I am not adding the "Accession" and "Note" fields properly??? -- Wes Barris E-Mail: Wes.Barris@csiro.au From jason at cgt.duhs.duke.edu Fri Nov 21 02:23:48 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Fri Nov 21 02:19:58 2003 Subject: [Bioperl-l] Searching the bioperl archive In-Reply-To: <20031121022208.59330.qmail@web13405.mail.yahoo.com> References: <20031121022208.59330.qmail@web13405.mail.yahoo.com> Message-ID: http://search.open-bio.org/ On Thu, 20 Nov 2003, Andy Hammer wrote: > Is there a web interface to search the bioperl archive? > > __________________________________ > Do you Yahoo!? > Free Pop-Up Blocker - Get it now > http://companion.yahoo.com/ > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From heikki at ebi.ac.uk Fri Nov 21 05:56:32 2003 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Fri Nov 21 05:53:09 2003 Subject: [Bioperl-l] Re: bioperl Registry In-Reply-To: <200311202201.05526.lstein@cshl.edu> References: <200311201319.45681.heikki@ebi.ac.uk> <200311202116.24042.lstein@cshl.edu> <200311202201.05526.lstein@cshl.edu> Message-ID: <200311211056.33935.heikki@ebi.ac.uk> On Friday 21 Nov 2003 TT:01, Lincoln Stein wrote: > I've fixed issues in the Registry, SeqFeature and GFF modules. The > corresponding regression tests all pass with Perl 5.8.2 (built from > source). Great stuff! Thanks a million. > I've had to change Bio::Tools::GFF IO slightly. The module used to default > to reading from ARGV if no filehandle or filename was passed to it in > new(). However, this is no good when you don't want to read from a file but > just write GFF output, and this was causing the SeqFeature regression test > to hang. So I removed this default. > > In LocatableSeq.pm, there was the following odd bit of code in the start() > method: > > $self->seq ? (return $self->{'start'} || 1) : undef; > > This is weird, because it means if there is no sequence attached to the > LocatableSeq, it returns undef as the start coordinate. This was causing > the GFF regression tests to fail. I replaced the line with a simple return > $self->{'start'} || 1. That was my recent modification. I thought that would make sense. end() returns undef. If there is no sequence and start() has not been set, why is it necessary to assume that start equals 1? I happy to accept this but would like to understand why before I change the t/Locatable.t. -Heikki > Lincoln > > On Thursday 20 November 2003 09:16 pm, Lincoln Stein wrote: > > I've got perl 5.8.2 installed now and can partially reproduce the > > problem. I am working on it now. > > > > Lincoln > > > > On Thursday 20 November 2003 08:19 am, Heikki Lehvaslaiho wrote: > > > Michele & Lincoln, > > > > > > Could I bother either of you two to have a look at the bioperl live > > > Registry/ Bio::DB::Flat modules. There is something wrong in the > > > BerkeleyDB version of flatdb and I can not get my head around it. > > > > > > Brian has been working on it but everything works under his cygwin > > > system so he is stuck. I isolated the BDB tests from t/Registry.t into > > > an other file and put it into bugzilla: > > > http://bugzilla.bioperl.org/show_bug.cgi?id=1562 > > > > > > The tests fail in both Mandrake Linux 9.2+cooker (Perl 5.8.2) and > > > RedHat Linux 9 (Perl 5.8.1), so it is definitely not just my system's > > > problem. > > > > > > Cheers, > > > -Heikki -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From lstein at cshl.edu Fri Nov 21 09:02:01 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Fri Nov 21 08:58:21 2003 Subject: [Bioperl-l] Re: bioperl Registry In-Reply-To: <200311211056.33935.heikki@ebi.ac.uk> References: <200311201319.45681.heikki@ebi.ac.uk> <200311202201.05526.lstein@cshl.edu> <200311211056.33935.heikki@ebi.ac.uk> Message-ID: <200311210902.01086.lstein@cshl.edu> > > In LocatableSeq.pm, there was the following odd bit of code in the > > start() method: > > > > $self->seq ? (return $self->{'start'} || 1) : undef; > > > > This is weird, because it means if there is no sequence attached to the > > LocatableSeq, it returns undef as the start coordinate. This was causing > > the GFF regression tests to fail. I replaced the line with a simple > > return $self->{'start'} || 1. > > That was my recent modification. I thought that would make sense. end() > returns undef. If there is no sequence and start() has not been set, why is > it necessary to assume that start equals 1? In fact I would be happy with this: return $self->{'start'}; The GFF code is correctly setting the 'start' key in this case. The || 1 was residual because I didn't understand what the intent of the code was. Perhaps the logic you are looking for is: return $self->{'start'} if defined $self->{'start'}; return 1 if $self->seq; return; ??? Lincoln > > I happy to accept this but would like to understand why before I change the > t/Locatable.t. > > -Heikki > > > Lincoln > > > > On Thursday 20 November 2003 09:16 pm, Lincoln Stein wrote: > > > I've got perl 5.8.2 installed now and can partially reproduce the > > > problem. I am working on it now. > > > > > > Lincoln > > > > > > On Thursday 20 November 2003 08:19 am, Heikki Lehvaslaiho wrote: > > > > Michele & Lincoln, > > > > > > > > Could I bother either of you two to have a look at the bioperl live > > > > Registry/ Bio::DB::Flat modules. There is something wrong in the > > > > BerkeleyDB version of flatdb and I can not get my head around it. > > > > > > > > Brian has been working on it but everything works under his cygwin > > > > system so he is stuck. I isolated the BDB tests from t/Registry.t > > > > into an other file and put it into bugzilla: > > > > http://bugzilla.bioperl.org/show_bug.cgi?id=1562 > > > > > > > > The tests fail in both Mandrake Linux 9.2+cooker (Perl 5.8.2) and > > > > RedHat Linux 9 (Perl 5.8.1), so it is definitely not just my system's > > > > problem. > > > > > > > > Cheers, > > > > -Heikki -- Lincoln Stein lstein@cshl.edu Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) From heikki at ebi.ac.uk Fri Nov 21 10:03:59 2003 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Fri Nov 21 10:00:36 2003 Subject: [Bioperl-l] Re: bioperl Registry In-Reply-To: <200311210902.01086.lstein@cshl.edu> References: <200311201319.45681.heikki@ebi.ac.uk> <200311211056.33935.heikki@ebi.ac.uk> <200311210902.01086.lstein@cshl.edu> Message-ID: <200311211504.00333.heikki@ebi.ac.uk> Lincoln, The code you wrote below gives the same functionality than my previous version but is easier to understand. Let's use it. I'll commit it. -Heikki On Friday 21 Nov 2003 TT:02, Lincoln Stein wrote: > In fact I would be happy with this: > > return $self->{'start'}; > > The GFF code is correctly setting the 'start' key in this case. The || 1 > was residual because I didn't understand what the intent of the code was. > Perhaps the logic you are looking for is: > > return $self->{'start'} if defined $self->{'start'}; > return 1 if $self->seq; > return; > > ??? > > Lincoln > > > I happy to accept this but would like to understand why before I change > > the t/Locatable.t. > > > > -Heikki > > > > > Lincoln > > > > > > On Thursday 20 November 2003 09:16 pm, Lincoln Stein wrote: > > > > I've got perl 5.8.2 installed now and can partially reproduce the > > > > problem. I am working on it now. > > > > > > > > Lincoln > > > > > > > > On Thursday 20 November 2003 08:19 am, Heikki Lehvaslaiho wrote: > > > > > Michele & Lincoln, > > > > > > > > > > Could I bother either of you two to have a look at the bioperl live > > > > > Registry/ Bio::DB::Flat modules. There is something wrong in the > > > > > BerkeleyDB version of flatdb and I can not get my head around it. > > > > > > > > > > Brian has been working on it but everything works under his cygwin > > > > > system so he is stuck. I isolated the BDB tests from t/Registry.t > > > > > into an other file and put it into bugzilla: > > > > > http://bugzilla.bioperl.org/show_bug.cgi?id=1562 > > > > > > > > > > The tests fail in both Mandrake Linux 9.2+cooker (Perl 5.8.2) and > > > > > RedHat Linux 9 (Perl 5.8.1), so it is definitely not just my > > > > > system's problem. > > > > > > > > > > Cheers, > > > > > -Heikki -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From andreas.boehm at virchow.uni-wuerzburg.de Fri Nov 21 10:15:41 2003 From: andreas.boehm at virchow.uni-wuerzburg.de (Andreas Boehm) Date: Fri Nov 21 10:19:08 2003 Subject: [Bioperl-l] MS: Calculation of theoretical spectra Message-ID: <3FBE2C1D.2000900@virchow.uni-wuerzburg.de> Hello List, does there already exist a module for calculating a theoretical spectrum from a given protein sequence that can be used for mass spectrometry? regards, Andreas M. Boehm From iain.wallace at ucd.ie Fri Nov 21 10:31:09 2003 From: iain.wallace at ucd.ie (Iain Wallace) Date: Fri Nov 21 10:27:55 2003 Subject: [Bioperl-l] [Bioperl-] Alignment Score for ClustalW using the factory Message-ID: Hi All, I was wondering if anyone knows how to capture/record the alignment score from a clustalw run, when you use the Bio::Tools::Run::Alignment::Clustalw module. It seems to me that the results only get sent to the screen. Thanks for all your help Iain ____________ Virus checked by G DATA AntiVirusKit Version: AVK 12.0.594 from 19.09.2003 Virus news: www.antiviruslab.com From m_conte at hotmail.com Fri Nov 21 10:44:28 2003 From: m_conte at hotmail.com (matthieu CONTE) Date: Fri Nov 21 10:40:44 2003 Subject: [Bioperl-l] Still working with biopipe.... Message-ID: Hi, Still working with biopipe.... I?m now trying to create de novo a pipeline to find orthologues between Oryza sativa (Os) and Arabidopsis thaliana (At) by BBMH (best blast mutal hit) (before to develop something more efficient and more complicate !). So I started by a simple blast between a prot from Os to At multifasta prot using the Bio::DB::Fasta and all the bioperl methods needed (and loop on all the Os proteins instead of a massive blast with a chunk of Os proteins). I would like to take a sequence from oriz_mfasta.txt ( using the get_Seq_by_id fonction) and blast it against arabido_mfasta.txt and so on for all the seq of oryza.This is the first step. But, it's not working !!! Probably because it's not really clear for me the function of all the XML code I am working with (especially the tag !). You will find the code and the biopipe output below. Thanks in advance Bio::Pipeline::Dumper Bio::DB::Fasta 2 STREAM INPUT new 1 $inputfile get_Seq_by_id INPUT 2 2 STREAM INPUT new 1 $inputfile get_all_ids 2 1 STREAM OUTPUT new 1 -dir $resultdir1 SCALAR 1 -module generic SCALAR 1 -prefix SCALAR INPUT 2 -format SCALAR gff 3 -file_suffix SCALAR gff 4 dump 2 OUTPUT ARRAY 1 protein_ids 1 setup_initial 1 protein_ids 2 Blast Bio::Pipeline::Runnable::Blast family $blastdb1 blastall $blastpath $blast_param1 -formatdb 1 -result_dir $resultdir1 1 2 NOTHING And I obtain: ? Creating biopipe Loading Schema... Reading Data_setup xml : /home/conte/xml/newhope.xml Doing DBAdaptor and IOHandler setup Doing Pipeline Flow Setup Doing Analysis.. ------------- EXCEPTION ------------- MSG: Need to store analysis first STACK Bio::Pipeline::SQL::JobAdaptor::store /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/Pipeline/SQL/JobAdaptor.pm:459 STACK Bio::Pipeline::XMLImporter::_create_initial_input_and_job /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/Pipeline/XMLImporter.pm:837 STACK Bio::Pipeline::XMLImporter::run /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/Pipeline/XMLImporter.pm:484 STACK toplevel PipelineManager:120 Matthieu CONTE M. Sc. in Bioinformatics form SIB CIRAD 00 33 06.68.90.28.70 m_conte@hotmail.com _________________________________________________________________ MSN Search, le moteur de recherche qui pense comme vous ! http://search.msn.fr/worldwide.asp From lstein at cshl.edu Fri Nov 21 11:18:20 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Fri Nov 21 11:14:38 2003 Subject: [Bioperl-l] Re: bioperl Registry In-Reply-To: <200311211504.00333.heikki@ebi.ac.uk> References: <200311201319.45681.heikki@ebi.ac.uk> <200311210902.01086.lstein@cshl.edu> <200311211504.00333.heikki@ebi.ac.uk> Message-ID: <200311211118.20821.lstein@cshl.edu> Hi Heikki, It's not the same functionality, though. Your version returned undef if the sequence is undef even though "start" might be defined. Lincoln On Friday 21 November 2003 10:03 am, Heikki Lehvaslaiho wrote: > Lincoln, > > The code you wrote below gives the same functionality than my previous > version but is easier to understand. Let's use it. I'll commit it. > > -Heikki > > On Friday 21 Nov 2003 TT:02, Lincoln Stein wrote: > > In fact I would be happy with this: > > > > return $self->{'start'}; > > > > The GFF code is correctly setting the 'start' key in this case. The || 1 > > was residual because I didn't understand what the intent of the code was. > > Perhaps the logic you are looking for is: > > > > return $self->{'start'} if defined $self->{'start'}; > > return 1 if $self->seq; > > return; > > > > ??? > > > > Lincoln > > > > > I happy to accept this but would like to understand why before I change > > > the t/Locatable.t. > > > > > > -Heikki > > > > > > > Lincoln > > > > > > > > On Thursday 20 November 2003 09:16 pm, Lincoln Stein wrote: > > > > > I've got perl 5.8.2 installed now and can partially reproduce the > > > > > problem. I am working on it now. > > > > > > > > > > Lincoln > > > > > > > > > > On Thursday 20 November 2003 08:19 am, Heikki Lehvaslaiho wrote: > > > > > > Michele & Lincoln, > > > > > > > > > > > > Could I bother either of you two to have a look at the bioperl > > > > > > live Registry/ Bio::DB::Flat modules. There is something wrong in > > > > > > the BerkeleyDB version of flatdb and I can not get my head around > > > > > > it. > > > > > > > > > > > > Brian has been working on it but everything works under his > > > > > > cygwin system so he is stuck. I isolated the BDB tests from > > > > > > t/Registry.t into an other file and put it into bugzilla: > > > > > > http://bugzilla.bioperl.org/show_bug.cgi?id=1562 > > > > > > > > > > > > The tests fail in both Mandrake Linux 9.2+cooker (Perl 5.8.2) and > > > > > > RedHat Linux 9 (Perl 5.8.1), so it is definitely not just my > > > > > > system's problem. > > > > > > > > > > > > Cheers, > > > > > > -Heikki -- ======================================================================== Lincoln D. Stein Cold Spring Harbor Laboratory lstein@cshl.org Cold Spring Harbor, NY ======================================================================== From heikki at ebi.ac.uk Fri Nov 21 11:20:25 2003 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Fri Nov 21 11:16:44 2003 Subject: [Bioperl-l] Bioperl Developer snapshot 1.3.03 Message-ID: <200311211620.25306.heikki@ebi.ac.uk> Bioperl developer snap shot 1.3.03 --------------------------------- This is the third developer snap shot from the BioPerl CVS head that will eventually lead to release 1.4. http://bioperl.org/DIST/current_core_unstable.tar.gz http://bioperl.org/DIST/bioperl-1.3.03.tar.gz Changes since 1.3.02 -------------------- A month is far too long time between snap shots, but I found it difficult to find time to write an overview of what has happend. Waiting made it harder, of course, so I'll be able to just skim the top of the changes made. See the latter pat of the message for emails. Bio::LocatableSeq now gives reasonable values to start() and end() without manually setting them if the values can be derived from the sequence only. Sequence database parsers now treat virus Bio::Species entries differently form other taxons. Since virus nomenclature does not follow the standard genus + species format, calling binomial() on viri is not advisable. The output will merge group name and species name, which is usually not what you want. This might need more work in the future. Bio::SimpleAlign has new methods. Help appreciated there too. (see below) If you really want, you can now add custom translation tables into Bio::Tools::CodonTable and create Marsian proteins. Stefan has continued finetuning his Bio::Matrix::PSM modules. Number of fixes has been added to Bio::Graphics modules. Work is under way to add SVG support. Bio::Tools::SeqWords has a new method: count_overlap_words() Remember: BPlite is getting superceded by SearchIO. On behalf of the bioperl core team, -Heikki NEW DIRECTORIES and FILES ========================= * AlignIO supports now MAF format * SeqIO knows about KEGG and TIGR formats * Bio/Tools/Analysis/Protein::ELM for documentation * two texts converted into SGML: Flat_Databases.sgml * new HOWTO: SimpleWebAnalysis.sgml * bioperl-live/doc/howto/txt - New directory for text-only versions of howtos examples * sirna/rnai_finder.cgi * db/bioflat_index.pl models * popgen.dia CHANGES ======= + Lots of fixes to tests * tests fail now cleanly when run without network access ------------------------- details --------------------------- Bio::Align::DNAStatistics code alignment formatting Bio::AlignIO::bl2seq Johnathan Segal's fixes for bug #1541 - problem with reverse complement alignments in bl2seq Bio::DB::Flat::BinarySearch More detail on secondary namespaces Bio::DB::Flat Some -index value has to be passed, it's required Bio::DB::GFF::Adaptor::biofetch changes making genbank2gff.pl use SOFA terms for type names in generated GFF3 Bio::DB::GFF::Aggregator fixed errors in the high-mag sequence alignments shown by the segments glyph Bio::DB::GFF::Feature Reworked the following methods to more closely resemble the corresponding Bio::SeqFeatureI methods: - all_tags (alias get_all_tags) - gff_string - get_tag_values - aliased sub_SeqFeature to get_SeqFeatures Bio::DB::GFF::Feature silence the uninitialized value error Bio::DB::Registry The HOWTO says that one should be able to use 1 or more seqdatabase.ini files. This is right, since the administrator could put one in /etc/bioinformatics and I might want my own in /home/bosborne/.bioinformatics. The old code was reading 1 *ini file and skipping the rest in OBDA_SEARCH_PATH, now it reads all the files specified in OBDA_SEARCH_PATH, as well as the standard locations. ActiveState has no getpwuid() so AS users can use /home/bosborne Bio::Graphics::FeatureFile - adding a symbol to access a feature's primary ID (eg, database PK) - remove unit variable warning when calling features() without arguments - fixed frend web-based feature renderer to accomodate recent changes in FeatureFile API Bio::Graphics::Glyph::diamond converted line-based outline to polygon calls Bio::Graphics::Glyph::Factory preliminary support for SVG output using GD::SVG Bio::Graphics::Glyph::graded_segments Fixed Bio::SeqFeature::Generic so that it will a Bio::Graphics::Panel preliminary support for SVG output using GD::SVG Bio::Graphics::Glyph fixed errors in the high-mag sequence alignments shown by the segments glyph Bio::Graphics::Glyph - preliminary support for SVG output using GD::SVG - polygon-based approach in filled_arrow to support SVG Bio::Graphics::Glyph::generic - generalized some code to support SVG output Bio::Graphics::Glyph::segments - added additional documentation for displaying multiple alignments with the segments glyph - fixed errors in the high-mag sequence alignments shown by the segments glyph - added a new "canonical_strand" option to the segments glyph Bio::Graphics::Glyph::graded_segments Fixed Bio::SeqFeature::Generic so that it will accept a score of 0; modified Bio::Graphics::Glyph::graded_segment so that it draws a fg box around each segment by default (can restore default behavior with -vary_fg=>1) Bio::Graphics::Glyph::triangle - more range checking on triangle glyph before fillToBorder call - try to fix GD buffer overrun in triangle glyph Bio::Graphics::Glyph::xyplot removed function-oriented GD calls for compatability with SVG output Bio::Graphics::Panel preliminary support for SVG output using GD::SVG Bio::Graphics::Pictogram support lowercase Bio::LocatableSeq - start() and end() now return undef if there is no sequence string - silence a spurious warning arising from unset strand - fixed trunc() when strand is -1. Also made end() calculate its value based on the length of the sequence and start. no need to set end() expicitely any more. - Johnathan Segal's fixes for bug #1541 - problem with reverse complement alignments in bl2seq Bio::SimpleAlign adding a parser and tests for UCSC maf (multiple alignment format) format. added a method SimpleAlign::splice_by_seq_pos to allow splicing of all sequences based on the gap locations of one sequence within the alignment. this could in principle be called repeatedly to remove all gaps from the MSA. Bio::Matrix::PSM::InstanceSite PsmHeader synopsis and doc fixes Bio::Matrix::PSM::IO::mast doc formatting fixes Bio::Matrix::PSM::SiteMatrix SiteMatrixI get/set method added to access accession_number Bio::Matrix::PSM::SiteMatrix Fixed bug Heikki pointed with the constructor when no input data for the vectors (A,G,C,T) is supplied This is still a temp solution Bio::Matrix::PSM::SiteMatrix Fixed bug Heikki pointed with the constructor when no input data for the vectors (A,G,C,T) is supplied This is still a temp solution Bio::Matrix::PSM::IO::mast sequence is unknown, but width is, so we supply it as 'NNN..' Accession number should be supplied as -accession_number Bio::Matrix::PSM::InstanceSite Bug fix: start method was overriding LocatableSeq method, and it shouldn't, fixed. Bio::Matrix::PSM::IO::transfac Throw exception if a position is not defined Bio::Matrix::PSM::IO::mast meme transfac Capitalization fixed when rearranging in new Bio::Matrix::PSM::IO::mast meme transfac Capitalization fixed when rearranging in new Bio::Matrix::PSM::InstanceSite Bug fix: start method was overriding LocatableSeq method, and it shouldn't, fixed. Bio::OntologyIO::dagflat - fixes to ontology regex to parse a greater subset of DAG-Edit files. i have tracked down the files where DAG-Edit IDs are validated: GOFlatFileAdapter.java the regex still only matches a subset of the allowed characters in an identifier. identifiers can be any non-whitespace, non ;$,:!\? characters > length 1 on either side of a : separator. i've opted to match \w+:\w+, hopefully we don't need to go beyond this. adding escape of SGML and newlines/tabs. is there a generic SGML escape module we want to add as a dependency? Bio::OntologyIO adding escape of SGML and newlines/tabs. is there a generic SGML escape module we want to add as a dependency? Bio::Ontology::Term Bio::Phenotype::OMIM::OMIMentry OMIMparser finer parse the symptoms Bio::PopGen Statistics update LD so that it will a) return an pair of values, LD and chiSQ. Also fix it so that composite_LD will calculate correctly with missing data Bio::PrimarySeqI translate() can take in a custom codon table Bio::RangeI Make it so 'disconnected_ranges' sub don't cause warnings Bio::Restriction::Analysis Apply fix for bug #1548 Bio::Root::IO - cleanup of debugging a little for uniformity - In order for rmtree() to work in cygwin Bio::SearchIO::blastxml blastxml expected and on the same line. my version of blastall puts them on different lines, which caused the parse to fail (from internal refactoring of and tags). this change fixes the bug. tests added to SearchIO.t and a test blastxml file added. Bio::SearchIO::Writer::GbrowseGFF Gbrowse now allows tstart and tend tags for alignment features to make it more like normal GFF. Bio::Seq::EncodedSeq fixed strandedness issues Bio::SeqFeature::Generic It will accept a score of 0; modified Bio::Graphics::Glyph::graded_segment so that it draws a fg box around each segment by default (can restore default behavior with -vary_fg=>1) Bio::SeqFeature::Tools::Unflattener reuses exons (eg containment graph not a tree) improved algorithm for matching mRNAs with CDSs Bio::SeqIO alternate ABI extension for newer versions of software (requested by Jan Aerts) Bio::SeqIO::swiss Bio::SeqIO::genbank Bio::SeqIO::embl resoving bugzilla #1519 1. fixed sprintf bug sometimes leading to extra space after ID tag 2. OS line output for viri now contains all the information after species name. The complex strain/abbreviation/common name list is stored in sub_species() which was previously not in use for viri. This is a hack but the (first) OS line now makes a perfect round trip. Bio::SeqUtils translate_6frames() failed on sequences where bioperl would guess that the sequence string is protein. Streamlined coding of the method to avoid guessing. Bio::SimpleAlign - offset location of new seq with features by location of original seq requested to build from. - added rudimentary key/value parsing for maf 'a' lines - run clean with -w on - cleaned up unit test spurious warnings. - bugfix in maf parser for detecting last record in file. - added functionality to trim gaps from a MSA for a given sequence to SimpleAlign. trimming allowed implementation of exporting Seq and SeqFeatures from SimpleAlign. the api here is still rough, comments appreciated. - added a method SimpleAlign::splice_by_seq_pos to allow splicing of all sequences based on the gap locations of one sequence within the alignment. this could in principle be called repeatedly to remove all gaps from the MSA. Bio::Species commented out internal calls to methods not doing anything Bio::Taxonomy clean up the rank sets Bio::Tools::BPlite::Iteration have be set to '' instead of undef - perhaps this is not entirely the best thing - are we screwing up in the parsing instead? use Bio::SearchIO instead I guess Bio::Tools::BPlite bug #1542 - improper detection of end of Query regexp Bio::Tools::CodonTable if you know what you are doing you can add custom codon table Bio::Tools::GFF - needed to move header parsing outside of next_feature, as it may be useful to handle sequences before sequence features (think database inserts). - adding support for parsing GFF ##sequence-region header lines. these are transformed into featureless Bio::LocatableSeq objects, available via the next_segment method. Bio::Tools::Phylo::PAML silenced a warning reported in bugzilla #1560 Bio::Tools::Run::StandAloneBlast Allow SearchIO to be used for all output format types now with _READMETHOD set Bio::Tools::SeqWords new method: count_overlap_words(), feature enhancement from bugzilla #1554 Bio::Tools::Signalp add the SignalP-HMM result. $feat->score; # Signal peptide probability $feat->get_tag_values('peptideProb')->[0]; # signalp peptide probability $feat->get_tag_values('anchorProb')->[0]; # signalp anchor probability /examples/biblio more biblio examples INSTALL.WIN Bug 1451, PPM3 documentation wrong scripts/Bio-DB-GFF/bp_genbank2gff.PLS changes making genbank2gff.pl use SOFA terms for type names in generated GFF3 scripts/Bio-DB-GFF/bulk_load_gff.PLS fast_load_gff.PLS pg_bulk_load_gff.PLS fixed a minor gff3 bug scripts/Bio-DB-GFF/bulk_load_gff.PLS added support for dsn strings in the form of "dbi:mysql:database=xxx;host=xxx" scripts/Bio-DB-GFF/bulk_load_gff.PLS added support for bulk loading from a local gff source to a remote db server scripts/Bio-DB-GFF/fast_load_gff.PLS added an option for setting MAX_BIN scripts/Bio-DB-GFF/bulk_load_gff.PLS pg_bulk_load_gff.PLS added option to set MAX_BIN, and updated the postgres loader to deal with gff3 (note that the gff3 stuff is completely untested though) scripts/graphics/frend.PLS Bio::Graphics::FeatureFile: remove uninit variable warning when calling features() without arguments; fixed frend web-based feature renderer to accomodate recent changes in FeatureFile API scripts/popgen/composite_LD.PLS - print with new API - fix to deal with newer API scripts/utilities/search2gff.PLS output 'match' and 'component' lines for GFF dumping From Jonathan_Epstein at nih.gov Fri Nov 21 11:27:57 2003 From: Jonathan_Epstein at nih.gov (Jonathan Epstein) Date: Fri Nov 21 11:24:09 2003 Subject: [Bioperl-l] MS: Calculation of theoretical spectra In-Reply-To: <3FBE2C1D.2000900@virchow.uni-wuerzburg.de> Message-ID: <5.1.1.6.0.20031121112653.042f4570@nihexchange4.nih.gov> We have something here ... not sure yet whether it's in good enough shape to be BioPerl-ized, but I'm interested in doing so if no one else has done it yet. Jonathan At 04:15 p.m. 11/21/2003 +0100, Andreas Boehm wrote: >Hello List, > >does there already exist a module for calculating a theoretical spectrum from a given protein sequence that can be used for mass spectrometry? > >regards, >Andreas M. Boehm > >_______________________________________________ >Bioperl-l mailing list >Bioperl-l@portal.open-bio.org >http://portal.open-bio.org/mailman/listinfo/bioperl-l Jonathan Epstein Jonathan_Epstein@nih.gov Head, Unit on Biologic Computation (301)402-4563 Office of the Scientific Director Bldg 31, Room 2A47 Nat. Inst. of Child Health & Human Development 31 Center Drive National Institutes of Health Bethesda, MD 20892-2425 From heikki at ebi.ac.uk Fri Nov 21 11:37:08 2003 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Fri Nov 21 11:33:39 2003 Subject: [Bioperl-l] Re: bioperl Registry In-Reply-To: <200311211118.20821.lstein@cshl.edu> References: <200311201319.45681.heikki@ebi.ac.uk> <200311211504.00333.heikki@ebi.ac.uk> <200311211118.20821.lstein@cshl.edu> Message-ID: <200311211637.08613.heikki@ebi.ac.uk> On Friday 21 Nov 2003 TT:18, Lincoln Stein wrote: > Hi Heikki, > > It's not the same functionality, though. Your version returned undef if > the sequence is undef even though "start" might be defined. Right. I did not check for that. Thanks again, -Heikki > Lincoln > > On Friday 21 November 2003 10:03 am, Heikki Lehvaslaiho wrote: > > Lincoln, > > > > The code you wrote below gives the same functionality than my previous > > version but is easier to understand. Let's use it. I'll commit it. > > > > -Heikki > > > > On Friday 21 Nov 2003 TT:02, Lincoln Stein wrote: > > > In fact I would be happy with this: > > > > > > return $self->{'start'}; > > > > > > The GFF code is correctly setting the 'start' key in this case. The || > > > 1 was residual because I didn't understand what the intent of the code > > > was. Perhaps the logic you are looking for is: > > > > > > return $self->{'start'} if defined $self->{'start'}; > > > return 1 if $self->seq; > > > return; > > > > > > ??? > > > > > > Lincoln > > > > > > > I happy to accept this but would like to understand why before I > > > > change the t/Locatable.t. > > > > > > > > -Heikki > > > > > > > > > Lincoln > > > > > > > > > > On Thursday 20 November 2003 09:16 pm, Lincoln Stein wrote: > > > > > > I've got perl 5.8.2 installed now and can partially reproduce the > > > > > > problem. I am working on it now. > > > > > > > > > > > > Lincoln > > > > > > > > > > > > On Thursday 20 November 2003 08:19 am, Heikki Lehvaslaiho wrote: > > > > > > > Michele & Lincoln, > > > > > > > > > > > > > > Could I bother either of you two to have a look at the bioperl > > > > > > > live Registry/ Bio::DB::Flat modules. There is something wrong > > > > > > > in the BerkeleyDB version of flatdb and I can not get my head > > > > > > > around it. > > > > > > > > > > > > > > Brian has been working on it but everything works under his > > > > > > > cygwin system so he is stuck. I isolated the BDB tests from > > > > > > > t/Registry.t into an other file and put it into bugzilla: > > > > > > > http://bugzilla.bioperl.org/show_bug.cgi?id=1562 > > > > > > > > > > > > > > The tests fail in both Mandrake Linux 9.2+cooker (Perl 5.8.2) > > > > > > > and RedHat Linux 9 (Perl 5.8.1), so it is definitely not just > > > > > > > my system's problem. > > > > > > > > > > > > > > Cheers, > > > > > > > -Heikki -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From lstein at cshl.edu Fri Nov 21 12:41:25 2003 From: lstein at cshl.edu (Lincoln Stein) Date: Fri Nov 21 12:37:38 2003 Subject: [Bioperl-l] GFF file output missing semicolon In-Reply-To: <3FBD9254.8080008@csiro.au> References: <3FBD9254.8080008@csiro.au> Message-ID: <200311211241.25744.lstein@cshl.edu> Hi, The GFF2 spec specifies that the semicolon separates tag/value pairs. It does not say that the last tag/value should be terminated by a semicolon. It also specifies that any amount of whitespace can occur around the semicolon. Lincoln On Thursday 20 November 2003 11:19 pm, Wes Barris wrote: > Hi, > > I have written a bioperl program that parses blast files and generates > a gff file. I have everything working except there is one small detail > that I have not been able to figure out. When generating each line > of gff output, the semicolon is left off at the end of the Accession > name. Here is a sample line from a gff file that I generated: > > AF354168 mirseeker pred_miRNA 188152 188251 198 - > . Note "mirseeker score 17.58" ; Accession > "s-h_19_r_99330000-99363000" > > Notice that: > > 1) There are three space characters after the note and the semicolon > that occurs before "Accession". > > 2) At the end of the line, after the Accession, there are three space > characters and no semicolon. Without that semicolon, the genome > browser doesn't display the "rollover" information properly. > > 3) The "Note" field is written before the "Accession" field. I thought > that the Accession should come first. > > Here is the relevant portion of my code: > > while( my $hsp = $hit->next_hsp ) { > my $strand = 1; > $strand = -1 if ($hsp->strand('query') == -1 || > $hsp->strand('hit') == -1); my $feature = new Bio::SeqFeature::Generic( > -source_tag=>$source, > -primary_tag=>$feature_type, > -start=>$hsp->start('hit'), > -end=>$hsp->end('hit'), > -score=>$hit->raw_score, > -strand=>$strand, > -tag=>{ > Accession=>$result->query_name, > Note=>$result->query_description, > } > ); > $feature->seq_id($hit->accession); > $gffio->write_feature($feature); #Bio::SeqFeatureI > } > > Perhaps I am not adding the "Accession" and "Note" fields properly??? -- ======================================================================== Lincoln D. Stein Cold Spring Harbor Laboratory lstein@cshl.org Cold Spring Harbor, NY ======================================================================== From jason at cgt.duhs.duke.edu Fri Nov 21 12:54:54 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Fri Nov 21 12:51:07 2003 Subject: [Bioperl-l] GFF file output missing semicolon In-Reply-To: <200311211241.25744.lstein@cshl.edu> References: <3FBD9254.8080008@csiro.au> <200311211241.25744.lstein@cshl.edu> Message-ID: I think that the gff2 dumping was not particularly good - I think I made some fixes to clean it up on the main trunk in the last few months. I can certainly dump with Tools:GFF and load into Gbrowse just fine with the current code. Wes you might try with bioperl 1.3.x series Bio::Tools::GFF instead. -jason On Fri, 21 Nov 2003, Lincoln Stein wrote: > Hi, > > The GFF2 spec specifies that the semicolon separates tag/value pairs. It does > not say that the last tag/value should be terminated by a semicolon. It also > specifies that any amount of whitespace can occur around the semicolon. > > Lincoln > > On Thursday 20 November 2003 11:19 pm, Wes Barris wrote: > > Hi, > > > > I have written a bioperl program that parses blast files and generates > > a gff file. I have everything working except there is one small detail > > that I have not been able to figure out. When generating each line > > of gff output, the semicolon is left off at the end of the Accession > > name. Here is a sample line from a gff file that I generated: > > > > AF354168 mirseeker pred_miRNA 188152 188251 198 - > > . Note "mirseeker score 17.58" ; Accession > > "s-h_19_r_99330000-99363000" > > > > Notice that: > > > > 1) There are three space characters after the note and the semicolon > > that occurs before "Accession". > > > > 2) At the end of the line, after the Accession, there are three space > > characters and no semicolon. Without that semicolon, the genome > > browser doesn't display the "rollover" information properly. > > > > 3) The "Note" field is written before the "Accession" field. I thought > > that the Accession should come first. > > > > Here is the relevant portion of my code: > > > > while( my $hsp = $hit->next_hsp ) { > > my $strand = 1; > > $strand = -1 if ($hsp->strand('query') == -1 || > > $hsp->strand('hit') == -1); my $feature = new Bio::SeqFeature::Generic( > > -source_tag=>$source, > > -primary_tag=>$feature_type, > > -start=>$hsp->start('hit'), > > -end=>$hsp->end('hit'), > > -score=>$hit->raw_score, > > -strand=>$strand, > > -tag=>{ > > Accession=>$result->query_name, > > Note=>$result->query_description, > > } > > ); > > $feature->seq_id($hit->accession); > > $gffio->write_feature($feature); #Bio::SeqFeatureI > > } > > > > Perhaps I am not adding the "Accession" and "Note" fields properly??? > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From fsanchez at cifn.unam.mx Fri Nov 21 12:45:05 2003 From: fsanchez at cifn.unam.mx (=?ISO-8859-1?Q?Fabiola_S=E1nchez?=) Date: Fri Nov 21 19:38:22 2003 Subject: [Bioperl-l] How can i get tag Comments from Genbank file Message-ID: <3FBE4F21.9010205@cifn.unam.mx> Hello! i need your help I'm reading files in genbank format i want to parser the second paragraph of the COMMENT how can i get this ? for example COMMENT ------------------------------------------------------------------ This SWISS-PROT entry is copyright. It is produced through a collaboration between the Swiss Institute of Bioinformatics and the EMBL outstation - the European Bioinformatics Institute. The original entry is available from http://www.expasy.ch/sprot and http://www.ebi.ac.uk/sprot -----------------------------------------------------------------. [CATALYTIC ACTIVITY] An alcohol + NAD(+) = an aldehyde or ketone + NADH. [COFACTOR] Binds 1 zinc ion per subunit (By similarity). [SIMILARITY] Belongs to the zinc-containing alcohol dehydrogenase family. I need get : -----------------------------------------------------------------. [CATALYTIC ACTIVITY] An alcohol + NAD(+) = an aldehyde or ketone + NADH. [COFACTOR] Binds 1 zinc ion per subunit (By similarity). [SIMILARITY] Belongs to the zinc-containing alcohol dehydrogenase family. Thank you. Fabi From jdonner at cs.nmsu.edu Sat Nov 22 01:23:43 2003 From: jdonner at cs.nmsu.edu (Jeff Donner) Date: Sat Nov 22 01:19:29 2003 Subject: [Bioperl-l] How to distinguish pdb Helix from Sheet? Message-ID: <3FBF00EF.4050106@cs.nmsu.edu> Hi, How can you tell which chains are HELIX and which SHEET after you've read a pdb file with Bio::Structure::IO? Is it possible even? Thanks, Jeff Donner From jason at cgt.duhs.duke.edu Sat Nov 22 07:10:50 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Sat Nov 22 07:07:31 2003 Subject: [Bioperl-l] How can i get tag Comments from Genbank file In-Reply-To: <3FBE4F21.9010205@cifn.unam.mx> References: <3FBE4F21.9010205@cifn.unam.mx> Message-ID: my $seqio = new Bio::SeqIO(-format => 'genbank', -file => 'yourfile.gb'); my $seq = $seqio->next_seq; if( $seq ) { print "comments are:\n"; # get the comment annotations as an array for my $comm ( $seq->annotation->get_Annotations('comment') ) { print $comm->text(); } See Bio::Seq, Bio::Annotation::Collection, and Bio::Annotation::Comment for more information. -jason On Fri, 21 Nov 2003, [ISO-8859-1] Fabiola Sánchez wrote: > Hello! > i need your help > I'm reading files in genbank format > i want to parser the second paragraph of the COMMENT > how can i get this ? > for example > COMMENT > ------------------------------------------------------------------ > This SWISS-PROT entry is copyright. It is produced through a > collaboration between the Swiss Institute of Bioinformatics and > the EMBL outstation - the European Bioinformatics Institute. The > original entry is available from http://www.expasy.ch/sprot and > http://www.ebi.ac.uk/sprot > > -----------------------------------------------------------------. > [CATALYTIC ACTIVITY] An alcohol + NAD(+) = an aldehyde or > ketone + > NADH. [COFACTOR] Binds 1 zinc ion per subunit (By similarity). > [SIMILARITY] Belongs to the zinc-containing alcohol > dehydrogenase > family. > > I need get : > -----------------------------------------------------------------. > [CATALYTIC ACTIVITY] An alcohol + NAD(+) = an aldehyde or > ketone + > NADH. [COFACTOR] Binds 1 zinc ion per subunit (By similarity). > [SIMILARITY] Belongs to the zinc-containing alcohol > dehydrogenase > family. > > > Thank you. > > Fabi > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From vesko_baev at abv.bg Sat Nov 22 17:06:38 2003 From: vesko_baev at abv.bg (Vesko Baev) Date: Sun Nov 23 18:13:25 2003 Subject: [Bioperl-l] STDERR? Message-ID: <542205707.1069538798363.JavaMail.nobody@storage.ni.bg> Hi, Did anyone know how to make my cgi to display errors in the browser instead in error log? Becouse I do not have access to error log and it displays to me Internal Error 500 and I do not know where is my error in the script? Thanks to all. Vesko ----------------------------------------------------------------- http://www.sosbg.org - ????????? ??? ???? ???? ???????? - 0881722 - ?? ???! From wes.barris at csiro.au Sun Nov 23 19:07:55 2003 From: wes.barris at csiro.au (Wes Barris) Date: Sun Nov 23 19:14:48 2003 Subject: [Bioperl-l] STDERR? In-Reply-To: <542205707.1069538798363.JavaMail.nobody@storage.ni.bg> References: <542205707.1069538798363.JavaMail.nobody@storage.ni.bg> Message-ID: <3FC14BDB.7010107@csiro.au> Vesko Baev wrote: > Hi, > Did anyone know how to make my cgi to display errors in the browser instead in error log? Becouse I do not have access to error log and it displays to me Internal Error 500 and I do not know where is my error in the script? > > Thanks to all. > Vesko use CGI::Carp 'fatalsToBrowser'; -- Wes Barris E-Mail: Wes.Barris@csiro.au From wes.barris at csiro.au Sun Nov 23 19:26:27 2003 From: wes.barris at csiro.au (Wes Barris) Date: Sun Nov 23 19:33:13 2003 Subject: [Bioperl-l] GFF file output missing semicolon In-Reply-To: <200311211241.25744.lstein@cshl.edu> References: <3FBD9254.8080008@csiro.au> <200311211241.25744.lstein@cshl.edu> Message-ID: <3FC15033.7030500@csiro.au> Lincoln Stein wrote: > Hi, > > The GFF2 spec specifies that the semicolon separates tag/value pairs. It does > not say that the last tag/value should be terminated by a semicolon. It also > specifies that any amount of whitespace can occur around the semicolon. Ok, fair enough. But then, gbrowse appears to not be able to handle this format properly. I know that I must be wrong about this but this is what I am seeing. Here is a gff line as created by Bio::Tools::GFF: AF354168 blast s-m-100-10 61437 61530 186 - . Note "QRNA Feature sheep vs. mouse RNA logoddspost=14.021" ; Accession "sheep_#25_61538..61445" Note that there is a lot of wrapping going on when displayed in this message. If I load this file (using fast_load_gff.pl) into a mysql database and view with gbrowse, there are two problems: 1) The accession is displayed above the item inside double quotes like this: "sheep_#25_61538..61445". 2) When mousing over the item, neither the accession nor the start and end are displayed. Instead all I see is the track key: QRNA Sheep-Mouse 100-10: If I manually add a semi-colon after the accession at the end of each line of the gff file and load that into the mysql database, gbrowse proplerly displays these two items like this: sheep_#25_61538..61445 (note no double quote marks any more) QRNA Sheep-Mouse 100-10: sheep_#25_61538..61445 AF354168: 61437..61530 > > Lincoln > > On Thursday 20 November 2003 11:19 pm, Wes Barris wrote: > >>Hi, >> >>I have written a bioperl program that parses blast files and generates >>a gff file. I have everything working except there is one small detail >>that I have not been able to figure out. When generating each line >>of gff output, the semicolon is left off at the end of the Accession >>name. Here is a sample line from a gff file that I generated: >> >>AF354168 mirseeker pred_miRNA 188152 188251 198 - >> . Note "mirseeker score 17.58" ; Accession >>"s-h_19_r_99330000-99363000" >> >>Notice that: >> >>1) There are three space characters after the note and the semicolon >> that occurs before "Accession". >> >>2) At the end of the line, after the Accession, there are three space >> characters and no semicolon. Without that semicolon, the genome >> browser doesn't display the "rollover" information properly. >> >>3) The "Note" field is written before the "Accession" field. I thought >> that the Accession should come first. >> >>Here is the relevant portion of my code: >> >> while( my $hsp = $hit->next_hsp ) { >> my $strand = 1; >> $strand = -1 if ($hsp->strand('query') == -1 || >>$hsp->strand('hit') == -1); my $feature = new Bio::SeqFeature::Generic( >> -source_tag=>$source, >> -primary_tag=>$feature_type, >> -start=>$hsp->start('hit'), >> -end=>$hsp->end('hit'), >> -score=>$hit->raw_score, >> -strand=>$strand, >> -tag=>{ >> Accession=>$result->query_name, >> Note=>$result->query_description, >> } >> ); >> $feature->seq_id($hit->accession); >> $gffio->write_feature($feature); #Bio::SeqFeatureI >> } >> >>Perhaps I am not adding the "Accession" and "Note" fields properly??? > > -- Wes Barris E-Mail: Wes.Barris@csiro.au From wes.barris at csiro.au Sun Nov 23 19:35:46 2003 From: wes.barris at csiro.au (Wes Barris) Date: Sun Nov 23 19:42:30 2003 Subject: [Bioperl-l] GFF file output missing semicolon In-Reply-To: References: <3FBD9254.8080008@csiro.au> <200311211241.25744.lstein@cshl.edu> Message-ID: <3FC15262.4090502@csiro.au> Jason Stajich wrote: > I think that the gff2 dumping was not particularly good - I think I made > some fixes to clean it up on the main trunk in the last few months. I can > certainly dump with Tools:GFF and load into Gbrowse just fine with > the current code. Wes you might try with bioperl 1.3.x series > Bio::Tools::GFF instead. I could try but this is running on a production server and I had a heck of a time trying to find a working combination of bioperl and gbrowse that would work together. I think that at the time, the only combo I could get to work was bioperl-1.2.2 and gbrowse-1.50. If I installed bioperl-1.2.3, what version of gbrowse is guaranteed to work with that? > > -jason > > On Fri, 21 Nov 2003, Lincoln Stein wrote: > > >>Hi, >> >>The GFF2 spec specifies that the semicolon separates tag/value pairs. It does >>not say that the last tag/value should be terminated by a semicolon. It also >>specifies that any amount of whitespace can occur around the semicolon. >> >>Lincoln >> >>On Thursday 20 November 2003 11:19 pm, Wes Barris wrote: >> >>>Hi, >>> >>>I have written a bioperl program that parses blast files and generates >>>a gff file. I have everything working except there is one small detail >>>that I have not been able to figure out. When generating each line >>>of gff output, the semicolon is left off at the end of the Accession >>>name. Here is a sample line from a gff file that I generated: >>> >>>AF354168 mirseeker pred_miRNA 188152 188251 198 - >>> . Note "mirseeker score 17.58" ; Accession >>>"s-h_19_r_99330000-99363000" >>> >>>Notice that: >>> >>>1) There are three space characters after the note and the semicolon >>> that occurs before "Accession". >>> >>>2) At the end of the line, after the Accession, there are three space >>> characters and no semicolon. Without that semicolon, the genome >>> browser doesn't display the "rollover" information properly. >>> >>>3) The "Note" field is written before the "Accession" field. I thought >>> that the Accession should come first. >>> >>>Here is the relevant portion of my code: >>> >>> while( my $hsp = $hit->next_hsp ) { >>> my $strand = 1; >>> $strand = -1 if ($hsp->strand('query') == -1 || >>>$hsp->strand('hit') == -1); my $feature = new Bio::SeqFeature::Generic( >>> -source_tag=>$source, >>> -primary_tag=>$feature_type, >>> -start=>$hsp->start('hit'), >>> -end=>$hsp->end('hit'), >>> -score=>$hit->raw_score, >>> -strand=>$strand, >>> -tag=>{ >>> Accession=>$result->query_name, >>> Note=>$result->query_description, >>> } >>> ); >>> $feature->seq_id($hit->accession); >>> $gffio->write_feature($feature); #Bio::SeqFeatureI >>> } >>> >>>Perhaps I am not adding the "Accession" and "Note" fields properly??? >> >> > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu -- Wes Barris E-Mail: Wes.Barris@csiro.au From harris at cshl.org Sun Nov 23 21:00:59 2003 From: harris at cshl.org (Todd Harris) Date: Sun Nov 23 21:07:38 2003 Subject: [Bioperl-l] GD/SVG (was: new to the group) In-Reply-To: <200311191851.26014.ronan@roasp.com> Message-ID: Hi Ronan - GD::SVG is now on CPAN! My first CPAN contrib - please have a look. I'll be releasing a new version today or tomorrow where I solved the setBrush and gdBrushed methods internally. > I see we're both running into the same problems. In particular the font issue > (this seems to be due to the way perl handles xs objects? it returns an > object that sometimes seems to bet treated like a string. I've worked around this by 1) exporting the same font methods as GD; Each of these methods contains hard-coded info on height and weight that mimic GDs fonts, then returns a generic font object (in my case GD::SVG::Font). 2) The GD::SVG::Font object This package creates generic formatting common to all the fonts, implements the height() and width() methods, and establishes GDs oo approach to fonts (ie GD::Font->Large). In the oo case, these methods simply call the exported method, a bizarre circularity which works fine. This looks to be quite similar to how you've handled it as well. > Maybe someone in bioperl has advice about how to grab the GD namespace without > forsaking the use of the GD methods? For the most part I've been forsaking GD's methods and I've found very few reasons to actually use GD's methods directly (sorry Lincoln, no offense!). This is best illustrated when considering the colorAllocate and rgb methods. GD's colorAllocate method returns a color index. The rgb($index) method returns the rgb triplet. Neither of these methods are really useful to generating SVG - colorAllocate just needs to return a stringified rgb triplet; the rgb method can just parse the index and return the triplet that way. I took a look at SVG::GD and see that you are maintaining an internal version of the GD image. I started doing the same thinking I could take advantage of some of GD's color manipulation methods but think it is overkill for my purposes. But with regards to the use of these GD wrapper modules, I'm not sure that maintaining both a GD::Image and the SVG image concurrently is the best idea. For example, many scripts (especially CGIs) will predominantly generate raster images via GD. And with Bio::Graphics and Gbrowse, the SVG images from these scripts can be *substantially* larger than their raster brethren. It's useless to generate these images in SVG when the intent is only to produce a raster image. Instead of enabling raster/svg decision to be made at the time of output, I ask that the user make a decision on what their output should be before "use". Coupled with an eval (ie eval "use $package"), this can also be dynamically established at runtime. No biggie, really. That said, I think we can easily maintain an internal representation of the GD::Image - just have to avoid all the exported functions and call them directly. todd > Pls. see private re. collaboration. I'm all for it. > > I've installed your module and will run around in it a bit to get a better > world view of how you use it and of how much similarity there is between > SVG::GD and GD::SVG. > > For now, feel free to take anything out of SVG::GD for your own use. > > Ronan > > On Tuesday 18 November 2003 17:40, Todd Harris wrote: >> Hi Ronan - >> >> Ha, great minds!, right This looks pretty good. I've been working on a >> similar module that works exactly the same and maps almost all functions >> into SVG output (using your SVG module). I've placed this in the GD >> namespace as GD::SVG since that seems to more closely represent the intent >> of the module. >> >> You can check out a preliminary version of my module at >> http://toddot.net/GD-SVG/GD-SVG.0.01.tgz >> >> Docs: >> http://toddot.net/GD-SVG/gd-svg.html >> >> And some very preliminary test images based on Bio::Graphics and some >> simple test scripts: >> http://toddot.net/GD-SVG/test.png >> http://toddot.net/GD-SVG/test.svg >> http://toddot.net/GD-SVG/biographics-dynamic_glyphs.png >> http://toddot.net/GD-SVG/biographics-dynamic_glyphs.svg >> http://toddot.net/GD-SVG/biographics-lots.png >> http://toddot.net/GD-SVG/biographics-lots.svg >> >> These images are a little out-of-date. I've fixed many of the formatting >> discrepancies already. >> >> I've already added support for GD::SVG into bioperl, so perhaps we should >> coordinate our efforts on the GD::SVG (or SVG::GD module). In particualr, >> there are a number of kludges that need to be implemented (regarding font >> sizes, positions, etc) to correctly map GD<->SVG output (particularly in >> regards to Bio::Graphics. >> >> Thanks, >> >> todd >> >> On Thu, 13 Nov 2003, Ronan Oger wrote: >>> Hi, >>> >>> My name is Ronan Oger, I am the lead developer of the SVG module. >>> >>> One of the focuses of my current work is SVG::GD, a wrapper for the GD >>> module to provide SVG (vector) output instead of raster. >>> >>> http://www.w3.org/Graphics/SVG/ >>> http://www.w3.org/TR/SVG >>> >>> I've been doing some tests with GD and GD derivatives, and since bioperl >>> is a fairly heavy user of GD, I have been testing around some bioperl >>> code to see how it works. >>> >>> There are some real issues in the SVG::GD at this point, but several >>> people are working on it and progress is being made. >>> >>> Clearly this is not production code at this stage. In particular, font >>> support is still very poor, and font positions are still broken. >>> >>> However, here is a bioperl-specific sample (you need an SVG-compliant >>> browser, such as IE with Adobe or Corel's SVG viewers installed. >>> >>> A png and its svg friend taken from a bio-related example on the net >>> ------------------------------------ >>> http://www.roasp.com/2003/11/13/ >>> >>> More prolific example comparisons >>> http://www.roasp.com/2003/11/11/ >>> >>> The SVG::GD module (version 0.07): >>> http://www.roasp.com/2003/11/11/SVG-GD-0.07.tar.gz >>> (This module has a dependency on SVG, which is on CPAN) >>> When it ripens, the module will live on CPAN. >>> >>> I'd appreciate some feedback, issues, etc. In particular, relating to the >>> module's usability. >>> >>> All the best, >>> >>> Ronan >>> >>> -- >>> Ronan Oger >>> http://www.roasp.com >>> Serverside SVG Portal >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l@portal.open-bio.org >>> http://portal.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l@portal.open-bio.org >> http://portal.open-bio.org/mailman/listinfo/bioperl-l From heikki at ebi.ac.uk Mon Nov 24 05:22:53 2003 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Mon Nov 24 05:29:28 2003 Subject: [Bioperl-l] Developers: what do you want to fix before 1.4 release? Message-ID: <200311241022.54362.heikki@ebi.ac.uk> With developer snapshot 1.3.03 out last Friday, I'd like to hear from bioperl developers what they'd like to accomplish before 1.4 comes out. At this stage, I'd rather not see major new features added; there will not be time to test them properly. I'd especially like to hear from * Lincoln Stein / Todd Harris about the SVG timetable * Stefan Kirov about Bio::Matrix::PSM modules * anyone else about their pet modules and bugs I'll give you three choices: 1. You've got a fix/enhancement under way and you can give me timetable for it to be in. 2. There is a bug that need to be fixed before the release but you do not know how to do it. Let's discuss that now! 3. Your modules are not ready for release and you do not have time to work on your code. That means that they will need to be excluded from the release. I am pretty happy with the current status of bioperl CVS tree. Tests pass, but there still are bugs in bugzilla. I urge anyone and everyone to have a look at the bugzilla and see what they think of the bugs still there. Realistically, we have couple of weeks time to get the fixes in. Then a week to wait to make sure everyone has time to make sure that everything works. -Heikki "Let's push 1.4 out" -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From Daniel.Lang at biologie.uni-freiburg.de Mon Nov 24 07:25:12 2003 From: Daniel.Lang at biologie.uni-freiburg.de (Daniel Lang) Date: Mon Nov 24 07:31:45 2003 Subject: [Bioperl-l] Graphics:Panel /SeqFeature::Generic In-Reply-To: <200311181353.34763.lstein@cshl.edu> References: <3FB89DB0.5070303@biologie.uni-freiburg.de> <200311181353.34763.lstein@cshl.edu> Message-ID: <3FC1F8A8.3060306@biologie.uni-freiburg.de> Hi Lincoln, Thanks for your help, but this didn?t improve the situation:( I know now for sure, that the $feature->score is not the evalue, that is set in the while loop, but the normal score!! Additionally, I tried again introducing it as an additional tag, but this tag isn?t available in the callback with e.g. get_tag_values('evalue'). I?m using it in a mod_perl Handler, could this be part of the problem? Thanks in advance, Daniel Lincoln Stein wrote: > Hi Dan, > > Try changing the "generic" glyph to "segments." The first glyph doesn't know > how to deal with subparts (such as HSPs), the second does. > > Lincoln > > On Monday 17 November 2003 05:06 am, Daniel Lang wrote: > >>Hi, >>I want to generate overview graphics from BLAST reports, where the hits >>are sorted and colored (>1e-10 -->green, ...)according their evalues... >> >>So I thought, I could solve this using a callback function for the >>bgcolor and using the 'low_score' sort_order, but when applied to a >>BLAST report, it results in sorted but only red hits? >>I also tried introducing the evalues as additional tags like done with >>'bits' or 'range', but when testing for this tag in the callback >>(has_tag) its not available? >>So I wander if the function is envoked for each hit in the while loop? >> >>Here the code sniplet: >> >>my $track = $panel->add_track(-glyph => 'generic', >> -label => 1, >> -connector => 'dashed', >> -height => 5, >> -bgcolor => sub { >> my $feature = shift; >> my $evalue = $feature->score; >> if ($evalue < 1e-10) {return 'green';} >> else {return 'red';}} >> , >> -fontcolor => 'green', >> -font2color => 'red', >> -sort_order => 'low_score', >> -min_score => '1e-1000', >> -max_score => '10000', >> -description => sub { >> my $feature = shift; >> return unless $feature->has_tag('bits'); >> my ($description) = >>$feature->each_tag_value('bits'); >> my $score = $feature->score; >> my ($range) = >>$feature->each_tag_value('range'); >> "Score=$description bits, E-value=$score, $range"; >> }); >> >> while( my $hit = $result->next_hit ) { >> my $evalue = $hit->significance; >> my $feature = Bio::SeqFeature::Generic->new(-score => $evalue, >> -display_name => $hit->name, >> -tag => { 'bits' => $hit->bits, >> 'range' => "from ". $hit->start('query') . " to " . >>$hit->end('query'), >> }, >> ); >> while( my $hsp = $hit->next_hsp ) { >> $feature->add_sub_SeqFeature($hsp,'EXPAND'); >> } >> $track->add_feature($feature); >> } >> >>Thanks in advance, >>Daniel >> >>_______________________________________________ >>Bioperl-l mailing list >>Bioperl-l@portal.open-bio.org >>http://portal.open-bio.org/mailman/listinfo/bioperl-l > > From Daniel.Lang at biologie.uni-freiburg.de Mon Nov 24 07:53:04 2003 From: Daniel.Lang at biologie.uni-freiburg.de (Daniel Lang) Date: Mon Nov 24 07:59:36 2003 Subject: [Bioperl-l] Graphics:Panel /SeqFeature::Generic In-Reply-To: <200311181353.34763.lstein@cshl.edu> References: <3FB89DB0.5070303@biologie.uni-freiburg.de> <200311181353.34763.lstein@cshl.edu> Message-ID: <3FC1FF30.2010101@biologie.uni-freiburg.de> Hi again, I tested it also on the command line using an additional tag called 'sig' but the problem is the same: ------------- EXCEPTION ------------- MSG: asking for tag value that does not exist sig STACK Bio::SeqFeature::Generic::get_tag_values /usr/lib/perl5/site_perl/5.6.1/Bio/SeqFeature/Generic.pm:504 STACK main::__ANON__ ./htmlresult1.pl:105 STACK (eval) /usr/lib/perl5/site_perl/5.6.1/Bio/Graphics/Glyph/Factory.pm:394 STACK Bio::Graphics::Glyph::Factory::option /usr/lib/perl5/site_perl/5.6.1/Bio/Graphics/Glyph/Factory.pm:394 STACK Bio::Graphics::Glyph::option /usr/lib/perl5/site_perl/5.6.1/Bio/Graphics/Glyph.pm:321 STACK Bio::Graphics::Glyph::bgcolor /usr/lib/perl5/site_perl/5.6.1/Bio/Graphics/Glyph.pm:386 STACK Bio::Graphics::Glyph::filled_box /usr/lib/perl5/site_perl/5.6.1/Bio/Graphics/Glyph.pm:830 STACK Bio::Graphics::Glyph::draw_component /usr/lib/perl5/site_perl/5.6.1/Bio/Graphics/Glyph.pm:959 STACK Bio::Graphics::Glyph::segments::draw_component /usr/lib/perl5/site_perl/5.6.1/Bio/Graphics/Glyph/segments.pm:63 STACK Bio::Graphics::Glyph::draw /usr/lib/perl5/site_perl/5.6.1/Bio/Graphics/Glyph.pm:650 STACK Bio::Graphics::Glyph::generic::draw /usr/lib/perl5/site_perl/5.6.1/Bio/Graphics/Glyph/generic.pm:107 STACK Bio::Graphics::Glyph::draw /usr/lib/perl5/site_perl/5.6.1/Bio/Graphics/Glyph.pm:642 STACK Bio::Graphics::Glyph::generic::draw /usr/lib/perl5/site_perl/5.6.1/Bio/Graphics/Glyph/generic.pm:107 STACK Bio::Graphics::Glyph::track::draw /usr/lib/perl5/site_perl/5.6.1/Bio/Graphics/Glyph/track.pm:21 STACK Bio::Graphics::Panel::gd /usr/lib/perl5/site_perl/5.6.1/Bio/Graphics/Panel.pm:461 STACK Bio::Graphics::Panel::png /usr/lib/perl5/site_perl/5.6.1/Bio/Graphics/Panel.pm:781 STACK main::create_overview ./htmlresult1.pl:147 STACK toplevel ./htmlresult1.pl:20 -------------------------------------- Any hints? Daniel Lincoln Stein wrote: > Hi Dan, > > Try changing the "generic" glyph to "segments." The first glyph doesn't know > how to deal with subparts (such as HSPs), the second does. > > Lincoln > > On Monday 17 November 2003 05:06 am, Daniel Lang wrote: > >>Hi, >>I want to generate overview graphics from BLAST reports, where the hits >>are sorted and colored (>1e-10 -->green, ...)according their evalues... >> >>So I thought, I could solve this using a callback function for the >>bgcolor and using the 'low_score' sort_order, but when applied to a >>BLAST report, it results in sorted but only red hits? >>I also tried introducing the evalues as additional tags like done with >>'bits' or 'range', but when testing for this tag in the callback >>(has_tag) its not available? >>So I wander if the function is envoked for each hit in the while loop? >> >>Here the code sniplet: >> >>my $track = $panel->add_track(-glyph => 'generic', >> -label => 1, >> -connector => 'dashed', >> -height => 5, >> -bgcolor => sub { >> my $feature = shift; >> my $evalue = $feature->score; >> if ($evalue < 1e-10) {return 'green';} >> else {return 'red';}} >> , >> -fontcolor => 'green', >> -font2color => 'red', >> -sort_order => 'low_score', >> -min_score => '1e-1000', >> -max_score => '10000', >> -description => sub { >> my $feature = shift; >> return unless $feature->has_tag('bits'); >> my ($description) = >>$feature->each_tag_value('bits'); >> my $score = $feature->score; >> my ($range) = >>$feature->each_tag_value('range'); >> "Score=$description bits, E-value=$score, $range"; >> }); >> >> while( my $hit = $result->next_hit ) { >> my $evalue = $hit->significance; >> my $feature = Bio::SeqFeature::Generic->new(-score => $evalue, >> -display_name => $hit->name, >> -tag => { 'bits' => $hit->bits, >> 'range' => "from ". $hit->start('query') . " to " . >>$hit->end('query'), >> }, >> ); >> while( my $hsp = $hit->next_hsp ) { >> $feature->add_sub_SeqFeature($hsp,'EXPAND'); >> } >> $track->add_feature($feature); >> } >> >>Thanks in advance, >>Daniel >> >>_______________________________________________ >>Bioperl-l mailing list >>Bioperl-l@portal.open-bio.org >>http://portal.open-bio.org/mailman/listinfo/bioperl-l > > From michael.watson at bbsrc.ac.uk Mon Nov 24 09:22:09 2003 From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C)) Date: Mon Nov 24 09:33:10 2003 Subject: [Bioperl-l] Problems with SearchIO and SearchIO::HTMLResultwriter Message-ID: <20B7EB075F2D4542AFFAF813E98ACD930282243A@cl-exsrv1.irad.bbsrc.ac.uk> Hi I am having trouble using Bio::SearchIO and Bio::SearchIO::HTMLResultWriter - was there a bug in 1.2.2? If so is it fixed? I get: Can't locate object method "algorithm" via package "Bio::SearchIO::blast" at /data/chicken_genome/bioperl-1.2.2//Bio/SearchIO/Writer/HTMLResultWriter.pm line 183, line 545. When executing the following: #!/usr/bin/perl use Bio::SearchIO; use Bio::SearchIO::Writer::HTMLResultWriter; my $searchio = new Bio::SearchIO (-format => 'blast', -file => $blast_report); my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter(); my $outhtml = new Bio::SearchIO(-writer => $writerhtml, -file => ">searchio.html"); while(my($result) = $searchio->next_result()) { # get a result from Bio::SearchIO parsing or build it up in memory $outhtml->write_result($searchio); } Thanks Mick From ronan at roasp.com Mon Nov 24 09:36:43 2003 From: ronan at roasp.com (Ronan Oger) Date: Mon Nov 24 09:42:50 2003 Subject: [Bioperl-l] GD/SVG (was: new to the group) In-Reply-To: References: Message-ID: <200311241436.43747.ronan@roasp.com> On Monday 24 November 2003 02:00, you wrote: > Hi Ronan - > > GD::SVG is now on CPAN! My first CPAN contrib - please have a look. I'll > be releasing a new version today or tomorrow where I solved the setBrush > and gdBrushed methods internally. > Congradulations, and best of luck with it. Did you make sure to fill out the README? I noticed it was unread on the prev. version. > > I see we're both running into the same problems. In particular the font > > issue (this seems to be due to the way perl handles xs objects? it > > returns an object that sometimes seems to bet treated like a string. > > I've worked around this by > > 1) exporting the same font methods as GD; > Each of these methods contains hard-coded info on height and weight that > mimic GDs fonts, then returns a generic font object (in my case > GD::SVG::Font). Right. I found that I could do this in *certain* cases, but that in other cases I seemed to be losing the blessing on the object when getting it back. Of course this might simply have been my error and may be a red herring. > > 2) The GD::SVG::Font object > This package creates generic formatting common to all the fonts, implements > the height() and width() methods, and establishes GDs oo approach to fonts > (ie GD::Font->Large). In the oo case, these methods simply call the > exported method, a bizarre circularity which works fine. > > This looks to be quite similar to how you've handled it as well. > Yes, thisi s how I did it, But as mentioned previously I still havent got it working. > > Maybe someone in bioperl has advice about how to grab the GD namespace > > without forsaking the use of the GD methods? > Never mind about the above, sorted this out. > For the most part I've been forsaking GD's methods and I've found very few > reasons to actually use GD's methods directly (sorry Lincoln, no offense!). > This is best illustrated when considering the colorAllocate and rgb > methods. GD's colorAllocate method returns a color index. The rgb($index) > method returns the rgb triplet. Neither of these methods are really useful > to generating SVG - colorAllocate just needs to return a stringified rgb > triplet; the rgb method can just parse the index and return the triplet > that way. > true. I've maintained the functionality of making sure the values sent back out of GD *look* like GD output values in order to make sure that there can be no underlying problem (for example with colorAllocate which wants an integer back). > I took a look at SVG::GD and see that you are maintaining an internal > version of the GD image. I started doing the same thinking I could take > advantage of some of GD's color manipulation methods but think it is > overkill for my purposes. > I'm not so sure it's overkill. You will have a tough time with the cut & paste functionality for GD. and with the background image functionality. If you don't fully support the GD API, then you will cause users issues with their inability to do a GD action on their graph that they expect to be able to do. Think of GD as an API standard. If you don't support the entire standard, then you have to do a serious song and dance to justify and explain the holes. Of course, SVG::GD is as incomplete as GD::SVG, but it is important to keep in mind that users will want to implement standard GD calls. Granted, you are attempting to implement only a specific set of drawing routines, and hence you do not necessarily need to do complicated things like support bitmap-supplied brushes or draw on top of an existing image. > But with regards to the use of these GD wrapper modules, I'm not sure that > maintaining both a GD::Image and the SVG image concurrently is the best > idea. You're right that it's not the *fastest* way to do things. There is a real performance hit from maintaining the GD raster in memory >For example, many scripts (especially CGIs) will predominantly > generate raster images via GD. And with Bio::Graphics and Gbrowse, the SVG > images from these scripts can be *substantially* larger than their raster > brethren. It's useless to generate these images in SVG when the intent is > only to produce a raster image. The SVG image size is high for complex images, but since the SVG standard includes compression with gzip, there is a significant reduction in size. I rarely see png or gif images of any significance (say larger than 300 x 400) that are smaller than gzipped SVG output. > > Instead of enabling raster/svg decision to be made at the time of output, I > ask that the user make a decision on what their output should be before > "use". Coupled with an eval (ie eval "use $package"), this can also be > dynamically established at runtime. No biggie, really. > I think this is a good and pragmatic approach, and the approach I am currently limited to in release 0.07 of SVG::GD. But it prevents your users from enabling some important functionality such as fills from the colour at a given point. > That said, I think we can easily maintain an internal representation of the > GD::Image - just have to avoid all the exported functions and call them > directly. > That's also a good idea. Until someone comes up with SVG::Canvas, then maintaining a GD-based canvas seems like a fairly good, pragmatic approach. Ronan > todd > > > Pls. see private re. collaboration. I'm all for it. > > > > I've installed your module and will run around in it a bit to get a > > better world view of how you use it and of how much similarity there is > > between SVG::GD and GD::SVG. > > > > For now, feel free to take anything out of SVG::GD for your own use. > > > > Ronan > > > > On Tuesday 18 November 2003 17:40, Todd Harris wrote: > >> Hi Ronan - > >> > >> Ha, great minds!, right This looks pretty good. I've been working on a > >> similar module that works exactly the same and maps almost all functions > >> into SVG output (using your SVG module). I've placed this in the GD > >> namespace as GD::SVG since that seems to more closely represent the > >> intent of the module. > >> > >> You can check out a preliminary version of my module at > >> http://toddot.net/GD-SVG/GD-SVG.0.01.tgz > >> > >> Docs: > >> http://toddot.net/GD-SVG/gd-svg.html > >> > >> And some very preliminary test images based on Bio::Graphics and some > >> simple test scripts: > >> http://toddot.net/GD-SVG/test.png > >> http://toddot.net/GD-SVG/test.svg > >> http://toddot.net/GD-SVG/biographics-dynamic_glyphs.png > >> http://toddot.net/GD-SVG/biographics-dynamic_glyphs.svg > >> http://toddot.net/GD-SVG/biographics-lots.png > >> http://toddot.net/GD-SVG/biographics-lots.svg > >> > >> These images are a little out-of-date. I've fixed many of the > >> formatting discrepancies already. > >> > >> I've already added support for GD::SVG into bioperl, so perhaps we > >> should coordinate our efforts on the GD::SVG (or SVG::GD module). In > >> particualr, there are a number of kludges that need to be implemented > >> (regarding font sizes, positions, etc) to correctly map GD<->SVG output > >> (particularly in regards to Bio::Graphics. > >> > >> Thanks, > >> > >> todd > >> > >> On Thu, 13 Nov 2003, Ronan Oger wrote: > >>> Hi, > >>> > >>> My name is Ronan Oger, I am the lead developer of the SVG module. > >>> > >>> One of the focuses of my current work is SVG::GD, a wrapper for the GD > >>> module to provide SVG (vector) output instead of raster. > >>> > >>> http://www.w3.org/Graphics/SVG/ > >>> http://www.w3.org/TR/SVG > >>> > >>> I've been doing some tests with GD and GD derivatives, and since > >>> bioperl is a fairly heavy user of GD, I have been testing around some > >>> bioperl code to see how it works. > >>> > >>> There are some real issues in the SVG::GD at this point, but several > >>> people are working on it and progress is being made. > >>> > >>> Clearly this is not production code at this stage. In particular, font > >>> support is still very poor, and font positions are still broken. > >>> > >>> However, here is a bioperl-specific sample (you need an SVG-compliant > >>> browser, such as IE with Adobe or Corel's SVG viewers installed. > >>> > >>> A png and its svg friend taken from a bio-related example on the net > >>> ------------------------------------ > >>> http://www.roasp.com/2003/11/13/ > >>> > >>> More prolific example comparisons > >>> http://www.roasp.com/2003/11/11/ > >>> > >>> The SVG::GD module (version 0.07): > >>> http://www.roasp.com/2003/11/11/SVG-GD-0.07.tar.gz > >>> (This module has a dependency on SVG, which is on CPAN) > >>> When it ripens, the module will live on CPAN. > >>> > >>> I'd appreciate some feedback, issues, etc. In particular, relating to > >>> the module's usability. > >>> > >>> All the best, > >>> > >>> Ronan > >>> > >>> -- > >>> Ronan Oger > >>> http://www.roasp.com > >>> Serverside SVG Portal > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l@portal.open-bio.org > >>> http://portal.open-bio.org/mailman/listinfo/bioperl-l > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l@portal.open-bio.org > >> http://portal.open-bio.org/mailman/listinfo/bioperl-l -- Ronan Oger http://www.roasp.com Serverside SVG Portal From michael.watson at bbsrc.ac.uk Mon Nov 24 09:36:39 2003 From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C)) Date: Mon Nov 24 09:48:57 2003 Subject: [Bioperl-l] Problems with SearchIO and SearchIO::HTMLResultwr iter Message-ID: <20B7EB075F2D4542AFFAF813E98ACD930282243B@cl-exsrv1.irad.bbsrc.ac.uk> Ah, solved by the removal of rather large error in my code. Sorry. -----Original Message----- From: michael watson (IAH-C) [mailto:michael.watson@bbsrc.ac.uk] Sent: 24 November 2003 14:22 To: 'bioperl-l@bioperl.org' Subject: [Bioperl-l] Problems with SearchIO and SearchIO::HTMLResultwriter Hi I am having trouble using Bio::SearchIO and Bio::SearchIO::HTMLResultWriter - was there a bug in 1.2.2? If so is it fixed? I get: Can't locate object method "algorithm" via package "Bio::SearchIO::blast" at /data/chicken_genome/bioperl-1.2.2//Bio/SearchIO/Writer/HTMLResultWriter.pm line 183, line 545. When executing the following: #!/usr/bin/perl use Bio::SearchIO; use Bio::SearchIO::Writer::HTMLResultWriter; my $searchio = new Bio::SearchIO (-format => 'blast', -file => $blast_report); my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter(); my $outhtml = new Bio::SearchIO(-writer => $writerhtml, -file => ">searchio.html"); while(my($result) = $searchio->next_result()) { # get a result from Bio::SearchIO parsing or build it up in memory $outhtml->write_result($searchio); } Thanks Mick _______________________________________________ Bioperl-l mailing list Bioperl-l@portal.open-bio.org http://portal.open-bio.org/mailman/listinfo/bioperl-l From jason at cgt.duhs.duke.edu Mon Nov 24 11:43:34 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Nov 24 11:49:57 2003 Subject: [Bioperl-l] Problems with SearchIO and SearchIO::HTMLResultwriter In-Reply-To: <20B7EB075F2D4542AFFAF813E98ACD930282243A@cl-exsrv1.irad.bbsrc.ac.uk> References: <20B7EB075F2D4542AFFAF813E98ACD930282243A@cl-exsrv1.irad.bbsrc.ac.uk> Message-ID: On Mon, 24 Nov 2003, michael watson (IAH-C) wrote: > Hi > > I am having trouble using Bio::SearchIO and Bio::SearchIO::HTMLResultWriter - was there a bug in 1.2.2? If so is it fixed? > > I get: > > Can't locate object method "algorithm" via package "Bio::SearchIO::blast" at /data/chicken_genome/bioperl-1.2.2//Bio/SearchIO/Writer/HTMLResultWriter.pm line 183, line 545. > > When executing the following: > > #!/usr/bin/perl > > use Bio::SearchIO; > use Bio::SearchIO::Writer::HTMLResultWriter; > > my $searchio = new Bio::SearchIO (-format => 'blast', > -file => $blast_report); > > my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter(); > my $outhtml = new Bio::SearchIO(-writer => $writerhtml, > -file => ">searchio.html"); > > while(my($result) = $searchio->next_result()) { > # get a result from Bio::SearchIO parsing or build it up in memory > $outhtml->write_result($searchio); The method is called 'write_result' so pass in $result > } > > Thanks > Mick > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From harris at cshl.org Mon Nov 24 12:47:56 2003 From: harris at cshl.org (Todd Harris) Date: Mon Nov 24 12:54:32 2003 Subject: [Bioperl-l] Re: Developers: what do you want to fix before 1.4 release? Message-ID: Hi Heikki - Re: SVG support for Bio::Graphics This is now essentially done. I've released the GD::SVG module to CPAN and implemented support for most features of GD that Bio::Graphics employs. I would very much like to see this a part of 1.4. You can check out the output from the examples/biographics in SVG at http://toddot.net/projects/GD-SVG/index.shtml#examples What remains: - detailed survey of glyphs to ensure that they render in png and svg identically. - more extensive testing - more supported features in GD::SVG (not required for bp 1.4 release). - adding in SVG information to full documentation It should be no problem to finish the testing within a week (or two, given thanksgiving gorging and recovery this week for some of us). todd > > With developer snapshot 1.3.03 out last Friday, I'd like to hear from bioperl > developers what they'd like to accomplish before 1.4 comes out. > > At this stage, I'd rather not see major new features added; there will not be > time to test them properly. > > I'd especially like to hear from > * Lincoln Stein / Todd Harris about the SVG timetable > * Stefan Kirov about Bio::Matrix::PSM modules > * anyone else about their pet modules and bugs > > I'll give you three choices: > 1. You've got a fix/enhancement under way and you can give me timetable for it > to be in. > 2. There is a bug that need to be fixed before the release but you do not know > how to do it. Let's discuss that now! > 3. Your modules are not ready for release and you do not have time to work on > your code. That means that they will need to be excluded from the release. > > I am pretty happy with the current status of bioperl CVS tree. Tests pass, but > there still are bugs in bugzilla. I urge anyone and everyone to have a look > at the bugzilla and see what they think of the bugs still there. > > Realistically, we have couple of weeks time to get the fixes in. Then a week > to wait to make sure everyone has time to make sure that everything works. > > -Heikki "Let's push 1.4 out" From jason at cgt.duhs.duke.edu Mon Nov 24 14:05:53 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Nov 24 14:12:14 2003 Subject: [Bioperl-l] How to distinguish pdb Helix from Sheet? In-Reply-To: <3FBF00EF.4050106@cs.nmsu.edu> References: <3FBF00EF.4050106@cs.nmsu.edu> Message-ID: I'm not entirely sure of the best way to do this - but here is something to get you started. If you work out a good strategy to the data I think we'd appreciate any useable scripts back in Bioperl. There seem to a scant few doing structural stuff and using Bioperl so we don't have a lot of good example applications written yet for the Structure objects. use Bio::Structure::IO; use strict; my $in = new Bio::Structure::IO(-format => 'pdb', -file => $filename); # first 2 nums are from the rol definition # that Kris sets up in the pdb parser # the 3rd number is the index of the # Chain ID for that type of feature my %rol_length = ('sheet' => [8,70,5], 'helix' => [8,76,4]); my $struc = $in->next_structure; foreach my $type ( keys %rol_length ) { my $uct = uc($type); my @header = $struc->annotation->get_Annotations($type); my $h = shift @header; next unless defined $h; my ($rol_begin,$rol_end,$chain_index) = @{$rol_length{$type}}; my $length = $rol_end - $rol_begin +1; my $k = $h->value; my $uct = uc($type); # change the whole concatenated string into \n delimited # string $k =~ s/(.{$length})/$uct$1\n/g; $k .= "\n"; # print $k; # print if you want to see the whole $type record my @records = split(/\n/,$k); foreach my $r ( @records ) { # based on # http://www.rcsb.org/pdb/docs/format/pdbguide2.2/guide2.2_frame.html my @line = split(/\s+/,$r); print "chain $line[$chain_index] contains $uct\n"; } } --jason On Fri, 21 Nov 2003, Jeff Donner wrote: > Hi, > > How can you tell which chains are HELIX and which SHEET > after you've read a pdb file with Bio::Structure::IO? > > Is it possible even? > > Thanks, > > Jeff Donner > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From steve_chervitz at affymetrix.com Mon Nov 24 15:01:22 2003 From: steve_chervitz at affymetrix.com (Steve Chervitz) Date: Mon Nov 24 15:06:24 2003 Subject: [Bioperl-l] Re: [Bioperl-announce-l] Bioperl Developer snapshot 1.3.03 In-Reply-To: <200311211620.25306.heikki@ebi.ac.uk> References: <200311211620.25306.heikki@ebi.ac.uk> Message-ID: On Nov 21, 2003, at 8:20 AM, Heikki Lehvaslaiho wrote: > Bioperl developer snap shot 1.3.03 > --------------------------------- > > > This is the third developer snap shot from the BioPerl CVS head > that will eventually lead to release 1.4. > > http://bioperl.org/DIST/current_core_unstable.tar.gz > http://bioperl.org/DIST/bioperl-1.3.03.tar.gz Correction on the second URL: http://bioperl.org/DIST/bioperl-devel-1.3.03.tar.gz Nice work Heikki. Steve From jason at cgt.duhs.duke.edu Mon Nov 24 17:04:06 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Mon Nov 24 17:10:32 2003 Subject: [Bioperl-l] Developers: what do you want to fix before 1.4 release? In-Reply-To: <200311241022.54362.heikki@ebi.ac.uk> References: <200311241022.54362.heikki@ebi.ac.uk> Message-ID: I am fixing up blast parsing to get the gapped, ungapped, and frame-specific lambda,kappa, entropy from the reports for those who want this. I will also try and make sure we are getting more of the parameters from the reports into the Result objects. Presumably we had talked about remove SearchIO/psiblast.pm from 1.4 release - not sure if that is still a good idea. I also have some more work on the PopGen objects to make sure all the numbers are correct for some of the implemented tests. -jason On Mon, 24 Nov 2003, Heikki Lehvaslaiho wrote: > > With developer snapshot 1.3.03 out last Friday, I'd like to hear from bioperl > developers what they'd like to accomplish before 1.4 comes out. > > At this stage, I'd rather not see major new features added; there will not be > time to test them properly. > > I'd especially like to hear from > * Lincoln Stein / Todd Harris about the SVG timetable > * Stefan Kirov about Bio::Matrix::PSM modules > * anyone else about their pet modules and bugs > > I'll give you three choices: > 1. You've got a fix/enhancement under way and you can give me timetable > for it to be in. > 2. There is a bug that need to be fixed before the release but you do > not know how to do it. Let's discuss that now! > 3. Your modules are not ready for release and you do not have time to > work on your code. That means that they will need to be excluded from > the release. > > I am pretty happy with the current status of bioperl CVS tree. Tests > pass, but there still are bugs in bugzilla. I urge anyone and everyone > to have a look at the bugzilla and see what they think of the bugs still > there. > > Realistically, we have couple of weeks time to get the fixes in. Then a > week to wait to make sure everyone has time to make sure that everything > works. > > -Heikki "Let's push 1.4 out" > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From heikki at nildram.co.uk Mon Nov 24 17:26:15 2003 From: heikki at nildram.co.uk (Heikki Lehvaslaiho) Date: Mon Nov 24 17:35:57 2003 Subject: [Bioperl-l] Re: Developers: what do you want to fix before 1.4 release? In-Reply-To: References: Message-ID: <200311242226.15855.heikki@nildram.co.uk> Todd, That's good news. I was afraid that there are still big chunks of code unwritten. So what you are saying is that SVG in Bio::Graphics is ready for other people to start testing and using it in their code. Go for it! Bug reports to Todd. -Heikki On Monday 24 Nov 2003 TT:47, Todd Harris wrote: > Hi Heikki - > > Re: SVG support for Bio::Graphics > > This is now essentially done. I've released the GD::SVG module to CPAN and > implemented support for most features of GD that Bio::Graphics employs. I > would very much like to see this a part of 1.4. > > You can check out the output from the examples/biographics in SVG at > http://toddot.net/projects/GD-SVG/index.shtml#examples > > What remains: > - detailed survey of glyphs to ensure that they render in png and svg > identically. > - more extensive testing > - more supported features in GD::SVG (not required for bp 1.4 release). > - adding in SVG information to full documentation > > It should be no problem to finish the testing within a week (or two, given > thanksgiving gorging and recovery this week for some of us). > > todd > > > With developer snapshot 1.3.03 out last Friday, I'd like to hear from > > bioperl developers what they'd like to accomplish before 1.4 comes out. > > > > At this stage, I'd rather not see major new features added; there will > > not be time to test them properly. > > > > I'd especially like to hear from > > * Lincoln Stein / Todd Harris about the SVG timetable > > * Stefan Kirov about Bio::Matrix::PSM modules > > * anyone else about their pet modules and bugs > > > > I'll give you three choices: > > 1. You've got a fix/enhancement under way and you can give me timetable > > for it to be in. > > 2. There is a bug that need to be fixed before the release but you do not > > know how to do it. Let's discuss that now! > > 3. Your modules are not ready for release and you do not have time to > > work on your code. That means that they will need to be excluded from the > > release. > > > > I am pretty happy with the current status of bioperl CVS tree. Tests > > pass, but there still are bugs in bugzilla. I urge anyone and everyone to > > have a look at the bugzilla and see what they think of the bugs still > > there. > > > > Realistically, we have couple of weeks time to get the fixes in. Then a > > week to wait to make sure everyone has time to make sure that everything > > works. > > > > -Heikki "Let's push 1.4 out" > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From steve_chervitz at affymetrix.com Mon Nov 24 19:10:19 2003 From: steve_chervitz at affymetrix.com (Steve Chervitz) Date: Mon Nov 24 19:15:23 2003 Subject: [Bioperl-l] Developers: what do you want to fix before 1.4 release? In-Reply-To: References: <200311241022.54362.heikki@ebi.ac.uk> Message-ID: On Nov 24, 2003, at 2:04 PM, Jason Stajich wrote: > I am fixing up blast parsing to get the gapped, ungapped, and > frame-specific lambda,kappa, entropy from the reports for those who > want > this. I will also try and make sure we are getting more of the > parameters > from the reports into the Result objects. > > Presumably we had talked about remove SearchIO/psiblast.pm from 1.4 > release - not sure if that is still a good idea. I'm fine with this. There are a few example scripts that are pointing at psiblast.pm. We'll have to make sure they work with blast.pm. SearchIO/psiblast.pm depends on some modules in Bio::Tools that can probably also go away (Bio::Tools::StateMachine::*) since these modules aren't being used by any other Bioperl modules. There may be external code that depends on them, but not too likely. If there's interest, I could contribute them separately to CPAN. Steve > > I also have some more work on the PopGen objects to make sure all the > numbers are correct for some of the implemented tests. > > -jason > > On Mon, 24 Nov 2003, Heikki Lehvaslaiho wrote: > >> >> With developer snapshot 1.3.03 out last Friday, I'd like to hear from >> bioperl >> developers what they'd like to accomplish before 1.4 comes out. >> >> At this stage, I'd rather not see major new features added; there >> will not be >> time to test them properly. >> >> I'd especially like to hear from >> * Lincoln Stein / Todd Harris about the SVG timetable >> * Stefan Kirov about Bio::Matrix::PSM modules >> * anyone else about their pet modules and bugs >> >> I'll give you three choices: > >> 1. You've got a fix/enhancement under way and you can give me >> timetable >> for it to be in. > >> 2. There is a bug that need to be fixed before the release but you do >> not know how to do it. Let's discuss that now! > >> 3. Your modules are not ready for release and you do not have time to >> work on your code. That means that they will need to be excluded from >> the release. >> >> I am pretty happy with the current status of bioperl CVS tree. Tests >> pass, but there still are bugs in bugzilla. I urge anyone and everyone >> to have a look at the bugzilla and see what they think of the bugs >> still >> there. >> >> Realistically, we have couple of weeks time to get the fixes in. Then >> a >> week to wait to make sure everyone has time to make sure that >> everything >> works. >> >> -Heikki "Let's push 1.4 out" >> >> > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > From fangl at genomics.org.cn Tue Nov 25 05:00:02 2003 From: fangl at genomics.org.cn (Magic Fang) Date: Tue Nov 25 05:06:44 2003 Subject: [Bioperl-l] why i can not find Bio::SeqIO::staden::read in bioperl-ext packages Message-ID: <3FC32822.1010007@genomics.org.cn> i have installed bioperl-ext_06 but there is not this module at all, where can i get the module, thank u. From jason at cgt.duhs.duke.edu Tue Nov 25 08:03:04 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Tue Nov 25 08:09:57 2003 Subject: [Bioperl-l] why i can not find Bio::SeqIO::staden::read in bioperl-ext packages In-Reply-To: <3FC32822.1010007@genomics.org.cn> References: <3FC32822.1010007@genomics.org.cn> Message-ID: It is presumably not in 06 which I think is an old release before that module was added. I don't know that we have done an bioperl-ext release in quite a while. You can get it (and every bit of code that has yet to be officially released) from CVS - http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-ext and click on the download tarball link Or just grab it from this link: http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-ext/bioperl-ext.tar.gz?tarball=1 Click on the download tarball link On Tue, 25 Nov 2003, Magic Fang wrote: > i have installed bioperl-ext_06 but there is not this module at all, > where can i get the module, thank u. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From skirov at utk.edu Tue Nov 25 10:00:44 2003 From: skirov at utk.edu (Stefan Kirov) Date: Tue Nov 25 10:07:23 2003 Subject: [Bioperl-l] Re: Developers: what do you want to fix before 1.4 release? In-Reply-To: <200311242226.15855.heikki@nildram.co.uk> References: <200311242226.15855.heikki@nildram.co.uk> Message-ID: <3FC36E9C.5030205@utk.edu> Heikki, Sorry for the delay. I think I need a week (maybe 10 days) to add some minor stuff to SiteMatrix module, nothing critical. I plan to add new parsers, extend mast parser to read section III (which is a bit messy for parsing), allow to write the PSM in some generic format (swissprot I guess) and probably one or two scripts. However this is not going to happen within the next month, so I guess this can wait for the next release. I am using the modules heavily, so I hope it will be useful for other people too. Stefan Heikki Lehvaslaiho wrote: >Todd, > >That's good news. I was afraid that there are still big chunks of code >unwritten. > >So what you are saying is that SVG in Bio::Graphics is ready for other people >to start testing and using it in their code. Go for it! Bug reports to Todd. > > -Heikki > >On Monday 24 Nov 2003 TT:47, Todd Harris wrote: > > >>Hi Heikki - >> >>Re: SVG support for Bio::Graphics >> >>This is now essentially done. I've released the GD::SVG module to CPAN and >>implemented support for most features of GD that Bio::Graphics employs. I >>would very much like to see this a part of 1.4. >> >>You can check out the output from the examples/biographics in SVG at >>http://toddot.net/projects/GD-SVG/index.shtml#examples >> >>What remains: >> - detailed survey of glyphs to ensure that they render in png and svg >> identically. >> - more extensive testing >> - more supported features in GD::SVG (not required for bp 1.4 release). >> - adding in SVG information to full documentation >> >>It should be no problem to finish the testing within a week (or two, given >>thanksgiving gorging and recovery this week for some of us). >> >>todd >> >> >> >>>With developer snapshot 1.3.03 out last Friday, I'd like to hear from >>>bioperl developers what they'd like to accomplish before 1.4 comes out. >>> >>>At this stage, I'd rather not see major new features added; there will >>>not be time to test them properly. >>> >>>I'd especially like to hear from >>>* Lincoln Stein / Todd Harris about the SVG timetable >>>* Stefan Kirov about Bio::Matrix::PSM modules >>>* anyone else about their pet modules and bugs >>> >>>I'll give you three choices: >>>1. You've got a fix/enhancement under way and you can give me timetable >>>for it to be in. >>>2. There is a bug that need to be fixed before the release but you do not >>>know how to do it. Let's discuss that now! >>>3. Your modules are not ready for release and you do not have time to >>>work on your code. That means that they will need to be excluded from the >>>release. >>> >>>I am pretty happy with the current status of bioperl CVS tree. Tests >>>pass, but there still are bugs in bugzilla. I urge anyone and everyone to >>>have a look at the bugzilla and see what they think of the bugs still >>>there. >>> >>>Realistically, we have couple of weeks time to get the fixes in. Then a >>>week to wait to make sure everyone has time to make sure that everything >>>works. >>> >>>-Heikki "Let's push 1.4 out" >>> >>> >>_______________________________________________ >>Bioperl-l mailing list >>Bioperl-l@portal.open-bio.org >>http://portal.open-bio.org/mailman/listinfo/bioperl-l >> >> > > > -- Stefan Kirov, Ph.D. University of Tennessee/Oak Ridge National Laboratory 1060 Commerce Park, Oak Ridge TN 37830-8026 USA tel +865 576 5120 fax +865 241 1965 e-mail: skirov@utk.edu sao@ornl.gov From markw at illuminae.com Tue Nov 25 11:29:53 2003 From: markw at illuminae.com (Mark Wilkinson) Date: Tue Nov 25 11:36:26 2003 Subject: [Bioperl-l] Bio::Perl blast function not working? Message-ID: <1069777793.1713.22.camel@localhost.localdomain> Hi all, I'm teaching a course on BioPerl next week and I'm putting the "beginners" lesson together using the Bio::Perl module. I notice that the blast_sequence function is not working anymore... is this just a temporary glitch, or is this function not supported these days? Any advice appreciated, M -- Mark Wilkinson Illuminae From jason at cgt.duhs.duke.edu Tue Nov 25 11:49:55 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Tue Nov 25 11:56:19 2003 Subject: [Bioperl-l] Bio::Perl blast function not working? In-Reply-To: <1069777793.1713.22.camel@localhost.localdomain> References: <1069777793.1713.22.camel@localhost.localdomain> Message-ID: Error messages, your own debugging efforts are always helpful here too... -jason On Tue, 25 Nov 2003, Mark Wilkinson wrote: > Hi all, > > I'm teaching a course on BioPerl next week and I'm putting the > "beginners" lesson together using the Bio::Perl module. I notice that > the blast_sequence function is not working anymore... is this just a > temporary glitch, or is this function not supported these days? > > Any advice appreciated, > > M > > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From heikki at ebi.ac.uk Tue Nov 25 12:22:23 2003 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Tue Nov 25 12:29:06 2003 Subject: [Bioperl-l] Developers: what do you want to fix before 1.4 release? In-Reply-To: References: <200311241022.54362.heikki@ebi.ac.uk> Message-ID: <200311251722.24333.heikki@ebi.ac.uk> Steve, Thanks for offering to clean the SearchIO modules. Could you also see that this bug: http://bugzilla.bioperl.org/show_bug.cgi?id=1419 gets closed and obsolete modules removed from cvs. Thanks, -Heikki P.S.1. Other open bugs related to SearchIO are: 1416, 1418, 1468, 1558 P.S.2. Also thanks for spotting the wrong URL. On Tuesday 25 Nov 2003 TT:10, Steve Chervitz wrote: > On Nov 24, 2003, at 2:04 PM, Jason Stajich wrote: > > I am fixing up blast parsing to get the gapped, ungapped, and > > frame-specific lambda,kappa, entropy from the reports for those who > > want > > this. I will also try and make sure we are getting more of the > > parameters > > from the reports into the Result objects. > > > > Presumably we had talked about remove SearchIO/psiblast.pm from 1.4 > > release - not sure if that is still a good idea. > > I'm fine with this. There are a few example scripts that are pointing > at psiblast.pm. We'll have to make sure they work with blast.pm. > > SearchIO/psiblast.pm depends on some modules in Bio::Tools that can > probably also go away (Bio::Tools::StateMachine::*) since these modules > aren't being used by any other Bioperl modules. There may be external > code that depends on them, but not too likely. If there's interest, I > could contribute them separately to CPAN. > > Steve > > > I also have some more work on the PopGen objects to make sure all the > > numbers are correct for some of the implemented tests. > > > > -jason > > > > On Mon, 24 Nov 2003, Heikki Lehvaslaiho wrote: > >> With developer snapshot 1.3.03 out last Friday, I'd like to hear from > >> bioperl > >> developers what they'd like to accomplish before 1.4 comes out. > >> > >> At this stage, I'd rather not see major new features added; there > >> will not be > >> time to test them properly. > >> > >> I'd especially like to hear from > >> * Lincoln Stein / Todd Harris about the SVG timetable > >> * Stefan Kirov about Bio::Matrix::PSM modules > >> * anyone else about their pet modules and bugs > >> > >> I'll give you three choices: > >> > >> 1. You've got a fix/enhancement under way and you can give me > >> timetable > >> for it to be in. > >> > >> 2. There is a bug that need to be fixed before the release but you do > >> not know how to do it. Let's discuss that now! > >> > >> 3. Your modules are not ready for release and you do not have time to > >> work on your code. That means that they will need to be excluded from > >> the release. > >> > >> I am pretty happy with the current status of bioperl CVS tree. Tests > >> pass, but there still are bugs in bugzilla. I urge anyone and everyone > >> to have a look at the bugzilla and see what they think of the bugs > >> still > >> there. > >> > >> Realistically, we have couple of weeks time to get the fixes in. Then > >> a > >> week to wait to make sure everyone has time to make sure that > >> everything > >> works. > >> > >> -Heikki "Let's push 1.4 out" > > > > -- > > Jason Stajich > > Duke University > > jason at cgt.mc.duke.edu > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From markw at illuminae.com Tue Nov 25 12:22:34 2003 From: markw at illuminae.com (Mark Wilkinson) Date: Tue Nov 25 12:29:14 2003 Subject: [BioPerl] Re: [Bioperl-l] Bio::Perl blast function not working? In-Reply-To: References: <1069777793.1713.22.camel@localhost.localdomain> Message-ID: <1069780954.1713.47.camel@localhost.localdomain> Yup, so long as i know that it is still a going concern I will have a look at it myself to see if i can track it down. The bug can be reproduced as follows: use Bio::Perl; $seq = get_sequence('swissprot', 'P35632'); $blast = blast_sequence($seq); write_blast(">mySequenceBlast", $blast); results in: Submitted Blast for [AP3_ARATH] -------------------- WARNING --------------------- MSG:


ERROR: Results for RID 1069780799-8428-65894659018 not found
--------------------------------------------------- I'll have a go at stepping through the code later today after I finish writing this lesson. Cheers all! M On Tue, 2003-11-25 at 10:49, Jason Stajich wrote: > Error messages, your own debugging efforts are always helpful here too... > > -jason > On Tue, 25 Nov 2003, Mark Wilkinson wrote: > > > Hi all, > > > > I'm teaching a course on BioPerl next week and I'm putting the > > "beginners" lesson together using the Bio::Perl module. I notice that > > the blast_sequence function is not working anymore... is this just a > > temporary glitch, or is this function not supported these days? > > > > Any advice appreciated, > > > > M > > > > > > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- Mark Wilkinson Illuminae From redwards at utmem.edu Tue Nov 25 13:08:06 2003 From: redwards at utmem.edu (Rob Edwards) Date: Tue Nov 25 13:14:44 2003 Subject: [BioPerl] Re: [Bioperl-l] Bio::Perl blast function not working? In-Reply-To: <1069780954.1713.47.camel@localhost.localdomain> Message-ID: <51911A74-1F72-11D8-AF64-000A959E1622@utmem.edu> This looks like the same problem that everyone had a month or so ago with parsing the RID line (NCBI added a .BLASTQ3 to the end. Lots of solutions were offered, basically adding (\S+) to the end of the line that parses RID should fix it. Rob On Tuesday, November 25, 2003, at 11:22 AM, Mark Wilkinson wrote: > Yup, so long as i know that it is still a going concern I will have a > look at it myself to see if i can track it down. > > The bug can be reproduced as follows: > > use Bio::Perl; > > $seq = get_sequence('swissprot', 'P35632'); > $blast = blast_sequence($seq); > write_blast(">mySequenceBlast", $blast); > > > results in: > > Submitted Blast for [AP3_ARATH] > -------------------- WARNING --------------------- > MSG: > >

>


ERROR: Results for RID > 1069780799-8428-65894659018 > not found
> > > --------------------------------------------------- > > > I'll have a go at stepping through the code later today after I finish > writing this lesson. > > Cheers all! > > M > > > On Tue, 2003-11-25 at 10:49, Jason Stajich wrote: >> Error messages, your own debugging efforts are always helpful here >> too... >> >> -jason >> On Tue, 25 Nov 2003, Mark Wilkinson wrote: >> >>> Hi all, >>> >>> I'm teaching a course on BioPerl next week and I'm putting the >>> "beginners" lesson together using the Bio::Perl module. I notice >>> that >>> the blast_sequence function is not working anymore... is this just a >>> temporary glitch, or is this function not supported these days? >>> >>> Any advice appreciated, >>> >>> M >>> >>> >>> >> >> -- >> Jason Stajich >> Duke University >> jason at cgt.mc.duke.edu >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l@portal.open-bio.org >> http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- > Mark Wilkinson > Illuminae > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From paul.boutros at utoronto.ca Tue Nov 25 13:12:13 2003 From: paul.boutros at utoronto.ca (paul.boutros@utoronto.ca) Date: Tue Nov 25 13:19:04 2003 Subject: [BioPerl] Re: [Bioperl-l] Bio::Perl blast function not working In-Reply-To: <200311251729.hAPHTeg0025010@portal.open-bio.org> References: <200311251729.hAPHTeg0025010@portal.open-bio.org> Message-ID: <1069783933.3fc39b7d6424c@webmail.utoronto.ca> I think this bug is caused by a change in the RID format at NCBI. I think the format is basically the same, except that you must now append .QBLAST to the end. So you'd need to be checking results for the RID: 1069780799-8428-65894659018.QBLAST instead of 1069780799-8428-65894659018 Does that fix it? Paul > Yup, so long as i know that it is still a going concern I will have a > look at it myself to see if i can track it down. > > The bug can be reproduced as follows: > > use Bio::Perl; > > $seq = get_sequence('swissprot', 'P35632'); > $blast = blast_sequence($seq); > write_blast(">mySequenceBlast", $blast); > > > results in: > > Submitted Blast for [AP3_ARATH] > -------------------- WARNING --------------------- > MSG: > >

>


ERROR: Results for RID 1069780799-8428-65894659018 > not found
> > > --------------------------------------------------- > > > I'll have a go at stepping through the code later today after I finish > writing this lesson. > > Cheers all! > > M > > > On Tue, 2003-11-25 at 10:49, Jason Stajich wrote: > > Error messages, your own debugging efforts are always helpful here too... > > > > -jason > > On Tue, 25 Nov 2003, Mark Wilkinson wrote: > > > > > Hi all, > > > > > > I'm teaching a course on BioPerl next week and I'm putting the > > > "beginners" lesson together using the Bio::Perl module. I notice that > > > the blast_sequence function is not working anymore... is this just a > > > temporary glitch, or is this function not supported these days? > > > > > > Any advice appreciated, > > > > > > M > > > > > > > > > > > > > -- > > Jason Stajich > > Duke University > > jason at cgt.mc.duke.edu > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- > Mark Wilkinson > Illuminae > > > > ------------------------------ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > End of Bioperl-l Digest, Vol 7, Issue 15 > **************************************** > From jason at cgt.duhs.duke.edu Tue Nov 25 13:24:21 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Tue Nov 25 13:30:40 2003 Subject: [BioPerl] Re: [Bioperl-l] Bio::Perl blast function not working? In-Reply-To: <1069780954.1713.47.camel@localhost.localdomain> References: <1069777793.1713.22.camel@localhost.localdomain> <1069780954.1713.47.camel@localhost.localdomain> Message-ID: You may need to use bioperl 1.3.x - as per the other posts which describe the problems - 1.2.x did not handle NCBI WebBlast changes which appended stuff to the RID line. Or just change the regexep in Tools::Run::RemoteBlast as others have described. -jason On Tue, 25 Nov 2003, Mark Wilkinson wrote: > Yup, so long as i know that it is still a going concern I will have a > look at it myself to see if i can track it down. > > The bug can be reproduced as follows: > > use Bio::Perl; > > $seq = get_sequence('swissprot', 'P35632'); > $blast = blast_sequence($seq); > write_blast(">mySequenceBlast", $blast); > > > results in: > > Submitted Blast for [AP3_ARATH] > -------------------- WARNING --------------------- > MSG: > >

>


ERROR: Results for RID 1069780799-8428-65894659018 > not found
> > > --------------------------------------------------- > > > I'll have a go at stepping through the code later today after I finish > writing this lesson. > > Cheers all! > > M > > > On Tue, 2003-11-25 at 10:49, Jason Stajich wrote: > > Error messages, your own debugging efforts are always helpful here too... > > > > -jason > > On Tue, 25 Nov 2003, Mark Wilkinson wrote: > > > > > Hi all, > > > > > > I'm teaching a course on BioPerl next week and I'm putting the > > > "beginners" lesson together using the Bio::Perl module. I notice that > > > the blast_sequence function is not working anymore... is this just a > > > temporary glitch, or is this function not supported these days? > > > > > > Any advice appreciated, > > > > > > M > > > > > > > > > > > > > -- > > Jason Stajich > > Duke University > > jason at cgt.mc.duke.edu > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From jason at cgt.duhs.duke.edu Tue Nov 25 14:06:54 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Tue Nov 25 14:13:15 2003 Subject: [Bioperl-l] Developers: what do you want to fix before 1.4 release? In-Reply-To: <200311251722.24333.heikki@ebi.ac.uk> References: <200311241022.54362.heikki@ebi.ac.uk> <200311251722.24333.heikki@ebi.ac.uk> Message-ID: > P.S.1. Other open bugs related to SearchIO are: 1416, 1418, 1468, 1558 Happy pre-Thanksgiving .... 1558 - Fixed. 1416 - Fixed. good enough, I can't reproduce it 1418 - Fixed. I think it works, I have started just using Guy's GFF output but I think it all matches up with parse from his VULGAR strings Not quite there: 1468 - It seems to work okay now. Everything except printing the Statistics in the BLAST way. Something that can be done without too much work I think. > > P.S.2. Also thanks for spotting the wrong URL. > > > On Tuesday 25 Nov 2003 TT:10, Steve Chervitz wrote: > > On Nov 24, 2003, at 2:04 PM, Jason Stajich wrote: > > > I am fixing up blast parsing to get the gapped, ungapped, and > > > frame-specific lambda,kappa, entropy from the reports for those who > > > want > > > this. I will also try and make sure we are getting more of the > > > parameters > > > from the reports into the Result objects. > > > > > > Presumably we had talked about remove SearchIO/psiblast.pm from 1.4 > > > release - not sure if that is still a good idea. > > > > I'm fine with this. There are a few example scripts that are pointing > > at psiblast.pm. We'll have to make sure they work with blast.pm. > > > > SearchIO/psiblast.pm depends on some modules in Bio::Tools that can > > probably also go away (Bio::Tools::StateMachine::*) since these modules > > aren't being used by any other Bioperl modules. There may be external > > code that depends on them, but not too likely. If there's interest, I > > could contribute them separately to CPAN. > > > > Steve > > > > > I also have some more work on the PopGen objects to make sure all the > > > numbers are correct for some of the implemented tests. > > > > > > -jason > > > > > > On Mon, 24 Nov 2003, Heikki Lehvaslaiho wrote: > > >> With developer snapshot 1.3.03 out last Friday, I'd like to hear from > > >> bioperl > > >> developers what they'd like to accomplish before 1.4 comes out. > > >> > > >> At this stage, I'd rather not see major new features added; there > > >> will not be > > >> time to test them properly. > > >> > > >> I'd especially like to hear from > > >> * Lincoln Stein / Todd Harris about the SVG timetable > > >> * Stefan Kirov about Bio::Matrix::PSM modules > > >> * anyone else about their pet modules and bugs > > >> > > >> I'll give you three choices: > > >> > > >> 1. You've got a fix/enhancement under way and you can give me > > >> timetable > > >> for it to be in. > > >> > > >> 2. There is a bug that need to be fixed before the release but you do > > >> not know how to do it. Let's discuss that now! > > >> > > >> 3. Your modules are not ready for release and you do not have time to > > >> work on your code. That means that they will need to be excluded from > > >> the release. > > >> > > >> I am pretty happy with the current status of bioperl CVS tree. Tests > > >> pass, but there still are bugs in bugzilla. I urge anyone and everyone > > >> to have a look at the bugzilla and see what they think of the bugs > > >> still > > >> there. > > >> > > >> Realistically, we have couple of weeks time to get the fixes in. Then > > >> a > > >> week to wait to make sure everyone has time to make sure that > > >> everything > > >> works. > > >> > > >> -Heikki "Let's push 1.4 out" > > > > > > -- > > > Jason Stajich > > > Duke University > > > jason at cgt.mc.duke.edu > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l@portal.open-bio.org > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From heikki at ebi.ac.uk Tue Nov 25 14:09:57 2003 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Tue Nov 25 14:17:05 2003 Subject: [Bioperl-l] Bio::Restriction issues Message-ID: <200311251909.57607.heikki@ebi.ac.uk> Hi Rob, Are you working on the Bio::Restriction bugs? It would be great to patch the Analysis module up for the 1.4 release. I had look but I do not want to duplicate what you are doing. Here are my current thoughts: The main concerns seem to be: 1. MultiSite and MultiCut are not taken into account 2. sites in around start/end of circular molecules are cut. These all problems have to do with _new_cuts(). (_cuts() is now obsolete and could be removed?) 1. We have to change the IO modules so that only one Enzyme object is added into EnzymeCollection (the other versions can be accessed using others()). _new_cuts() then checks for enzyme type using isa() and loops over sequence for each Enzyme object. Instead of producing an array of fragments, it creates and array of cut sites and every Enzyme cuts the original sequence. In the end the cut site array is sorted and fragments created from the original sequence. 2. circular molecules need to be tested before cutting. To allow cuts across start.end boundary, the cut site needs to be analysed for each enzyme and the correct number of bases need to be added to the end (?) of the string from the start. When cut sites locations are converted into fragments, the locations are compared to the length of the original sequence and placed appropriately. Looks like the _new_cuts() code needs to be redistributed into smaller methods to make the logic of the code easier to follow. What do you think? -Heikki -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From heikki at ebi.ac.uk Tue Nov 25 14:29:21 2003 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Tue Nov 25 14:36:02 2003 Subject: [Bioperl-l] Developers: what do you want to fix before 1.4 release? In-Reply-To: References: <200311241022.54362.heikki@ebi.ac.uk> <200311251722.24333.heikki@ebi.ac.uk> Message-ID: <200311251929.21244.heikki@ebi.ac.uk> Jason, If everyone were to raise to a challenge like you...! Thanks, -Heikki On Tuesday 25 Nov 2003 TT:06, Jason Stajich wrote: > > P.S.1. Other open bugs related to SearchIO are: 1416, 1418, 1468, 1558 > > Happy pre-Thanksgiving .... > > 1558 - Fixed. > 1416 - Fixed. good enough, I can't reproduce it > 1418 - Fixed. I think it works, I have started just using Guy's GFF output > but I think it all matches up with parse from his VULGAR > strings > > Not quite there: > 1468 - It seems to work okay now. Everything except printing the > Statistics in the BLAST way. > Something that can be done without too much work I think. > > > P.S.2. Also thanks for spotting the wrong URL. > > > > On Tuesday 25 Nov 2003 TT:10, Steve Chervitz wrote: > > > On Nov 24, 2003, at 2:04 PM, Jason Stajich wrote: > > > > I am fixing up blast parsing to get the gapped, ungapped, and > > > > frame-specific lambda,kappa, entropy from the reports for those who > > > > want > > > > this. I will also try and make sure we are getting more of the > > > > parameters > > > > from the reports into the Result objects. > > > > > > > > Presumably we had talked about remove SearchIO/psiblast.pm from 1.4 > > > > release - not sure if that is still a good idea. > > > > > > I'm fine with this. There are a few example scripts that are pointing > > > at psiblast.pm. We'll have to make sure they work with blast.pm. > > > > > > SearchIO/psiblast.pm depends on some modules in Bio::Tools that can > > > probably also go away (Bio::Tools::StateMachine::*) since these modules > > > aren't being used by any other Bioperl modules. There may be external > > > code that depends on them, but not too likely. If there's interest, I > > > could contribute them separately to CPAN. > > > > > > Steve > > > > > > > I also have some more work on the PopGen objects to make sure all the > > > > numbers are correct for some of the implemented tests. > > > > > > > > -jason > > > > > > > > On Mon, 24 Nov 2003, Heikki Lehvaslaiho wrote: > > > >> With developer snapshot 1.3.03 out last Friday, I'd like to hear > > > >> from bioperl > > > >> developers what they'd like to accomplish before 1.4 comes out. > > > >> > > > >> At this stage, I'd rather not see major new features added; there > > > >> will not be > > > >> time to test them properly. > > > >> > > > >> I'd especially like to hear from > > > >> * Lincoln Stein / Todd Harris about the SVG timetable > > > >> * Stefan Kirov about Bio::Matrix::PSM modules > > > >> * anyone else about their pet modules and bugs > > > >> > > > >> I'll give you three choices: > > > >> > > > >> 1. You've got a fix/enhancement under way and you can give me > > > >> timetable > > > >> for it to be in. > > > >> > > > >> 2. There is a bug that need to be fixed before the release but you > > > >> do not know how to do it. Let's discuss that now! > > > >> > > > >> 3. Your modules are not ready for release and you do not have time > > > >> to work on your code. That means that they will need to be excluded > > > >> from the release. > > > >> > > > >> I am pretty happy with the current status of bioperl CVS tree. Tests > > > >> pass, but there still are bugs in bugzilla. I urge anyone and > > > >> everyone to have a look at the bugzilla and see what they think of > > > >> the bugs still > > > >> there. > > > >> > > > >> Realistically, we have couple of weeks time to get the fixes in. > > > >> Then a > > > >> week to wait to make sure everyone has time to make sure that > > > >> everything > > > >> works. > > > >> > > > >> -Heikki "Let's push 1.4 out" > > > > > > > > -- > > > > Jason Stajich > > > > Duke University > > > > jason at cgt.mc.duke.edu > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l@portal.open-bio.org > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From markw at illuminae.com Tue Nov 25 14:54:37 2003 From: markw at illuminae.com (Mark Wilkinson) Date: Tue Nov 25 15:01:10 2003 Subject: [BioPerl] Re: [Bioperl-l] Bio::Perl blast function not working? In-Reply-To: References: <1069777793.1713.22.camel@localhost.localdomain> <1069780954.1713.47.camel@localhost.localdomain> Message-ID: <1069790077.1713.71.camel@localhost.localdomain> Yup - switching to the CVS version solved the problem. I'll just change my lesson to teach the students how to use CVS first, and then get them to install BioPerl from CVS instead of CPAN or downloading the latest dist. Cheers! Mark On Tue, 2003-11-25 at 12:24, Jason Stajich wrote: > You may need to use bioperl 1.3.x - as per the other posts which describe > the problems - 1.2.x did not handle NCBI WebBlast changes which appended > stuff to the RID line. Or just change the regexep in > Tools::Run::RemoteBlast as others have described. > > -jason > > On Tue, 25 Nov 2003, Mark Wilkinson wrote: > > > Yup, so long as i know that it is still a going concern I will have a > > look at it myself to see if i can track it down. > > > > The bug can be reproduced as follows: > > > > use Bio::Perl; > > > > $seq = get_sequence('swissprot', 'P35632'); > > $blast = blast_sequence($seq); > > write_blast(">mySequenceBlast", $blast); > > > > > > results in: > > > > Submitted Blast for [AP3_ARATH] > > -------------------- WARNING --------------------- > > MSG: > > > >

> >


ERROR: Results for RID 1069780799-8428-65894659018 > > not found
> > > > > > --------------------------------------------------- > > > > > > I'll have a go at stepping through the code later today after I finish > > writing this lesson. > > > > Cheers all! > > > > M > > > > > > On Tue, 2003-11-25 at 10:49, Jason Stajich wrote: > > > Error messages, your own debugging efforts are always helpful here too... > > > > > > -jason > > > On Tue, 25 Nov 2003, Mark Wilkinson wrote: > > > > > > > Hi all, > > > > > > > > I'm teaching a course on BioPerl next week and I'm putting the > > > > "beginners" lesson together using the Bio::Perl module. I notice that > > > > the blast_sequence function is not working anymore... is this just a > > > > temporary glitch, or is this function not supported these days? > > > > > > > > Any advice appreciated, > > > > > > > > M > > > > > > > > > > > > > > > > > > -- > > > Jason Stajich > > > Duke University > > > jason at cgt.mc.duke.edu > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l@portal.open-bio.org > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > Jason Stajich > Duke University > jason at cgt.mc.duke.edu -- Mark Wilkinson Illuminae From wes.barris at csiro.au Tue Nov 25 19:16:28 2003 From: wes.barris at csiro.au (Wes Barris) Date: Tue Nov 25 19:23:11 2003 Subject: [Bioperl-l] AUTHORS reference field in genbank file? Message-ID: <3FC3F0DC.1040200@csiro.au> Hi, A typical genbank entry contains this reference section: REFERENCE 1 (bases 1 to 561) AUTHORS DeSilva,U., Franklin,I.R., Maddox,J.F., van Hest,B. and Adelson,D.L. TITLE Gene Expression in Sheep Skin and Wool (Hair) JOURNAL Cytogenet. Genome Res. (2003) In press How does one go about getting at the AUTHORS section? For each sequence, I have tried this: my $ac = $seq->annotation; my @values = $ac->get_Annotations('reference'); foreach my $value (@values) { print($value->as_text,"\n"); } The problem is, this only displays the REFERENCE->TITLE portion. I want the REFERENCE->AUTHORS. -- Wes Barris E-Mail: Wes.Barris@csiro.au From jason at cgt.duhs.duke.edu Tue Nov 25 19:48:05 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Tue Nov 25 19:54:23 2003 Subject: [Bioperl-l] AUTHORS reference field in genbank file? In-Reply-To: <3FC3F0DC.1040200@csiro.au> References: <3FC3F0DC.1040200@csiro.au> Message-ID: See the documentation for a Bio::Annotation::Reference..... http://doc.bioperl.org/releases/bioperl-1.2.3/Bio/Annotation/Reference.html $value->title, $value->authors, $value->location,... -jason On Wed, 26 Nov 2003, Wes Barris wrote: > Hi, > > A typical genbank entry contains this reference section: > > REFERENCE 1 (bases 1 to 561) > AUTHORS DeSilva,U., Franklin,I.R., Maddox,J.F., van Hest,B. and > Adelson,D.L. > TITLE Gene Expression in Sheep Skin and Wool (Hair) > JOURNAL Cytogenet. Genome Res. (2003) In press > > How does one go about getting at the AUTHORS section? For each sequence, > I have tried this: > > my $ac = $seq->annotation; > my @values = $ac->get_Annotations('reference'); > foreach my $value (@values) { > print($value->as_text,"\n"); > } > > The problem is, this only displays the REFERENCE->TITLE portion. > I want the REFERENCE->AUTHORS. > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From jaudall at iastate.edu Tue Nov 25 22:31:30 2003 From: jaudall at iastate.edu (Joshua A Udall) Date: Tue Nov 25 22:40:37 2003 Subject: [Bioperl-l] AUTHORS reference field in genbank file? In-Reply-To: <3FC3F0DC.1040200@csiro.au> References: <3FC3F0DC.1040200@csiro.au> Message-ID: <6.0.0.22.2.20031125213004.01cce918@jaudall.mail.iastate.edu> Try: my ($ref1) = $seq->annotation->get_Annotations('reference'); print $ref1->authors,"\n"; Josh At 06:16 PM 11/25/2003, you wrote: >Hi, > >A typical genbank entry contains this reference section: > >REFERENCE 1 (bases 1 to 561) > AUTHORS DeSilva,U., Franklin,I.R., Maddox,J.F., van Hest,B. and > Adelson,D.L. > TITLE Gene Expression in Sheep Skin and Wool (Hair) > JOURNAL Cytogenet. Genome Res. (2003) In press > >How does one go about getting at the AUTHORS section? For each sequence, >I have tried this: > > my $ac = $seq->annotation; > my @values = $ac->get_Annotations('reference'); > foreach my $value (@values) { > print($value->as_text,"\n"); > } > >The problem is, this only displays the REFERENCE->TITLE portion. >I want the REFERENCE->AUTHORS. >-- >Wes Barris >E-Mail: Wes.Barris@csiro.au > > >_______________________________________________ >Bioperl-l mailing list >Bioperl-l@portal.open-bio.org >http://portal.open-bio.org/mailman/listinfo/bioperl-l Joshua Udall Department of Ecology, Evolution, and Organismal Biology Iowa State University Ames, IA 50011 Ph: (515) 294-7098 Fax: (515) 294-1337 From fangl at genomics.org.cn Tue Nov 25 23:51:10 2003 From: fangl at genomics.org.cn (Magic Fang) Date: Tue Nov 25 23:57:50 2003 Subject: [Bioperl-l] about the Bio::SeqIO::staden::read Message-ID: <3FC4313E.4010905@genomics.org.cn> hi mackey, i want to install Bio::SeqIO::staden::read module, and got error messages when compiled it. ... /usr/local/genome/include/io_lib/os.h:9:1: warning: "INT_MAX" redefined In file included from /usr/include/sys/param.h:107, from /usr/local/lib/perl5/5.6.1/mach/CORE/perl.h:497, from read.xs:2: /usr/include/sys/limits.h:62:1: warning: this is the location of the previous definition In file included from /usr/local/genome/include/io_lib/Read.h:43, from read.xs:5: /usr/local/genome/include/io_lib/os.h:10:1: warning: "SHRT_MAX" redefined In file included from /usr/include/sys/param.h:107, from /usr/local/lib/perl5/5.6.1/mach/CORE/perl.h:497, from read.xs:2: /usr/include/sys/limits.h:58:1: warning: this is the location of the previous definition In file included from /usr/local/genome/include/io_lib/Read.h:43, from read.xs:5: /usr/local/genome/include/io_lib/os.h:24:4: #error No 2-byte integer type found. /usr/local/genome/include/io_lib/os.h:34:4: #error No 4-byte integer type found. In file included from /usr/local/genome/include/io_lib/Read.h:43, from read.xs:5: /usr/local/genome/include/io_lib/os.h:40: syntax error before "int_2" /usr/local/genome/include/io_lib/os.h:40: warning: data definition has no type or storage class /usr/local/genome/include/io_lib/os.h:41: syntax error before "uint_2" /usr/local/genome/include/io_lib/os.h:41: warning: data definition has no type or storage class /usr/local/genome/include/io_lib/os.h:42: syntax error before "int_4" /usr/local/genome/include/io_lib/os.h:42: warning: data definition has no type or storage class /usr/local/genome/include/io_lib/os.h:43: syntax error before "uint_4" /usr/local/genome/include/io_lib/os.h:43: warning: data definition has no type or storage class /usr/local/genome/include/io_lib/os.h:46: syntax error before "f_int" /usr/local/genome/include/io_lib/os.h:46: warning: data definition has no type or storage class /usr/local/genome/include/io_lib/os.h:47: syntax error before "f_implicit" /usr/local/genome/include/io_lib/os.h:47: warning: data definition has no type or storage class /usr/local/genome/include/io_lib/os.h:51: syntax error before "int_f" /usr/local/genome/include/io_lib/os.h:51: warning: data definition has no type or storage class /usr/local/genome/include/io_lib/os.h:52: syntax error before "int_fl" /usr/local/genome/include/io_lib/os.h:52: warning: data definition has no type or storage class In file included from /usr/local/genome/include/io_lib/Read.h:44, from read.xs:5: /usr/local/genome/include/io_lib/scf.h:79: syntax error before "uint_4" /usr/local/genome/include/io_lib/scf.h:89: syntax error before "uint_4" /usr/local/genome/include/io_lib/scf.h:110: syntax error before "uint_2" /usr/local/genome/include/io_lib/scf.h:120: syntax error before "uint_4" /usr/local/genome/include/io_lib/scf.h:432: syntax error before "samples" In file included from read.xs:5: /usr/local/genome/include/io_lib/Read.h:93: syntax error before "TRACE" /usr/local/genome/include/io_lib/Read.h:93: warning: data definition has no type or storage class /usr/local/genome/include/io_lib/Read.h:104: syntax error before "TRACE" /usr/local/genome/include/io_lib/Read.h:112: syntax error before "uint_2" In file included from /usr/local/genome/include/io_lib/translate.h:20, from /usr/local/genome/include/io_lib/Read.h:227, ... i can compiled it when i fixed the os.h from io_lib, but it said incorrected install Bio::SeqIO::staden::read when i use it. my system is freebsd 5.1, any advice? From heikki at nildram.co.uk Wed Nov 26 02:38:21 2003 From: heikki at nildram.co.uk (Heikki Lehvaslaiho) Date: Wed Nov 26 02:45:42 2003 Subject: [BioPerl] Re: [Bioperl-l] Bio::Perl blast function not working? In-Reply-To: <1069790077.1713.71.camel@localhost.localdomain> References: <1069777793.1713.22.camel@localhost.localdomain> <1069790077.1713.71.camel@localhost.localdomain> Message-ID: <200311260736.58475.heikki@ebi.ac.uk> Mark, Did you remember this: http://www.ebi.ac.uk/~lehvasla/bioperl/ there is a quite detailed page about using anonymous cvs: http://www.ebi.ac.uk/~lehvasla/bioperl/InstallingBioperl.html We should put that into bioperl website somewhere... Are you going to make your materials available? -Heikki On Tuesday 25 Nov 2003 7:54 pm, Mark Wilkinson wrote: > Yup - switching to the CVS version solved the problem. I'll just change > my lesson to teach the students how to use CVS first, and then get them > to install BioPerl from CVS instead of CPAN or downloading the latest > dist. > > Cheers! > > Mark > > On Tue, 2003-11-25 at 12:24, Jason Stajich wrote: > > You may need to use bioperl 1.3.x - as per the other posts which describe > > the problems - 1.2.x did not handle NCBI WebBlast changes which appended > > stuff to the RID line. Or just change the regexep in > > Tools::Run::RemoteBlast as others have described. > > > > -jason > > > > On Tue, 25 Nov 2003, Mark Wilkinson wrote: > > > Yup, so long as i know that it is still a going concern I will have a > > > look at it myself to see if i can track it down. > > > > > > The bug can be reproduced as follows: > > > > > > use Bio::Perl; > > > > > > $seq = get_sequence('swissprot', 'P35632'); > > > $blast = blast_sequence($seq); > > > write_blast(">mySequenceBlast", $blast); > > > > > > > > > results in: > > > > > > Submitted Blast for [AP3_ARATH] > > > -------------------- WARNING --------------------- > > > MSG: > > > > > >

> > >


ERROR: Results for RID > > > 1069780799-8428-65894659018 not found
> > > > > > > > > --------------------------------------------------- > > > > > > > > > I'll have a go at stepping through the code later today after I finish > > > writing this lesson. > > > > > > Cheers all! > > > > > > M > > > > > > On Tue, 2003-11-25 at 10:49, Jason Stajich wrote: > > > > Error messages, your own debugging efforts are always helpful here > > > > too... > > > > > > > > -jason > > > > > > > > On Tue, 25 Nov 2003, Mark Wilkinson wrote: > > > > > Hi all, > > > > > > > > > > I'm teaching a course on BioPerl next week and I'm putting the > > > > > "beginners" lesson together using the Bio::Perl module. I notice > > > > > that the blast_sequence function is not working anymore... is this > > > > > just a temporary glitch, or is this function not supported these > > > > > days? > > > > > > > > > > Any advice appreciated, > > > > > > > > > > M > > > > > > > > -- > > > > Jason Stajich > > > > Duke University > > > > jason at cgt.mc.duke.edu > > > > _______________________________________________ > > > > Bioperl-l mailing list > > > > Bioperl-l@portal.open-bio.org > > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > > Jason Stajich > > Duke University > > jason at cgt.mc.duke.edu -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From markw at illuminae.com Wed Nov 26 06:38:14 2003 From: markw at illuminae.com (Mark Wilkinson) Date: Wed Nov 26 06:44:52 2003 Subject: [BioPerl] Re: [Bioperl-l] Bio::Perl blast function not working? In-Reply-To: <200311260736.58475.heikki@ebi.ac.uk> References: <1069777793.1713.22.camel@localhost.localdomain> <1069790077.1713.71.camel@localhost.localdomain> <200311260736.58475.heikki@ebi.ac.uk> Message-ID: <1069846693.6281.1.camel@localhost.localdomain> Hey Heikki! Those are great links! Thanks for the heads-up! If I produce anything "useful" I will certainly make it public. We'll see how the lessons go. It is going to be a mixture of the SeqHound API and the BioPerl API, so it isn't 100% BP... M On Wed, 2003-11-26 at 01:38, Heikki Lehvaslaiho wrote: > Mark, > > Did you remember this: > http://www.ebi.ac.uk/~lehvasla/bioperl/ > there is a quite detailed page about using anonymous cvs: > http://www.ebi.ac.uk/~lehvasla/bioperl/InstallingBioperl.html > > We should put that into bioperl website somewhere... > > Are you going to make your materials available? > > -Heikki > > On Tuesday 25 Nov 2003 7:54 pm, Mark Wilkinson wrote: > > Yup - switching to the CVS version solved the problem. I'll just change > > my lesson to teach the students how to use CVS first, and then get them > > to install BioPerl from CVS instead of CPAN or downloading the latest > > dist. > > > > Cheers! > > > > Mark > > > > On Tue, 2003-11-25 at 12:24, Jason Stajich wrote: > > > You may need to use bioperl 1.3.x - as per the other posts which describe > > > the problems - 1.2.x did not handle NCBI WebBlast changes which appended > > > stuff to the RID line. Or just change the regexep in > > > Tools::Run::RemoteBlast as others have described. > > > > > > -jason > > > > > > On Tue, 25 Nov 2003, Mark Wilkinson wrote: > > > > Yup, so long as i know that it is still a going concern I will have a > > > > look at it myself to see if i can track it down. > > > > > > > > The bug can be reproduced as follows: > > > > > > > > use Bio::Perl; > > > > > > > > $seq = get_sequence('swissprot', 'P35632'); > > > > $blast = blast_sequence($seq); > > > > write_blast(">mySequenceBlast", $blast); > > > > > > > > > > > > results in: > > > > > > > > Submitted Blast for [AP3_ARATH] > > > > -------------------- WARNING --------------------- > > > > MSG: > > > > > > > >

> > > >


ERROR: Results for RID > > > > 1069780799-8428-65894659018 not found
> > > > > > > > > > > > --------------------------------------------------- > > > > > > > > > > > > I'll have a go at stepping through the code later today after I finish > > > > writing this lesson. > > > > > > > > Cheers all! > > > > > > > > M > > > > > > > > On Tue, 2003-11-25 at 10:49, Jason Stajich wrote: > > > > > Error messages, your own debugging efforts are always helpful here > > > > > too... > > > > > > > > > > -jason > > > > > > > > > > On Tue, 25 Nov 2003, Mark Wilkinson wrote: > > > > > > Hi all, > > > > > > > > > > > > I'm teaching a course on BioPerl next week and I'm putting the > > > > > > "beginners" lesson together using the Bio::Perl module. I notice > > > > > > that the blast_sequence function is not working anymore... is this > > > > > > just a temporary glitch, or is this function not supported these > > > > > > days? > > > > > > > > > > > > Any advice appreciated, > > > > > > > > > > > > M > > > > > > > > > > -- > > > > > Jason Stajich > > > > > Duke University > > > > > jason at cgt.mc.duke.edu > > > > > _______________________________________________ > > > > > Bioperl-l mailing list > > > > > Bioperl-l@portal.open-bio.org > > > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l > > > > > > -- > > > Jason Stajich > > > Duke University > > > jason at cgt.mc.duke.edu -- Mark Wilkinson Illuminae From faruque at ebi.ac.uk Thu Nov 27 06:04:13 2003 From: faruque at ebi.ac.uk (Nadeem Faruque) Date: Thu Nov 27 06:11:21 2003 Subject: [Bioperl-l] Bio::SeqIO Genbank + EMBL unquoted values Message-ID: <3FC5DA2D.1060501@ebi.ac.uk> Prompted by a genome submittor that had used BioPerl, I wondered why he couldn't get BioPerl to write out unquoted evidence qualifier values. Maybe I've got this wrong, but I think that the feature table writing functions are oversimplified on this point:- In sub _print_EMBL_FTHelper (and sub _print_GenBank_FTHelper) it appears only to think that only qualifier values that are just numbers are unquoted:- ... elsif( $always_quote == 1 || $value !~ /^\d+$/ ) { my $pat = $value =~ /\s/ ? '\s|$' : '.|$'; $self->_write_line_EMBL_regex("FT ", "FT ", "/$tag=\"$value\"",$pat,80); } else { $self->_write_line_EMBL_regex("FT ", "FT ", "/$tag=$value",'.|$',80); #' Each of the folloing qualifiers accepts a non-numeric single token that should be unquoted:- /direction=left, right, or both /estimated_length=unknown though an actual number will be accepted next year /evidence=experimental or not_experimental /label=*** single token used to permanently tag a feature (for use within EMBL, eg for joins that span entries. External use not advised) /mod_base=m5c for example, the abbreviation for a modified nucleotide base /number=1e for example, a single token used as a exon/intron number (should be a number but exon numbering is more chaotic than that) /rpt_type=tandem, inverted, flanking, terminal, direct, dispersed, and other /rpt_unit can either accept quoted text (/rpt_unit="aagggc" ) or a location value (/rpt_unit=202..245 ) NB The other qualifiers that are unusual are:- /anticodon=(pos:***,aa:***) /citation=[***] - the number of the citation /codon=(seq:"***", aa:***) /cons_splice=(5'site:***, 3'site:***) /transl_except=(pos:***,aa:***) /usedin=***:*** - like /label, this shouldn't really be created externally. Further details are available in teh feature table document or at Nadeem -- S.M. Nadeem N. Faruque EMBL Nucleotide Database Curation Team EMBL Outstation Tel: +44 1799 494611 Fax: +44 1799 494472 The European Bioinformatics Institute URL: http://www.ebi.ac.uk/ Email for data submissions: datasubs@ebi.ac.uk Email for updates: update@ebi.ac.uk ============================================================================= From llukens at uoguelph.ca Thu Nov 27 11:39:51 2003 From: llukens at uoguelph.ca (Lewis Lukens) Date: Thu Nov 27 11:45:45 2003 Subject: [Bioperl-l] newick to nexus Message-ID: Hi, I am trying to convert tree files from newick to nexus format. It seems that one can do this (using the example script from bio::factory::treefactoryI) but I have been unable to. My the code is: #!/usr/bin/perl use Bio::TreeIO; my $treeio= new Bio::TreeIO ('-format'=> 'newick', '-file'=>'bpoistree'); my $treeout= new Bio::TreeIO ('-format'=> 'nexus', '-file'=>'>outfile'); while(my $tree = $treeio->next_tree) { $treeout->write_tree($treeout); } The newick tree is fine, but the "nexus" module is missing, and I get the following error: Bio::TreeIO: nexus cannot be found Exception ------------- EXCEPTION ------------- MSG: Failed to load module Bio::TreeIO::nexus. Can't locate Bio/TreeIO/nexus.pm in @INC (@INC contains: /System/Library/Perl/darwin /System/Library/Perl /Library/Perl/darwin /Library/Perl /Library/Perl /Network/Library/Perl/darwin /Network/Library/Perl /Network/Library/Perl .) at /Library/Perl/Bio/Root/Root.pm line 407. STACK Bio::Root::Root::_load_module /Library/Perl/Bio/Root/Root.pm:409 STACK (eval) /Library/Perl/Bio/TreeIO.pm:221 STACK Bio::TreeIO::_load_format_module /Library/Perl/Bio/TreeIO.pm:220 STACK Bio::TreeIO::new /Library/Perl/Bio/TreeIO.pm:121 STACK toplevel treeswitch.pl:4 Is there another way to do this in bioperl? Thanks much, Lewis P.S. Please respond to the address above, as I am not a subscriber to the mailing list. -- Lewis Lukens Assistant Professor Department of Plant Agriculture Univ. of Guelph, Guelph, Ontario. N1G 2W1 Tel: (519) 824- 4120 ext 52304 From redwards at utmem.edu Thu Nov 27 11:56:16 2003 From: redwards at utmem.edu (Rob Edwards) Date: Thu Nov 27 12:02:42 2003 Subject: [Bioperl-l] newick to nexus In-Reply-To: Message-ID: <9D904E86-20FA-11D8-AF64-000A959E1622@utmem.edu> You need to get a more recent version of bioperl. The nexus module was not in earlier versions. As you don't have the file Bio/TreeIO/nexus.pm you need to update bioperl. Rob On Thursday, November 27, 2003, at 10:39 AM, Lewis Lukens wrote: > Hi, > > I am trying to convert tree files from newick to nexus format. It > seems that one can do this (using the example script from > bio::factory::treefactoryI) but I have been unable to. > > My the code is: > > #!/usr/bin/perl > use Bio::TreeIO; > my $treeio= new Bio::TreeIO ('-format'=> 'newick', > '-file'=>'bpoistree'); > my $treeout= new Bio::TreeIO ('-format'=> 'nexus', > '-file'=>'>outfile'); > > while(my $tree = $treeio->next_tree) { > $treeout->write_tree($treeout); > } > > > The newick tree is fine, but the "nexus" module is missing, and I get > the following error: > > Bio::TreeIO: nexus cannot be found > Exception > ------------- EXCEPTION ------------- > MSG: Failed to load module Bio::TreeIO::nexus. Can't locate > Bio/TreeIO/nexus.pm in @INC (@INC contains: > /System/Library/Perl/darwin /System/Library/Perl /Library/Perl/darwin > /Library/Perl /Library/Perl /Network/Library/Perl/darwin > /Network/Library/Perl /Network/Library/Perl .) at > /Library/Perl/Bio/Root/Root.pm line 407. > > STACK Bio::Root::Root::_load_module /Library/Perl/Bio/Root/Root.pm:409 > STACK (eval) /Library/Perl/Bio/TreeIO.pm:221 > STACK Bio::TreeIO::_load_format_module /Library/Perl/Bio/TreeIO.pm:220 > STACK Bio::TreeIO::new /Library/Perl/Bio/TreeIO.pm:121 > STACK toplevel treeswitch.pl:4 > > Is there another way to do this in bioperl? > > Thanks much, > Lewis > > P.S. Please respond to the address above, as I am not a subscriber to > the mailing list. > -- > Lewis Lukens > Assistant Professor > Department of Plant Agriculture > Univ. of Guelph, Guelph, Ontario. N1G 2W1 > > Tel: (519) 824- 4120 ext 52304 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From shawnh at stanford.edu Thu Nov 27 12:01:02 2003 From: shawnh at stanford.edu (Shawn Hoon) Date: Thu Nov 27 12:04:43 2003 Subject: [Bioperl-l] newick to nexus In-Reply-To: References: Message-ID: <47D74F91-20FB-11D8-B12C-000A95783436@stanford.edu> On Thursday, November 27, 2003, at 8:39AM, Lewis Lukens wrote: > Hi, > > I am trying to convert tree files from newick to nexus format. It > seems that one can do this (using the example script from > bio::factory::treefactoryI) but I have been unable to. > > My the code is: > > #!/usr/bin/perl > use Bio::TreeIO; > my $treeio= new Bio::TreeIO ('-format'=> 'newick', > '-file'=>'bpoistree'); > my $treeout= new Bio::TreeIO ('-format'=> 'nexus', > '-file'=>'>outfile'); > > while(my $tree = $treeio->next_tree) { > $treeout->write_tree($treeout); > } > I think you wanna do $treeout->write_tree($tree) but thats probably just a typo. > > The newick tree is fine, but the "nexus" module is missing, and I get > the following error: > > Bio::TreeIO: nexus cannot be found > Exception > ------------- EXCEPTION ------------- > MSG: Failed to load module Bio::TreeIO::nexus. Can't locate > Bio/TreeIO/nexus.pm in @INC (@INC contains: > /System/Library/Perl/darwin /System/Library/Perl /Library/Perl/darwin > /Library/Perl /Library/Perl /Network/Library/Perl/darwin > /Network/Library/Perl /Network/Library/Perl .) at > /Library/Perl/Bio/Root/Root.pm line 407. > Jason added nexus 2 months ago so probably you don't have it. Check that you have Bio/TreeIO/nexus.pm, if not do a cvs update or download a newer bioperl package. (I checked mine and I didn't have it) cheers, shawn > STACK Bio::Root::Root::_load_module /Library/Perl/Bio/Root/Root.pm:409 > STACK (eval) /Library/Perl/Bio/TreeIO.pm:221 > STACK Bio::TreeIO::_load_format_module /Library/Perl/Bio/TreeIO.pm:220 > STACK Bio::TreeIO::new /Library/Perl/Bio/TreeIO.pm:121 > STACK toplevel treeswitch.pl:4 > > Is there another way to do this in bioperl? > > Thanks much, > Lewis > > P.S. Please respond to the address above, as I am not a subscriber to > the mailing list. > -- > Lewis Lukens > Assistant Professor > Department of Plant Agriculture > Univ. of Guelph, Guelph, Ontario. N1G 2W1 > > Tel: (519) 824- 4120 ext 52304 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From jason at cgt.duhs.duke.edu Thu Nov 27 12:03:04 2003 From: jason at cgt.duhs.duke.edu (Jason Stajich) Date: Thu Nov 27 12:09:22 2003 Subject: [Bioperl-l] newick to nexus In-Reply-To: References: Message-ID: The TreeIO::nexus parser is not in bioperl 1.2.x but only in bioperl 1.3.x so you'd need to get the latest code from the dev releases to achieve this. However, all that said I don't think I implemented nexus tree writing, just nexus tree parsing at this time. Will be able to check this weekend I expect. -jason On Thu, 27 Nov 2003, Lewis Lukens wrote: > Hi, > > I am trying to convert tree files from newick to nexus format. It > seems that one can do this (using the example script from > bio::factory::treefactoryI) but I have been unable to. > > My the code is: > > #!/usr/bin/perl > use Bio::TreeIO; > my $treeio= new Bio::TreeIO ('-format'=> 'newick', '-file'=>'bpoistree'); > my $treeout= new Bio::TreeIO ('-format'=> 'nexus', '-file'=>'>outfile'); > > while(my $tree = $treeio->next_tree) { > $treeout->write_tree($treeout); > } > > > The newick tree is fine, but the "nexus" module is missing, and I get > the following error: > > Bio::TreeIO: nexus cannot be found > Exception > ------------- EXCEPTION ------------- > MSG: Failed to load module Bio::TreeIO::nexus. Can't locate > Bio/TreeIO/nexus.pm in @INC (@INC contains: > /System/Library/Perl/darwin /System/Library/Perl /Library/Perl/darwin > /Library/Perl /Library/Perl /Network/Library/Perl/darwin > /Network/Library/Perl /Network/Library/Perl .) at > /Library/Perl/Bio/Root/Root.pm line 407. > > STACK Bio::Root::Root::_load_module /Library/Perl/Bio/Root/Root.pm:409 > STACK (eval) /Library/Perl/Bio/TreeIO.pm:221 > STACK Bio::TreeIO::_load_format_module /Library/Perl/Bio/TreeIO.pm:220 > STACK Bio::TreeIO::new /Library/Perl/Bio/TreeIO.pm:121 > STACK toplevel treeswitch.pl:4 > > Is there another way to do this in bioperl? > > Thanks much, > Lewis > > P.S. Please respond to the address above, as I am not a subscriber to > the mailing list. > -- Jason Stajich Duke University jason at cgt.mc.duke.edu From heikki at ebi.ac.uk Thu Nov 27 12:41:19 2003 From: heikki at ebi.ac.uk (Heikki Lehvaslaiho) Date: Thu Nov 27 12:47:48 2003 Subject: [Bioperl-l] Bio::SeqIO Genbank + EMBL unquoted values In-Reply-To: <3FC5DA2D.1060501@ebi.ac.uk> References: <3FC5DA2D.1060501@ebi.ac.uk> Message-ID: <200311271741.19593.heikki@ebi.ac.uk> Nadeem, I fixed that only yesterday! Look into cvs. It was bioperl bugzilla bug # 1516. -Heikki On Thursday 27 Nov 2003 11:04 am, Nadeem Faruque wrote: > Prompted by a genome submittor that had used BioPerl, I wondered why he > couldn't get BioPerl to write out unquoted evidence qualifier values. > > Maybe I've got this wrong, but I think that the feature table writing > functions are oversimplified on this point:- > > In sub _print_EMBL_FTHelper (and sub _print_GenBank_FTHelper) > it appears only to think that only qualifier values that are just numbers > are unquoted:- > ... > elsif( $always_quote == 1 || $value !~ /^\d+$/ ) { > my $pat = $value =~ /\s/ ? '\s|$' : '.|$'; > $self->_write_line_EMBL_regex("FT ", > "FT ", > "/$tag=\"$value\"",$pat,80); > } > else { > $self->_write_line_EMBL_regex("FT ", > "FT ", > "/$tag=$value",'.|$',80); #' > > > > Each of the folloing qualifiers accepts a non-numeric single token that > should be unquoted:- > /direction=left, right, or both > /estimated_length=unknown though an actual number will be accepted next > year /evidence=experimental or not_experimental > /label=*** single token used to permanently tag a feature > (for use within EMBL, eg for joins that span entries. > External use not advised) > /mod_base=m5c for example, the abbreviation for a modified nucleotide > base /number=1e for example, a single token used as a exon/intron number > (should be a number but exon numbering is more chaotic than that) > /rpt_type=tandem, inverted, flanking, terminal, direct, dispersed, and > other /rpt_unit can either accept quoted text (/rpt_unit="aagggc" ) > or a location value (/rpt_unit=202..245 ) > > NB The other qualifiers that are unusual are:- > /anticodon=(pos:***,aa:***) > /citation=[***] - the number of the citation > /codon=(seq:"***", aa:***) > /cons_splice=(5'site:***, 3'site:***) > /transl_except=(pos:***,aa:***) > /usedin=***:*** - like /label, this shouldn't really be created > externally. > > > Further details are available in teh feature table document or at > > > Nadeem -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From international at netcityhk.com Fri Nov 28 08:20:16 2003 From: international at netcityhk.com (Lotto International) Date: Fri Nov 28 08:26:39 2003 Subject: [Bioperl-l] CONGRATULATIONS YOU HAVE WON Message-ID: <200311281326.hASDQRg0016388@portal.open-bio.org> LOTTO INTERNATIONAL INTERNATIONAL PROMOTION/PRIZE AWARD DEPT. REF: OME/1100-JJTD99540011/11 BATCH: JJT/00/1099/KKTC Email: international@netcityhk.com international1@netcityhk.com Website: www.lotto.com RE: WINNING FINAL NOTIFICATION We are pleased to inform you of the result of the Lottery Winners International programs held on the 26th of November 2003. Your e-mail address attached to ticket number 770068455005-88121 with serial number 303-532100 drew lucky numbers 9-00-11-00-66-44 which consequently won in the 3rd category, you have therefore been approved for a lump sum pay out of US$1,000,000.00 (One Million United States Dollars). CONGRATULATIONS!!! You funds are now deposited with a security firm and insured in your name to be transferred into your nominated bank account or by cashier's cheque. Due to mix up of some numbers and names, we ask that you keep your winning information very confidential and strict from public notice till your claims have been processed and your prize/money Remitted to you. This is part of our security protocol to avoid double claiming and unwarranted abuse of this program by some participants. All participants were selected through a computer ballot system drawn from over 800,000 company and 1,000,000 individual email addresses and names from all over the world. This promotional program takes place annually. We hope with part of your winning you will take part in our next year USD$20 million international lottery. To file for your claim, please contact our/your financial agent MR HARRY WOOD of the, GLOBAL AGENCY INTERNATIONAL TEL: 001-775-363-6449 or 0031-642-829-666 FAX: 001-775-262-4472 Email: globall1@netcityhk.com This lottery is promoted and sponsored by software corporations to compensate some many individuals with email addresses. Note that all winning must be claimed not later than 24th of December 2003. After this date all unclaimed, funds will be included in the next stake. Please note in order to avoid unnecessary delays and complications please remember to quote your reference number and batch numbers in all correspondence. Furthermore, should there be any change of address do inform our agent as soon as possible. Congratulations once more from our members of staff and thank you for being part of our promotional program. N.B: Anybody under the age of 18 is automatically disqualified and breach of confidentiality on the part of the winners will result to disqualification. Email: globall1@netcityhk.com Sincerely yours, Mr. Jack Van Brill Lottery Coordinator. From ak at ebi.ac.uk Fri Nov 28 11:46:45 2003 From: ak at ebi.ac.uk (Andreas Kahari) Date: Fri Nov 28 11:53:14 2003 Subject: [Bioperl-l] ProServer, a pluggable DAS server, Bio::SeqIO support added Message-ID: <20031128164645.GA27626@ebi.ac.uk> Hi lists (sorry for the cross-posting), This for those of you who are interested in DAS but not aware of ProServer: ProServer is a DAS server implementation written in Perl by Roger Pettett at the Sanger Institute, here outside Cambridge in the UK. It builds on top of ideas from Tony Cox, also at the Sanger Institute. The point with ProServer is that it is pluggable, so that any data source may be used as a source to serve DAS features from, as long as there is source adaptor and a transport module for it. There are source adaptors already written for a number of types of sources, and they are fairly easy to extend to other types of sources or transports (I recently wrote a toy "wgetz" transport module from the already existing "getz" module which is used by the Swissprot source adaptor). Other DAS servers often requires you to create a dedicated database of DAS data. I thought it might be of interest to a couple of you to note that you now also can serve features or sequence data from any type of file that Bio::SeqIO can read. This, of course, is only of interest to people with smallish amount of data since queries are looked up sequentially in the files (unless the Bio::DB::Flat support in the code is used, which reduces the lookup time but which doesn't support all formats). ProServer is part of the Bio-Das2 module in the biodas CVS repository: http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/Bio-Das2/?cvsroot=biodas Cheers, Andreas -- |(--)| Andreas K?h?ri |<><>| |-)(-| EMBL, European Bioinformatics Institute |><><| |(--)| Wellcome Trust Genome Campus, Hinxton |<><>| |-)(-| Cambridge, CB10 1SD |><><| |(--)| United Kingdom |<><>| From heikki at nildram.co.uk Fri Nov 28 13:17:59 2003 From: heikki at nildram.co.uk (Heikki Lehvaslaiho) Date: Fri Nov 28 13:24:32 2003 Subject: [Bioperl-l] Re: Small Bio::Factory::EMBOSS typo In-Reply-To: References: Message-ID: <200311281817.59735.heikki@nildram.co.uk> Leon, I meant that I'll first try to get the bioperl core 1.4 version which is based on bioperl-live. Only after that I'll look into bioperl-run cvs module and try to put a release out. I can see two things that need to be done for EMBOSS: 1. During the Factory initialisation, it shoud check that it can find at least the wossname program and fail with a reasonable error message if needed. 2. Check the EMBOSS version and if it is < 2.7.0 convert all 'sequencea' paremeters into 'asequence' in the internal memory representation of the ACD specs. Thanks for looking into this. -Heikki On Friday 28 Nov 2003 4:22 pm, you wrote: > On Fri, 28 Nov 2003 10:20:51 +0000 > > Heikki Lehvaslaiho wrote: > > Yeah, I know. This will need fixing. See also: > > > > http://bugzilla.bioperl.org/show_bug.cgi?id=1481 > > > > Do you want to do it? I am now concentrating on bioperl > > core 1.4 release. The > > run will follow after that. > > > > -Heikki > > Actually I would like to do it, Im gona see how my time > looks and I'll start by figurein out what these ACD file's > are.. I can guess, and it sounds like the fix will be in > one of those? Also I found another small parameter passing > bug with prophecy from EMBOSS. > > And im wondering what you mean by "The run will follow > after that."? > -Leon -- ______ _/ _/_____________________________________________________ _/ _/ http://www.ebi.ac.uk/mutations/ _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute _/ _/ _/ Wellcome Trust Genome Campus, Hinxton _/ _/ _/ Cambs. CB10 1SD, United Kingdom _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468 ___ _/_/_/_/_/________________________________________________________ From proj_mgr8 at yahoo.com Fri Nov 28 20:28:48 2003 From: proj_mgr8 at yahoo.com (E l e c t r i c i a n) Date: Fri Nov 28 23:39:06 2003 Subject: [Bioperl-l] R E S U M E Message-ID: <200311290439.hAT4d2g0019866@portal.open-bio.org> Resume from: RICH' S for Job, Consulting, Installation or Service E L E C T R I C I A N Tel. (408) 482-2102 rysio3@yahoo.com WIRING & INSTALLATION Hands on electrical installations perform fitting, mounting, laying cables on Commercial, Industrial, residential new & existing buildings. Electrical Power Supply for Lights, Plugs, Receptacles, Panels, & Fuse boxes, Emergency Generators wiring and testing, Transformers, Power Lines & conduit layout, bending and mounting, parking lighting, lamps, Signs, switches, posts and underground installations. Shopping Centers; grocery stories, hardware stories, restaurants & residential - housing areas, computer business & fast food units installation & buildings; Solar Panels, Sun Tracking, Flywheel Storage & cars electric systems modify, Natural Energy in Remote areas install. LOW VOLTAGE 12 / 24 Volt audio & video equipment, Computer & data network wiring, data backup and UPS; Monitoring Video Control & backup tapes set up and mounting, electro-optical assemblies & subsystems. DC Power Supply, Switch & Motion sensors Alarm. Fire & safety systems install. Fiber Optics systems, PLC setup, Master Control Center, cable modems & cable TV install. Network, UPS Battery Backup mounting and charging systems; Power supply testing, troubleshooting, and analyzing to a components level. Solar Signs, Electric Vehicles Design, Assembly & Installations. CC TV & Cameras, Security Systems & Sensors for Safety, Fire sprinklers and traffic Monitoring & Door Control. TECHNICIAN Use lab & shop equipment, mechanical, electrical & electronic tools, measurement & testing equipment, video cameras & microscopes. Support scientists & electronic engineers. Mechanical & Electro-Mechan. Design. OFFICE, ELECTRICAL AND MECHANICAL PROJECTS Electrical & Network Sketches, one line diagrams, and "as is" drawings update. Customizing Electronic and Electrical Components & Parts, Layouts electronic and electrical schematic, connectors and mechanical detailing. Quotes, supply, bids and job estimating. SOLAR PROJECTS. Customers contact, inspection, project mgmt & supervision of electricians & material handling; Use CAD, Windows and applications; ELECTRICAL & MAINTENANCE SERVICE US Citizen; open for travel . From proj_mgr8 at yahoo.com Fri Nov 28 20:28:48 2003 From: proj_mgr8 at yahoo.com (E l e c t r i c i a n) Date: Fri Nov 28 23:39:13 2003 Subject: [Bioperl-l] R E S U M E Message-ID: <200311290439.hAT4d2g0019865@portal.open-bio.org> Resume from: RICH' S for Job, Consulting, Installation or Service E L E C T R I C I A N Tel. (408) 482-2102 rysio3@yahoo.com WIRING & INSTALLATION Hands on electrical installations perform fitting, mounting, laying cables on Commercial, Industrial, residential new & existing buildings. Electrical Power Supply for Lights, Plugs, Receptacles, Panels, & Fuse boxes, Emergency Generators wiring and testing, Transformers, Power Lines & conduit layout, bending and mounting, parking lighting, lamps, Signs, switches, posts and underground installations. Shopping Centers; grocery stories, hardware stories, restaurants & residential - housing areas, computer business & fast food units installation & buildings; Solar Panels, Sun Tracking, Flywheel Storage & cars electric systems modify, Natural Energy in Remote areas install. LOW VOLTAGE 12 / 24 Volt audio & video equipment, Computer & data network wiring, data backup and UPS; Monitoring Video Control & backup tapes set up and mounting, electro-optical assemblies & subsystems. DC Power Supply, Switch & Motion sensors Alarm. Fire & safety systems install. Fiber Optics systems, PLC setup, Master Control Center, cable modems & cable TV install. Network, UPS Battery Backup mounting and charging systems; Power supply testing, troubleshooting, and analyzing to a components level. Solar Signs, Electric Vehicles Design, Assembly & Installations. CC TV & Cameras, Security Systems & Sensors for Safety, Fire sprinklers and traffic Monitoring & Door Control. TECHNICIAN Use lab & shop equipment, mechanical, electrical & electronic tools, measurement & testing equipment, video cameras & microscopes. Support scientists & electronic engineers. Mechanical & Electro-Mechan. Design. OFFICE, ELECTRICAL AND MECHANICAL PROJECTS Electrical & Network Sketches, one line diagrams, and "as is" drawings update. Customizing Electronic and Electrical Components & Parts, Layouts electronic and electrical schematic, connectors and mechanical detailing. Quotes, supply, bids and job estimating. SOLAR PROJECTS. Customers contact, inspection, project mgmt & supervision of electricians & material handling; Use CAD, Windows and applications; ELECTRICAL & MAINTENANCE SERVICE US Citizen; open for travel . From info at mobiform.com Sun Nov 30 13:31:48 2003 From: info at mobiform.com (info@mobiform.com) Date: Sun Nov 30 13:38:07 2003 Subject: [Bioperl-l] Mobiform SVG Browser and SVGViewPlus for .NET Message-ID: <002b01c3b770$36d8bda0$bd0351cf@m1p1z7> Mobiform Software Ltd. has now released Version 1.0 of the Mobiform SVG Browser and the Mobiform SVGViewPlus component for .NET. developers. SVG (Scalable Vector Graphics) is an XML based graphics format developed by the W3C for internet and desktop applications. SVG has all of the capabilities of Flash, PDF and HTML in one XML based open standard format. Screen Shot of the Mobiform SVG Browser: http://www.mobiform.com/images/MobiformSvgBrowser.JPG The MOBIFORM SVG Browser supports SVG and .NET as its script language. Together the convergence of documents, media and applications becomes a reality. For a download trial and limited time price break on Mobiform products visit http://www.mobiform.com/products.aspx For investment and partnering opportunities with Mobiform, contact rdeserranno@mobiform.com Mobiform Software Ltd. www.MOBIFORM.com From rconsultant3 at yahoo.com Sun Nov 30 17:51:05 2003 From: rconsultant3 at yahoo.com (Engineer - Mgr) Date: Sun Nov 30 21:01:18 2003 Subject: [Bioperl-l] Sending Resume Message-ID: <200312010201.hB121Eg0002060@portal.open-bio.org> Hello David I'm looking for new project as independent CONSULTANT Will call you tomorrow to schedule meeting as we were talking before, possible this week or in Monday; E-mail: rcondultant9@yahoo.com Tel. (408) 309-7006 Engineer or Manager or INDEPENDENT CONSULTING or Engineering Manager and/or Senior Mechanical & Design ENGINEER, Design Analyst Program Manager Project Manager, Product Manager Product Development Specialist Manufacturing Engineering Electro-Mechanical & Industrial Designer R&D, CADD Manager, SBIR Over 20 years combine experience in Technical Managerial Positions performing Engineering & Design Service Systems Design & System Integration, R&D, Product & Process Development, Projects & Products Engineering & Management, Manufacturing Operations & Preparation & Managing teams of Engineering Specialist; 08-1993 to present; MECH-TRONIC DESIGN & MFG; SANTA CLARA, CA Engineering Manager, Project & Product Manager, SR. Mechanical & Design Engineer, CADD Manager. Duties included: Preparing technical documentation, calculations, engineering, design, layouts, drawings, 3D and Solid Modeling, development & propositions. Hard Drive Designing, testing, balancing, recalculating and redesign. Solid Works, Mechanical Desktop. Manufacturing and Assembly Equipment design and build. Hard Drive Testing Equipment design & mfg. Tooling and Operations Development, Implementation & Automation for mass production. Pro-E, CADD Management, Automation and Operations, Analyzing, AutoCAD 2000 & M D, Win 95/98/2000/NT, Creating, revising & implementing improvements to existing and new manufacturing processes. Mfg & Assy Development & support. Cost study and cost reduction analyzing. Purchasing manufacturing equipment, Bill of Materials creation. Copy Machines & Color commercial Printers design & build. Paper path & Heads - dynamic orientation. Rapid Thermo Processing, Wafer 200 & 300 mm with automatic Door and manual inserting, Scanners, Sensors & Cassette opening and rotation, LASER & Equipment & Motion Control. Fiber Optics equipment, enclosures, connectors, tools & manufacturing; assembly equipment design & build. Wafer Handling & Processing Equipment, Stages & Sliders, Thickness Measurement, Robotics. Network & Electronic / Computer Testing Equipment. Plastic & Enclosures Design & Build; Hard Drive, modems, ICT Equipment Des, Mfg & Assy. Teaching Mechanical Engineering and Descriptive Geometry, CADD. Computer Graphics, Electro - Mechanical Engineering & Design, AutoCAD, Plotting, DOS, Lecture and practical assignment. Software and computer support. Electrical & Mechanical Projects, Production Equipment & Machinery. Designing Electric Cars, Analyzes & Calculations, Prototypes electronic and electrical schematic Layouts, mechanical detailing & body shell. CAD and Plotting Station Management, Lib, Analyzes & Calculations, Prototypes & Product Mfg. Development, design & redesigning mechanical components for manufacturing, Assembly draw, Customizing Electronic, Mechanical and Electrical Equipment, AutoCAD, Spreadsheet, Basic, dB; Lotus, Excel, Paradox, Quadro Project Management & Production, scheduling, quality control, controlling & production planning injection molding, plastic parts and elements design and manufacturing Manufacturing hydraulic & pneumatic equipment, machinery and control systems, mechanisms, robotics device, precision machine elements. Inspecting and control manufacturing standards, analyzing stresses and tolerances, selecting materials, engineering computations and technical improvements, documentation, projects development, supplying. Automation, Conveyors, Spiral Elevator, Fast Cannery Transportation, Electronic and Electrical Components, Master Control Center, Sheet Metal Oven Rebuilding Project. LASER MEDICAL DEVICE, Lasers, Optics, Fiber Optics 3 SETS OF OPTICAL LENSES MOVE-AUTOMATON COLIMMATOR Project Management, Design, Product Development Manufacturing, Production, Operation Manager Sales & Customer Service Transportation & Shop Manager Service Station Operating & Management CREATIVE THINKING AND SOLUTION DEVELOPMENT OPEN FOR TRAVEL (domestic /international) US Citizen Permanent preferred INDEPENDENT CONSULTING TEL. (408) 309-7006 E-mail: rconsultant9@yahoo.com From rconsultant3 at yahoo.com Sun Nov 30 17:51:05 2003 From: rconsultant3 at yahoo.com (Engineer - Mgr) Date: Sun Nov 30 21:01:25 2003 Subject: [Bioperl-l] Sending Resume Message-ID: <200312010201.hB121Eg0002061@portal.open-bio.org> Hello David I'm looking for new project as independent CONSULTANT Will call you tomorrow to schedule meeting as we were talking before, possible this week or in Monday; E-mail: rcondultant9@yahoo.com Tel. (408) 309-7006 Engineer or Manager or INDEPENDENT CONSULTING or Engineering Manager and/or Senior Mechanical & Design ENGINEER, Design Analyst Program Manager Project Manager, Product Manager Product Development Specialist Manufacturing Engineering Electro-Mechanical & Industrial Designer R&D, CADD Manager, SBIR Over 20 years combine experience in Technical Managerial Positions performing Engineering & Design Service Systems Design & System Integration, R&D, Product & Process Development, Projects & Products Engineering & Management, Manufacturing Operations & Preparation & Managing teams of Engineering Specialist; 08-1993 to present; MECH-TRONIC DESIGN & MFG; SANTA CLARA, CA Engineering Manager, Project & Product Manager, SR. Mechanical & Design Engineer, CADD Manager. Duties included: Preparing technical documentation, calculations, engineering, design, layouts, drawings, 3D and Solid Modeling, development & propositions. Hard Drive Designing, testing, balancing, recalculating and redesign. Solid Works, Mechanical Desktop. Manufacturing and Assembly Equipment design and build. Hard Drive Testing Equipment design & mfg. Tooling and Operations Development, Implementation & Automation for mass production. Pro-E, CADD Management, Automation and Operations, Analyzing, AutoCAD 2000 & M D, Win 95/98/2000/NT, Creating, revising & implementing improvements to existing and new manufacturing processes. Mfg & Assy Development & support. Cost study and cost reduction analyzing. Purchasing manufacturing equipment, Bill of Materials creation. Copy Machines & Color commercial Printers design & build. Paper path & Heads - dynamic orientation. Rapid Thermo Processing, Wafer 200 & 300 mm with automatic Door and manual inserting, Scanners, Sensors & Cassette opening and rotation, LASER & Equipment & Motion Control. Fiber Optics equipment, enclosures, connectors, tools & manufacturing; assembly equipment design & build. Wafer Handling & Processing Equipment, Stages & Sliders, Thickness Measurement, Robotics. Network & Electronic / Computer Testing Equipment. Plastic & Enclosures Design & Build; Hard Drive, modems, ICT Equipment Des, Mfg & Assy. Teaching Mechanical Engineering and Descriptive Geometry, CADD. Computer Graphics, Electro - Mechanical Engineering & Design, AutoCAD, Plotting, DOS, Lecture and practical assignment. Software and computer support. Electrical & Mechanical Projects, Production Equipment & Machinery. Designing Electric Cars, Analyzes & Calculations, Prototypes electronic and electrical schematic Layouts, mechanical detailing & body shell. CAD and Plotting Station Management, Lib, Analyzes & Calculations, Prototypes & Product Mfg. Development, design & redesigning mechanical components for manufacturing, Assembly draw, Customizing Electronic, Mechanical and Electrical Equipment, AutoCAD, Spreadsheet, Basic, dB; Lotus, Excel, Paradox, Quadro Project Management & Production, scheduling, quality control, controlling & production planning injection molding, plastic parts and elements design and manufacturing Manufacturing hydraulic & pneumatic equipment, machinery and control systems, mechanisms, robotics device, precision machine elements. Inspecting and control manufacturing standards, analyzing stresses and tolerances, selecting materials, engineering computations and technical improvements, documentation, projects development, supplying. Automation, Conveyors, Spiral Elevator, Fast Cannery Transportation, Electronic and Electrical Components, Master Control Center, Sheet Metal Oven Rebuilding Project. LASER MEDICAL DEVICE, Lasers, Optics, Fiber Optics 3 SETS OF OPTICAL LENSES MOVE-AUTOMATON COLIMMATOR Project Management, Design, Product Development Manufacturing, Production, Operation Manager Sales & Customer Service Transportation & Shop Manager Service Station Operating & Management CREATIVE THINKING AND SOLUTION DEVELOPMENT OPEN FOR TRAVEL (domestic /international) US Citizen Permanent preferred INDEPENDENT CONSULTING TEL. (408) 309-7006 E-mail: rconsultant9@yahoo.com