From clements at nescent.org Sun Jan 2 13:49:12 2011 From: clements at nescent.org (Dave Clements) Date: Sun, 2 Jan 2011 10:49:12 -0800 Subject: [Bioperl-l] GMOD Spring Training Application Deadline January 7 In-Reply-To: References: Message-ID: Hello all, The application deadline for the 2011 GMOD Spring Training (http://gmod.org/wiki/2011_GMOD_Spring_Training) is the end of this coming Friday, January 7. Admission is competitive, so please apply by Friday to avoid being automatically placed on the waiting list. Details are below. Thanks, and happy new year! Dave C ------- Applications are now being accepted for the 2011 GMOD Spring Training course (http://gmod.org/wiki/2011_GMOD_Spring_Training), a five-day hands-on school aimed at teaching new GMOD administrators how to install, configure and integrate popular GMOD components. The course will be held March 8-12 at the US National Evolutionary Synthesis Center (NESCent) in Durham, North Carolina, as part of GMOD Americas 2011. These components will be covered: * Apollo - genome annotation editor * Chado - biological database schema * Galaxy - analysis and data integration framework * GBrowse - genome viewer * GBrowse_syn - synteny viewer * GFF3 - genome annotation file format and tools * InterMine - biological data mining system * JBrowse - next generation genome browser * MAKER - genome annotation pipeline * Tripal - web front end to Chado databases The deadline for applying is the end of Friday, January 7, 2011. Admission is competitive and is based on the strength of the application, especially the statement of interest. The 2010 school had over 60 applicants for the 25 slots. Any application received after deadline will be automatically placed on the waiting list. The course requires some knowledge of Linux as a prerequisite. The registration fee will be $265 (only $53 per day!). There will be a limited number of scholarships available. This may be the only GMOD School offered in 2011. If you are interested, you are strongly encouraged to apply by January 7. Links: http://gmod.org/wiki/2011_GMOD_Spring_Training http://gmod.org/wiki/GMOD_Americas_2011 http://www.nescent.org/ From pmiguel at purdue.edu Mon Jan 3 11:48:48 2011 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Mon, 03 Jan 2011 11:48:48 -0500 Subject: [Bioperl-l] Default output format for Bio::SeqIO? Message-ID: <4D21FDF0.2040501@purdue.edu> The following: perl -e 'use Bio::SeqIO; $out = new Bio::SeqIO()' results in: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unknown format given or could not determine it [] STACK: Error::throw STACK: Bio::Root::Root::throw /usr/local/lib/perl/5.10.0/Bio/Root/Root.pm:368 STACK: Bio::SeqIO::new /usr/local/lib/perl/5.10.0/Bio/SeqIO.pm:383 STACK: -e:1 ----------------------------------------------------------- suggesting there is no default SeqIO format. Is that correct? If so, should the useful example script, bp_extract_feature_seq.pl, be updated so it will not error out from relying on a default format assumption? -- Phillip From cjfields at illinois.edu Mon Jan 3 13:22:20 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 3 Jan 2011 12:22:20 -0600 Subject: [Bioperl-l] Default output format for Bio::SeqIO? In-Reply-To: <4D21FDF0.2040501@purdue.edu> References: <4D21FDF0.2040501@purdue.edu> Message-ID: <641B1E6E-3B3F-4E1A-874A-8E115B7DB595@illinois.edu> On Jan 3, 2011, at 10:48 AM, Phillip San Miguel wrote: > The following: > > perl -e 'use Bio::SeqIO; $out = new Bio::SeqIO()' > > results in: > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Unknown format given or could not determine it [] > STACK: Error::throw > STACK: Bio::Root::Root::throw /usr/local/lib/perl/5.10.0/Bio/Root/Root.pm:368 > STACK: Bio::SeqIO::new /usr/local/lib/perl/5.10.0/Bio/SeqIO.pm:383 > STACK: -e:1 > ----------------------------------------------------------- > > suggesting there is no default SeqIO format. Is that correct? If so, should the useful example script, bp_extract_feature_seq.pl, be updated so it will not error out from relying on a default format assumption? > > -- > Phillip Bio::SeqIO can guess the format via the file extension or by testing the file via Bio::Tools::GuessSeqFormat; if one doesn't provide a file or file handle of some sort then this may pop up, but it's not really informative. That can be taken care of with the script, but it may be simpler to deal with it in Bio::SeqIO itself (e.g. there should be some error if neither a file or file handle is defined). Let me see what I can come up with. chris From dichmann at berkeley.edu Tue Jan 4 19:56:08 2011 From: dichmann at berkeley.edu (Darwin Sorento Dichmann) Date: Tue, 4 Jan 2011 16:56:08 -0800 Subject: [Bioperl-l] CPAN/Bioperl can't find modules Message-ID: Wet biologist here trying to get into NGS/bioperl/gbrowse. Something funky is going on in my gbrowse2 installation and in an attempt to fix it I reinstalled bioperl as well as other perl modules through CPAN (default settings, OSX 10.6). When I run CPAN -O to test if all modules are up to date I get a lot of errors like this (excerpt): ------ Macintosh:~ darwin$ cpan -O CPAN: Storable loaded ok (v2.25) Going to read '/Users/darwin/Library/Application Support/.cpan/Metadata' Database was generated on Tue, 04 Jan 2011 07:05:20 GMT Module Name Local CPAN Bio::DB::SeqFeature 0.0000 1.0060 Bio::DB::SeqFeature::NormalizedFeature 0.0000 1.0060 Bio::DB::SeqFeature::NormalizedFeatureI 0.0000 1.0060 Bio::DB::SeqFeature::NormalizedTableFeatureI 0.0000 1.0060 Bio::DB::SeqFeature::Segment 0.0000 1.0060 Bio::DB::SeqFeature::Store 0.0000 1.0060 Bio::DB::SeqFeature::Store::DBI::Iterator 0.0000 1.0060 Bio::DB::SeqFeature::Store::DBI::Pg 0.0000 1.0060 Bio::DB::SeqFeature::Store::DBI::SQLite 0.0000 1.0060 Bio::DB::SeqFeature::Store::DBI::mysql 0.0000 1.0060 Bio::DB::SeqFeature::Store::FeatureFileLoader 0.0000 1.0060 Bio::DB::SeqFeature::Store::GFF2Loader 0.0000 1.0060 Bio::DB::SeqFeature::Store::GFF3Loader 0.0000 1.0060 Bio::DB::SeqFeature::Store::LoadHelper 0.0000 1.0060 Bio::DB::SeqFeature::Store::Loader 0.0000 1.0060 Bio::DB::SeqFeature::Store::bdb 0.0000 1.0060 Bio::DB::SeqFeature::Store::berkeleydb 0.0000 1.0060 Bio::DB::SeqFeature::Store::berkeleydb3 0.0000 1.0060 Bio::DB::SeqFeature::Store::memory 0.0000 1.0060 -------- However, when I test the module in perl debugger it finds this: --------- DB<11> use Bio::DB::SeqFeature::Store print $Bio::DB::SeqFeature::Store::VERSION 1.006001 --------- I assume that something in the environment is not set up right and in general I have had a good deal of problems that might relate to this. If anyone have some pointers to what it could be I'd be grateful. On a side note: Is there a way to search bioperl mailing list archives? I might be retarded and I couldn't figure out how. Happy New Year, Darwin From cjfields at illinois.edu Tue Jan 4 23:16:25 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 4 Jan 2011 22:16:25 -0600 Subject: [Bioperl-l] CPAN/Bioperl can't find modules In-Reply-To: References: Message-ID: <9CFF66B4-EF7F-42A0-9644-619594FB21FB@illinois.edu> On Jan 4, 2011, at 6:56 PM, Darwin Sorento Dichmann wrote: > Wet biologist here trying to get into NGS/bioperl/gbrowse. Something funky is going on in my gbrowse2 installation and in an attempt to fix it I reinstalled bioperl as well as other perl modules through CPAN (default settings, OSX 10.6). > > When I run CPAN -O to test if all modules are up to date I get a lot of errors like this (excerpt): > > ------ > Macintosh:~ darwin$ cpan -O > CPAN: Storable loaded ok (v2.25) > Going to read '/Users/darwin/Library/Application Support/.cpan/Metadata' > Database was generated on Tue, 04 Jan 2011 07:05:20 GMT > Module Name Local CPAN > > Bio::DB::SeqFeature 0.0000 1.0060 > Bio::DB::SeqFeature::NormalizedFeature 0.0000 1.0060 > Bio::DB::SeqFeature::NormalizedFeatureI 0.0000 1.0060 > Bio::DB::SeqFeature::NormalizedTableFeatureI 0.0000 1.0060 > Bio::DB::SeqFeature::Segment 0.0000 1.0060 > Bio::DB::SeqFeature::Store 0.0000 1.0060 > Bio::DB::SeqFeature::Store::DBI::Iterator 0.0000 1.0060 > Bio::DB::SeqFeature::Store::DBI::Pg 0.0000 1.0060 > Bio::DB::SeqFeature::Store::DBI::SQLite 0.0000 1.0060 > Bio::DB::SeqFeature::Store::DBI::mysql 0.0000 1.0060 > Bio::DB::SeqFeature::Store::FeatureFileLoader 0.0000 1.0060 > Bio::DB::SeqFeature::Store::GFF2Loader 0.0000 1.0060 > Bio::DB::SeqFeature::Store::GFF3Loader 0.0000 1.0060 > Bio::DB::SeqFeature::Store::LoadHelper 0.0000 1.0060 > Bio::DB::SeqFeature::Store::Loader 0.0000 1.0060 > Bio::DB::SeqFeature::Store::bdb 0.0000 1.0060 > Bio::DB::SeqFeature::Store::berkeleydb 0.0000 1.0060 > Bio::DB::SeqFeature::Store::berkeleydb3 0.0000 1.0060 > Bio::DB::SeqFeature::Store::memory 0.0000 1.0060 > -------- > However, when I test the module in perl debugger it finds this: > --------- > DB<11> use Bio::DB::SeqFeature::Store print $Bio::DB::SeqFeature::Store::VERSION > 1.006001 > --------- > I assume that something in the environment is not set up right and in general I have had a good deal of problems that might relate to this. If anyone have some pointers to what it could be I'd be grateful. Might be, or it might be that BioPerl has a funky way of assigning the module version that's causing the noise; it's defined in Bio::Root::Version and exported to every module. This isn't the only module that does this (I get the same problem for DateTime). I have seen this in some instances; the best way to check for the version is Bio::Root::Version. Also, the version output here is chopped via printf (only reports to four decimal places). Tell the truth, I wouldn't worry about it if the correct version is showing up via the debugger or command line. > On a side note: Is there a way to search bioperl mailing list archives? I might be retarded and I couldn't figure out how. > > Happy New Year, > Darwin http://www.bioperl.org/wiki/Mailing_lists Look under 'Search the Mailing Lists' (only the main one is searchable). chris From drummike at gmail.com Wed Jan 5 09:41:46 2011 From: drummike at gmail.com (Mike Williams) Date: Wed, 5 Jan 2011 09:41:46 -0500 Subject: [Bioperl-l] CPAN/Bioperl can't find modules In-Reply-To: <9CFF66B4-EF7F-42A0-9644-619594FB21FB@illinois.edu> References: <9CFF66B4-EF7F-42A0-9644-619594FB21FB@illinois.edu> Message-ID: On Tue, Jan 4, 2011 at 11:16 PM, Chris Fields wrote: > On Jan 4, 2011, at 6:56 PM, Darwin Sorento Dichmann wrote: > >> Wet biologist here trying to get into NGS/bioperl/gbrowse. Something funky is going on in my gbrowse2 installation and in an attempt to fix it I reinstalled bioperl as well as other perl modules through CPAN (default settings, OSX 10.6). >> >> When I run CPAN -O to test if all modules are up to date I get a lot of errors like this (excerpt): >> >> ------ >> Macintosh:~ darwin$ cpan -O >> CPAN: Storable loaded ok (v2.25) >> Going to read '/Users/darwin/Library/Application Support/.cpan/Metadata' >> ?Database was generated on Tue, 04 Jan 2011 07:05:20 GMT >> Module Name ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Local ? ?CPAN >> >> Bio::DB::SeqFeature ? ? ? ? ? ? ? ? ? ? ? 0.0000 ?1.0060 >> Bio::DB::SeqFeature::NormalizedFeature ? ?0.0000 ?1.0060 >> Bio::DB::SeqFeature::NormalizedFeatureI ? 0.0000 ?1.0060 >> Bio::DB::SeqFeature::NormalizedTableFeatureI ?0.0000 ?1.0060 >> Bio::DB::SeqFeature::Segment ? ? ? ? ? ? ?0.0000 ?1.0060 > Might be, or it might be that BioPerl has a funky way of assigning the module version that's causing the noise; it's defined in Bio::Root::Version and exported to every module. ?This isn't the only module that does this (I get the same problem for DateTime). ?I have seen this in some instances; the best way to check for the version is Bio::Root::Version. > > Also, the version output here is chopped via printf (only reports to four decimal places). ?Tell the truth, I wouldn't worry about it if the correct version is showing up via the debugger or command line. I've seen the same thing with version numbers and CPAN. This is a snippet from a fedora 13 system that had Bio::Perl installed via CPAN. cpan -O reports: Bio::Align::AlignI 0.0000 1.0060 Bio::Align::DNAStatistics 0.0000 1.0060 Bio::Align::PairwiseStatistics 0.0000 1.0060 Bio::Align::ProteinStatistics 0.0000 1.0060 Bio::Align::StatisticsI 0.0000 1.0060 Bio::Align::Utilities 0.0000 1.0060 Bio::AlignIO 0.0000 1.0060 The same results are repeated for all Bio::Perl modules. I usually use perl -MCPAN -e shell instead of using the cpan script. With the CPAN module shell I get similar results: cpan[1]> r /Bio::Perl/ Going to read '/root/.cpan/Metadata' Database was generated on Tue, 04 Jan 2011 07:05:20 GMT Package namespace installed latest in CPAN file Bio::Perl undef 1.006001 CJFIELDS/BioPerl-1.6.1.tar.gz 1 installed module has no parsable version number I once tried to use the upgrade command from the CPAN shell and it re-installed all of Bio::Perl because of the version number issue. Mike From cjfields at illinois.edu Wed Jan 5 10:45:49 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 5 Jan 2011 09:45:49 -0600 Subject: [Bioperl-l] CPAN/Bioperl can't find modules In-Reply-To: References: <9CFF66B4-EF7F-42A0-9644-619594FB21FB@illinois.edu> Message-ID: On Jan 5, 2011, at 8:41 AM, Mike Williams wrote: > On Tue, Jan 4, 2011 at 11:16 PM, Chris Fields wrote: >> On Jan 4, 2011, at 6:56 PM, Darwin Sorento Dichmann wrote: >> >>> Wet biologist here trying to get into NGS/bioperl/gbrowse. Something funky is going on in my gbrowse2 installation and in an attempt to fix it I reinstalled bioperl as well as other perl modules through CPAN (default settings, OSX 10.6). >>> >>> When I run CPAN -O to test if all modules are up to date I get a lot of errors like this (excerpt): >>> >>> ------ >>> Macintosh:~ darwin$ cpan -O >>> CPAN: Storable loaded ok (v2.25) >>> Going to read '/Users/darwin/Library/Application Support/.cpan/Metadata' >>> Database was generated on Tue, 04 Jan 2011 07:05:20 GMT >>> Module Name Local CPAN >>> >>> Bio::DB::SeqFeature 0.0000 1.0060 >>> Bio::DB::SeqFeature::NormalizedFeature 0.0000 1.0060 >>> Bio::DB::SeqFeature::NormalizedFeatureI 0.0000 1.0060 >>> Bio::DB::SeqFeature::NormalizedTableFeatureI 0.0000 1.0060 >>> Bio::DB::SeqFeature::Segment 0.0000 1.0060 >> Might be, or it might be that BioPerl has a funky way of assigning the module version that's causing the noise; it's defined in Bio::Root::Version and exported to every module. This isn't the only module that does this (I get the same problem for DateTime). I have seen this in some instances; the best way to check for the version is Bio::Root::Version. >> >> Also, the version output here is chopped via printf (only reports to four decimal places). Tell the truth, I wouldn't worry about it if the correct version is showing up via the debugger or command line. > > I've seen the same thing with version numbers and CPAN. This is a > snippet from a fedora 13 system that had Bio::Perl installed via CPAN. > cpan -O reports: > > Bio::Align::AlignI 0.0000 1.0060 > Bio::Align::DNAStatistics 0.0000 1.0060 > Bio::Align::PairwiseStatistics 0.0000 1.0060 > Bio::Align::ProteinStatistics 0.0000 1.0060 > Bio::Align::StatisticsI 0.0000 1.0060 > Bio::Align::Utilities 0.0000 1.0060 > Bio::AlignIO 0.0000 1.0060 > > The same results are repeated for all Bio::Perl modules. > > I usually use perl -MCPAN -e shell instead of using the cpan script. > With the CPAN module shell I get similar results: > > cpan[1]> r /Bio::Perl/ > Going to read '/root/.cpan/Metadata' > Database was generated on Tue, 04 Jan 2011 07:05:20 GMT > > Package namespace installed latest in CPAN file > Bio::Perl undef 1.006001 CJFIELDS/BioPerl-1.6.1.tar.gz > 1 installed module has no parsable version number > > I once tried to use the upgrade command from the CPAN shell and it > re-installed all of Bio::Perl because of the version number issue. > > Mike I would view this as a bug, then. The best way to fix it, from my perspective, is to have a single reference point for the version number, either Bio::Perl or Bio::Root::Version, and not export versions (which I believe is causing the module version inconsistency), or to make sure the version gets exported properly into the module namespace when called. Funny thing is, when looking at the versions listed in CPAN in the monolithic release, they all indicate 1.006001 (or 1.6.1): http://search.cpan.org/~cjfields/BioPerl-1.6.1/ Anyway, at some future point this may end up becoming somewhat moot as we have been talking about modularizing bioperl to allow faster bug fix releases (likely for 1.7), thus requiring each focused module to have an independent versioning scheme. This will require a bit of coordination with the CPAN folks, though, but is on the slate, just need the time to get it rolling. chris From biopython at maubp.freeserve.co.uk Thu Jan 6 07:36:26 2011 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 6 Jan 2011 12:36:26 +0000 Subject: [Bioperl-l] [BioSQL-l] BioSQL-l Digest, Vol 79, Issue 1 In-Reply-To: <408F810D-41D4-4569-BE16-9E4DD0B27FAC@illinois.edu> References: <408F810D-41D4-4569-BE16-9E4DD0B27FAC@illinois.edu> Message-ID: Hi Chris & ??, I've CC'd the BioPerl mailing list (this started on the BioSQL list). 2011/1/6 Chris Fields : > See the BioPerl SeqIO HOWTO for this: > > http://www.bioperl.org/wiki/HOWTO:SeqIO > > Basically: > > ? ?# create one SeqIO object to read in,and another to write out > ? ?my $seq_in = Bio::SeqIO->new('-file' => "<$infile", > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? '-format' => $infileformat); > ? ?my $seq_out = Bio::SeqIO->new('-file' => ">$outfile", > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?'-format' => $outfileformat); > > ? ?# write each entry in the input file to the output file > ? ?while (my $inseq = $seq_in->next_seq) { > ? ? ? $seq_out->write_seq($inseq); > ? ?} > > You may have to configure the sequence display ID and description to suit your needs. > > chris Hi Chris, I think that just covers the easy case, getting one FASTA record per GenBank record (i.e. one FASTA sequence for the whole plasmid or chromosome), which is what the NCBI use *.fna for on their FTP site. What about the second part of this request, getting the gene sequences in FASTA as nucleotides (NCBI use *.ffn) and proteins/amino acids (NCBI use *.faa)? This would require looking at the gene/CDS features in the GenBank file (and again, rebuilding the exact sequence name the NCBI use in their FASTA files is hard). Peter P.S. There is a Biopython example of this here: http://www.warwick.ac.uk/go/peter_cock/python/genbank2fasta/ From deeepersound at googlemail.com Fri Jan 7 16:46:15 2011 From: deeepersound at googlemail.com (Maxim) Date: Fri, 7 Jan 2011 22:46:15 +0100 Subject: [Bioperl-l] Can't figure out restriction analysis Message-ID: Hi, I'm desperately trying to annotate all HindIII restriction sites for different genomes. The only solution I was able to figure out (after complicated self-made approaches using awk and grep) is shown below. I'm not a programmer, so please excuse the bad coding (strict, warnings etc, this is dedicated to be a "standalone" script). Below code reads the 25 fasta files for human genome one after another. Then restriction analysis is performed, the only value I was able to get returned so far is the fragment length. Am I right, that the fragment length values returned by the @fragments array can be mapped onto the reference genome in a linear fashion, i.e. the fragment length values are ordered according to their location on the reference sequence? If so, I can of course get the positions (coordinates) by simply adding the fragment length values. More questions: I've seen the _make_cuts function, would this be more appropriate in order to directly retrieve coordinates of cutting positions? Unfortunately I can't get it work like my $analysis = _make_cuts($seq->seq,'HindIII',0) this returns:can't call method "_make_cuts" on an undefined value Or are both approaches just nonsense and I overlooked another obvious function? Maxim use Bio::Perl; use Bio::SeqIO; use Bio::Restriction::Analysis; use Bio::Restriction::EnzymeCollection; use Cwd; $dir = getcwd; $fasta_dir = "/data1/Genomes/HG18/"; opendir(DIR, $fasta_dir) or die "$!"; @chroms = grep {/chr/} readdir DIR; print "found the following files:", join ("\n", map {$_} @chroms), "\n"; #foreach $chrom (@chroms) {print "$chrom\n";} foreach $file (@chroms) { $outname = $file; @outname = split (/.fa/, $outname); $outname = $dir . "/" . @outname[0] . "_fragments"; print "Output filename: $outname\n"; #exit; open (OUT, ">$outname"); $filelink = $fasta_dir. "/" . $file; my $seqio = Bio::SeqIO->new(-file => $filelink, '-format' => 'Fasta'); while(my $seq = $seqio->next_seq) { my $string = $seq->seq; #print $string,"\n"; my $analysis = Bio::Restriction::Analysis->new(-seq => $seq); my @fragments = $analysis->fragments('HindIII'); print OUT join("\n", map {length $_} @fragments), "\n"; } close (OUT); print "chromosome file: $file is done!\n"; } From hrh at fmi.ch Sat Jan 8 07:14:24 2011 From: hrh at fmi.ch (Hans-Rudolf Hotz) Date: Sat, 08 Jan 2011 13:14:24 +0100 Subject: [Bioperl-l] Can't figure out restriction analysis In-Reply-To: References: Message-ID: <4D285520.2000203@fmi.ch> Hi Maxim It is not a Bioperl solution, but have you looked at the emboss tool 'restrict' to get the coordinates? -bash-3.2$ restrict chr1.fa -enzymes hindiii -sitelen 6 stdout Report restriction enzyme cleavage sites in a nucleotide sequence ######################################## # Program: restrict # Rundate: Sat 8 Jan 2011 13:08:10 # Commandline: restrict # [-sequence] /work/gbioinfo/DB/genomes/hg19/chr1.fa # -enzymes hindiii # -sitelen 6 # [-outfile] stdout # Report_format: table # Report_file: stdout ######################################## #======================================= # # Sequence: chr1 from: 1 to: 249250621 # HitCount: 64394 # # Minimum cuts per enzyme: 1 # Maximum cuts per enzyme: 2000000000 # Minimum length of recognition site: 6 # Blunt ends allowed # Sticky ends allowed # DNA is linear # Ambiguities allowed # #======================================= Start End Strand Enzyme_name Restriction_site 5prime 3prime 5primerev 3primerev 16007 16012 + HindIII AAGCTT 16007 16011 . . 24571 24576 + HindIII AAGCTT 24571 24575 . . /// 249224235 249224240 + HindIII AAGCTT 249224235 249224239 . . 249230987 249230992 + HindIII AAGCTT 249230987 249230991 . . 249231350 249231355 + HindIII AAGCTT 249231350 249231354 . . #--------------------------------------- #--------------------------------------- #--------------------------------------- # Total_sequences: 1 # Total_length: 249250621 # Reported_sequences: 1 # Reported_hitcount: 64394 #--------------------------------------- -bash-3.2$ for more details see: http://emboss.sourceforge.net/apps/cvs/emboss/apps/restrict.html Hope this helps. Regards, Hans On 01/07/2011 10:46 PM, Maxim wrote: > Hi, > > I'm desperately trying to annotate all HindIII restriction sites for > different genomes. The only solution I was able to figure out (after > complicated self-made approaches using awk and grep) is shown below. I'm not > a programmer, so please excuse the bad coding (strict, warnings etc, this is > dedicated to be a "standalone" script). Below code reads the 25 fasta files > for human genome one after another. Then restriction analysis is performed, > the only value I was able to get returned so far is the fragment length. Am > I right, that the fragment length values returned by the @fragments array > can be mapped onto the reference genome in a linear fashion, i.e. the > fragment length values are ordered according to their location on the > reference sequence? If so, I can of course get the positions (coordinates) > by simply adding the fragment length values. > > More questions: > I've seen the _make_cuts function, would this be more appropriate in order > to directly retrieve coordinates of cutting positions? Unfortunately I can't > get it work like > my $analysis = _make_cuts($seq->seq,'HindIII',0) > this returns:can't call method "_make_cuts" on an undefined value > > Or are both approaches just nonsense and I overlooked another obvious > function? > Maxim > > > > use Bio::Perl; > use Bio::SeqIO; > use Bio::Restriction::Analysis; > use Bio::Restriction::EnzymeCollection; > use Cwd; > > $dir = getcwd; > > $fasta_dir = "/data1/Genomes/HG18/"; > opendir(DIR, $fasta_dir) or die "$!"; > @chroms = grep {/chr/} readdir DIR; > print "found the following files:", join ("\n", map {$_} @chroms), "\n"; > #foreach $chrom (@chroms) {print "$chrom\n";} > > foreach $file (@chroms) > { > $outname = $file; > @outname = split (/.fa/, $outname); > $outname = $dir . "/" . @outname[0] . "_fragments"; > print "Output filename: $outname\n"; > #exit; > open (OUT, ">$outname"); > $filelink = $fasta_dir. "/" . $file; > my $seqio = Bio::SeqIO->new(-file => $filelink, '-format' => 'Fasta'); > while(my $seq = $seqio->next_seq) > { > my $string = $seq->seq; > #print $string,"\n"; > > my $analysis = Bio::Restriction::Analysis->new(-seq => $seq); > my @fragments = $analysis->fragments('HindIII'); > > print OUT join("\n", map {length $_} @fragments), "\n"; > } > close (OUT); > print "chromosome file: $file is done!\n"; > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From deeepersound at googlemail.com Sat Jan 8 10:05:28 2011 From: deeepersound at googlemail.com (Marek) Date: Sat, 8 Jan 2011 16:05:28 +0100 Subject: [Bioperl-l] Can't figure out restriction analysis In-Reply-To: <4D285520.2000203@fmi.ch> References: <4D285520.2000203@fmi.ch> Message-ID: <79BF3A64-B9CA-4026-9EB5-973C6EF9CB37@googlemail.com> Good idea! Meanwhile I managed to figure out the cut method which yields coordinates, but I'm waiting now for an hour to see completion of the job for one of the larger chromosomes - that is slow. On the other hand this job will run only once. I think I have emboss on one of my machines,I will test whether it might be faster (I guess so as you most likely did not run a such time consuming job to generate above output). Maxim Von meinem iPod gesendet Am Jan 8, 2011 um 1:14 PM schrieb Hans-Rudolf Hotz : > Hi Maxim > > It is not a Bioperl solution, but have you looked at the emboss tool 'restrict' to get the coordinates? > > > -bash-3.2$ restrict chr1.fa -enzymes hindiii -sitelen 6 stdout > Report restriction enzyme cleavage sites in a nucleotide sequence > ######################################## > # Program: restrict > # Rundate: Sat 8 Jan 2011 13:08:10 > # Commandline: restrict > # [-sequence] /work/gbioinfo/DB/genomes/hg19/chr1.fa > # -enzymes hindiii > # -sitelen 6 > # [-outfile] stdout > # Report_format: table > # Report_file: stdout > ######################################## > > #======================================= > # > # Sequence: chr1 from: 1 to: 249250621 > # HitCount: 64394 > # > # Minimum cuts per enzyme: 1 > # Maximum cuts per enzyme: 2000000000 > # Minimum length of recognition site: 6 > # Blunt ends allowed > # Sticky ends allowed > # DNA is linear > # Ambiguities allowed > # > #======================================= > > Start End Strand Enzyme_name Restriction_site 5prime 3prime 5primerev 3primerev > 16007 16012 + HindIII AAGCTT 16007 16011 . . > 24571 24576 + HindIII AAGCTT 24571 24575 . . > /// > 249224235 249224240 + HindIII AAGCTT 249224235 249224239 . . > 249230987 249230992 + HindIII AAGCTT 249230987 249230991 . . > 249231350 249231355 + HindIII AAGCTT 249231350 249231354 . . > > #--------------------------------------- > #--------------------------------------- > > #--------------------------------------- > # Total_sequences: 1 > # Total_length: 249250621 > # Reported_sequences: 1 > # Reported_hitcount: 64394 > #--------------------------------------- > -bash-3.2$ > > > for more details see: > http://emboss.sourceforge.net/apps/cvs/emboss/apps/restrict.html > > > Hope this helps. > Regards, Hans > > > > On 01/07/2011 10:46 PM, Maxim wrote: >> Hi, >> >> I'm desperately trying to annotate all HindIII restriction sites for >> different genomes. The only solution I was able to figure out (after >> complicated self-made approaches using awk and grep) is shown below. I'm not >> a programmer, so please excuse the bad coding (strict, warnings etc, this is >> dedicated to be a "standalone" script). Below code reads the 25 fasta files >> for human genome one after another. Then restriction analysis is performed, >> the only value I was able to get returned so far is the fragment length. Am >> I right, that the fragment length values returned by the @fragments array >> can be mapped onto the reference genome in a linear fashion, i.e. the >> fragment length values are ordered according to their location on the >> reference sequence? If so, I can of course get the positions (coordinates) >> by simply adding the fragment length values. >> >> More questions: >> I've seen the _make_cuts function, would this be more appropriate in order >> to directly retrieve coordinates of cutting positions? Unfortunately I can't >> get it work like >> my $analysis = _make_cuts($seq->seq,'HindIII',0) >> this returns:can't call method "_make_cuts" on an undefined value >> >> Or are both approaches just nonsense and I overlooked another obvious >> function? >> Maxim >> >> >> >> use Bio::Perl; >> use Bio::SeqIO; >> use Bio::Restriction::Analysis; >> use Bio::Restriction::EnzymeCollection; >> use Cwd; >> >> $dir = getcwd; >> >> $fasta_dir = "/data1/Genomes/HG18/"; >> opendir(DIR, $fasta_dir) or die "$!"; >> @chroms = grep {/chr/} readdir DIR; >> print "found the following files:", join ("\n", map {$_} @chroms), "\n"; >> #foreach $chrom (@chroms) {print "$chrom\n";} >> >> foreach $file (@chroms) >> { >> $outname = $file; >> @outname = split (/.fa/, $outname); >> $outname = $dir . "/" . @outname[0] . "_fragments"; >> print "Output filename: $outname\n"; >> #exit; >> open (OUT, ">$outname"); >> $filelink = $fasta_dir. "/" . $file; >> my $seqio = Bio::SeqIO->new(-file => $filelink, '-format' => 'Fasta'); >> while(my $seq = $seqio->next_seq) >> { >> my $string = $seq->seq; >> #print $string,"\n"; >> >> my $analysis = Bio::Restriction::Analysis->new(-seq => $seq); >> my @fragments = $analysis->fragments('HindIII'); >> >> print OUT join("\n", map {length $_} @fragments), "\n"; >> } >> close (OUT); >> print "chromosome file: $file is done!\n"; >> } >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l From tzhu at mail.bnu.edu.cn Sat Jan 8 21:03:42 2011 From: tzhu at mail.bnu.edu.cn (Tao Zhu) Date: Sun, 09 Jan 2011 10:03:42 +0800 Subject: [Bioperl-l] How to find fuzzy locations? Message-ID: <1294538622.2280.0.camel@ubuntu> Hello, everyone! In BioPerl HOWTO:Feature-Annotation, Location_Objects http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Location_Objects , it says that we could fetch location objects from a SeqFeature::Generic object like this, ---------------------------------------- # polyA_signal 1811..1815 # /gene="NDP" my $start = $feat_object->location->start; my $end = $feat_object->location->end; ---------------------------------------- Location object is a Range object but it has additional capabilities designed to handle inexact or "fuzzy" locations, where the "start" and "end" of a particular sub-sequence themselves have start and end positions, or are not precisely defined. So in the following example we could still fetch location objects like this, ---------------------------------------- # polyA_signal <1811..>1815 # /gene="NDP" my $start = $feat_object->location->start; my $end = $feat_object->location->end; ---------------------------------------- Then we'll get $start=1811 and $end=1815 too. But how should I do if I just want to exclude such "fuzzy" locations? Is there any method in BioPerl that could detect such "fuzzy" locations and then I could exculde them? Thank you! -- Tao Zhu, College of Life Sciences, Beijing Normal University, Beijing 100875, China Email: tzhu at mail.bnu.edu.cn Website: http://bnuzt.org (mainly written in Chinese) From jason.stajich at gmail.com Sun Jan 9 02:39:25 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Sat, 08 Jan 2011 23:39:25 -0800 Subject: [Bioperl-l] How to find fuzzy locations? In-Reply-To: <1294538622.2280.0.camel@ubuntu> References: <1294538622.2280.0.camel@ubuntu> Message-ID: <4D29662D.1070207@gmail.com> You can test by virtue of inheritance: if( $location->isa('Bio::Location::FuzzyLocationI') ) { # location is 'fuzzy' } Or, if you want to know if the start or end coordinates are fuzzy you do this - testing if the start, end, and location joining is exact: if( $location->start_pos_type ne 'EXACT' || $location->end_pos_type ne 'EXACT' && $location->location_type ne 'EXACT' ) { # location is 'fuzzy' } You will want to look at the perldoc for Bio::LocationI and look at the documentation for location_type, start_pos_type, end_pos_type I cut and paste unformatted here: =head2 location_type Title : location_type Usage : my $location_type = $location->location_type(); Function: Get location type encoded as text Returns : string ('EXACT', 'WITHIN', 'IN-BETWEEN') Args : none =cut =head2 start_pos_type Title : start_pos_type Usage : my $start_pos_type = $location->start_pos_type(); Function: Get start position type encoded as text Known valid values are 'BEFORE' (<5..100), 'AFTER' (>5..100), 'EXACT' (5..100), 'WITHIN' ((5.10)..100), 'BETWEEN', (5^6), with their meaning best explained by their GenBank/EMBL location string encoding in brackets. Returns : string ('BEFORE', 'AFTER', 'EXACT','WITHIN', 'BETWEEN') Args : none =cut Tao Zhu wrote: > Hello, everyone! > In BioPerl HOWTO:Feature-Annotation, Location_Objects > http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Location_Objects , > it says that we could fetch location objects from a SeqFeature::Generic > object like this, > ---------------------------------------- > # polyA_signal 1811..1815 > # /gene="NDP" > > my $start = $feat_object->location->start; > my $end = $feat_object->location->end; > ---------------------------------------- > > Location object is a Range object but it has additional capabilities > designed to handle inexact or "fuzzy" locations, where the "start" and > "end" of a particular sub-sequence themselves have start and end > positions, or are not precisely defined. So in the following example we > could still fetch location objects like this, > ---------------------------------------- > # polyA_signal<1811..>1815 > > # /gene="NDP" > my $start = $feat_object->location->start; > my $end = $feat_object->location->end; > ---------------------------------------- > Then we'll get $start=1811 and $end=1815 too. > > But how should I do if I just want to exclude such "fuzzy" locations? Is > there any method in BioPerl that could detect such "fuzzy" locations and > then I could exculde them? Thank you! > > > -- Jason Stajich From chiragmatkarbioinfo at gmail.com Sun Jan 9 13:17:52 2011 From: chiragmatkarbioinfo at gmail.com (chirag matkar) Date: Mon, 10 Jan 2011 01:17:52 +0700 Subject: [Bioperl-l] Bio::Biblio Author Name Differentiation Message-ID: Is it possible to query Author search term in Pubmed and return first name and surname seperately? As i know there whole Author term is returned in an array in Bio::Biblio -- Regards, Chirag Matkar From jason.stajich at gmail.com Sun Jan 9 13:39:24 2011 From: jason.stajich at gmail.com (Jason Stajich) Date: Sun, 09 Jan 2011 10:39:24 -0800 Subject: [Bioperl-l] How to find fuzzy locations? In-Reply-To: <4D29662D.1070207@gmail.com> References: <1294538622.2280.0.camel@ubuntu> <4D29662D.1070207@gmail.com> Message-ID: <4D2A00DC.3010001@gmail.com> Oops should be 3 or statements there Jason Stajich wrote: > if( $location->start_pos_type ne 'EXACT' || $location->end_pos_type > ne 'EXACT' || $location->location_type ne 'EXACT' ) { > # location is 'fuzzy' > } > From tzhu at mail.bnu.edu.cn Thu Jan 6 03:51:57 2011 From: tzhu at mail.bnu.edu.cn (Tao Zhu) Date: Thu, 06 Jan 2011 16:51:57 +0800 Subject: [Bioperl-l] How to find fuzzy locations? Message-ID: <1294303917.5628.19.camel@ubuntu> Hello, everyone! In BioPerl HOWTO:Feature-Annotation, Location_Objects http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Location_Objects , it says that we could fetch location objects from a SeqFeature::Generic object like this, ---------------------------------------- # polyA_signal 1811..1815 # /gene="NDP" my $start = $feat_object->location->start; my $end = $feat_object->location->end; ---------------------------------------- Location object is a Range object but it has additional capabilities designed to handle inexact or "fuzzy" locations, where the "start" and "end" of a particular sub-sequence themselves have start and end positions, or are not precisely defined. So in the following example we could still fetch location objects like this, ---------------------------------------- # polyA_signal <1811..>1815 # /gene="NDP" my $start = $feat_object->location->start; my $end = $feat_object->location->end; ---------------------------------------- Then we'll get $start=1811 and $end=1815 too. But how should I do if I just want to exclude such "fuzzy" locations? Is there any method in BioPerl that could detect such "fuzzy" locations and then I could exculde them? Thank you! -- Tao Zhu, College of Life Sciences, Beijing Normal University, Beijing 100875, China Email: tzhu at mail.bnu.edu.cn Website: http://bnuzt.org (mainly written in Chinese) From silavb at yahoo.com Fri Jan 7 23:38:25 2011 From: silavb at yahoo.com (Silav Bremos) Date: Fri, 7 Jan 2011 20:38:25 -0800 (PST) Subject: [Bioperl-l] EXCEPTION: Bio::Root::Exception Message-ID: <105542.34834.qm@web36907.mail.mud.yahoo.com> Hello I recently installed BioPerl on Ubuntu 10.10 with the Snaptic package manager. Install was easy. I tried to run the first tutorial script from: http://www.bioperl.org/wiki/Bptutorial.pl#Quick_getting_started_scripts $seq_object = get_sequence('swiss',"ROA1_HUMAN"); write_sequence(">roa1.fasta",'fasta',$seq_object); I get this error: ------------ EXCEPTION: Bio::Root::Exception ------------- MSG: id does not exist STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:368 STACK: Bio::DB::WebDBSeqI::get_Seq_by_id /usr/share/perl5/Bio/DB/WebDBSeqI.pm:168 STACK: Bio::Perl::get_sequence /usr/share/perl5/Bio/Perl.pm:523 STACK: tut1.pl:6 Please help me decipher the exception. Perl version is 5.10. Thanks ---Silav From wadim_kapulkin at yahoo.co.uk Sun Jan 9 12:04:00 2011 From: wadim_kapulkin at yahoo.co.uk (wadim kapulkin) Date: Sun, 9 Jan 2011 17:04:00 +0000 (GMT) Subject: [Bioperl-l] possible help with bioperl on macosx10.4 ... Message-ID: <118904.55380.qm@web28516.mail.ukl.yahoo.com> Hi There , I learned from I need unfortunatelly ADS offers only new version of xcode not supported on macosx10.4 -- where could i possibly download obsolete xcode who works with osx10.4 best WK From Nicolas.Thierry-Mieg at imag.fr Thu Jan 13 10:39:05 2011 From: Nicolas.Thierry-Mieg at imag.fr (Nicolas Thierry-Mieg) Date: Thu, 13 Jan 2011 16:39:05 +0100 Subject: [Bioperl-l] Bio::Biblio fails for "recent" publications Message-ID: <4D2F1C99.40604@imag.fr> Hello, I have a script that uses Bio::Biblio to fetch publication data. This has been working for some time, but users reported last month that it now fails for recent publications. For example: This works (a publication from 2009): perl -MBio::Biblio -e 'print new Bio::Biblio->get_by_id ("19147664")' But this fails: perl -MBio::Biblio -e 'print new Bio::Biblio->get_by_id ("21087995")' The error message has: Citation 21087995 was not found in MEDLINE/MEDLINENEW. However that PMID does exist in pubmed: http://www.ncbi.nlm.nih.gov/pubmed/21087995 I found that there have been recent changes at the NCBI, could my problem be related? http://eutils.ncbi.nlm.nih.gov/entrez/eutils/soap/v2.0/DOC/esoap_help.html Any hints would be very welcome! Regards, Nicolas From David.Messina at sbc.su.se Thu Jan 13 12:20:09 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 13 Jan 2011 18:20:09 +0100 Subject: [Bioperl-l] possible help with bioperl on macosx10.4 ... In-Reply-To: <118904.55380.qm@web28516.mail.ukl.yahoo.com> References: <118904.55380.qm@web28516.mail.ukl.yahoo.com> Message-ID: <5BF31F8F-C2A1-429E-B4DC-76ED04FB9E27@sbc.su.se> Hi, Xcode 2.x also came on the 10.4 install disks, so if you still have your copy of those, you can install it from there. Dave On Jan 9, 2011, at 18:04 , wadim kapulkin wrote: > Hi There , > > I learned from > I need > > unfortunatelly ADS offers only new version of xcode not supported on macosx10.4 > -- > > where could i possibly download obsolete xcode who works with osx10.4 > > best > > WK > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From florent.angly at gmail.com Thu Jan 13 19:21:27 2011 From: florent.angly at gmail.com (Florent Angly) Date: Fri, 14 Jan 2011 10:21:27 +1000 Subject: [Bioperl-l] possible help with bioperl on macosx10.4 ... In-Reply-To: <118904.55380.qm@web28516.mail.ukl.yahoo.com> References: <118904.55380.qm@web28516.mail.ukl.yahoo.com> Message-ID: <4D2F9707.5000304@gmail.com> Hi Wadim, Did you try this? http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix#Mac_OS_X_using_fink Florent On 10/01/11 03:04, wadim kapulkin wrote: > Hi There , > > I learned from > I need > > unfortunatelly ADS offers only new version of xcode not supported on macosx10.4 > -- > > where could i possibly download obsolete xcode who works with osx10.4 > > best > > WK > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From tzhu at mail.bnu.edu.cn Thu Jan 13 20:53:26 2011 From: tzhu at mail.bnu.edu.cn (Tao Zhu) Date: Fri, 14 Jan 2011 09:53:26 +0800 Subject: [Bioperl-l] EXCEPTION: Bio::Root::Exception [From: Silav Bremos] Message-ID: <1294970006.1875.7.camel@ubuntu> Hello! In fact the script doesn't work on my computer either. Probably it has some problems with the swissprot database. You could try to run this script as follows: ----------------------------------------------- use Bio::Perl; # this script will only work if you have an internet connection on the # computer you're using, the databases you can get sequences from # are 'swiss', 'genbank', 'genpept', 'embl', and 'refseq' $seq_object = get_sequence('genbank',"ECORHO"); write_sequence(">echrho.fasta",'fasta',$seq_object); ------------------------------------------------ It should work if you've installed BioPerl correctly. By the way, if you just want to check whether you've installed BioPerl correctly, you could type such command like perldoc Bio::Seq Good luck! -- Tao Zhu, College of Life Sciences, Beijing Normal University, Beijing 100875, China Email: tzhu at mail.bnu.edu.cn Website: http://bnuzt.org (mainly written in Chinese) From akarger at CGR.Harvard.edu Fri Jan 14 13:06:47 2011 From: akarger at CGR.Harvard.edu (Karger, Amir) Date: Fri, 14 Jan 2011 13:06:47 -0500 Subject: [Bioperl-l] Frame translation gets an extra aa? Message-ID: <1B12003244CE894E85B472602363788831A88123@FASXCH01.fasmail.priv> Apologies if this question has been asked before, or if it's so stupid that nobody was silly enough to ask it before. (Using Bioperl 1.6.1) perl -l -MBio::Seq -e '$x=Bio::Seq->new(-display_id=>"foo",-seq=>"AAACCCTTTGGG"); print $x->translate(-frame=>1)->seq' NPLG Um, why is GG being translated to G? Shouldn't you not translate if you only have 2 bp left? That is, even if you know that GGX translates to amino acid G for X in (A,C,G,T) you don't actually have that third bp right now. In real life, would an mRNA get translated even if it's missing the third base pair? -Amir Karger Team Lead for Scientific Applications Research Computing, Division of Science, FAS Harvard University From cjfields at illinois.edu Fri Jan 14 13:25:36 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 14 Jan 2011 12:25:36 -0600 Subject: [Bioperl-l] Frame translation gets an extra aa? In-Reply-To: <1B12003244CE894E85B472602363788831A88123@FASXCH01.fasmail.priv> References: <1B12003244CE894E85B472602363788831A88123@FASXCH01.fasmail.priv> Message-ID: <69597404-C36C-4F53-A0C1-C65D4F49D03C@illinois.edu> Amir, Um, the sequence you have has 4 codons: AAA CCC TTT GGG Taking the final 'G' gives the correct response: perl -l -MBio::Seq -e '$x=Bio::Seq->new(-display_id=>"foo",-seq=>"AAACCCTTTGG"); print $x->translate(-frame=>1)->seq' NPL chris On Jan 14, 2011, at 12:06 PM, Karger, Amir wrote: > Apologies if this question has been asked before, or if it's so stupid that nobody was silly enough to ask it before. > > (Using Bioperl 1.6.1) > > perl -l -MBio::Seq -e '$x=Bio::Seq->new(-display_id=>"foo",-seq=>"AAACCCTTTGGG"); print $x->translate(-frame=>1)->seq' > NPLG > > Um, why is GG being translated to G? Shouldn't you not translate if you only have 2 bp left? That is, even if you know that GGX translates to amino acid G for X in (A,C,G,T) you don't actually have that third bp right now. In real life, would an mRNA get translated even if it's missing the third base pair? > > -Amir Karger > Team Lead for Scientific Applications > Research Computing, Division of Science, FAS > Harvard University > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From gawbul at gmail.com Fri Jan 14 17:17:41 2011 From: gawbul at gmail.com (Steve Moss) Date: Fri, 14 Jan 2011 22:17:41 +0000 Subject: [Bioperl-l] possible help with bioperl on macosx10.4 ... (Dave Messina) In-Reply-To: References: Message-ID: <8407519560032573634@unknownmsgid> Hi, When you install the latest XCode releases, there is an option to install with 10.4 support, that is however if you are on a new system and require backward compatibility! You can download XCode 2.5 for Tiger from here however (http://connect.apple.com/cgi-bin/WebObjects/MemberSite.woa/wa/download?path=%2FDeveloper_Tools%2Fxcode_2.5_developer_tools%2Fxcode25_8m2558_developerdvd.dmg&wosid=kI4dcfprJgE73gcxd5u1zOCWmbN - 903MB) or other earlier versions from http://connect.apple.com - Developer Tools downloads section, although a login is required! Cheers, Steve Sent from my iPad On 14 Jan 2011, at 17:07, bioperl-l-request at lists.open-bio.org wrote: > Re: possible help with bioperl on macosx10.4 ... (Dave Messina) From silavb at yahoo.com Fri Jan 14 23:51:47 2011 From: silavb at yahoo.com (Silav) Date: Sat, 15 Jan 2011 04:51:47 +0000 (UTC) Subject: [Bioperl-l] EXCEPTION: Bio::Root::Exception [From: Silav Bremos] References: <1294970006.1875.7.camel@ubuntu> Message-ID: Tao Zhu mail.bnu.edu.cn> writes: > Hello! In fact the script doesn't work on my computer either. Probably > it has some problems with the swissprot database. > use Bio::Perl; > $seq_object = get_sequence('genbank',"ECORHO"); > write_sequence(">echrho.fasta",'fasta',$seq_object); Thanks for your response. The code with 'genbank' worked. Hopfully someone will fix the tutorial after reading this. From amackey at virginia.edu Sat Jan 15 18:34:30 2011 From: amackey at virginia.edu (Aaron Mackey) Date: Sat, 15 Jan 2011 18:34:30 -0500 Subject: [Bioperl-l] Frame translation gets an extra aa? In-Reply-To: <69597404-C36C-4F53-A0C1-C65D4F49D03C@illinois.edu> References: <1B12003244CE894E85B472602363788831A88123@FASXCH01.fasmail.priv> <69597404-C36C-4F53-A0C1-C65D4F49D03C@illinois.edu> Message-ID: I'm guessing the confusion might be the differences in terminology between reading frame (taking a value of 1, 2 or 3) and leading intron phase (a value of 0, 1 or 2, which corresponds to a reading frame of 1, 3 or 2, respectively) ... ? -Aaron On Fri, Jan 14, 2011 at 1:25 PM, Chris Fields wrote: > Amir, > > Um, the sequence you have has 4 codons: > > AAA CCC TTT GGG > > Taking the final 'G' gives the correct response: > > perl -l -MBio::Seq -e > '$x=Bio::Seq->new(-display_id=>"foo",-seq=>"AAACCCTTTGG"); print > $x->translate(-frame=>1)->seq' > NPL > > chris > > On Jan 14, 2011, at 12:06 PM, Karger, Amir wrote: > > > Apologies if this question has been asked before, or if it's so stupid > that nobody was silly enough to ask it before. > > > > (Using Bioperl 1.6.1) > > > > perl -l -MBio::Seq -e > '$x=Bio::Seq->new(-display_id=>"foo",-seq=>"AAACCCTTTGGG"); print > $x->translate(-frame=>1)->seq' > > NPLG > > > > Um, why is GG being translated to G? Shouldn't you not translate if you > only have 2 bp left? That is, even if you know that GGX translates to amino > acid G for X in (A,C,G,T) you don't actually have that third bp right now. > In real life, would an mRNA get translated even if it's missing the third > base pair? > > > > -Amir Karger > > Team Lead for Scientific Applications > > Research Computing, Division of Science, FAS > > Harvard University > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From akarger at CGR.Harvard.edu Sun Jan 16 02:00:15 2011 From: akarger at CGR.Harvard.edu (Karger, Amir) Date: Sun, 16 Jan 2011 02:00:15 -0500 Subject: [Bioperl-l] Frame translation gets an extra aa? In-Reply-To: References: <1B12003244CE894E85B472602363788831A88123@FASXCH01.fasmail.priv> <69597404-C36C-4F53-A0C1-C65D4F49D03C@illinois.edu>, Message-ID: <1B12003244CE894E85B47260236378882F948030@FASXCH01.fasmail.priv> Wait, what? Aaron, I'm not a biologist, so please give me a couple more sentences here. Also, the docs (and code) don't seem to support your numbers. From http://www.bioperl.org/wiki/BioPerl_Tutorial: You can also determine the frame of the translation. The default frame starts at the first nucleotide (frame 0). To get translation in the next frame we would write: $prot_obj = $my_seq_object->translate(-frame => 1); >From http://doc.bioperl.org/releases/bioperl-1.6.1/ PrimarySeqI documentation (and my 1.5 perldoc Bio::PrimarySeqI): Args:... -frame - frame default is 0 >From the code linked to at the doc.bioperl link above: ## use frame, error if frame is not 0, 1 or 2 $self->throw("Valid values for frame are 0, 1, or 2, not $frame.") unless ($frame == 0 or $frame == 1 or $frame == 2); $seq = substr($seq,$frame); What am I missing here? All the docs I see seem to use frame as "the number of bp we move to the right before we start translating codons 3 bp at a time". But if that code is being run when I do a translate() I should really be getting the answer I expect, and not four aas. And yet the Deobfuscator tells me that Bio::Seq::translate is inheriting from PrimarySeqI. And I get the same four-aa result if I create a PrimarySeq instead of a Seq. Aha. Now I see that PrimarySeq::translate calls CodonTable::translate after taking the substr. CodonTable::translate() says: if the codon is two nucleotides long and if by adding an [sic] a third character 'N', it codes for a single amino acid (with exceptions above), return that, otherwise return empty string. Are you sure that's what every user of PrimarySeq::translate wants? If so, please put something in the docs about it. Also, is there an option that will let me say "translate 11 bp to only 3 aa"? From looking at the code, it looks like no. I guess I can do this on my own if frame is 1. Slightly less confused, -Amir ________________________________________ From: ajmackey at gmail.com [ajmackey at gmail.com] On Behalf Of Aaron Mackey [amackey at virginia.edu] Sent: Saturday, January 15, 2011 18:34 To: Chris Fields Cc: Karger, Amir; bioperl-l at lists.open-bio.org Subject: Re: [Bioperl-l] Frame translation gets an extra aa? I'm guessing the confusion might be the differences in terminology between reading frame (taking a value of 1, 2 or 3) and leading intron phase (a value of 0, 1 or 2, which corresponds to a reading frame of 1, 3 or 2, respectively) ... ? -Aaron On Fri, Jan 14, 2011 at 1:25 PM, Chris Fields > wrote: Amir, Um, the sequence you have has 4 codons: AAA CCC TTT GGG Taking the final 'G' gives the correct response: perl -l -MBio::Seq -e '$x=Bio::Seq->new(-display_id=>"foo",-seq=>"AAACCCTTTGG"); print $x->translate(-frame=>1)->seq' NPL chris On Jan 14, 2011, at 12:06 PM, Karger, Amir wrote: > Apologies if this question has been asked before, or if it's so stupid that nobody was silly enough to ask it before. > > (Using Bioperl 1.6.1) > > perl -l -MBio::Seq -e '$x=Bio::Seq->new(-display_id=>"foo",-seq=>"AAACCCTTTGGG"); print $x->translate(-frame=>1)->seq' > NPLG > > Um, why is GG being translated to G? Shouldn't you not translate if you only have 2 bp left? That is, even if you know that GGX translates to amino acid G for X in (A,C,G,T) you don't actually have that third bp right now. In real life, would an mRNA get translated even if it's missing the third base pair? From David.Messina at sbc.su.se Sun Jan 16 09:28:08 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Sun, 16 Jan 2011 15:28:08 +0100 Subject: [Bioperl-l] EXCEPTION: Bio::Root::Exception [From: Silav Bremos] In-Reply-To: References: <1294970006.1875.7.camel@ubuntu> Message-ID: Hi Silav, > Hopfully someone will fix the tutorial after reading this. Thanks for pointing out the problem. I fixed it. And please remember for the future that you can also make these kind of changes yourself ? the Bioperl docs are a wiki editable by anyone. Dave From amackey at virginia.edu Mon Jan 17 09:10:37 2011 From: amackey at virginia.edu (Aaron Mackey) Date: Mon, 17 Jan 2011 09:10:37 -0500 Subject: [Bioperl-l] Frame translation gets an extra aa? In-Reply-To: <1B12003244CE894E85B47260236378882F948030@FASXCH01.fasmail.priv> References: <1B12003244CE894E85B472602363788831A88123@FASXCH01.fasmail.priv> <69597404-C36C-4F53-A0C1-C65D4F49D03C@illinois.edu> <1B12003244CE894E85B47260236378882F948030@FASXCH01.fasmail.priv> Message-ID: I did say that I was guessing ... the fact that -frame ranges between 0 and 2 makes sense to a programmer, but not to so much to a biologist who has a numerical understanding of the concept "reading frame"; I was imagining that the BioPerl API had used the more "natural" frame range of 1..3. sorry for muddying the waters, -Aaron On Sun, Jan 16, 2011 at 2:00 AM, Karger, Amir wrote: > Wait, what? Aaron, I'm not a biologist, so please give me a couple more > sentences here. > > Also, the docs (and code) don't seem to support your numbers. From > http://www.bioperl.org/wiki/BioPerl_Tutorial: > > You can also determine the frame of the translation. The default frame > starts at the first nucleotide (frame 0). To get translation in the next > frame we would write: > $prot_obj = $my_seq_object->translate(-frame => 1); > > From http://doc.bioperl.org/releases/bioperl-1.6.1/ PrimarySeqI > documentation (and my 1.5 perldoc Bio::PrimarySeqI): > Args:... > -frame - frame default is 0 > > From the code linked to at the doc.bioperl link above: > > ## use frame, error if frame is not 0, 1 or 2 > $self->throw("Valid values for frame are 0, 1, or 2, not > $frame.") > unless ($frame == 0 or $frame == 1 or $frame == 2); > $seq = substr($seq,$frame); > > What am I missing here? All the docs I see seem to use frame as "the number > of bp we move to the right before we start translating codons 3 bp at a > time". But if that code is being run when I do a translate() I should really > be getting the answer I expect, and not four aas. And yet the Deobfuscator > tells me that Bio::Seq::translate is inheriting from PrimarySeqI. And I get > the same four-aa result if I create a PrimarySeq instead of a Seq. > > Aha. Now I see that PrimarySeq::translate calls CodonTable::translate after > taking the substr. CodonTable::translate() says: > > if the codon is two nucleotides long and if by adding > an [sic] a third character 'N', it codes for a single amino > acid (with exceptions above), return that, otherwise > return empty string. > > Are you sure that's what every user of PrimarySeq::translate wants? If so, > please put something in the docs about it. Also, is there an option that > will let me say "translate 11 bp to only 3 aa"? From looking at the code, it > looks like no. I guess I can do this on my own if frame is 1. > > Slightly less confused, > > -Amir > > ________________________________________ > From: ajmackey at gmail.com [ajmackey at gmail.com] On Behalf Of Aaron Mackey [ > amackey at virginia.edu] > Sent: Saturday, January 15, 2011 18:34 > To: Chris Fields > Cc: Karger, Amir; bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Frame translation gets an extra aa? > > I'm guessing the confusion might be the differences in terminology between > reading frame (taking a value of 1, 2 or 3) and leading intron phase (a > value of 0, 1 or 2, which corresponds to a reading frame of 1, 3 or 2, > respectively) ... ? > > -Aaron > > On Fri, Jan 14, 2011 at 1:25 PM, Chris Fields > wrote: > Amir, > > Um, the sequence you have has 4 codons: > > AAA CCC TTT GGG > > Taking the final 'G' gives the correct response: > > perl -l -MBio::Seq -e > '$x=Bio::Seq->new(-display_id=>"foo",-seq=>"AAACCCTTTGG"); print > $x->translate(-frame=>1)->seq' > NPL > > chris > > On Jan 14, 2011, at 12:06 PM, Karger, Amir wrote: > > > Apologies if this question has been asked before, or if it's so stupid > that nobody was silly enough to ask it before. > > > > (Using Bioperl 1.6.1) > > > > perl -l -MBio::Seq -e > '$x=Bio::Seq->new(-display_id=>"foo",-seq=>"AAACCCTTTGGG"); print > $x->translate(-frame=>1)->seq' > > NPLG > > > > Um, why is GG being translated to G? Shouldn't you not translate if you > only have 2 bp left? That is, even if you know that GGX translates to amino > acid G for X in (A,C,G,T) you don't actually have that third bp right now. > In real life, would an mRNA get translated even if it's missing the third > base pair? > From cjfields at illinois.edu Mon Jan 17 11:07:46 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 17 Jan 2011 10:07:46 -0600 Subject: [Bioperl-l] Frame translation gets an extra aa? In-Reply-To: <1B12003244CE894E85B47260236378882F948030@FASXCH01.fasmail.priv> References: <1B12003244CE894E85B472602363788831A88123@FASXCH01.fasmail.priv> <69597404-C36C-4F53-A0C1-C65D4F49D03C@illinois.edu>, <1B12003244CE894E85B47260236378882F948030@FASXCH01.fasmail.priv> Message-ID: Amir, Completely missed the frame argument you passed. Yes, the behavior between PrimarySeqI::translate and CodonTable::translate seems inconsistent here, particularly with the '-complete' parameter (implying a complete CDS) defaulting to false. If the default assumption by PrimarySeqI::translate() is any sequence to be translated isn't complete, why should CodonTable::translate() automatically 'complete' the translation for incomplete codons by default? I would consider this a bug. However, as '-complete' also assumes a complete CDS, using it doesn't quite fit either, so we probably need some argument that allows for more finitely defining this. '-strict' ? Anyway, that is easily fixed; just currying the flag to the call to CodonTable::translate, then bypassing translation of partial codons is present, corrects the problem. Would just need to decide on the above. chris On Jan 16, 2011, at 1:00 AM, Karger, Amir wrote: > Wait, what? Aaron, I'm not a biologist, so please give me a couple more sentences here. > > Also, the docs (and code) don't seem to support your numbers. From http://www.bioperl.org/wiki/BioPerl_Tutorial: > > You can also determine the frame of the translation. The default frame starts at the first nucleotide (frame 0). To get translation in the next frame we would write: > $prot_obj = $my_seq_object->translate(-frame => 1); > >> From http://doc.bioperl.org/releases/bioperl-1.6.1/ PrimarySeqI documentation (and my 1.5 perldoc Bio::PrimarySeqI): > Args:... > -frame - frame default is 0 > >> From the code linked to at the doc.bioperl link above: > > ## use frame, error if frame is not 0, 1 or 2 > $self->throw("Valid values for frame are 0, 1, or 2, not $frame.") > unless ($frame == 0 or $frame == 1 or $frame == 2); > $seq = substr($seq,$frame); > > What am I missing here? All the docs I see seem to use frame as "the number of bp we move to the right before we start translating codons 3 bp at a time". But if that code is being run when I do a translate() I should really be getting the answer I expect, and not four aas. And yet the Deobfuscator tells me that Bio::Seq::translate is inheriting from PrimarySeqI. And I get the same four-aa result if I create a PrimarySeq instead of a Seq. > > Aha. Now I see that PrimarySeq::translate calls CodonTable::translate after taking the substr. CodonTable::translate() says: > > if the codon is two nucleotides long and if by adding > an [sic] a third character 'N', it codes for a single amino > acid (with exceptions above), return that, otherwise > return empty string. > > Are you sure that's what every user of PrimarySeq::translate wants? If so, please put something in the docs about it. Also, is there an option that will let me say "translate 11 bp to only 3 aa"? From looking at the code, it looks like no. I guess I can do this on my own if frame is 1. > > Slightly less confused, > > -Amir > > ________________________________________ > From: ajmackey at gmail.com [ajmackey at gmail.com] On Behalf Of Aaron Mackey [amackey at virginia.edu] > Sent: Saturday, January 15, 2011 18:34 > To: Chris Fields > Cc: Karger, Amir; bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Frame translation gets an extra aa? > > I'm guessing the confusion might be the differences in terminology between reading frame (taking a value of 1, 2 or 3) and leading intron phase (a value of 0, 1 or 2, which corresponds to a reading frame of 1, 3 or 2, respectively) ... ? > > -Aaron > > On Fri, Jan 14, 2011 at 1:25 PM, Chris Fields > wrote: > Amir, > > Um, the sequence you have has 4 codons: > > AAA CCC TTT GGG > > Taking the final 'G' gives the correct response: > > perl -l -MBio::Seq -e '$x=Bio::Seq->new(-display_id=>"foo",-seq=>"AAACCCTTTGG"); print $x->translate(-frame=>1)->seq' > NPL > > chris > > On Jan 14, 2011, at 12:06 PM, Karger, Amir wrote: > >> Apologies if this question has been asked before, or if it's so stupid that nobody was silly enough to ask it before. >> >> (Using Bioperl 1.6.1) >> >> perl -l -MBio::Seq -e '$x=Bio::Seq->new(-display_id=>"foo",-seq=>"AAACCCTTTGGG"); print $x->translate(-frame=>1)->seq' >> NPLG >> >> Um, why is GG being translated to G? Shouldn't you not translate if you only have 2 bp left? That is, even if you know that GGX translates to amino acid G for X in (A,C,G,T) you don't actually have that third bp right now. In real life, would an mRNA get translated even if it's missing the third base pair? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Mon Jan 17 11:14:43 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 17 Jan 2011 10:14:43 -0600 Subject: [Bioperl-l] EXCEPTION: Bio::Root::Exception [From: Silav Bremos] In-Reply-To: References: <1294970006.1875.7.camel@ubuntu> Message-ID: <4A94A8F6-66FC-40BB-BDED-CBF088EB21F3@illinois.edu> On Jan 14, 2011, at 10:51 PM, Silav wrote: > > Tao Zhu mail.bnu.edu.cn> writes: > > >> Hello! In fact the script doesn't work on my computer either. Probably >> it has some problems with the swissprot database. >> use Bio::Perl; > >> $seq_object = get_sequence('genbank',"ECORHO"); >> write_sequence(">echrho.fasta",'fasta',$seq_object); > > > Thanks for your response. The code with 'genbank' worked. Hopfully someone will > fix the tutorial after reading this. This should be fixed in on github, but I can check on it. chris From akarger at CGR.Harvard.edu Mon Jan 17 13:02:05 2011 From: akarger at CGR.Harvard.edu (Karger, Amir) Date: Mon, 17 Jan 2011 13:02:05 -0500 Subject: [Bioperl-l] Frame translation gets an extra aa? In-Reply-To: References: <1B12003244CE894E85B472602363788831A88123@FASXCH01.fasmail.priv> <69597404-C36C-4F53-A0C1-C65D4F49D03C@illinois.edu>, <1B12003244CE894E85B47260236378882F948030@FASXCH01.fasmail.priv>, Message-ID: <1B12003244CE894E85B47260236378882F948035@FASXCH01.fasmail.priv> "strict" seems a bit vague to me. -add_third_bp defaulting to false? -assume_third_bp? You might also want to check whether anything else in the vast code base calls CT::translate and see whether the assumptions make sense there. In any case, as long as it's clearly documented, I don't think it matters too much what you do. Well, I take it back: there should be some way to do either assuming or not assuming without too much extra work. Writing the code to say "if we're in frame 1, which is the only frame where this can happen, and the length of the translated thing ends up being one too long, then truncate the object by one" was kind of annoying, and the extra object copying might be a problem if you were trying to write fast code, which luckily I'm not in this case. -Amir ________________________________________ From: Chris Fields [cjfields at illinois.edu] Sent: Monday, January 17, 2011 11:07 To: Karger, Amir Cc: Aaron Mackey; bioperl-l at lists.open-bio.org Subject: Re: [Bioperl-l] Frame translation gets an extra aa? Amir, Completely missed the frame argument you passed. Yes, the behavior between PrimarySeqI::translate and CodonTable::translate seems inconsistent here, particularly with the '-complete' parameter (implying a complete CDS) defaulting to false. If the default assumption by PrimarySeqI::translate() is any sequence to be translated isn't complete, why should CodonTable::translate() automatically 'complete' the translation for incomplete codons by default? I would consider this a bug. However, as '-complete' also assumes a complete CDS, using it doesn't quite fit either, so we probably need some argument that allows for more finitely defining this. '-strict' ? Anyway, that is easily fixed; just currying the flag to the call to CodonTable::translate, then bypassing translation of partial codons is present, corrects the problem. Would just need to decide on the above. chris From joshpk105 at gmail.com Thu Jan 13 16:47:53 2011 From: joshpk105 at gmail.com (josh katz) Date: Thu, 13 Jan 2011 16:47:53 -0500 Subject: [Bioperl-l] Clustalw Wrapper Message-ID: I was wondering if I was possible to pass my own scoring matrix into clustal through bioperl. Thx, Josh Katz -- ________________________________________________________ "Observations of the external, will explain self." Joshua P. Katz From thamanchand at yahoo.com Fri Jan 14 07:45:19 2011 From: thamanchand at yahoo.com (2BioInfo) Date: Fri, 14 Jan 2011 04:45:19 -0800 (PST) Subject: [Bioperl-l] Problem in using BioPerl module through Strawberry Perl Professional 5.10.1.3 alpha 2 Message-ID: <2d22e6bb-8153-4562-ae6d-3b83abd44066@fu15g2000vbb.googlegroups.com> Hi all, I have been trying to use Bioperl through Strawberry Perl Professional 5.10.1.3 alpha 2 which by default has Bioperl modules When I use use Bio::SeqIO; ## It doesn't complain but when I use use Bio::Perl; # this script will only work with an internet connection # on the computer it is run on $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); write_sequence(">roa1.fasta",'fasta',$seq_object); Then it shows following erros Global symbol "$seq_object" requires explicit package name at C: \...........\Perl\2.pl line 8. Global symbol "$seq_object" requires explicit package name at C: \...........\Perl\2.pl line 9. Execution of C:\............\Perl\2.pl aborted due to compilation errors. Can you point me what I am doing wrong? From thamanchand at yahoo.com Fri Jan 14 09:10:44 2011 From: thamanchand at yahoo.com (2BioInfo) Date: Fri, 14 Jan 2011 06:10:44 -0800 (PST) Subject: [Bioperl-l] Problem in using BioPerl module through Strawberry Perl Professional 5.10.1.3 alpha 2 Message-ID: Hi all, I am new to Perl and Bioperl, but I am trying to do something with Perl and Bioperl. I tried to install Perl and BioPerl module separately but didn't worked for me. Then I installed Strawberry Perl Professional 5.10.1.3 alpha 2 which has inbuilt Bioperl module and Padre IDE. To check whether perl is working or not I put these command c:\perl -v It seems working Then I wanted to check whether Bioperl is working In Padre I put use Bio::Perl; ## It seems working because it didn't complain Then I wanted to check further more according to http://etutorials.org/Programming/perl+bioinformatics/Part+II+Perl+and+Bioinformatics/Chapter+9.+Introduction+to+Bioperl/9.3+Testing+Bioperl/ I used following examples to test #!C:\strawberry\perl\bin -w use strict; use warnings; use Bio::Perl; # this script will only work with an internet connection # on the computer it is run on $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); write_sequence(">roa1.fasta",'fasta',$seq_object); But it doesn't seems working I got following erros Global symbol "$seq_object" requires explicit package name at C:\...... \Perl\2.pl line 8. Global symbol "$seq_object" requires explicit package name at C: \.........\Perl\2.pl line 9. Execution of C:\.................\Perl\2.pl aborted due to compilation errors. Further I checked in command line perldoc::Seq It seems working too Can anyone point out what stupid mistakes I am doing? Thank you From prateekshettys at gmail.com Fri Jan 14 12:34:15 2011 From: prateekshettys at gmail.com (Prateek Shetty) Date: Fri, 14 Jan 2011 23:04:15 +0530 Subject: [Bioperl-l] Genscan for multiple sequences in the same input file Message-ID: Hello sir, I have a query. please do help me out with it. I downloaded the stand alone linux version for GENSCAN and ran it using Ubuntu. however, i met with a small hitch. the application treats all the multiple sequences present in the input file as just one sequence and returns a result. is there any way to correct. also does your script here http://search.cpan.org/~birney/bioperl-1.2.3/Bio/Tools/Genscan.pm do the same? can you please help me out with this. Regards, Prateek From thamanchand at yahoo.com Mon Jan 17 05:27:51 2011 From: thamanchand at yahoo.com (2BioInfo) Date: Mon, 17 Jan 2011 02:27:51 -0800 (PST) Subject: [Bioperl-l] EXCEPTION: Bio::Root::Exception In-Reply-To: <105542.34834.qm@web36907.mail.mud.yahoo.com> References: <105542.34834.qm@web36907.mail.mud.yahoo.com> Message-ID: <9f0d59c7-9d84-401b-b216-6dbc4bc25fb8@l22g2000vbp.googlegroups.com> I am having the same problem On Jan 8, 6:38?am, Silav Bremos wrote: > Hello > I recently installed BioPerl on Ubuntu 10.10 with the Snaptic package manager. > Install was easy. I tried to run the first tutorial script from:http://www.bioperl.org/wiki/Bptutorial.pl#Quick_getting_started_scripts > $seq_object = get_sequence('swiss',"ROA1_HUMAN"); > write_sequence(">roa1.fasta",'fasta',$seq_object); > I get this error: > ------------ EXCEPTION: Bio::Root::Exception ------------- > MSG: id does not exist > STACK: Error::throw > STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:368 > STACK: Bio::DB::WebDBSeqI::get_Seq_by_id > /usr/share/perl5/Bio/DB/WebDBSeqI.pm:168 > STACK: Bio::Perl::get_sequence /usr/share/perl5/Bio/Perl.pm:523 > STACK: tut1.pl:6 > Please help me decipher the exception. Perl version is 5.10. Thanks > ---Silav > > _______________________________________________ > Bioperl-l mailing list > Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l From amackey at virginia.edu Mon Jan 17 09:19:22 2011 From: amackey at virginia.edu (Aaron Mackey) Date: Mon, 17 Jan 2011 09:19:22 -0500 Subject: [Bioperl-l] Registration open for the 6th International Symposium on Health Informatics and Bioinformatics (HIBIT11) Message-ID: FYI: Dear Friends and Colleagues, We are happy to announce that the registration for the 6th International Symposium on Health Informatics and Bioinformatics (HIBIT11) is now open: http://hibit11.iyte.edu.tr/register.html In case you are Turkish and under the age of 35 we can apply for a partial sponsorship from TUBITAK for you. For that please fill the needed details in our form: http://hibit11.iyte.edu.tr/grantApplicationForm.php There will be only 2 persons accepted per institution and we will let chance decide if more than 2 persons apply for any one institution. We would like to invite you to join our venue which takes place from 2 ? 5 May 2011 in Izmir, Turkey. You can also book your hotel with us; with different hotels available for every budget. From dormitories to low cost and luxury hotels everything is possible. Before booking a flight, please check our sponsor, Lufthansa (http://www.lufthansa.com/event-booking_en\) using the promotional code: TRTJM and allowing popups. Workshops Also we are happy to announce that there will be several practical workshops and a mini conference at the onset of HIBIT11. Ralf Hofestaedt and Can T?rker invite you to join their mini conference on Translational Bioinformatics (Health Informaticians are of course welcome). Michael Specht invites you to learn about Proteomatics, a general automation and pipelining application with examples from mass spectrometry based proteomics. Vilda Purutcuoglu invites you to join her workshop on networks and network analysis. Pricing Early registration fees (until 28.02.2011) are: 60? for daily registration, 75? for students (3 days), and 150? for other participants (3 days). We can print your posters in A2 format for just 10?. Gala Dinner is a mere 15? and the trip to Ephesus is only 50? (http://en.wikipedia.org/wiki/Ephesus). Hotels can be booked in the registration process as well. Prices range from 45? to 140? per night. We hope to be able to welcome you in Izmir at our Symposium, Jens Allmer, Conference Chair, on behalf of the organizing and program committees http://hibit11.iyte.edu.tr Sponsors Lufthansa, http://www.lufthansa.com/event-booking_en TOP Yay?nc?l?k, http://www.top.com.tr/ T?BA, http://www.tuba.gov.tr/anasayfa/en/English T?BITAK, http://www.tubitak.gov.tr/en/ot/10/ For more details please visit: http://hibit11.iyte.edu.tr To stay on top of new information please follow us: Linkedin: http://www.linkedin.com/groups?mostPopular=&gid=1532167 Facebook: http://www.facebook.com/home.php?sk=group_154594581249053 Twitter: http://twitter.com/#!/hibit11 If you don?t want to receive further updates about HIBIT, you can remove your email here: http://hibit11.iyte.edu.tr/emailsignup.php From sheetu.piscean at gmail.com Mon Jan 17 18:00:07 2011 From: sheetu.piscean at gmail.com (sheetal gosrani) Date: Mon, 17 Jan 2011 15:00:07 -0800 Subject: [Bioperl-l] No hits/hsps are displayed while rendering image using Bio::Graphics (parsed blast+ tabular output (-m 6 equivalent to legacy blast -m 8) file using Bio::SearchIO) Message-ID: Hi, I am trying to parse (using Bio::SearchIO) and render (using Bio::Graphics) the blast output file in tabular format (-m 6 option in blast+ 2.2.23 version which is equivalent to -m 8 in legacy C based blast). I do not get anything displayed on the image for some reason that I am unable to debug. I tried to just parse the file using Bio::SearchIO, format => 'blasttable' and I get this warning message: --------------------- WARNING --------------------- MSG: Did not define the number of conserved matches in the HSP; assuming conserved == identical (19) --------------------------------------------------- I cannot avoid this message as the information it needs is not in the file. Is this message causing the hits/hsps to be not displayed ? Some details on the OS and version: OS: $ uname -a Linux sheetal-ubuntu 2.6.32-27-generic #49-Ubuntu SMP Wed Dec 1 23:52:12 UTC 2010 i686 GNU/Linux Perl : v5.10.1 (*) built for i486-linux-gnu-thread-multi BioPerl Version : $ perl -MBio::Root::Version -e 'print $Bio::Root::Version::VERSION,"\n"' 1.006001 Bio::SearchIO version: $ perl -MBio::SearchIO -e 'printf "%vd\n", $Bio::SearchIO::VERSION' 49.46.48.48.54.48.48.49 Bio::Graphics version: $ perl -MBio::Graphics -e 'printf "%vd\n", $Bio::Graphics::VERSION' 50.46.49.56 Can you please let me know where the problem is ? Attaching the perl file render_blast.pl and sample blast output file. Also attaching the perl file which just parses the sample blast file using Bio::SearchIO as I get the warning message when I run that. Thanks, Sheetal Gosrani -------------- next part -------------- # BLASTX 2.2.23+ # Query: Contig_4 # Database: nr # Fields: query id, subject id, % identity, alignment length, mismatches, gap opens, q. start, q. end, s. start, s. end, evalue, bit score # 6 hits found Contig_4 gi|189485521|ref|YP_001956462.1| 47.50 40 20 1 271 387 405 444 4e-008 44.3 Contig_4 gi|189485521|ref|YP_001956462.1| 58.33 36 14 1 113 217 351 386 4e-008 38.9 Contig_4 gi|15605800|ref|NP_213177.1| 59.52 42 17 0 271 396 380 421 4e-008 58.5 Contig_4 gi|15605800|ref|NP_213177.1| 73.33 15 4 0 170 214 346 360 4e-008 24.6 Contig_4 gi|150021420|ref|YP_001306774.1| 54.76 42 18 1 271 396 392 432 7e-008 48.5 Contig_4 gi|150021420|ref|YP_001306774.1| 53.13 32 14 1 122 214 341 372 7e-008 33.9 -------------- next part -------------- A non-text attachment was scrubbed... Name: render_blast.pl Type: application/octet-stream Size: 2948 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: parse_using_bio_searchio.pl Type: application/octet-stream Size: 1654 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: render_blast_output_screenshot.png Type: image/png Size: 70244 bytes Desc: not available URL: From tzhu at mail.bnu.edu.cn Tue Jan 18 00:32:47 2011 From: tzhu at mail.bnu.edu.cn (Tao Zhu) Date: Tue, 18 Jan 2011 13:32:47 +0800 Subject: [Bioperl-l] Problem in using BioPerl module through Strawberry Perl Professional 5.10.1.3 alpha 2 Message-ID: <1295328767.3698.18.camel@ubuntu> Dear friend, In my opinion, there exist two major problems in your script: First, have you noticed that you've written "use strict;" at the beginning line? Delete it, please! If you write down "use strict;",you should explicitly declare all the variables in your script, and this could be very confusing to new learners. Second, the website http://etutorials.org/Programming/perl +bioinformatics/ has been outdated. I recommend the tutorial script on http://www.bioperl.org/wiki/Bptutorial.pl#Quick_getting_started_scripts. It should be like this: ----------------------------- use Bio::Perl; # this script will only work if you have an internet connection on the # computer you're using, the databases you can get sequences from # are 'swiss', 'genbank', 'genpept', 'embl', and 'refseq' $seq_object = get_sequence('genbank',"ECORHO"); write_sequence(">ecorho.fasta",'fasta',$seq_object); ----------------------------- Please run this script and I hope it should work. Good luck! > Hi all, > > I am new to Perl and Bioperl, but I am trying to do something with > Perl and Bioperl. I tried to install Perl and BioPerl module > separately but didn't worked for me. Then I installed Strawberry Perl > Professional 5.10.1.3 alpha 2 which has inbuilt Bioperl module and > Padre IDE. > > To check whether perl is working or not I put these command > > c:\perl -v > > It seems working > > Then I wanted to check whether Bioperl is working > > In Padre I put > > use Bio::Perl; ## It seems working because it didn't complain > > Then I wanted to check further more according to > > http://etutorials.org/Programming/perl+bioinformatics/Part+II+Perl+and > +Bioinformatics/Chapter+9.+Introduction+to+Bioperl/9.3+Testing > +Bioperl/ > > I used following examples to test > > #!C:\strawberry\perl\bin -w > use strict; > use warnings; > use Bio::Perl; > > # this script will only work with an internet connection > # on the computer it is run on > $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); > write_sequence(">roa1.fasta",'fasta',$seq_object); > > But it doesn't seems working > > I got following erros > > Global symbol "$seq_object" requires explicit package name at C: > \...... > \Perl\2.pl line 8. > Global symbol "$seq_object" requires explicit package name at C: > \.........\Perl\2.pl line 9. > Execution of C:\.................\Perl\2.pl aborted due to compilation > errors. > > Further I checked in command line perldoc::Seq > > It seems working too > > Can anyone point out what stupid mistakes I am doing? > > Thank you -- Tao Zhu, College of Life Sciences, Beijing Normal University, Beijing 100875, China Email: tzhu at mail.bnu.edu.cn Website: http://bnuzt.org (mainly written in Chinese) From adsj at novozymes.com Tue Jan 18 03:10:39 2011 From: adsj at novozymes.com (Adam =?utf-8?Q?Sj=C3=B8gren?=) Date: Tue, 18 Jan 2011 09:10:39 +0100 Subject: [Bioperl-l] Problem in using BioPerl module through Strawberry Perl Professional 5.10.1.3 alpha 2 In-Reply-To: <1295328767.3698.18.camel@ubuntu> (Tao Zhu's message of "Tue, 18 Jan 2011 13:32:47 +0800") References: <1295328767.3698.18.camel@ubuntu> Message-ID: <874o963a4g.fsf@topper.koldfront.dk> On Tue, 18 Jan 2011 13:32:47 +0800, Tao wrote: > In my opinion, there exist two major problems in your script: > First, have you noticed that you've written "use strict;" at the > beginning line? Delete it, please! If you write down "use strict;",you > should explicitly declare all the variables in your script, and this > could be very confusing to new learners. I think this isn't the best advice. Everybody, and especially beginners, should "use strict; use warnings;" and declare variables. Doing so makes it much, much easier to catch typos and other small mistakes - which is often what trips up beginners. The only exception I can think of is quick and dirty one-liners. Best regards, Adam -- Adam Sj?gren adsj at novozymes.com From cjfields at illinois.edu Tue Jan 18 08:49:54 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 18 Jan 2011 07:49:54 -0600 Subject: [Bioperl-l] Problem in using BioPerl module through Strawberry Perl Professional 5.10.1.3 alpha 2 In-Reply-To: <874o963a4g.fsf@topper.koldfront.dk> References: <1295328767.3698.18.camel@ubuntu> <874o963a4g.fsf@topper.koldfront.dk> Message-ID: <70901836-A22D-4E67-8049-A8A56C626B26@illinois.edu> On Jan 18, 2011, at 2:10 AM, Adam Sj?gren wrote: > On Tue, 18 Jan 2011 13:32:47 +0800, Tao wrote: > >> In my opinion, there exist two major problems in your script: > >> First, have you noticed that you've written "use strict;" at the >> beginning line? Delete it, please! If you write down "use strict;",you >> should explicitly declare all the variables in your script, and this >> could be very confusing to new learners. > > I think this isn't the best advice. Everybody, and especially beginners, > should "use strict; use warnings;" and declare variables. Doing so makes > it much, much easier to catch typos and other small mistakes - which is > often what trips up beginners. > > The only exception I can think of is quick and dirty one-liners. > > > Best regards, > > Adam Completely agree. In fact, many of the problems posted here can be resolved by 'use strict; use warnings;' (BTW, perl 5.14 will have 'use strict;' on by default). chris From j_martin at lbl.gov Tue Jan 18 11:54:08 2011 From: j_martin at lbl.gov (Joel Martin) Date: Tue, 18 Jan 2011 08:54:08 -0800 Subject: [Bioperl-l] Problem in using BioPerl module through Strawberry Perl Professional 5.10.1.3 alpha 2 In-Reply-To: <70901836-A22D-4E67-8049-A8A56C626B26@illinois.edu> References: <1295328767.3698.18.camel@ubuntu> <874o963a4g.fsf@topper.koldfront.dk> <70901836-A22D-4E67-8049-A8A56C626B26@illinois.edu> Message-ID: and given their good advice, the answer is... change $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); to my $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); so that the variable $seq_object is declared. The first chapter in 'Learning Perl', or a similar introduction will get you pretty far along the road to using bioperl. Joel On Tue, Jan 18, 2011 at 5:49 AM, Chris Fields wrote: > On Jan 18, 2011, at 2:10 AM, Adam Sj?gren wrote: > > > On Tue, 18 Jan 2011 13:32:47 +0800, Tao wrote: > > > >> In my opinion, there exist two major problems in your script: > > > >> First, have you noticed that you've written "use strict;" at the > >> beginning line? Delete it, please! If you write down "use strict;",you > >> should explicitly declare all the variables in your script, and this > >> could be very confusing to new learners. > > > > I think this isn't the best advice. Everybody, and especially beginners, > > should "use strict; use warnings;" and declare variables. Doing so makes > > it much, much easier to catch typos and other small mistakes - which is > > often what trips up beginners. > > > > The only exception I can think of is quick and dirty one-liners. > > > > > > Best regards, > > > > Adam > > Completely agree. In fact, many of the problems posted here can be > resolved by 'use strict; use warnings;' (BTW, perl 5.14 will have 'use > strict;' on by default). > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cseligman at earthlink.net Tue Jan 18 14:40:28 2011 From: cseligman at earthlink.net (Chet Seligman) Date: Tue, 18 Jan 2011 11:40:28 -0800 Subject: [Bioperl-l] BioPerl installation on Windows 7 Message-ID: <000f01cbb747$9050b340$b0f219c0$@earthlink.net> Please recommend which Perl version from ActiveState should be used for windows7 64bit and then whether Bioperl should be installed from PPM or via perl -MCPAN -e "install Bundle::BioPerl" Chet Seligman From cjfields at illinois.edu Tue Jan 18 15:39:41 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 18 Jan 2011 14:39:41 -0600 Subject: [Bioperl-l] BioPerl installation on Windows 7 In-Reply-To: <000f01cbb747$9050b340$b0f219c0$@earthlink.net> References: <000f01cbb747$9050b340$b0f219c0$@earthlink.net> Message-ID: <0B19D9B7-9D8D-4882-8944-CA3183580C31@illinois.edu> Well, that's a difficult question. I would always suggest the latest perl (which will soon be perl 5.14, but is currently perl 5.12). IIRC there are problems with DB_File using the latest ActiveState Perl, so maybe use Strawberry Perl? chris On Jan 18, 2011, at 1:40 PM, Chet Seligman wrote: > Please recommend which Perl version from ActiveState should be used for > windows7 64bit and then whether Bioperl should be installed from PPM or via > perl -MCPAN -e "install Bundle::BioPerl" > > > > Chet Seligman > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cseligman at earthlink.net Tue Jan 18 19:04:04 2011 From: cseligman at earthlink.net (Chet Seligman) Date: Tue, 18 Jan 2011 16:04:04 -0800 Subject: [Bioperl-l] How can I tell if I have certain BioPerl modules? Like Bio::SeqIO Message-ID: <002701cbb76c$631c4860$2954d920$@earthlink.net> I am running Strawberry Perl perl 5, version 12, subversion 2 (v5.12.2) built for MSWin32-x64-multi-thread I then installed Bioperl this way: INSTALLING BIOPERL THE EASY WAY USING CPAN You can use the CPAN shell to install Bioperl. For example: >perl -MCPAN -e shell Then find the name of the Bioperl version you want: cpan>d /bioperl/ CPAN: Storable loaded ok Going to read /home/bosborne/.cpan/Metadata Database was generated on Tue, 24 Feb 2004 23:55:23 GMT Distribution B/BI/BIRNEY/bioperl-1.2.tar.gz Distribution B/BI/BIRNEY/bioperl-1.4.tar.gz Now install: cpan>install B/BI/BIRNEY/bioperl-1.4.tar.gz And the installation went to completion Chet Seligman From wkretzsch at gmail.com Tue Jan 18 19:49:11 2011 From: wkretzsch at gmail.com (Warren W. Kretzschmar) Date: Wed, 19 Jan 2011 00:49:11 +0000 Subject: [Bioperl-l] How can I tell if I have certain BioPerl modules? Like Bio::SeqIO In-Reply-To: <002701cbb76c$631c4860$2954d920$@earthlink.net> References: <002701cbb76c$631c4860$2954d920$@earthlink.net> Message-ID: Hi Chet, This is sort of a hack way to do it, but running a one-liner that includes the library you'd like to know if you have installed will throw an error if the library is not installed. So this should run without error if Bio::SeqIO is installed on your system: perl -MBio::SeqIO -e 'print "Hello World\n"' While this will throw an error that the library called Bio::S could not be found: perl -MBio::S -e 'print "Hello World\n"' A nicer way might be to print the module's version number: perl -MBio::SeqIO -e 'print $Bio::SeqIO::VERSION."\n"' Because of inheritance, I thin the version number might actually be the bioperl version number. Cheers, Warren On Wed, Jan 19, 2011 at 12:04 AM, Chet Seligman wrote: > I am running Strawberry Perl > > perl 5, version 12, subversion 2 (v5.12.2) built for > MSWin32-x64-multi-thread > > > > I then installed Bioperl this way: > > > > INSTALLING BIOPERL THE EASY WAY USING CPAN > > > > ? You can use the CPAN shell to install Bioperl. For example: > > > > ? ? >perl -MCPAN -e shell > > > > ? Then find the name of the Bioperl version you want: > > > > ? ? cpan>d /bioperl/ > > ? ? CPAN: Storable loaded ok > > ? ? Going to read /home/bosborne/.cpan/Metadata > > ? ? Database was generated on Tue, 24 Feb 2004 23:55:23 GMT > > ? ? Distribution ? ?B/BI/BIRNEY/bioperl-1.2.tar.gz > > ? ? Distribution ? ?B/BI/BIRNEY/bioperl-1.4.tar.gz > > > > ? Now install: > > > > ? ? cpan>install B/BI/BIRNEY/bioperl-1.4.tar.gz > > And the installation went to completion > > > > Chet Seligman > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at illinois.edu Tue Jan 18 22:25:51 2011 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 18 Jan 2011 21:25:51 -0600 Subject: [Bioperl-l] How can I tell if I have certain BioPerl modules? Like Bio::SeqIO In-Reply-To: References: <002701cbb76c$631c4860$2954d920$@earthlink.net> Message-ID: <3F73AC0F-0A1D-4739-B2C6-F2A4A40EE6B2@illinois.edu> Chet, In addition to the below, you should be installing BioPerl v. 1.6.1. I have a branch created for the next point release, which I'm hoping to have released soon. chris On Jan 18, 2011, at 6:49 PM, Warren W. Kretzschmar wrote: > Hi Chet, > This is sort of a hack way to do it, but running a one-liner that > includes the library you'd like to know if you have installed will > throw an error if the library is not installed. > > So this should run without error if Bio::SeqIO is installed on your system: > perl -MBio::SeqIO -e 'print "Hello World\n"' > > While this will throw an error that the library called Bio::S could > not be found: > perl -MBio::S -e 'print "Hello World\n"' > > A nicer way might be to print the module's version number: > perl -MBio::SeqIO -e 'print $Bio::SeqIO::VERSION."\n"' > > Because of inheritance, I thin the version number might actually be > the bioperl version number. > > Cheers, > Warren > > On Wed, Jan 19, 2011 at 12:04 AM, Chet Seligman wrote: >> I am running Strawberry Perl >> >> perl 5, version 12, subversion 2 (v5.12.2) built for >> MSWin32-x64-multi-thread >> >> >> >> I then installed Bioperl this way: >> >> >> >> INSTALLING BIOPERL THE EASY WAY USING CPAN >> >> >> >> You can use the CPAN shell to install Bioperl. For example: >> >> >> >> >perl -MCPAN -e shell >> >> >> >> Then find the name of the Bioperl version you want: >> >> >> >> cpan>d /bioperl/ >> >> CPAN: Storable loaded ok >> >> Going to read /home/bosborne/.cpan/Metadata >> >> Database was generated on Tue, 24 Feb 2004 23:55:23 GMT >> >> Distribution B/BI/BIRNEY/bioperl-1.2.tar.gz >> >> Distribution B/BI/BIRNEY/bioperl-1.4.tar.gz >> >> >> >> Now install: >> >> >> >> cpan>install B/BI/BIRNEY/bioperl-1.4.tar.gz >> >> And the installation went to completion >> >> >> >> Chet Seligman >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From kai.blin at biotech.uni-tuebingen.de Wed Jan 19 06:12:36 2011 From: kai.blin at biotech.uni-tuebingen.de (Kai Blin) Date: Wed, 19 Jan 2011 12:12:36 +0100 Subject: [Bioperl-l] BioPerl installation on Windows 7 In-Reply-To: <000f01cbb747$9050b340$b0f219c0$@earthlink.net> References: <000f01cbb747$9050b340$b0f219c0$@earthlink.net> Message-ID: <4D36C724.9060706@biotech.uni-tuebingen.de> On 2011-01-18 20:40, Chet Seligman wrote: Hi Chet, > Please recommend which Perl version from ActiveState should be used for > windows7 64bit and then whether Bioperl should be installed from PPM or via > perl -MCPAN -e "install Bundle::BioPerl" While I haven't actually worked much with BioPerl under Win7, I do have a test setup using Strawberry Perl and BioPerl from git. That works just fine. I expect BioPerl from CPAN to be as easy to install. I have tried to install ActiveState Perl, but the version I tried refused to install in 64bit Win7. Cheers, Kai -- Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de Institute for Microbiology and Infection Medicine Division of Microbiology/Biotechnology Eberhard-Karls-Universit?t T?bingen Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 D-72076 T?bingen Fax : ++49 7071 29-5979 Germany Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben From cjfields at illinois.edu Wed Jan 19 10:14:53 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 19 Jan 2011 09:14:53 -0600 Subject: [Bioperl-l] Frame translation gets an extra aa? In-Reply-To: <1B12003244CE894E85B47260236378882F948035@FASXCH01.fasmail.priv> References: <1B12003244CE894E85B472602363788831A88123@FASXCH01.fasmail.priv> <69597404-C36C-4F53-A0C1-C65D4F49D03C@illinois.edu>, <1B12003244CE894E85B47260236378882F948030@FASXCH01.fasmail.priv>, <1B12003244CE894E85B47260236378882F948035@FASXCH01.fasmail.priv> Message-ID: On Jan 17, 2011, at 12:02 PM, Karger, Amir wrote: > "strict" seems a bit vague to me. -add_third_bp defaulting to false? -assume_third_bp? > > You might also want to check whether anything else in the vast code base calls CT::translate and see whether the assumptions make sense there. I go strictly by the test suite for API conformance, but from that last run it seems that nothing in the code base relies on automatically filling out incomplete codons. > In any case, as long as it's clearly documented, I don't think it matters too much what you do. Well, I take it back: there should be some way to do either assuming or not assuming without too much extra work. Writing the code to say "if we're in frame 1, which is the only frame where this can happen, and the length of the translated thing ends up being one too long, then truncate the object by one" was kind of annoying, and the extra object copying might be a problem if you were trying to write fast code, which luckily I'm not in this case. > > -Amir Fast code and BioPerl? We're trying to make things faster, but this may be more difficult in practice with the default OO system (and a complete switch to Moose will be problematic in the short term). I have committed a fix to github for the above. Basically, setting either '-complete' or '-complete_codons' will fill out partial codons and attempt to translate them, but this behavior is off by default. I think this makes sense, just from the perspective we don't want unintended side-effects. chris > ________________________________________ > From: Chris Fields [cjfields at illinois.edu] > Sent: Monday, January 17, 2011 11:07 > To: Karger, Amir > Cc: Aaron Mackey; bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Frame translation gets an extra aa? > > Amir, > > Completely missed the frame argument you passed. Yes, the behavior between PrimarySeqI::translate and CodonTable::translate seems inconsistent here, particularly with the '-complete' parameter (implying a complete CDS) defaulting to false. If the default assumption by PrimarySeqI::translate() is any sequence to be translated isn't complete, why should CodonTable::translate() automatically 'complete' the translation for incomplete codons by default? I would consider this a bug. > > However, as '-complete' also assumes a complete CDS, using it doesn't quite fit either, so we probably need some argument that allows for more finitely defining this. '-strict' ? > > Anyway, that is easily fixed; just currying the flag to the call to CodonTable::translate, then bypassing translation of partial codons is present, corrects the problem. Would just need to decide on the above. > > chris > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From akarger at CGR.Harvard.edu Wed Jan 19 12:36:02 2011 From: akarger at CGR.Harvard.edu (Karger, Amir) Date: Wed, 19 Jan 2011 12:36:02 -0500 Subject: [Bioperl-l] Frame translation gets an extra aa? In-Reply-To: References: <1B12003244CE894E85B472602363788831A88123@FASXCH01.fasmail.priv> <69597404-C36C-4F53-A0C1-C65D4F49D03C@illinois.edu>, <1B12003244CE894E85B47260236378882F948030@FASXCH01.fasmail.priv>, <1B12003244CE894E85B47260236378882F948035@FASXCH01.fasmail.priv> Message-ID: <1B12003244CE894E85B472602363788831A884DE@FASXCH01.fasmail.priv> > From: Chris Fields [mailto:cjfields at illinois.edu] > > I wrote: > > there should be > some way to do either assuming or not assuming without too much > extra work.... the extra object copying > might be a problem if you were trying to write fast code, which > luckily I'm not in this case. > > Fast code and BioPerl? We're trying to make things faster, but > this may be more difficult in practice with the default OO system > (and a complete switch to Moose will be problematic in the short > term). > > I have committed a fix to github for the above. Basically, setting > either '-complete' or '-complete_codons' will fill out partial > codons and attempt to translate them, but this behavior is off by > default. I think this makes sense, just from the perspective we > don't want unintended side-effects. That makes sense to me. The truth is, there shouldn't be that many real life applications, as far as I can tell, where this bites people meaningfully. But having the option is nice. -Amir From David.Messina at sbc.su.se Thu Jan 20 11:11:01 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 20 Jan 2011 17:11:01 +0100 Subject: [Bioperl-l] =?windows-1252?q?removing_BioPerl_1=2Ex_from_CPAN_=97?= =?windows-1252?q?_request_for_comment?= Message-ID: Hi everybody, As some of you have no doubt noticed, we have a persistent problem with users accidentally using BioPerl versions 1.4 and earlier. Although they've been placed on BackPAN, CPAN's archive for deprecated software, for some reason they still show up in CPAN search results. It's frustrating for users to be directed to the wrong version of the software, adding another gotcha to an already too-complicated installation process. And to this day it generates support emails on this list. As far as I know, the Ensembl Perl API still designates version 1.2.3 of BioPerl for use with it: http://www.ensembl.org/info/docs/api/api_installation.html (although in Chris Fields' testing, the Ensembl API worked just fine with the current BioPerl version, 1.6.x.) I intend to ask the CPAN admins to remove BioPerl 1.4 and earlier entirely. But before I do, I wanted to raise the issue here in case anyone objects. In particular, it'd be great if someone from Ensembl could weigh in on this. I should also point out that we have these earlier releases archived on github already, so they will still be available. Thanks, Dave From kellert at ohsu.edu Thu Jan 20 15:43:10 2011 From: kellert at ohsu.edu (Tom Keller) Date: Thu, 20 Jan 2011 12:43:10 -0800 Subject: [Bioperl-l] removing BioPerl 1.x from CPAN ? In-Reply-To: References: Message-ID: <9B867157-A168-4B33-9939-FDD720577AFA@ohsu.edu> Would the entire Bio directory disappear? How would one get modules that are not part of bioperl-live or the other main packages? (for example Bio::Trace::ABIF) thanks, Tom MMI DNA Services Core Facility 503-494-2442 kellert at ohsu.edu Office: 6588 RJH (CROET/BasicScience) On Jan 20, 2011, at 9:00 AM, wrote: > Send Bioperl-l mailing list submissions to > bioperl-l at lists.open-bio.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.open-bio.org/mailman/listinfo/bioperl-l > or, via email, send a message with subject or body 'help' to > bioperl-l-request at lists.open-bio.org > > You can reach the person managing the list at > bioperl-l-owner at lists.open-bio.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Bioperl-l digest..." > > > Today's Topics: > > 1. Re: Frame translation gets an extra aa? (Karger, Amir) > 2. removing BioPerl 1.x from CPAN ? request for comment > (Dave Messina) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 19 Jan 2011 12:36:02 -0500 > From: "Karger, Amir" > Subject: Re: [Bioperl-l] Frame translation gets an extra aa? > To: Chris Fields > Cc: "bioperl-l at lists.open-bio.org" > Message-ID: > <1B12003244CE894E85B472602363788831A884DE at FASXCH01.fasmail.priv> > Content-Type: text/plain; charset="us-ascii" > >> From: Chris Fields [mailto:cjfields at illinois.edu] >> >> I wrote: >>> there should be >> some way to do either assuming or not assuming without too much >> extra work.... the extra object copying >> might be a problem if you were trying to write fast code, which >> luckily I'm not in this case. >> >> Fast code and BioPerl? We're trying to make things faster, but >> this may be more difficult in practice with the default OO system >> (and a complete switch to Moose will be problematic in the short >> term). >> >> I have committed a fix to github for the above. Basically, setting >> either '-complete' or '-complete_codons' will fill out partial >> codons and attempt to translate them, but this behavior is off by >> default. I think this makes sense, just from the perspective we >> don't want unintended side-effects. > > That makes sense to me. The truth is, there shouldn't be that many real life applications, as far as I can tell, where this bites people meaningfully. But having the option is nice. > > -Amir > > > > ------------------------------ > > Message: 2 > Date: Thu, 20 Jan 2011 17:11:01 +0100 > From: Dave Messina > Subject: [Bioperl-l] removing BioPerl 1.x from CPAN ? request for > comment > To: Dave Messina > Message-ID: > Content-Type: text/plain; charset=us-ascii > > Hi everybody, > > As some of you have no doubt noticed, we have a persistent problem with users accidentally using BioPerl versions 1.4 and earlier. Although they've been placed on BackPAN, CPAN's archive for deprecated software, for some reason they still show up in CPAN search results. > > It's frustrating for users to be directed to the wrong version of the software, adding another gotcha to an already too-complicated installation process. And to this day it generates support emails on this list. > > As far as I know, the Ensembl Perl API still designates version 1.2.3 of BioPerl for use with it: > > http://www.ensembl.org/info/docs/api/api_installation.html > > (although in Chris Fields' testing, the Ensembl API worked just fine with the current BioPerl version, 1.6.x.) > > > I intend to ask the CPAN admins to remove BioPerl 1.4 and earlier entirely. But before I do, I wanted to raise the issue here in case anyone objects. In particular, it'd be great if someone from Ensembl could weigh in on this. I should also point out that we have these earlier releases archived on github already, so they will still be available. > > > Thanks, > Dave > > > > > ------------------------------ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > End of Bioperl-l Digest, Vol 93, Issue 16 > ***************************************** From cjfields at illinois.edu Thu Jan 20 16:13:30 2011 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 20 Jan 2011 15:13:30 -0600 Subject: [Bioperl-l] removing BioPerl 1.x from CPAN ? In-Reply-To: <9B867157-A168-4B33-9939-FDD720577AFA@ohsu.edu> References: <9B867157-A168-4B33-9939-FDD720577AFA@ohsu.edu> Message-ID: <3C398EFF-C60D-4FC2-A4BB-75B890F8E12A@illinois.edu> No, only the older (out-of-date) versions of BioPerl-related code itself would be removed. Any other Bio::* modules in any other distribution wouldn't be touched (actually, couldn't be touched, unless they were removed by the distribution author). Dave, we may need to contact Sendu as well, noticed he has the 1.5.2 developer releases on CPAN as well: http://search.cpan.org/~sendu/ chris On Jan 20, 2011, at 2:43 PM, Tom Keller wrote: > Would the entire Bio directory disappear? > How would one get modules that are not part of bioperl-live or the other main packages? > (for example Bio::Trace::ABIF) > > thanks, > Tom > MMI DNA Services Core Facility > 503-494-2442 > kellert at ohsu.edu > Office: 6588 RJH (CROET/BasicScience) > > > > > > On Jan 20, 2011, at 9:00 AM, wrote: > >> Send Bioperl-l mailing list submissions to >> bioperl-l at lists.open-bio.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> or, via email, send a message with subject or body 'help' to >> bioperl-l-request at lists.open-bio.org >> >> You can reach the person managing the list at >> bioperl-l-owner at lists.open-bio.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of Bioperl-l digest..." >> >> >> Today's Topics: >> >> 1. Re: Frame translation gets an extra aa? (Karger, Amir) >> 2. removing BioPerl 1.x from CPAN ? request for comment >> (Dave Messina) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Wed, 19 Jan 2011 12:36:02 -0500 >> From: "Karger, Amir" >> Subject: Re: [Bioperl-l] Frame translation gets an extra aa? >> To: Chris Fields >> Cc: "bioperl-l at lists.open-bio.org" >> Message-ID: >> <1B12003244CE894E85B472602363788831A884DE at FASXCH01.fasmail.priv> >> Content-Type: text/plain; charset="us-ascii" >> >>> From: Chris Fields [mailto:cjfields at illinois.edu] >>> >>> I wrote: >>>> there should be >>> some way to do either assuming or not assuming without too much >>> extra work.... the extra object copying >>> might be a problem if you were trying to write fast code, which >>> luckily I'm not in this case. >>> >>> Fast code and BioPerl? We're trying to make things faster, but >>> this may be more difficult in practice with the default OO system >>> (and a complete switch to Moose will be problematic in the short >>> term). >>> >>> I have committed a fix to github for the above. Basically, setting >>> either '-complete' or '-complete_codons' will fill out partial >>> codons and attempt to translate them, but this behavior is off by >>> default. I think this makes sense, just from the perspective we >>> don't want unintended side-effects. >> >> That makes sense to me. The truth is, there shouldn't be that many real life applications, as far as I can tell, where this bites people meaningfully. But having the option is nice. >> >> -Amir >> >> >> >> ------------------------------ >> >> Message: 2 >> Date: Thu, 20 Jan 2011 17:11:01 +0100 >> From: Dave Messina >> Subject: [Bioperl-l] removing BioPerl 1.x from CPAN ? request for >> comment >> To: Dave Messina >> Message-ID: >> Content-Type: text/plain; charset=us-ascii >> >> Hi everybody, >> >> As some of you have no doubt noticed, we have a persistent problem with users accidentally using BioPerl versions 1.4 and earlier. Although they've been placed on BackPAN, CPAN's archive for deprecated software, for some reason they still show up in CPAN search results. >> >> It's frustrating for users to be directed to the wrong version of the software, adding another gotcha to an already too-complicated installation process. And to this day it generates support emails on this list. >> >> As far as I know, the Ensembl Perl API still designates version 1.2.3 of BioPerl for use with it: >> >> http://www.ensembl.org/info/docs/api/api_installation.html >> >> (although in Chris Fields' testing, the Ensembl API worked just fine with the current BioPerl version, 1.6.x.) >> >> >> I intend to ask the CPAN admins to remove BioPerl 1.4 and earlier entirely. But before I do, I wanted to raise the issue here in case anyone objects. In particular, it'd be great if someone from Ensembl could weigh in on this. I should also point out that we have these earlier releases archived on github already, so they will still be available. >> >> >> Thanks, >> Dave >> >> >> >> >> ------------------------------ >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> End of Bioperl-l Digest, Vol 93, Issue 16 >> ***************************************** > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Thu Jan 20 16:23:00 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 20 Jan 2011 22:23:00 +0100 Subject: [Bioperl-l] removing old BioPerl versions from CPAN ? In-Reply-To: <3C398EFF-C60D-4FC2-A4BB-75B890F8E12A@illinois.edu> References: <9B867157-A168-4B33-9939-FDD720577AFA@ohsu.edu> <3C398EFF-C60D-4FC2-A4BB-75B890F8E12A@illinois.edu> Message-ID: <6B9B57D1-0BE7-4C7B-A4E7-A5C384874A4A@sbc.su.se> > Dave, we may need to contact Sendu as well, noticed he has the 1.5.2 developer releases on CPAN as well: > > http://search.cpan.org/~sendu/ Agreed. Also, I noticed that my previous subject line "removing BioPerl 1.x from CPAN" would technically mean removing all the versions of BioPerl from CPAN, including the current one, which is of course not what I meant. So I've amended that. Dave From whs at eaglegenomics.com Fri Jan 21 08:41:55 2011 From: whs at eaglegenomics.com (William Spooner) Date: Fri, 21 Jan 2011 13:41:55 +0000 Subject: [Bioperl-l] removing BioPerl 1.x from CPAN ? In-Reply-To: <9B867157-A168-4B33-9939-FDD720577AFA@ohsu.edu> References: <9B867157-A168-4B33-9939-FDD720577AFA@ohsu.edu> Message-ID: Hi Dave, As a long-time Ensembl user, I believe that all of their documentation recommends obtaining bioperl 1.2.3 from CVS rather than CPAN. I personally would have no problems with losing old bioperl versions from CPAN, although others may have a different view. Will > Message: 2 > Date: Thu, 20 Jan 2011 17:11:01 +0100 > From: Dave Messina > Subject: [Bioperl-l] removing BioPerl 1.x from CPAN ? request for > comment > To: Dave Messina > Message-ID: > Content-Type: text/plain; charset=us-ascii > > Hi everybody, > > As some of you have no doubt noticed, we have a persistent problem with users accidentally using BioPerl versions 1.4 and earlier. Although they've been placed on BackPAN, CPAN's archive for deprecated software, for some reason they still show up in CPAN search results. > > It's frustrating for users to be directed to the wrong version of the software, adding another gotcha to an already too-complicated installation process. And to this day it generates support emails on this list. > > As far as I know, the Ensembl Perl API still designates version 1.2.3 of BioPerl for use with it: > > http://www.ensembl.org/info/docs/api/api_installation.html > > (although in Chris Fields' testing, the Ensembl API worked just fine with the current BioPerl version, 1.6.x.) > > > I intend to ask the CPAN admins to remove BioPerl 1.4 and earlier entirely. But before I do, I wanted to raise the issue here in case anyone objects. In particular, it'd be great if someone from Ensembl could weigh in on this. I should also point out that we have these earlier releases archived on github already, so they will still be available. > > > Thanks, > Dave > > > > > ------------------------------ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > End of Bioperl-l Digest, Vol 93, Issue 16 > ***************************************** -- William Spooner whs at eaglegenomics.com http://www.eaglegenomics.com From dnasseh at googlemail.com Tue Jan 18 04:59:11 2011 From: dnasseh at googlemail.com (Daniel_N) Date: Tue, 18 Jan 2011 01:59:11 -0800 (PST) Subject: [Bioperl-l] Bioperl/Ensemble - getting 50.000 instead of 20.000 genes Message-ID: <30698664.post@talk.nabble.com> Hello community, I am a student working on a project. For this project i need the locations of all genes in human and other genomes. I posted my script below and it seems to work fine and do the things which i need. I now get the following results for human: [quote] ENSEMBLE DATA IMPORTER Amount of genes on chr 1: 5232 Amount of genes on chr 2: 3938 Amount of genes on chr 3: 2979 Amount of genes on chr 4: 2544 Amount of genes on chr 5: 2829 Amount of genes on chr 6: 2870 Amount of genes on chr 7: 2895 Amount of genes on chr 8: 2247 Amount of genes on chr 9: 2372 Amount of genes on chr 10: 2245 Amount of genes on chr 11: 2149 Amount of genes on chr 12: 1786 Amount of genes on chr 13: 1207 Amount of genes on chr 14: 1523 Amount of genes on chr 15: 1329 Amount of genes on chr 16: 1430 Amount of genes on chr 17: 2109 Amount of genes on chr 18: 581 Amount of genes on chr 19: 2082 Amount of genes on chr 20: 1288 Amount of genes on chr 21: 705 Amount of genes on chr 22: 1221 Amount of genes on chr X: 2357 Amount of genes on chr Y: 561 Amount of total genes: 50479 [/quote] I am just wondering because as far as I know human has currently 21,077 genes which you can see here: http://www.ensembl.org/Homo_sapiens/Info/StatsTable?db=core Basicaly for each chromosome i get too many genes, but the gene amount seems to be in proportion. (For instance chromosome 18 does not have many genes) Basicaly this code sums up how i get the data: [code] my $species = 'Human'; my $slice_adaptor = $registry->get_adaptor( $species, 'Core', 'Slice' ); my $slice = $slice_adaptor->fetch_by_region( 'chromosome', $v); my $genes = $slice->get_all_Genes(); [/code] I found an option in the ENSEMBLE Api (http://www.ensembl.org/info/docs/Pdoc/ensembl/index.html) which additionally filters the results: [code] $geneisknown = $gene->is_known(); [/code] With this option i still get more than 30000 genes. All the genes i get have different IDs, so they obviously exist and there are no duplicates, i checked this by using a hash. So the big question is, why are there that many genes, are those really genes, and if not, how can i get those 21077 listed genes instead of my 50000+ genes I would appreciate any help or answer. Daniel (The entire code of the script:) [code] #!/usr/bin/perl use strict; use lib "src/ensembl/modules"; use lib "src/bioperl-life"; print "ENSEMBLE DATA IMPORTER\n\n"; #--------------------------------------------------------- use Bio::EnsEMBL::Registry; my $registry = 'Bio::EnsEMBL::Registry'; $registry->load_registry_from_db( -host => 'ensembldb.ensembl.org', -user => 'anonymous' ); my @db_adaptors = @{ $registry->get_all_DBAdaptors() }; #---------------------------------------------------------- #get_all_Genes() #my %testhash = (); my $genecounter=0; my $species = 'Human'; my $slice_adaptor = $registry->get_adaptor( $species, 'Core', 'Slice' ); my $j=0; for(my $v=1;$v<=24;$v++){ if($v==23){$v='X';}; if($v==24){$v='Y';}; my $slice = $slice_adaptor->fetch_by_region( 'chromosome', $v); my $genes = $slice->get_all_Genes(); my $i =0; while ( my $gene = shift @{$genes} ) { $i++; #PARAMETER WHICH HAVE TO BE INPORTED TO THE LOCAL DATABASE my $gene_id = $gene->stable_id(); my $gene_start= $gene->start(); my $gene_end = $gene->end(); my $gene_length = $gene_end-$gene_start; my $strand = $gene->strand(); # my @exons = @{ $gene->get_all_Exons() }; # my $exon_amount = @exons; # my $total_exon_length=0; # my $chromosome = $v; # my @transcripts =@{$gene->get_all_Transcripts()}; # $testhash{ $gene_id } = 1; #my $geneisknown = $gene->is_known(); # if($geneisknown){ # print "known\n"; # $genecounter++; # }else{ # print "not known\n"; # } # foreach my $exon (@exons){ # my $exon_start= $exon->start(); # my $exon_end= $exon->end(); # my $exon_length = $exon_end-$exon_start; # $total_exon_length = $total_exon_length+$exon_length; # } #print "$gene_id: Position:$gene_start-$gene_end\tLength:$gene_length \tStrand: $strand\tExons:$exon_amount \t Total Exon Length: $total_exon_length\tChr:$chromosome\tSpecies:$species\n "; # TODO splicing variants of transcripts--------- # foreach my $transcript (@transcripts){ # my $transcript_id= $transcript->stable_id(); # my $transcript_start= $transcript->start(); # my $transcript_end= $transcript->end(); # my $transcript_length = $transcript_end-$transcript_start; # #print "Transcript: $transcript_id\t$transcript_length\n"; # } # ---------------------------------------------- } $j=$j+$i; print "Amount of genes on chr $v: $i\n"; if($v eq 'X'){$v=23;}; if($v eq 'Y'){$v=24;}; } print "Amount of total genes: $j\n"; #print "size of hash: " . keys( %testhash ) . ".\n"; #print "Total amount of known genes: $genecounter\n." [/code] -- View this message in context: http://old.nabble.com/Bioperl-Ensemble---getting-50.000-instead-of-20.000-genes-tp30698664p30698664.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From jw12 at sanger.ac.uk Tue Jan 18 05:29:45 2011 From: jw12 at sanger.ac.uk (Jonathan Warren) Date: Tue, 18 Jan 2011 10:29:45 +0000 Subject: [Bioperl-l] Registrations for DAS Workshop 2011 Message-ID: DAS is currently being used to share annotations on genomes, protein alignments, structural and interaction information. If you are interested in sharing biological information the DAS workshop below may be of interest to you. Registration is open for the 2011 DAS workshop (2,3,4th March) at the Genome Campus, Hinxton UK. If you are interested in attending, please find out more by going to http://www.ebi.ac.uk//training/onsite/110302DAS.html and register via the web link at the bottom of the page. This workshop will cater for novice to expert DAS users as each day is optional. Please register early as places will be limited. Registration closes 18 February 2011 (17:00). If you are interested in giving a 15 minute talk on the second day please email Jonathan Warren using jonathan.warren at sanger.ac.uk Many thanks The Sanger/EBI DAS team. Jonathan Warren Senior Developer and DAS coordinator blog: http://biodasman.wordpress.com jw12 at sanger.ac.uk Ext: 2314 Telephone: 01223 492314 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From cjfields at illinois.edu Fri Jan 21 10:35:03 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 21 Jan 2011 09:35:03 -0600 Subject: [Bioperl-l] Bioperl/Ensemble - getting 50.000 instead of 20.000 genes In-Reply-To: <30698664.post@talk.nabble.com> References: <30698664.post@talk.nabble.com> Message-ID: Daniel, This question should be asked on the ensembl-dev mailing list, not here. http://uswest.ensembl.org/info/about/contact/mailing.html chris On Jan 18, 2011, at 3:59 AM, Daniel_N wrote: > > Hello community, > > I am a student working on a project. For this project i need the locations > of all genes in human and other genomes. > I posted my script below and it seems to work fine and do the things which i > need. > I now get the following results for human: > > [quote] > ENSEMBLE DATA IMPORTER > > Amount of genes on chr 1: 5232 > Amount of genes on chr 2: 3938 > Amount of genes on chr 3: 2979 > Amount of genes on chr 4: 2544 > Amount of genes on chr 5: 2829 > Amount of genes on chr 6: 2870 > Amount of genes on chr 7: 2895 > Amount of genes on chr 8: 2247 > Amount of genes on chr 9: 2372 > Amount of genes on chr 10: 2245 > Amount of genes on chr 11: 2149 > Amount of genes on chr 12: 1786 > Amount of genes on chr 13: 1207 > Amount of genes on chr 14: 1523 > Amount of genes on chr 15: 1329 > Amount of genes on chr 16: 1430 > Amount of genes on chr 17: 2109 > Amount of genes on chr 18: 581 > Amount of genes on chr 19: 2082 > Amount of genes on chr 20: 1288 > Amount of genes on chr 21: 705 > Amount of genes on chr 22: 1221 > Amount of genes on chr X: 2357 > Amount of genes on chr Y: 561 > Amount of total genes: 50479 > [/quote] > > I am just wondering because as far as I know human has currently 21,077 > genes which you can see here: > http://www.ensembl.org/Homo_sapiens/Info/StatsTable?db=core > Basicaly for each chromosome i get too many genes, but the gene amount seems > to be in proportion. (For instance chromosome 18 does not have many genes) > > Basicaly this code sums up how i get the data: > [code] > my $species = 'Human'; > my $slice_adaptor = $registry->get_adaptor( $species, 'Core', 'Slice' ); > my $slice = $slice_adaptor->fetch_by_region( 'chromosome', $v); > my $genes = $slice->get_all_Genes(); > [/code] > I found an option in the ENSEMBLE Api > (http://www.ensembl.org/info/docs/Pdoc/ensembl/index.html) > which additionally filters the results: > [code] > $geneisknown = $gene->is_known(); > [/code] > With this option i still get more than 30000 genes. > All the genes i get have different IDs, so they obviously exist and there > are no duplicates, i checked this by using a hash. > So the big question is, why are there that many genes, are those really > genes, and if not, how can i get those 21077 listed genes instead of my > 50000+ genes > > I would appreciate any help or answer. > > Daniel > > (The entire code of the script:) > [code] > #!/usr/bin/perl > use strict; > > use lib "src/ensembl/modules"; > use lib "src/bioperl-life"; > > > print "ENSEMBLE DATA IMPORTER\n\n"; > > #--------------------------------------------------------- > use Bio::EnsEMBL::Registry; > > my $registry = 'Bio::EnsEMBL::Registry'; > > $registry->load_registry_from_db( > -host => 'ensembldb.ensembl.org', > -user => 'anonymous' > ); > > my @db_adaptors = @{ $registry->get_all_DBAdaptors() }; > > #---------------------------------------------------------- > #get_all_Genes() > #my %testhash = (); > > my $genecounter=0; > my $species = 'Human'; > my $slice_adaptor = $registry->get_adaptor( $species, 'Core', 'Slice' ); > my $j=0; > for(my $v=1;$v<=24;$v++){ > if($v==23){$v='X';}; > if($v==24){$v='Y';}; > > my $slice = $slice_adaptor->fetch_by_region( 'chromosome', $v); > my $genes = $slice->get_all_Genes(); > > my $i =0; > > while ( my $gene = shift @{$genes} ) { > $i++; > #PARAMETER WHICH HAVE TO BE INPORTED TO THE LOCAL DATABASE > > my $gene_id = $gene->stable_id(); > my $gene_start= $gene->start(); > my $gene_end = $gene->end(); > my $gene_length = $gene_end-$gene_start; > my $strand = $gene->strand(); > # my @exons = @{ $gene->get_all_Exons() }; > # my $exon_amount = @exons; > # my $total_exon_length=0; > # my $chromosome = $v; > # my @transcripts =@{$gene->get_all_Transcripts()}; > # $testhash{ $gene_id } = 1; > #my $geneisknown = $gene->is_known(); > > # if($geneisknown){ > # print "known\n"; > # $genecounter++; > # }else{ > # print "not known\n"; > # } > > # foreach my $exon (@exons){ > # my $exon_start= $exon->start(); > # my $exon_end= $exon->end(); > # my $exon_length = $exon_end-$exon_start; > # $total_exon_length = $total_exon_length+$exon_length; > # } > > > > #print "$gene_id: Position:$gene_start-$gene_end\tLength:$gene_length > \tStrand: $strand\tExons:$exon_amount \t Total Exon Length: > $total_exon_length\tChr:$chromosome\tSpecies:$species\n "; > # TODO splicing variants of transcripts--------- > # foreach my $transcript (@transcripts){ > # my $transcript_id= $transcript->stable_id(); > # my $transcript_start= $transcript->start(); > # my $transcript_end= $transcript->end(); > # my $transcript_length = $transcript_end-$transcript_start; > # #print "Transcript: $transcript_id\t$transcript_length\n"; > # } > # ---------------------------------------------- > } > $j=$j+$i; > print "Amount of genes on chr $v: $i\n"; > if($v eq 'X'){$v=23;}; > if($v eq 'Y'){$v=24;}; > } > print "Amount of total genes: $j\n"; > #print "size of hash: " . keys( %testhash ) . ".\n"; > > > #print "Total amount of known genes: $genecounter\n." > > > > [/code] > -- > View this message in context: http://old.nabble.com/Bioperl-Ensemble---getting-50.000-instead-of-20.000-genes-tp30698664p30698664.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bbimber at gmail.com Fri Jan 21 10:41:49 2011 From: bbimber at gmail.com (Ben Bimber) Date: Fri, 21 Jan 2011 09:41:49 -0600 Subject: [Bioperl-l] removing BioPerl 1.x from CPAN ? In-Reply-To: References: <9B867157-A168-4B33-9939-FDD720577AFA@ohsu.edu> Message-ID: I also have no problem removing the old version. All the different BioPerl variants made it difficult to install the right version back when I first did it. I'd love even more if the version on CPAN was updated though. I'd been using a copy from GIT for a while b/c I need changes made to the bioperl-run wrapper code. -Ben >> Message: 2 >> Date: Thu, 20 Jan 2011 17:11:01 +0100 >> From: Dave Messina >> Subject: [Bioperl-l] removing BioPerl 1.x from CPAN ? request for >> ? ? ? comment >> To: Dave Messina >> Message-ID: >> Content-Type: text/plain; charset=us-ascii >> >> Hi everybody, >> >> As some of you have no doubt noticed, we have a persistent problem with users accidentally using BioPerl versions 1.4 and earlier. Although they've been placed on BackPAN, CPAN's archive for deprecated software, for some reason they still show up in CPAN search results. >> >> It's frustrating for users to be directed to the wrong version of the software, adding another gotcha to an already too-complicated installation process. And to this day it generates support emails on this list. >> >> As far as I know, the Ensembl Perl API still designates version 1.2.3 of BioPerl for use with it: >> >> ? ? ? http://www.ensembl.org/info/docs/api/api_installation.html >> >> (although in Chris Fields' testing, the Ensembl API worked just fine with the current BioPerl version, 1.6.x.) >> >> >> I intend to ask the CPAN admins to remove BioPerl 1.4 and earlier entirely. But before I do, I wanted to raise the issue here in case anyone objects. In particular, it'd be great if someone from Ensembl could weigh in on this. I should also point out that we have these earlier releases archived on github already, so they will still be available. >> >> >> Thanks, >> Dave >> >> >> >> >> ------------------------------ >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> End of Bioperl-l Digest, Vol 93, Issue 16 >> ***************************************** > > -- > William Spooner > whs at eaglegenomics.com > http://www.eaglegenomics.com > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at illinois.edu Fri Jan 21 10:49:23 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 21 Jan 2011 09:49:23 -0600 Subject: [Bioperl-l] removing BioPerl 1.x from CPAN ? In-Reply-To: References: <9B867157-A168-4B33-9939-FDD720577AFA@ohsu.edu> Message-ID: <5EE0EF83-8CD8-4144-AF07-F18177E02751@illinois.edu> Do you need updates to bioperl, or to bioperl-run? The reason I ask is one of the things I really want to push towards is using Dist::Zilla for us developers who want an easier way to make releases, with a default Build.PL so one doesn't have to install Dist::Zilla to use the latest code. bioperl-run might be a good place to test this out. chris On Jan 21, 2011, at 9:41 AM, Ben Bimber wrote: > I also have no problem removing the old version. All the different > BioPerl variants made it difficult to install the right version back > when I first did it. > > I'd love even more if the version on CPAN was updated though. I'd > been using a copy from GIT for a while b/c I need changes made to the > bioperl-run wrapper code. > > -Ben > > >>> Message: 2 >>> Date: Thu, 20 Jan 2011 17:11:01 +0100 >>> From: Dave Messina >>> Subject: [Bioperl-l] removing BioPerl 1.x from CPAN ? request for >>> comment >>> To: Dave Messina >>> Message-ID: >>> Content-Type: text/plain; charset=us-ascii >>> >>> Hi everybody, >>> >>> As some of you have no doubt noticed, we have a persistent problem with users accidentally using BioPerl versions 1.4 and earlier. Although they've been placed on BackPAN, CPAN's archive for deprecated software, for some reason they still show up in CPAN search results. >>> >>> It's frustrating for users to be directed to the wrong version of the software, adding another gotcha to an already too-complicated installation process. And to this day it generates support emails on this list. >>> >>> As far as I know, the Ensembl Perl API still designates version 1.2.3 of BioPerl for use with it: >>> >>> http://www.ensembl.org/info/docs/api/api_installation.html >>> >>> (although in Chris Fields' testing, the Ensembl API worked just fine with the current BioPerl version, 1.6.x.) >>> >>> >>> I intend to ask the CPAN admins to remove BioPerl 1.4 and earlier entirely. But before I do, I wanted to raise the issue here in case anyone objects. In particular, it'd be great if someone from Ensembl could weigh in on this. I should also point out that we have these earlier releases archived on github already, so they will still be available. >>> >>> >>> Thanks, >>> Dave >>> >>> >>> >>> >>> ------------------------------ >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> End of Bioperl-l Digest, Vol 93, Issue 16 >>> ***************************************** >> >> -- >> William Spooner >> whs at eaglegenomics.com >> http://www.eaglegenomics.com >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bbimber at gmail.com Fri Jan 21 10:55:56 2011 From: bbimber at gmail.com (Ben Bimber) Date: Fri, 21 Jan 2011 09:55:56 -0600 Subject: [Bioperl-l] removing BioPerl 1.x from CPAN ? In-Reply-To: <5EE0EF83-8CD8-4144-AF07-F18177E02751@illinois.edu> References: <9B867157-A168-4B33-9939-FDD720577AFA@ohsu.edu> <5EE0EF83-8CD8-4144-AF07-F18177E02751@illinois.edu> Message-ID: Hi Chris, Perhaps there is some subtle difference I'm not catching. I thought that 'BioPerl' referred collectively to bioperl-live and bioperl-run. There are also a number of other 'Bio::XX' packages out there that are not BioPerl, like Bio::DB. The specific bugfix I need is in CommandExts.pm, which is part of the wrapper code in the bioperl-run repository. I have only 2 servers to keep track of, so downloading from GIT was honestly just simpler and faster than figuring out a proper solution. I will do some reading on Dist::Zilla. -Ben On Fri, Jan 21, 2011 at 9:49 AM, Chris Fields wrote: > Do you need updates to bioperl, or to bioperl-run? ?The reason I ask is one of the things I really want to push towards is using Dist::Zilla for us developers who want an easier way to make releases, with a default Build.PL so one doesn't have to install Dist::Zilla to use the latest code. ?bioperl-run might be a good place to test this out. > > chris > > On Jan 21, 2011, at 9:41 AM, Ben Bimber wrote: > >> I also have no problem removing the old version. ?All the different >> BioPerl variants made it difficult to install the right version back >> when I first did it. >> >> I'd love even more if the version on CPAN was updated though. ?I'd >> been using a copy from GIT for a while b/c I need changes made to the >> bioperl-run wrapper code. >> >> -Ben >> >> >>>> Message: 2 >>>> Date: Thu, 20 Jan 2011 17:11:01 +0100 >>>> From: Dave Messina >>>> Subject: [Bioperl-l] removing BioPerl 1.x from CPAN ? request for >>>> ? ? ? comment >>>> To: Dave Messina >>>> Message-ID: >>>> Content-Type: text/plain; charset=us-ascii >>>> >>>> Hi everybody, >>>> >>>> As some of you have no doubt noticed, we have a persistent problem with users accidentally using BioPerl versions 1.4 and earlier. Although they've been placed on BackPAN, CPAN's archive for deprecated software, for some reason they still show up in CPAN search results. >>>> >>>> It's frustrating for users to be directed to the wrong version of the software, adding another gotcha to an already too-complicated installation process. And to this day it generates support emails on this list. >>>> >>>> As far as I know, the Ensembl Perl API still designates version 1.2.3 of BioPerl for use with it: >>>> >>>> ? ? ? http://www.ensembl.org/info/docs/api/api_installation.html >>>> >>>> (although in Chris Fields' testing, the Ensembl API worked just fine with the current BioPerl version, 1.6.x.) >>>> >>>> >>>> I intend to ask the CPAN admins to remove BioPerl 1.4 and earlier entirely. But before I do, I wanted to raise the issue here in case anyone objects. In particular, it'd be great if someone from Ensembl could weigh in on this. I should also point out that we have these earlier releases archived on github already, so they will still be available. >>>> >>>> >>>> Thanks, >>>> Dave >>>> >>>> >>>> >>>> >>>> ------------------------------ >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> End of Bioperl-l Digest, Vol 93, Issue 16 >>>> ***************************************** >>> >>> -- >>> William Spooner >>> whs at eaglegenomics.com >>> http://www.eaglegenomics.com >>> >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Fri Jan 21 13:27:12 2011 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 21 Jan 2011 12:27:12 -0600 Subject: [Bioperl-l] removing BioPerl 1.x from CPAN ? In-Reply-To: References: <9B867157-A168-4B33-9939-FDD720577AFA@ohsu.edu> <5EE0EF83-8CD8-4144-AF07-F18177E02751@illinois.edu> Message-ID: Ben, No, BioPerl and BioPerl-Run are two separate distributions or packages: http://www.bioperl.org/wiki/Category:BioPerl_Packages A good number of Bio::DB::* modules on CPAN are currently part of BioPerl (also known as the core modules; 'bioperl-live' are the bleeding edge modules found on github). A number, like Bio::DB::Sam, are found in other distributions (Lincoln's Bio-Samtools). The code you need is in BioPerl itself, and was added by Mark Jensen after the 1.6.1 release. If you use the CPAN shell and use 'm /Bio::DB/', you will see a list of the modules and the associated distributions they belong to. Final note: Dist::Zilla isn't required reading for anyone using BioPerl code, only those who may want to act as release managers in the future (for BioPerl or any other Perl project). It's merely a nice tool that dramatically simplifies: 1) Updating changes 2) Changing the version 3) Testing the code 4) Packaging up the code for CPAN 5) Uploading to CPAN for others to enjoy I mention it here for those who use github, as many Dist::Zilla conversions lack a Makefile.PL/Build.PL. We would have to retain one for many reasons too involved to talk about here. chris On Jan 21, 2011, at 9:55 AM, Ben Bimber wrote: > Hi Chris, > > Perhaps there is some subtle difference I'm not catching. I thought > that 'BioPerl' referred collectively to bioperl-live and bioperl-run. > There are also a number of other 'Bio::XX' packages out there that are > not BioPerl, like Bio::DB. The specific bugfix I need is in > CommandExts.pm, which is part of the wrapper code in the bioperl-run > repository. > > I have only 2 servers to keep track of, so downloading from GIT was > honestly just simpler and faster than figuring out a proper solution. > I will do some reading on Dist::Zilla. > > -Ben > > > > On Fri, Jan 21, 2011 at 9:49 AM, Chris Fields wrote: >> Do you need updates to bioperl, or to bioperl-run? The reason I ask is one of the things I really want to push towards is using Dist::Zilla for us developers who want an easier way to make releases, with a default Build.PL so one doesn't have to install Dist::Zilla to use the latest code. bioperl-run might be a good place to test this out. >> >> chris >> >> On Jan 21, 2011, at 9:41 AM, Ben Bimber wrote: >> >>> I also have no problem removing the old version. All the different >>> BioPerl variants made it difficult to install the right version back >>> when I first did it. >>> >>> I'd love even more if the version on CPAN was updated though. I'd >>> been using a copy from GIT for a while b/c I need changes made to the >>> bioperl-run wrapper code. >>> >>> -Ben >>> >>> >>>>> Message: 2 >>>>> Date: Thu, 20 Jan 2011 17:11:01 +0100 >>>>> From: Dave Messina >>>>> Subject: [Bioperl-l] removing BioPerl 1.x from CPAN ? request for >>>>> comment >>>>> To: Dave Messina >>>>> Message-ID: >>>>> Content-Type: text/plain; charset=us-ascii >>>>> >>>>> Hi everybody, >>>>> >>>>> As some of you have no doubt noticed, we have a persistent problem with users accidentally using BioPerl versions 1.4 and earlier. Although they've been placed on BackPAN, CPAN's archive for deprecated software, for some reason they still show up in CPAN search results. >>>>> >>>>> It's frustrating for users to be directed to the wrong version of the software, adding another gotcha to an already too-complicated installation process. And to this day it generates support emails on this list. >>>>> >>>>> As far as I know, the Ensembl Perl API still designates version 1.2.3 of BioPerl for use with it: >>>>> >>>>> http://www.ensembl.org/info/docs/api/api_installation.html >>>>> >>>>> (although in Chris Fields' testing, the Ensembl API worked just fine with the current BioPerl version, 1.6.x.) >>>>> >>>>> >>>>> I intend to ask the CPAN admins to remove BioPerl 1.4 and earlier entirely. But before I do, I wanted to raise the issue here in case anyone objects. In particular, it'd be great if someone from Ensembl could weigh in on this. I should also point out that we have these earlier releases archived on github already, so they will still be available. >>>>> >>>>> >>>>> Thanks, >>>>> Dave >>>>> >>>>> >>>>> >>>>> >>>>> ------------------------------ >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>>> End of Bioperl-l Digest, Vol 93, Issue 16 >>>>> ***************************************** >>>> >>>> -- >>>> William Spooner >>>> whs at eaglegenomics.com >>>> http://www.eaglegenomics.com >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cseligman at earthlink.net Fri Jan 21 12:57:08 2011 From: cseligman at earthlink.net (Chet Seligman) Date: Fri, 21 Jan 2011 09:57:08 -0800 Subject: [Bioperl-l] Module needed to install Bioperl IPC::Run for GraphViz Message-ID: <000001cbb994$aec81e40$0c585ac0$@earthlink.net> Any ideas where I can get this? It doesn't seem to be in any of the packages I've tried which I get from cpan> i /bioperl/ Chet From David.Messina at sbc.su.se Sat Jan 22 05:43:54 2011 From: David.Messina at sbc.su.se (Dave Messina) Date: Sat, 22 Jan 2011 11:43:54 +0100 Subject: [Bioperl-l] removing BioPerl 1.x from CPAN ? In-Reply-To: References: <9B867157-A168-4B33-9939-FDD720577AFA@ohsu.edu> Message-ID: On Jan 21, 2011, at 14:41 , William Spooner wrote: > I believe that all of their documentation recommends obtaining bioperl 1.2.3 from CVS rather than CPAN. Good to know. Thanks for your comments, Will. Dave From adsj at novozymes.com Sat Jan 22 11:31:39 2011 From: adsj at novozymes.com (Adam =?utf-8?Q?Sj=C3=B8gren?=) Date: Sat, 22 Jan 2011 17:31:39 +0100 Subject: [Bioperl-l] Module needed to install Bioperl IPC::Run for GraphViz In-Reply-To: <000001cbb994$aec81e40$0c585ac0$@earthlink.net> (Chet Seligman's message of "Fri, 21 Jan 2011 09:57:08 -0800") References: <000001cbb994$aec81e40$0c585ac0$@earthlink.net> Message-ID: <87lj2csxw4.fsf@topper.koldfront.dk> On Fri, 21 Jan 2011 09:57:08 -0800, Chet wrote: > Any ideas where I can get this? IPC::Run? * http://search.cpan.org/search?query=IPC%3A%3ARun `--> http://search.cpan.org/~toddr/IPC-Run-0.89/lib/IPC/Run.pm (It isn't BioPerl-specific.) Best regards, Adam -- Adam Sj?gren adsj at novozymes.com From manchunjohn-ma at uiowa.edu Mon Jan 24 12:44:53 2011 From: manchunjohn-ma at uiowa.edu (Ma, Man Chun John) Date: Mon, 24 Jan 2011 17:44:53 +0000 Subject: [Bioperl-l] Proposed improvement to to Bio::Tools::Run::Primer3Redux Message-ID: <344D48F6FA61134A9B17AE445882A195010C77@HC-MAILBOXC1-N5.healthcare.uiowa.edu> Hi, Attached are my proposed diff for some changes for Bio::Tools::Run::Primer3Redux to more fully implement the new features of Primer3 version 2.x.x: 1. Adding support for the commond-line argument p3_settings_file that has been available for all 2.x.x versions, and 2. Adding support for the "Sequence" tag SEQUENCE_PRIMER_PAIR_OK_REGION_LIST, a new function in version 2.2.3 Although I have used this module quite heavily in my projects and it appeared to run well, I'm not sure if there are bugs--not to say I have yet understand how to write /t scripts, so I wonder if someone would like to test this up. Cheers, John MC Ma Graduate Assistant Kwitek Lab Department of Internal Medicine 3125E MERF 375 Newton Road Iowa City IA 52242 -------------- next part -------------- A non-text attachment was scrubbed... Name: Primer3Redux.patch Type: application/octet-stream Size: 6779 bytes Desc: Primer3Redux.patch URL: From hanbobio at 126.com Sat Jan 22 05:28:32 2011 From: hanbobio at 126.com (hanbobio) Date: Sat, 22 Jan 2011 18:28:32 +0800 (CST) Subject: [Bioperl-l] Bioperl problem In-Reply-To: References: Message-ID: <5a8b8f.9819.12dad44a2aa.Coremail.hanbobio@126.com> Hi,all When I run the Bioperl script on Windows XP, there are some problem. I think the cause relate to the Bioperl tools. But I don't know how to dissove the problem. So I come here forhelp. The problem is as the attached file I had download the ClustalW2.exe, and the program is on C:/work/ClustalW2/clustalw2.exe. And the program worked when I used it directly. But it didn't work when I run the perl script(the attached file:test-new.pl). And I had installed the Bio-Run in the perl package management. Operation system: windows XP Bioperl version: 1.6.0 Best regards! yusheng liao -------------- next part -------------- A non-text attachment was scrubbed... Name: BioPerl Problem.doc Type: application/msword Size: 131584 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test-new.pl Type: application/octet-stream Size: 4795 bytes Desc: not available URL: From cjfields at illinois.edu Mon Jan 24 13:29:45 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 24 Jan 2011 12:29:45 -0600 Subject: [Bioperl-l] Bioperl problem In-Reply-To: <5a8b8f.9819.12dad44a2aa.Coremail.hanbobio@126.com> References: <5a8b8f.9819.12dad44a2aa.Coremail.hanbobio@126.com> Message-ID: <8CA274A3-EE61-48DC-841C-3D80852A17A0@illinois.edu> It's possible this is being caused by UNIX-specific issues with the method call (note the output redirection). The problem is, I can't debug this myself w/o a Windows box to work off of. Anyone Windows-savvy who can help? chris On Jan 22, 2011, at 4:28 AM, hanbobio wrote: > Hi,all > When I run the Bioperl script on Windows XP, there are some problem. > I think the cause relate to the Bioperl tools. But I don't know how to dissove the problem. > So I come here forhelp. > The problem is as the attached file > I had download the ClustalW2.exe, and the program is on C:/work/ClustalW2/clustalw2.exe. And the program worked when I used it directly. But it didn't work when I run the perl script(the attached file:test-new.pl). And I had installed the Bio-Run in the perl package management. > > Operation system: windows XP > Bioperl version: 1.6.0 > > Best regards! > yusheng liao_______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Mon Jan 24 13:41:12 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 24 Jan 2011 12:41:12 -0600 Subject: [Bioperl-l] Proposed improvement to to Bio::Tools::Run::Primer3Redux In-Reply-To: <344D48F6FA61134A9B17AE445882A195010C77@HC-MAILBOXC1-N5.healthcare.uiowa.edu> References: <344D48F6FA61134A9B17AE445882A195010C77@HC-MAILBOXC1-N5.healthcare.uiowa.edu> Message-ID: John, This patch is made off an older version of Bio-Tools-Primer3Redux, which is now hosted in a separate repo on GitHub: https://github.com/cjfields/Bio-Tools-Primer3Redux I get one patch failure against the latest code which is easily added (the SEQUENCE_PRIMER_PAIR_OK_REGION_LIST parameter), but tests now fail (see below). Can you resubmit this against the latest code? chris $ ./Build test --test-files t/Run/Primer3Redux.t --verbose t/Run/Primer3Redux.t .. Subroutine p3_settings_file redefined at /Users/cjfields/bioperl/Bio-Tools-Primer3Redux/blib/lib/Bio/Tools/Run/Primer3Redux.pm line 620. ok 1 - use Bio::Tools::Run::Primer3Redux; ok 2 ok 3 - program_name SEQUENCE_ID=Test1 SEQUENCE_TEMPLATE=AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACC PRIMER_EXPLAIN_FLAG=1 PRIMER_PRODUCT_SIZE_RANGE=100-250 PRIMER_SALT_CORRECTIONS=1 PRIMER_TASK=pick_pcr_primers PRIMER_TM_FORMULA=1 = Unknown open() mode '/Users/cjfields/bin/primer3_core Hi, > > Attached are my proposed diff for some changes for Bio::Tools::Run::Primer3Redux to more fully implement the new features of Primer3 version 2.x.x: > > 1. Adding support for the commond-line argument p3_settings_file that has been available for all 2.x.x versions, and > 2. Adding support for the "Sequence" tag SEQUENCE_PRIMER_PAIR_OK_REGION_LIST, a new function in version 2.2.3 > > Although I have used this module quite heavily in my projects and it appeared to run well, I'm not sure if there are bugs--not to say I have yet understand how to write /t scripts, so I wonder if someone would like to test this up. > > Cheers, > > John MC Ma > Graduate Assistant > Kwitek Lab > Department of Internal Medicine > 3125E MERF > 375 Newton Road > Iowa City IA 52242_______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jun.yin at ucd.ie Mon Jan 24 17:14:30 2011 From: jun.yin at ucd.ie (Jun Yin) Date: Mon, 24 Jan 2011 22:14:30 +0000 Subject: [Bioperl-l] Bioperl problem In-Reply-To: <8CA274A3-EE61-48DC-841C-3D80852A17A0@illinois.edu> References: <5a8b8f.9819.12dad44a2aa.Coremail.hanbobio@126.com> <8CA274A3-EE61-48DC-841C-3D80852A17A0@illinois.edu> Message-ID: <003801cbbc14$127ed1a0$377c74e0$%yin@ucd.ie> Hi, I haven't run the code, but I spotted a few problems in the code. Yusheng, can you try to set the environment as BEGIN { $ENV{CLUSTALDIR} = "c:/work/ClustalW2 "}; Instead of BEGIN { $ENV{CLUSTALDIR} = "c:/work/ClustalW2/clustalw2.exe"}; If that still doesn't work, try to rename your clustalw2.exe into clustalw as well. Or, you can run your clustalw2 using system command, and read in the alignment using Bio::AlignIO. Cheers, Jun Yin Ph.D.?student in U.C.D. Bioinformatics Laboratory Conway Institute University College Dublin -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields Sent: Monday, January 24, 2011 6:30 PM To: hanbobio Cc: bioperl-l at lists.open-bio.org Subject: Re: [Bioperl-l] Bioperl problem It's possible this is being caused by UNIX-specific issues with the method call (note the output redirection). The problem is, I can't debug this myself w/o a Windows box to work off of. Anyone Windows-savvy who can help? chris On Jan 22, 2011, at 4:28 AM, hanbobio wrote: > Hi,all > When I run the Bioperl script on Windows XP, there are some problem. > I think the cause relate to the Bioperl tools. But I don't know how to dissove the problem. > So I come here forhelp. > The problem is as the attached file > I had download the ClustalW2.exe, and the program is on C:/work/ClustalW2/clustalw2.exe. And the program worked when I used it directly. But it didn't work when I run the perl script(the attached file:test-new.pl). And I had installed the Bio-Run in the perl package management. > > Operation system: windows XP > Bioperl version: 1.6.0 > > Best regards! > yusheng liao_______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From chiragmatkarbioinfo at gmail.com Tue Jan 25 01:29:21 2011 From: chiragmatkarbioinfo at gmail.com (chirag matkar) Date: Tue, 25 Jan 2011 13:29:21 +0700 Subject: [Bioperl-l] Bioperl problem In-Reply-To: <003801cbbc14$127ed1a0$377c74e0$%yin@ucd.ie> References: <5a8b8f.9819.12dad44a2aa.Coremail.hanbobio@126.com> <8CA274A3-EE61-48DC-841C-3D80852A17A0@illinois.edu> <003801cbbc14$127ed1a0$377c74e0$%yin@ucd.ie> Message-ID: Hello, I am getting an Exception while running this script on Windows XP. -------------------------'align' is not recognized as an internal or external co mmand, operable program or batch file. ------------- EXCEPTION ------------- MSG: ClustalW call ( align -infile=seq.txt -output=gcg -matrix=BLOSUM -ktuple =3 -outfile=C:\DOCUME~1\chirag\LOCALS~1\Temp\rIC8w2_4Ag\5JCfhbka_j 2>&1) crashed : 256 STACK Bio::Tools::Run::Alignment::Clustalw::_run C:/Perl/lib/Bio/Tools/Run/Align ment/Clustalw.pm:767 STACK Bio::Tools::Run::Alignment::Clustalw::align C:/Perl/lib/Bio/Tools/Run/Alig nment/Clustalw.pm:515 STACK toplevel C:\PROGRA~1\LUCKAS~1\ENGINS~1\data\CLUSTA~1.PL:33 ------------------------------------- Also the -outfile is taking any path which is not specified by me.is it because of this? On Tue, Jan 25, 2011 at 5:14 AM, Jun Yin wrote: > Hi, > > I haven't run the code, but I spotted a few problems in the code. > > Yusheng, can you try to set the environment as > > BEGIN { $ENV{CLUSTALDIR} = "c:/work/ClustalW2 "}; > Instead of > BEGIN { $ENV{CLUSTALDIR} = "c:/work/ClustalW2/clustalw2.exe"}; > > If that still doesn't work, try to rename your clustalw2.exe into clustalw > as well. > > Or, you can run your clustalw2 using system command, and read in the > alignment using Bio::AlignIO. > > Cheers, > Jun Yin > Ph.D. student in U.C.D. > > Bioinformatics Laboratory > Conway Institute > University College Dublin > > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields > Sent: Monday, January 24, 2011 6:30 PM > To: hanbobio > Cc: bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Bioperl problem > > It's possible this is being caused by UNIX-specific issues with the method > call (note the output redirection). The problem is, I can't debug this > myself w/o a Windows box to work off of. Anyone Windows-savvy who can > help? > > chris > > On Jan 22, 2011, at 4:28 AM, hanbobio wrote: > > > Hi,all > > When I run the Bioperl script on Windows XP, there are some problem. > > I think the cause relate to the Bioperl tools. But I don't know how to > dissove the problem. > > So I come here forhelp. > > The problem is as the attached file > > I had download the ClustalW2.exe, and the program is on > C:/work/ClustalW2/clustalw2.exe. And the program worked when I used it > directly. But it didn't work when I run the perl script(the attached > file:test-new.pl). And I had installed the Bio-Run in the perl package > management. > > > > Operation system: windows XP > > Bioperl version: 1.6.0 > > > > Best regards! > > yusheng liao Problem.doc>_______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Regards, Chirag Matkar From gawbul at gmail.com Tue Jan 25 16:43:35 2011 From: gawbul at gmail.com (Steve Moss) Date: Tue, 25 Jan 2011 21:43:35 +0000 Subject: [Bioperl-l] Bioperl problem (Chris Fields) Message-ID: Hi Chris, I have a windows box I can check this on. Could you forward the test-new.plscript, as I don't appear to have it attached to my digest. Cheers, Steve On 25 January 2011 17:00, wrote: > Date: Mon, 24 Jan 2011 12:29:45 -0600 > From: Chris Fields > Subject: Re: [Bioperl-l] Bioperl problem > To: hanbobio > Cc: bioperl-l at lists.open-bio.org > Message-ID: <8CA274A3-EE61-48DC-841C-3D80852A17A0 at illinois.edu> > Content-Type: text/plain; charset=us-ascii > > It's possible this is being caused by UNIX-specific issues with the method > call (note the output redirection). The problem is, I can't debug this > myself w/o a Windows box to work off of. Anyone Windows-savvy who can help? > > chris > > On Jan 22, 2011, at 4:28 AM, hanbobio wrote: > > > Hi,all > > When I run the Bioperl script on Windows XP, there are some problem. > > I think the cause relate to the Bioperl tools. But I don't know how to > dissove the problem. > > So I come here forhelp. > > The problem is as the attached file > > I had download the ClustalW2.exe, and the program is on > C:/work/ClustalW2/clustalw2.exe. And the program worked when I used it > directly. But it didn't work when I run the perl script(the attached file: > test-new.pl). And I had installed the Bio-Run in the perl package > management. > > > > Operation system: windows XP > > Bioperl version: 1.6.0 > > > > Best regards! > > yusheng liao >_______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Kindest regards, Steve Moss http://stevemoss.ath.cx From jason at bioperl.org Tue Jan 25 16:46:16 2011 From: jason at bioperl.org (Jason Stajich) Date: Tue, 25 Jan 2011 13:46:16 -0800 Subject: [Bioperl-l] Bio::TreeIO In-Reply-To: <000801cbbcd6$ddd295e0$9977c1a0$@edu> References: <000801cbbcd6$ddd295e0$9977c1a0$@edu> Message-ID: <4D3F44A8.8020001@bioperl.org> Hi Komal - You would just iterate over all the nodes, unset the bootstrap value if it is < that your cutoff. for my $node ( $tree->get_nodes ) { if($node->bootstrap < $CUTOFF ){ $node->bootstrap(''); } } The subtly between whether values get moved to boostrap value or just as an id can be set when you do Bio::TreeIO reading in, you can also call $tree->move_id_to_bootstrap which moves them over to the bootstrap slot. Alternatively you can just know that ids for internal nodes are for boostraps and use that field, e.g. for my $node ( $tree->get_nodes ) { if(! $node->is_Leaf && $node->id < $CUTOFF ){ $node->id(''); } } And write the tree back out. The HOWTO on Trees shows I/O for trees in more detail. Best wishes. Komal Jain wrote: > Dear Jason, > > > > I went through the documentation of the Bio::Tree module and it is a very > useful tool. I would like to know from you if it is possible to remove the > bootstrap values smaller than a cutoff from the newick tree. > > > > Thanks in advance for your reply, > > Komal. > -- Jason Stajich jason at bioperl.org http://bioperl.org/wiki From kj2230 at columbia.edu Tue Jan 25 16:50:39 2011 From: kj2230 at columbia.edu (Komal Jain) Date: Tue, 25 Jan 2011 16:50:39 -0500 Subject: [Bioperl-l] Bio::TreeIO In-Reply-To: <4D3F44A8.8020001@bioperl.org> References: <000801cbbcd6$ddd295e0$9977c1a0$@edu> <4D3F44A8.8020001@bioperl.org> Message-ID: <000d01cbbcd9$e83d7ce0$b8b876a0$@edu> Thanks very much for your prompt reply. I greatly appreciate it. Komal. -----Original Message----- From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason Stajich Sent: Tuesday, January 25, 2011 4:46 PM To: Komal Jain; BioPerl List Subject: Re: Bio::TreeIO Hi Komal - You would just iterate over all the nodes, unset the bootstrap value if it is < that your cutoff. for my $node ( $tree->get_nodes ) { if($node->bootstrap < $CUTOFF ){ $node->bootstrap(''); } } The subtly between whether values get moved to boostrap value or just as an id can be set when you do Bio::TreeIO reading in, you can also call $tree->move_id_to_bootstrap which moves them over to the bootstrap slot. Alternatively you can just know that ids for internal nodes are for boostraps and use that field, e.g. for my $node ( $tree->get_nodes ) { if(! $node->is_Leaf && $node->id < $CUTOFF ){ $node->id(''); } } And write the tree back out. The HOWTO on Trees shows I/O for trees in more detail. Best wishes. Komal Jain wrote: > Dear Jason, > > > > I went through the documentation of the Bio::Tree module and it is a > very useful tool. I would like to know from you if it is possible to > remove the bootstrap values smaller than a cutoff from the newick tree. > > > > Thanks in advance for your reply, > > Komal. > -- Jason Stajich jason at bioperl.org http://bioperl.org/wiki From dichmann at berkeley.edu Tue Jan 25 16:59:28 2011 From: dichmann at berkeley.edu (Darwin Sorento Dichmann) Date: Tue, 25 Jan 2011 13:59:28 -0800 Subject: [Bioperl-l] Landmark not recognized SeqFeature::Store database problem Message-ID: Hello, I am trying to set up a SeqFeature::Store database. However, I consistently get "Landmark named scaffold_something is not recognized. See the help pages for suggestions." Apache errorlog shows nothing. I am currently troubleshooting on a single scaffold and get same error. I've been banging my head against this for weeks and can't figure out what the problem is. Any help or pointers are greatly appreciated. Cc'ed to the bioperl list. Best wishes, Darwin The database seem to load properly and I can find the scaffold name: --- mysql> select seqname from locationlist; +--------------+ | seqname | +--------------+ | scaffold_498 | +--------------+ --- Command for loading the database and head of files: ----- bp_seqfeature_load.pl -dsn frog3 -u darwin -p xxxxxxx -c -v fixed_scaffold_498.gff3 scaffold_498.fasta loading scaffold_498.gff3... Building object tree... 0.00s load time: 0.02s loading scaffold_498.fasta... Building object tree... 0.00s load time: 0.28s Macintosh:scaffold_498 darwin$ head -25 fixed_scaffold_498.gff3 scaffold_498 pick scaffold 1 875342 . . . name=scaffold_498 scaffold_498 pick gene 5249 38060 1000 + . ID=CUFF.398955;Name= scaffold_498 pick transcript 5249 38060 1000 + . ID=CUFF.398955.1;Name=;Parent=CUFF.398955 scaffold_498 pick exon 5249 5359 1000 + . ID=CUFF.398955.1.1;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 5508 5661 1000 + . ID=CUFF.398955.1.2;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 7474 7647 1000 + . ID=CUFF.398955.1.3;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 7819 8005 1000 + . ID=CUFF.398955.1.4;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 8254 8419 1000 + . ID=CUFF.398955.1.5;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 9997 10111 1000 + . ID=CUFF.398955.1.6;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 12104 12302 1000 + . ID=CUFF.398955.1.7;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 13854 13954 1000 + . ID=CUFF.398955.1.8;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 14810 14989 1000 + . ID=CUFF.398955.1.9;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 15310 15394 1000 + . ID=CUFF.398955.1.10;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 15938 16044 1000 + . ID=CUFF.398955.1.11;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 17997 18072 1000 + . ID=CUFF.398955.1.12;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 18364 18472 1000 + . ID=CUFF.398955.1.13;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 19364 19488 1000 + . ID=CUFF.398955.1.14;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 21402 21549 1000 + . ID=CUFF.398955.1.15;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 27620 27726 1000 + . ID=CUFF.398955.1.16;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 29322 29436 1000 + . ID=CUFF.398955.1.17;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 29958 30129 1000 + . ID=CUFF.398955.1.18;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 30947 31055 1000 + . ID=CUFF.398955.1.19;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 31247 31309 1000 + . ID=CUFF.398955.1.20;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 31461 31526 1000 + . ID=CUFF.398955.1.21;Name=;Parent=CUFF.398955.1 scaffold_498 pick exon 32478 32625 1000 + . ID=CUFF.398955.1.22;Name=;Parent=CUFF.398955.1 Macintosh:scaffold_498 darwin$ head -25 scaffold_498.fasta >scaffold_498 TAATATTGTGATGGTCGGCCCTGTAGCTCCTGCTCTAATACAGAACCATTTTCCATTGTG GTTTTCCAGTGAGATAAACCCTTAATCAAACTGCATAATAAACTGACGGTAGCACACTAG CACTGGCTGTATCCCTTCTTGAGTAAAAAAAATAAATGGATCTGCCCTTGCTCTTTACCT CATTAGGCCAGTGGAGTAGATTTGGCATTGGATCATAGCAACGTATGTTTCCAACACAAG TGACACATTGCTGATTCAGTCCATCTTGACCAGGGCAAAGTTATTAGGTCATATATGGAG TGAAAACAATGGATTTCCCTACTATTATCTAATCTGGGCTTGTTTGGATATGGATTGGGC AAGCTGGAAATTGCCATTGGCTGAGGATCATATCAGGCTGTGGAGGCAGCCCATGAACAA AAGGTCTTCATGAGCTTTTTAAATGATCATATTATTTCAGCCAAGCTTGACCCAACACTA AGGTGGCTAAATGGGACGGAAGAGTTCTTGCGTGGCAGCAAAAATGAGACGGATCTTCTC AAACTGCACAGCTTTAATTCTTAGAACATGTTTTCCTAGAGGTCAATACTTTGCATGTCT GAGTTACCACATGATCGACTCAATCCCATGCTGTAGCCCTCAAGCTCACTGAGCCTTATT ATCTTATTATTTATACTCTGTAGCCTCTTTGGTGCCAAATGTTCTATTGCATAGATGGCA CAGGGTAACTTTCTGAGTGAGGTTCCTCAGAAATAAATTGAATATATGTCTGCGCTGCTG ATGCTTAAAGTTTGGCTTGGGAACCCCACATTGGCACACTATTTATAGCCAGTGAGAGGT AATTGCTAATATGAAGTGGATTGCCAAACTCATTCTATTTTGGATCACAGAGTGGTACat acaggtatgggacccgttattcagaatgctcgggaccaagggtattctggataaggggtc tttccgtaatttggatctccatacattaagtccactaaaaaatcaataaaacgttaataa aacccagtaggactgttctgccccaataaagattaattatattttagttgggatcaagta caggtactgttttattattacagagaaaagggaatcatttaaccattaaataaacccaat aggactgttctgcccccaataaggggtaattatatcttagttgggatcaagtacaggtac tgttttattattacagagaaaagggaatcatttaaccattaaataaacccaataggactg ttctgcccccaataaggggtaattatatcttagttgggatcaagtacaggtactgtttta ttattacAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCCA ---- From scott at scottcain.net Tue Jan 25 17:36:07 2011 From: scott at scottcain.net (Scott Cain) Date: Tue, 25 Jan 2011 17:36:07 -0500 Subject: [Bioperl-l] [Gmod-gbrowse] Landmark not recognized SeqFeature::Store database problem In-Reply-To: References: Message-ID: Hi Darwin, In column 9, case matters. That is, "name" != "Name". Try changing that for the scaffold. Scott On Tue, Jan 25, 2011 at 4:59 PM, Darwin Sorento Dichmann wrote: > Hello, > > I am trying to set up a SeqFeature::Store database. However, I consistently get "Landmark named scaffold_something is not recognized. See the help pages for suggestions." Apache errorlog shows nothing. I am currently troubleshooting on a single scaffold and get same error. > > I've been banging my head against this for weeks and can't figure out what the problem is. Any help or pointers are greatly appreciated. Cc'ed to the bioperl list. > > Best wishes, > Darwin > > The database seem to load properly and I can find the scaffold name: > --- > mysql> select seqname from locationlist; > +--------------+ > | seqname ? ? ?| > +--------------+ > | scaffold_498 | > +--------------+ > --- > Command for loading the database and head of files: > ----- > bp_seqfeature_load.pl -dsn frog3 -u darwin -p xxxxxxx -c -v fixed_scaffold_498.gff3 scaffold_498.fasta > loading scaffold_498.gff3... > Building object tree... 0.00s > load time: ?0.02s > loading scaffold_498.fasta... > Building object tree... 0.00s > load time: ?0.28s > > > Macintosh:scaffold_498 darwin$ head -25 fixed_scaffold_498.gff3 > scaffold_498 ? ?pick ? ?scaffold ? ? ? ?1 ? ? ? 875342 ?. ? ? ? . ? ? ? . ? ? ? name=scaffold_498 > scaffold_498 ? ?pick ? ?gene ? ?5249 ? ?38060 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955;Name= > scaffold_498 ? ?pick ? ?transcript ? ? ?5249 ? ?38060 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1;Name=;Parent=CUFF.398955 > scaffold_498 ? ?pick ? ?exon ? ?5249 ? ?5359 ? ?1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.1;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?5508 ? ?5661 ? ?1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.2;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?7474 ? ?7647 ? ?1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.3;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?7819 ? ?8005 ? ?1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.4;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?8254 ? ?8419 ? ?1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.5;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?9997 ? ?10111 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.6;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?12104 ? 12302 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.7;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?13854 ? 13954 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.8;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?14810 ? 14989 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.9;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?15310 ? 15394 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.10;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?15938 ? 16044 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.11;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?17997 ? 18072 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.12;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?18364 ? 18472 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.13;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?19364 ? 19488 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.14;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?21402 ? 21549 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.15;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?27620 ? 27726 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.16;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?29322 ? 29436 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.17;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?29958 ? 30129 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.18;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?30947 ? 31055 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.19;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?31247 ? 31309 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.20;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?31461 ? 31526 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.21;Name=;Parent=CUFF.398955.1 > scaffold_498 ? ?pick ? ?exon ? ?32478 ? 32625 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.22;Name=;Parent=CUFF.398955.1 > > Macintosh:scaffold_498 darwin$ head -25 scaffold_498.fasta >>scaffold_498 > TAATATTGTGATGGTCGGCCCTGTAGCTCCTGCTCTAATACAGAACCATTTTCCATTGTG > GTTTTCCAGTGAGATAAACCCTTAATCAAACTGCATAATAAACTGACGGTAGCACACTAG > CACTGGCTGTATCCCTTCTTGAGTAAAAAAAATAAATGGATCTGCCCTTGCTCTTTACCT > CATTAGGCCAGTGGAGTAGATTTGGCATTGGATCATAGCAACGTATGTTTCCAACACAAG > TGACACATTGCTGATTCAGTCCATCTTGACCAGGGCAAAGTTATTAGGTCATATATGGAG > TGAAAACAATGGATTTCCCTACTATTATCTAATCTGGGCTTGTTTGGATATGGATTGGGC > AAGCTGGAAATTGCCATTGGCTGAGGATCATATCAGGCTGTGGAGGCAGCCCATGAACAA > AAGGTCTTCATGAGCTTTTTAAATGATCATATTATTTCAGCCAAGCTTGACCCAACACTA > AGGTGGCTAAATGGGACGGAAGAGTTCTTGCGTGGCAGCAAAAATGAGACGGATCTTCTC > AAACTGCACAGCTTTAATTCTTAGAACATGTTTTCCTAGAGGTCAATACTTTGCATGTCT > GAGTTACCACATGATCGACTCAATCCCATGCTGTAGCCCTCAAGCTCACTGAGCCTTATT > ATCTTATTATTTATACTCTGTAGCCTCTTTGGTGCCAAATGTTCTATTGCATAGATGGCA > CAGGGTAACTTTCTGAGTGAGGTTCCTCAGAAATAAATTGAATATATGTCTGCGCTGCTG > ATGCTTAAAGTTTGGCTTGGGAACCCCACATTGGCACACTATTTATAGCCAGTGAGAGGT > AATTGCTAATATGAAGTGGATTGCCAAACTCATTCTATTTTGGATCACAGAGTGGTACat > acaggtatgggacccgttattcagaatgctcgggaccaagggtattctggataaggggtc > tttccgtaatttggatctccatacattaagtccactaaaaaatcaataaaacgttaataa > aacccagtaggactgttctgccccaataaagattaattatattttagttgggatcaagta > caggtactgttttattattacagagaaaagggaatcatttaaccattaaataaacccaat > aggactgttctgcccccaataaggggtaattatatcttagttgggatcaagtacaggtac > tgttttattattacagagaaaagggaatcatttaaccattaaataaacccaataggactg > ttctgcccccaataaggggtaattatatcttagttgggatcaagtacaggtactgtttta > ttattacAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN > NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCCA > > > ---- > > > > ------------------------------------------------------------------------------ > Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! > Finally, a world-class log management solution at an even better price-free! > Download using promo code Free_Logger_4_Dev2Dev. Offer expires > February 28th, so secure your free ArcSight Logger TODAY! > http://p.sf.net/sfu/arcsight-sfd2d > _______________________________________________ > Gmod-gbrowse mailing list > Gmod-gbrowse at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From dichmann at berkeley.edu Tue Jan 25 17:54:56 2011 From: dichmann at berkeley.edu (Darwin Sorento Dichmann) Date: Tue, 25 Jan 2011 14:54:56 -0800 Subject: [Bioperl-l] [Gmod-gbrowse] Landmark not recognized SeqFeature::Store database problem In-Reply-To: References: Message-ID: Worked. Thanks Scott! Now, when I expand to the whole genome do the scaffold lines need to be in the same gff as the other features and if so do they have to be immediately before the features of the scaffold similar to the test? or can they be in a separate file or appended at the top/end of the feature file. Darwin On Jan 25, 2011, at 2:36 PM, Scott Cain wrote: > Hi Darwin, > > In column 9, case matters. That is, "name" != "Name". Try changing > that for the scaffold. > > Scott > > > On Tue, Jan 25, 2011 at 4:59 PM, Darwin Sorento Dichmann > wrote: >> Hello, >> >> I am trying to set up a SeqFeature::Store database. However, I consistently get "Landmark named scaffold_something is not recognized. See the help pages for suggestions." Apache errorlog shows nothing. I am currently troubleshooting on a single scaffold and get same error. >> >> I've been banging my head against this for weeks and can't figure out what the problem is. Any help or pointers are greatly appreciated. Cc'ed to the bioperl list. >> >> Best wishes, >> Darwin >> >> The database seem to load properly and I can find the scaffold name: >> --- >> mysql> select seqname from locationlist; >> +--------------+ >> | seqname | >> +--------------+ >> | scaffold_498 | >> +--------------+ >> --- >> Command for loading the database and head of files: >> ----- >> bp_seqfeature_load.pl -dsn frog3 -u darwin -p xxxxxxx -c -v fixed_scaffold_498.gff3 scaffold_498.fasta >> loading scaffold_498.gff3... >> Building object tree... 0.00s >> load time: 0.02s >> loading scaffold_498.fasta... >> Building object tree... 0.00s >> load time: 0.28s >> >> >> Macintosh:scaffold_498 darwin$ head -25 fixed_scaffold_498.gff3 >> scaffold_498 pick scaffold 1 875342 . . . name=scaffold_498 >> scaffold_498 pick gene 5249 38060 1000 + . ID=CUFF.398955;Name= >> scaffold_498 pick transcript 5249 38060 1000 + . ID=CUFF.398955.1;Name=;Parent=CUFF.398955 >> scaffold_498 pick exon 5249 5359 1000 + . ID=CUFF.398955.1.1;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 5508 5661 1000 + . ID=CUFF.398955.1.2;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 7474 7647 1000 + . ID=CUFF.398955.1.3;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 7819 8005 1000 + . ID=CUFF.398955.1.4;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 8254 8419 1000 + . ID=CUFF.398955.1.5;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 9997 10111 1000 + . ID=CUFF.398955.1.6;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 12104 12302 1000 + . ID=CUFF.398955.1.7;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 13854 13954 1000 + . ID=CUFF.398955.1.8;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 14810 14989 1000 + . ID=CUFF.398955.1.9;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 15310 15394 1000 + . ID=CUFF.398955.1.10;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 15938 16044 1000 + . ID=CUFF.398955.1.11;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 17997 18072 1000 + . ID=CUFF.398955.1.12;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 18364 18472 1000 + . ID=CUFF.398955.1.13;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 19364 19488 1000 + . ID=CUFF.398955.1.14;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 21402 21549 1000 + . ID=CUFF.398955.1.15;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 27620 27726 1000 + . ID=CUFF.398955.1.16;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 29322 29436 1000 + . ID=CUFF.398955.1.17;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 29958 30129 1000 + . ID=CUFF.398955.1.18;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 30947 31055 1000 + . ID=CUFF.398955.1.19;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 31247 31309 1000 + . ID=CUFF.398955.1.20;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 31461 31526 1000 + . ID=CUFF.398955.1.21;Name=;Parent=CUFF.398955.1 >> scaffold_498 pick exon 32478 32625 1000 + . ID=CUFF.398955.1.22;Name=;Parent=CUFF.398955.1 >> >> Macintosh:scaffold_498 darwin$ head -25 scaffold_498.fasta >>> scaffold_498 >> TAATATTGTGATGGTCGGCCCTGTAGCTCCTGCTCTAATACAGAACCATTTTCCATTGTG >> GTTTTCCAGTGAGATAAACCCTTAATCAAACTGCATAATAAACTGACGGTAGCACACTAG >> CACTGGCTGTATCCCTTCTTGAGTAAAAAAAATAAATGGATCTGCCCTTGCTCTTTACCT >> CATTAGGCCAGTGGAGTAGATTTGGCATTGGATCATAGCAACGTATGTTTCCAACACAAG >> TGACACATTGCTGATTCAGTCCATCTTGACCAGGGCAAAGTTATTAGGTCATATATGGAG >> TGAAAACAATGGATTTCCCTACTATTATCTAATCTGGGCTTGTTTGGATATGGATTGGGC >> AAGCTGGAAATTGCCATTGGCTGAGGATCATATCAGGCTGTGGAGGCAGCCCATGAACAA >> AAGGTCTTCATGAGCTTTTTAAATGATCATATTATTTCAGCCAAGCTTGACCCAACACTA >> AGGTGGCTAAATGGGACGGAAGAGTTCTTGCGTGGCAGCAAAAATGAGACGGATCTTCTC >> AAACTGCACAGCTTTAATTCTTAGAACATGTTTTCCTAGAGGTCAATACTTTGCATGTCT >> GAGTTACCACATGATCGACTCAATCCCATGCTGTAGCCCTCAAGCTCACTGAGCCTTATT >> ATCTTATTATTTATACTCTGTAGCCTCTTTGGTGCCAAATGTTCTATTGCATAGATGGCA >> CAGGGTAACTTTCTGAGTGAGGTTCCTCAGAAATAAATTGAATATATGTCTGCGCTGCTG >> ATGCTTAAAGTTTGGCTTGGGAACCCCACATTGGCACACTATTTATAGCCAGTGAGAGGT >> AATTGCTAATATGAAGTGGATTGCCAAACTCATTCTATTTTGGATCACAGAGTGGTACat >> acaggtatgggacccgttattcagaatgctcgggaccaagggtattctggataaggggtc >> tttccgtaatttggatctccatacattaagtccactaaaaaatcaataaaacgttaataa >> aacccagtaggactgttctgccccaataaagattaattatattttagttgggatcaagta >> caggtactgttttattattacagagaaaagggaatcatttaaccattaaataaacccaat >> aggactgttctgcccccaataaggggtaattatatcttagttgggatcaagtacaggtac >> tgttttattattacagagaaaagggaatcatttaaccattaaataaacccaataggactg >> ttctgcccccaataaggggtaattatatcttagttgggatcaagtacaggtactgtttta >> ttattacAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN >> NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCCA >> >> >> ---- >> >> >> >> ------------------------------------------------------------------------------ >> Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! >> Finally, a world-class log management solution at an even better price-free! >> Download using promo code Free_Logger_4_Dev2Dev. Offer expires >> February 28th, so secure your free ArcSight Logger TODAY! >> http://p.sf.net/sfu/arcsight-sfd2d >> _______________________________________________ >> Gmod-gbrowse mailing list >> Gmod-gbrowse at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse >> > > > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. scott at scottcain dot net > GMOD Coordinator (http://gmod.org/) 216-392-3087 > Ontario Institute for Cancer Research From scott at scottcain.net Tue Jan 25 18:10:17 2011 From: scott at scottcain.net (Scott Cain) Date: Tue, 25 Jan 2011 18:10:17 -0500 Subject: [Bioperl-l] [Gmod-gbrowse] Landmark not recognized SeqFeature::Store database problem In-Reply-To: References: Message-ID: Hi Darwin, Bio::DB::SeqFeature::Store does not care about the order in which entries appear in the GFF files, so the scaffold lines can be anywhere (many people do put them in a separate file). Scott On Tue, Jan 25, 2011 at 5:54 PM, Darwin Sorento Dichmann wrote: > Worked. Thanks Scott! > > Now, when I expand to the whole genome do the scaffold lines need to be in the same gff as the other features and if so do they have to be immediately before the features of the scaffold similar to the test? or can they be in a separate file or appended at the top/end of the feature file. > > Darwin > > On Jan 25, 2011, at 2:36 PM, Scott Cain wrote: > >> Hi Darwin, >> >> In column 9, case matters. ?That is, "name" != "Name". ?Try changing >> that for the scaffold. >> >> Scott >> >> >> On Tue, Jan 25, 2011 at 4:59 PM, Darwin Sorento Dichmann >> wrote: >>> Hello, >>> >>> I am trying to set up a SeqFeature::Store database. However, I consistently get "Landmark named scaffold_something is not recognized. See the help pages for suggestions." Apache errorlog shows nothing. I am currently troubleshooting on a single scaffold and get same error. >>> >>> I've been banging my head against this for weeks and can't figure out what the problem is. Any help or pointers are greatly appreciated. Cc'ed to the bioperl list. >>> >>> Best wishes, >>> Darwin >>> >>> The database seem to load properly and I can find the scaffold name: >>> --- >>> mysql> select seqname from locationlist; >>> +--------------+ >>> | seqname ? ? ?| >>> +--------------+ >>> | scaffold_498 | >>> +--------------+ >>> --- >>> Command for loading the database and head of files: >>> ----- >>> bp_seqfeature_load.pl -dsn frog3 -u darwin -p xxxxxxx -c -v fixed_scaffold_498.gff3 scaffold_498.fasta >>> loading scaffold_498.gff3... >>> Building object tree... 0.00s >>> load time: ?0.02s >>> loading scaffold_498.fasta... >>> Building object tree... 0.00s >>> load time: ?0.28s >>> >>> >>> Macintosh:scaffold_498 darwin$ head -25 fixed_scaffold_498.gff3 >>> scaffold_498 ? ?pick ? ?scaffold ? ? ? ?1 ? ? ? 875342 ?. ? ? ? . ? ? ? . ? ? ? name=scaffold_498 >>> scaffold_498 ? ?pick ? ?gene ? ?5249 ? ?38060 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955;Name= >>> scaffold_498 ? ?pick ? ?transcript ? ? ?5249 ? ?38060 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1;Name=;Parent=CUFF.398955 >>> scaffold_498 ? ?pick ? ?exon ? ?5249 ? ?5359 ? ?1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.1;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?5508 ? ?5661 ? ?1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.2;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?7474 ? ?7647 ? ?1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.3;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?7819 ? ?8005 ? ?1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.4;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?8254 ? ?8419 ? ?1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.5;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?9997 ? ?10111 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.6;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?12104 ? 12302 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.7;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?13854 ? 13954 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.8;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?14810 ? 14989 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.9;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?15310 ? 15394 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.10;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?15938 ? 16044 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.11;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?17997 ? 18072 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.12;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?18364 ? 18472 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.13;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?19364 ? 19488 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.14;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?21402 ? 21549 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.15;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?27620 ? 27726 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.16;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?29322 ? 29436 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.17;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?29958 ? 30129 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.18;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?30947 ? 31055 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.19;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?31247 ? 31309 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.20;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?31461 ? 31526 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.21;Name=;Parent=CUFF.398955.1 >>> scaffold_498 ? ?pick ? ?exon ? ?32478 ? 32625 ? 1000 ? ?+ ? ? ? . ? ? ? ID=CUFF.398955.1.22;Name=;Parent=CUFF.398955.1 >>> >>> Macintosh:scaffold_498 darwin$ head -25 scaffold_498.fasta >>>> scaffold_498 >>> TAATATTGTGATGGTCGGCCCTGTAGCTCCTGCTCTAATACAGAACCATTTTCCATTGTG >>> GTTTTCCAGTGAGATAAACCCTTAATCAAACTGCATAATAAACTGACGGTAGCACACTAG >>> CACTGGCTGTATCCCTTCTTGAGTAAAAAAAATAAATGGATCTGCCCTTGCTCTTTACCT >>> CATTAGGCCAGTGGAGTAGATTTGGCATTGGATCATAGCAACGTATGTTTCCAACACAAG >>> TGACACATTGCTGATTCAGTCCATCTTGACCAGGGCAAAGTTATTAGGTCATATATGGAG >>> TGAAAACAATGGATTTCCCTACTATTATCTAATCTGGGCTTGTTTGGATATGGATTGGGC >>> AAGCTGGAAATTGCCATTGGCTGAGGATCATATCAGGCTGTGGAGGCAGCCCATGAACAA >>> AAGGTCTTCATGAGCTTTTTAAATGATCATATTATTTCAGCCAAGCTTGACCCAACACTA >>> AGGTGGCTAAATGGGACGGAAGAGTTCTTGCGTGGCAGCAAAAATGAGACGGATCTTCTC >>> AAACTGCACAGCTTTAATTCTTAGAACATGTTTTCCTAGAGGTCAATACTTTGCATGTCT >>> GAGTTACCACATGATCGACTCAATCCCATGCTGTAGCCCTCAAGCTCACTGAGCCTTATT >>> ATCTTATTATTTATACTCTGTAGCCTCTTTGGTGCCAAATGTTCTATTGCATAGATGGCA >>> CAGGGTAACTTTCTGAGTGAGGTTCCTCAGAAATAAATTGAATATATGTCTGCGCTGCTG >>> ATGCTTAAAGTTTGGCTTGGGAACCCCACATTGGCACACTATTTATAGCCAGTGAGAGGT >>> AATTGCTAATATGAAGTGGATTGCCAAACTCATTCTATTTTGGATCACAGAGTGGTACat >>> acaggtatgggacccgttattcagaatgctcgggaccaagggtattctggataaggggtc >>> tttccgtaatttggatctccatacattaagtccactaaaaaatcaataaaacgttaataa >>> aacccagtaggactgttctgccccaataaagattaattatattttagttgggatcaagta >>> caggtactgttttattattacagagaaaagggaatcatttaaccattaaataaacccaat >>> aggactgttctgcccccaataaggggtaattatatcttagttgggatcaagtacaggtac >>> tgttttattattacagagaaaagggaatcatttaaccattaaataaacccaataggactg >>> ttctgcccccaataaggggtaattatatcttagttgggatcaagtacaggtactgtttta >>> ttattacAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN >>> NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCCA >>> >>> >>> ---- >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! >>> Finally, a world-class log management solution at an even better price-free! >>> Download using promo code Free_Logger_4_Dev2Dev. Offer expires >>> February 28th, so secure your free ArcSight Logger TODAY! >>> http://p.sf.net/sfu/arcsight-sfd2d >>> _______________________________________________ >>> Gmod-gbrowse mailing list >>> Gmod-gbrowse at lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse >>> >> >> >> >> -- >> ------------------------------------------------------------------------ >> Scott Cain, Ph. D. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? scott at scottcain dot net >> GMOD Coordinator (http://gmod.org/) ? ? ? ? ? ? ? ? ? ? 216-392-3087 >> Ontario Institute for Cancer Research > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From s.denaxas at gmail.com Wed Jan 26 07:13:17 2011 From: s.denaxas at gmail.com (Spiros Denaxas) Date: Wed, 26 Jan 2011 12:13:17 +0000 Subject: [Bioperl-l] medperl, something kinda like bioperl Message-ID: Hello, I am sending this email here since I consider all people that contribute and/or follow the bioperl project as the best starting point for advice on a new project I am currently planning ; my apologies if its considered off-topic. While the bioinformatics community has greatly benefitted from the Perl community, with the shining example of bioperl, the medical community is sadly a bit behind. I am currently employed in a public health / epidemiology environment and have under numerous occasions discovered opportunities to contribute code to CPAN that has made my life easier. I know I am not alone, but a very quick search on CPAN for related modules form the medical / biomedical domain does not return much for now. I recently gave a presentation at the London Perl Workshop [1] and while creating it, I thought, would it be useful to have something similar to bioperl for modules which largely contribute to the medical / epidemiological domain? I was thinking of creating something like medperl, alas similar to bioperl, but in a very very simple form. It would serve as a reference point to the (albeit small) numbers of modules that are currently on CPAN and will also hopefully urge people to contribute some of their code along the way. So I would like to request your advice on: a) Can you think of any reasons for not doing this? b) Does anybody know of something similar? c) Does anybody feel like they could contribute? Regards, Spiros Denaxas [1] http://www.slideshare.net/spirosd/perl-cures-coronary-heart-disease-lpw2010 From bernd.web at gmail.com Wed Jan 26 10:40:33 2011 From: bernd.web at gmail.com (Bernd Web) Date: Wed, 26 Jan 2011 16:40:33 +0100 Subject: [Bioperl-l] Bio::SimpleAlign replace sequence Message-ID: Hi Is it possible to replace a sequence itself for a Bio::Locatable Seq in SimpleAlign. I'd like to change the sequence string for a sequence in the alignment directly. I could remove this sequence and add it but this will change the order. (I remember a patch was made to insert a sequence with add_seq at a specific position, but still). I'd like to do change a sequence like this. I realize the below cannot work, but is there an option to do this? $aln->get_seq_by_pos(1)->seq = "MMM"; that is via the methods, instead of doing something low level like $aln->{_seq}->{myname}->{seq} = "MMM"; Regards, Bernd From roy.chaudhuri at gmail.com Wed Jan 26 10:55:33 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Wed, 26 Jan 2011 15:55:33 +0000 Subject: [Bioperl-l] Bio::SimpleAlign replace sequence In-Reply-To: References: Message-ID: <4D4043F5.90602@gmail.com> Hi Bernd, This works for me: $aln->get_seq_by_pos(1)->seq("MMM"); Cheers, Roy. On 26/01/2011 15:40, Bernd Web wrote: > Hi > > Is it possible to replace a sequence itself for a Bio::Locatable Seq > in SimpleAlign. > I'd like to change the sequence string for a sequence in the alignment directly. > I could remove this sequence and add it but this will change the > order. (I remember a patch was made to insert a sequence with add_seq > at a specific position, but still). > > I'd like to do change a sequence like this. I realize the below cannot > work, but is there an option to do this? > > $aln->get_seq_by_pos(1)->seq = "MMM"; > > that is via the methods, instead of doing something low level like > > $aln->{_seq}->{myname}->{seq} = "MMM"; > > > Regards, > Bernd > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From dan.bolser at gmail.com Wed Jan 26 11:57:04 2011 From: dan.bolser at gmail.com (Dan Bolser) Date: Wed, 26 Jan 2011 16:57:04 +0000 Subject: [Bioperl-l] Undefined subroutine &Bio::DB::SeqFeature::Store::uncompress called Message-ID: Hi, I just updated BioPerl from git / GBrowse from svn (to check if a particular bug was fixed), and I'm seeing the following message in large friendly red letters across the top of my GBrowse instance: Undefined subroutine &Bio::DB::SeqFeature::Store::uncompress called. Unfortunately, none of the usual sequence features are displayed (no overview or details panels). Here is a snippet of the Apache error log: [Wed Jan 26 16:55:15 2011] [error] [client x.x.x.x] Math::BigInt: couldn't load specified math lib(s), fallback to Math::BigInt::Calc at /homes/www-potato/perl5/lib/perl5/Crypt/DH.pm line 6, referer: https://x/potato/cgi-bin/gbrowse2/gbrowse/pot_qa/?name=PGSC0003DMB000000115%3A1..600%2C001 [Wed Jan 26 16:55:15 2011] [error] [client x.x.x.x] Math::BigInt: couldn't load specified math lib(s), fallback to Math::BigInt::Calc at /homes/www-potato/perl5/lib/perl5/Crypt/DH.pm line 6, referer: https://x/potato/cgi-bin/gbrowse2/gbrowse/pot_qa/?name=PGSC0003DMB000000115%3A1..600%2C001 [Wed Jan 26 16:55:15 2011] [error] [client x.x.x.x] Undefined subroutine &Bio::DB::SeqFeature::Store::uncompress called at /homes/www-potato/perl5/lib/perl5/Bio/DB/SeqFeature/Store.pm line 2531., referer: https://x/potato/cgi-bin/gbrowse2/gbrowse/pot_qa/?name=PGSC0003DMB000000115%3A1..600%2C001 Thanks for guidance, Dan. From plattj at cardiff.ac.uk Wed Jan 26 17:05:08 2011 From: plattj at cardiff.ac.uk (JayPea) Date: Wed, 26 Jan 2011 14:05:08 -0800 (PST) Subject: [Bioperl-l] Newbie question on Bio::SeqIO Message-ID: <30768204.post@talk.nabble.com> Hi all. I recently installed bioperl on my mac (OSX 10.6.6) using fink. And have been playing around trying to get some really simple things to work. SO what I'm trying to do is just grab 20bases of the fasta file then print them out. This is my script: #!/usr/bin/perl use Bio::Perl; use Bio::SeqIO; my $seqio_obj = Bio::SeqIO->new(-file => "dna.fa", -format => 'Fasta'); my $output = $seqio_obj->subseq(1,20); print "$output\n"; fasta file: >chr1 D_discoideum_Ax4_May_2005 4923596 bp DDB0232428 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTATAGTTACTATTGTAAATC GATAGATAACTTAATTTCATTAATATTATACATAGTAACATTATAAAAAACTTTTAATTT TTATTTGGGAATTTCAAATTGCTCATTTGGGAAAATTTTTAACTAAGAAAAAATTCAAAA I get this error: Can't locate object method "subseq" via package "Bio::SeqIO::fasta" at ./biotester.pl line 16. Thanks for any help! James -- View this message in context: http://old.nabble.com/Newbie-question-on-Bio%3A%3ASeqIO-tp30768204p30768204.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From florent.angly at gmail.com Wed Jan 26 17:24:59 2011 From: florent.angly at gmail.com (Florent Angly) Date: Wed, 26 Jan 2011 17:24:59 -0500 Subject: [Bioperl-l] Newbie question on Bio::SeqIO In-Reply-To: <30768204.post@talk.nabble.com> References: <30768204.post@talk.nabble.com> Message-ID: <4D409F3B.6090509@gmail.com> Hi James, This should get you started: http://www.bioperl.org/wiki/HOWTO:SeqIO Best, Florent On 26/01/11 17:05, JayPea wrote: > Hi all. > > I recently installed bioperl on my mac (OSX 10.6.6) using fink. And have > been playing around trying to get some really simple things to work. SO what > I'm trying to do is just grab 20bases of the fasta file then print them out. > > This is my script: > > #!/usr/bin/perl > use Bio::Perl; > use Bio::SeqIO; > > > my $seqio_obj = Bio::SeqIO->new(-file => "dna.fa", > -format => 'Fasta'); > my $output = $seqio_obj->subseq(1,20); > print "$output\n"; > > fasta file: > >> chr1 D_discoideum_Ax4_May_2005 4923596 bp DDB0232428 > NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN > NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTATAGTTACTATTGTAAATC > GATAGATAACTTAATTTCATTAATATTATACATAGTAACATTATAAAAAACTTTTAATTT > TTATTTGGGAATTTCAAATTGCTCATTTGGGAAAATTTTTAACTAAGAAAAAATTCAAAA > > I get this error: > > Can't locate object method "subseq" via package "Bio::SeqIO::fasta" at > ./biotester.pl line 16. > > Thanks for any help! > > James From asjo at koldfront.dk Wed Jan 26 17:30:10 2011 From: asjo at koldfront.dk (Adam =?utf-8?Q?Sj=C3=B8gren?=) Date: Wed, 26 Jan 2011 23:30:10 +0100 Subject: [Bioperl-l] Newbie question on Bio::SeqIO In-Reply-To: <30768204.post@talk.nabble.com> (JayPea's message of "Wed, 26 Jan 2011 14:05:08 -0800 (PST)") References: <30768204.post@talk.nabble.com> Message-ID: <87aainxpql.fsf@topper.koldfront.dk> On Wed, 26 Jan 2011 14:05:08 -0800 (PST), JayPea wrote: > #!/usr/bin/perl > use Bio::Perl; > use Bio::SeqIO; > my $seqio_obj = Bio::SeqIO->new(-file => "dna.fa", > -format => 'Fasta'); > my $output = $seqio_obj->subseq(1,20); Try replacing that line by these two: my $seq=$seqio_obj->next_seq; my $output=$seq->subseq(1,20); > print "$output\n"; See "perldoc Bio::SeqIO" :-) Also: Never forget to start your scripts with "use strict; use warnings;" - a great way to avoid typos and stuff. Best regards, Adam -- "H?r kommer r?dslan, gamle v?n Adam Sj?gren N?r alla fj?rilar i magen vaknar upp asjo at koldfront.dk Viskar v?lkommen hem" From cjfields at illinois.edu Wed Jan 26 18:07:51 2011 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 26 Jan 2011 17:07:51 -0600 Subject: [Bioperl-l] Newbie question on Bio::SeqIO In-Reply-To: <87aainxpql.fsf@topper.koldfront.dk> References: <30768204.post@talk.nabble.com> <87aainxpql.fsf@topper.koldfront.dk> Message-ID: Also, FASTQ parsing only works correctly if you are using BioPerl 1.6.1. Not sure what version fink currently has. chris On Jan 26, 2011, at 4:30 PM, Adam Sj?gren wrote: > On Wed, 26 Jan 2011 14:05:08 -0800 (PST), JayPea wrote: > >> #!/usr/bin/perl >> use Bio::Perl; >> use Bio::SeqIO; > >> my $seqio_obj = Bio::SeqIO->new(-file => "dna.fa", >> -format => 'Fasta'); >> my $output = $seqio_obj->subseq(1,20); > > Try replacing that line by these two: > > my $seq=$seqio_obj->next_seq; > my $output=$seq->subseq(1,20); > >> print "$output\n"; > > See "perldoc Bio::SeqIO" :-) > > > Also: Never forget to start your scripts with "use strict; use > warnings;" - a great way to avoid typos and stuff. > > > Best regards, > > Adam > > -- > "H?r kommer r?dslan, gamle v?n Adam Sj?gren > N?r alla fj?rilar i magen vaknar upp asjo at koldfront.dk > Viskar v?lkommen hem" > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From plattj at cardiff.ac.uk Wed Jan 26 12:00:20 2011 From: plattj at cardiff.ac.uk (JayPea) Date: Wed, 26 Jan 2011 09:00:20 -0800 (PST) Subject: [Bioperl-l] Newbie question on Bio::SeqIO Message-ID: <30765870.post@talk.nabble.com> Hi all. I recently installed bioperl on my mac (OSX 10.6.6) using fink. And have been playing around trying to get some really simple things to work. SO what I'm trying to do is just grab 20bases of the fasta file then print them out. This is my script: #!/usr/bin/perl use Bio::Perl; use Bio::SeqIO; my $seqio_obj = Bio::SeqIO->new(-file => "dna.fa", -format => 'Fasta'); my $output = $seqio_obj->subseq(1,20); print "$output\n"; fasta file: >chr1 D_discoideum_Ax4_May_2005 4923596 bp DDB0232428 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTATAGTTACTATTGTAAATC GATAGATAACTTAATTTCATTAATATTATACATAGTAACATTATAAAAAACTTTTAATTT TTATTTGGGAATTTCAAATTGCTCATTTGGGAAAATTTTTAACTAAGAAAAAATTCAAAA I get this error: Can't locate object method "subseq" via package "Bio::SeqIO::fasta" at ./biotester.pl line 16. Thanks for any help! James -- View this message in context: http://old.nabble.com/Newbie-question-on-Bio%3A%3ASeqIO-tp30765870p30765870.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From joshua_udall at byu.edu Wed Jan 26 17:11:46 2011 From: joshua_udall at byu.edu (Joshua Udall) Date: Wed, 26 Jan 2011 15:11:46 -0700 Subject: [Bioperl-l] Newbie question on Bio::SeqIO In-Reply-To: <30768204.post@talk.nabble.com> Message-ID: Use this bash command to see which libraries are loading in perl. Ideally, you would make sure the fink library is represented in the list. A quick fix would be to simply use a 'lib' statement at the beginning of your script to source your Bioperl path. perl -e 'for (@INC) { printf "%d %s\n", $i++, $_}' Josh On 1/26/11 3:05 PM, "JayPea" wrote: > >Hi all. > >I recently installed bioperl on my mac (OSX 10.6.6) using fink. And have >been playing around trying to get some really simple things to work. SO >what >I'm trying to do is just grab 20bases of the fasta file then print them >out. > >This is my script: > >#!/usr/bin/perl >use Bio::Perl; >use Bio::SeqIO; > > >my $seqio_obj = Bio::SeqIO->new(-file => "dna.fa", > -format => 'Fasta'); >my $output = $seqio_obj->subseq(1,20); >print "$output\n"; > >fasta file: > >>chr1 D_discoideum_Ax4_May_2005 4923596 bp DDB0232428 >NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN >NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTATAGTTACTATTGTAAATC >GATAGATAACTTAATTTCATTAATATTATACATAGTAACATTATAAAAAACTTTTAATTT >TTATTTGGGAATTTCAAATTGCTCATTTGGGAAAATTTTTAACTAAGAAAAAATTCAAAA > >I get this error: > >Can't locate object method "subseq" via package "Bio::SeqIO::fasta" at >./biotester.pl line 16. > >Thanks for any help! > >James >-- >View this message in context: >http://old.nabble.com/Newbie-question-on-Bio%3A%3ASeqIO-tp30768204p3076820 >4.html >Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > >_______________________________________________ >Bioperl-l mailing list >Bioperl-l at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/bioperl-l From dan.bolser at gmail.com Thu Jan 27 05:13:42 2011 From: dan.bolser at gmail.com (Dan Bolser) Date: Thu, 27 Jan 2011 10:13:42 +0000 Subject: [Bioperl-l] Undefined subroutine &Bio::DB::SeqFeature::Store::uncompress called In-Reply-To: References: Message-ID: On 26 January 2011 16:57, Dan Bolser wrote: > Hi, > > I just updated BioPerl from git / GBrowse from svn (to check if a So is it true that 'latest' GBrowse has a maximum BioPerl version requirement? v1.6.1 ? The only only 'release' branch I could find was "origin/release-1-6-2"... After ./Build install I still see the same error. Can someone create a v1.6.1 branch? Or should I use a tag? Cheers, Dan. P.S. I have largely revised* and then followed the the guide here: http://www.bioperl.org/wiki/Using_Git http://www.bioperl.org/wiki/Using_Git#Release_branches After perl Build.PL I see the following error (just for reference): Use of uninitialized value in split at Bio/Root/Build.pm line 769, line 3. After ./Build test: Test Summary Report ------------------- t/SeqIO/embl.t (Wstat: 512 Tests: 85 Failed: 0) Non-zero exit status: 2 Parse errors: Bad plan. You planned 95 tests but ran 85. Files=350, Tests=24189, 280 wallclock secs ( 8.85 usr 2.37 sys + 210.23 cusr 28.67 csys = 250.12 CPU) Result: FAIL Failed 1/350 test programs. 0/24189 subtests failed. * I moved the bulk of the content here (as it is only relevant for a small group of developers): http://www.bioperl.org/wiki/Using_Git/Advanced I figure splitting out the page gives each part a clearer focus, allowing them to be improved more easily. > particular bug was fixed), and I'm seeing the following message in > large friendly red letters across the top of my GBrowse instance: > > Undefined subroutine &Bio::DB::SeqFeature::Store::uncompress called. > > > Unfortunately, none of the usual sequence features are displayed (no > overview or details panels). > > Here is a snippet of the Apache error log: > > [Wed Jan 26 16:55:15 2011] [error] [client x.x.x.x] Math::BigInt: > couldn't load specified math lib(s), fallback to Math::BigInt::Calc at > /homes/www-potato/perl5/lib/perl5/Crypt/DH.pm line 6, referer: > https://x/potato/cgi-bin/gbrowse2/gbrowse/pot_qa/?name=PGSC0003DMB000000115%3A1..600%2C001 > [Wed Jan 26 16:55:15 2011] [error] [client x.x.x.x] Math::BigInt: > couldn't load specified math lib(s), fallback to Math::BigInt::Calc at > /homes/www-potato/perl5/lib/perl5/Crypt/DH.pm line 6, referer: > https://x/potato/cgi-bin/gbrowse2/gbrowse/pot_qa/?name=PGSC0003DMB000000115%3A1..600%2C001 > [Wed Jan 26 16:55:15 2011] [error] [client x.x.x.x] Undefined > subroutine &Bio::DB::SeqFeature::Store::uncompress called at > /homes/www-potato/perl5/lib/perl5/Bio/DB/SeqFeature/Store.pm line > 2531., referer: > https://x/potato/cgi-bin/gbrowse2/gbrowse/pot_qa/?name=PGSC0003DMB000000115%3A1..600%2C001 > > > > Thanks for guidance, > Dan. > From dan.bolser at gmail.com Thu Jan 27 05:15:23 2011 From: dan.bolser at gmail.com (Dan Bolser) Date: Thu, 27 Jan 2011 10:15:23 +0000 Subject: [Bioperl-l] Undefined subroutine &Bio::DB::SeqFeature::Store::uncompress called In-Reply-To: References: Message-ID: On 27 January 2011 10:13, Dan Bolser wrote: > On 26 January 2011 16:57, Dan Bolser wrote: >> Hi, >> >> I just updated BioPerl from git / GBrowse from svn (to check if a > > So is it true that 'latest' GBrowse has a maximum BioPerl version requirement? Or should I simply not be using GBrowse from SVN? > v1.6.1 ? > > The only only 'release' branch I could find was "origin/release-1-6-2"... > > After ./Build install I still see the same error. Can someone create a > v1.6.1 branch? Or should I use a tag? > > > Cheers, > Dan. > > P.S. I have largely revised* and then followed the the guide here: > > http://www.bioperl.org/wiki/Using_Git > http://www.bioperl.org/wiki/Using_Git#Release_branches > > > After perl Build.PL I see the following error (just for reference): > > Use of uninitialized value in split at Bio/Root/Build.pm line 769, > line 3. > > > After ./Build test: > > Test Summary Report > ------------------- > t/SeqIO/embl.t ? ? ? ? ? ? ? ? ? ? ? ? ? ? (Wstat: 512 Tests: 85 Failed: 0) > ?Non-zero exit status: 2 > ?Parse errors: Bad plan. ?You planned 95 tests but ran 85. > Files=350, Tests=24189, 280 wallclock secs ( 8.85 usr ?2.37 sys + > 210.23 cusr 28.67 csys = 250.12 CPU) > Result: FAIL > Failed 1/350 test programs. 0/24189 subtests failed. > > > * I moved the bulk of the content here (as it is only relevant for a > small group of developers): > http://www.bioperl.org/wiki/Using_Git/Advanced > > I figure splitting out the page gives each part a clearer focus, > allowing them to be improved more easily. > >> particular bug was fixed), and I'm seeing the following message in >> large friendly red letters across the top of my GBrowse instance: >> >> Undefined subroutine &Bio::DB::SeqFeature::Store::uncompress called. >> >> >> Unfortunately, none of the usual sequence features are displayed (no >> overview or details panels). >> >> Here is a snippet of the Apache error log: >> >> [Wed Jan 26 16:55:15 2011] [error] [client x.x.x.x] Math::BigInt: >> couldn't load specified math lib(s), fallback to Math::BigInt::Calc at >> /homes/www-potato/perl5/lib/perl5/Crypt/DH.pm line 6, referer: >> https://x/potato/cgi-bin/gbrowse2/gbrowse/pot_qa/?name=PGSC0003DMB000000115%3A1..600%2C001 >> [Wed Jan 26 16:55:15 2011] [error] [client x.x.x.x] Math::BigInt: >> couldn't load specified math lib(s), fallback to Math::BigInt::Calc at >> /homes/www-potato/perl5/lib/perl5/Crypt/DH.pm line 6, referer: >> https://x/potato/cgi-bin/gbrowse2/gbrowse/pot_qa/?name=PGSC0003DMB000000115%3A1..600%2C001 >> [Wed Jan 26 16:55:15 2011] [error] [client x.x.x.x] Undefined >> subroutine &Bio::DB::SeqFeature::Store::uncompress called at >> /homes/www-potato/perl5/lib/perl5/Bio/DB/SeqFeature/Store.pm line >> 2531., referer: >> https://x/potato/cgi-bin/gbrowse2/gbrowse/pot_qa/?name=PGSC0003DMB000000115%3A1..600%2C001 >> >> >> >> Thanks for guidance, >> Dan. >> > From dan.bolser at gmail.com Thu Jan 27 05:37:44 2011 From: dan.bolser at gmail.com (Dan Bolser) Date: Thu, 27 Jan 2011 10:37:44 +0000 Subject: [Bioperl-l] Undefined subroutine &Bio::DB::SeqFeature::Store::uncompress called In-Reply-To: References: Message-ID: On 27 January 2011 10:13, Dan Bolser wrote: > On 26 January 2011 16:57, Dan Bolser wrote: >> Hi, >> >> I just updated BioPerl from git / GBrowse from svn (to check if a > > So is it true that 'latest' GBrowse has a maximum BioPerl version requirement? > > v1.6.1 ? > > The only only 'release' branch I could find was "origin/release-1-6-2"... > > After ./Build install I still see the same error. Can someone create a > v1.6.1 branch? Or should I use a tag? OK, I used the bioperl-release-1-6-1 tag. perl Build.PL gave the following error (for reference): Can't find dist packages without a MANIFEST file - run 'manifest' action first at Bio/Root/Build.pm line 605, line 4. WARNING: Possible missing or corrupt 'MANIFEST' file. Nothing to enter for 'provides' field in META.yml Creating new 'Build' script for 'BioPerl' version '1.006001' Tested and installed fine, but didn't fix the GBrowse issue. I guess this requires a GB roll back rather than a BP roll back. > Cheers, > Dan. > > P.S. I have largely revised* and then followed the the guide here: > > http://www.bioperl.org/wiki/Using_Git > http://www.bioperl.org/wiki/Using_Git#Release_branches > > > After perl Build.PL I see the following error (just for reference): > > Use of uninitialized value in split at Bio/Root/Build.pm line 769, > line 3. > > > After ./Build test: > > Test Summary Report > ------------------- > t/SeqIO/embl.t ? ? ? ? ? ? ? ? ? ? ? ? ? ? (Wstat: 512 Tests: 85 Failed: 0) > ?Non-zero exit status: 2 > ?Parse errors: Bad plan. ?You planned 95 tests but ran 85. > Files=350, Tests=24189, 280 wallclock secs ( 8.85 usr ?2.37 sys + > 210.23 cusr 28.67 csys = 250.12 CPU) > Result: FAIL > Failed 1/350 test programs. 0/24189 subtests failed. > > > * I moved the bulk of the content here (as it is only relevant for a > small group of developers): > http://www.bioperl.org/wiki/Using_Git/Advanced > > I figure splitting out the page gives each part a clearer focus, > allowing them to be improved more easily. > >> particular bug was fixed), and I'm seeing the following message in >> large friendly red letters across the top of my GBrowse instance: >> >> Undefined subroutine &Bio::DB::SeqFeature::Store::uncompress called. >> >> >> Unfortunately, none of the usual sequence features are displayed (no >> overview or details panels). >> >> Here is a snippet of the Apache error log: >> >> [Wed Jan 26 16:55:15 2011] [error] [client x.x.x.x] Math::BigInt: >> couldn't load specified math lib(s), fallback to Math::BigInt::Calc at >> /homes/www-potato/perl5/lib/perl5/Crypt/DH.pm line 6, referer: >> https://x/potato/cgi-bin/gbrowse2/gbrowse/pot_qa/?name=PGSC0003DMB000000115%3A1..600%2C001 >> [Wed Jan 26 16:55:15 2011] [error] [client x.x.x.x] Math::BigInt: >> couldn't load specified math lib(s), fallback to Math::BigInt::Calc at >> /homes/www-potato/perl5/lib/perl5/Crypt/DH.pm line 6, referer: >> https://x/potato/cgi-bin/gbrowse2/gbrowse/pot_qa/?name=PGSC0003DMB000000115%3A1..600%2C001 >> [Wed Jan 26 16:55:15 2011] [error] [client x.x.x.x] Undefined >> subroutine &Bio::DB::SeqFeature::Store::uncompress called at >> /homes/www-potato/perl5/lib/perl5/Bio/DB/SeqFeature/Store.pm line >> 2531., referer: >> https://x/potato/cgi-bin/gbrowse2/gbrowse/pot_qa/?name=PGSC0003DMB000000115%3A1..600%2C001 >> >> >> >> Thanks for guidance, >> Dan. >> > From plattj at Cardiff.ac.uk Thu Jan 27 08:02:13 2011 From: plattj at Cardiff.ac.uk (James Platt) Date: Thu, 27 Jan 2011 13:02:13 +0000 Subject: [Bioperl-l] Newbie question on Bio::SeqIO In-Reply-To: References: <30768204.post@talk.nabble.com> Message-ID: I tried this already I got a different error: Unrecognized character \xE2 in column 21 at ./biotester.pl line 18. I then copied your script in and it worked, then I had my script and that identical and mine still didn't work. Not sure why it was happening. So I have my sequence now and I'm trying to reverese complement it: $output1 = $seq_obj->subseq(160,180); print "$output1\n"; $rev_output1 = revcom( $output1 ); print "$rev_output1\n"; I get this output: ATTATAAAAAACTTTTAATTT Bio::PrimarySeq=HASH(0x9c8e20) I also try it in an object orientated manner: $rev_output1 = $output1->revcom; ATTATAAAAAACTTTTAATTT Can't locate object method "revcom" via package "ATTATAAAAAACTTTTAATTT" (perhaps you forgot to load "ATTATAAAAAACTTTTAATTT"?) at biotester.pl line 24, line 1. I've been following instructions at http://www.bioperl.org/wiki/HOWTO:Beginners Thanks again! James On 26 Jan 2011, at 22:15, Yifei Huang wrote: > subseq is not a member function in SeqIO class. You need to use next_seq function in SeqIO class to read a sequence and then fetch a subsequence, which might be something like this. > > > #!/usr/bin/perl > use Bio::Perl; > use Bio::SeqIO; > > > my $seqio_obj = Bio::SeqIO->new(-file => "dna.fa", > -format => 'Fasta'); > my $seq_obj = $seqio_obj->next_seq(); > my $output = $seq_obj->subseq(1,20); > print "$output\n"; > > On Wed, Jan 26, 2011 at 5:05 PM, JayPea wrote: > > Hi all. > > I recently installed bioperl on my mac (OSX 10.6.6) using fink. And have > been playing around trying to get some really simple things to work. SO what > I'm trying to do is just grab 20bases of the fasta file then print them out. > > This is my script: > > #!/usr/bin/perl > use Bio::Perl; > use Bio::SeqIO; > > > my $seqio_obj = Bio::SeqIO->new(-file => "dna.fa", > -format => 'Fasta'); > my $output = $seqio_obj->subseq(1,20); > print "$output\n"; > > fasta file: > > >chr1 D_discoideum_Ax4_May_2005 4923596 bp DDB0232428 > NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN > NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTATAGTTACTATTGTAAATC > GATAGATAACTTAATTTCATTAATATTATACATAGTAACATTATAAAAAACTTTTAATTT > TTATTTGGGAATTTCAAATTGCTCATTTGGGAAAATTTTTAACTAAGAAAAAATTCAAAA > > I get this error: > > Can't locate object method "subseq" via package "Bio::SeqIO::fasta" at > ./biotester.pl line 16. > > Thanks for any help! > > James > -- > View this message in context: http://old.nabble.com/Newbie-question-on-Bio%3A%3ASeqIO-tp30768204p30768204.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > Yifei Huang > Department of Biology > McMaster University From Shuai.Zhan at umassmed.edu Thu Jan 27 00:21:41 2011 From: Shuai.Zhan at umassmed.edu (Zhan, Shuai) Date: Thu, 27 Jan 2011 00:21:41 -0500 Subject: [Bioperl-l] question of seq() method of Bio::DB::GFF and Bio::PrimarySeq Message-ID: Dear BIOPERL developers, I'm a post doc in UMASS Medical School. Recently I carried out some genome-wide annotation work, but faced some difficulty with using Bio::DB::GFF to fetch DNA sequences, whose seq() method is required by another critical softwares. I found the key issue is that the method of seq() of Bio::DB::GFF (inherited from Bio::PrimarySeq?) can't exactly return the actual sequence string but something like reference. This question made me upset for several days and stoped my progress, but I have not found any related help from google or my friends. So I'm assuming you could give me some valuable information. My perl is 5.8.8 and system is CentOS 5.3. I loaded gff file with annotation information both of CDS and reference by bp_load_gff.pl. The raw fasta sequence file was also loaded together. I wrote a test script just like your instruction on website. [code] use Bio::DB::GFF; my $db = Bio::DB::GFF->new(-dsn => 'dbi:mysql:brenneri', -user => 'me', -password => '123',); my $contig0 = $db->segment(-class => 'Sequence', -name => 'Contig0'); my $subcontig = $contig0->subseq(1,100); print "returned by seq(): ", $upstream->seq, "\n"; print "returned by dna(): ", $upstream->dna, "\n\n"; my $transcripts = $db->segment(-class => 'Transcript', -name => 'fgp17221.t1'); my @exons = $transcripts->features('CDS'); print "returned by seq(): ", $exons[0]->seq "\n"; print "returned by seq(): ", $exons[0]->dna "\n\n"; my $upstream = $exons[0]->subseq(-20,0); print "returned by seq(): ", $upstream->seq "\n"; print "returned by seq(): ", $upstream->dna "\n\n"; [/code] The result as follows: $perl test.pl returned by seq(): Bio::PrimarySeq=HASH(0x1afa40d0) returned by dna(): GTAGATCACTTTTTATTCTCGAAGAATAATTTTTGAGCTATGTAAAAGCGTGTATACCTCCCAGTTTTCCGGTTAAAAGTCCAGGATCGCATGTCTCATG returned by seq(): Bio::PrimarySeq=HASH(0x1afa3ab0) returned by dna(): AAGGAGCTCAACTTCAAACACCAAACACTTTGAACCAACATTTTTCTGGAGAAAACGTAAGAATTGAAAAGTCAAGAATGAATCCACATGTGAAAATTGAACAGAGAT CAATGGCAATGGGTGGAGGAGGAGGGGATGAACAAATGAAAAACTTTACGGAAATGACGAATGAAGAACTCCGTGAGCGGTTAATGAAAATGCAAATGGATATGCAGAATCTTC AAATGGCAATGG returned by seq(): Bio::PrimarySeq=HASH(0x1afa3fb0) returned by dna(): GGAGTTTGGACCGTGTTTCAG Because the method of dna() could exactly return the actual sequence, so I think my loaded database should work. But I have no idea of why the method seq() missed the sequence. I'd greatly appreciate any help. Sincerly, Shuai Jan 27 2010 364 Plantation Str. MA 01605 UMASS MED From roy.chaudhuri at gmail.com Thu Jan 27 08:20:03 2011 From: roy.chaudhuri at gmail.com (Roy Chaudhuri) Date: Thu, 27 Jan 2011 13:20:03 +0000 Subject: [Bioperl-l] Newbie question on Bio::SeqIO In-Reply-To: References: <30768204.post@talk.nabble.com> Message-ID: <4D417103.5070107@gmail.com> Hi James, It's important to not confuse sequence objects with sequence strings. If you try to print out a sequence object, you will get something like Bio::PrimarySeq=HASH(0x9c8e20). But you can call methods like revcom on sequence objects - if you try to do that with a sequence string you get an error. The subseq method: $seq_obj->subseq(160,180) returns a plain text string of the sequence. If you want a sequence object that you can call additional methods on, you need to use a different method, called trunc, which returns a sequence object: $seq_obj->trunc(160,180) To get a sequence string from a sequence object, use the seq method: $seq_obj->seq So to get the reverse complement of your subsequence as a string you would do: my $trunc=$seq_obj->trunc(160,180); my $revcom=$trunc->revcom; my $rev_output=$revcom->seq; Or you can combine them all into one line: my $rev_output=$seq_obj->trunc(160,180)->revcom->seq; Hope this helps. Roy. On 27/01/2011 13:02, James Platt wrote: > I tried this already I got a different error: > > Unrecognized character \xE2 in column 21 at ./biotester.pl line 18. > > I then copied your script in and it worked, then I had my script and > that identical and mine still didn't work. Not sure why it was > happening. > > So I have my sequence now and I'm trying to reverese complement it: > > $output1 = $seq_obj->subseq(160,180); print "$output1\n"; > $rev_output1 = revcom( $output1 ); print "$rev_output1\n"; > > I get this output: > > ATTATAAAAAACTTTTAATTT Bio::PrimarySeq=HASH(0x9c8e20) > > I also try it in an object orientated manner: > > $rev_output1 = $output1->revcom; > > ATTATAAAAAACTTTTAATTT Can't locate object method "revcom" via package > "ATTATAAAAAACTTTTAATTT" (perhaps you forgot to load > "ATTATAAAAAACTTTTAATTT"?) at biotester.pl line 24, line 1. > > I've been following instructions at > http://www.bioperl.org/wiki/HOWTO:Beginners > > Thanks again! > > James > > > On 26 Jan 2011, at 22:15, Yifei Huang wrote: > >> subseq is not a member function in SeqIO class. You need to use >> next_seq function in SeqIO class to read a sequence and then fetch >> a subsequence, which might be something like this. >> >> >> #!/usr/bin/perl use Bio::Perl; use Bio::SeqIO; >> >> >> my $seqio_obj = Bio::SeqIO->new(-file => "dna.fa", -format => >> 'Fasta'); my $seq_obj = $seqio_obj->next_seq(); my $output = >> $seq_obj->subseq(1,20); print "$output\n"; >> >> On Wed, Jan 26, 2011 at 5:05 PM, JayPea >> wrote: >> >> Hi all. >> >> I recently installed bioperl on my mac (OSX 10.6.6) using fink. And >> have been playing around trying to get some really simple things to >> work. SO what I'm trying to do is just grab 20bases of the fasta >> file then print them out. >> >> This is my script: >> >> #!/usr/bin/perl use Bio::Perl; use Bio::SeqIO; >> >> >> my $seqio_obj = Bio::SeqIO->new(-file => "dna.fa", -format => >> 'Fasta'); my $output = $seqio_obj->subseq(1,20); print >> "$output\n"; >> >> fasta file: >> >>> chr1 D_discoideum_Ax4_May_2005 4923596 bp DDB0232428 >> NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN >> NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTATAGTTACTATTGTAAATC >> GATAGATAACTTAATTTCATTAATATTATACATAGTAACATTATAAAAAACTTTTAATTT >> TTATTTGGGAATTTCAAATTGCTCATTTGGGAAAATTTTTAACTAAGAAAAAATTCAAAA >> >> I get this error: >> >> Can't locate object method "subseq" via package "Bio::SeqIO::fasta" >> at ./biotester.pl line 16. >> >> Thanks for any help! >> >> James -- View this message in context: >> http://old.nabble.com/Newbie-question-on-Bio%3A%3ASeqIO-tp30768204p30768204.html >> >> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. >> >> _______________________________________________ Bioperl-l mailing >> list Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> -- Yifei Huang Department of Biology McMaster University > > > _______________________________________________ Bioperl-l mailing > list Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From dan.bolser at gmail.com Thu Jan 27 08:32:06 2011 From: dan.bolser at gmail.com (Dan Bolser) Date: Thu, 27 Jan 2011 13:32:06 +0000 Subject: [Bioperl-l] Undefined subroutine &Bio::DB::SeqFeature::Store::uncompress called In-Reply-To: References: Message-ID: On 27 January 2011 10:37, Dan Bolser wrote: > On 27 January 2011 10:13, Dan Bolser wrote: >> On 26 January 2011 16:57, Dan Bolser wrote: >>> Hi, >>> >>> I just updated BioPerl from git / GBrowse from svn (to check if a >> >> So is it true that 'latest' GBrowse has a maximum BioPerl version requirement? >> >> v1.6.1 ? >> >> The only only 'release' branch I could find was "origin/release-1-6-2"... >> >> After ./Build install I still see the same error. Can someone create a >> v1.6.1 branch? Or should I use a tag? > > OK, I used the bioperl-release-1-6-1 tag. > > perl Build.PL gave the following error (for reference): > > Can't find dist packages without a MANIFEST file - run 'manifest' > action first at Bio/Root/Build.pm line 605, line 4. > > WARNING: Possible missing or corrupt 'MANIFEST' file. > Nothing to enter for 'provides' field in META.yml > Creating new 'Build' script for 'BioPerl' version '1.006001' > > > Tested and installed fine, but didn't fix the GBrowse issue. I guess > this requires a GB roll back rather than a BP roll back. So... I'm rolling back GB release by release using "svn switch" here: https://gmod.svn.sourceforge.net/svnroot/gmod/Generic-Genome-Browser/tags/ I'm back to 2.17 and I finally see this problem clear up. As mentioned, this is with BioPerl 'tag' 1.6.1. Interestingly, I still don't see any features (I don't see the error, and I do see the overview and details panels showing correctly). Now I'm seeing an error in the error_log like this: [Thu Jan 27 12:10:32 2011] [error] [client x.x.x.x] Use of uninitialized value in string comparison (cmp) at /homes/www-potato/perl5/lib/perl5/x86_64-linux-thread-multi/Bio/Graphics/Browser2/Render/HTML.pm line 1909., referer: https://x/potato/cgi-bin/gbrowse2/gbrowse/pot_qa/?name=PGSC0003DMB000000115%3A1..600%2C001 However, updating back to the latest bioperl-live fixes this issue (i.e. the bioperl version was a red-herring). Cheers, Dan. >> Cheers, >> Dan. >> >> P.S. I have largely revised* and then followed the the guide here: >> >> http://www.bioperl.org/wiki/Using_Git >> http://www.bioperl.org/wiki/Using_Git#Release_branches >> >> >> After perl Build.PL I see the following error (just for reference): >> >> Use of uninitialized value in split at Bio/Root/Build.pm line 769, >> line 3. >> >> >> After ./Build test: >> >> Test Summary Report >> ------------------- >> t/SeqIO/embl.t ? ? ? ? ? ? ? ? ? ? ? ? ? ? (Wstat: 512 Tests: 85 Failed: 0) >> ?Non-zero exit status: 2 >> ?Parse errors: Bad plan. ?You planned 95 tests but ran 85. >> Files=350, Tests=24189, 280 wallclock secs ( 8.85 usr ?2.37 sys + >> 210.23 cusr 28.67 csys = 250.12 CPU) >> Result: FAIL >> Failed 1/350 test programs. 0/24189 subtests failed. >> >> >> * I moved the bulk of the content here (as it is only relevant for a >> small group of developers): >> http://www.bioperl.org/wiki/Using_Git/Advanced >> >> I figure splitting out the page gives each part a clearer focus, >> allowing them to be improved more easily. >> >>> particular bug was fixed), and I'm seeing the following message in >>> large friendly red letters across the top of my GBrowse instance: >>> >>> Undefined subroutine &Bio::DB::SeqFeature::Store::uncompress called. >>> >>> >>> Unfortunately, none of the usual sequence features are displayed (no >>> overview or details panels). >>> >>> Here is a snippet of the Apache error log: >>> >>> [Wed Jan 26 16:55:15 2011] [error] [client x.x.x.x] Math::BigInt: >>> couldn't load specified math lib(s), fallback to Math::BigInt::Calc at >>> /homes/www-potato/perl5/lib/perl5/Crypt/DH.pm line 6, referer: >>> https://x/potato/cgi-bin/gbrowse2/gbrowse/pot_qa/?name=PGSC0003DMB000000115%3A1..600%2C001 >>> [Wed Jan 26 16:55:15 2011] [error] [client x.x.x.x] Math::BigInt: >>> couldn't load specified math lib(s), fallback to Math::BigInt::Calc at >>> /homes/www-potato/perl5/lib/perl5/Crypt/DH.pm line 6, referer: >>> https://x/potato/cgi-bin/gbrowse2/gbrowse/pot_qa/?name=PGSC0003DMB000000115%3A1..600%2C001 >>> [Wed Jan 26 16:55:15 2011] [error] [client x.x.x.x] Undefined >>> subroutine &Bio::DB::SeqFeature::Store::uncompress called at >>> /homes/www-potato/perl5/lib/perl5/Bio/DB/SeqFeature/Store.pm line >>> 2531., referer: >>> https://x/potato/cgi-bin/gbrowse2/gbrowse/pot_qa/?name=PGSC0003DMB000000115%3A1..600%2C001 >>> >>> >>> >>> Thanks for guidance, >>> Dan. >>> >> > From scott at scottcain.net Thu Jan 27 08:37:19 2011 From: scott at scottcain.net (Scott Cain) Date: Thu, 27 Jan 2011 08:37:19 -0500 Subject: [Bioperl-l] question of seq() method of Bio::DB::GFF and Bio::PrimarySeq In-Reply-To: References: Message-ID: Hi Shuai, The seq method does not return a sequence string, but a sequence object (Bio::Seq), on which you can perform many methods, including just getting the sequence string, but you can also doing things like translating and getting the reverse complement. To get the DNA string, call the seq() method on your Bio::Seq object: my $upstream->seq->seq; The first invocation of seq is calling the seq method in Bio::DB::GFF, which returns the Bio:Seq object, and the second call of seq calls the seq method in Bio::Seq, which returns a string. The dna method in Bio::DB::GFF is a short cut for doing exactly this (I think--it's been a while since I've used that). Scott 2011/1/27 Zhan, Shuai : > Dear BIOPERL developers, > > I'm a post doc in UMASS Medical School. > > Recently I carried out some genome-wide annotation work, but faced some difficulty with using Bio::DB::GFF to fetch DNA sequences, whose seq() method is required by another critical softwares. > > I found the key issue is that the method of seq() of Bio::DB::GFF (inherited from Bio::PrimarySeq?) can't exactly return the actual sequence string but something like reference. This question made me upset for several days and stoped my progress, but I have not found any related help from google or my friends. So I'm assuming you could give me some valuable information. > > My perl is 5.8.8 and system is CentOS 5.3. > I loaded gff file with annotation information both of CDS and reference by bp_load_gff.pl. The raw fasta sequence file was also loaded together. > > I wrote a test script just like your instruction on website. > [code] > use Bio::DB::GFF; > my $db = Bio::DB::GFF->new(-dsn => 'dbi:mysql:brenneri', -user => 'me', -password => '123',); > my $contig0 = $db->segment(-class => 'Sequence', -name => 'Contig0'); > my $subcontig = $contig0->subseq(1,100); > print "returned by seq(): ", $upstream->seq, "\n"; > print "returned by dna(): ", $upstream->dna, "\n\n"; > my $transcripts = $db->segment(-class => 'Transcript', -name => 'fgp17221.t1'); > my @exons = $transcripts->features('CDS'); > print "returned by seq(): ", $exons[0]->seq "\n"; > print "returned by seq(): ", $exons[0]->dna "\n\n"; > my $upstream = $exons[0]->subseq(-20,0); > print "returned by seq(): ", $upstream->seq "\n"; > print "returned by seq(): ", $upstream->dna "\n\n"; > [/code] > > The result as follows: > $perl test.pl > returned by seq(): Bio::PrimarySeq=HASH(0x1afa40d0) > returned by dna(): GTAGATCACTTTTTATTCTCGAAGAATAATTTTTGAGCTATGTAAAAGCGTGTATACCTCCCAGTTTTCCGGTTAAAAGTCCAGGATCGCATGTCTCATG > > returned by seq(): Bio::PrimarySeq=HASH(0x1afa3ab0) > returned by dna(): AAGGAGCTCAACTTCAAACACCAAACACTTTGAACCAACATTTTTCTGGAGAAAACGTAAGAATTGAAAAGTCAAGAATGAATCCACATGTGAAAATTGAACAGAGAT > CAATGGCAATGGGTGGAGGAGGAGGGGATGAACAAATGAAAAACTTTACGGAAATGACGAATGAAGAACTCCGTGAGCGGTTAATGAAAATGCAAATGGATATGCAGAATCTTC > AAATGGCAATGG > > returned by seq(): Bio::PrimarySeq=HASH(0x1afa3fb0) > returned by dna(): GGAGTTTGGACCGTGTTTCAG > > Because the method of dna() could exactly return the actual sequence, so I think my loaded database should work. But I have no idea of why the method seq() missed the sequence. > > I'd greatly appreciate any help. > > Sincerly, > Shuai > Jan 27 2010 > > 364 Plantation Str. > MA 01605 > UMASS MED > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- ------------------------------------------------------------------------ Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087 Ontario Institute for Cancer Research From plattj at Cardiff.ac.uk Thu Jan 27 12:06:20 2011 From: plattj at Cardiff.ac.uk (James Platt) Date: Thu, 27 Jan 2011 17:06:20 +0000 Subject: [Bioperl-l] Newbie question on Bio::SeqIO In-Reply-To: <4D417103.5070107@gmail.com> References: <30768204.post@talk.nabble.com> <4D417103.5070107@gmail.com> Message-ID: <795228C8-9E40-4DFB-BDA6-B4913F02DED3@Cardiff.ac.uk> Thanks Roy, I am definitely a newbie to bioperl. James On 27 Jan 2011, at 13:20, Roy Chaudhuri wrote: > Hi James, > > It's important to not confuse sequence objects with sequence strings. If you try to print out a sequence object, you will get something like Bio::PrimarySeq=HASH(0x9c8e20). But you can call methods like revcom on sequence objects - if you try to do that with a sequence string you get an error. > > The subseq method: > $seq_obj->subseq(160,180) > returns a plain text string of the sequence. If you want a sequence object that you can call additional methods on, you need to use a different method, called trunc, which returns a sequence object: > $seq_obj->trunc(160,180) > > To get a sequence string from a sequence object, use the seq method: > $seq_obj->seq > > So to get the reverse complement of your subsequence as a string you would do: > my $trunc=$seq_obj->trunc(160,180); > my $revcom=$trunc->revcom; > my $rev_output=$revcom->seq; > > Or you can combine them all into one line: > my $rev_output=$seq_obj->trunc(160,180)->revcom->seq; > > Hope this helps. > Roy. > > On 27/01/2011 13:02, James Platt wrote: >> I tried this already I got a different error: >> >> Unrecognized character \xE2 in column 21 at ./biotester.pl line 18. >> >> I then copied your script in and it worked, then I had my script and >> that identical and mine still didn't work. Not sure why it was >> happening. >> >> So I have my sequence now and I'm trying to reverese complement it: >> >> $output1 = $seq_obj->subseq(160,180); print "$output1\n"; >> $rev_output1 = revcom( $output1 ); print "$rev_output1\n"; >> >> I get this output: >> >> ATTATAAAAAACTTTTAATTT Bio::PrimarySeq=HASH(0x9c8e20) >> >> I also try it in an object orientated manner: >> >> $rev_output1 = $output1->revcom; >> >> ATTATAAAAAACTTTTAATTT Can't locate object method "revcom" via package >> "ATTATAAAAAACTTTTAATTT" (perhaps you forgot to load >> "ATTATAAAAAACTTTTAATTT"?) at biotester.pl line 24, line 1. >> >> I've been following instructions at >> http://www.bioperl.org/wiki/HOWTO:Beginners >> >> Thanks again! >> >> James >> >> >> On 26 Jan 2011, at 22:15, Yifei Huang wrote: >> >>> subseq is not a member function in SeqIO class. You need to use >>> next_seq function in SeqIO class to read a sequence and then fetch >>> a subsequence, which might be something like this. >>> >>> >>> #!/usr/bin/perl use Bio::Perl; use Bio::SeqIO; >>> >>> >>> my $seqio_obj = Bio::SeqIO->new(-file => "dna.fa", -format => >>> 'Fasta'); my $seq_obj = $seqio_obj->next_seq(); my $output = >>> $seq_obj->subseq(1,20); print "$output\n"; >>> >>> On Wed, Jan 26, 2011 at 5:05 PM, JayPea >>> wrote: >>> >>> Hi all. >>> >>> I recently installed bioperl on my mac (OSX 10.6.6) using fink. And >>> have been playing around trying to get some really simple things to >>> work. SO what I'm trying to do is just grab 20bases of the fasta >>> file then print them out. >>> >>> This is my script: >>> >>> #!/usr/bin/perl use Bio::Perl; use Bio::SeqIO; >>> >>> >>> my $seqio_obj = Bio::SeqIO->new(-file => "dna.fa", -format => >>> 'Fasta'); my $output = $seqio_obj->subseq(1,20); print >>> "$output\n"; >>> >>> fasta file: >>> >>>> chr1 D_discoideum_Ax4_May_2005 4923596 bp DDB0232428 >>> NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN >>> NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTATAGTTACTATTGTAAATC >>> GATAGATAACTTAATTTCATTAATATTATACATAGTAACATTATAAAAAACTTTTAATTT >>> TTATTTGGGAATTTCAAATTGCTCATTTGGGAAAATTTTTAACTAAGAAAAAATTCAAAA >>> >>> I get this error: >>> >>> Can't locate object method "subseq" via package "Bio::SeqIO::fasta" >>> at ./biotester.pl line 16. >>> >>> Thanks for any help! >>> >>> James -- View this message in context: >>> http://old.nabble.com/Newbie-question-on-Bio%3A%3ASeqIO-tp30768204p30768204.html >>> >>> > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. >>> >>> _______________________________________________ Bioperl-l mailing >>> list Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >>> -- Yifei Huang Department of Biology McMaster University >> >> >> _______________________________________________ Bioperl-l mailing >> list Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From florent.angly at gmail.com Thu Jan 27 13:16:45 2011 From: florent.angly at gmail.com (Florent Angly) Date: Thu, 27 Jan 2011 13:16:45 -0500 Subject: [Bioperl-l] question of seq() method of Bio::DB::GFF and Bio::PrimarySeq In-Reply-To: References: Message-ID: <4D41B68D.4030003@gmail.com> Hi Shuai, Scott answered your question, but you can find more information pertinent to your question here: http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Getting_Sequences Best, Florent On 27/01/11 08:37, Scott Cain wrote: > Hi Shuai, > > The seq method does not return a sequence string, but a sequence > object (Bio::Seq), on which you can perform many methods, including > just getting the sequence string, but you can also doing things like > translating and getting the reverse complement. To get the DNA > string, call the seq() method on your Bio::Seq object: > > my $upstream->seq->seq; > > The first invocation of seq is calling the seq method in Bio::DB::GFF, > which returns the Bio:Seq object, and the second call of seq calls the > seq method in Bio::Seq, which returns a string. The dna method in > Bio::DB::GFF is a short cut for doing exactly this (I think--it's been > a while since I've used that). > > Scott > > > 2011/1/27 Zhan, Shuai: >> Dear BIOPERL developers, >> >> I'm a post doc in UMASS Medical School. >> >> Recently I carried out some genome-wide annotation work, but faced some difficulty with using Bio::DB::GFF to fetch DNA sequences, whose seq() method is required by another critical softwares. >> >> I found the key issue is that the method of seq() of Bio::DB::GFF (inherited from Bio::PrimarySeq?) can't exactly return the actual sequence string but something like reference. This question made me upset for several days and stoped my progress, but I have not found any related help from google or my friends. So I'm assuming you could give me some valuable information. >> >> My perl is 5.8.8 and system is CentOS 5.3. >> I loaded gff file with annotation information both of CDS and reference by bp_load_gff.pl. The raw fasta sequence file was also loaded together. >> >> I wrote a test script just like your instruction on website. >> [code] >> use Bio::DB::GFF; >> my $db = Bio::DB::GFF->new(-dsn => 'dbi:mysql:brenneri', -user => 'me', -password => '123',); >> my $contig0 = $db->segment(-class => 'Sequence', -name => 'Contig0'); >> my $subcontig = $contig0->subseq(1,100); >> print "returned by seq(): ", $upstream->seq, "\n"; >> print "returned by dna(): ", $upstream->dna, "\n\n"; >> my $transcripts = $db->segment(-class => 'Transcript', -name => 'fgp17221.t1'); >> my @exons = $transcripts->features('CDS'); >> print "returned by seq(): ", $exons[0]->seq "\n"; >> print "returned by seq(): ", $exons[0]->dna "\n\n"; >> my $upstream = $exons[0]->subseq(-20,0); >> print "returned by seq(): ", $upstream->seq "\n"; >> print "returned by seq(): ", $upstream->dna "\n\n"; >> [/code] >> >> The result as follows: >> $perl test.pl >> returned by seq(): Bio::PrimarySeq=HASH(0x1afa40d0) >> returned by dna(): GTAGATCACTTTTTATTCTCGAAGAATAATTTTTGAGCTATGTAAAAGCGTGTATACCTCCCAGTTTTCCGGTTAAAAGTCCAGGATCGCATGTCTCATG >> >> returned by seq(): Bio::PrimarySeq=HASH(0x1afa3ab0) >> returned by dna(): AAGGAGCTCAACTTCAAACACCAAACACTTTGAACCAACATTTTTCTGGAGAAAACGTAAGAATTGAAAAGTCAAGAATGAATCCACATGTGAAAATTGAACAGAGAT >> CAATGGCAATGGGTGGAGGAGGAGGGGATGAACAAATGAAAAACTTTACGGAAATGACGAATGAAGAACTCCGTGAGCGGTTAATGAAAATGCAAATGGATATGCAGAATCTTC >> AAATGGCAATGG >> >> returned by seq(): Bio::PrimarySeq=HASH(0x1afa3fb0) >> returned by dna(): GGAGTTTGGACCGTGTTTCAG >> >> Because the method of dna() could exactly return the actual sequence, so I think my loaded database should work. But I have no idea of why the method seq() missed the sequence. >> >> I'd greatly appreciate any help. >> >> Sincerly, >> Shuai >> Jan 27 2010 >> >> 364 Plantation Str. >> MA 01605 >> UMASS MED >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > From chiragmatkarbioinfo at gmail.com Fri Jan 28 01:57:25 2011 From: chiragmatkarbioinfo at gmail.com (chirag matkar) Date: Fri, 28 Jan 2011 13:57:25 +0700 Subject: [Bioperl-l] question of seq() method of Bio::DB::GFF and Bio::PrimarySeq In-Reply-To: <4D41B68D.4030003@gmail.com> References: <4D41B68D.4030003@gmail.com> Message-ID: Dear Shuai, Objects in Bioperl are References ,generally hash references,where keys are attributes like name and source of sequence So to get values , You need to Deference them to fetch values stored at the memory location. my $hash_value=$hash_ref->{'some_key'}; Hope it Helps. On Fri, Jan 28, 2011 at 1:16 AM, Florent Angly wrote: > Hi Shuai, > Scott answered your question, but you can find more information pertinent > to your question here: > http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Getting_Sequences > Best, > Florent > > > On 27/01/11 08:37, Scott Cain wrote: > >> Hi Shuai, >> >> The seq method does not return a sequence string, but a sequence >> object (Bio::Seq), on which you can perform many methods, including >> just getting the sequence string, but you can also doing things like >> translating and getting the reverse complement. To get the DNA >> string, call the seq() method on your Bio::Seq object: >> >> my $upstream->seq->seq; >> >> The first invocation of seq is calling the seq method in Bio::DB::GFF, >> which returns the Bio:Seq object, and the second call of seq calls the >> seq method in Bio::Seq, which returns a string. The dna method in >> Bio::DB::GFF is a short cut for doing exactly this (I think--it's been >> a while since I've used that). >> >> Scott >> >> >> 2011/1/27 Zhan, Shuai: >> >>> Dear BIOPERL developers, >>> >>> I'm a post doc in UMASS Medical School. >>> >>> Recently I carried out some genome-wide annotation work, but faced some >>> difficulty with using Bio::DB::GFF to fetch DNA sequences, whose seq() >>> method is required by another critical softwares. >>> >>> I found the key issue is that the method of seq() of Bio::DB::GFF >>> (inherited from Bio::PrimarySeq?) can't exactly return the actual sequence >>> string but something like reference. This question made me upset for several >>> days and stoped my progress, but I have not found any related help from >>> google or my friends. So I'm assuming you could give me some valuable >>> information. >>> >>> My perl is 5.8.8 and system is CentOS 5.3. >>> I loaded gff file with annotation information both of CDS and reference >>> by bp_load_gff.pl. The raw fasta sequence file was also loaded together. >>> >>> I wrote a test script just like your instruction on website. >>> [code] >>> use Bio::DB::GFF; >>> my $db = Bio::DB::GFF->new(-dsn => 'dbi:mysql:brenneri', -user => 'me', >>> -password => '123',); >>> my $contig0 = $db->segment(-class => 'Sequence', -name => 'Contig0'); >>> my $subcontig = $contig0->subseq(1,100); >>> print "returned by seq(): ", $upstream->seq, "\n"; >>> print "returned by dna(): ", $upstream->dna, "\n\n"; >>> my $transcripts = $db->segment(-class => 'Transcript', -name => >>> 'fgp17221.t1'); >>> my @exons = $transcripts->features('CDS'); >>> print "returned by seq(): ", $exons[0]->seq "\n"; >>> print "returned by seq(): ", $exons[0]->dna "\n\n"; >>> my $upstream = $exons[0]->subseq(-20,0); >>> print "returned by seq(): ", $upstream->seq "\n"; >>> print "returned by seq(): ", $upstream->dna "\n\n"; >>> [/code] >>> >>> The result as follows: >>> $perl test.pl >>> returned by seq(): Bio::PrimarySeq=HASH(0x1afa40d0) >>> returned by dna(): >>> GTAGATCACTTTTTATTCTCGAAGAATAATTTTTGAGCTATGTAAAAGCGTGTATACCTCCCAGTTTTCCGGTTAAAAGTCCAGGATCGCATGTCTCATG >>> >>> returned by seq(): Bio::PrimarySeq=HASH(0x1afa3ab0) >>> returned by dna(): >>> AAGGAGCTCAACTTCAAACACCAAACACTTTGAACCAACATTTTTCTGGAGAAAACGTAAGAATTGAAAAGTCAAGAATGAATCCACATGTGAAAATTGAACAGAGAT >>> >>> CAATGGCAATGGGTGGAGGAGGAGGGGATGAACAAATGAAAAACTTTACGGAAATGACGAATGAAGAACTCCGTGAGCGGTTAATGAAAATGCAAATGGATATGCAGAATCTTC >>> AAATGGCAATGG >>> >>> returned by seq(): Bio::PrimarySeq=HASH(0x1afa3fb0) >>> returned by dna(): GGAGTTTGGACCGTGTTTCAG >>> >>> Because the method of dna() could exactly return the actual sequence, so >>> I think my loaded database should work. But I have no idea of why the method >>> seq() missed the sequence. >>> >>> I'd greatly appreciate any help. >>> >>> Sincerly, >>> Shuai >>> Jan 27 2010 >>> >>> 364 Plantation Str. >>> MA 01605 >>> UMASS MED >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Regards, Chirag Matkar From dan.bolser at gmail.com Fri Jan 28 07:08:47 2011 From: dan.bolser at gmail.com (Dan Bolser) Date: Fri, 28 Jan 2011 12:08:47 +0000 Subject: [Bioperl-l] DBD::mysql::st execute failed: Column 'seqname' cannot be null Message-ID: Anyone seen this? Google didn't give me any clues: ## Load feature data use Bio::SeqFeature::Generic; use Bio::DB::SeqFeature::Store; ... my $feat = Bio::SeqFeature::Generic-> new( -seq_id => 'x' -source_tag => 'dund', -primary_tag => 'megascaffold', -start => 1, -end => 100, -score => 0, -strand => '+', -phase => ',', -attributes => { ID => 'x', Name => 'x' } ); $db->store($feat) or die "Couldn't store!"; which gives: DBD::mysql::st execute failed: Column 'seqname' cannot be null at ~/perl5/lib/perl5/Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1574, <> line 590. Transaction aborted because DBD::mysql::st execute failed: Column 'seqname' cannot be null at ~/perl5/lib/perl5/Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1574, <> line 590. Couldn't store! at ./for_gbrowse.plx line 90, <> line 590. From dan.bolser at gmail.com Fri Jan 28 07:08:47 2011 From: dan.bolser at gmail.com (Dan Bolser) Date: Fri, 28 Jan 2011 12:08:47 +0000 Subject: [Bioperl-l] DBD::mysql::st execute failed: Column 'seqname' cannot be null Message-ID: Anyone seen this? Google didn't give me any clues: ## Load feature data use Bio::SeqFeature::Generic; use Bio::DB::SeqFeature::Store; ... my $feat = Bio::SeqFeature::Generic-> new( -seq_id => 'x' -source_tag => 'dund', -primary_tag => 'megascaffold', -start => 1, -end => 100, -score => 0, -strand => '+', -phase => ',', -attributes => { ID => 'x', Name => 'x' } ); $db->store($feat) or die "Couldn't store!"; which gives: DBD::mysql::st execute failed: Column 'seqname' cannot be null at ~/perl5/lib/perl5/Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1574, <> line 590. Transaction aborted because DBD::mysql::st execute failed: Column 'seqname' cannot be null at ~/perl5/lib/perl5/Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1574, <> line 590. Couldn't store! at ./for_gbrowse.plx line 90, <> line 590. From dan.bolser at gmail.com Fri Jan 28 08:01:28 2011 From: dan.bolser at gmail.com (Dan Bolser) Date: Fri, 28 Jan 2011 13:01:28 +0000 Subject: [Bioperl-l] DBD::mysql::st execute failed: Column 'seqname' cannot be null In-Reply-To: References: Message-ID: I think the bug may be due to lines like this, and the fact that my sequence id is '0' in this case! $types{$type}{seqname} ||= $seqname; On 28 January 2011 12:08, Dan Bolser wrote: > Anyone seen this? Google didn't give me any clues: > > ## Load feature data > use Bio::SeqFeature::Generic; > use Bio::DB::SeqFeature::Store; > > ... > > ?my $feat = Bio::SeqFeature::Generic-> > ? ?new( -seq_id ? ? ?=> 'x' > ? ? ? ? -source_tag ?=> 'dund', > ? ? ? ? -primary_tag => 'megascaffold', > ? ? ? ? -start ? ? ? => 1, > ? ? ? ? -end ? ? ? ? => 100, > ? ? ? ? -score ? ? ? => 0, > ? ? ? ? -strand ? ? ?=> '+', > ? ? ? ? -phase ? ? ? => ',', > ? ? ? ? -attributes ?=> { ID => 'x', Name => 'x' } > ? ? ? ); > > ?$db->store($feat) > ? ?or die "Couldn't store!"; > > > which gives: > > DBD::mysql::st execute failed: Column 'seqname' cannot be null at > ~/perl5/lib/perl5/Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1574, <> > line 590. > Transaction aborted because DBD::mysql::st execute failed: Column > 'seqname' cannot be null at > ~/perl5/lib/perl5/Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1574, <> > line 590. > Couldn't store! at ./for_gbrowse.plx line 90, <> line 590. > From dan.bolser at gmail.com Fri Jan 28 08:01:28 2011 From: dan.bolser at gmail.com (Dan Bolser) Date: Fri, 28 Jan 2011 13:01:28 +0000 Subject: [Bioperl-l] DBD::mysql::st execute failed: Column 'seqname' cannot be null In-Reply-To: References: Message-ID: I think the bug may be due to lines like this, and the fact that my sequence id is '0' in this case! $types{$type}{seqname} ||= $seqname; On 28 January 2011 12:08, Dan Bolser wrote: > Anyone seen this? Google didn't give me any clues: > > ## Load feature data > use Bio::SeqFeature::Generic; > use Bio::DB::SeqFeature::Store; > > ... > > ?my $feat = Bio::SeqFeature::Generic-> > ? ?new( -seq_id ? ? ?=> 'x' > ? ? ? ? -source_tag ?=> 'dund', > ? ? ? ? -primary_tag => 'megascaffold', > ? ? ? ? -start ? ? ? => 1, > ? ? ? ? -end ? ? ? ? => 100, > ? ? ? ? -score ? ? ? => 0, > ? ? ? ? -strand ? ? ?=> '+', > ? ? ? ? -phase ? ? ? => ',', > ? ? ? ? -attributes ?=> { ID => 'x', Name => 'x' } > ? ? ? ); > > ?$db->store($feat) > ? ?or die "Couldn't store!"; > > > which gives: > > DBD::mysql::st execute failed: Column 'seqname' cannot be null at > ~/perl5/lib/perl5/Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1574, <> > line 590. > Transaction aborted because DBD::mysql::st execute failed: Column > 'seqname' cannot be null at > ~/perl5/lib/perl5/Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1574, <> > line 590. > Couldn't store! at ./for_gbrowse.plx line 90, <> line 590. > From Shuai.Zhan at umassmed.edu Fri Jan 28 13:03:17 2011 From: Shuai.Zhan at umassmed.edu (Zhan, Shuai) Date: Fri, 28 Jan 2011 13:03:17 -0500 Subject: [Bioperl-l] questions about generating gene model by GLEAN Message-ID: Hi, Is there someone used GLEAN for generating consensus gene set? Would you mind sharing some experience with me? I'd greatly appreciate any help. I have tried for run it for couple of weeks. At first I found it can't fetch the real sequence from database but on object, then I change some some codes of fetchseq {} of GLEAN::Evidence::Base. (->seq->seq instead of ->seq) By far, it has began processing my input and correctly analyzed the candidate start, stop, donor, and acceptor for the first contig. But it still failed with something looks like "MSG: asking for tag value that does not exist Evidence". $glean-lca --database glean --user me --password 123 --param param.yaml > test.dat No reference provided; attempting to analyze entire genome Gathering evidence from 'FGENE' for scaff Contig0:1,1129007 ... The initial exon of CDS:FGENESH(fgp17144.t1) did not begin with a valid start codon Extending the terminal exon of CDS:FGENESH(fgp17240.t1) to next valid downstream stop codon The initial exon of CDS:FGENESH(fgp17165.t1) did not begin with a valid start codon Extending the initial exon of CDS:FGENESH(fgp17192.t1) to next valid upstream start codon The initial exon of CDS:FGENESH(fgp17178.t1) did not begin with a valid start codon Error providing evidence type: GeneModel The error was: ------------- EXCEPTION ------------- MSG: asking for tag value that does not exist Evidence STACK Bio::SeqFeature::Generic::get_tag_values /usr/lib/perl5/site_perl/5.8.8/Bio/SeqFeature/Generic.pm:517 STACK Glean::Site::dump /home/zhan/geneset/glean-gene/bin/../lib/Glean/Site.pm:52 STACK Glean::MLE::_add_evidence /home/zhan/geneset/glean-gene/bin/../lib/Glean/MLE.pm:167 STACK Glean::MLE::add_evidence /home/zhan/geneset/glean-gene/bin/../lib/Glean/MLE.pm:94 STACK (eval) /home/zhan/geneset/glean-gene/bin/../lib/Glean/MLE.pm:203 STACK Glean::MLE::estimate /home/zhan/geneset/glean-gene/bin/../lib/Glean/MLE.pm:202 STACK toplevel glean-gene/bin/glean-lca:172 I think it failed in add_evidence method of Glean::MLE. According to the track, I also report some key scalar by print sentense. For example, It failed in adding the first candidate site to evidence. The current $site is Contig0:1011027:1011027:0:-1, its primary_tag is "start", its seq_id is "Contig0" its strand is -1. At first it invoked _add_evidence {} of Glean/MLE.pm. The problem is $site->dump need get the value of tag "Evidence" for $site, but actually the "Evidence" tag didn't exist for this site. THen I found $site of @sites was created by list_starts {} of Glean::Evidence::GeneModel, but its inital tags only have "Next Stop" and "Readingframe". I can't find any hint of when tag "Evidence" was added to $site. The related source codes of glean are listed: sub _add_evidence{ my @sites = $evid->list_sites($scaff); for my $site (@sites) { ... $stype->{$sloc}->{_site} ||= $site->dump; } ... } sub dump { # from Glean/Site.pm ... return join("\t", $site->seq_id, $site->primary_tag, defined $site->score ? sprintf("%g", $site->score == 0 ? $site->worst : -log($site->score)) : "NA", ... join(";", $site->get_tag_values("Evidence")) ); ... } sub list_sites { # from Evidence/GeneModel.pm ... push @sites, $self->list_starts(@cds); ... } sub list_starts { # from Evidence/GeneModel.pm ... push @sites, Glean::Site->new(-primary => "start", -start => $pos + $str * (pos($startseq) - 3), -end => $pos + $str * (pos($startseq) - 3), -strand => $str, -frame => 0, -source => "$self", -seq_id => $seq_id, -tag => { NextStop => $nextstop, ReadingFrame => $frame, }, ); ... } sub add_evidence{ ... eval { $self->add_evidence($scaff, $evtype->new($db, $self->{_log}, $algo, $params)); }; croak("Error providing evidence type: $type\nThe error was:\n$@") if $@; ... } Sincerely Shuai Zhan From bbimber at gmail.com Fri Jan 28 13:18:56 2011 From: bbimber at gmail.com (Ben Bimber) Date: Fri, 28 Jan 2011 12:18:56 -0600 Subject: [Bioperl-l] Bioperl-run Wrappers Message-ID: Hello, I'm using CommandExts to wrap a number of tools. In a pipeline I was looking to make the tools log their current version. I realized that instead of using run() in a unique way for each tool, perhaps there should be a consistent method that gets called and returns a version string. because obtaining this version string is specific to the tool, perhaps each wrapper could provide a version() method that runs the appropriate command on the executable, parses, then returns some string. has something like this been discussed? have others already solved this? Thanks, Ben From amackey at virginia.edu Fri Jan 28 14:47:54 2011 From: amackey at virginia.edu (Aaron Mackey) Date: Fri, 28 Jan 2011 14:47:54 -0500 Subject: [Bioperl-l] questions about generating gene model by GLEAN In-Reply-To: References: Message-ID: Hi Shuai, As I wrote yesterday, please send me your .yaml file, as there is probably something in there causing the problem. -Aaron -- Aaron J. Mackey, PhD Assistant Professor Center for Public Health Genomics University of Virginia amackey at virginia.edu http://www.cphg.virginia.edu/mackey On Fri, Jan 28, 2011 at 1:03 PM, Zhan, Shuai wrote: > Hi, > > Is there someone used GLEAN for generating consensus gene set? Would you > mind sharing some experience with me? I'd greatly appreciate any help. > > I have tried for run it for couple of weeks. At first I found it can't > fetch the real sequence from database but on object, then I change some some > codes of fetchseq {} of GLEAN::Evidence::Base. (->seq->seq instead of ->seq) > > By far, it has began processing my input and correctly analyzed the > candidate start, stop, donor, and acceptor for the first contig. > But it still failed with something looks like "MSG: asking for tag value > that does not exist Evidence". > > $glean-lca --database glean --user me --password 123 --param param.yaml > > test.dat > No reference provided; attempting to analyze entire genome > Gathering evidence from 'FGENE' for scaff Contig0:1,1129007 ... > The initial exon of CDS:FGENESH(fgp17144.t1) did not begin with a valid > start codon > Extending the terminal exon of CDS:FGENESH(fgp17240.t1) to next valid > downstream stop codon > The initial exon of CDS:FGENESH(fgp17165.t1) did not begin with a valid > start codon > Extending the initial exon of CDS:FGENESH(fgp17192.t1) to next valid > upstream start codon > The initial exon of CDS:FGENESH(fgp17178.t1) did not begin with a valid > start codon > Error providing evidence type: GeneModel > The error was: > ------------- EXCEPTION ------------- > MSG: asking for tag value that does not exist Evidence > STACK Bio::SeqFeature::Generic::get_tag_values > /usr/lib/perl5/site_perl/5.8.8/Bio/SeqFeature/Generic.pm:517 > STACK Glean::Site::dump > /home/zhan/geneset/glean-gene/bin/../lib/Glean/Site.pm:52 > STACK Glean::MLE::_add_evidence > /home/zhan/geneset/glean-gene/bin/../lib/Glean/MLE.pm:167 > STACK Glean::MLE::add_evidence > /home/zhan/geneset/glean-gene/bin/../lib/Glean/MLE.pm:94 > STACK (eval) /home/zhan/geneset/glean-gene/bin/../lib/Glean/MLE.pm:203 > STACK Glean::MLE::estimate > /home/zhan/geneset/glean-gene/bin/../lib/Glean/MLE.pm:202 > STACK toplevel glean-gene/bin/glean-lca:172 > > I think it failed in add_evidence method of Glean::MLE. > According to the track, I also report some key scalar by print sentense. > > For example, > It failed in adding the first candidate site to evidence. > The current $site is Contig0:1011027:1011027:0:-1, its primary_tag is > "start", its seq_id is "Contig0" its strand is -1. > At first it invoked _add_evidence {} of Glean/MLE.pm. > The problem is $site->dump need get the value of tag "Evidence" for $site, > but actually the "Evidence" tag didn't exist for this site. > THen I found $site of @sites was created by list_starts {} of > Glean::Evidence::GeneModel, but its inital tags only have "Next Stop" and > "Readingframe". I can't find any hint of when tag "Evidence" was added to > $site. > > The related source codes of glean are listed: > sub _add_evidence{ > my @sites = $evid->list_sites($scaff); > for my $site (@sites) { > ... > $stype->{$sloc}->{_site} ||= $site->dump; > } > ... > } > sub dump { # from Glean/Site.pm > ... > return join("\t", > $site->seq_id, > $site->primary_tag, > defined $site->score ? sprintf("%g", $site->score == 0 ? > $site->worst : -log($site->score)) : "NA", > ... > join(";", $site->get_tag_values("Evidence")) > ); > ... > } > sub list_sites { # from Evidence/GeneModel.pm > ... > push @sites, $self->list_starts(@cds); > ... > } > sub list_starts { # from Evidence/GeneModel.pm > ... > push @sites, Glean::Site->new(-primary => "start", > -start => $pos + $str * (pos($startseq) > - 3), > -end => $pos + $str * (pos($startseq) > - 3), > -strand => $str, > -frame => 0, > -source => "$self", > -seq_id => $seq_id, > -tag => { NextStop => $nextstop, > ReadingFrame => $frame, > }, > ); > ... > } > sub add_evidence{ > ... > eval { > $self->add_evidence($scaff, $evtype->new($db, $self->{_log}, $algo, > $params)); > }; > croak("Error providing evidence type: $type\nThe error was:\n$@") if $@; > ... > } > > Sincerely > Shuai Zhan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From adsj at novozymes.com Sun Jan 30 07:10:57 2011 From: adsj at novozymes.com (Adam =?utf-8?Q?Sj=C3=B8gren?=) Date: Sun, 30 Jan 2011 13:10:57 +0100 Subject: [Bioperl-l] question of seq() method of Bio::DB::GFF and Bio::PrimarySeq In-Reply-To: (chirag matkar's message of "Fri, 28 Jan 2011 13:57:25 +0700") References: <4D41B68D.4030003@gmail.com> Message-ID: <87pqre7fse.fsf@topper.koldfront.dk> On Fri, 28 Jan 2011 13:57:25 +0700, chirag wrote: > Objects in Bioperl are References ,generally hash references,where keys are > attributes like name and source of sequence > So to get values , You need to Deference them to fetch values stored at the > memory location. > my $hash_value=$hash_ref->{'some_key'}; Generally the object provides methods (accessors) for all the information that it wants to expose. Bypassing that and going directly into the internal datastructure of the object ("breaking encapsulation") can be okay when you are debugging the code implementing the objects, but it is good practise to respect and use the interfaces otherwise (the internals of the object might change while the documented interface stays the same). So I would say: Look up the methods to call in the documentation and avoid accessing in the internal representation of the objects directly. Best regards, Adam -- Adam Sj?gren adsj at novozymes.com From cjfields at illinois.edu Sun Jan 30 09:53:47 2011 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 30 Jan 2011 08:53:47 -0600 Subject: [Bioperl-l] question of seq() method of Bio::DB::GFF and Bio::PrimarySeq In-Reply-To: <87pqre7fse.fsf@topper.koldfront.dk> References: <4D41B68D.4030003@gmail.com> <87pqre7fse.fsf@topper.koldfront.dk> Message-ID: On Jan 30, 2011, at 6:10 AM, Adam Sj?gren wrote: > On Fri, 28 Jan 2011 13:57:25 +0700, chirag wrote: > >> Objects in Bioperl are References ,generally hash references,where keys are >> attributes like name and source of sequence > >> So to get values , You need to Deference them to fetch values stored at the >> memory location. > >> my $hash_value=$hash_ref->{'some_key'}; > > Generally the object provides methods (accessors) for all the > information that it wants to expose. > > Bypassing that and going directly into the internal datastructure of the > object ("breaking encapsulation") can be okay when you are debugging the > code implementing the objects, but it is good practise to respect and > use the interfaces otherwise (the internals of the object might change > while the documented interface stays the same). > > So I would say: Look up the methods to call in the documentation and > avoid accessing in the internal representation of the objects directly. > > > Best regards, > > Adam > > -- > Adam Sj?gren > adsj at novozymes.com The other aspect of this: we do not guarantee those hash keys will stay the same, or even be accessible. What happens if we decide to switch to Moose, which does not store attributes the same way? We only support the indicated API, relying on anything else is a ticking time bomb. chris From cjfields at illinois.edu Mon Jan 31 14:16:54 2011 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 31 Jan 2011 13:16:54 -0600 Subject: [Bioperl-l] fastq index In-Reply-To: <6CCEA607-FA44-4D5D-B49B-577816E7514B@illinois.edu> References: <9A90C056-FEEB-4755-B8C7-ABD90977DE8C@illinois.edu> <3FB6C3C1E6A0F44498ACC8AB9DD66CB88B05379151@EXCMSMBX03.ad.bcm.edu> <6CCEA607-FA44-4D5D-B49B-577816E7514B@illinois.edu> Message-ID: <0CE09C4D-B8A4-4F50-92FE-683973CDAB06@illinois.edu> Just a quick note that I made a small change to the Bio::Index::Fastq parser in bioperl-live which appears to fix this. We probably need a more robust way of finding the start of the FASTQ header (I think '@' can be a qual line symbol, so just searching for this may bite us down the road). chris On Dec 31, 2010, at 9:28 AM, Chris Fields wrote: > Caleb, > > Yes that would be a bug. I posted this to bugzilla for tracking: > > http://bugzilla.open-bio.org/show_bug.cgi?id=3165 > > chris > > On Dec 31, 2010, at 12:47 AM, Davis, Caleb F wrote: > >> Thank you for the rec! >> >> Here's what I get with 1.6.1: >> >> %perl make_fq_inx_test.pl test.inx test.fastq >> %perl fetch_fastq_test.pl test.inx FVBWUVC01D7SUB >> >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: No description line parsed >> STACK: Error::throw >> STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:368 >> STACK: Bio::SeqIO::fastq::next_dataset /usr/share/perl5/Bio/SeqIO/fastq.pm:71 >> STACK: Bio::SeqIO::fastq::next_seq /usr/share/perl5/Bio/SeqIO/fastq.pm:29 >> STACK: Bio::Index::AbstractSeq::fetch /usr/share/perl5/Bio/Index/AbstractSeq.pm:147 >> STACK: fetch_fastq_test.pl:11 >> ----------------------------------------------------------- >> >> Is it a bug? >> --Caleb >> >> These perl scripts are from http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Index/Fastq.html >> >> ########## make_fq_inx_test.pl ########### >> # Complete code for making an index for several >> # fastq files >> use Bio::Index::Fastq; >> use strict; >> >> my $Index_File_Name = shift; >> my $inx = Bio::Index::Fastq->new( >> '-filename' => $Index_File_Name, >> '-write_flag' => 1); >> $inx->make_index(@ARGV); >> >> >> ########## fetch_fastq_test.pl ########### >> # Print out several sequences present in the index >> # in Fastq format >> use Bio::Index::Fastq; >> use strict; >> >> my $Index_File_Name = shift; >> my $inx = Bio::Index::Fastq->new('-filename' => $Index_File_Name); >> my $out = Bio::SeqIO->new('-format' => 'Fastq','-fh' => \*STDOUT); >> >> foreach my $id (@ARGV) { >> my $seq = $inx->fetch($id); # Returns Bio::Seq::Quality object <------------------- THROW >> $out->write_seq($seq); >> } >> >> Example data-- >> >> ########## test.fastq ########### >> @FVBWUVC01BR7MP >> GCGACCCTAGGTAGCAACCGCCGGCTTCGGCGGTAAGGTATCACTCAG >> + >> 24<9000988:;<=<;=<44444<<=<<<>???@@@@?>=86662232 >> @FVBWUVC01D7NSE >> GAAGCAGACACAGAAAGACACGGTCTAGCAGATCG >> + >> IIIIIIIIIIIIIIIIIIIIIIIIIIIIIEEEE@< >> @FVBWUVC01D7SUB >> TTTATCGGCTAGGTCAAATAGAGTGCTTTGATATCAGCATGTCTAGCT >> + >> FFD===FFFFFHFFFFFFFFFFC888FFFFDDBAAA@@@840...757 >> @FVBWUVC01BFN75 >> TTAGAATTCAGTTTAGTGCGCTGATCTGAGTCGAGATAAAATCACCAGTACCCAAAACCAGGCGGGCTCGCCACGTTGGCTAATCCTGGTACATTTTGTAATCAATGTTCAGAAGA >> + >> IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFFFFFFFFDDBB:544448<<=>;899<=8889988894<<9955,,/4,,,,,811775512426766777;97668<<44944 >> @FVBWUVC01AYO0N >> AAATTTGTGTTAGAAGGACGAGTCACCACGTACCAATAGCAACAACGATCGGTCGGACTATTCATTGTGGTGGTGACGCTC >> + >> IIIIIIIIIIIIIHHFF@??DA???==<=766<<11,/,,,1,,,,733977--/4444722466<;;<<<82/,,--.12 >> @FVBWUVC01EYPM9 >> GGATTACACGGGAAAGGTGCTTGTGTCCCGACAGGCTAGGATA >> + >> FFFFDD<<:ABAA<988:9::BA===BBBBAA??<8623425/ >> @FVBWUVC01BWHY4 >> AGGTACTACTTCTTAGTGAGACAAGTCCTGGACAGGAGCAGGTAATATT >> + >> HGGGDDD:555:4449==>=<<555=BBAAAA at 8888894224266;.. >> @FVBWUVC01ELH7H >> CATGAGAAGTCTTAATATTACCTCTCAGGTACCTCCTCTTAAGACACAATTACAGAAGGTGCT >> + >> IIIII@@??GIIIIG<<666:IFEIEIEED<==<;CE?3344IFIIIIIIIIIGC>==> @FVBWUVC01CTTAY >> CTCGAGATTCTGGATCCTCATGGACAAGATGTTCTCCGGCTTAGAGAT >> + >> FFFFFFFFFFFFDA:88@>>>44444898==<;<62444221775557 >> >> >> -----Original Message----- >> From: Chris Fields [mailto:cjfields at illinois.edu] >> Sent: Wednesday, December 29, 2010 9:35 PM >> To: Cook, Malcolm >> Cc: Davis, Caleb F; bioperl-l at lists.open-bio.org >> Subject: Re: [Bioperl-l] fastq index >> >> May just wrap this for the indexer. Thanks for the pointer Malcolm! >> >> chris >> >> On Dec 29, 2010, at 6:20 PM, Cook, Malcolm wrote: >> >>> If you're looking for alternatives, I recommend: http://sourceforge.net/projects/cdbfasta/ >>> >>> No bioperl wrapper, but, hey, that's what `system` is for >>> >>> Cheers, >>> >>> Malcolm >>> >>> >>> On 12/29/10 2:28 PM, "Chris Fields" wrote: >>> >>> On Dec 29, 2010, at 1:46 PM, Davis, Caleb F wrote: >>> >>>> Hi all, >>>> >>>> Retrieving fastq from an index with bio::index::fastq is not working for me. I try using the index creation and retrieval code as given here: >>>> http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Index/Fastq.html >>>> using the fastq sequence given here: >>>> http://www.bioperl.org/wiki/FASTQ_sequence_format >>>> but I get this error: >>>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>>> MSG: NCYC361-11a03.q1k bases 1 to 1576 doesn't match fastq descriptor line type >>>> STACK: Error::throw >>>> STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 >>>> STACK: Bio::SeqIO::fastq::next_seq /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/fastq.pm:113 >>>> STACK: Bio::Index::AbstractSeq::fetch /usr/lib/perl5/site_perl/5.8.8/Bio/Index/AbstractSeq.pm:134 >>>> STACK: fetch_fastq_test.pl:11 >>>> >>>> The only other report of this behavior I could find is here: >>>> http://permalink.gmane.org/gmane.comp.lang.perl.bio.general/17836 >>>> >>>> I get the same behavior when I use my own code and sequence. I hope I provided enough information. Sadly, I'm not sure what I'm doing wrong here. >>>> >>>> --Caleb >>> >>> Caleb, >>> >>> Make sure you are using the latest BioPerl release via CPAN, or via github; the line number and error message don't correspond to the latest version. If the problem persists, you may need to file a bug report for this with some example data and a script, or at least show some example data that is triggering the problem. >>> >>> I believe the current indexing scheme used for FASTQ isn't up-to-date with the current parser (which underwent a complete refactoring a while back), so this would help tremendously, but it should be fairly easy to add proper indexing to this. Jason and I briefly talked about FASTQ parsing a few months back in relation to speed of parsing, it could be much faster (my main concern initially was that it was correct). >>> >>> chris >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From rmb32 at cornell.edu Mon Jan 31 18:46:54 2011 From: rmb32 at cornell.edu (Robert Buels) Date: Mon, 31 Jan 2011 15:46:54 -0800 Subject: [Bioperl-l] medperl, something kinda like bioperl In-Reply-To: References: Message-ID: <4D4749EE.3010901@cornell.edu> Hi Spiros, This is a fine idea. My most important piece of advice is to keep the code loosely coupled and flexible. Don't try to make big monolithic distributions like Bioperl. Keep the code as loosely-coupled as possible: think carefully before making something be a subclass of something else, or have some other kind of direct dependency upon it. Things change. Coding practices change. Technology changes too, both on the bio/med side, *and* on the code side. For the project to stay healthy for the long haul, it needs to stay easy for people to wrap their minds around the codebase, and then work on it: developers need to be able to focus their efforts on the code that they are interested in without having to worry about huge amounts of other code. For this to be possible, the various parts of the codebase need to stay organized and compartmentalized, with minimal, well-characterized dependency relationships between them. Good luck! Rob Spiros Denaxas wrote: > Hello, > > I am sending this email here since I consider all people that contribute > and/or follow the bioperl project as the best starting point for advice on a > new project I am currently planning ; my apologies if its considered > off-topic. > > While the bioinformatics community has greatly benefitted from the Perl > community, with the shining example of bioperl, the medical community is > sadly a bit behind. I am currently employed in a public health / > epidemiology environment and have under numerous occasions discovered > opportunities to contribute code to CPAN that has made my life easier. I > know I am not alone, but a very quick search on CPAN for related modules > form the medical / biomedical domain does not return much for now. > > I recently gave a presentation at the London Perl Workshop [1] and while > creating it, I thought, would it be useful to have something similar to > bioperl for modules which largely contribute to the medical / > epidemiological domain? I was thinking of creating something like medperl, > alas similar to bioperl, but in a very very simple form. It would serve as a > reference point to the (albeit small) numbers of modules that are currently > on CPAN and will also hopefully urge people to contribute some of their code > along the way. > > So I would like to request your advice on: > > a) Can you think of any reasons for not doing this? > b) Does anybody know of something similar? > c) Does anybody feel like they could contribute? > > Regards, > Spiros Denaxas > > [1] > http://www.slideshare.net/spirosd/perl-cures-coronary-heart-disease-lpw2010 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From price4890 at gmail.com Mon Jan 31 11:30:35 2011 From: price4890 at gmail.com (Nicholas Price) Date: Mon, 31 Jan 2011 08:30:35 -0800 (PST) Subject: [Bioperl-l] nucleotide changes along tree Message-ID: <9b7468ad-fe3a-4ace-8cd2-69b146fddd28@j11g2000yqh.googlegroups.com> Hi I have three nucleotide sequences from human, chimp and Orangutan and the corresponding tree.I want align the sequences and for each column in the alignment where there are substitutions, I want to infer on which branches the changes occurred using a maximum likelihood method. Is there a way to do this in Bioperl?? thank you Nicholas From ppurkayastha2010 at gmail.com Sat Jan 29 00:45:14 2011 From: ppurkayastha2010 at gmail.com (Elina) Date: Fri, 28 Jan 2011 21:45:14 -0800 (PST) Subject: [Bioperl-l] How to find the SNPs which are from frequent to rare using Perl programming? Message-ID: <30792714.post@talk.nabble.com> How can i find out what are the changes(SNPs)in the sequence which are from frequent to rare( frequent codon to rare codon) from codon usage table of human ? what will be the concept behind writing a program? for example for amino acid T ACC=>0.36 is the frequent codon and ACG=>0.11 is the rare codon. http://old.nabble.com/file/p30792714/codon_usage_h_sapiens.gif http://old.nabble.com/file/p30792714/BARD1SNP.txt BARD1SNP.txt -- View this message in context: http://old.nabble.com/How-to-find-the-SNPs-which-are-from-frequent-to-rare-using-Perl-programming--tp30792714p30792714.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.