[Bioperl-l] Re: [Gmod-gbrowse] Adding human chromosomes as reference sequences

Lincoln Stein lstein at cshl.edu
Tue Jul 19 12:43:36 EDT 2005


The bug involving _maxbin() was fixed in the CVS version of bioper some time 
ago. You also get the fix when you install the latest CVS version of GBrowse. 
I'm sorry that the ucsc_genes2gff.pl script isn't loading the chromosome 
extents; We just need a similar script called ucsc_chromosomes2gff.pl or 
something similar. Ilari, since you've already essentially done this, perhaps 
you'd be willing to contribute the script? I'll add it to bioperl.

Thanks for the information about load_ucsc.pl. Although I can't use it, due to 
not having the enum.pm module installed, I did see immediately where the 
problem has arisen and have fixed it in bioperl CVS (hope I didn't break it 
in so doing!)

As of about a week ago the xyplot.pm glyph has been enhanced to accept 
negative scores. You can also colorize the bars and points according to the 
score or other criteria.


On Tuesday 19 July 2005 05:57 am, Ilari Scheinin wrote:
> Hello.
> I recently installed gbrowse for visualizing the human genome. By
> browsing this list, I found out that the easiest way to import the
> genome data is is to get it from UCSC.
> So I downloaded these files from
> ftp://hgdownload.cse.ucsc.edu/goldenPath/hg17/database/:
> chromInfo.txt, kgXref.txt, knownGeneMrna.txt, knownGenePep.txt,
> knownGene.txt, knownToLocusLink.txt, knownToPfam.txt,
> knownToU133Plus2.txt, knownToU133.txt, knownToU95.txt, refLink.txt,
> refSeqSummary.txt
> and these from ftp://ftp.ncbi.nlm.nih.gov/refseq/LocusLink/ARCHIVE/:
> log2UG, loc2acc, loc2go
> and also /gene/DATA/gene2accession (renamed to genebank2accessions.txt)
> and then ran ucsc_genes2gff.pl (from gmod-0.003) and bp_load_gff.pl with
> % ./ucsc_genes2gff.pl -annotations hg17 | bp_load_gff.pl -c -d
> "dbi:mysql:database=gbrowse;host=<host>" --user <user> -p <pass> -f
> sequencedata/ -
> It works fine and loads the data to the database, but it doesn't add
> the reference entries for the chromosomes, so when I try to search for
> chr1 (or just 1) in gbrowse, I get "The landmark named chr1 is not
> recognized.". I tried adding an entry for chr1 directly in mysql and
> gbrowse worked fine with that.
> So next I took the file chromInfo.txt which contains the lenghts of the
> chromosomes and edited that into a GFF file. I tried to load it with
> % bp_load_gff.pl -d "dbi:mysql:database=gbrowse;host=<host>" --user
> <user> -p <pass> chromosomes.gff
> I get:
> chromosomes.gff: loading...
> Can't locate object method "_maxbin" via package
> "Bio::DB::GFF::Adaptor::dbi::mysqlopt" at
> /usr/lib/perl5/site_perl/5.8.1/Bio/DB/GFF/Adaptor/dbi/mysql.pm line
> 687, <> line 2.
> DBI::db=HASH(0x11f8080)->disconnect invalidates 2 active statement
> handles (either destroy statement handles or call finish on them before
> disconnecting) at
> /usr/lib/perl5/site_perl/5.8.1/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> line 228, <> line 2.
> I noticed that this is a problem with long features. Chr1 is
> 245,522,847 bp. If I drop the 7 from the end, it works. The default for
> maxfeature is 100,000,000, but adding --maxfeature 1000000000 for
> bp_load_gff.pl doesn't have any effect. As you can see, this is with
> perl 5.8.1, and same thing happens on another machine with 5.8.3.
> Bioperl is 1.5.0. Is the script broken or am I doing something wrong?
> I then made a little script that goes through chromInfo.txt and adds
> the chromosomes directly to mysql. I ignored the column fbin, because I
> didn't know what it was for. This seems to work fine, gbrowse is able
> to find the chromosomes. But is there an "official" or better way to
> import the human genome data to gbrowse?
> I also tried load_ucsc.pl from bioperl-1.5.0, but it didn't add the
> chromosome entries either. By the way, the script produces an empty GFF
> file for each input file, but everything is written to stdout, so all
> the files remain empty.
> Also one other thing. Can the score values in GFF be negative? I'm
> using gbrowse to visualize CGH data, but the xyplot doesn't seem to
> work with negative log ratios.
> Regards,
> Ilari
> -------------------------------------------------------
> SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
> from IBM. Find simple to follow Roadmaps, straightforward articles,
> informative Webcasts and more! Get everything you need to get up to
> speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse

Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
SANDRA MICHELSEN, AT michelse at cshl.edu

More information about the Bioperl-l mailing list