[Bioperl-l] confused by Bio::Graphics

David Messina dmessina at wustl.edu
Mon Dec 4 11:46:16 EST 2006

Hi Richard,

> [richard]
> These are the problems:
> 1) As I understand it this:
> my $wholeseq = Bio::SeqFeature::Generic->new (
> 		-start => 1,
> 		-end => $refseq->length,
> 		-display_name =>$refseq->display_name
> 		);
> should display the name of the gene (CD133/Prominin1) near the top  
> of image.
> It doesn't, am I misunderstanding or is there an error in the code?

The contents of a sequence object's display_name varies depending on  
the type of sequence record; for a sequence object created from a  
Genbank record, it's the value of the LOCUS field on the first line  
of the record.

If you want the gene name, you'll have to dig it out of the feature  
table. If you look at the  Genbank record for your first sequence,  
you'll see that under both the gene and CDS primary features, the  
HUGO gene abbreviation is stored under the "gene" secondary tag, and  
various synonyms are under the "note" and "product" secondary tags.

LOCUS       NM_006017               3794 bp    mRNA    linear   PRI  
DEFINITION  Homo sapiens prominin 1 (PROM1), mRNA.
VERSION     NM_006017.1  GI:5174386
[...skipping irrelevant part of the Genbank record...]
FEATURES             Location/Qualifiers
      source          1..3794
                      /organism="Homo sapiens"
      gene            1..3794
                      /note="prominin 1; synonyms: AC133, CD133, PROML1,
      CDS             38..2635
                      /go_component="integral to plasma membrane  
[pmid 9389720];
                      /go_process="response to stimulus; visual  
                      /note="hProminin; prominin (mouse)-like 1;  
                      stem cell antigen"
                      /product="prominin 1"

In your script, you grab the primary features between lines 34-60.  
You can grab the secondary feature you want with something like:

[cribbed from the Feature-Annotation HOWTO]
for my $feat_object ($seq_object->get_SeqFeatures) {
    push @ids, $feat_object->get_tag_values("gene") if ($feat_object- 

> 2) In the quoted example the CDS is broken up into smaller regions  
> which are
> then linked together in example 6. This isn't happening in my code  
> and I
> think it should be, I get one solid block for the CDS. I don't  
> understand why
> this is because I'm not clear which parts of the feature table are  
> used to
> define where the CDS should be split. I think this is the relevant  
> bit of
> code:
> foreach my $alt_trans (keys %main) {
> 	foreach my $tag (keys %{ $main{$alt_trans}{'features'} }) {
> 		my $feature = $main{$alt_trans}{'features'}{$tag};
> 		$panel->add_track($feature,
> 				-glyph => 'generic',
> 				-bgcolor => $colors[$idx++ % @colors],
> 				-fgcolor => 'black',
> 				-font2color => 'black',
> 				-key => $alt_trans,
> 				-bump => +1,
> 				-height => 8,
> 				-label => 1,
> 				-description => 1,
> 				) if ($tag eq 'CDS');
> }
> }

The problem here is that RefSeq mRNA records don't contain intron- 
exon boundary information. I think you'll have to get that from an  
assembly record. From the Entrez gene page for PROM1, I obtained a  
Genbank record for the PROM1 genomic locus:


Saving that as 'PROM1.gb' (the suffix is important), and running the  
bp_embl2picture.pl script on it, I got an image similar to Figure 6  

Hope this helps,

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061204/4add2cbc/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PROM1.png
Type: image/png
Size: 8646 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061204/4add2cbc/attachment.png 

More information about the Bioperl-l mailing list