[Bioperl-l] how to extract intron information from gff files.

=?gb2312?B?va0gzsTi/Q==?= biology0046 at hotmail.com
Mon Nov 5 07:16:13 EST 2007


Dear all:

i got a poplar genome gff file like this:
LG_I	src	exon	2598	3280	.	-	.	name "fgenesh1_pg.C_LG_I000001"; transcriptId 
62649
LG_I	src	CDS	2598	3280	.	-	0	name "fgenesh1_pg.C_LG_I000001"; proteinId 
62649; exonNumber 4
LG_I	src	start_codon	3278	3280	.	-	0	name "fgenesh1_pg.C_LG_I000001"
LG_I	src	stop_codon	2598	2600	.	-	0	name "fgenesh1_pg.C_LG_I000001"
LG_I	src	exon	3544	3918	.	-	.	name "fgenesh1_pg.C_LG_I000001"; transcriptId 
62649
LG_I	src	CDS	3544	3918	.	-	2	name "fgenesh1_pg.C_LG_I000001"; proteinId 
62649; exonNumber 3
LG_I	src	exon	4258	4740	.	-	.	name "fgenesh1_pg.C_LG_I000001"; transcriptId 
62649
LG_I	src	CDS	4258	4740	.	-	2	name "fgenesh1_pg.C_LG_I000001"; proteinId 
62649; exonNumber 2
LG_I	src	exon	5344	6388	.	-	.	name "fgenesh1_pg.C_LG_I000001"; transcriptId 
62649
LG_I	src	CDS	5344	6388	.	-	2	name "fgenesh1_pg.C_LG_I000001"; proteinId 
62649; exonNumber 1
LG_I	src	exon	8259	8528	.	-	.	name "fgenesh1_pg.C_LG_I000002"; transcriptId 
62650
LG_I	src	CDS	8259	8528	.	-	0	name "fgenesh1_pg.C_LG_I000002"; proteinId 
62650; exonNumber 3
LG_I	src	stop_codon	8259	8261	.	-	0	name "fgenesh1_pg.C_LG_I000002"
LG_I	src	exon	8897	8987	.	-	.	name "fgenesh1_pg.C_LG_I000002"; transcriptId 
62650
LG_I	src	CDS	8897	8987	.	-	0	name "fgenesh1_pg.C_LG_I000002"; proteinId 
62650; exonNumber 2
LG_I	src	exon	9831	9892	.	-	.	name "fgenesh1_pg.C_LG_I000002"; transcriptId 
62650
LG_I	src	CDS	9831	9892	.	-	1	name "fgenesh1_pg.C_LG_I000002"; proteinId 
62650; exonNumber 1
LG_I	src	start_codon	9890	9892	.	-	0	name "fgenesh1_pg.C_LG_I000002"

I try to use Bio::DB::GFF, but this module only applies to methods given in 
the gff file.
what i want to get is "intron, 5utr, 3utr", but this information do not 
contain in this gff file.

how can i get these information through bioperl? This file do not contain 
intron information
if i consider gaps between exons as introns, non cds parts of the first and 
last exon as utrs, how can i extract them through this gff file.

Thanks~~

Wenkai

_________________________________________________________________
ÏíÓÃÊÀ½çÉÏ×î´óµÄµç×ÓÓʼþϵͳ¡ª MSN Hotmail¡£  http://www.hotmail.com  



More information about the Bioperl-l mailing list