[Bioperl-l] How to map a sequence to a known UniGene cluster?
sdavis2 at mail.nih.gov
Mon May 17 07:30:35 EDT 2004
Does your sequence come from genbank? If so, you can simply search the
Mm.data file from NCBI for the matching genbank accession and associate it
with the unigene number. (This will probably best be done with a little
perl code, but you could probably use ClusterIO which would be much slower).
If not, then I can think of three other possibilities. 1) Blast against
some subset of genbank and map those genbank hit accessions to unigene
clusters in the same manner as above. You may find that they all map to the
same unigene, but you may have to use some "majority rules" argument (i.e.,
if >75% of the "good hits" come from a single unigene cluster, use that
cluster). 2) If you want to get to transcript annotation, you could blast
against refSeq, ensembl transcripts, or H-invitational transcripts. 3) If
all else fails, you could try to blast against the genome and see where your
cDNA lands and what is in each of those regions.
----- Original Message -----
From: "Wuming Gong" <wmgong at development.3322.org>
To: <bioperl-l at bioperl.org>
Sent: Saturday, May 15, 2004 10:12 PM
Subject: [Bioperl-l] How to map a sequence to a known UniGene cluster?
> I try to map some full length cDNA sequences to UniGene cluster by NCBI
> with the e-value less than 1e-100. The BLAST database - Mm.uniq.seq is
> download from the ftp site of NCBI. However, the BLAST results show that
> some sequences, there are two or more hits with the e-value equal to zero.
> wonder how can I map one sequence to exact one UniGene cluster or is it
> possible to map one sequence to two or more clusters?
> Thanks in advance!
> Wuming Gong
> College of Life Science,
> Wuhan University, China.
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
More information about the Bioperl-l