[Bioperl-l] Bio::SeqIO::genbank, Bio::Species - can't get full species name

James Wasmuth james.wasmuth at ed.ac.uk
Thu May 13 09:51:57 EDT 2004

Hi Matthew,

I fixed this, its in the CVS for SeqIO/genbank.pm

I would give you the link but the docs have been playing up all day...

Let me know if it doesn't do what you want...


Matthew Betts wrote:

>I am trying to reconcile gene trees with species trees, and to do this I 
>need the species names to be the same in both cases. The gene trees come 
>from a clustering of GenBank coding sequences, and the species trees come 
>from the NCBI taxonomy. However, when using BioPerl to extract the species 
>info from GenBank entries, it only seems possible to get the first 
>three words from the ORGANISM line, which are treated as genus, species, 
>and subspecies in Bio::Species. However, in several cases, such as the 
>example below, there is more information in the ORGANISM line. I suspect 
>that this means that the subspecies name uses more than one word, or that 
>the GenBank format is being broken? However, this is also how the names 
>appear in the NCBI taxonomy names.dmp file.
>The problem seems to be in Bio::SeqIO::genbank->_read_GenBank_Species(). 
>There is a special condition there for viruses (the whole of the ORGANISM 
>info is put on to the classification array), but the examples I have are 
>for chordates (there may be others).
>I'd be really grateful for any comments on the best thing for me to do.
>LOCUS       AY211864                 701 bp    DNA     linear   ROD 25-AUG-2003
>DEFINITION  Tamias amoenus X Tamias ruficaudus RBCM19680 cytochrome b (cytb)
>            gene, partial cds; mitochondrial gene for mitochondrial product.
>VERSION     AY211864.1  GI:33385214
>SOURCE      mitochondrion Tamias amoenus X Tamias ruficaudus
>  ORGANISM  Tamias amoenus X Tamias ruficaudus
>            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
>            Mammalia; Eutheria; Rodentia; Sciurognathi; Sciuridae; Sciurinae;
>            Tamias.
>REFERENCE   1  (bases 1 to 701)
>  AUTHORS   Good,J.M., Demboski,J.R., Nagorsen,D.W. and Sullivan,J.
>  TITLE     Phylogeography and introgressive hybridization: chipmunks (genus
>            Tamias) in the northern Rocky Mountains
>  JOURNAL   Evolution 57 (8), 1900-1916 (2003)
>REFERENCE   2  (bases 1 to 701)
>  AUTHORS   Good,J.M., Demboski,J.R., Nagorsen,D.W. and Sullivan,J.
>  TITLE     Direct Submission
>  JOURNAL   Submitted (08-JAN-2003) Ecology and Evolutionary Biology,
>            University of Arizona, 1041 E. Lowell Street, Tucson, AZ 85721, USA
>FEATURES             Location/Qualifiers
>     source          1..701
>                     /organism="Tamias amoenus X Tamias ruficaudus"
>                     /organelle="mitochondrion"
>                     /mol_type="genomic DNA"
>                     /specimen_voucher="Royal British Columbia Museum
>                     (RBCM19680)"
>                     /db_xref="taxon:231237"
>     gene            1..>701
>                     /gene="cytb"
>     CDS             1..>701
>                     /gene="cytb"
>                     /codon_start=1
>                     /transl_table=2
>                     /product="cytochrome b"
>                     /protein_id="AAP45298.1"
>                     /db_xref="GI:33385215"
>                     PFHPYYTIKDILGILL"
>        1 atgacaaaca tccgcaaaac ccatcccctc attaaaatca ttaaccactc attcattgac
>       61 ttacccgcac catccaacat ttctgcatga tgaaattttg gatccctctt aggtatttgc
>      121 ctaattatcc aaattctcac tggactattc ctagcaatac actacacatc cgacacaatg
>      181 acagctttct catctgtcac tcatatttgc cgagatgtaa actacggctg acttatccga
>      241 tacatacacg ctaacggagc ctccatattt tttatctgcc tattccttca tgtaggccga
>      301 ggactttact atggatcata tacctacttc gaaacatgaa acattggagt aattctttta
>      361 ttcgccgtta tagccactgc atttataggt tacgttctcc catgaggaca gatatccttt
>      421 tgaggtgcta ctgttattac aaatctccta tcagccatcc catatatcgg aacaacacta
>      481 gtagaatgaa tctgaggagg cttctcagta gacaaagcca ctctaacacg attctttgca
>      541 tttcatttta tcctcccatt cattattaca gcattagtta tagttcacct actcttcctt
>      601 catgaaaccg gatccaataa tccttccgga ttaatctctg actctgataa aattccattc
>      661 catccatatt acactattaa agatatccta ggcatcctcc t
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org

"There are some days when I think I'm going to die from 
an overdose of satisfaction."
	     --- Salvador Dali

Nematode Bioinformatics          |
Blaxter Nematode Genomics Group  |
School of Biological Sciences    |
Ashworth Laboratories            | tel: +44 131 650 7403
University of Edinburgh          | web: www.nematodes.org
Edinburgh                        |
EH9 3JT                          |
UK                               |	

More information about the Bioperl-l mailing list