[Bioperl-l] genbank parsing of multiple 'function' tags within primary tag
abualiga2 at gmail.com
Thu Sep 8 12:14:07 EDT 2011
I only had a quick look at your code, so maybe I'm missing something but
you are currently pushing all products of all CDSs into the same array,
i.e. you do not assign them to a datastructure that links a particular
CDS to a list of products. You then use the same index to print out a
locus from the @loci array and a product from @products, but the two
will not match up because you will have more products than loci.
That's right. Products are not the issue in this particular case, as it's
E.coli and there's no alternate splicing as far as I know so there is a
single product per gene. But there are plenty more 'function' qualifiers,
for example, than loci. And I don't know how to create a data structure that
will link a 'gene' (as primary tag) to all other qualifiers, whether they
belong to 'CDS', 'Misc_RNA', 'Misc_feature', or other primary tags.
More information about the Bioperl-l