[Bioperl-l] (no subject)
Dhr. R. de Jonge
ronnie.dejonge at gmail.com
Mon Jul 21 10:17:51 EDT 2008
To be more precise, I am actually looking for the subset of the cDNA
sequence that contains the sequence producing the obtained domain (for
multiple HMMer hits than so to speak).
2008/7/21, Dave Messina <David.Messina at sbc.su.se>:
> Hi Ronnie,
> I'm not sure I'm following you -- you start with a database of cDNA
> sequences, but you're asking how to obtain the cDNA sequence.
> Do you mean, once you've identified in protein space with HMMer a subset of
> sequences that contain a certain domain, how do you pick out the
> corresponding cDNA sequences from your starting database?
> I'm not sure this is what your mean, but you should be able to generate a
> lookup hash of which protein sequence identifier corresponds to which cDNA
> identifier. Once you've used your protein IDs to get the list of cDNA IDs,
> then you can extract the cDNA sequences from your original database. It
> would probably be possible to use Bio::Annotation to keep track of the
> relationship between a protein ID and a cDNA ID, but this seems like
> overkill to me compared to a plain old hash.
> If you haven't already, you may want to check out the PAML HOWTO on the
> BioPerl website
> which shows the pairwise_kaks script. Or look directly at the script itself,
> included in the bioperl-live distribution under scripts/utilities.
> In any case, I'm not sure I've answered your question -- please follow up if
> I've missed the point.
More information about the Bioperl-l