[Bioperl-l] (no subject)

Ronnie de Jonge ronnie.dejonge at gmail.com
Mon Jul 21 11:09:13 EDT 2008

Hi Dave, 
Exactly my problem there in step 7 and 8. 
As said, I tried to solve this problem by blasting the domain sequence to
the cdna database, and use the first hit reference (parse by searchIO) for
further truncation of the cdna hit. Though this gives false positives. 
I'm wonder now  if i could do the same, though instead of blasting the
entire database just blast the hit to it's own cdna sequence (single fasta)
'on-the-fly'? (guess not possible, regarding the formatdb step?)


From: dave at davemessina.com [mailto:dave at davemessina.com] On Behalf Of Dave
Sent: maandag 21 juli 2008 17:04
To: Dhr. R. de Jonge
Cc: bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] (no subject)

Okay, let me see if I've got this straight: you want to do Ka/Ks on just the
subsequences of the cDNAs that match the HMMer domain?

1) You have a cDNA sequence. Let's call it Xn.
2) Xn is 300 nucleotides in length.
3) You translate Xn into protein Xp.
4) You use HMMer to search Xp against Pfam.
5) HMMer tells you that Xp has, for example, an SH2 domain from residue 30
to residue 51.
6) Likewise, let's say two additional proteins Yp and Zp have the same SH2

You want to:
7) Determine which nucleotides in Xn correspond to amino acids 30-51 in Xp.
8) Extract just those nucleotides (and also the nucleotides in Yn and Zn
corresponding to their SH2 domain hits).
9) Align those nucleotide sequences.
10) Give the resulting multiple alignment to PAML and calculate the Ka/Ks

Is that correct?
Is it steps 7 and 8 that you are trying to solve?


More information about the Bioperl-l mailing list