[Bioperl-l] blasting two identical seq yields only 88% identity

William Hsiao wlhsiao at yahoo.ca
Sun Dec 25 11:15:15 EST 2005

Hi Anders,
   This is due to BLAST's low complexity filter
which masks low complexity regions as X's.  These X's
are taken into consideration when calculating %
identity resulting in less than 100% identity for two
identical sequences.  You can turn the filter off then
you should see 100% identity.



--- Anders Stegmann <anst at kvl.dk> wrote:

> Merry christmas BioPerl!
> I obtained some odd result blasting a protein
> sequence against
> a chromosome I new encoded the protein using
> tblastn. 
> So I tested the problem by blasting the protein
> against a database only containing the exact same
> protein sequence using blastp (both files were fasta
> formated).
> I obtained an identity of only 88% instead of 100%?
> A lot of X'ses were incorporated in the query
> sequence.
> I figured that it had something to do with the
> database formatting so I tried several possibilities
> with no luck
> (First I tried: formatdb -i SSD1pDB.txt -p T -o F).
> I have had this problem before blasting nucleotides.
> What can I do about it?
> Regards Anders.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org


Find your next car at http://autos.yahoo.ca

More information about the Bioperl-l mailing list