[Bioperl-l] blasting two identical seq yields only 88% identity
wlhsiao at yahoo.ca
Sun Dec 25 11:15:15 EST 2005
This is due to BLAST's low complexity filter
which masks low complexity regions as X's. These X's
are taken into consideration when calculating %
identity resulting in less than 100% identity for two
identical sequences. You can turn the filter off then
you should see 100% identity.
--- Anders Stegmann <anst at kvl.dk> wrote:
> Merry christmas BioPerl!
> I obtained some odd result blasting a protein
> sequence against
> a chromosome I new encoded the protein using
> So I tested the problem by blasting the protein
> against a database only containing the exact same
> protein sequence using blastp (both files were fasta
> I obtained an identity of only 88% instead of 100%?
> A lot of X'ses were incorporated in the query
> I figured that it had something to do with the
> database formatting so I tried several possibilities
> with no luck
> (First I tried: formatdb -i SSD1pDB.txt -p T -o F).
> I have had this problem before blasting nucleotides.
> What can I do about it?
> Regards Anders.
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
Find your next car at http://autos.yahoo.ca
More information about the Bioperl-l