Month: December 2016

DNA Contamination – The Implied Claim

In Jeff Tomkins’ most recent attempt to play down the genetic similarity between humans and chimpanzees he suggests that there is widespread human DNA contamination in the current chimpanzee genome, and that:

“Sequences […] from the seemingly less contaminated data sets indicate that the chimpanzee genome is approximately 85% identical overall to human.”

Analysis of 101 Chimpanzee Trace Read Data Sets: Assessment of Their Overall Similarity to Human and Possible Contamination With Human DNA

We can do some rough calculations here. Let’s say the current chimpanzee assembly contains 90% chimpanzee DNA (which Tomkins claims is only 85% identical to human DNA) and 10% human DNA (which is obviously 100% identical to itself) we can work out an approximate “observed similarity”┬áthat we would see if we were to compare this supposedly contaminated chimpanzee genome to the human genome:

(90% x 85%) + (10% x 100%) = 86.5%

We can generalise this formula to work out the “observed similarity” given the relative amounts of chimpanzee DNA and human contaminant that found its way into the assembly:

(X% x 85%) + (Y% x 100%) = Z%

Where:

X% = Percentage of chimpanzee DNA
Y% = Percentage of human DNA (equal to 100% - X%)
Z% = Observed similarity

We can then rearrange this formula to work out X% (and therefore Y%) given the “observed similarity”:

X% = (Z% - 100%) / (85% - 100%)

If the “observed similarity” is somewhere around 98%, it follows that the current chimpanzee assembly is actually composed of 87% human DNA. That is clearly absurd.

Even a very conservative “observed similarity” of 95% implies that a full two thirds of the chimpanzee assembly isn’t chimpanzee DNA at all. I’m sure I’m not the only one that thinks this is more than a little far-fetched.