Month: July 2016

Chromosome 2 Fusion – What do subtelomeres look like?

As usual, we’ll start with a quote from Jeff Tomkins:

Second, the fusion-like sequence was very degenerate and only 70% similar to what one would expect of a pristine fusion sequence of the same size. Even if you assume an evolutionary timeline of up to six million years since the fusion event occurred, the data do not match up with known mutation rates or the variability found in human DNA.

Jeff Tomkins – More DNA Evidence Against Human Chromosome Fusion

The implicit assumption underlying this statement is that Jeff Tomkins believes that the sequences we find at the fusion site were – no more than six million years ago – pristine, perfect telomere repeats, and that they have since mutated into the “degenerate” arrays we see today. This assumption is utterly wrong-headed.

What he is ignoring is the fact that these “degenerate” arrays are found immediately adjacent to the telomeres in virtually all of the human chromosomes. What the chromosome 2 fusion sequence looks like to any reasonable, well-informed person is two chromosomes whose telomeres have been depleted to the point where these subtelomeric “degenerate” arrays are exposed, and the telomeres are no longer protecting the chromosomes from fusion.

First, here are some “degenerate” TTAGGG repeats, found at the “end” of some of our chromosomes:

>chromosome:GRCh38:4:190122446:190122745:1
ATGAGGGTTGGGGTTAGGGTTAGGGTTAGGGTGAGGGTGAGGGTGAGGGTGAGGGTGAGG
GTGAGGGTTAGGGTTAGGGGTTAGGGTCAGGGTCAGGGTCAGGGTCAGGGTCAGGGGTAG
GGTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTGGGTTAGGGTTAGGGTCAGGG
TCAGGGTCAGGGTCAGGGTCAGGGTCAGGGTTAGGGGTTAGGGGTTAGGGTCAGGGTTAG
GGTTAGGGTTAGGGTTTTAGGGTTAGGGTTGGGGTTGGGGTTAGGGTTAGGGTTAGGGTT
>chromosome:GRCh38:1:248946010:248946309:1
GGGTTAGGGTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGGTTAGGGTTA
GGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTGGGTTAGGGTTAGGGTTAGGGTTAG
GGTTAGGGGTTAGGGTTAGGGGTTAGGGTTGGGGTTGGGGTTGGGGTTGGGGTTGGGGTT
GGGGTTGGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTT
AGGGTGTTAGGGTGTTAGGGTGTTAGGGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGG
>chromosome:GRCh38:X:156030051:156030350:1
TGGGGTTAGGGTTAAGGGTTAGGGTTAGGGGTTAGGGGTTAGGGTTGGGGTTGGGGTTAG
GGTTAGGGTAGGGTTAGGGTTAGGGTTAGGGGTTAGGGGTTAGGGTAGGGTTAGGGTGAG
GGTGAGGGTGAGGGTGAGGGTGAGGGTGAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAG
GGGTTAGGGGTTAGGGTTAGGGTTAGGGGTTAGGGGTTAGGGTTAGGGTTAGGGGTTAGG
GTTAGGGTTAGGGGTTAGGGGTTAGGGGTTAGGGGTTAGGGTAGGGTAGGGTAGGGTAGG

Now here are some of the reverse motifs – CCCTAA – found at the “beginning” of our chromosomes:

>chromosome:GRCh38:1:10160:10459:1
CCCTAACCCTAACCCTAACCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCCTAACCCTAACCCTAAACCCTAAACCCTAACCCTAACCCTAACCCT
AACCCTAACCCCAACCCCAACCCCAACCCCAACCCCAACCCCAACCCTAACCCCTAACCC
TAACCCTAACCCTACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCC
CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAAC
>chromosome:GRCh38:9:10000:10299:1
NTAACCCTAACCCTAACCCTAACCCAACCCCACCCCAACCCCAACCCCAACCCAACCCTA
ACCCTAACCCTAACCCAACCCTAACCCTAACCCTAACCCAACCCTCACCCTCACCCTCAC
CCTCACCCTCACCCTCACCCTCACCCTAACCCTACCCTAACCCCTAACCCCTAACCCCTA
ACCCCTAACCCTTAACCCTAACCCTAACCCTACCCTAACCCTAACCCTAACCCCTAACCC
CTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCCTAACCTCTAACCCT
>chromosome:GRCh38:18:10270:10569:1
TAACCCTAACCCTAACCCTACCCTAACCCTACCCTACCCTAACCCTAACCCTAACCCTAA
CCCTTAACCCTAACCCTAACCCTAACCCTACCCCAACCCCAACCCCAACCCCAACCCCAA
CCCCAACCCCAACCCCAACCCTACCCTAACCCTAACCCTAACCCTAAACCCCAACCCTAA
CCCCTAACCCCTACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTA
CCCTAACCCTACCCTACCCTAACCCTAACCCTAACCCTAACCCTTAACCCTAACCCTAAC

As you can see, these are not perfect telomeric repeats, and it is quite easy to visualise the resulting sequence if one of these “degenerate” forward arrays fused head-to-head with one of these “degenerate” reverse arrays.

It would look something like the picture that Jeff Tomkins himself has provided:

Tomkins-highlight
http://creation.com/chromosome-2-fusion-2

Funny that.

Advertisements

Chromosome 2 Fusion – What should we expect?

Man, if I had a dollar for every time somebody told me that the fusion site doesn’t look like what we would expect it to look like, I’d have twenty-something million dollars. So, I thought I might download some DNA sequences for known mammalian fusions and show you what they look like.

Now the sequences I’m about to show you are from the Indian muntjac species, and if I may quote from a paper published in 2008:

Indian muntjac (Muntiacus muntjak vaginalis) has an extreme mammalian karyotype, with only six and seven chromosomes in the female and male, respectively. Chinese muntjac (Muntiacus reevesi) has a more typical mammalian karyotype, with 46 chromosomes in both sexes.

Comparative sequence analyses reveal sites of ancestral chromosomal fusions in the Indian muntjac genome

So clearly there have been a bunch of fusions in this Indian muntjac species and – luckily for us – they decided to sequence these fusion sites to see what they looked like. Now, remember these are not telomere-to-telomere fusions, these are telomere-to-satellite fusions, so they only correspond to one side of the human chromosome 2 fusion site.

“In chromosome fusion events that occur in nature in living mammals—a very rare event—the DNA signature always involves satDNA producing a DNA signature that occurs as either satDNA-satDNA or satDNA-teloDNA sequence.”

More DNA Evidence Against Human Chromosome Fusion

Behold!

This is what fusions actually look like.

http://www.ncbi.nlm.nih.gov/nuccore/DP000824

AGAGATCTAGTTTTCCACCAAAGATTTAAAATATCTCTCTGACCTCCTTTTTTTTGGGGAGGGGGGGGAG
GGTTTGAAGTTTTCATTTGCCCAAGGTTGAGGTCTCAGGCAAAGTAGGCAGTGTTTTCAAGGAAAGGGTT
CGGGTTCGGGTTCGGGTTCGGGTTCGGGTTCGGGTTAGGGTTCGGGTTAGGGTTAGGGTTCAGGTTAGGG
TTAGGGTTCGGGTTAGGGTTCGGGTTAGGGTTCGGGTTTAGGGTTAGGGTTCGGGTTAGGGTTAGGGGTT
AGGGTTAGGGATAGGGTTAGGGTTAGGGTTAGGGGTTAGGGTTCGGGTTAGGGTTAGGGTTCGGGTTAGG
TTTGGGGTTAGGGTTAGGGTTAGGGGTTAGGGTTAGGGTTAGGTTTAGGGTTTAGGGTTAGGGTTACGGT
TAGGGTTAGGGTTAGGGTTAGGGTTTAGGGTTAGTGTTAGGGTTAGGGTTAGGGTCAGGGTTAGGGTTTA
GGGTTAGGGTTAGGGTTAGGGTTAGGGTTTAGGGTTAGGGTTTAGGGTTAGGGTTAGGGTTAGGGTTAGG
GTTAGGGGTTAGGGTTAGGGTTAGGGTTAGGGATTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGT
GATAAGGCATTCTCTAGTGTCTCCCCAGGGGCTTCAGACTTCCCTTCATCTTGTGACACGTAATCCAGCT
TGTACTCAAGTCAGTGCAGGAATTCCGGCCTGATTTCCAGTCAGGGCATTTCAGGGTCGATTCCACTGGA

http://www.ncbi.nlm.nih.gov/nuccore/DP000825

TAGAATATCATGGCCATCAAACGTCGGGTCATCTTTCTTTCCTGATGGCATAGTTTTACCGCTGAATTAT
ACACAGATCAGCTGACAAGGTGATGTGAACCTGCGGAAGGAAGGATCACTAACGTGGTTCGGGAAAGGGG
TTTGGGTTCGGGTTCGGGTTCGGGTTCGGGTTCGGGTTCGGGTTAGGGTTCGGGTTAGGGTTAGGGTTAG
GGTTAGGGTTTGGGTTTGGGTTAGGGTTAGGGTTCGGGTTCGGGTTAGGGTTAGGGTTCGGGTTAGGGTT
CGGGTTAGGGTTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTCGGGTTCGGGTTCAGGTTAGGGTTAGG
GTTAGGGTTAGCGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGGTTAGTGTTAGGGTTAGGGTTAGGCT
TAGGGTTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTCAGGGTTTAGGGTTAGGGTTAGGGTTAGGGTTA
GGGTTTAGGGTTTAGGGTTTAGGGTTTAGGGGTTAGGGTTAGGGTAAGGTTTAGGGTTAGGGGTTAGGGT
TAGGGTTAGCGTTAGGGTTAGGGTTAGGGTTAGGGGTTAGGGTAGGGTTCGGGTTAGGGTTAGGGTTAGG
GTTAGGTTTCGGGTTAGGGTTCGGCTTAGGGTTCGGGTTAGGGTTTAGGGTTAGGGTTTAGGGTTAGGGT
TAGGGTTAGGGTTCGGGTTCGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGCTCCTTTCAAGT
TTGATTGCGAGCACGGAATTGCTCTGCACGCAGTGCAGGGGAATCGGGCCTCATCTCATGGTGAGGGGGA
AGTCTCATGGTTTTTCTCGAGTTGCGGCCGGAACCTGGGATATATTCTCGAGTTACGACAGGGATGGCCC

http://www.ncbi.nlm.nih.gov/nuccore/DP000826

CTAGGTCAGGTCATCTTTCCTTTCTGTAGTCATAGTTAGACAGATCAGCTGATAAAAACCTTGGATTGTG
TCTGTGGGGTGGATTTGCTTGTCATTGTGCTTCAGCTGGCCAAGGTAGCAACGCCCGCGCCCCTCCCACG
GAAGTAGGAGTGGGGGCGGGGCGCACCGGGGGAGTGTCGGGCCGGGTTAGGGCTAGGGCTTAGGGCTTAG
GGCTTAGGGCTTAGGGCTTAGGGCTGAGGTTGGGGTTAAGGCTTAGGGTTAGGGCTGGGGTTGGGGTTAG
GGCTTAGGGCTTAGGGCTAGGGTTAGGGTTGGGCTTAGGGCTAGGGTTAGGGCTTAGGGCTTAGGGCTAG
GGCTTAGGGCTAGGGTTAGGGGTTAGGGTTAGGGTTAGGGTTAGAGGGTTAGGGTTAGGGTTAGGGTTAG
GGTTAGGGGTTAGGGTTAGGGTTAGGGTTAGGGTTACGGTTAGCGTTAGGGTTAGGGTTAGGGGTTAGGG
TTAGGGTTAGGGTTAGGGTTACGGTTAGCGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTTAGGGTTA
GGGTTAGGTTTAGGGTTAGGGTTTAGGGTTAGGGTTAGGGTTAGTGTTAGGGTTAGGGTCTGGATTGTCT
CCAGAGCCATCCCGCCTTCCCCATCAAACATGACAAGTGGCTTGACTTCCTTTAGACAGTTCCGAAGATT
CCTCGAGAATATCGTCGTCACAAGTCTAGAGGAACACCAAGTTCAGCACAGCAACTCGAGAAAAGCTCCG

http://www.ncbi.nlm.nih.gov/nuccore/DP000827

CTAGCTACCAGTCATTTAGAATAATTTACTGGATGCCGTCAACCACATTCTGTTTAGAGTGTACATATGC
AAATTGAATACAAGAAAAAAAAAACAACGAAACAGAACCCACTCATCTGGTTTTAAACCAAGATCATAAT
CACTATATCTTTCTCAATATATGAGATGTTAGTGAAAAATAATGTTAGGGTTAGGGGTTAGGGTTAGGGT
TAGGGTTGGGGTTGGGGTTAGGGTTAGGGTTAGGGTTAGGGTTCGGGGTTAGGGGTTAGGGTTAGGGGTT
AGGGTTAGGGTAGGGTTAGGGTTAGGGGTTAGGGTTAGGGGTTAGGGGTTAGGGGTTAGGGTTAGGGTTA
GGGGTAAGGGGTTAGGGTTAGGCGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTTGGGTTAGGGTTAG
GGTTAGGGTTACAGTTAGGGTTAGGGTTAGAGTCAGTTAGGGTTAGGGTTAGGGTTTAGGGTTAGGGTTA
GGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGCTTGATTCCCACGTGATTCCCTCCTGATTCC
CACGTGATTCCCACGTGATTCCCACGTGATTCCCACGTCACTCCCACGTGATTCCCACGTGATTCCCACG
TGACTGAGACGTGATTCCCACGTGATTCCCACGTGATTCCCACGCGATTTCCACATGATTCCCACGTGAT

http://www.ncbi.nlm.nih.gov/nuccore/DP000828

GCTGCATATAGGCCCAGAGTTCTGAGATGGTGGAATGAACCAACTGGACACTTAAAGAGACACTCCAAGT
GGATTAGAGAGACTGACTGCTCTAGGGATTGGGCTAGGGCTCGGGCTTGGGCTCGGGTTTGGGGTTCGGG
TTCGGGGTTCGGGTTCAGGCTCGGGCTCGGGTTTGGGGTTCGGGTTCGGGTTAGGGTTCGAGTTAGGGTT
CGGGTTCGGGTTCGGGTTAGGGTTCGGGTTCGGGTTCGGGTTCGGGTTAGGGTTAGGGTTCGGGTTCGGG
TTAGGGGTTAGGGTTAGGGTTAGGGCTAGGGTTAGGGTTAGGGTTAGGGTTAGGGGTTAGGGTTAGGGTT
AGGGTTAGGGGTTAGGGTTAGGGTTAGGGGTTAGGGTTAGGGGTTAGGGTTAGGGTTAGGGTTTAGGGTT
AGGGTTAGGGTTAAGGTTAGGGTTAGGGGTTAGGGTTAGGGTTCGGGTTCGGGTTCGGGTTAGGGTTAGG
GTTAGGGTTAGGGTTAGTGTTAGGGTTAGGGTTAGGTTTAGGGTTAGTGTTAGGGTTTAGGGTTAGGGTT
AGGGTTAGGGTTAGGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGCTTAGGGATTAGGGTTAGGGTTAG
GGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGGTTAGGGTTAGGGTTAGCGTTTAGGTTTAGGATTAGGT
TTAGGGTTAAGGTTAGGGTTAGGGTTTGGGTTAGGGTTAGCTCCCAGACGTACAGCTGCGAGAAGCCTCC
TCTTGTGGTGCTTGTGGACAGTTGGCATTCCTCTTGATTTGAAGCTAGGAAATCAGCCCTCACCTCGAGA
TGATTGGCGGAACACGGAGCTCTTTCTGCTTGCTGCAGTGACCTCAGTTTCCATCTTGACTTGAGACAGT

http://www.ncbi.nlm.nih.gov/nuccore/DP000829

CCGTGTAAACACACAGCCTTCCTAGGCCTTATGCCTCCTCGTCCATAAGGGGATGGTGGTTTTTCTGCTC
ATGGGGTGGGGGGAGGGCACACCTATGGCCACAGCACTGGCTCCATGGACAGCATGGTTCCTTGGGGGCC
TGGAGCCCCACAACTGATAAGACTGACACAAGAGCTGACATTAGGGTTCGGGTTCGGGTCAGGGTTCGGG
TCAGGGTTCGGGTCAGGGGTTAGGGTCAGGGTTAGGGTCAGGGGTTAGGGTCAGGGTTAGGGTTAGGGTT
AGGGGTTAGGGTCAGGGTCAGGGTCAGGGTCAGGGTCAGGGGTTAGGGTCAGGGTCAGGGTCAGGGTCAG
GGTCAGGGTTAGGGGTTAGGGTTAGGGTTATGGTTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGGTTAGG
GTTAGGGTTAGGGTTTAGGGTTAGGGGTTAGGGTTAGGGTTAGGGTTAGGGTTTAGGGTTAGGGTTTAGG
GTTAGGGTTAGGGTTAGGGTTTAGGGTTAGAGTTAAGGTTAGGGTTAGGGTTAGGGGTTAGGGTCAGGGT
CAGGGTCAGGGGTTAGGGTCAGGGTCAGGGTCAGGGTCAGGTTAGGGTTAGGGTTAGGGTTAGGGTTTAG
GGTTAGGGTTAGGGTTAGGGTTTAGGGTTAGGGTTTAGGGTTAGGATAAGGGTTAGGGTTTAGGGTTAGG
GTTAGGGTTAGGGTTTAGGGTTTAGGGTTTAGGGTTAGGGTTAGGGTTAGGTTTAGTGTTTAGGGTTAGG
GTTAGGGTTAGGTTTAGGTTAGGGTTAGGGTTAGGGTTTGGGTTAGGGTTAGGGTTAGGGTTAGGGGTTA
GGGTTAGGGTTAGGCTTTAGGGTTAGGGTTAGGGTTTAGGATTAGGGTTAGGGTTAGGGTTAGGGTTAGG
GTTAGGGTTTAGGGTTAGGGTTAGGGTTACGGTATAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTTAGGG
TTAGGTTTAGGGTTTGGGTTAGGGTTAGGGTTAGGGTGGAGGCCGCAAATTCAACCTCCCTCAACCAGAC
CTACAGCTGCGAGAAGCCTCCTCTTGTGGTGCTTGTGGACAGTTGGCATTCCTCTTGATTTGAAGCCAGG
AAATCAGCTCTCACCTCGAGATGATTCGGAATACACGGAGCTGTTTCTGGTTGGTGAAGTGACTTTAGGA

http://www.ncbi.nlm.nih.gov/nuccore/DP000830

CGTAGGAATGTGCCCATGAGATGAAAATATTGTCCTGAGTTGAAACGGAAGTTAAATTCAAAATCACGTT
GATGAGGCCAAGCAGCAGGAGTTGATTATGTGACTTCATCCTGAAGGGGCTCATCCGGGATCTTCTTTTC
CTTCTTCTTCCTGAGGATGATCTTAGTCTCTTGGGGGTCAGTGGAGTCCCTCTCTGCAGCCTGTTAGGGT
TAGTGTTAGGGTTAGGGTTAGGGTTTAGGGATAGAGTTAGGGTTAGGGTTAGGGTTAGGGTTAGTTAGGG
TTAGGGTTAGGGTTAGGATTAGGGTTTAGGGTAAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTT
AGGGTTAGGGTTAGGGTTAGTAATGGATGGGAGGCCGCCTGTCGAGAAAGGGCAGGGAGCTAGGGCTTTC
TCTAGTGTCTCCGCAGGGGCTTCAGACATCCCTTCATCTTGTGAGATGAAATCCAGCTTGCACTCAGGTC
ACTGCAGGAATTCCGGCCTGATTTCGTGTCAGGGCATCTCGGGATCGATTCCACTGGAGCTCGCAAATTC

In the Immortal Words of Dr Tomkins

First, the sequence was only about 800 bases long—not the 10,000 bases or more you would expect if two 5,000-base (or larger) telomeres fused together.

Second, the fusion-like sequence was very degenerate and only 70% similar to what one would expect of a pristine fusion sequence of the same size.

More DNA Evidence Against Human Chromosome Fusion

Who here agrees with him?

Chromosome 2 Fusion – The Cryptic Centromere

This is a brief tutorial on how one goes about demonstrating the existence of a cryptic centromere on human chromosome 2. It is in response to this point from Jeff Tomkins:

“The purported cryptic centromere on human chromosome 2, like the fusion site, is in a very different location to that predicted by a fusion event.”

New Research Undermines Key Argument for Human Evolution

So, first of all I need to mention that Jeff Tomkins implicitly admits that there is such a putative centromere, but his objection is that it is not where it should be. Nevertheless I’ll show you how to find it and then show that it is where it is expected it to be.

So what are we looking for?

“The DNA evidence in question is based on the fact that human, great-ape, and other mammalian centromeres are composed of a highly variable class of DNA sequence that is repeated over and over called alpha-satellite or alphoid DNA. Alphoid DNA, although found in centromeric areas, is not unique to centromeres and is even highly variable between homologous regions throughout the same mammalian genome.”

So basically what we are looking for is a large cluster of these alphoid sequences. As Tomkins states, alphoid sequences are not unique to centromeres, but we shouldn’t find large clusters elsewhere on the chromosome.

BLAST away!

So let’s get a list of all the alphoid sequences that we can find on chromosome 2:

[glenn@macha] cat alphoid.fa
>gi|117911456|emb|CS444613.1| Sequence 51 from Patent WO2006110680
CATTCTCAGAAACTTCTTTGTGATGTGTGCATTCAACTCACAGAGTTGAACCTTCCTTTTCATAGAGCAG
TTTTGAAACACTCTTTTTGTAGAATCTGCAAGTGGATATTTGGACCGCTTTGAGGCCTTCGTTGGAAACG
GGAATATCTTCATATAAAAACTAGACAGAAG

… and then …

[glenn@macha] blastn -query alphoid.fa -subject /Users/glenn/Data/hg19/chr2.fa
 -outfmt '10 sstart send pident nident length evalue' -out alphoid.csv
 -task blastn -dust no -soft_masking false -word_size 7 -evalue 1e-30

This command will search chromosome 2 for anything that looks like an alphoid sequence, and write the results to a file named alphoid.csv, and this is what the file looks like after it has been sorted:

70658558,70658701,86.806,125,144,1.10e-42
92272684,92272854,84.211,144,171,1.74e-46
92272855,92273025,88.304,151,171,2.95e-56
92273026,92273194,80.117,137,171,4.38e-35
92273195,92273363,85.294,145,170,1.74e-46
92273366,92273535,80.588,137,170,8.46e-38
92273537,92273707,85.380,146,171,3.37e-49
92274458,92274566,89.091,98,110,6.51e-33
92274567,92274738,83.721,144,172,7.42e-45
92274739,92274909,85.380,146,171,3.37e-49
92274910,92275079,84.706,144,170,1.43e-47
92275081,92275250,85.965,147,171,9.65e-50
...
...
...

That first field (“sstart” in our command above) is where the matching DNA starts on chromosome 2. So if you look at the file in its entirety, you’ll see that there are 483 matches for this alphoid sequence across chromosome 2, and the vast majority – all but 2 of those 483 matches – are clustered around two locations.

The first location is around the 92Mb mark – and this corresponds to the beginning of the active centromere; the second location is around the 133Mb mark.

Could this be our centromere?

Well it certainly is a cluster of alphoid sequences, but it is in the right place? Let’s have a look at the genes either side of this cluster:

CentromereSynteny
http://grch37.ensembl.org/Homo_sapiens/Location/Synteny?db=core&r=2%3A132000000-134000000&otherspecies=Pan_troglodytes

What you should be looking at here are all the genes that precede the cryptic centromere (from PLEKHB2 down to ANKRD30BL) and their corresponding position on chimpanzee chromosome 2B. Now a couple of the corresponding chimpanzee genes are found on scaffolds (the ones beginning with AACZ or GL), but for the genes that have been placed on the chromosome, you can see that they are all around the 132Mb mark.

For the genes on the other side of the cryptic centromere (GPR39 and LYPD1) you’ll notice that the corresponding genes on chimpanzee chromosome 2B are found near the 136Mb mark.

And what pray tell is in that gap between 132Mb and 136Mb on chimpanzee chromosome 2B? The centromere!

To recap

  1. On human chromosome 2 there are two clusters of alphoid sequences.
  2. One of those clusters is the current active centromere.
  3. The other cluster corresponds well to the centromere on chimpanzee chromosome 2B.

I’m gonna say it’s our cryptic centromere …