MITOCHONDRIAL DNA HAPLOGROUP K SURVEY AT 750 ENTRIES ON MITOSEARCH
AS OF AUGUST 16, 2006
(UPDATED SEPTEMBER 2, 2006)
(Note: A new survey at 1000 K entries is now available at K1000 Survey. However, the information below may still be of interest.)
This is my sixth survey of the mtDNA haplogroup K (Katrine’s Clan) entries on FamilyTreeDNA’s MitoSearch. The previous survey at 500 entries may be found at K500 Survey. There are several links on that page to previous surveys, supplements and charts. I had been publishing surveys at every one hundred entries, but this time I skipped over two of the century marks. It has only been four months since the K500 survey.
The K750 survey includes a new CHART, which is sorted by HVR2 then HVR1 mutations. The first step taken was to eliminate all but the high-resolution entries – those with both HVR1 and HVR2 mutations listed. First to be dropped were 41 shown as CRS or matching the Cambridge Reference Sequence, which is represented by no mutations in HVR1. None of those would belong in K, so I suspect that they are probably entries which were not completed before entering the mutation lists. Under HVR2 were 369 listing Not Tested and 47 as No Mutations. Since no K would have no HVR2 mutations, I eliminated all of those. I also eliminated one duplicated entry.
The remaining high-resolution entries numbered 333 or 44.4% of the original total – slightly lower than the previous survey. The real percentage, after eliminating the probably incomplete entries labeled CRS, is 47%. Also, some may have upgraded their tests without adding the HVR2 results to MitoSearch. Four listed the testing company as the National Geographic Society’s Genographic Project. I suspect many more began their testing there, but completed it at FTDNA and so listed that. Two listed Oxford Ancestors, although they probably took the HVR2 upgrade at FTDNA. One each listed DNA Heritage and Sorenson. Three listed Other. The remaining 322 listed FTDNA. Forty-two entries (plus 14 with HVR1 only) contain pedigree charts.
There are 178 entries, or 53.5%, with exact high-resolution matches in 42 different haplotypes or sequences. That leaves 155 unmatched singletons for a total of 197 different haplotypes. Dividing by the 333 total gives 59.2% for what I have been calling the “diversity percentage”; but which Ann Turner says is called “discrimination capacity” in scientific papers. That number is down from 67% in the previous survey and 78% in the K403 survey. Thus it is becoming more likely that new entries will find an exact match. One haplotype has 15 examples, four have 12, and two have 11. I will list these very common haplotypes with just the “extra” mutations, not including the six basic K mutations: HVR1 – 16224C, 16311C, 16519C and HVR2 – 73G, 263G, 315.1C.
15 – 16234T, 114T, 497T - This is the most common form of the largest “Ashkenazi” subclade K1a1b1a.
12 – 146C, 152C - This is the basic “either/or” haplotype which can be K2a or the ancestor of K1c and other haplotypes. See my recent discussion.
12 – 16320T, 146C, 152C, 498- - This is a “perfect” subclade
K1c2, which possibly originated in the
12 – 146C, 152C, 512C - This is another Ashkenazi subclade, K2a2a.
12 – 16093C, 16524G, 195C, 497T – This is the third Ashkenazi subclade, K1a9.
11 – 16223T, 16234T, 114T, 497T – This adds 16223T to the basic K1a1b1a subclade above.
11 – 309.1C, 497T – This adds one of the most common recurrent mutations, 309.1C, to the defining mutation for K1a, 497T.
Notice that four the most common haplotypes are forms of the three Ashkenazi K subclades mentioned in Dr. Doron Behar’s 2006 paper. Some other common subclades, such as K1c, have few perfect examples but have many with one or more additional mutations. The two shortest haplotypes without back mutations are those with one extra mutation, either 497T or 146C. The first has four examples and the second has three. Since no K has just the basic six mutations, these two haplotypes probably represent the closest in time to the founder of K. Although I have referred to several K subclades above, the actual subclade designations are usually based on coding-region mutations outside the HVR1/2 regions. See the Behar paper mentioned above. At present those mutations may only be tested for K with a full-sequence test.
In the accompanying chart I have added columns with codes for the countries and general areas for the most distantly-known origin as listed in the entries. This time I have added three charts by area of origin, country of origin, and Europe origin. (Ten phylogenetic charts and one map were added on September 2, 2006. See discussion.)
As in the past, I have color-coded a few of the mutations on the chart. I again used yellow to mark the 498- and 16320T mutations, which usually indicate the K1c and K1c2 subclades. Also, green is used to mark the 16234T, 16223T, 16524G, and 512C mutations which usually indicate the three “Ashkenazi” subclades, as discussed above under the most common haplotypes. In previous surveys I used blue to mark the 497T mutation which defines subclade K1a, but this time I am using turquoise to mark the insertions at position 524, which appear in 21% of the entries. Though those don’t indicate specific subclades, they don’t usually appear in the subclades marked in yellow or green. In rare instances a haplotype may be marked by two of the colors; that usually indicates that a mutation which normally defines a subclade is appearing in a random, non-defining manner. I will discuss the few examples on the chart. One haplotype, with green and turquoise, has the 16524G mutation which usually defines subclade K1a9; but it is missing the 497T mutation required to be even in K1a. Thus its 16524G mutation is just a random or personal mutation. Similarly, another haplotype has the 16320T mutation which usually defines K1c2; but is missing the 498- mutation for K1c. In the past I have also said that this entry has the 524 insertions which never appear with the K1c/K1c2 subclades, but a new member of the mtDNA Haplogroup K Project has both 498- and 16320T plus a pair of 524 insertions. My new rule: never say never. Again I have used aqua to mark the odd haplotypes with the 133G mutation and many other differences from the standard K. There have been no new examples of these since I discussed them in the Geographical Supplement to the K403 Survey.
Another pair of tables (in one file) of interest are those produced by Tom Glad’s mtDNAtool: An mtDNA Analysis Utility. The Summary Table shows the mutations for each entry plus a count and frequency for each mutation. (The Haplogroup column is blank, but all are K.) The Genetic Distance Report allows each entry to be compared to the others, ignoring the second of each pair of 524 insertions and the 523 deletion.
For this survey I have created three new charts. The first
is a pie chart for some predicted
subclades by percentages for the 333 high-resolution entries. The largest
fairly easily predicted subclade is K1a1b1a at 11%. Combined with K1a9 at 5% and
K2a2a at 4%, the Ashkenazi subclades total 20%. The K1c and K1c2 subclades
which tend to be found in
The other two are bar charts for the Ashkenazi
subclades and the K1c/K1c2
subclades by European country of origin. The second one lends some support
to my theory that K1c was brought to the
All K’s who tested at FTDNA or who transfer their results from the Genographic Project are welcome to join the mtDNA Haplogroup K Project. (Those testing with other companies may join by e-mail request, but their mutations will be listed on the Results page rather than the mtDNA Results page.) Further information is available on our project website.
Many aspects of the above comments have been explained in greater detail in the K500 or K403 surveys referenced above. Other charts and supplements may be produced from the K750 data in the near future.
William R. Hurst
Administrator, mtDNA Haplogroup K Project