PROPOSED K3 SUBCLADE
Until this year, every known full mtDNA haplogroup K sequence had been easily assigned to either subclade K1, defined by 1189C and 10398G, or subclade K2, defined by 146C and 9716C. I reported in June that a sequence (HM852886 in the GenBank database) from Georgia submitted with a paper by Schönberg et al. had none of those defining mutations. So I suggested it be called K*, since as a singleton sequence it could not be termed K3.
Last week I learned of a paper by Zheng et al. that discussed 367 sequences from China and Japan that were part of the “1000 Genomes Project.” Mutation lists for the sequences are in Table S1 in the Supporting Information. Not really expecting to find anything, I searched for a K defining mutation and found that sequence NA18539 from the Beijing Han Chinese group was certainly a K, but not in K1 or K2. Later, I compared the new one with the Schönberg sequence and found that the two had eleven mutations in common. The Georgian one had six extra mutations, while the Chinese one had four. The PhyloTree.org rule usually requires three different sequences to qualify as a subclade, but there are exceptions. The great number of defining mutations, with very different extra mutations, should qualify this as a subclade. Therefore, I have proposed this as a new subclade K3. One remaining problem is that the new sequence is not on GenBank, although the whole genome sequence from which the mtDNA was retrieved apparently is.
The distance from Batumi, Georgia, to Beijing, China, is 4,128 miles. Assuming that the subclade origin was in the Caucasus area, that was quite a journey for our fellow K – a journey that may have occurred thousands of years ago.
Next, I looked for HVR-only (control-region) sequences that matched the two full sequences. I found none in the K Project, relevant geographical projects, MitoSearch.org or SMGF.org. However, I found two on GenBank. The first, GU069324, was from Uzbekistan. It included the subclade’s defining mutations 16093C, 16148T, 16153A, 150T, 195C, 235G and 560T – as well as the usual K mutations. The second, AF285383, from Adygea, a Russian republic in the North Caucasus area, had only HVR1, with 16093C, 16148T and 16153A. It also shared 16239T with the Georgian sequence. Both partial sequences and the Georgian sequence have apparently had a back mutation at position 16519, while the Chinese sequence does have the 16519C mutation common to about 98% of K. Thus, the partial sequences are closer genetically and geographically to the Georgian sequence than to the Chinese one. Here is the tree for the proposed K3:
William R. Hurst
Administrator, mtDNA Haplogroup K Project