Useful DNA Glossary

autosomal pertaining to the numbered human chromosome, 1-22; all the human chromosomes except the “sex chromosomes”, the yChromosome, and the xChromosome

base pair is two chemical bases bonded to one another forming a rung of the DNA ladder. The DNA molecule consists of two strands that wind around each other like a twisted ladder. Each strand has a backbone made of alternating sugar (deoxyribose) and phosphate groups. Attached to each sugar is one of four bases--adenine (A), cytosine (C), guanine (G), or thymine (T). The two strands are held together by hydrogen bonds between the bases, with adenine forming a base pair with thymine, and cytosine forming a base pair with guanine.

cell is the basic building block of living things. All cells can be sorted into one of two groups: eukaryotes and prokaryotes. A eukaryote has a nucleus and membrane-bound organelles, while a prokaryote does not. Plants and animals (includes Humans!) are made of numerous eukaryotic cells, while many microbes, such as bacteria, consist of single cells. An adult human body is estimated to contain between 10 and 100 trillion cells.

Centimorgan (cM) is a measurement of how likely a segment of DNA is to recombine from one generation to the next generation. As a general rule, one centimorgan corresponds to about 1 million base pairs in humans on average. For the autosomal tester, a centimorgan value attached to a matching segment can be considered as a measurement of the quality of the match. Generally, the higher the value, the closer will be the relationship, though there are large uncertainties in any estimate of the relationship.

centromere is a constricted region of a chromosome that separates it into a short arm (p) and a long arm (q). During cell division, the chromosomes first replicate so that each daughter cell receives a complete set of chromosomes. Following DNA replication, the chromosome consists of two identical structures called sister chromatids, which are joined at the centromere.

chromatid is one of two identical halves of a replicated chromosome. During cell division, the chromosomes first replicate so that each daughter cell receives a complete set of chromosomes. Following DNA replication, the chromosome consists of two identical structures called sister chromatids, which are joined at the centromere.

chromosome one of 46 strands of the complete human DNA that constitute the genetic blueprint for each individual, organized into pairs, with one member of each pair inherited from the father, the other from the mother. 22 of these 23 chromosomal pairs are called autosomal chromosomes, while the remaining pair, made up of the xChromosome and the yChromosome, are called the sex chromosomes. Other species have variant numbers of chromosomes. The chromosomes of an organism taken as a whole are called the genome.

crossover a process that occurs during the replication of one of a parent’s two chromosomal strands to pass on to the next generation, in which part of the genetic material is taken from the other chromosomal strand instead; since crossover is likely to occur at some point on most chromosomes each generation, over time the segments of DNA passed on from ancestors get smaller and smaller, and eventually frustrate attempts to demonstrate relationship through autosomal DNA testing.

haplogroup the deep ancestry of a particular individual. Haplogroups have a branching tree structure, dividing meta-groups like R, called “clades”, into “subclades” like R1b, or R1b1a2, with each subclade branch defined by the particular sequence of SNP mutations that have accumulated in the genome of the common male ancestor of members of that subclade. As the haplogroup tree has been progressively articulated over the years, the original system nomenclature for subclades has become increasingly unwieldy.

This old nomenclature is now less used in favor of one that appends to the the first, defining, letter of the human haplotree, the name of the lowest level SNP that has tested positive. Thus, R1b1a2 is now preferably called R-M269. Since new SNPs are constantly being found, most people haven’t tested the latest of their line, and this is recognized by designating their haplogroup, e.g. R-M269+, while in cases where all the more recent (subordinate) SNPs have been tested, but come up negative, their haplotype would be designated, e.g. R-M269*.

haplotype a set of ySTR/mtDNA marker values associated with a particular individual. ySTR marker values (also called alleles) are determined by testing a subset of highly mutable microsatellite sites on the yChromosome called ySTRs.

match, half-identical said of two humans who share at least one allele value at a particular SNP site. Long consecutive stretches of half-identical sampled SNPs, measured in CM's (centimorgans)are indicative of a shared descent from a common ancestor. The term HIR is sometimes used to mean half-identical region, whose length may be quantified either in cMs or in the number of SNPs.

match, IBD (Identical By Descent) confusing jargon for inherited, typically used to characterize a particular stretch of DNA that is known to have been inherited from some relatively recent ancestor, as opposed to the same stretch of DNA that is IBS (Identical By State), meaning simply identical between two individuals and not known to have been inherited from a common ancestor. Comparing results of the same area with other known ancestors or benchmarks allows one to deduce common ancestors through triangulation techniques.

marker (in the context of DNA testing) a stretch of DNA whose allele values are sampled as a means of identifying individuals or placing individuals within (deep) patrilineages. Allele values are patterns in the DNA sequence which repeat over and over again in tandem, i.e., right after each other. Typically the repeat motif is less than six base pairs long.

By counting the repeats, one gets an allele value which is given in an individual's haplotype. (STRs are also known as microsatellites and simple sequence repeats (SSRs)) When viewing Y-DNA results the values found for each of the Alleles is what is compared to others who have taken the same test. In the author's results, you find DYS393=13, DYS390=22, DYS438=11, etc.; as technology has advanced the number of markers that are viable for consumer testing has increased. The author first started with a 12 marker Y-DNA test and the same process has advanced to having 111 marker tests available. Of greater personal interest to the author, he recently found a Y-DNA test match of 109/111 markers. From a genetic genealogy context this is a remarkable test result.

nucleotide There are four of these protein bases, denominated “A”, “G”, “C”, and “T”, and they constitute the alphabet of the genetic code.

organelle is a subcellular structure that has one or more specific jobs to perform in the cell, much like an organ does in the body. Among the more important cell organelles are the nuclei, which store genetic information; mitochondria, which produce chemical energy; and ribosomes, which assemble proteins.

SNP (Single Nucleotide Polymorphism) an observed difference in allele values between single nucleotides on the chromosomal strands of two individuals of the same species. The term is also used to refer to the paired nucleotides, or "base pair" of the nuclear DNA of an individual of a diploid species, like we humans, who inherit a copy of each chromosome from each of our parents. In autosomal testing for genealogical purposes, large numbers of SNP sites (base pairs) are sampled across whole chromosomes in two individuals, with the aim of identifying long half-identical stretches that are likely indicative of shared DNA from a common ancestor.

STR (Short Tandem Repeat) a type of (male) yChromosome DNA sequence composed of multiple copies (or repeats) of the same multi-nucleotide sequence; in the context of testing for genealogical purposes they are more familiarly called “marker”s. Sets of these ySTR markers are preferred for constructing test haplotypes for genetic genealogical purposes. Several hundred of these ySTR sites, or markers, have been identified but only 120 or so are currently being tested for genealogical purposes.

