Glossary of Genetic Terms - 2014
Allele: Broadly, one of the alternative forms of a gene or genetic marker. More narrowly, the term allele value refers to a count of the number of repeats in an STR (pronounced ess-tee-are). A list of marker labels and their associated allele values constitutes an individualís haplotype.

Ancestral or Negative: The designation given a SNP when DNA testing determines that the SNP mutation is absent.

Anthrogenealogy: The tracing of human lineage beyond the limits of historical records through DNA testing through the use of SNP testing to determine haplogroups.

Anthropological Time Frame: A time frame of over 1000 to tens of thousands of years ago that predates recorded history and surnames for most people. The Y-DNA haplogroup tree traces SNP mutations over anthropological time.

Anthropology: The science of human beings, especially the study of human beings in relation to distribution, origin, and classification.

Base: A small chemical molecule which is the information portion of the nucleotides in DNA. The chemical bases are: A (Adenine), T (Thymine) C (Cytosine) and G (Guanine).

Base Pair (bp): A (Adenine) pairs with T (Thymine) and C (Cytosine) pairs with G (Guanine). These base pairs form the ladder of the DNA molecule.

Branch: A specific area on a haplogroup or phylogenetic tree that has associated SNPs. For example, E1, E2, and E3 are all branches of haplogroup E.

Chromosome: The self-replicating genetic structure of cells containing the cellular DNA that bears in its nucleotide sequence the linear array of genes. Chromosomes are normally found in pairs; human beings typically have 23 pairs of chromosomes.

Clade: From the Greek word klados, meaning branch. A clade on the Y chromosome tree is also called a haplogroup.

Coalescence Age: See TMRCA under MRCA. The term coalescence often implies a distant MRCA (thousands of years ago), and statistical models may add more parameters beyond the mutation rate, such as population growth or bottlenecks.

Confirmed SNP: See definition of confirmed SNP in Requirements for SNP Inclusion.

Deep Ancestry: Ancestry in an anthropological time frame of over 1000 to tens of thousands of years ago that predates recorded history and surnames for most people. The Y-DNA haplogroup tree traces SNP mutations to show deep ancestry.

Derived or Positive: The designation given a SNP when DNA testing determines that the SNP mutation is present.

DNA (deoxyribonucleic acid): The large molecule inside the nucleus of a cell that carries genetic instructions for making living organisms. See Y-DNA.

Downstream: A term used in association with a haplogroup or phylogenetic tree to designate the relationship between two SNPs or two branches or two clades, with downstream being the descriptor for the object that succeeds the first. The downstream item cannot exist without the existence of the upstream item.

DYS-DYF-DYZ: Markers that have been submitted to the HUGO gene nomenclature committee are given catalog numbers, more or less in order of discovery. D stands for DNA, Y stands for Y chromosome, and S/F/Z stand for different properties of the marker. However, the nomenclature has not always been applied consistently, as spelled out in a message by Thomas Krahn on the GENEALOGY-DNA mailing list.

Gene: The fundamental physical and functional unit of heredity. A gene is an ordered sequence of nucleotides located in a particular position on a particular chromosome that encodes a specific functional product.

Genetic Genealogy: The tracing of human lineage within the time frame of historical records through DNA testing and comparison of haplotypes.

Genealogical Time Frame: A time frame within the last 500 up to 1000 years since the adoption of surnames and written family records. An individual's haplotype is useful within this time frame and is compared to others to help identify branches within a family.

Haplogroup: A population descended from a common ancestor, as evidenced by specific SNP mutations. Haplogroups are not cultural groups, although a haplogroup can be strongly represented by a cultural population such as American Indians. The Y Chromosome Consortium (YCC) has assigned hierarchical alphanumeric labels, which can be presented graphically in the form of a phylogenetic or haplogroup tree.

Haplogroup Tree: A diagram showing evolutionary lineages of organisms. See also Phylogenetic Tree.

Haplotype: Broadly, the complete set of results obtained from multiple markers located on a single chromosome. For the Y chromosome, the term is restricted by convention to allele values (number of repeats) obtained from microsatellite (STR) markers, as described by the Y Chromosome Consortium (YCC).

Human Genome Project: An international research project to map each human gene and to completely sequence human DNA, i.e. to sequence the entire human genome, the complete complement of all genetic material in the human species. The human genome sequencing was completed in 2003.

HUGO: Established in 1989, the Human Genome Organization (HUGO) is the international organisation of scientists involved in human genetics.

LINE: See Long Interspersed Nucleotide Element.

Long Interspersed Nucleotide Element (LINE): A Unique Event Polymorphism which is several thousand bases in length, with about 500,000 copies scattered over the human genome.

Marker: An identifiable physical location on a chromosome that is variable between individuals and whose inheritance can be monitored. A term commonly used along with allele values in describing an individual's haplotype. Marker labels, such as M173 or DYS388, have no intrinsic meaning.

Meiosis: The process of two consecutive cell divisions in the diploid progenitors of sex cells. Meiosis results in four rather than two daughter cells, each with a haploid set of chromosomes.

Microsatellite: Repetitive stretches of short sequences of DNA used as genetic markers to track inheritance in families. They are short sequences of nucleotides which are repeated over and over again a number of times in tandem. They are also called Short Tandem Repeats (STR) and Simple Sequence Repeats (SSRs)

MRCA -- Most Recent Common Ancestor: In this context, it refers to the straight paternal line. The MRCA of brothers is their father, the MRCA of first cousins is their grandfather, and so forth. If the genealogy is not known, the time to the MRCA (TMRCA) can be estimated statistically, using the mutation rate of markers on the Y chromosome. On the average, the closer the DNA match, the more recent the TMRCA. The 95% confidence interval for the TMRCA typically covers a very wide range. If a single value is stated, it is usually the median value (50% of pairs who match each other at X out of Y markers will find their common ancestor within G generations, but 50% will have to keep on looking).

Mutation: A permanent structural alteration or change in the DNA sequence. Mutations in the sperm or egg are called germline mutations. Germline mutations in the Y chromosome of the male are passed on to all of his male-line descendants. Mutations that occur after conception are called somatic mutations; these mutations may be found in different tissues of the body and they are not passed on to offspring.

Negative or Ancestral: The designation given a SNP when DNA testing determines that the SNP mutation is absent.

NRY: Non-recombining Y, the large central portion of the Y chromosome that does not exchange material with the X chromosome.

Nucleotide: A sub-unit of DNA made of a molecule of sugar, a molecule of phosphoric acid, and a molecule called a base.

Nucleus: The central cell structure located in the center of the cell that houses DNA packaged in chromosomes.

Paternal Line of Descent: A direct line of descent from ancestral father to son to son to son along an all male line which is traced through Y-DNA.

Phylogenetic Tree: A diagram showing evolutionary lineages of organisms. See also Haplogroup Tree.

Phylogenetically Equivalent: A term used when describing the relationship between two or more SNPs; specifically, SNPs that belong on the same branch (or clade) of a haplogroup or phylogenetic tree, for example, Y-DNA haplogroup M is defined by the following SNPs: M4, M5, M106, M186, M189, P35, which are said to be phylogenetically equivalent.

Polymorphism: A variation in the sequence of genetic information on a segment of DNA.

Population Genetics: The study of the genetics of groups of individual organisms, often shown through the graphic of a haplogroup or phylogenetic tree.

Positive or Derived: The designation given a SNP when DNA testing determines that the SNP mutation is present.

Provisional SNP: See definition of provisional SNP in Requirements for SNP Inclusion.

Private SNP: See definition of private SNP in Requirements for SNP Inclusion.

Short Tandem Repeats (STR - pronounced ess-tee-are): Patterns in the DNA sequence which repeat over and over again in tandem, i.e., right after each other. Typically the repeat motif is less than six (6) base pairs long. By counting the repeats, one gets an allele value which is given in an individual's haplotype. They are also called microsatellites and Simple Sequence Repeats (SSRs).

Single Nucleotide Polymorphism (SNP which is pronounced 'snip'): Variation in the nucleotide allele at a certain nucleotide position in the human genome. When the change occurs it is called a polymorphism, and polymorphisms accumulate over time. A polymorphism can be very common (found in a significant fraction of global or localized populations) or very rare (found in a single individual). Common variations are used to track the evolution of the human genome over time (population genetics) and can be graphically represented in a haplogroup or phylogenetic tree.

Sister clade: A term used to describe clades that are on the same level of a haplogroup or phylogenetic tree. For example, R1b1c1, R1b1c2, R1b1c3, etc. would be considered sister clades.

SNP (which is pronounced 'snip'): See Single Nucleotide Polymorphism.

SSR (which is pronounced ess-ess-are): SSR (Simple Sequence Repeats) is synonymous with STR. See Short Tandem Repeats.

STR (which is pronounced ess-tee-are): See Short Tandem Repeats.

Sub-branch: A term to describe the relationship between two branches with the sub-branch being downstream. See also Subclade.

Subclade: A term to describe the relationship between two clades with the subclade being downstream. See also Sub-branch.

UEP: See Unique Event Polymorphism.

Unique Event Polymorphism (UEP): A mutation which is treated as if it occurred only once in all of human history, so that all persons sharing the mutation descend from a common ancestor. Most UEPs are Single Nucleotide Polymorphisms (SNPs), while some are insertions or deletions (for examples, see LINE and YAP).

Upstream: A term used in association with a haplogroup or phylogenetic tree to designate the relationship between two SNPs or two branches or two clades, with upstream being the descriptor for the object that precedes the second. The upstream item must exist before the creation of the downstream item.

X chromosome: One of two types of sex determining chromosomes, the other being the Y chromosome. When two X chromosomes, one from each parents, are paired with each other in a fertilized egg cell, the resulting child will be female. If the fertilized egg cell contains both an X and a Y chromosome, the resulting child will be male. The X chromosomes become subject to cross-over effects during subsequent egg cell creation in the female offspring, and thus the homologous gene alleles and genetic marker alleles in both these X chromosomes can randomly swap positions in the next generations making it very difficult to track a particular X chromosome over more than a couple of generations. Determining a common ancestor for an X chromosome is very difficult beyond a couple of generations; therefore, the X chromosome is not a very useful tool for genetic genealogy purposes.

YAP: See Y Alu Polymorphism.

Y Alu Polymorphism (YAP): A Unique Event Polymorphism that is an insertion of a few hundred base pairs. There are about a million Alu inserts scattered throughout the human genome.

YCC: Y Chromosome Consortium, a committee formed to standardize haplogroup nomenclature.

Y chromosome: The Y chromosome is the chromosome that makes a person a male and can be passed by a male only to his sons. It differs from all other chromosomes that the majority of the chromosome is unique and does not recombine during meiosis (see NRY or non-combining Y). This means the historical pattern of mutations can easily be studied.

Y-DNA: The DNA in the Y chromosome that can be passed by a male only to his sons. This DNA can be tested to determine both haplotype and haplogroup of the individual.

Definitions adapted from the following References:
1. Avise, J.C. 1994. Molecular Markers, Natural History and Evolution. Chapman and Hall, New York.
2. Hartl, D. L. 2000. A Primer of Population Genetics (3rd ed.). Sinauer Associates, Sunderland, MA.
3. Kerchner, C. F. Jr. 2004. Genetic Genealogy DNA Testing Dictionary. C. F. Kerchner & Associates, Inc. Emmaus, PA.
4. National Human Genome Research Institute

