Listing Criteria for SNP Inclusion
into the ISOGG Y-DNA Haplogroup Tree - 2015
The entire work is identified by the Version Number and date given on the
Main Page. Directions for citing the document are given at
the bottom of the Main
Version History Last
revision date for this specific page: 23 August 2015
These recommendations are to assure that there is a uniform set of criteria for accepting new mutations for inclusion
on the ISOGG Y-DNA haplogroup tree.
Because of the abundance of alternatives now available, only single nucleotide polymorphisms (SNPs) are being
accepted, and not insertions or deletions (indels) for new additions. In exceptional cases other variants may be
considered for inclusion on a case by case basis if they can be clearly demonstrated to have equivalent properties
to SNPs, but the burden of proof required will be much higher and at the discretion of the committee.
Special Coding for Interpreting SNP status
This coding may need to be modified when the tree moves from .html coding to a database.
- Added SNPs are color coded red and defined as SNPs that have met all of the
criteria for inclusion and did not appear on last year's tree.
- Provisional SNPs are color coded purple and defined as: SNPs newly submitted
to the ISOGG committee that have sufficient information to be at an approximate tree position but require
better evidence for exact placement.
- SNPs under Investigation are color coded pink and are SNPs that have not yet
been placed on the tree because additional testing is needed to confirm adequate positive samples and/or
correct placement on the tree.
- Private SNPs are color coded blue and are SNPs with either less than two
samples or SNPs that do not meet the STR diversity requirements described under 1. Sanger Sequencing.
- SNPs found solely from next generation sequencing are colored either black or red and shown in
italics; they indicate quality, consistent reads found in Y sequencing. These are not confirmed by Sanger
sequencing or microarray testing and sometimes may not be amenable to either process.
- SNP(s) printed in bold in a subclade: The criteria for a representative SNP printed in bold for a
subclade is that it has traditionally represented that subgroup or seems the most promising representative if
in italics. Bolded items frequently were confirmed by Sanger sequencing.
- Identical SNPs are SNPs that have the same y-position, mutation, and subclade within a haplogroup and were
discovered in different labs. They are listed in alphabetical order, (not necessarily in the order of
discovery), and are separated by "/". Examples: P257/U6, L31/S149.
General Requirements for SNP Validation
The requirements listed here in this General Requirements section apply to validating SNPs discussed in
Requirements of Specific Type of Testing in the next section below.
- Inserting a SNP Creating a Non-Terminal Branch to the ISOGG Tree
The supporting information provided by the proposer should demonstrate that the new SNP is downstream of
an established tree mutation. There is need also to show that the SNP was tested in individuals from all
parallel subgroups on the tree.. In cases where relevant existing tree subgroups are from rare populations
and based solely on old research listing only one sample proving the existence of the SNP, an exception may
be granted for testing of the old subgroup. The mutations of the existing subgroup will then be listed
Example: Suppose that a new subgroup is being added with name of Q18.
Then the evidence for Q18 must show that a man is derived for both Q18 and L140. Simultaneously one man each
from L1266 and L13 must be ancestral for Q18. In addition, one man derived for Q18 must be derived for L1268,
and a second Q18 man ancestral for L1268. Derived means the mutation is present; ancestral means it is absent.
- Adding a SNP Representing a New Terminal Branch to the ISOGG Tree
In the case where the new SNP is the terminal branch of an existing branch then:
- at least one individual who has the new SNP is found also to have a SNP defining the immediate
- at least one individual from any parallel subgroup to the new subgroup is found also to lack
the submitted SNP.
Example: Suppose that a new subgroup is being added with name QQ12.
Then the evidence for QQ12 must show that two men are derived for QQ12. Simultaneously one man from P343
must be ancestral for QQ12. Also, one of the QQ12 men must be derived for L5432.
Requirements for Specific Type of Testing
Reference giving details about Y-DNA SNP testing companies:
Y-DNA SNP testing chart
Reference giving details about Y-DNA STR testing companies:
Y-DNA STR testing chart
- Sanger Sequencing
Examples of Sanger sequencing are the tests at the company ySeq and the Advanced Tests (SNP) at Family Tree DNA.
STR testing is available, for instance, at Genebase and Family Tree DNA. Acceptable testing for this category
consists of Sanger sequencing which targets a short segment of Y-DNA.
The objective of the ISOGG Tree at this time is to include all SNPs that arose prior to about the year 1500 C.E.
This guideline may be measured through STR diversity or alternative evidence.
Where a new terminal subgroup is being added, STR marker results or other evidence described below for two men
with the new SNP are needed.
To be accepted the SNP must be observed in at least two individuals and must meet the STR diversity requirement.
A SNP that does not meet this requirement will be classified as a Private SNP (see definition above).
The STR diversity requirement is met if the following conditions are satisfied:
- If the SNP is a Non-Terminal Branch SNP, no further proof of diversity is required.
- Genetic distance is calculated using the
Infinite Alleles Model (IAM). A marker for
which there is a null value in one sample must be discarded from the calculations. Otherwise, most laboratories
use the IAM.
- All markers tested by both individuals must be compared.
- If 74 markers (or fewer) are compared, the minimum genetic distance to meet the diversity
requirement is 5.
- If 75 (or more) markers are compared, the diversity requirement is a minimum of 7%, computed by
dividing the genetic distance by the number of markers compared, and rounding to the nearest integer value.
If the submitter can otherwise provide evidence that the common ancestor of the two samples can be reasonably
expected to have lived more than 500 years ago, this will also be considered.
- Next Generation Sequencing
Next generation sequencing is available for the genealogical community at Full Genomes Corporation, Family Tree's
Big Y Test. Next generation sequencing has the largest coverage of any type of SNP
testing currently available.
- The committee recognizes there are a wide variety of ways in which sequencing information is
available. Because of this, no specific criteria for sequencing information is provided here. The goal of
the reviewers of the sequencing submissions – at one extreme – will be to easily accept quality SNPs from old,
root branches found in many samples within all the downstream branches. At the opposite extreme, it is unlikely
reviewers will accept SNPs near or in terminal branches whose positions depend on the results from one sample.
- The submitter must provide the raw data report(s) pertaining to the sequencing. Just two
examples of raw data reports would include a vcf file showing the usual quality scores, DP scores for depth of
reads, etc. for the involved sample and pertinent additional ones, including ones from other haplogroups OR
instead the so-called “haplogroup compare report” from Full Genomes Corp. Results from Sanger sequencing or
from microarray products, such as Geno 2.0 or Chromo 2.0, might be acceptable comparative information in
certain cases. Having a large number of pertinent comparative samples on a vcf report, can improve the
- The reviewer will have to take into consideration the coverage of the next generation sequencing,
varied quality scorings, position of the site on the chromosome, the percentage of samples with clean reads at
the site in question, possible indel relationships to the SNP, geographical separation of the samples, non-next
generation sequencing testing, results for the SNP site in other reports, and other factors in making a complex
judgment as to whether the submitted SNP is almost certain to show the same results in next generation sequencing
of new comparable samples.
- More precise criteria for next generation sequencing submissions may be provided as evidence
- When a new SNP creating a new terminal branch is being added to the tree, at least two of the
submitted samples must each have an average of 3 unique (singleton) SNPs per 10 million base pairs of sequencing
coverage. Reviewers will determine uniqueness according to comparisons to all available sequencing results
rather than samples tested at a particular laboratory.
- If the evidence for the SNP is based solely on next generation sequencing, the SNP will appear
in italics on the tree.
- Microarray Chip-based Genotyping
Examples of microarray chip-based genotyping are Geno 2.0or Geno 2.0 Next-Generation test, 23andMe, Chromo 2.0 and Family Tree DNA's Deep Clade
panels. Microarray chips target a selected group of snps.
- Novel SNPs found in microarray products without a presence also in other qualifying sources -
such as Sanger sequencing or next generation sequencing - cannot be submitted. However, chip-based genotyping
results can be used in combination with Sanger sequencing and/or next generation sequencing results as validating
evidence for one of the samples. If chip-based genotyping is part of the evidence, the approved SNP will be
listed in regular type, rather than italics, even if the other evidence is from next generation sequencing.
- Samples from chip-based genotyping used to prove a new terminal branch must meet the criteria
for STR diversity described in the Sanger sequencing section.
Acceptance Process for Placing a SNP on the ISOGG Y-DNA Haplotree
The discoverer of the SNP (or a knowledgeable third party) can email the Contact Person listed on the appropriate
haplogroup page and describe where the new SNP fits in the tree. The haplogroup experts will evaluate the evidence
for inclusion on the tree. If the information on tree placement is insufficient, it will be listed as
investigational in the section under the tree. If the Contact Person is not available, contact
Corrections/Additions made since 1 January 2015:
- Identified additional tests under microarray testing and next-generation sequencing on 21 August 2015.