Autosomal DNA statistics

From ISOGG Wiki
Jump to: navigation, search

An understanding of autosomal DNA statistics is helpful when trying to understand results from an autosomal DNA test. Autosomal DNA is inherited from both parents. It is randomly shuffled up in a process called recombination and the percentage of autosomal DNA is diluted with each new generation.

Autosomal DNA tests for finding cousins and verifying relationships for genetic genealogy purposes are offered by 23andMe, AncestryDNA and Family Tree DNA (the Family Finder test). For comparisons of the different services see Tim Janzen's autosomal DNA testing comparison chart.

Contents

Simple mathematical average of sharing

The figures in the table below show the average percentage of autosomal DNA shared with relatives assuming that every child gets 50% from his mother and 50% from his father. The degree of sharing is measured by the testing companies in units of genetic distance known as centiMorgans, although in practice it is not the total number of centiMorgans which is more significant but the length and number of shared segments. The percentages and the number of centiMorgans can vary. For example, a brother might share 53% of his DNA with one sibling and 47% with another sibling. Because of the random way that autosomal DNA is inherited third, fourth and more distant cousins will not necessarily match you with the currently available autosomal DNA tests. According to Family Tree DNA's figures there is a 90% chance that third cousins will share enough DNA for the relationship to be detected, but there is only a 50% chance that you will share enough DNA with a fourth cousin for the relationship to be identified.

23andMe provide information on both the percentage of DNA shared and the number of shared cMs. FTDNA do not provide percentages and only provide information on the number of shared cMs. The following table shows the FTDNA Family Finder cM data for comparison purposes. When using Family Finder data the calculation can also be made by dividing the total number of autosomal cMs by 68 to get the percentage of the autosomal genome that two people share. Note that the FTDNA figures exclude the X-chromosome cMs. Males have one X-chromosome and females have two X-chromosomes. If you want to include the X-chromosome in the calculations, then there are 7158.06 cMs in the entire genome if you are a female. Therefore divide the total number of cMs by 71.6 to get the percentage of the entire genome that two people share. If you are a male there are 6962.13 cMs when combining the atDNA with the X-chromosome.

23andMe include the X-chromosome in their cM count so their figures will be higher than those provided by FTDNA. The number of cMs will, therefore, vary according to gender. 23andMe made adjustments to the cM count in June 2013 so the number of cMs will vary slightly depending on when the test was taken:

  • When using 23andMe data prior to June 2013 divide the total cMs by 75 to get the percentage of the genome that two people share. There were 7494.8 cMs when combining the autosomal DNA and the X-chromosome per Family Inheritance: Advanced prior to June 2013.
  • When using 23andMe data after June 2013 divide the total cMs by 74 to get the percentage of the genome that two people share. There are 7438.6 cMs when combining the autosomal DNA and the X-chromosome per Family Inheritance: Advanced. There are 7074.6 autosomal cMs per 23andMe.
  • The above numbers for 23andMe are applicable to females, who have two X-chromosomes. The number of cMs (atDNA and X combined) will be lower for men. Men in 23andMe have 7256.8 cMs when combining the atDNA with the X-chromosome.

Note that AncestryDNA do not provide information on the shared centiMorgans or the percentages of shared DNA. However, AncestryDNA customers can upload their raw data to the free GedMatch utility in order to extract the necessary cM data for making comparisons and to check the relationship predictions. David Pike's tools can also be used.

Percentage centiMorgans (FTDNA) Relationship Notes
100% 6766.2 Identical twins (monozygotic twins) Tiny differences between identical twins can now be detected by next generation sequencing. See: Weber-Lehman et al 2014. Finding the needle in the haystack: Differentiating "identical" twins in paternity testing and forensics by ultra-deep next generation sequencing. Forensic Science International: Genetics; 9: 42-46. See also the editorial by Bruce Budowle in Investigative Genetics: Molecular genetic investigative leads to differentiate monozygotic twins.
50% 3400 Mother, father, siblings The actual half-identical value is 3338.31 cMs but has been rounded up here for convenience
50% 2640 Full siblings Siblings share both half-identical regions (HIRs) and fully identical regions (FIRs). The expected numbers are 50% half-identical, 25% completely identical, and 25% not identical for an overall average of 50%. However, both 23andMe and FTDNA only report the number of centiMorgans for the half-identical regions and do not include the additional cMs for the fully identical regions. Similarly, the FTDNA chromosome browser and the chromosome browser in 23andMe's Family Inheritance: Advanced feature only show the half-identical regions. The parts that are shared are painted one color, and you don't get extra credit for the completely identical segments. However, it is possible to see the fully identical regions at 23andMe by using the Family Traits chromosome browser (accessed via the Family and Friends menu). FIRs can also be seen if you upload your raw data to the free GedMatch utility. FIRs show up as "green" in the one-to-one comparison, though, in the one-to-many chromosome browser display at GEDMatch, it only shows the regions as HIRs.
25% 1700.00 Grandfathers, grandmothers, aunts, uncles, half-siblings, double first cousins
12.5% 850.00 Great-grandparents, first cousins, great-uncles, great-aunts, half-aunts/uncles, half-nephews/nieces
6.25% 425.00 First cousins once removed, half first cousins
3.125% 212.50 Second cousins, first cousins twice removed
1.563% 106.25 Second cousins once removed, half second cousins
0.781% 53.13 Third cousins, second cousins twice removed
0.391% 26.56 Third cousins once removed
0.195% 13.28 Fourth cousins
0.0977% 6.64 Fourth cousins once removed
0.0488% 3.32 Fifth cousins
0.0244 1.66 Fifth cousins once removed
0.0122% 0.83 Sixth cousins
0.0061% 0.42 Sixth cousins once removed
0.00305% 0.21 Seventh cousins ca. 92,000 base pairs
0.001525% 0.10 Seventh cousins once removed
0.000763% 0.05 Eighth cousins ca 23,000 base pairs

The chart below (courtesy Dimario, Wikimedia Commons) shows the average amount of autosomal DNA inherited by all close relations up to the third cousin level.

Cousin tree (with genetic kinship).png

Ranges of sharing percentage

Figures from 23andMe's Relative Finder:

  • Parent/child: 47.54 (for father/son pairs, who do not share the X-chromosome) to ~50%
  • 1st cousins: 7.31-13.8
  • 1st cousins once removed: 3.3-8.51
  • 2nd cousins: 2.85-5.04
  • 2nd cousins once removed: .57-2.54
  • 3rd cousins: ca .3-2.0
  • 3rd cousins once removed: .11-1.32
  • 4th and more distant cousins: .07-.5

Shared SNPs

Figures from 23andMe Compare Genes function (from Tim Janzen's data):

  • Parent-child pairs share between 83.94% and 84.20% of SNPs (50% of DNA in common)
  • Siblings share between 83.81% and 87.47% of SNPs (50% of DNA in common)
  • Uncle/aunt-niece/nephew pairs share between 78.48% and 79.57% of SNPs (25% of DNA in common)
  • Grandparent-grandchild pairs share between 77.96% and 80.59% of SNPs (25% of DNA in common)
  • First cousins and great uncle/great aunt-grandniece/grandnephew pairs share 75.78% and 77.03% of SNPs (12.5% of DNA in common)
  • First cousins once removed share ca 75.5% of SNPs (6.25% of DNA in common)
  • Second cousins and first cousins twice removed share ca 75% of SNPs (3.125% of DNA in common)
  • Unrelated people of European descent share 73-74.6% of SNPs

Identical by descent segments

The recombinant DNA has a limit of breakdown, meaning we do not inherit DNA segments from every genealogical ancestor. In other words we only inherit DNA from a small proportion of our genealogical ancestors. Luke Jostins calculated that on average a human has no more then ca. 125 genetic ancestors from the same ancestral generation. This means that only up to the seventh ancestor generation (128 ancestors) segments of those ancestors are detectable in the personal DNA.

It is important to identify identical by descent segments versus identical by state (IBS) segments. John Walden's research on IBD and IBS segments can be used as a guideline:[1][2]

cM  % IBD  % IBS
10 99 1
9 80 20
8 50 50
7 30 70
6 20 80
5 5 95

Statistics categorized by genealogical relationship

In order to help people who have taken an autosomal DNA test gain greater insight into the genealogical relationships implied by the resultant data Tim Janzen has created three charts that provide statistical information in various categories. The charts provide statistics on close relatives, distant endogamous relatives and distant non-endogamous relatives. The charts were originally designed for use with 23andMe data but now also incorporate data from FTDNA's Family Finder test. The charts are organized by the degree of relationship, with the most closely related people (parents and children, full siblings) being listed at the top and more distant cousins being listed at the bottom. The statistics are based on information from real people who have been tested by 23andMe and Family Tree DNA and who have a known genealogical relationship to someone else who has also been tested by the same company. The charts also include information on the median and the average number of shared cMs for people who are related to each other from the first cousin once removed level of relationship to the 5th cousin level of relationship. The charts can be downloaded from Anabaptist Genetic Genealogy website.

An unidentified author has also provided a spreadsheet on DNA Inheritance Statistics to which anyone can add their data. The spreadsheet can be found here.

Blog posts

Resources

Charts and tools

Resources from FTDNA and 23andMe

Scientific papers

References

  1. John Walden's research reported in a message to the Autosomal DNA Rootsweb list by Tim Janzen, 6 January 2012.
  2. See also the files on John Walden's website.

See also