Identical by descent

From ISOGG Wiki
Jump to: navigation, search
This page contains changes which are not marked for translation.

Other languages:English 100%

Identical by descent (IBD) is a term used in genetic genealogy to describe a matching segment of DNA shared by two or more people that has been inherited from a recent common ancestor. The segments are considered to match if all the alleles are identical (barring rare mutations and genotyping errors). Two matching half-identical regions (HIRs) (see below) or full-identical regions (FIRs) which meet minimum threshold conditions can be considered identical by descent. Being identical by descent is contrasted to being identical by state (IBS).

Identity by descent can be considered on various timescales. In population genetics theory all individuals have common ancestry in the distant past. For the purposes of genetic genealogy the focus is on detecting IBD segments within a genealogical timeframe (effectively within the last ten generations) where there is a possibility of identifying the common ancestor through documentary records. Any given pair of individuals is related through many common ancestors, though many of these relationships will be too distant to result in detectable IBD segments. If the two individuals have ancestors from the same geographical region they might have many recent common ancestors, but many of the relationships will not result in IBD sharing, and there might only be one or two detectable segments inherited from a subset of the common ancestors.

IBD segments can be measured in centiMorgans (a unit of genetic distance) or in megabases (a unit of physical distance). 23andMe and Family Tree DNA report segment sizes in centiMorgans. 23andMe round the segment length to the nearest tenth of a centiMorgan and round the segment start and end co-ordinates to the closest millionth base pair to reflect the uncertainty in the exact locations of the segment boundaries. AncestryDNA originally used megabases for their matching algorithms but converted to centiMorgans in about January 2014. However, they do not provide the underlying segment data.

The terms IBD and IBS are more relevant to the results of SNP microarray testing than to results of DNA sequencing, because SNP testing provides so much less information per centiMorgan of DNA. SNP test results have an additional complexity since they report on both copies of the chromosome, but the results are not phased (that is, it is unknown which nucleotide is on which copy of the chromosome). Thus if one person's SNP result is (CC), this could be at least "Half-Identical" to either (CC) or (CT) in a second person. A homozygous mismatch such as (CC) vs. (TT) would be required before one could say the results are *not* identical.

Consecutive SNP results for a short segment of DNA may be half-identical in two individuals when in actuality the DNA sequences are not identical. Therefore we must resort to the use of statistics to determine whether two half-identical (apparently matching) segments are likely to actually be identical (neglecting rare errors and mutations). A long consecutive string of half-identical SNP results (typically about 5 cM / 700 SNPs, depending on the test's error rate and other factors) is required before one can say that two matching DNA segments are probably identical by descent. Thresholds for length and number of mismatches (errors or mutations) are set by each testing company; these criteria must be met before the company will report that two individuals very likely inherited their matching segments from a common ancestor.


Thresholds for matches

For threshold details see Family Finder versus Relative Finder - Thresholds for relationship matches. AncestryDNA originally set their threshold for matches at 5 megabases. In around January 2014 they subsequently changed to using centiMorgans for measuring matches and in 2014 the threshold was changed to 5 cM, but the earlier matches were not rerun. The thresholds for other relationships at AncestryDNA are given here. As AncestryDNA have a much lower threshold than FTDNA and 23andMe users will get proportionately many more matches but a much larger percentage will be false positive matches (FPMs).

Ranges of total centimorgans of IBD segments expected, based on family relationship

  • Parent/child: 3539-3748 centimorgans (cMs)
  • 1st cousins: 548-1034 cMs
  • 1st cousins once removed: 248-638 cMs
  • 2nd cousins: 101-378 cMs
  • 2nd cousins once removed: 43-191 cMs
  • 3rd cousins: 43-ca 150 cMs
  • 3rd cousins once removed: 11.5-99 cMs
  • 4th and more distant cousins: 5-ca 50 cMs

Ranges of the number of shared IBD segments based on family relationship

  • Parent/child: 23-29
  • 1st cousins: 17-32
  • 1st cousins once removed: 12-23
  • 2nd cousins: 10-18
  • 2nd cousins once removed: 4-12
  • 3rd cousins: 2-6?
  • 3rd cousins once removed: 1-4
  • 4th and more distant cousins: 0-2

IBD accuracy of genetic tests and analysis methods

  • SNP microarray testing (23andMe, Family Finder, AncestryDNA, Chromo2, Geno 2.0, etc.: see Autosomal DNA testing comparison chart): the accuracy depends on the number and type of extracted autosomal and X-chromosome SNPs. Generally more is better.
  • Whole-genome sequencing (WGS) using next generation sequencing (NGS) technology, is not currently affordable for the genetic genealogy market, but is being used in academic studies: IBD tools are able to detect all 1st through 6th degree relationships and 55% of 9th through 11th degree relationships, a 5% to 15% increase in relationship detection compared to high-density microarray data.[1]

Further reading

Blog posts

Scientific papers


  1. Li et al 2014: Relationship Estimation from Whole-Genome Sequence Data, PLoS Genetics Jan 2014; 10(1): e1004144,

See also