ChrisR/current NextGenSeq testing

NGS comparison table

FGC = Full Genomes Corporation, FTDNA = Family Tree DNA, see also Y-DNA next generation sequencing, Y-DNA SNP testing chart and Autosomal DNA testing comparison chart.

	FGC WGS 30×	FGC WGS 20×	FGC WGS 15× (GenomeGuide)	FGC WGS 10×	FGC WGS 4×	1000 Genomes Ph.3	FGC WGS 2×	FGC Y-Elite 2	FGC Y-Elite 1	FTDNA BigY
Introduced	Summer 2014	Late 2015	Early 2016	July 2015	July 2015	2011-2013	July 2015	May 2015	Late 2012-2015	November 2013
Price	$1250 ($42/×)	$1200 ($60/×)	$895 ($60/×)	$725 ($73/×)	$395 ($99/×)	-	$280 ($140/×)	$775	$850-1299	$575
Sequenced DNA focus	whole genome	whole genome	whole genome	whole genome	whole genome	whole genome	whole genome	Y-DNA, mtDNA	Y-DNA, mtDNA	Y-DNA (until April 2015 mtDNA)
Read depth, read length, Method	30× 150 bp (or 10 mb)^[1]	20× 150 bp	15× 150 bp	10× 150 bp	4× 150 bp	min. 4×, av. 7× > 70bp	2× 150 bp	30× 250 bp	50× 100 bp	60× 100 bp
Upgrade options	$55 per 1× + $100 data fee^[1]	price difference + $100 data fee^[1]	price difference + $100 data fee	price difference + $100 data fee^[2]	price difference to 10× + $100 data fee^[2]	-	price difference to 4× + $100 data fee^[2]	2nd order for 60× ^[3]	-	-
Y ≥1× coverage (FGC)	~22.9 mbp ^[4] 92% hg19	~22.8 mbp ^[5] 92% hg19	~ ?	~21.8 mbp ^[6] 89% hg19	~17.7 mbp ^[4] 72% hg19	? mbp	~13.8 mbp ^[4] 56% hg19	>22.0 mbp ^[4] 89% hg19	~22.8 mbp ^[4] 92% hg19 (21.5-23 mbp)^[7]	~16 mbp 65% hg19 (14-23 mbp)^[7]
Y Callable Loci (GATK) (FGC qual.-read-lenght)	~14.9 mbp ^[8]	~13.9 mbp ^[5]	~13.2 mbp ^[9]	~8.0 mbp	~1.1 mbp	?	~0.4 mbp	~14.8 mbp ^[10]	~14.1 mbp ^[11]	~8.8 mbp ^[11]
Y Method Analysis (YFull)	Mean/Av. 21× Median 12× ~22.8 mbp ~0.3 Gb BAM ~2900? SNPs ?/111 STRs			Mean/Av. 10-11× Median 4-5× ~88% Y-cov-hg19 ~0.2 Gb BAM ~2,762 known + ? novel SNPs ~?/111 STRs	^[12] Mean/Av. 9× Median 4× ~87% Y-cov-hg19 ~0.1 Gb BAM 2,764 known + 243 novel SNPs ~81/111 STRs	Mean/Av. ?× Median ?× ? bp ~0.18 Gb BAM ~2300? SNPs ca. 1/3 of 111 STRs	?	Mean/Av. ~47-72× Median ~31-47× 22 mbp ~1.2 Gb BAM ^[4] ~2750 SNPs^[6] ~107/111 STRs	Mean/Av. -76× Median 37-39× 22.7-25 mbp ~3 Gb BAM ~2800 SNPs ~98/111 STRs	Mean/Av. -91× Median 47-60× ~13.9 mbp ~0.8 Gb BAM ~2050 SNPs ~96/111 STRs
mt Method Analysis			~100% FMS Mean/Av. >1000X ^[9]	~100% FMS Mean/Av. >1000X ^[13]				92-100% FMS ^[14]	~95% FMS Mean/Av. ~26X ^[13] (75-100%)^[7]	~69% FMS Mean/Av. ~13-41X ^[13] (0-100%)^[7]
at/X Method ~3,60 mill. SNPs expected	~3.60 mill. SNPs (~100%)^[15] ca. 22.5x? Coverage ca. 95%.		~3.52 mill. SNPs (~98%)^[9]	~3.11 mill. SNPs (~86%)^[15]	~1.75 mill. SNPs (~49%)^[15]			not included	not included	not included

Numbers of variants in the human genome / in WGS databases

The human nucleotide diversity is estimated to be 0.1% to 0.4% of base pairs. A difference of 1 to 4 in 1,000 amounts to approximately 3 to 12 million nucleotide differences, because the human genome has about 3 billion nucleotides.^[16]

SNVs and structural variation of some WGS databases (Francioli, Menelaou et al 2014)

Variants shared by whole genomes of 250 Dutch parent-offspring families from Genome of the Netherlands (GoNL) Project (20.4 million single-nucleotide variants and 1.2 million insertions and deletions, intermediate coverage ~13×)^[17]

Variant dataset	Variants M(illions)	percent
HapMap CEU 2005-2009	2.3	11%
1000G EUR 2011-2013	9.1	45%
1000G 2011-2013	1.2	6%
dbSNP 1998-2013 ^[18]	0.2	1%
GoNL 2014 only	7.6	37%
Sum	20.4	100%

See for comparison the widely used Illumina SNP chips (23andMe, FamilyFinder, Ancestry.com, etc.) which provide a few hundred thousand SNP Markers: Autosomal DNA testing comparison chart

relationship between read depth and coverage in Next generation sequencing (Wang, Wei et al 2011)

Minimal read depth and coverage for variant (SNV/SNP) research

Sequencing depth represents the (often average) number of nucleotides contributing to a portion of an assembly. On a genome basis, it means that, on average, each base has been sequenced a certain number of times (10×, 20×,...). For a specific nucleotide, it represents the number of sequences that added information about that nucleotide. Such depth varies quite a lot depending on the genomic region. In consequence, an average sequencing depth of 30× leaves a lot of small portions of a genome unsequenced while other receive a lot more sequences.^[19]

Low confidence: 7×

The 1000 Genomes Project sequenced genomes of 2,504 individuals representing 26 populations to an average of 7× coverage. This dataset is used by many for variant research and has acceptable minimal confidence for haploid genome parts (mtDNA and hemizygous Y-DNA).

accuracy variant calling various coverage depths (filtered on chr20, Cheng, Teo et al 2014)

Medium confidence: 10× - 59×

Everything >7× is called Deep sequencing. For detecting human genome mutations, SNPs, and rearrangements, publications often recommend from 10× to 30× depth of coverage, depending on the application and statistical model.^[20] A 2011 study calls 10× SNP calling capability enough for the standard SNP analysis evaluation.^[21]

Analysis of the first sequenced human genome in 2008 suggests that homozygous SNVs are detected at a 15× average depth and an average depth of 33× is required to detect the same proportion of heterozygous SNVs.^[22] A 2011 study suggests improvements in sequencing set the required average mapped depth to 35× for reliable calling of SNVs and small indels across 95% of the genome.^[23]

SNP call accuracy according to a 2014 study on single nucleotide variant detection and genotype calling (for chr20)^[24]

5×: 90-97%
10×: 96-98%
15×: 98% (Minimum for rare variants)
≥20×: 99%

Minimum depth for correct genotype call (Meynert, Ansari et al 2014)

High confidence: 60× and higher

For high confidence of Exome variants (medical)
"We calculated that 60× WGS data from the HiSeq 2000 platform are needed to recover ~95% of INDELs, much higher than that for SNP detection. Accurate detection of heterozygous INDELs requires ~1.2-fold higher coverage than that for homozygous INDELs" ^[25]

Interesting features for population genetics and genetic genealogy

Enrichment / target designs can help to provide better coverage for certain genome areas. Ability to deliver the following data seems crucial for competitivity in the market:

Y

Detection derived Y-SNPs > 2000
Y-STR coverage especially for FTDNA Y37-Y67 panel
FASTQ files for remapping possibility

Autosomal/X

Coverage for the main DTC-chip SNPs useful for admixture and IBD comparisons like on Gedmatch: 23andMe (v1-v4), Ancestry.com, FTDNA FamilyFinder, Geno 2.0 (v1-v2), Chromo2;
Phasing: possibility to distinguish paternal and maternal DNA in an individual without having parents DNA/testing. At 34× phasing of 96 % of SNPs into haplotype blocks should be already possible.^[26]
Potential for coverage of highly informative continental and regional SNPs (rare variants) to be used in future admixture and matching services (distant genetic relations).^[27]^[28]

mt

~100% FMS with good mean read depth (>50×)

References

↑ ^1.0 ^1.1 ^1.2 $2750 pilot project long read whole genome Chromium technology, Justin Loe, FGC, 2016-09-02, Forum http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=184229&viewfull=1#post184229 Cite error: Invalid <ref> tag; name "LoeUpgrade1512" defined multiple times with different content Cite error: Invalid <ref> tag; name "LoeUpgrade1512" defined multiple times with different content
↑ ^2.0 ^2.1 ^2.2 Justin Loe, FGC, 2015-11-29, Forum http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=123473&viewfull=1#post123473
↑ Justin Loe, FGC, 2015-12-15, http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=126856#post126856
↑ ^4.0 ^4.1 ^4.2 ^4.3 ^4.4 ^4.5 Justin Loe, email message, 28 Dec 2015 and AG Forum 2016-01 http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=131470&viewfull=1#post131470
↑ ^5.0 ^5.1 Justin Loe, AG Forum 2016-01 http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=135412&viewfull=1#post135412
↑ ^6.0 ^6.1 Justin Loe, AG Forum 2016-01 http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=132860&viewfull=1#post132860
↑ ^7.0 ^7.1 ^7.2 ^7.3 Jim Kane, Which Y-DNA NGS test to take? November 12, 2015, http://www.it2kane.org/2015/11/which-ngs-test-to-take/
↑ Justin Loe, E-Mail 2016-06-10
↑ ^9.0 ^9.1 ^9.2 Justin Loe, E-Mail 2016-02
↑ Justin Loe, AG Forum 2016-03 http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=146390&viewfull=1#post146390
↑ ^11.0 ^11.1 Vince Tilroe analysis of FGC raw-data from Greg Magoon, shared by Iain McDonald, 2 Dec 2015
↑ based on a single sample with initial QC problems YF05650
↑ ^13.0 ^13.1 ^13.2 Petr, Forum post: Full Y Chromosome Sequencing: Phase III Pilot, 2015-12-25, http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=128756&viewfull=1#post128756
↑ Justin Loe, Batch 9006, email message, 28 Dec 2015
↑ ^15.0 ^15.1 ^15.2 Justin Loe, AG Forum 2016-01 http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=134491&viewfull=1#post134491
↑ Human genetic variation / Measures of variation, Wikipedia, 2016-01 https://en.wikipedia.org/wiki/Human_genetic_variation
↑ Francioli, Menelaou et al 2014: doi:10.1038/ng.3021, http://www.nature.com/ng/journal/v46/n8/abs/ng.3021.html
↑ ca. NCBI dbSNP Build 138, Apr 2013: http://www.ncbi.nlm.nih.gov/SNP/snp_summary.cgi?view+summary=view+summary&build_id=138
↑ Eric Normandeau, What Is The Sequencing 'Depth' ?, 2011-2012, https://www.biostars.org/p/638/#640
↑ Illumina, Sequencing Coverage, 2016-01, http://www.illumina.com/science/education/sequencing-coverage.html
↑ Wang, Wei et al 2011, Next generation sequencing has lower sequence coverage and poorer SNP-detection capability in the regulatory regions, http://dx.doi.org/10.1038/srep00055
↑ Bentley et al. 2008: Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456 , 53–59.
↑ Ajayet al 2011: Accurate and comprehensive sequencing of personal genomes. Genome Res. 21, 1498–1505.
↑ Cheng, Teo et al 2014: Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals. Bioinformatics Volume 30 Issue 12.. Interpretation by FGC (JL)
↑ Fang, Narzisi et al 2014: Reducing INDEL errors in whole-genome and exome sequencing, http://dx.doi.org/10.1186/s13073-014-0089-z
↑ 10X Genomics GemCode platform, 2015-08-16, Forum http://www.anthrogenica.com/showthread.php?5178-WGS-tec-able-to-phase-96-of-SNPs-into-haplotype-blocks
↑ Al-Khudhair, Qiu et al 2015: Inference Of Distant Genetic Relations In Humans Using “1000 Genomes” http://dx.doi.org/10.1093/gbe/evv003
↑ Schiffels, Haak et al 2016: rarecoal in "Iron Age and Anglo-Saxon genomes from East England reveal British migration history" http://dx.doi.org/10.1038/ncomms10408

[LoeUpgrade1512-1] 1.0 ^1.1 ^1.2 $2750 pilot project long read whole genome Chromium technology, Justin Loe, FGC, 2016-09-02, Forum http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=184229&viewfull=1#post184229 Cite error: Invalid <ref> tag; name "LoeUpgrade1512" defined multiple times with different content Cite error: Invalid <ref> tag; name "LoeUpgrade1512" defined multiple times with different content

[LoeUpgrade1511-2] 2.0 ^2.1 ^2.2 Justin Loe, FGC, 2015-11-29, Forum http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=123473&viewfull=1#post123473

[LoeUpgrade1512b-3] Justin Loe, FGC, 2015-12-15, http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=126856#post126856

[Loe201512a-4] 4.0 ^4.1 ^4.2 ^4.3 ^4.4 ^4.5 Justin Loe, email message, 28 Dec 2015 and AG Forum 2016-01 http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=131470&viewfull=1#post131470

[Loe201601c-5] 5.0 ^5.1 Justin Loe, AG Forum 2016-01 http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=135412&viewfull=1#post135412

[Loe201601a-6] 6.0 ^6.1 Justin Loe, AG Forum 2016-01 http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=132860&viewfull=1#post132860

[Kane2015yngs-7] 7.0 ^7.1 ^7.2 ^7.3 Jim Kane, Which Y-DNA NGS test to take? November 12, 2015, http://www.it2kane.org/2015/11/which-ngs-test-to-take/

[Loe201606a-8] Justin Loe, E-Mail 2016-06-10

[Loe201602a-9] 9.0 ^9.1 ^9.2 Justin Loe, E-Mail 2016-02

[Loe201503a-10] Justin Loe, AG Forum 2016-03 http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=146390&viewfull=1#post146390

[TilroeYAnalysis2015-11] 11.0 ^11.1 Vince Tilroe analysis of FGC raw-data from Greg Magoon, shared by Iain McDonald, 2 Dec 2015

[YF05650-12] sed on a single sample with initial QC problems YF05650

[Petr2015antro-13] 13.0 ^13.1 ^13.2 Petr, Forum post: Full Y Chromosome Sequencing: Phase III Pilot, 2015-12-25, http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=128756&viewfull=1#post128756

[Loe201512b-14] Justin Loe, Batch 9006, email message, 28 Dec 2015

[Loe201601b-15] 15.0 ^15.1 ^15.2 Justin Loe, AG Forum 2016-01 http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=134491&viewfull=1#post134491

[16] Human genetic variation / Measures of variation, Wikipedia, 2016-01 https://en.wikipedia.org/wiki/Human_genetic_variation

[17] Francioli, Menelaou et al 2014: doi:10.1038/ng.3021, http://www.nature.com/ng/journal/v46/n8/abs/ng.3021.html

[18] . NCBI dbSNP Build 138, Apr 2013: http://www.ncbi.nlm.nih.gov/SNP/snp_summary.cgi?view+summary=view+summary&build_id=138

[19] Eric Normandeau, What Is The Sequencing 'Depth' ?, 2011-2012, https://www.biostars.org/p/638/#640

[20] Illumina, Sequencing Coverage, 2016-01, http://www.illumina.com/science/education/sequencing-coverage.html

[21] Wang, Wei et al 2011, Next generation sequencing has lower sequence coverage and poorer SNP-detection capability in the regulatory regions, http://dx.doi.org/10.1038/srep00055

[22] Bentley et al. 2008: Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456 , 53–59.

[23] Ajayet al 2011: Accurate and comprehensive sequencing of personal genomes. Genome Res. 21, 1498–1505.

[24] Cheng, Teo et al 2014: Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals. Bioinformatics Volume 30 Issue 12.. Interpretation by FGC (JL)

[25] Fang, Narzisi et al 2014: Reducing INDEL errors in whole-genome and exome sequencing, http://dx.doi.org/10.1186/s13073-014-0089-z

[26] 10X Genomics GemCode platform, 2015-08-16, Forum http://www.anthrogenica.com/showthread.php?5178-WGS-tec-able-to-phase-96-of-SNPs-into-haplotype-blocks

[27] Al-Khudhair, Qiu et al 2015: Inference Of Distant Genetic Relations In Humans Using “1000 Genomes” http://dx.doi.org/10.1093/gbe/evv003

[28] Schiffels, Haak et al 2016: rarecoal in "Iron Age and Anglo-Saxon genomes from East England reveal British migration history" http://dx.doi.org/10.1038/ncomms10408

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

User