Utility of Population-Level DNA Sequence Data in the Diagnosis of Hereditary Endocrine Disease.
Newey PJ., Berg JN., Zhou K., Palmer CNA., Thakker RV.
Context: Genetic testing is increasingly used for clinical diagnosis, although variant interpretation presents a major challenge because of high background rates of rare coding-region variation, which may contribute to inaccurate estimates of variant pathogenicity and disease penetrance. Objective: To use the Exome Aggregation Consortium (ExAC) data set to determine the background population frequencies of rare germline coding-region variants in genes associated with hereditary endocrine disease and to evaluate the clinical utility of these data. Design Setting Participants: Cumulative frequencies of rare nonsynonymous single-nucleotide variants were established for 38 endocrine disease genes in 60,706 unrelated control individuals. The utility of gene-level and variant-level metrics of tolerability was assessed, and the pathogenicity and penetrance of germline variants previously associated with endocrine disease evaluated. Results: The frequency of rare coding-region variants differed markedly between genes and was correlated with the degree of evolutionary conservation. Genes associated with dominant monogenic endocrine disorders typically harbored fewer rare missense and/or loss-of-function variants than expected. In silico variant prediction tools demonstrated low clinical specificity. The frequency of several endocrine disease‒associated variants in the ExAC cohort far exceeded estimates of disease prevalence, indicating either misclassification or overestimation of disease penetrance. Finally, we illustrate how rare variant frequencies may be used to anticipate expected rates of background rare variation when performing disease-targeted genetic testing. Conclusions: Quantifying the frequency and spectrum of rare variation using population-level sequence data facilitates improved estimates of variant pathogenicity and penetrance and should be incorporated into the clinical decision-making algorithm when undertaking genetic testing.