Pilot GWAS of caries in African-Americans shows genetic heterogeneity

Background Dental caries is the most common chronic disease in the US and disproportionately affects racial/ethnic minorities. Caries is heritable, and though genetic heterogeneity exists between ancestries for a substantial portion of loci associated with complex disease, a genome-wide association study (GWAS) of caries specifically in African Americans has not been performed previously. Methods We performed exploratory GWAS of dental caries in 109 African American adults (age > 18) and 96 children (age 3–12) from the Center for Oral Health Research in Appalachia (COHRA1 cohort). Caries phenotypes (DMFS, DMFT, dft, and dfs indices) assessed by dental exams were tested for association with 5 million genotyped or imputed single nucleotide polymorphisms (SNPs), separately in the two age groups. The GWAS was performed using linear regression with adjustment for age, sex, and two principal components of ancestry. A maximum of 1 million adaptive permutations were run to determine empirical significance. Results No loci met the threshold for genome-wide significance, though some of the strongest signals were near genes previously implicated in caries such as antimicrobial peptide DEFB1 (rs2515501; p = 4.54 × 10− 6) and TUFT1 (rs11805632; p = 5.15 × 10− 6). Effect estimates of lead SNPs at suggestive loci were compared between African Americans and Caucasians (adults N = 918; children N = 983). Significant (p < 5 × 10− 8) genetic heterogeneity for caries risk was found between racial groups for 50% of the suggestive loci in children, and 12–18% of the suggestive loci in adults. Conclusions The genetic heterogeneity results suggest that there may be differences in the contributions of genetic variants to caries across racial groups, and highlight the critical need for the inclusion of minorities in subsequent and larger genetic studies of caries in order to meet the goals of precision medicine and to reduce oral health disparities.

groups [10][11][12]. Caries prevalence in primary teeth is 42% higher in non-Hispanic black children compared with non-Hispanic Caucasian children. Non-Hispanic black children have double the rate of untreated tooth decay in primary teeth compared to non-Hispanic Caucasian children [11], and among adults, non-Hispanic blacks have nearly double the rate of untreated decayed teeth (42%) of non-Hispanic Caucasians (22%) [10]. Some disparity is explained by sociocultural differences between racial groups. African Americans are less likely to have access to and utilize oral health care [13,14]. Other factors include differences in caretaker fatalism and oral health education [15], socioeconomic status, and transmission of cariogenic bacteria [16]. Genetic differences in caries predisposition are known: the 2% of African American children with localized juvenile periodontitisa disease more common in African Americanshave fewer carious teeth than others, likely due to a variant in the gene encoding a protective component of saliva [17]. Other differences include those in immunity genes and propensity toward cariogenic oral flora [18]. While inter-racial genetic differences influence dental features [19], there is a dearth of studies on the role of genetics in differences in dentition across racial and ethnic groups.
Although dental caries is estimated to be 30-50% heritable [1,5,6,20], few specific caries-related genes have been discovered, with the majority of these identified in Caucasians [21]. Yet, it is known that some complex diseases exhibit differences in their predominant genetic architecture across races [22][23][24]. Genetic markers for disease vary in frequency between races, and the effect sizes of the genetic variants can display large heterogeneity [25]. Indeed, up to 25% of GWAS tagSNPs show effect heterogeneity by ancestry [26]. Thus it is possible that there are different genetic risk factors for caries operating between races, or that the effects of risk variants are dissimilar. In spite of this, adequate information is lacking regarding the disease process in vulnerable groups such as racial/ethnic minorities; in particular, few studies have focused on the oral health of African Americans [12]. Genome-wide association studies (GWAS) of dental caries in African American samples have not been performed, and although African-Americans are a large US minority group, little work has been done to understand their dental genetics. In this study, we describe a pilot caries GWAS in African American children and adults to generate hypotheses about the genetics of dental caries in African Americans. We consider primary and permanent dentition separately because previously work has estimated that only 18% of covariation in primary vs permanent tooth caries is due to common genetic factors [6]. Furthermore, we compare the GWAS scans in African Americans to analogous analyses in Caucasian children and adults to determine whether there is heterogeneity present between the two racial groups.

Study sample
One hundred nine African American adults (aged > 18 years) and 96 African American children (3-12 years) were recruited through the Center for Oral Health Research in Appalachia (COHRA, cohort COHRA1), a joint study of the University of Pittsburgh and West Virginia University [27]. Briefly, all participants provided consent or assent with written parental informed consent, in accordance with the Institutional Review Board policies of the University of Pittsburgh and West Virginia University. Two clinical examination sites were located in Pennsylvania and four in West Virginia. Admixed African ancestry was verified using Principal Component Analysis (PCA) with respect to Hap-Map controls from Europe, Asia, Africa, and Central/South America. Participants were genotyped for approximately 550,000 single nucleotide polymorphisms (SNPs) using the Illumina Human610-Quad Beadchip (Illumina, Inc., San Diego, CA). Genetic data were rigorously cleaned and quality-checked as previously described [28], and imputed to the 1000 Genomes Project (June 2011) phase 1 reference panel using SHAPEIT (for pre-phasing) [29] and IMPUTE2 [30]. SNPs were filtered for INFO score > 0.5, and MAF > 5% (separately for each age group). SNPs were not filtered for HWE due to the admixed nature of the African American population. Quality filters included participant call rates > 90% and SNP call rates > 99%. Approximately 4.9 million SNPs passed quality control and were included in the GWASs. Identical analyses were performed in COHRA-recruited cohorts of 918 Caucasian adults and 983 children (results for these cohorts have been previously published) [28,31]. The same filters were used in Caucasians (separately for each age group) along with a filter for HWE (p-value > 10 − 4 ). STROBE guidelines were followed for this observational study.

Quantitative caries phenotypes
Ascertainment of caries status was conducted with a dental explorer by either a licensed dentist or a dental hygienist. The assessments were done in exam rooms with a dental chair and dental examination light on dried teeth, and were mutually calibrated at the start of the study and several times over the course of data collection via a review of data collection techniques followed by reliability testing [27]. Inter-and intra-rater reliability of caries assessments was high [27]. From these assessments, the following caries phenotypes were generated: the DMFS index (Decayed, Missing, and Filled Tooth Surfaces) and DMFT index (Decayed, Missing, and Filled Teeth) in adults, and the dfs index (decayed and filled deciduous tooth surfaces) and dft index (decayed, and filled deciduous teeth) in children. These caries indices represent the count of affected tooth surfaces or teeth, in accordance with the World Health Organization DMFS/dfs or DMFT/dft scales [32] and established dental caries research protocols [33,34]. For 31 of the 96 children in the African American pediatric cohort with mixed dentition, and 378 of 983 children in the Caucasian pediatric cohort with mixed dentition, both DMFS/DMFT and dfs/dft indices were scored at the time of the assessment. For the purposes of this study only dfs/dft measures were tested for association in the pediatric cohorts. White spots were included in the DMFS/DMFT and dfs/dft counts because their inclusion has been shown to increase caries heritability estimates and thus improve power to detect association in gene mapping [6].

Statistical model
The GWASs were performed separately in adults (for DMFT and DMFS) and children (for dft and dfs) using linear regression while adjusting for age, sex, and two principal components of ancestry in PLINK v1.9 [35]. Statistical significance was determined using adaptive imputation with a maximum number of 1,000,000 permutations per SNP as implemented in PLINK. P-value thresholds incorporated the burden of multiple testing: genome-wide significance was defined as p-value less than 5 × 10 − 8 and suggestive significance as p-value less than 5 × 10 − 6 . Results were visualized in Manhattan plots using R (v3.2.0) [36].

Results annotation and comparison with Caucasian caries GWASs
Genes within 500 kb of the top associated SNP in each locus were queried for corroborating biological connections to dental caries in public databases, including OMIM, PubMed, and ClinVar. In addition, GREAT [37] was used to assess the functions of cis-regulatory regions of the associated loci using default parameters.
Heterogeneity in effect sizes between the GWAS results of African Americans and Caucasians were compared via Cochran's Q statistic. The effect sizes for the lead SNPs at suggestive (p-value ≤5 × 10 − 6 ) loci observed in African Americans were compared with the effect sizes of the same SNPs in Caucasians, if present. Not all suggestively-associated lead SNPs in African Americans were tested for heterogeneity because MAF and quality controls filters yielded different sets of SNPs retained for African Americans and Caucasians. Specifically, the numbers of loci tested for heterogeneity were 17 of 25 for DMFT, 11 of 12 for DMFS, 20 of 26 for dft, and 12 of 18 for dfs. The genome-wide significance threshold for heterogeneity tests was p-value ≤5 × 10 − 8 .

Results
Four GWASs of indices of dental caries were performed: DMFS and DMFT in 109 African American adults, and dfs and dft in 96 African American children. Cohort demographics are shown in Table 1. The GWAS in African Americans did not yield associations at genome-wide significance (p-value ≤5 × 10 − 8 ) for any phenotype (Fig. 1), while several loci with potential roles in caries etiology were associated at suggestive significance (p-value ≤5 × 10 − 6 ).

GWASs of caries in the permanent dentition in African Americans
The GWAS of DMFT yielded 94 suggestive (p-value ≤5 × 10 −6 ) SNPs across 25 distinct loci. The GWAS of DMFS yielded 23 suggestive SNPs across 11 distinct loci. These loci and corroborating evidence for nearby genes are listed in Table 2 (DMFT) and Table 3 (DMFS). Many of the top loci for the two phenotypes overlapped (rs6947348, rs12171500, chr3:194035416, rs12488352, rs1003652). GREAT regulatory analysis results are available in the Appendix.

Comparison with Caucasian caries GWAS
Results of the tests for heterogeneity between African Americans and Caucasians are listed in Table 6. Significant (p-value ≤5 × 10 − 8 ) heterogeneity in effects between racial groups was observed for 50% of the loci in children, and 12-18% of loci in adults.

Discussion
Dental caries is a complex disease that disproportionately affects certain groups, including African Americans. This is one of few studies of the genetics of dental caries to specifically investigate African Americans. The purpose of this pilot study was to perform preliminary GWAS scans in African American children and adults and to contrast the evidence for genetic association between Africans Americans and Caucasians.
Though no significant associations were observed (which was expected given the small samples sizes), several suggestive loci showed strong evidence of genetic heterogeneity between African Americans and Caucasians. These findings suggest that the genetic architecture of dental caries differs across racial groups. Thus, gene-mapping efforts in African American and other minority racial groups are warranted, and may lead to the discovery of caries risk loci that would go undetected by studying Caucasians alone.
Several suggestive loci harboring genes with putative connections to caries were observed. Given the exploratory nature of this study, we describe suggestive hits to potentially help inform new hypotheses about caries genetics. We caution that these suggestive loci should be interpreted with much skepticism.
LGR4 Required for sequential development of molars [66].   [71]. AIRE Mutations cause autoimmune polyendocrinopathy candidiasis-ectodermal dystrophy, a feature of which can be dental abnormalities [72]. TRPM2 Encodes an ion channel whose expression is increased in dental pulpitis. TRPM2 is activated in cancer radiation treatments to suppress Ca 2+ signaling required for saliva production [73]. TSPEAR Mutations affect Notch signaling and cause an ectodermal dysplasia causing features including hypodontia [74].
Loci associated with caries, and genes within +/− 500 kb of the GWAS signal that have supporting evidence for a putative role in dental caries. Shown are lead SNPs of all loci meeting suggestive significance (p-value < 5 × 10 − 6 ), their effect size in the Caucasian cohort and heterogeneity test p-value. Loci associated in the African-American cohort, but not found in the Caucasian GWAS don't have values in the Caucasian cohort columns. Note: not all genes near GWAS signal are listed.   [42].
S100Z Upregulated as part of ameloblastoma signature [54]. SNORA47 Upregulated as part of ameloblastoma signature [54]. CRHBP One of most up-regulated genes in deciduous tooth pulp, as compared to that of permanent teeth [55].  [76], which can have oral manifestations [77]. NOD1 Innate immunity gene expressed by dental pulp fibroblasts in the recognition of invaded caries-related bacteria and the subsequent innate immune responses [78]; gene product mediates sensing of periodontal pathogens [79], including P. gingivalis [80]. Required for the bone resorption consequences of immune activation by commensal bacteria in a model of periodontitis [81].  [83].
Loci associated with caries, and genes within +/− 500 kb of the GWAS signal that have supporting evidence for a putative role in dental caries. Shown are lead SNPs of all loci meeting suggestive significance (p-value < 5 × 10 − 6 ), their effect size in the Caucasian cohort and heterogeneity test p-value. Loci associated in the African-American cohort, but not found in the Caucasian GWAS don't have values in the Caucasian cohort columns. Note: not all genes near GWAS signal are listed.  CD1D Gene product mediates mucosal immunity [90].
CD1C Expressed in gingival environment on dendritic cells [92]. Locus contains clusters of OR6 and OR10 olfactory receptor family members [93]. PYHIN1 Involved in inflammasome activation in host response to pathogens [94]. Asthma susceptibility locus specific to African-American ancestry [95].

4.67E-11
RARB Likely targeted by miRNAs involved in tooth morphogenesis and differentiation of dental cells [96]. Upregulated in ameloblastoma [53]. Has increased methylation in context of/is associated with head and neck squamous cell carcinoma, which is associated with dental hygiene and inflammation due to microbial factors [97].  [107]. VSTM2A Exhibits high expression in mandibular molars relative to incisors [108]. EGFR EGF-receptors are found on the dental follicle, alceolar bone, and ameloblasts before and during tooth eruption [109,110]. EGFR is a biomarker for neoplastic potential of dysplastic oral tissues [111]. Product mediates proliferation of gingival fibroblasts [112].
Loci associated with caries, and genes within +/− 500 kb of the GWAS signal that have supporting evidence for a putative role in dental caries. Shown are lead SNPs of all loci meeting suggestive significance (p-value < 5 × 10 − 6 ), their effect size in the Caucasian cohort and heterogeneity test p-value. Loci associated in the African-American cohort, but not found in the Caucasian GWAS don't have values in the Caucasian cohort columns. Note: not all genes near GWAS signal are listed.

7.83E-13
RARB Likely targeted by miRNAs involved in tooth morphogenesis and differentiation of dental cells [96]. Upregulated in ameloblastoma [53]. Has increased methylation in context of/is associated with head and neck squamous cell carcinoma, which, in turn, is associated with dental hygiene and inflammation due to microbial factors [97].   [129]. KCTD15 Associated with obesity and preference for carbohydrates [130].
Loci associated with caries, and genes within +/− 500 kb of the GWAS signal that have supporting evidence for a putative role in dental caries. Shown are lead SNPs of all loci meeting suggestive significance (p-value < 5 × 10 − 6 ), their effect size in the Caucasian cohort and heterogeneity test p-value. Loci associated in the African-American cohort, but not found in the Caucasian GWAS don't have values in the Caucasian cohort columns. Note: not all genes near GWAS signal are listed. Several genes related to the immune response and periodontal disease were identified. HES1 (chr3:194035416) encodes a transcription factor with roles in antimicrobial response within epithelial cells [49]. NOD1 (rs66691214; pvalue 7.24 × 10 − 7 ) encodes a dental pulp protein with roles in sensing caries-related [78] and periodontal pathogens [79,80], and the subsequent immune response [78,81]. Protein products of several genes are involved in innate immunity [64,88] (SIGLEC9, CD33; rs4801855; p-value 3.24 × 10 − 6 and SLC5A12; rs7107282; p-value 3.21 × 10 − 6 ). PTGER3 (rs74086974; p-value 3.18 × 10 − 6 ) is a candidate gene for the outcome of periodontal disease therapy [38], and MIR186 (rs74086974) is differentially expressed between gingiva in health versus periodontitis [41]. rs28503910 (p-value 4.84 × 10 − 6 ) contained MIR1305, which is upregulated in response to smoking and may impair regeneration of periodontal tissues in that state [52]. TRPM2 (rs2838538; p-value 4.34 × 10 − 6 ) encodes an ion channel upregulated in dental pulpitis [137], and is involved in saliva production [138].
The locus rs2515501 (p-value 4.54 × 10 − 6 ) harbored several members of the alpha and beta defensin family of antimicrobial peptides [141], which are involved in chronic periodontal inflammation [116] and oral carcinogenesis [117]. Of note, this locus contains DEFB1, polymorphisms in which are associated with a > 5 fold increase in DMFT and DMFS scores [114], and general DMFT index [115]. An additional gene at this locus, ANGPT2, is also associated with oral cancer, and upregulated in response to P. gingivalis, a periodontal pathogen [113].
Three separate associated loci harbored genes associated with complex periodontal traits, proxies for different subgroups of periodontal disease, a condition closely associated with dental caries [142]. rs1235058 (p-value 3.14 × 10-6) harbored HPVC1, a candidate gene for a trait involving a mixed infection bacterial community [107]. rs7630386 (p-value 9.51 × 10 − 7 ) harbored RBMS3, a candidate gene for a trait involving a high periodontal pathogen load [107]. Thirdly, rs17606253 (p-value 1.85 × 10 − 6 ) harbored TRAF3IP2, a protein implicated in mucosal immunity and IL-17 signaling, and associated with a trait involving high levels of A. actinomycetemcomitans and a profile of aggressive periodontal disease [107].
Two loci were found to be related to asthma, a disease associated with a doubled risk of caries [143]. rs12125935 (p-value 2.78 × 10 − 6 ) harbors PYHIN1, which encodes a protein involved in inflammasome activation in response to pathogens [94], and represents an asthma susceptibility locus specific to African-American ancestry [95]. rs11741099 (p-value 2.93 × 10 − 6 ) is intronic to ADAMTS2; the ADAMTS protein family is proposed to play a role in asthma [105]. Additionally, homozygous mutations in ADAMTS2 cause Ehlers-Danlos syndrome (VIIC), features of which can include multiple tooth agenesis and dentin defects [104]. rs7174369 (p-value 1.72 × 10 − 6 ) harbored IGF1R, involved in dental fibroblast apoptosis [127]. Interestingly, in addition to its receptor, the regulator of hard dental tissue encoded by IGF1 was also associated at a separate locus (rs79812076; p-value 2.17 × 10 − 6 ).

Comparison between association results across dentition type and across races
Aside from TUFT1 and DEFB1, the loci reported here have not been associated with dental caries in previous studies, which have largely comprised Caucasian individuals. This is in line with previous research showing differences in frequencies of risk alleles for complex disease across races, but may also be because the study was underpowered to detect associated loci in African Americans. In addition, no overlap was found in associated loci between this study and a multi-ethnic pilot GWAS of early childhood caries [144]. There was no overlap in loci associated with primary and permanent caries indices, but this might be expected given that the genetic determinants of caries are thought to largely differ between the dentitions [6]. However we cannot rule out similarities in genetic determinants across dentitions because this pilot study was not designed to have sufficient power for this purpose.
Loci displaying significant heterogeneity between African Americans and Caucasians (Table 6) in permanent dentition were largely ones in gene deserts with unknown function. One locus (rs12171500; DMFT Q statistic [Q] p-value 6.46x − 10 ; DMFS Q p-value 3.37x − 12 ) contained genes involved in enamel and tooth development.
Among loci displaying significant heterogeneity in primary dentition, there were several that harbored genes related to periodontitis. Such loci represented genes related to periodontal inflammation (rs2515501; Q p-value 4.39x − 10 ), gingival healing (rs9915753; dft Q p-value 1.81x − 07 , dfs Q pvalue 1.47x − 10 ), and aggressive periodontal disease and high levels of oral A. actinomycetemcomitans (rs17606253; Q p-value 1.41x − 9 ). Notably, African American pre-teens are approximately 16 times as likely as Caucasian ones to have localized aggressive periodontitis and detection of A. actinomycetemcomitans is associated with early surrogates for periodontal inflammation in African American preadolescents [145].
Several broad categories of genes associated with caries in African Americans emerged, including those involved in tooth/enamel development, those causing single-gene disorders with craniofacial or dental malformations, those involved in immune response or periodontitis, those related to salivary glands and proteins, and those associated with obesity. These results support the known multifactorial nature of dental caries [21]. Further studies will be necessary to confirm the loci nominated in this pilot study. Nevertheless, these GWASs provide valuable insight into the differences in the genetic architecture of caries across populations, and suggest new candidate genes worth following-up in hypothesis-driven studies.

Study limitations
This study has limitations, including the genotyping platform, which was not optimized for genomic coverage of the African American population [146,147]. Thus, studies in larger African American cohorts and with denser chips are needed to identify risk loci that may not have been well represented in this study. The ascertainment of caries was limited by the lack of X-ray examination to confirm white spots and approximal tooth surface caries, which would have underestimated the true extent of caries counts. Imprecision in the caries assessment would lower the power to detect association, but would not result in false positive associations. Therefore, the associations observed in this study would likely not be influenced by this limitation, but other true associations may have gone undetected. The pediatric cohort analyses were somewhat limited in that the primary caries indices (dfs/dft) were tested for genetic association in a sample that included some children with mixed dentition. Limiting the scope of the pediatric analyses to solely primary dentition caries indices allows for simplified interpretation of the association results because genetic determinants of primary and permanent tooth caries have been found to differ [6]. However, assessing dfs/dft scores in the mixed dentition provides an incomplete picture of the caries experience in the primary dentition, given the exfoliation of some teeth. This is another important source of measurement error, which would bias our analysis toward the null hypothesis of no association.

Conclusions
In summary, these results suggest that there may be genetic differences in caries susceptibility, and potentially differing genetic etiologies or differentially distributed genetic risk factors, across racial groups. Indeed, addressing the oral health disparity gap is a national priority according to both the US Surgeon General's Oral Health in America report [12] and the Healthy People 2020 public health goal framework [148]. This oral health disparity has parallels in the research sphere -relatively little work, to date, has been done on the genetics of caries in African Americans. Furthermore, African Americans represent a segment of the population traditionally underrepresented in biomedical research (UBR) and the importance of including such groups in research is recognized as foundational to the future of precision medicine by the National Institutes of Health initiative, All of Us [149]. Larger gene-mapping studies are thus needed in this population to help alleviate its disproportionate burden of the disease.