Identifying novel genetic variants for brain amyloid deposition: a genome-wide association study in the Korean population

Genome-wide association studies (GWAS) have identified a number of genetic variants for Alzheimer’s disease (AD). However, most GWAS were conducted in individuals of European ancestry, and non-European populations are still underrepresented in genetic discovery efforts. Here, we performed GWAS to identify single nucleotide polymorphisms (SNPs) associated with amyloid β (Aβ) positivity using a large sample of Korean population. One thousand four hundred seventy-four participants of Korean ancestry were recruited from multicenters in South Korea. Discovery dataset consisted of 1190 participants (383 with cognitively unimpaired [CU], 330 with amnestic mild cognitive impairment [aMCI], and 477 with AD dementia [ADD]) and replication dataset consisted of 284 participants (46 with CU, 167 with aMCI, and 71 with ADD). GWAS was conducted to identify SNPs associated with Aβ positivity (measured by amyloid positron emission tomography). Aβ prediction models were developed using the identified SNPs. Furthermore, bioinformatics analysis was conducted for the identified SNPs. In addition to APOE, we identified nine SNPs on chromosome 7, which were associated with a decreased risk of Aβ positivity at a genome-wide suggestive level. Of these nine SNPs, four novel SNPs (rs73375428, rs2903923, rs3828947, and rs11983537) were associated with a decreased risk of Aβ positivity (p < 0.05) in the replication dataset. In a meta-analysis, two SNPs (rs7337542 and rs2903923) reached a genome-wide significant level (p < 5.0 × 10−8). Prediction performance for Aβ positivity increased when rs73375428 were incorporated (area under curve = 0.75; 95% CI = 0.74–0.76) in addition to clinical factors and APOE genotype. Cis-eQTL analysis demonstrated that the rs73375428 was associated with decreased expression levels of FGL2 in the brain. The novel genetic variants associated with FGL2 decreased risk of Aβ positivity in the Korean population. This finding may provide a candidate therapeutic target for AD, highlighting the importance of genetic studies in diverse populations.


Conclusion:
The novel genetic variants associated with FGL2 decreased risk of Aβ positivity in the Korean population. This finding may provide a candidate therapeutic target for AD, highlighting the importance of genetic studies in diverse populations.
Keywords: Alzheimer's disease, Amyloid-beta, Genome-wide association studies, Positron emission tomography Background Genetic factors play an important role in the pathogenesis of Alzheimer's disease (AD) because heritability is estimated to be 58%-79% [1]. In addition to APOE ɛ4, recent genome-wide association studies (GWAS) have discovered a number of genetic risk variants for AD [2,3]. However, a large proportion of AD heritability is still unexplained.
Accumulation of amyloid-beta (Aβ) in the brain is the earliest pathogenic process in AD, followed by tau deposition, neurodegeneration, and cognitive impairment [4]. Therefore, detecting individuals with Aβ deposition is of utmost importance for the prevention and early treatment of AD [5]. Previous studies have evaluated the genetic basis of Aβ deposition using positron emission tomography (PET) imaging [6][7][8][9][10] and identified several novel Aβ associated genetic variants outside the APOE region from European ancestry [11]. However, as each ancestry has a distinct genetic background, replication of the novel genetic findings in different populations is challenging. A number of previous studies failed to replicate European GWAS findings in other ethnic populations [12][13][14][15]. Furthermore, it should be noted that most previous GWAS were conducted in individuals of European ancestry, and non-European populations are underrepresented in genetic discovery efforts [16][17][18].
In this study, using a large sample of the Korean population, we conducted a GWAS to identify single nucleotide polymorphisms (SNPs) associated with Aβ deposition in the brain. We identified novel SNPs for Aβ deposition and demonstrated their associations in an independent cohort of the Korean population. Then, we assessed the topography of Aβ deposition related to the novel SNP. Furthermore, we developed an Aβ prediction model incorporating the novel SNP.

Participants
For the discovery dataset, total 1214 participants of Korean ancestry were recruited from 14 referral hospitals in South Korea from January 2013 to July 2019. Among them, 923 participants were recruited from the Samsung Medical Center, 201 participants were recruited from a multicenter study of the Korean Brain Aging Study for the Early Diagnosis and Prediction of AD (KBASE-V) [19], and 90 participants were recruited from a multicenter study of Clinical Research Platform based on Dementia Cohort.
For the replication dataset, we used data from 306 participants of Korean ancestry from the biobank of the Chronic Cerebrovascular Disease consortium, recruited from 2016 to 2018. This was part of the ongoing Biobank Innovation for chronic Cerebrovascular disease With ALZheimer's disease Study (BICWALZS) and the Center for Convergence Research of Neurological Disorders.
For the discovery and replication dataset, we included participants (i) who were diagnosed with amnestic mild cognitive impairment (aMCI), AD dementia (ADD), or were cognitively unimpaired (CU) based on detailed neuropsychological tests [20][21][22], and (ii) who underwent amyloid PET imaging. Participants with aMCI met the following criteria, modified from Peterson's criteria [23]: (i) normal activities of daily living; (ii) objective memory impairment on verbal or visual memory test, below the 16th percentile of age-and educationmatched norms; and (iii) did not have dementia. Those with ADD satisfied the core clinical criteria for probable ADD according to the National Institute of Neurological and Communicative Disorders and Stroke and Alzheimer's Disease and Related Disorders Association [21]. We excluded participants if they had (i) a causative genetic mutation for AD, such as PSEN1, PSEN2, and APP; (ii) structural abnormalities detected on brain MRI, such as severe cerebral ischemia, territorial infarction, or brain tumors; and (iii) other medical or psychiatric diseases that may cause cognitive impairment. All participants provided written informed consent, and the study was approved by the Institutional Review Board of each center.

Genotyping and imputation
Participants were genotyped using the Illumina Asian Screening Array BeadChip (Illumina, CA, USA) for discovery data and Affymetrix customized Korean chips (Affymetrix, CA, USA) for replication data. Only SNP markers were analyzed. We conducted QC using PLINK software (version 1.9) [24]. Participants were excluded based on the following criteria: (i) call rate < 95%, (ii) mismatch between reported and genetically inferred sex, (iii) deviation from each population parameter, (iv) excess heterozygosity rate (5 standard deviation from the mean), and (v) in cases of related pairs (identified with identity by descent ≥ 0.125) within and between the discovery and replication datasets.
SNPs were excluded based on the following criteria: (i) call rate < 98%, (ii) minor allele frequency (MAF) < 1%, and (iii) a p value < 1.0 × 10 −6 for the Hardy-Weinberg equilibrium test. After QC, genome-wide imputation was performed using the Minimac4 software with all available reference haplotypes from HRC-r1.1 on the University of Michigan Imputation Server [25,26]. For post-imputation QC, we excluded SNPs based on the following criteria: (i) poor imputation quality (r 2 ≤ 0.8) and (ii) MAF ≤ 1%. Finally, a total of 4,906,407 SNPs was analyzed.

Amyloid PET acquisition and image analysis
Amyloid PET images were obtained using a Discovery STE PET/CT scanner (GE Medical Systems, Milwaukee, WI, USA). PET images were acquired for 20 min, starting at 90 min after intravenous injection of either 18 Fflorbetaben or 18 F-flutemetamol. Aβ positivity or negativity was determined by well-trained nuclear physicians using visual assessments for florbetaben and flutemetamol [27,28] PET. Briefly, positivity for tracer uptake was assessed in four cortical regions (lateral temporal, frontal, parietal, and posterior cingulate cortices) for florbetaben PET and five cortical regions (lateral temporal, frontal, parietal, posterior cingulate cortices, and striatum) for flutemetamol PET. Amyloid PET positivity was defined as having at least one cortical region with evidence of positive uptake.
A subset of participants in the discovery cohort (n = 824) and the replication cohort (n = 260) had amyloid PET data available for PET image analysis. For PET image analysis, we performed the following preprocessing using Statistical Parametric Mapping software 12 (SPM, http://www.fil.ion/uc.ac.uk/spm) running on MATLAB (MathWorks 2014b): (1) co-registration of PET to T1-weighted structural MRI, (2) structural MRI segmentation and calculation of transformation matrix, (3) normalization of PET to a Montreal Neurological Institute (MNI) space, and (4) spatial smoothing with a Gaussian kernel of 8-mm full width at half maximum. To calculate the standardized uptake value ratio (SUVR) for each PET image, we used two reference regions (the cerebellar cortex for florbetaben and pons for flutemetamol). The masks of reference regions were obtained from the GAAIN website (http://www.GAAIN.org).

Statistical analysis GWAS analysis
Logistic regression analysis was performed to determine the association between SNPs and Aβ positivity controlling for age, sex, and the first three principal components (PC) of the genetic ancestry, expressed as Aβ positivity = β 0 + β 1 age + β 2 sex + β 3 PC 1 + β 4 PC 2 + β 5 PC 3 + β 6 SNP (additive model, coded as 0, 1, and 2 according to the number of minor alleles). Reported p values were two-tailed, and we defined a p value less than 5.0 × 10 −8 as being statistically significant and less than 1.0 × 10 −5 or 1.0 × 10 −6 as being statistically suggestive based on previous studies [29][30][31]. We assessed genomic inflation according to a previous study [32]. For the replication analysis, reported p values were twotailed, and a p value less than 0.05, was considered statistically significant. Furthermore, considering the small size of the replication dataset, we performed a permutation test to infer the statistical significance of SNPs from the null distribution. We recalculated the t values of SNPs from logistic regression analysis of randomly shuffled Aβ positivity (10,000 permutations). We calculated the fraction of permutations that showed a more significant association than the observed t values of SNPs derived from the original dataset.
To check if SNPs were associated with Aβ positivity independent of APOE genotype, we performed a conditional analysis by further adjusting for APOE genotype. We also performed a p value based meta-analysis and calculated the summary effect size by averaging the study specific effect sizes, with weights reflecting the standard errors from the study specific effect sizes.

Effects of the newly identified SNPs
After identifying associated SNPs, we calculated the risk of the identified SNPs on Aβ deposition in all participants and at each cognitive level (CU, aMCI, and ADD). We also examined whether Aβ associated SNPs are associated with ADD risk using CU and ADD participants using the following logistic model: ADD = β 0 + β 1 age + β 2 sex + β 3 education + β 4 identified SNPs.
Next, using the previously reported cut-off values for Aβ positivity (SUVR 0.6 for flutemetamol [33], and SUVR 1.4 for florbetaben [34]), we also performed logistic regression to evaluate whether the identified SNPs were associated with Aβ deposition based on SUVR cutoff values.
Furthermore, we performed voxel-wise PET image analysis to determine which regional Aβ deposition is associated with SNPs after adjusting for the effects of age, sex, genetic PCs, APOE genotype, and PET tracer type. T static maps were thresholded by p < 0.001 with cluster size > 20 when uncorrected for multiple tests or p < 0.05 when corrected for multiple tests using family-wise rate.
To test the clinical utility of the newly identified SNPs, we developed multivariable logistic models to predict Aβ positivity in each individual. To evaluate the performance of the logistic model, we measured the area under curve (AUC) from the receiver operating characteristic curve analysis. For internal validation, we conducted a 10-fold cross-validation with 100 repeats using the discovery data. We reported the mean AUC with 95% confidence interval (CI) of the model. As an external validation, parameters estimated from the discovery data were used to test the Aβ prediction performance in the replication data. We used R software (http://www.rproject.org) and MATLAB for the statistical analyses and results visualization.
Finally, we characterized the function of the identified SNPs by leveraging bioinformatic tools and previously reported results. First, we checked whether MAF of SNPs in our data was similar to that in the East Asian population using the 1000 Genomes Project dataset [35]. To evaluate the genotype-specific expression of identified SNPs in human brain tissues, we performed cisexpression quantitative trait loci (cis-eQTL) analysis through the Genotype-Tissue Expression portal (https:// gtexportal.org) [36]. We reported genes that showed significant expression changes in the brain tissues (p < 0.05).

Participants
After QC of genotype data, a total of 1190 (383 CU, 330 aMCI, and 477 ADD) and 284 participants (46 CU, 167 aMCI, and 71 ADD) remained available for the discovery and replication data, respectively. Table 1 shows the baseline demographics for the two datasets (discovery and replication data).
Of the four SNPs, rs11983537 was genotyped while the remaining were imputed. Imputation qualities of the identified SNPs were high (mean r 2 0.97 ± 0.02). Of note, two of the four SNPs (rs73375428 and rs2903923) showed genome-wide significant associations (p < 5.0 × 10 −8 ) in the meta-analysis of the discovery and replication datasets ( Table 2). When we adjusted for the effect of the APOE ɛ4 allele, all four SNPs were associated with Aβ positivity in the replication datasets (p < 0.05) ( Table  2). Since the identified four SNPs showed high linkage disequilibrium (mean r 2 0.95 ± 0.05) with each other, we selected rs73375428 for subsequent analyses because it showed the most significant association in the primary analysis of the discovery dataset.

Effects of the newly identified SNPs
In the logistic model, the APOE ɛ4 allele was associated with a 5-fold higher risk of Aβ positivity (odds ratio [OR] = 5.330; 95% CI = 4.188-6.788; p < 0.001) and rs73375428 was associated with a 2-fold lower risk of Aβ positivity (OR = 0.519; 95% CI = 0.404-0.666; p < 0.001). When we adjusted the effect of diagnosis (CU, aMCI, and ADD), the effect of rs73375428 remained significant (OR = 0.556; 95% CI = 0.406-0.666; p < 0.001). In the subgroup analysis, the association of rs73375428 with Aβ positivity was significant in the CU and aMCI groups but not in the ADD group, while the association of APOE ɛ4 was significant across all cognitive states (Table 3). When we defined Aβ positivity based on SUVR, rs73375428 was also associated with a decreased risk of Aβ positivity in both discovery (OR = 0.608; 95% CI = 0.523-0.707; p < 0.001) and  (Table S3).
We developed prediction models to test the clinical utility of the APOE ɛ4 allele and newly identified SNP (rs73375428) in predicting Aβ positivity. In the 10-fold cross-validation with 100 repetitions, the model (model 1) including only clinical factors (age, sex, and level of education) showed an AUC of 0.506 (95% CI = 0.500-0.512). After incorporating the APOE ɛ4 allele in the model (model 2), the prediction performance significantly increased (AUC = 0.723; 95% CI = 0.717-0.729). Moreover, when the model included rs73375428 (model 3), the prediction performance further increased (AUC = 0.749; 95% CI = 0.743-0.755) (Fig. 3). When each model, trained in the discovery data, was tested in the replication data, the highest AUC was also observed in the model including both APOE ɛ4 and rs73375428 (model 1 AUC = 0.509, model 2 AUC = 0.693, model 3 AUC = 0.714).

Discussion
We performed GWAS to identify genetic factors associated with Aβ deposition in the brain using the largest amyloid PET imaging and GWAS data collected from multicenters in South Korea. We identified four novel SNPs (rs73375428, rs2903923, rs3828947, and rs11983537) on chromosome 7, which were associated with a decreased risk of Aβ positivity in the brain at the suggestive level (< 1.0 × 10 −6 ). These associations were also observed in the independent cohort (p < 0.05). Having a minor allele in rs73375428 (G) was associated with a 2-fold decreased risk of Aβ positivity (OR = 0.519) and decreased Aβ deposition in the precuneus, lateral parietal, and medial frontal areas. Incorporating rs73375428, in addition to age, sex, education, and APOE e4, better predicted Aβ positivity. The minor allele of rs73375428 was associated with decreased expression levels of FGL2 in the brain. We identified four novel SNPs (rs73375428, rs2903923, rs3828947, and rs11983537) associated with a decreased risk of Aβ positivity in the brain. In the discovery dataset, nine SNPs showed genome-wide suggestive significance (< 1.0 × 10 −6 ), of which four SNPs were associated with a decreased risk of Aβ positivity (p < 0.05) in an independent cohort. Although the significance of four novel SNPs was at the suggestive level, meta-analysis of the discovery and replication datasets showed that two SNPs (rs73375428 and rs2903923) reached a genome-wide significance level (p < 5.0 × 10 −8 ). Furthermore, the obtained OR of rs73375428 for Aβ positivity was 0.519, which was strong compared with the ORs of previously reported Aβ-or ADDassociated SNPs (Aβ-associated SNPs OR from 0.84 to 1.2 [13]). In our cohort, about 30% of CU participants carried one or more minor alleles in rs73375428 (MAF of 0.160). This is in accordance with the previously reported MAF of rs73375428 in the East Asian population (MAF of 0.131) [35], which indicates that the samples used in this study were not biased and may reflect the East Asian population. In the subgroup analysis, the identified SNP (rs73375428) decreased the risk of Aβ positivity in the CU and aMCI group but not in the ADD group. This finding may suggest that in the course of AD spectrum, the effect of rs73375428 diminishes in the dementia stage.
Further imaging analysis and prediction model for Aβ positivity showed consistent results. PET image analysis showed that the participants with minor allele in rs73375428 had less Aβ deposition in the precuneus, lateral parietal, and medial frontal areas. These areas are part of the default mode network, typical regions where Aβ deposits in AD [37]. Identifying patients with Aβ deposition is of the utmost importance in predicting the prognosis and selecting patients for clinical trials of anti-Aβ therapy [38]. Currently available diagnostic tools for measuring Aβ are either invasive (cerebrospinal fluid examination) or expensive (PET), hampering their widespread application in clinical practice [39]. We demonstrated that genetic data (APOE ɛ4 and rs73375428) obtained from blood samples with clinical information could predict Aβ positivity with an AUC of 0.749. Furthermore, we demonstrated that the prediction performance improved when rs73375428 was included in the model in addition to age, sex, and APOE ɛ4, suggesting the clinical utility of rs73375428.
The identified SNPs were associated with decreased expression of FGL2 in the brain cortex. Although further specific biological mechanistic studies are required, this result suggests that FGL2 may be a possible link between rs73375428 and decreased Aβ deposition in the brain. FGL2 is a membrane-bound or secreted protein expressed by immune cells that have either coagulation activity [40,41] or immune-suppressive functions [42,43]. A previous study demonstrated that FGL2 expression is associated with brain tumor progression through the immune system [44]. FGL2 was also associated with AD. One prior study demonstrated that when human microglia were exposed to Aβ peptide, FGL2 expression in microglia was reduced more than six-fold as an inflammatory response to Aβ peptide [45]. Furthermore, Taguchi et al. obtained brain samples from both patients with AD and controls of Japanese population and demonstrated that FGL2 was upregulated in the AD hippocampus as compared to controls [46]. Given these previous observations, we speculated that participants with minor alleles of rs73375428 could have reduced the risk of Aβ deposition in the brain through decreased expression of FGL2, which reflects the reactive inflammatory response (e.g., Aβ clearance) to Aβ peptide. More functional studies are necessary to elucidate the role of FGL2 in AD pathogenesis.
Our results showed some evidence for ethnic similarity and differences in genetic variants associated with Aβ. As expected, variants in the APOE locus exhibited a significant association with Aβ deposition in the brain, confirming that the APOE variants are important risk factors for AD across various ethnicities [47]. However, there were some ethnic differences. We observed a stronger effect of the variant in APOE (rs429358) on Aβ positivity in the Korean population than that in the European population (Korean, OR = 5.275; European, OR = 1.197 [11]). This is similar to the results in previous studies of the East Asian population, in which the effect of APOE ɛ4 on AD risk was stronger in Han Chinese [48] and Japanese [47] than in the European population. Furthermore, outside the APOE locus, previously reported Aβ associated SNPs in European ancestry data were not replicated [11] in our cohort. Ethnic differences in the effect size and significance might be attributed to the differences in allele frequency and LD pattern across different populations [12]. Indeed, we observed heterogeneity in the allele frequency between the European and Korean cohorts (Table S4). Furthermore, epigenomic patterns, lifestyle, education attainment, and other non-genetic factors may also account for differences across populations. However, it should be noted that the lack of replication might also be a result of insufficient sample size of our cohort. Nevertheless, these findings suggest that the discovery from GWAS in one population may not be applicable to other populations. Therefore, continuous efforts of population-specific and trans-ethnic studies are necessary to accurately discover risk genetic variants.

Limitations
This study has several limitations. First, the statistical significance of the novel SNP was at the genome-wide suggestive level, and the sample size of the replication dataset was small. Furthermore, although associations between four SNPs and Aβ (p < 0.05) were found in the independent dataset, the statistical significance disappeared after correction for multiple tests of nine SNPs. However, our study might present true findings for the following reasons: (i) nine suggestive SNPs at a more conservative p-value (< 1.0 × 10 −6 ) showed high LD with each other, which might reduce the number of independent tests to one; (ii) the permutation test of the four SNPs showed that if the null hypothesis was true, the chance of observing our findings would be extremely small for a given sample size; (iii) two SNPs (rs73375428 and rs2903923) showed genome-wide significant associations in the meta-analysis; and (iv) the biological relevance of FGL2 association with the identified SNPs in the brain tissue suggests a potential AD-associated gene. Nevertheless, our findings should be interpreted with caution and replicated in larger independent datasets. Second, imputation was performed using a large reference panel of mixed populations rather than the Korean population. However, we conducted a strict postimputation QC, excluding SNPs with poor imputation quality (r 2 ≤ 0.8) or low frequency (MAF < 1%). As a result, the imputation qualities of the identified SNPs were high (mean r 2 0.97 ± 0.02). Third, the cis-eQTL dataset was obtained from healthy populations and not from subjects with AD. Furthermore, the causality of the identified SNPs and FGL2 expression could not be evaluated in the current analysis. Functional studies using gene editing are necessary to determine the association between the identified SNPs and FGL2. Fourth, GWAS was conducted using Aβ positivity, determined by the visual assessment not by quantitative Aβ SUVR. Since this study was conducted using large data obtained from multiple cohorts, some data were not available for SUVR analysis. However, the visual assessment of Aβ positivity has high correlations with histopathological findings of Aβ deposition in the brain [49,50], and it is more widely used in the clinical practice.

Conclusions
We identified novel SNPs that reduce the risk of Aβ deposition in the brain and suggested a possible role of FGL2 in AD pathogenesis. This finding may provide a candidate therapeutic target for AD, highlighting the importance of genetic studies in diverse populations.
Additional file 2: Figure S1. Histogram of t-values obtained from the permutations. Red dotted lines indicate the lowest 5% of the 10,000 permutations. Red arrows indicate the observed t-value obtained from the original dataset.

Availability of data and materials
The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.