Joint effects of common genetic variants from multiple genes and pathways on the risk of premature coronary artery disease
Article Outline
- Abstract
- Methods
- Results
- Discussion
- Conclusions
- Acknowledgements
- Appendix. Supplementary data
- References
- Supplementary References
- Copyright
Objective
The aim of this study is to discover common variants in 6 lipid metabolic genes and construct and validate a genetic risk score (GRS) based on the joint effects of genetic variants in multiple genes from lipid and other pathobiologic pathways.
Background
Explaining the genetic basis of coronary artery disease (CAD) is incomplete. Discovery and aggregation of genetic variants from multiple pathways may advance this objective.
Methods
Premature CAD cases (n = 1,947) and CAD-free controls (n = 1,036) were selected from our angiographic registry. In a discovery phase, single nucleotide polymorphisms (SNPs) at 56 loci from internal discovery and external reports were tested for associations with biomarkers and CAD: 28 promising SNPs were then tested jointly for CAD associations, and a GRS consisting of SNPs contributing independently was constructed and validated in a replication set of familial cases and population-based controls (n = 1,320).
Results
Five variants contributed jointly to CAD prediction in a multigenic GRS model: odds ratio 1.24 (95% CI 1.16-1.33) per risk allele, P = 8.2 × 10−11, adjusted OR 2.03 (1.53-2.70), fourth versus first quartile. 5-SNP genetic risk score had minor impact on area under the receiver operating characteristic curve (P > .05) but resulted in substantial net reclassification improvement: 0.16 overall, 0.28 in intermediate-risk patients (both P < .0001). GRS5 predicted familial CAD with similar magnitude in the validation set.
Conclusions
The Intermountain Healthcare's Coronary Genetics study demonstrates the ability of a multigenic, multipathway GRS to improve discrimination of angiographic CAD. Genetic risk scores promise to increase understanding of the genetic basis of CAD and improve identification of individuals at increased CAD risk.
Much of the genetic basis of coronary heart disease (CHD) remains to be discovered.1 However, steady progress is occurring using both candidate gene and genome-wide association studies (GWAS).2, 3 Insights into pathobiologic pathways has guided candidate gene testing,2, 4 whereas GWAS makes no a priori assumptions about genetic site and provides broader, genome-wide coverage.3 We hypothesized that combining discoveries from both of these complementary approaches and extending consideration to multiple pathobiologic pathways could further advance this effort.
To date, the risk attributable to any individual variant has been modest. However, discovering and combining multiple loci with modest effects into a global genetic risk score (GRS) could improve the identification of high-risk populations and improve individual risk assessment.5, 6, 7 Only a limited number of GRS studies for CHD have been reported, and these have focused mostly on lipid-related genes and especially on low-density lipoprotein cholesterol (LDL-C)–related variants.5, 7 However, CHD pathogenesis involves multiple stages and other mechanisms and biopathways, including high-density lipoprotein cholesterol (HDL-C), inflammation, thrombosis, and vascular development.1, 2, 4
Methods
Study objectives
The primary objectives of the Intermountain Healthcare's Coronary Genetics (CorGen) study were (1) to discover all common single nucleotide polymorphisms (SNPs) among a set of 6 key genes in the reverse cholesterol transport system and test them for associations with angiographic coronary artery disease (CAD), and (2) to construct and validate a multivariant GRS, based on the joint effects of these and other (literature reported) genes in lipid and 3 other pathobiologic pathways, to discriminate CAD using precise angiographic phenotyping.
Study participants
Study subjects for the primary association study were selected from Intermountain Healthcare's ongoing Angiographic Registry and DNA Bank.8 This registry is approved by the hospital's institutional review board, and study subjects gave written consent before enrollment. Fasting blood is sampled at the time of angiography; DNA and plasma are extracted and stored; demographic and angiographic information is collected and entered into an electronic database; and patients are followed up prospectively.8, 9
To optimize genetic susceptibility, we studied younger subjects: men aged ≤60 years and women ≤70 years. Approximately 3,000 subjects (∼2,000 CAD cases and ∼1,000 angiographically normal controls, matched 2:1 for sex, age, and date of registry entry) were selected. Clinical diagnoses preceding angiography included stable disease (angina equivalent or other) in 56%, unstable angina in 25%, and acute myocardial infarction (MI) in 19%.
A separate set of cases with highly familial premature CAD (first-degree relative with CHD onset <55 in men, <65 in women) from the University of Utah Cardiovascular Genetics Family Tree Registry1 and a separate set of controls (randomly invited from a public records database) were enrolled as a replication set.10
Study design and selection of genetic variants
The study consisted of 3 discovery phases and a validation phase (Supplementary Figure 1, online Appendix), each consisting of separate, independent datasets. In discovery phase 1 (SNP discovery), 6 genes with key roles in reverse cholesterol transport (cholesteryl ester transfer protein [CETP], hepatic lipase [LIPC], lipoprotein lipase [LPL], lecithin-cholesterol acyltransferase [LCAT], scavenger receptor class B type I [SR-BI], and apolipoprotein F [apoF]) were scanned in 62 volunteers; 81 SNPs were discovered; and all variant genomic segments were sequenced. In discovery phase 2 (haplotype or tagging [t]SNP discovery), a separate set of 339 Euro-American volunteers was tested to establish tagging SNPs (tSNPs) using the Horne and Camp method of principal components analysis (online Appendix Supplementary materials and Refs.11, 12). A total of 38 tSNPs were determined to characterize the linkage disequilibrium groups of the 6 genes. Of these, 10 found to be associated univariably with lipoprotein markers or nominally (P < .2) with CAD in a discovery set of cases (n = 915) and controls (n = 522) (discovery phase 3) were selected for multivariable GRS modeling.
To complement and expand GRS candidates to other pathways and genome-wide, literature studies were reviewed as of April 2008. Eighteen SNPs showing the most robust and independent associations at genome-wide significance (P < 5 × 10−8) with a CAD-related biomarker (ie, lipids—divided among LDL-C, HDL-C, TG—or CRP or MCP1) and/or CAD per se (ie, 9p21.3) and diversified among lipid/lipoprotein, thrombosis, inflammation, vascular function/unassigned pathways were added to the 10 internal candidates SNPs for GRS modeling (Supplementary Figure 1; Supplementary Table I, online Appendix). The risk allele for SNP candidates was designated a priori based on associations with a recognized risk marker (eg, lipid levels) or, preferentially, when reported (for literature SNPs) or angiographically determined (for internal SNP candidates), on associations with CAD directly, before entry into multivariable GRS modeling.
Multivariable GRS modeling using backward logistic regression then was used in an angiographically phenotyped set of 1,918 cases and 1,032 controls to reduce the 28 candidate SNPs from the 2 sources to a set of SNPs contributing jointly to CAD prediction.
In validation testing, the reduced set GRS determined in the discovery set was prospectively tested in a completely independent set of familial CAD cases (n = 312) and population controls (n = 1,008).
Definition of angiographic characteristics and clinical covariables
The presence of CAD was determined by coronary angiographic analysis masked to genetic information. Patients were categorized as free of CAD (no lesions noted angiographically), mild/moderate CAD (ie, most severe lesion <70% stenosis), or severe CAD (ie, at least 1 lesion ≥70% stenosis). Patients with mild/moderate CAD were excluded as indeterminate.
Standard clinical criteria and specific therapies were used to define the presence of diabetes, hypertension, and hyperlipidemia.9 Smoking was defined as current smoking or a >10 pack-year smoking history. Family history was positive if a parent, sibling, or child manifested CAD or MI by age 55 in males or 65 in females.
DNA extraction and genotyping
DNA was extracted from blood samples, quality and quantity were assessed, and genotyping was performed using standard techniques as detailed in Supplementary Materials. Call rates for SNPs in the GRS were 95% to 97% and required to be >90% for all tested SNPs.
Computation of GRSs
Two methods were used to create the multivariable GRS: a simple, unweighted count method (count GRS) and a weighted method (weighted GRS).6, 13 Both methods assumed each SNP to be independently associated with risk. (The independence of SNP associations with CAD/CAD risk markers was recently tested in this dataset and found to be valid.14) An additive genetic model was assumed: weightings of 0, 1, and 2 were given according to the number of risk alleles present.15
The count method assumed that each SNP contributed equally to CAD risk and was calculated by summing the number of risk alleles across the panel of SNPs tested. This produced a score between 0 and twice the number of SNPs, that is, representing the total number of risk alleles. The weighted GRS was calculated by multiplying each β-coefficient for the CAD phenotype from the discovery set by the number of corresponding risk alleles (0, 1, or 2) and then summing the products. The GRS was modeled as a continuous variable and as quartiles.
Statistical analysis
Statistical analyses were performed using SPSS version 15.0 (Chicago, IL). Chi-square tests (Armitage 1 df test-of-trend for additive genetic model) and t tests were used for comparing proportions and means, respectively, between cases and controls. Logistic regression was used to determine the effect of each variant separately and combined on risk for angiographic CAD. Odds ratios (ORs) and 95% CIs are reported for the high- versus low-risk alleles assuming an additive risk model. Multivariable analyses were used to adjust for history of hypertension, hyperlipidemia, smoking, diabetes, family history, ethnicity/race, and body mass index (BMI). Sex and age were matched by design. Power calculations used nQuery Advisor v.4.0 (Statistical Solutions, Saugus, MA). For risk alleles with minor allele frequency ≥10%, the study had 80% power to detect an OR for CAD of ≥1.3 and >90% for OR ≥1.35, for 2-sided alpha of <.05 for 2,000 cases and 1,000 controls. Associations with CAD for individual SNPs were considered significant at P ≤ .05 and in aggregate for GRS models at P ≤ .01.
We plotted receiver-operating characteristic (ROC) curves and calculated areas under the curve (AUC) for logistic regression models including conventional risk factors without and with GRS.16 We classified Framingham risk scores (FRSs) into 4 categories, with intermediate-risk categories defined as 10-year risks of 5% to 9% (low intermediate) and 10% to 19% (high intermediate).17, 18 The potential of GRS to improve individual risk stratification then was measured using the net reclassification improvement (NRI) method, using flow-limiting CAD (>70% stenosis) as the clinical CHD equivalent and excluding patients with diabetes (an a priori high-risk equivalent and not included in FRS scoring).16, 19, 20
Sources of funding
This study was funded by a grant from the National Heart, Lung, and Blood Institute (R01- HL071878). The authors are solely responsible for the design and conduct of this study, all study analyses, and drafting and editing of the paper.
Results
Patient characteristics
Characteristics of the angiographic set of cases and controls are summarized in Table I. Age averaged 53 years, and one third were women. By design, cases were matched to controls for age and sex. Other traditional risk factors were more prevalent in cases. However, (treated) lipid profiles and blood pressures were generally similar in cases and controls.
Table I. Characteristics of discovery set angiographic CAD cases and controls
| Variable | Angiographic | Angiographic |
|---|---|---|
| CAD cases | Normal controls | |
| No. | 1947 | 1036 |
| Age, mean ± SD, y | 53.1 ± 8.0 | 53.2 ± 8.2 |
| Sex (% women) | 35 | 36 |
| Race/Ethnicity† (% white) | 94 | 94 |
| BMI (kg/m2) | 30.1 ± 6.2 | 29.9 ± 6.5 |
| H/o hyperlipidemia (%) | 67⁎ | 34 |
| H/o hypertension (%) | 61⁎ | 44 |
| H/o diabetes (%) | 28⁎ | 12 |
| H/o smoking (%) | 26⁎ | 14 |
| Family history CHD | 50⁎ | 29 |
| Systolic BP (mm Hg) | 140 ± 25⁎ | 138 ± 22 |
| Diastolic BP (mm Hg) | 82 ± 14 | 81 ± 14 |
| Total cholesterol (mg/dL) | 193 ± 52⁎ | 185 ± 44 |
| Triglycerides (mg/dL) | 209 ± 196⁎ | 181 ± 152 |
| LDL-C (mg/dL) | 111 ± 40 | 108 ± 37 |
| HDL-C (mg/dL) | 38.0 ± 12.7⁎ | 41.6 ± 13.7 |
| Glucose (mg/dL) | 131 ± 67⁎ | 110 ± 45 |
⁎P < .05 for cases versus controls. |
†No blacks included. |
Single nucleotide polymorphism discovery and association with CAD
A total of 38 tSNPs in the 6 key lipoprotein metabolic genes was identified by exhaustive scanning (Supplementary Figure 1, Supplementary Table I, online Appendix); of these, 10 were significantly associated with lipid levels (ie, at P ≤ .002) or at least nominally (P < .2) with CAD. For initial GRS modeling, these 10 were added to 18 leading literature SNPs reported from GWAS to be associated with CAD-related biomarkers or CAD risk. These 28 SNPs, with allelic frequencies in cases and controls, are shown in Supplementary Table I, online Appendix. All SNPs were in Hardy-Weinberg equilibrium (ie, P > .05 per SNP).
Genetic risk score modeling
The multivariable genetic model could be simplified by operator-interactive stepwise elimination from a 28- to a 5-SNP genetic risk score (GRS5) without loss of discrimination (Table II). Five other SNPs showed multivariable association trends (.05 < P < .2) but were eliminated in the final model (Table II, footnote). The Hosmer-Lemeshow statistics for the 5 SNP model (7.92, P = .24, 6 df) and for the model using GRS5 as a single variable (3.91, P = .27, 3 df) both suggested a good fit to the data.
Table II. Logistic regression model for GRS5 in angiographic case/control set
| Gene/locus (rs) | Risk allele | β | Sig | OR (95% CI) |
|---|---|---|---|---|
| CELSR2/rs599839 | Major | .223 | <.001 | 1.25 (1.11-1.40) |
| 9p21.3/rs2383206 | Minor | .191 | <.001 | 1.21 (1.09-1.34) |
| CETP/rs289715⁎ | Major | .217 | .005 | 1.24 (1.07-1.44) |
| ApoF/rs78739461⁎ | Minor | .307 | .015 | 1.36 (1.06-1.74) |
| F2 (PT) /rs1799963 | Major | .474 | .046 | 1.61 (1.01-2.56) |
⁎Internally discovered SNP. |
The simple count method GRS5 yielded an OR of 1.24 per risk allele (CI 1.16-1.33, P = 8.2 × 10−11, model χ2mv 44.8, model χ2GRS-5 42.8) (Table II). The GRS5 was predictive in diabetic (OR 1.34, CI 1.13-1.59, N = 668) as well as nondiabetic subgroups of cases/controls (OR 1.23, CI 1.14-1.32, N = 2,282).
Results for continuous GRS5 and for quartile GRS5 are shown in Table III. Angiographic CAD prevalence across the spectrum of count-GRS5 scores is presented in Figure 1. Comparisons of CAD prevalence in fourth versus first quartile subjects yielded an unadjusted OR of 2.07 (CI 1.59-2.70) and a risk factor adjusted OR of 2.03 (1.53-2.70) for count GRS5 (Table III). Weighted GRS5 results were similar: unadjusted OR 2.06 (CI 1.61-2.65); adjusted OR 2.02 (CI 1.55-2.64).
Table III. Associations between count GRS5 and angiographic CAD
| Continuous GRS | Quartile of continuous GRS | ||||
|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | ||
| No. of subjects | 2983 | 630 | 969 | 930 | 454 |
| Median GRS (range) | 6 (2-10) | 5 (2-5) | 6 (6) | 7 (7) | 8 (8-10) |
| OR (95% CI)/allele | 1.24 (1.16-1.33) | 1.0 (–) | 1.12 (0.91-1.37) | 1.54 (1.24-1.90) | 2.07 (1.59-2.70) |
| Adjusted⁎ | 1.21 (1.13-1.30) | 1.0 (–) | 1.20 (0.96-1.49) | 1.49 (1.18-1.87) | 2.03 (1.53-2.70) |
⁎Adjusted for hyperlipidemia, hypertension, BMI, diabetes, smoking, family history. |

Figure 1.
Prevalence of angiographic CAD by count GRS5 score. Percent CAD across GRS5 categories was highly significant (p trend = 8.2 × 10−11). Reference mean count GRS score is 6.3.
The GRS validation testing
For validation testing, the GRS5 was prospectively tested in a completely separate set of 312 unrelated familial coronary disease cases and 1,008 apparently healthy population controls (Supplementary Table II, online Appendix). In this independent familial case/control set, the GRS5 remained highly significantly predictive of coronary disease, with a similar magnitude to that observed in the angiographic case/control set, that is, OR of 1.23 (CI 1.11-1.38) per risk allele, P = 2.0 × 10−4. In addition, each risk allele, except the rare ApoF variant, individually predicted risk in these familial cases, with the common CELSR2 and 9p21.3 polymorphisms showing the strongest associations (Supplementary Table III, online Appendix).
In addition, an internal analysis of reliability within the primary case/control set was performed by dividing the set into equally sized “discovery” and “replication” subsets. This analysis demonstrated a high degree of internal reliability, with GRS5 showing consistent and highly significant predictive value in both subsets (ORs 1.26, 1.23, respectively, both P < 10−5).
Incremental value of multiple source SNPs in GRS modeling
Of the 5 SNPs meeting criteria for GRS5 membership, 2 (ie, ApoF rs78739461* and CETP rs289715), 1 novel*, were from the internal lipoprotein gene discovery effort, and 3 were from external candidates from other loci/pathways (ie, 9p21 rs2383206, CELSR2/PSRC1 rs599839, F2 rs1799963). The SNPs from both sources contributed in a complementary fashion to overall risk prediction; both GRS3 (limited to the 3 external SNPs; χ2GRS-3 = 30.6) and GRS2 (2 internal SNPs; χ2GRS-5 = 13.6) predicted CAD but were inferior to the combined GRS5 (χ2GRS-5 = 42.8).
Multivariable predictive model
A multivariable predictive model for angiographic CAD incorporating genetic and clinical information is presented in Table IV. The contribution of GRS5 is intermediate, similar to family history and greater than hypertension. (The impact of age and sex cannot be determined, given the matching design, and the standard use of statins in CAD patients likely impacts the diagnosis of hyperlipidemia.)
Table IV. Multivariable predictive model for CAD using genetic variables and standard risk factors
| Variable | Wald χ2 | OR (CI) | P |
|---|---|---|---|
| Hyperlipidemia | 123.51 | 2.75 (2.30-3.29) | 1.1 × 10−28 |
| Diabetes | 46.12 | 2.21 (1.76-2.77) | 1.1 × 10−11 |
| Smoking | 39.12 | 1.98 (1.60-2.46) | 4.0 × 10−10 |
| Family history | 30.01 | 1.64 (1.37-1.96) | 4.3 × 10−8 |
| GRS5/Risk allele | 28.66 | 1.21 (1.13-1.30) | 8.7 × 10−8 |
| Hypertension | 2.71 | 1.16 (0.97-1.39) | 0.10 |
| BMI (per kg/m2) | 0.02 | 1.001 (0.985-1.018) | 0.88⁎ |
⁎Body mass index was eliminated from the final model. |
Incremental value of GRS in individual risk assessment
When count GRS5 was added to FRS in the full angiographic set, net reclassification fraction was found to be 0.233 in CAD cases (P < .0001) and 0.073 in no-CAD controls (P = .005), which when combined yielded an NRI of 0.160 (P < .0001) (Supplementary Table IV, online Appendix). Net reclassification improvement restricted to intermediate-risk subjects yielded an NRI of 0.283 (P < .0001) (Supplementary Table IV, online Appendix).17, 18
As with others' experience with novel (and most traditional) risk factors,7, 13, 18 the addition of GRS5 had only minor impact on area under the ROC curve: conventional model C-statistic 0.723 (CI 0.703-0.742); GRS-augmented C-statistic 0.731 (CI 0.712-0.750) (P > .05).
Discussion
Summary of key study results
CorGen demonstrates the feasibility and potential utility of simultaneously considering the joint effects of common genetic variants from multiple pathobiologic pathways, aggregated as a GRS, to predict the risk of premature CAD. CorGen also demonstrates the complementary effects of combining tSNPs derived from high-definition scanning of candidate genes (ie, those associated with lipoprotein metabolism) with biomarker- and other risk-related SNPs discovered through high-density GWAS.
Features of CorGen that provide assurance of a valid result include the angiographic characterization of CAD (defining a precise phenotype) and replication in an independent set of cases and controls. On average, each high-risk allele increased risk by 24%, and considered jointly as GRS5, a fourth quartile score increased CAD risk by over 2-fold. The GRS also contributed independently to standard risk factors in multivariable modeling. The GRS improved the FRS category in a net of 16% of subjects overall, which compares favorably with the 12% reported NRI for HDL cholesterol16, 17 and 10% NRI for systolic blood pressure.20 Further, application restricted to subjects in the 2 intermediate-risk categories improved classification in a net of 28%. These findings suggest its potential clinical utility in individual risk classification, despite (as with others' experience) its minor impact on population-level AUC.7, 13, 17, 20
Literature comparisons
Approximately one half of CHD risk appears to be genetically transmitted.1 However, the proportion of variation among markers of human traits and diseases (including CAD) attributed to individual SNPs has been modest,21, 22 leading to the “common disease, common variant hypothesis.”23 Combining these SNPs to form a more powerful risk predictor underlies the GRS concept. The GRS modeling for CHD is in its infancy, but a few reports have appeared, mostly modeling plasma lipids.5, 7, 13 CorGen extends these models to multiple additional genes and pathways identified by either candidate-gene or GWAS methods and introduces new markers.
Accounting for genetic susceptibility
Uncovering the genetic basis of CHD is an unrealized goal.1, 2, 3, 4, 5, 6, 7, 8, 9, 10 The pathophysiology of CHD is a complex interaction of genetic and environmental factors acting directly and indirectly on multiple disease stages from preclinical to clinical CAD to acute coronary syndromes, each stage with a distinct set of risk factors.1 Despite methodological advances such as GWAS, progress has been slow, and well-validated associations remain few, with modest impact. This emphasizes the need for broadening the genetic search, more precisely defining the coronary phenotype and aggregating individual genetic risk markers into an overall metric.
A major success in this effort has been the discovery and replication of a risk locus at chromosome 9p21.3 (Supplementary references, online Appendix). Although its mechanism remains to be precisely defined, it appears to be involved in vascular development and function, increasing susceptibility to CAD rather than precipitating MI.10, 24, 25 Some studies have suggested that knowledge of 9p21 status may improve individual risk classification.26, 27, 28 In CorGen, 9p21 emerged as a major contributor to the multivariant GRS.
Three other contributors to GRS5 are involved with lipoprotein metabolism, but in contrast to a previous report,7 these were not restricted to LDL-C–related genes. In addition, a less common variant from the thrombosis pathway (F2) with a relatively large effect was selected. Such uncommon SNPs have been suggested as a focus for future discovery efforts.12, 21 A variant representing vascular inflammation (CRP rs2794520) demonstrated a preliminary association but was eliminated in the final model. Future research should aim to discover additional genetic risk contributors, both universal and population-specific, including those from vascular inflammation and thrombosis pathways.
Study implications
Here we show that a multivariant GRS can discriminate CAD as well as or better than many standard nongenetic tests and may improve individual risk assessment. In contrast, GRS5 added little to the AUC for the ROC curve, that is, did not importantly improve risk prediction over the FRS at a population level. This dichotomy also has been reported for other risk predictors.7, 13, 18, 26
We view CorGen as a proof-of-concept study. Very recent29 and future genetic studies may identify SNPs that add to and refine the CorGen GRS. Nevertheless, of the many high-profile literature SNPs associated with lipids, other disease markers, or even CAD already reported that we tested, only 5 of 28 leading candidates contributed independently to CAD prediction in multivariable GRS modeling. Similarly, of 38 tSNPs discovered internally by extensive scanning to characterize variation in the 6 key lipoprotein metabolic genes, only 2 contributed to the final GRS model. Hence, the goal to account for the greater part of the genetic basis of CAD remains challenging.
The GRS may be of particular value in younger cohorts in whom traditional risk factors have not developed and who may benefit from closer surveillance and more aggressive preventive measures. Genetic risk scores, as other novel risk predictors, may be of greatest incremental value in those at intermediate pretest probability.18, 19,30 Finally, by interfacing genetic loci from multiple pathways, additional insights into disease pathogenesis and treatment targets may be expected.
Strengths and limitations
The study, although prospective in enrollment and hypothesis, possesses the limitations of observational studies, including the possibility of uncorrected confounding. The predictive value of the GRS is restricted to angiographic CAD and not MI per se. This study focused on a younger population of Euro-Americans, minimizing population stratification and heterogeneity but potentially limiting applicability to other racial/ethnic groups (ie, African Americans) and older ages. Risk-associated SNPs, although valuable for disease discrimination, may not represent causal variants. Although moderately large, CorGen has limited power to discern associations with small effect sizes. However, small-effect variants are unlikely to have important clinical impact or be cost-effective for clinical application. Although the GRS5 was validated internally, external validation in geographically distinct populations is needed, as is further testing of the novel and rare apoF variant. Here we test GRS in the context of the commonly used ATP-III version of the FRS; other risk stratification methods (eg, FRS including diabetics; Diamond Forrester score) may deserve testing. Given our case-control study design, we are limited to discriminatory analysis and technically cannot precisely estimate the predictive power of GRS for CAD risk. Finally, we acknowledge that our highlighting of a pathway-based approach is primarily conceptual; this effort represents an initial, selective rather than a thorough, definitive application of a pathway approach.
Conclusions
Using a multiple-step study design, CorGen has validated the ability of a GRS derived from 5 SNPs to predict premature CAD and has demonstrated the complementary nature of candidate gene and GWAS approaches. The GRS5 model provided greater discrimination than any single variant, predicted a 24% risk increment per allele, identified a highest quartile GRS with a 2-fold increase in CAD risk, and improved net risk classification in 28% of intermediate-risk individuals. Thus, CorGen demonstrates proof-of-concept that GRSs are a feasible and promising approach to account for a portion of the genetic basis of CAD and identify individuals at increased CAD risk.
Acknowledgements
We acknowledge the technical assistance of Samera Khan, John A. Huntinghouse, Matthew J. Kolek, Nathan Hull, and Brianna S. Ronnow.
Appendix. Supplementary data
Supplementary Text:
Extraction and Genotyping of DNA. DNA was extracted from blood samples and genotyped using standard techniques (currently, Gentra Autopure LS automated DNA extractor). Intactness of high-molecular-weight DNA was gauged routinely on random samples by electrophoresis using 1.5% agarose gels. The quantity and purity of eluted DNA was determined by UV absorption at 260 and 280 nm; DNA was adjusted to 200 μg/mL and stored at −70°C.
Genotyping employed polymerase chain reaction amplification of the genomic region bracketing the polymorphism of interest. Detection of SNPs employed 1 or more of 3 methods to ensure the highest proportion of successful calls: amplification and detection using (1) 5' exonuclease (Taqman®) chemistry with an ABI 7500 Prism ®Sequence Detection System, Applied Biosystems, Foster City, CA; (2) melting curve analysis employing LC Green and a Light Scanner, RAPID-LT or an HR-1 (Idaho Technology, Salt Lake City, UT); and/or (3) traditional Sanger sequencing.
Principal Components Analysis: Method of Horne and Camp.31 This method evaluates the linkage disequilibrium (LD) simultaneously for all SNPs within a haplotype structure. Two principal component analyses are used to derive LD groups and then tagging SNPs based on an eigenvalue threshold. Components of each LD group are SNPs with significant factor loadings, and a tagging SNP is selected as the 1 with the highest factor loading.
The Chromosome 9p21.3 Locus and CHD Risk. One of the major successes in the effort to discover the genetic basis of CHD risk has been the discovery of a risk-associated locus at chromosome 9p21.3. First reported by several groups in 2007,32, 33, 34, 35 this association has been validated by multiple groups worldwide, across racial and geographic boundaries, and is independent of traditional risk factors.36

Supplementary Figure 1.
Study Design Schematic. *Lipid-related genes studied were apoF, CETP, LCAT, LIPC, LPL, LCAT, SR-BI.
Supplementary Table 1. Allelic distributions of GRS candidate SNPs in the angiographic set of CAD cases and controls
| SNP | Locus/Gene | Major allele | Minor allele | MAF, cases/controls |
|---|---|---|---|---|
| rs2383206δ | 9p21.332, 33, 34, 35, 36, 37 | A | G† | 0.53/0.48 |
| rs10811661 | 9p2138 | T† | C | 0.21/0.22 |
| rs2794520 | CRP39 | C | T† | 0.34/0.32 |
| rs2494250 | FCER1A39 | C | G† | 0.26/0.25 |
| rs6025 | F540 | G | A† | 0.032/0.026 |
| rs1799963δ | F240 | G‡ | A† | 0.011/0.017 |
| rs11591147 | PCSK941, 42, 43 | G† | T | 0.034/0.032 |
| rs599839δ | CELSR241, 42, 44 | A† | G | 0.22/0.27 |
| rs754523 | ApoB44 | T | C† | 0.30/0.31 |
| rs12654264 | HMGCR41, 42 | A | T† | 0.37/0.36 |
| rs6511720 | LDLR41, 42, 44 | G† | T | 0.127/0.133 |
| rs4420638 | ApoC1/E41, 42, 44 | A | G† | 0.21/0.19 |
| rs328 | LPL41 | C† | G | 0.12/0.13 |
| rs11570897⁎ | LPL | C | T† | 0.0050/0.0050 |
| rs4149268 | ABCA144 | G | A† | 0.35/0.36 |
| rs78739461⁎,δ | ApoF | G† | C‡ | 0.048/0.037 |
| rs34934555⁎ | ApoF | C | T† | 0.030/0.026 |
| rs4775041⁎ | LIPC | G† | C | 0.28/0.28 |
| rs1800588⁎ | LIPC | C† | T | 0.23/0.24 |
| rs36041167⁎ | LIPC | A† | G | 0.08/0.10 |
| rs3764261 | CETP44 | G† | T | 0.31/0.33 |
| rs1800776⁎ | CETP | C | A† | 0.090/0.087 |
| rs1800775⁎ | CETP | A | C† | 0.46/0.48 |
| rs11076175⁎ | CETP | A | G† | 0.20/0.18 |
| rs289715⁎,δ | CETP | A† | T | 0.12/0.14 |
| rs2156552 | LIPG42, 44 | A | T† | 0.152/0.148 |
| rs780094 | GCKR44 | G | A† | 0.38/0.36 |
| rs12286037 | ZNF25944 | C† | T | 0.08/0.07 |
⁎Internal SNP candidate. |
δSNP selected in final GRS5 model. |
†High-risk allele for associated CHD biomarker. |
‡High-risk allele associated with CHD. |
Supplementary Table 2. Demographics of familial cases and population-based controls
| Characteristic | CHD cases (n = 414; 312 unrelated) | Population-based controls (n = 1008) |
|---|---|---|
| Age, mean (SD), y | 54.8 (8.9) | 53.4 (15.2) |
| Males (%) | 79 | 42 |
| H/o hyperlipidemia (%) | 59 | 36 |
| H/o hypertension (%) | 54 | 28 |
| H/o diabetes (%) | 20 | 7 |
| H/o smoking (%) | 40 | 20 |
Supplementary Table III. CHD risk associations of individual GRS-5 component SNPs in the validation set of highly familial CHD cases (n = 312) and population-based controls (n = 1008)
| Gene/Variant | MAF cases | MAF controls | Risk allele⁎ | OR (95% CI) |
|---|---|---|---|---|
| CELSR2/rs599839 | 14% | 30% | Major | 1.93 (1.57-2.36) |
| 9p21.3/rs2383206 | 55% | 45% | Minor | 1.47 (1.23-1.74) |
| CETP/rs289715 | 14% | 20% | Major | 1.36 (1.1-1.68) |
| F2 (PT) /rs1799963 | 0.9% | 1.6% | Major | 1.62 (0.72-3.61) |
| ApoF/rs78739461 | 0.6% | 4.9% | Minor | 0.12 (0.05-0.33)† |
⁎In angiographic case/control set. |
†Estimate may not be reliable due to very few (<5) alleles in case set. |
Supplementary Table 4. Improvement in net reclassification in the validation set using GRS5
| FRS + GRS→ | Low | Intermed-1 | Intermed-2 | High | |||
|---|---|---|---|---|---|---|---|
| FRS ↓ | F/GRS <5% | F/GRS 5%-9% | F/GRS 10%-19% | F/GRS ≥20% | Total | ||
| CAD 0 | FRS <5% | Count | 185 | 18 | 5 | 0 | 208 |
| FRS 5%-9% | Count | 33 | 86 | 39 | 10 | 168 | |
| FRS 10%-19% | Count | 2 | 20 | 39 | 22 | 83 | |
| FRS ≥20% | Count | 0 | 1 | 3 | 19 | 23 | |
| Total | Count | 220 | 125 | 86 | 51 | 482 | |
| CAD + | FRS <5% | Count | 336 | 42 | 7 | 2 | 387 |
| FRS 5%-9% | Count | 39 | 153 | 122 | 44 | 358 | |
| FRS 10%-19% | Count | 4 | 35 | 92 | 125 | 256 | |
| FRS ≥20% | Count | 0 | 3 | 8 | 74 | 85 | |
| Total | Count | 379 | 233 | 229 | 245 | 1086 |
References
- . Family history and genetic factors. In: Wong ND, Black HR, Gardin JM editor. Preventive cardiology, a practical approach. New York: McGraw-Hill; 2005;p. 92–148
- Progress in unraveling the genetics of coronary artery disease and myocardial infarction. Current Atherosclerosis Reports. 2007;9:179–186
- . Genomewide association studies—illuminating biologic pathways. N Engl J Med. 2009;360:1699–1701
- Genetic susceptibility to myocardial infarction and coronary artery disease. Hum Molec Genet. 2006;15:R117–R123
- Generating genetic risk scores from intermediate phenotypes for use in association studies of clinically significant endpoints. Ann Human Genetics. 2005;69:176–186
- Evaluation of genetic risk scores for lipid levels using genome wide markers in the Framingham Heart Study. BMC. 2009;3(Suppl 7):546
- Polymorphisms associated with cholesterol and risk of cardiovascular events. N Engl J Med. 2008;358:1240–1249
- Implementation of a computerized cardiovascular information system in a private hospital setting. Am Heart J. 1998;136:792–803
- Multiple-polymorphism associations of seven matrix metalloproteinase and tissue inhibitor metalloproteinase genes with myocardial infarction and angiographic coronary artery disease. Am Heart J. 2007;154:751–758
- Association of variation in the chromosome 9p21 locus with myocardial infarction versus chronic coronary artery disease. Circulation Cardiovascular Genetics. 2008;1:85–92
- High-resolution characterization of linkage disequilibrium structure and selection of tagging SNPs for the cholesteryl ester transfer protein gene. Ann Human Genetics. 2006;70:524–534
- Multiple less common genetic variants explain the association of the cholesteryl ester transfer protein gene with coronary artery disease. J Am Coll Cardiol. 2007;49:2053–2060
- Joint effects of common genetic variants on the risk of type 2 diabetes in U.S. men and women of European ancestry. Ann Intern Med. 2009;150:541–550
- Modeling multiplicative SNP interactions in the presence of an additive genetic risk score. Genet Epidemiol. 2009;33:771
- . A tutorial on statistical methods for population association studies. Nat Rev Genet. 2006;7:781–791
- Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Statist Med. 2008;27:157–172
- Criteria for evaluation of novel markers of cardiovascular risk: a scientific statement from the American Heart Association. Circulation. 2009;119:2408–2416
- Development and validation of improved algorithms for the assessment of global cardiovascular risk in women. JAMA. 2007;297:611–619
- . Use and misuse of receiver operating characteristic curves in risk prediction. Circulation. 2007;115:928–935
- . Advances in measuring the effect of individual predictors of cardiovascular risk: the role of reclassification measures. Ann Intern Med. 2009;150:795–802
- . Common genetic variation and human traits. N Engl J Med. 2009;360:1696–1698
- . Genetic risk prediction—are we there yet?. N Engl J Med. 2009;360:1701–1703
- . The new genomics: global views of biology. Science. 1996;274:536–539
- Genetic variation at the 9p21 locus predicts angiographic coronary artery disease prevalence but not extent and has clinical utility. Am Heart J. 2008;156:1155–1162
- Gene dosage of the common variant 9p21 predicts severity of coronary artery disesae. J Am Coll Cardiol. 2010;56:[in press]
- Chromosome 9p21.3 coronary heart disease locus genotype and prospective risk of CHD in healthy middle-aged men. Clin Chem. 2008;54:467–474
- Impact of adding a single allele in the 9p21 locus to traditional risk factors on reclassification of coronary heart disease risk and implications for lipid-modifiying therapy in Atherosclerosis Risk in Communities study. Clin Cardiovasc Genet. 2009;2:279–285
- Cardiovascular disease risk prediction with and without knowledge of genetic variation at chromosome 9p21.3. Ann Intern Med. 2009;150:65–72
- . Large scale association analysis of novel genetic loci for coronary artery disease. Arterioscl Thromb Vasc Biol. 2009;29:775–780
- Third Report of the National Cholesterol Education Program (NCEP) Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel III) final report. Circulation. 2002;106:3143–3421
Supplementary References
- . Principal component analysis for selection of optimal SNP-sets that capture intragenic genetic variation. Genet Epidemiol. 2004;26:11–21
- A common allele on chromosome 9 associated with coronary heart disease. Science. 2007;316:1488–1491
- A common variant on chromosome 9p21 affects the risk of myocardial infarction. Science. 2007;316:91–93
- . Genome-wide association study of 14,000 cases of seven common disease and 3,000 shared controls. Nature. 2007;447:661–678
- Genomewide association analysis of coronary artery disease. N Engl J Med. 2007;357:443–453
- Repeated replication and a prospective meta-analysis of the association between chromosome 9p21.3 and coronary artery disease. Circulation. 2008;117:1675–1684
- Genetic variation at the 9p21 locus predicts angiographic coronary artery disease prevalence but not extent and has clinical utility. Am Heart J. 2008;156:1155–1162
- Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science. 2007;316:1331–1336
- Genome-wise association with select biomarker traits in Framingham Heart Study. BMC Medical Genetics. 2007;8:1–12
- Seven haemostatic gene polymorphisms in coronary disease: meta-analysis of 66 155 cases and 91 307 controls. Lancet. 2006;367:651–658
- Polymorphisms associated with cholesterol and risk of cardiovascular events. N Engl J Med. 2008;358:1240–1249
- Six new loci associated with blood low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, or triglycerides in humans. Nat Genet. 2008;40:189–197
- . Sequence variations in PCSK9, low LDL, and protection against coronary heart disease. N Engl J Med. 2006;354:1264–1272
- Newly identified loci that influence lipid concentrations and risk of coronary artery disease. Nat Genet. 2008;40:161–169
- . Development and validation of improved algorithms for the assessment of global cardiovascular risk in women. JAMA. 2007;297:611–619
PII: S0002-8703(10)00437-0
doi:10.1016/j.ahj.2010.05.031
© 2010 Mosby, Inc. All rights reserved.
