02-08-2018 | Breast cancer | Editorial | Article

Developing polygenic risk scores for breast cancer

Introduction

The potential for inheritance of breast cancer risk has been speculated on since Paul Broca reported on the apparent hereditary nature of breast cancer in his wife’s family in 1866 [1]. This was confirmed by epidemiological studies in the 1980s and, finally, with identification of the high-risk genes BRCA1 (1994) and BRCA2 (1995), which together account for around 3% of breast cancers that are due to single high-risk genes (around 4% of all breast cancers). Additional high-risk genes (lifetime risk ≥40%) TP53, PTEN, STK11, CDH1, and PALB2, as well as moderate-risk genes (lifetime risk 20–35%), notably ATM and CHEK2, have been identified. Generally, high-risk genes with pathogenic variants are rare (less than one in 500, and most less than one in 5,000) compared to around one in 200 for the two well-known, moderate-risk genes. Together, all of the high/moderate-risk genes account for little more than 25% of the inherited component of breast cancer [2]. Twin studies have shown that around 27% of breast cancer has an inherited component [3], despite less than 20% of women with breast cancer having a family history of the disease. This is partly due to the fact that nowhere near all women who carry genetic variants will develop breast cancer, and half of the variants will be inherited from a man, who has an average lifetime risk of 0.1%.

From the early part of the 21^st century, interest switched to identifying common genetic variants that affect breast cancer risk in a polygenic and potentially multiplicative way [4]. Large genome-wide association studies were developed using thousands of single nucleotide polymorphisms (SNPs) to identify breast cancer susceptibility loci. SNPs are single changes of a nucleotide, for instance changing an Adenine (A) to a guanine (G), which are not thought to affect function if they are within a gene. In the aforementioned case, if A is the usual population copy, then the less common copy of the gene change (G) is present in 5–49% of individuals. Initially, studies were disappointing and funding agencies were about to withdraw support when, in 2007, five new breast cancer-associated SNPs were identified, including one in FGFR2 and one in TOX3, which remain the two most important SNPs in the general population [5]. It soon became clear that the effects of these SNPs were multiplicative and hence those who have inherited two copies of the risk allele have double the risk of those with one. Those with no risk allele are at below-average risk. By 2010, 18 SNPs had been identified, which rose to 77 in 2013 [2], and with imputed SNPs, there are now more than 300. Currently, SNPs make up more of the inherited component of breast cancer than all the known moderate/high-risk genes combined, despite their individual effects being substantially lower. Generally, carrying a single copy of the risk variant increases risk by only 1–27%. Assuming a population average of 10% lifetime risk, not carrying a variant may reduce the risk to 9%, whereas carrying one risk variant copy may only increase the risk to 11%.

Developing a polygenic risk score

Nearly all studies to date have confirmed that the odds ratios (ORs) for each independent SNP can be multiplied to provide a polygenic risk score (PRS). By using the ORs for each SNP identified in studies by the Breast Cancer Association Consortium of over 120,000 cases, and a similar number of controls and frequencies of the variant copy in the population, an OR for each SNP can be calculated, normalizing the risk around 1.0. The OR for each SNP is then multiplied with the ORs for each other independent SNP. This then produces a PRS. Although the average PRS should be 1.0, the median is usually well below this, reflecting that most of the population have a lifetime risk below the average population risk. The great majority of women will have a PRS between 0.5 and 2.0, though this can be as low as 0.2 and as high as five-fold or greater than this, using 143 SNPs.

Using a SNP PRS

Relatively little work has been done to assess the ability of SNP panels to stratify risk in combination with classic breast cancer risk factors (usually combined in a model) as well as mammographic density, another important risk factor. Recently, results with panels of up to 77 SNPs have been published [6–8]. These show that a SNP PRS can be combined with other risk factors to improve the prediction of breast cancer in models. The discrimination of the PRS (the accuracy across the risk range, ie whether it accurately predicts at high and low-risk predictions) is extremely good, especially for SNP18, which contains the first 18 identified SNPs, and been validated the longest and has the strongest evidence base [7]. Despite the other two major risk factors having a strong hereditary component (breast cancer family history in classical risk factors and mammographic density), there is little overlap between the SNP PRS and these factors [7]. As such, with very little adjustment, these can be incorporated together. The main drawback is the relatively poor discrimination accuracy of the standard risk-factor models.

Population risk stratification

Whilst a SNP PRS can give an accurate assessment of an individual woman’s risk and help her make decisions about screening and preventive measures, it will ultimately have its biggest impact in population screening. At present, all women are treated the same, with most national screening programmes offering 2- or 3-yearly mammography to all women aged 50–70 years. In reality, some women have a risk as low as 1% over this 20-year period and others have risks as high as 40%. In order to balance the potential harms of screening, such as a false-positive screen and overtreatment of tumors that would never have presented clinically, with the proven improvements in breast cancer survival, a different approach is required. Ideally, all women at entry to screening should have access to a full risk assessment, including a SNP PRS. Women at high risk (≥8% 10-year risk) in the UK can have more frequent National Institute for Health and Care Excellence (NICE)-approved screening and access to NICE-approved prevention medication such as tamoxifen or anastrozole that can reduce breast cancer risk by 35–50%. This approach is backed up by the proven higher incidence, but also a higher proportion of later-stage cancers on normal screening intervals [7]. Those women with a low-risk, for example less than1.5% 10-year risk (the average risk of a 40 year old who does not get screened for 10 years), could defer screening for 10 years. Again, this is supported by the lower incidence and earlier stages of cancer in the low-risk group [7].

In an ideal world, the initial assessment should take place at 40 years of age or below (15% of breast cancers occur in women aged 40–49 years). Women aged 40 years and at more than 3% (moderate) 10-year risk are eligible for annual mammography in the UK and many other countries. The SNP PRS is particularly helpful in those with a family history of breast cancer with low likelihood of BRCA1/2 positivity or where BRCA1/2 testing has proven negative in the family. In this situation, 50% of women change risk group as a result of SNP18 assessment [9]. Indeed, a SNP PRS that should cost less than £100 in the UK will provide a meaningful output for every woman, whereas testing for a panel of 30–90 genes associated with moderate or high risk of cancer [10] (only nine that are clearly actionable for breast cancer) costing £270–2000, will only give a meaningful output in around 2% of women in the general population. Furthermore, identification of variants of uncertain significance will leave 5–40% of women in limbo with ongoing uncertainty. In conclusion, SNP PRSs for breast cancer are ready for full implementation and provide accurate risk prediction for breast cancer at a once-only, very inexpensive cost.

Medicine Matters Oncology

Introduction

Developing a polygenic risk score

Using a SNP PRS

Population risk stratification