Several risk calculators (RCs) have been developed to predict prostate cancer (PCa) diagnosis at prostate biopsy. These multivariable tools have constantly been shown to be superior to risk prediction using PSA testing alone. Their use in personalized clinical decision-making is thus increasingly recommended to reduce overdiagnosis and overtreatment of PCa . Foley et al.  conducted a multi-institutional external validation of the most recent versions of the European Randomised Study of Screening for Prostate Cancer Risk Calculator (ERSPC-RC) and the Prostate Cancer Prevention Trial Risk Calculator (PCPT-RC) in a large cohort of patients from six different Irish tertiary referral centres. The study showed that the two RCs performed moderately well. Both RCs performed less optimistic compared with their original reports. The ERSPC-RC showed superior discrimination (area under the curve of 0.74 vs 0.69 for high grade PCa) and a greater net benefit in decision-curve analysis (DCA) than the PCPT-RC; however, although the ERSPC-RC was superior to the PCPT-RC in this well-conducted study, neither RC can be recommended for PCa risk prediction in this specific Irish cohort.
The authors chose to perform DCAs, which are of great value for further assessing the utility of a risk prediction model using visualization of the clinical net benefit and net harm. The benefit threshold of >30%, as shown in the DCA of the ERSPC-RC for high grade PCa, is too high for a clinically meaningful prediction tool. Below this threshold the RC did not provide further benefit compared with a strategy of performing a biopsy on everybody. It is questionable whether clinicians or patients would opt to use an RC which only provides a benefit if a risk of 30% as the lowest acceptable threshold for high grade disease is accepted.
What are the reasons for the suboptimum performance of the RCs in the Irish cohort? It is well known that RC performance is often less optimistic in external validations . Differences in cohort characteristics, biopsy strategies and screening recommendations between RC development cohorts and the tested cohorts, but also changes in clinical practice over time, are potential reasons. Although the RCs have constantly been modified to establish their role as a general one-size-fits-all risk prediction model, their performance varied significantly in different cohorts. We recently evaluated the same RCs in a large Swiss single-centre cohort and found similar discrimination but better calibration, a greater net benefit and a lower and thus clinically useful benefit threshold in DCAs compared with the present Irish study . The cohort in the present study was unique because it consisted of a highly preselected group of patients. This is attributable to the specific referral practice for prostate biopsies in Ireland and is reflected in the high number of patients with a positive DRE (47% in the group diagnosed with PCa) or a positive family history (11%). Accordingly, the overall PCa detection rate (58%) and the detection rate of high grade disease (35%) were higher than usually expected. From a scientific point of view, the Irish cohort is not the optimum cohort to validate these RCs. Far more importantly, however, from a clinical point of view, the evaluation showed that these RCs are not really useful in the specific Irish health system.
What can be done to improve the performance RCs in the future? It is obvious that specific characteristics of the tested cohorts will affect RC performance. These local or regional characteristics usually cannot be changed. Thus modifications of available RCs according to local patient practice might be necessary. This concept has recently been examined by Strobl et al. . They were able to show that recalibration of the static PCPT-RC according to local cohort and practice characteristics can improve its accuracy. Additionally, RCs developed from contemporary clinical cohorts that were, for example, diagnosed using current state-of-the-art biopsy strategies (i.e. 12-core biopsies) instead of historical cohorts from, for example, randomized clinical trials might also result in better RC performance in clinical practice. Furthermore, the inclusion of novel variables in the RC might be useful. Results from imaging studies, such as multiparametric prostate MRI, or promising new biomarkers might increase the overall performance of PCa RCs. The study by Foley et al. shows that the inclusion of novel markers can be of benefit. The ERSPC-PHI RC, which includes the Prostate Health Index (PHI) as an additional variable, was investigated in a subset of patients in their study and was superior to the conventional ERSPC-RC; however, when novel variables are integrated, their potential clinical harm (e.g. unpleasant or costly investigations) has to be balanced against their potential benefit.
The work of Foley et al. nicely illustrates the limitations of current PCa RCs. Locally tailored static RCs, RCs based on contemporary clinical cohorts, or RCs including novel variables need to be developed to assess whether overall RC performance can be improved in the future. There is still much work to do!