Blog - Latest News

Editorial: Are historical studies relevant in the setting of grade migration?

While randomized controlled trials are the ‘gold standard’ for comparative effectiveness research, it is important that they be taken in context of their limitations. This is especially true in surgical trials for prostate cancer. For one, factors such as blinding and allocation concealment are often impossible in surgery, and surgeon skill may have a large impact [1]. What is more, it can take over a decade before interventions yield detectable differences in prostate cancer survival. Consequently, shifts in diagnosis and management may make historical clinical trial findings less useful for contemporary patients. For example, the landmark Scandinavian Prostate Cancer Group Study number 4 (SPCG‐4) showed a survival benefit for men treated with radical prostatectomy rather than observation during the 1989–1999 time period [2] but management in the study differed from contemporary practice as, in the 1990s, strict ‘active surveillance’ protocols did not exist.

In addition to shifts in management, men diagnosed with prostate cancer today differ from those diagnosed in previous decades. This was shown by Dalela et al. [3] who compared registry‐based data from the USA with data on patients enrolled in the Prostate Cancer Intervention Versus Observation (PIVOT) trial, and found significant differences between the two cohorts.

In a similar vein, Cazzaniga et al. [4] designed an elegant study to assess the generalizability of the SPCG‐4 to contemporary cohorts of men with prostate cancer. They focused on histological grading and compared the natural history of men in the SPCG‐4 study to men in similar grade categories diagnosed approximately one decade later in Sweden.

The contemporary cohort was made up of men with localized prostate cancer drawn from the Swedish National Prostate Cancer Register (NPCR). Men in the NPCR diagnosed in 2005–2006 had lower prostate cancer‐specific and all‐cause mortality compared to men with similar grade cancer in the SPCG‐4 (hazard ratios 0.46, 95% CI 0.19–1.14, and 0.66, 95% CI 0.46–0.95, respectively). While some of the observed differences in survival may have been attributable to improved treatments, Cazzaniga et al. hypothesized that grade migration was to blame.

As expected, the authors found a shift in Gleason grading, with a decrease in Gleason Grade Group (GGG) 1 disease, corresponding to a historical score of Gleason 3 + 3 = 6, and a concurrent increase in GGG2 and GGG3 disease, corresponding to historical scores of 3 + 4 = 7 and 4 + 3 = 7, respectively. Importantly, these differences in prostate cancer‐specific and all‐cause mortality were mitigated after compensating for grade migration by increasing GGG by one for the NPCR group; in other words, men in the SPCG‐4 treated in the 1990s had similar prostate cancer‐specific and all‐cause mortality to men in a later period with a one‐unit higher GGG.

Grade migration has been a gradual process, which was hastened by the major 2005 International Society of Urological Pathology revision that recategorized some Gleason patterns from 3 to 4. Changes in 2014 further refined these, and the concept of grade groups was introduced by Epstein two years later. Older cases of Gleason score 6 cancer include histological patterns, such as cribriform and poorly formed glands, which today would be considered Gleason pattern 4.

Grade migration was also demonstrated by Danneman et al. [5] who analysed the Gleason scoring of prostate biopsies from the NPCR in Sweden for the period 1998–2011. There was an increasing incidence of low‐risk cancer (cT1 20% in 1998 to 51% in 2011) and a concurrent decrease in high‐risk cancers (cT3 29% to 16%), reflecting earlier detection. With earlier diagnosis from screening, one would expect a shift towards lower grades at diagnosis, but they found the opposite. Among low‐risk tumours (stage cT1 and PSA 4–10 ng/mL) the proportion of Gleason score 7–10 increased from 16% to 40%. Among high‐risk tumours (stage cT3 and PSA 20–50 ng/mL) the proportion of Gleason 7–10 increased from 65% to 94%.

Gleason score reclassification was also addressed by Albertsen et al. [6], who had prostate biopsy slides for the period 1990 to 1992 re‐reviewed by an experienced pathologist in 2002–2004. They found an upward shift in Gleason grading, with 55% of the samples upgraded, 14% downgraded, and 31% unchanged. Comparing matched cohorts of historical vs contemporary patients with prostate cancer, one might erroneously infer better survival. This illusory change in prognosis is known as the ‘Will Rogers phenomenon’.

While randomized trials such as the SPCG‐4 represent one of the highest levels of clinical evidence, it is important to keep in mind that these trials have limitations. Given the interval changes in grading criteria for prostatic adenocarcinoma, predicting clinical outcomes based on historical cohorts is rarely as simple as it may seem. While the fundamental conclusions of the SPGC‐4 remain valid, the finding that Gleason grade did not modify the effect of prostatectomy on survival is now less certain. Physicians should therefore use caution when inferring prognosis based on those results.

Cazzaniga et al. should be congratulated for this important work which will help physicians better counsel patients making decisions based on trials like the SPCG‐4.


  1. Trinh QD, Cole AP, Dasgupta P. Weighing the evidence from surgical trials. BJU Int 2017; 119: 659–60
  2. Bill‐Axelson A, Holmberg L, Ruutu M et al. Radical prostatectomy versus watchful waiting in early prostate cancer. N Engl J Med 2011; 364: 1708–17
  3. Dalela D, Karabon P, Sammon J et al. Generalizability of the Prostate Cancer Intervention Versus Observation Trial (PIVOT) results to contemporary North American men with prostate cancer. Eur Urol 2017; 71: 511–4
  4. Cazzaniga W, Garmo H, Robinson D, Holmberg L, Bill‐Axelson A, Stattin P. Mortality after radical prostatectomy in a matched contemporary cohort in Sweden compared to the Scandinavian Prostate Cancer Group 4 (SPCG‐4) study. BJU Int 2019; 123: 421–8
  5. Danneman D, Drevin L, Robinson D, Stattin P, Egevad LJ. Gleason inflation 1998–2011: a registry study of 97,168 men. BJU Int 2015; 115: 248–55
  6. Albertsen PC, Hanley JA, Barrows GH et al. Prostate cancer and the Will Rogers phenomenon. J Natl Cancer Inst 2005; 97: 1248–53C


© 2020 BJU International. All Rights Reserved.