Tag Archive for: AMSTAR


Reaching for the stars – rating the quality of systematic reviews with the Assessment of Multiple Systematic Reviews (AMSTAR) 2

The number of published systematic reviews and meta‐analyses in the urological literature has dramatically increased in recent years [1]. This is good news given their importance in guiding clinical decision‐making, guideline development and health policy. However, many of these studies are of low quality, raising concerns about the trustworthiness of their results. As with other research studies, it is therefore important for readers to have a framework for determining the quality of a given systematic review. Therefore, in 2017 BJU International launched a scoring system for systematic reviews that provides readers with a summary assessment as to whether established methodological safeguards against bias for systematic reviews have been met [2]. This is based on the Assessment of Multiple Systematic Reviews (AMSTAR), a validated instrument that assesses methodological quality on an 11‐point scale (0–11), with higher scores reflecting greater methodological rigor and all criteria being given the same relative weight [3].

Recently, an updated version of this instrument has become available, offering a better assessment of systematic reviews [4]. The revised instrument (AMSTAR 2) includes 10 of the original domains; it has 16 items in total (compared with 11 in the original), simpler response categories to the original AMSTAR, and provides an overall rating that is largely based on seven critical domains that should all be met. These relate to: (i) documentation of an a priori registered protocol in Prospective Register of Systematic Reviews (PROSPERO) or through Cochrane, (ii) a comprehensive literature search, (iii) explicit justification for excluding studies, (iv) a risk of bias assessment of included studies, (v) appropriate use of meta‐analytical methods, (vi) consideration of risk of bias when interpreting the results of the review, and (vii) assessment of presence and likely impact of publication bias. Other, non‐critical domains include a clear description of the study question in Population, Intervention, Comparison, Outcome (PICO) format, study selection and data extraction in duplicate, and identification of sources of funding of the studies included in the review and the review itself. This results in a four‐tiered rating (high, moderate, low, and critically low) that reflects the confidence that a reader may place in the results. Notably, a high‐quality rating requires no critical weakness and allows for only one non‐critical weakness. More than one non‐critical weakness drops the rating down to moderate, and just one critical weakness (such as lack of an a priori protocol) drops the rating down to low. Any review that has more than one critical weakness will be rated as critically low.

BJU International editors will routinely apply this AMSTAR 2‐based scoring system to screen for methodological quality in order to raise the awareness of this issue and promote reviews of higher quality (Fig. 1)[1]. Needless to say, BJU International is not the place for systematic reviews of sub‐optimal methodological quality in which the readers cannot place their trust. Meanwhile, we also fully understand that methodological rigor is not everything but has to be paired with clinical relevance and newsworthiness. Much has been written about the dramatic redundancy of systematic reviews on the same topic; in certain areas of medicine, the number of systematic reviews exceeds that of eligible studies that these reviews included [5]. Therefore, when systematic reviews already exist, there needs to be a clear rationale for any ‘encore’ performance. BJU International also encourages the development of systematic reviews by author teams that are financially unconflicted and have thoughtfully managed any intellectual conflict of interest.

Figure 1: New BJUI rating system of systematic reviews based on AMSTAR 2. The number of coloured stars in the inner and outer layers of the system represents completeness of an individual critical domain and overall confidence rating of the systematic review, respectively. The number in the middle of the system refers to the summary AMSTAR 2 score based on the overall confidence rating of the systematic review (high: 4, moderate: 3, low: 2, critically low: 1).

Through this initiative, BJU International not only intends to become the premier journal for high‐quality systematic reviews as they relate to urology, but also to move the field forward, reducing redundancy and waste. As we embrace the higher standards of AMSTAR 2, we present the first review to be scored using this method in this issue [6] and we encourage all systematic review authors to accept this challenge and reach with us for the stars.


  1. Han JL, Gandhi S, Bockoven CG, Narayan VM, Dahm P. The landscape of systematic reviews in urology (1998 to 2015): an assessment of methodological quality. BJU Int 2017; 119: 638–49
  2. Dahm P. Raising the bar for systematic reviews with Assessment of Multiple Systematic Reviews (AMSTAR). BJU Int 2017; 119: 193
  3. Shea BJ, Grimshaw JM, Wells GA et al. Development of AMSTAR: a measurement tool to assess the methodological quality of systematic reviews. BMC Med Res Methodol 2007; 7: 10


About the authors:

Dr Philipp Dahm is Professor of Urology and Vice Chair of Veterans Affairs at the University of Minnesota. He also serves as Director of Research and Education for Surgical Services at the Minneapolis Veterans Administration Medical Center (@EBMUrology).


Dr Jae Hung Jung is from the Department of Urology, Wonju College of Medicine, Yonsei University, Korea.




Changing the LATITUDE of Treatment for High-Risk Hormone-Naïve Prostate Cancer: STAMPEDE-ing Towards Androgen Biosynthesis Inhibition

zach-klaassenEarlier this month at the annual American Society of Clinical Oncology (ASCO) meeting in Chicago, IL, Dr. Karim Fizazi and Dr. Nicholas James (@Prof_Nick_James) presented results from the LATITUDE and STAMPEDE trials, respectively. These randomized controlled trials (RCTs) assessed the utility of adding abiraterone acetate (AA) + prednisone to conventional androgen deprivation therapy (ADT) among men with high-risk, hormone-naïve prostate cancer. Since Dr. Charles Huggins’ 1941 Nobel prize winning finding that ADT is highly effective in controlling metastatic prostate cancer, nearly 70 years passed before CHAARTED and STAMPEDE demonstrated in 2015 that the addition of docetaxel to ADT prolongs survival in men with high volume metastatic prostate cancer. The de novo metastatic prostate cancer global incidence is striking: 3% in the US and rising, 6% across Europe, 4-10% in Latin America, and nearly 60% in Asia-Pacific. Historically, ADT has been standard of care, however most men with metastases progress to metastatic castration-resistant prostate cancer (mCRPC) driven by the reactivation of androgen receptor (AR) signaling. The rationale for adding AA + prednisone to ADT for metastatic hormone-naïve prostate cancer patients is threefold: (i) the mechanism of resistance to ADT may develop early, (ii) ADT alone does not inhibit androgen synthesis by the adrenal glands or prostate cancer cells, and (iii) AA + prednisone improves overall survival (OS) in mCRPC patients and reduces tumor burden in high-risk, localized prostate cancer.


LATITUDE was conducted at 235 sites in 34 countries in Europe, Asia-Pacific, Latin America, and Canada. The objectives of the study were to evaluate the addition of AA + prednisone to ADT on clinical benefit in men with newly diagnosed, high-risk, metastatic hormone-naïve prostate cancer. Patients were stratified by the presence of visceral disease (yes/no) and ECOG performance status (0, 1 vs 2) and then randomized 1:1 to either ADT + AA (1000 mg daily) + prednisone (5 mg) (n=597) or ADT + placebo (n=602). The co-primary endpoints were OS and radiographic progression-free survival (rPFS). Secondary endpoints included time to: (i) pain progression, (ii) PSA progression, (iii) next symptomatic skeletal event, (iv) chemotherapy, and (v) subsequent prostate cancer therapy. The study was powered to detect an HR of 0.67 and 0.81 in favor of AA for rPFS and OS, respectively.
Over a median follow-up of 30.4 months, patients treated with ADT + AA + prednisone had a 38% risk reduction of death (HR 0.62, 95%CI 0.51-0.76) compared to ADT + placebo.


Median OS was not yet reached in the ADT + AA + prednisone arm compared to 34.7 months in the ADT + placebo arm. OS rates at 3 years for the ADT + AA + prednisone arm was 66%, compared to 49% in the ADT + placebo arm. This OS benefit was consistently favorable across all subgroups including ECOG 0 and 1-2, visceral metastases, Gleason ≥8 disease, and bone lesions >10.

There was also 53% risk of reduction of radiographic progression or death for patients treated with ADT + AA + prednisone (median 33.0 months; HR 0.47, 95%CI 0.39-0.55) compared to ADT + placebo (14.8 months).


Secondary endpoints showed statistically significant improvement for ADT + AA + prednisone, including time to PSA progression (HR 0.30, 95%CI 0.26-0.35), time to pain progression (HR 0.70, 95%CI 0.58-0.83), time to next symptomatic skeletal event (HR 0.70, 95%CI 0.54-0.92), time to chemotherapy (HR 0.44, 95%CI 0.35-0.56), and time to subsequent prostate cancer therapy (HR 0.42, 95%CI 0.35-0.50).


Secondary to the results presented at ASCO, the study was discontinued after the first interim analysis. Adverse events were comparable in the two groups. Hypertension only rarely required treatment discontinuation, and only two patients discontinued treatment due to hypokalemia (no hypokalemia-related deaths). Two patients in each arm died of cerebrovascular events, and 10 patients treated with ADT + AA + prednisone compared to 6 patients treated with ADT + placebo died of cardiac disorders.


STAMPEDE is a large multi-stage, multi-arm, RCT being conducted in the United Kingdom to assess the utility of novel therapeutic agents in conjunction with ADT. Currently being tested are AA, enzalutamide, zoledronic acid, docetaxol, celecoxib and radiotherapy (RT). The AA arm of the study was presented at ASCO as a late-breaking abstract. Inclusion criteria included men with locally advanced or metastatic prostate cancer, including newly diagnosed with N1 or M1 disease, or any two of the following: stage T3/4, PSA ≥ 40 ng/mL, or Gleason score 8-10. Patients undergoing prior radical prostatectomy or RT were eligible if they had more than one of the following: PSA ≥ 4 ng/mL and PSADT < 6 months, PSA ≥ 20 ng/mL, N1, or M1 disease. Patients were then randomized 1:1 to standard of care (SOC; ADT for ≥2 years, n=957) vs SOC + AA (1000 mg) + prednisone 5 mg daily (n=960). Treatment with RT was mandated in patients with N0M0 disease, while strongly encouraged for N1M0 patients. Primary outcomes were OS and failure-free survival (FFS), where failure was defined as PSA failure, local failure, lymph node failure, distant metastases or prostate cancer death. Secondary outcome included toxicity and skeletal-related events (SREs). The study was powered to detect a 25% improvement in OS for the treatment group (requiring 267 control arm mortalities).
Both groups were balanced and patients were predominantly metastatic (52% M1, 20% N+M0, 28% N0M0), median was PSA 53 ng/mL, and 99% were treated with LHRH analogues. Over a median follow-up of 40 months, there were 262 control arm deaths, of which 82% were prostate cancer-related; there were 184 deaths in the SOC + AA + prednisone arm. There was a 37% relative improvement in overall survival (HR 0.63, 95%CI 0.52-0.76) favoring SOC + AA + prednisone.


A Forrest plot split on stratification factors demonstrated no evidence of heterogeneity based on any of the factors, including M0/M1 status (p=0.37). Second, SOC+AA + prednisone demonstrated a 71% improvement in FFS (HR 0.29, 95%CI 0.25-0.34), with an early split in the KM curves.


SOC + AA + prednisone also significantly decreased SREs among the entire cohort (HR 0.46, 95%CI 0.37-0.58), as well as specifically in the M1 cohort (HR 0.45, 95%CI 0.37-0.58). This resulted in a 55% reduction in SREs in the M1 subset analysis.


When looking at treatment progression, 89% of the SOC arm went on to next line of therapy, whereas 79% of the SOC + AA + prednisone arm received additional therapy, most commonly docetaxel. As expected, the rate of Grade 3-5 adverse events was higher in the SOC + AA prednisone arm (47% vs. 33%), and were primarily cardiovascular (HTN, MI, cardiac dysrhythmias) or hepatic (transaminitis) in nature.


As has become the norm during academic conferences, there was significant buzz on Twitter over the course of the two days these results were presented:


This also included the New England Journal of Medicine immediately tweeting after the presentations that LATITUDE and STAMPEDE were published instantaneously:


Furthermore, immediately following Dr. Fizazi’s presentation of LATITUDE, Dr. Eric Small from @UCSF presented a discussion of LATITUDE. A number of important points were raised. First, although this was a well-designed, placebo controlled, randomized phase III study, early unblinding (although appropriate) resulting in an HR of 0.62 for OS is based on only 50% of the targeted total deaths. Making conclusions based on interim analyses must be made with caution. However, with every endpoint reaching statistical significance and conditional probability modeling, if the study had remained blinded, the probability of reaching the same conclusions is high. Second, since twice as many patients in the ADT + placebo arm received life-prolonging therapy than compared to the ADT + AA + placebo arm, the benefit of AA is not explained by more secondary life-prolonging therapy, strengthening the cause for AA + ADT.

Perhaps the most interesting and pertinent clinical comparison is assessing outcomes of the LATITUDE and CHAARTED (high-volume disease) treatment arms (AA vs docetaxel). With similar median OS outcomes between the ADT control arms of the two trials (suggesting similar populations), the HRs for OS based on treatment are nearly identical:


Similarly, the rPFS outcomes were comparable between the two trials:


With nearly identical OS and rPFS outcomes for men receiving ADT + AA or ADT + docetaxel, the question becomes whether the impact of adding AA to ADT is volume or risk dependent. Results from the STAMPEDE trial would suggest remarkably similar outcomes support the use of AA + ADT in patients with less burden of disease. Arguably the most important slide of the meeting was captured and tweeting by Dr. Agarwal (@neerajaiims):


Dr. Small eloquently summarized future directions into two groups. Unanswered questions regarding efficacy include: (i) Can a genomic classifier be used to select patients more likely to benefit from AA or docetaxel? (ii) Can AA be added in even earlier settings (with radiation? Increasing PSAs?) (iii) Should AA and docetaxel be combined or used sequentially? Additionally, there are also unanswered questions regarding AA resistance, including (i) Will the mechanisms of resistance to AA be the same when used in the non-mCRPC setting? (ii) Will androgen receptor amplification still be observed? (iii) Will there be an increased risk of treatment-associated small cell/neuroendocrine prostate cancer? (iv) Does adding chemotherapy or AA to ADT result in more aggressive disease at the time of resistance? (v) What is the optimal therapy for a patient who progresses on ADT + AA, compared to a patient who progresses on ADT + docetaxel? Given the avoidance of potential chemotherapy related side effects (ie. neutropenic complications) for an oral, long-term treatment, AA + ADT should be considered standard of care for untreated, high-risk metastatic prostate cancer.

But what is the long-term economic landscape like when practice changing trials such as LATITUE and STAMPEDE suddenly thrust an expensive medication such as AA + prednisone directly to the forefront of hormone-naïve disease? Following these presentations, urologic oncologist, Twitter veteran, and Forbes correspondent Dr. Ben Davies (@daviesbj) wrote a provocative piece highlighting the potential ‘financial toxicity’ (particularly in the United States) that may result downstream of these trials:


A conservative estimate is a wholesale cost of $115,000 per year per patient for AA + prednisone, resulting in a crude estimate of a $2.8 billion annual expenditure for the drug in the United States alone if used in the hormone-naïve setting, according to Dr. Davies. As Dr. Davies also points outs, although the patent for AA expired in 2016 and there are currently 13 applications to make generic AA, the patent for prednisone lasts until 2027, with $30 billion riding on the lawsuit. Dr. David Penson (@urogeek) succinctly summarized via Twitter:


Strictly academically speaking, LATITUDE and STAMPEDE, in addition to the docetaxel benefits of CHAARTED, have provided clinicians with exciting Level 1 evidence for improving patient care in the high-risk/metastatic setting. The investigators and more importantly the thousands of patients and families are to be thanked and congratulated for their perseverance, hard-work, and willingness to participate in these practice-changing clinical trials. It is our job as clinicians to continue advocating the best treatment for our patients, whether this be through economic barriers in the United States, or access to appropriate care on a global scale.


Zach Klaassen, MD

Urologic Oncology Fellow

University of Toronto/Princess Margaret Cancer Centre

Toronto, Ontario, Canada



February Editorial: Raising the bar for systematic reviews with Assessment of Multiple Systematic Reviews (AMSTAR)

The BJUI has a longstanding track record in promoting the dissemination of high-quality unbiased evidence and helping their readership to understand why the principles of evidence-based medicine matter. This devotion is witnessed by the work that goes into every issue of the journal, as well as past initiatives such as providing a level of evidence rating for clinical research articles or publishing educational articles such as the ‘Evidence-Based Urology in Practice’ series [1, 2].

Major foci for clinically oriented specialty journals are systematic reviews and meta-analyses. Systematic reviews have a preeminent role in guiding the practice of evidence medicine by addressing focused clinical questions in a systematic, transparent and reproducible manner. Defining criteria of a high-quality systematic review include: an a priori registered protocol, a comprehensive search of multiple sources including unpublished studies (to avoid publication bias), an assessment of the quality of evidence that goes beyond study design alone, and a thoughtful interpretation of the findings. Systematic reviews inform clinicians and patients at the point of care, form the foundation of evidence-based clinical practice guidelines, and help shape health policy [3]. They also find frequent citation and can raise a journal’s impact factor. There is therefore more than one good reason for journals to care about the quality of systematic reviews.

Meanwhile, a study in this issue of the BJUI [4] shows that the methodological quality of systematic reviews published in the urological literature is modest, varies substantially, and has failed to improve over time. This contrasts to randomised controlled trials’ reporting quality that appears to have improved substantially over time, probably due to increased awareness among clinical researchers, urology readers and journal reviewers [4, 5]. The study [4] used the Assessment of Multiple Systematic Reviews (AMSTAR), a validated 11-item instrument, to measure the methodological quality of systematic reviews with higher scores reflecting better quality.

The authors [4] surveyed four major urological journals and compared the periods 2013–2015 to 2009–2012 and 1998–2008. Despite a dramatic increase in the number of systematic reviews published each year, methodological quality has stagnated with mean AMSTAR scores ± standard deviations of 4.8 ± 2.4 (2013–2015; = 125), 5.4 ± 2.3 (2009–2012; = 113) and 4.8 ± 2.0 (1998–2008; = 57). The average systematic review therefore has deficits in over half the 11 AMSTAR criteria and is of only modest quality thereby undermining our confidence in their results. Although the mean AMSTAR score of 5.6 ± 2.9 for 25 systematic reviews published in the BJUI in 2013–2015 compared favourably to similar studies in other leading urology journals, the difference was not statistically significant.

What are we going to do about it? Inspired by these findings, the BJUI is launching a new initiative to raise awareness for the issue of methodological quality of systematic reviews among its readership and raise the bars for its contributors. Future systematic review authors will be asked to submit an AMSTAR-based checklist to provide enhanced transparency about its methods that will be reviewed as part of the editorial review process. These include documentation of an a priori written protocol and ideally, registration of the systematic review through the Cochrane Collaboration or the Prospective Register of Systematic Reviews (PROSPERO). Such a protocol should outline all important steps of the review process including the definition of outcomes, study inclusion and exclusion criteria, details about the literature search, study selection and data abstraction process, analytical approach including planned sensitivity and subgroup analyses. Authors should also rate the quality of evidence looking beyond study limitation alone by using an approach such as the Grading of Recommendations Assessment, Development, and Evaluation (GRADE), which recognises such additional domains such as imprecision, inconsistency, indirectness and publication bias [6]. Critical steps of the systematic review process should be completed in duplicate to guard against random and systematic error and authors should provide readers with the information about who funded the studies included in the review, as well as their own potential conflicts of interests. To guard against publication bias, systematic review authors should also search for ongoing trials and unpublished studies through registries and abstract proceedings.

It is understood that the methodological handiwork that goes into the planning, execution and reporting of a systematic review do not assure clinical relevance or newsworthiness, nor does it address any issues surrounding the limited quality of studies that the review may be summarising. However, it is nevertheless a sine quae no to assure readers that they can be confident of the results. The new BJUI initiative will raise awareness for the issue of systematic review quality by providing a summary AMSTAR score to accompany each article. We hope that with this initiative we will provide a beacon for other specialty journals to follow, with the goal of raising the bar for all published systematic reviews and ultimately leading to improved patient care.

Philipp Dahm


Department of Urology, Minneapolis Veterans Administration Health Care System and University of Minnesota , MinneapolisMN, USA



1 Dahm P, Preminger GM. Introducing levels of evidence to publications in urology. BJU Int 2007; 100: 2467




4 HanJL, Gandhi S, Bockoven CG, Narayan VM, Dahm PThe landscape osystematic reviews in urology (1998 to 2015): an assessment of methodological quality. BJU Int 2016 [Epub ahead of print]. doi: 10.1111/bju.13653.


5 Narayan VM, Cone EB, Smith D, Scales CD Jr, Dahm P. Improved reporting of randomized controlled trials in the urologic literature. Eur Urol 2016; 70: 10449


6 Guyatt GH, Oxman AD, Vist GE et al. What is quality of evidence and why is it important to clinicians? BMJ 2008; 336: 9958


© 2020 BJU International. All Rights Reserved.