In this edition of the BJUI Ganesan et al.  report a retrospective analysis of 552 ultrasonography (US) examinations that were followed by a non-contrast CT within 60 days in 486 patients collected over an 18-year period (1995–2012). The sensitivity of US for stone detection was 54% and its specificity was 91% when compared to CT, and sensitivity was positively associated with stone size (increasing from 73% for stones of 0–4 mm to 77% for 5–10 mm, and 89% for >10 mm; P < 0.001), but not with intra-renal location of stones (P = 0.58). US overestimated the size of stones that were <10 mm (P < 0.001), and had a tendency to underestimate size for those >10 mm (P = 0.05).
Stones were grouped into three size categories, based on clinical relevance to stone management: ≤4 mm (where observation would likely be recommended), 5–10 mm (where shockwave lithotripsy [SWL] would be chosen) or >10 mm where an endoscopic approach would be undertaken). Using these thresholds, 39% of cases would have been misassigned to observation and 14% of patients would have been inappropriately advised to undergo active treatment.
One may question the use of CT as the ‘gold standard’, as CT is also prone to sizing inaccuracy. Nevertheless, the headline findings that the inaccuracies inherent in US diagnosis and sizing may compromise clinical management are important. Other authors have made similar observations: in a literature review, Ray et al.  reported that US sensitivity was 45% for the detection of renal and ureteric calculi, with specificity up to 94% for ureteric stones and 88% for renal stones and that US overestimated stone size by a mean of 1.9 mm over CT, especially with stones of <5 mm. Similarly, Sternberg et al.  showed that the largest stone diameter was over-estimated by an average of 2.2 mm with US, and that errors increased with reducing stone size, rising from a 3% difference in stones >10 mm to 27% for those of 5–10 mm, and an 85% difference in stones ≤5 mm.
It is well established that, whilst having the advantage of no radiation dose, that US is a ‘user dependent’ study but there are also inherent limitations of US compared to CT for stone imaging. CT is capable of much finer spatial resolution, whilst US is prey to more diagnostic confounders. Reflectivity arising from sinus fat or the edges of the papillae may be mistaken for small calculi. For size, it can be difficult to delineate stone edges with the same precision as with CT. The sensitivity of US for stone detection can be improved by adjusting the imaging modalities between ray line (the conventional form of US), spatial compound and harmonic imaging (the most accurate stone size modality). Techniques such as increasing the gain and the transducer-to-stone depth and identifying ‘twinkle artefact’ using colour Doppler have also been used to improve stone detection .
However, manoeuvres to improve sensitivity of US may also compromise size measurement. An in vitro study has shown that each 2 cm increase in depth setting increases the size overestimation of stones by ~22% . Using calcium oxalate monohydrate stones, the same group have shown that measuring the posterior acoustic shadow provided a more precise assessment of stone size than measurement of the stone itself . Interestingly, the accuracy of stone width measurement was worse with greater transducer-to-stone depth, but measurement of the shadow width was independent of depth, and all US modalities (ray line, spatial compound, and harmonic imaging) performed similarly for shadow size. Shadow measurement was accurate to within 1 mm of the stone size , and similar findings have been shown in vivo, where 73% of the stone measurements and 85% of the shadow measurements were within 2 mm of the size on CT .
Unfortunately, not all stones cast an acoustic shadow, particularly the smaller ones, which are most likely to be over-sized. May et al.  showed that 89% of stones >5 mm, but only 53% of stones <5 mm demonstrated a posterior acoustic shadow. However, this may provide a further value for US-based clinical decision making, as stones that do not shadow are most likely <5 mm and are small enough to pass spontaneously, and therefore to be managed conservatively.
It is also important to be aware that CT stone measurements are also prone to error and inter-observer variability. Comparing in vitro CT measurements of stones in a ‘kidney sized potato model’, Eisner et al.  have shown that the most accurate measurements were obtained using magnified ‘bone window’ settings, which showed a mean 0.13 mm difference compared to a ‘gold standard’ measurement using callipers. This study also included a comparison of size estimate for spontaneously passed ureteric stones (thus a true reference standard) demonstrating that magnified ‘bone window’ measurements were equivalent to digital calliper measurements (the mean underestimation vs digital callipers was only 0.3 mm, P = 0.4), while measurements using magnified soft tissue windows were statistically different (mean underestimation 1.4 mm, P = 0.001) .
With its safety and accessibility, US should be the ideal modality for postoperative follow-up, both for assessment of stone recurrence, monitoring for enlargement of residual fragments, and for identifying the rare but important finding of ‘silent obstruction’, with the potential to lose renal function. However, given the ‘real-life’ data reported in this edition of the BJUI , and particularly the findings that 22% of patients might have been managed inappropriately when using US for decision making alone, increasing to 43% of patients who had stones between 5 and 10 mm on US, the authors have concluded that patients monitored by US might benefit from an additional CT if intervention is being considered, particularly for stones in the 5–10 mm range by US measurement.
Given the key importance of stone size to the outcome of interventions for stone disease, accurate imaging should translate into improved decision making and patient counselling and allow fairer inter-surgeon and departmental comparisons. Until the best US protocol and settings have been established, we recommend that, when US is used for diagnosis or follow-up, careful optimisation of the settings is crucial. Colour Doppler for ‘twinkle artefact’, and a high gain setting can be used to reduce the risk of missing stones, combined with removing all filtering and compressing the grey scale range to enhance the posterior shadowing. Harmonic imaging (which is now available on most commercial machines) is more accurate than cross beam or compound beams (that are used for standard renal US settings). When decisions need to be made, particularly those based on stone size, CT of the kidneys, ureters and bladder remains invaluable, from which the longest stone diameter should be measured, using magnified images and the ‘bone window’ setting. Current methods for accurate estimation of stone volume are impractical or imprecise. Manual segmentation can be accurate but is laborious, whilst standard semi-ellipsoid formulae cannot account for the wide variety of stone shapes seen in practice. Further studies devoted to simplifying stone volume estimation are necessary. There is also the wider challenge of how best to report stone imaging data. The key variables are stone size, density and location; and the morphology of the collecting system. Agreement between the various stakeholders – sonographers, radiologists and endourologists – over imaging standards and a minimal data set for stone imaging would improve management.