Characterization of high-grade prostate cancer at multiparametric MRI: assessment of PI-RADS version 2.1 and version 2 descriptors across 21 readers with varying experience (MULTI study)

Di Franco, Florian; Souchon, Rémi; Crouzet, Sébastien; Colombel, Marc; Ruffion, Alain; Klich, Amna; Almeras, Mathilde; Milot, Laurent; Rabilloud, Muriel; Rouvière, Olivier

doi:10.1186/s13244-023-01391-z

Original Article
Open access
Published: 20 March 2023

Characterization of high-grade prostate cancer at multiparametric MRI: assessment of PI-RADS version 2.1 and version 2 descriptors across 21 readers with varying experience (MULTI study)

Florian Di Franco¹,
Rémi Souchon²,
Sébastien Crouzet^2,3,4,5,
Marc Colombel^3,4,5,
Alain Ruffion^3,6,7,8,
Amna Klich^9,10,
Mathilde Almeras^9,10,
Laurent Milot^1,2,3,8,
Muriel Rabilloud^3,9,10 &
Olivier Rouvière ORCID: orcid.org/0000-0002-0030-478X^1,2,3,4
on behalf of the MULTI Study Group

Insights into Imaging volume 14, Article number: 49 (2023) Cite this article

2556 Accesses
3 Citations
2 Altmetric
Metrics details

Abstract

Objective

To assess PI-RADSv2.1 and PI-RADSv2 descriptors across readers with varying experience.

Methods

Twenty-one radiologists (7 experienced (≥ 5 years) seniors, 7 less experienced seniors and 7 juniors) assessed 240 ‘predefined’ lesions from 159 pre-biopsy multiparametric prostate MRIs. They specified their location (peripheral, transition or central zone) and size, and scored them using PI-RADSv2.1 and PI-RADSv2 descriptors. They also described and scored ‘additional’ lesions if needed. Per-lesion analysis assessed the ‘predefined’ lesions, using targeted biopsy as reference; per-lobe analysis included ‘predefined’ and ‘additional’ lesions, using combined systematic and targeted biopsy as reference. Areas under the curve (AUCs) quantified the performance in diagnosing clinically significant cancer (csPCa; ISUP ≥ 2 cancer). Kappa coefficients (κ) or concordance correlation coefficients (CCC) assessed inter-reader agreement.

Results

At per-lesion analysis, inter-reader agreement on location and size was moderate-to-good (κ = 0.60–0.73) and excellent (CCC ≥ 0.80), respectively. Agreement on PI-RADSv2.1 scoring was moderate (κ = 0.43–0.47) for seniors and fair (κ = 0.39) for juniors. Using PI-RADSv2.1, juniors obtained a significantly lower AUC (0.74; 95% confidence interval [95%CI]: 0.70–0.79) than experienced seniors (0.80; 95%CI 0.76–0.84; p = 0.008) but not than less experienced seniors (0.74; 95%CI 0.70–0.78; p = 0.75). As compared to PI-RADSv2, PI-RADSv2.1 downgraded 17 lesions/reader (interquartile range [IQR]: 6–29), of which 2 (IQR: 1–3) were csPCa; it upgraded 4 lesions/reader (IQR: 2–7), of which 1 (IQR: 0–2) was csPCa. Per-lobe analysis, which included 60 (IQR: 25–73) ‘additional’ lesions/reader, yielded similar results.

Conclusions

Experience significantly impacted lesion characterization using PI-RADSv2.1 descriptors. As compared to PI-RADSv2, PI-RADSv2.1 tended to downgrade non-csPCa lesions, but this effect was small and variable across readers.

Key points

1.
Juniors characterized aggressive cancers less well than experienced seniors on prostate MRI.
2.
Agreement between readers remained moderate even for experienced readers.
3.
As compared to version 2, PI-RADSv2.1 descriptors tended to show improved specificity.

Introduction

Interpretation of prostate multiparametric magnetic resonance imaging (MRI) is challenging because of potential discordance between findings from the different pulse sequences and substantial overlap between the appearance of benign and malignant conditions. These difficulties led to the creation of the Prostate Imaging-Reporting and Data System (PI-RADS). For each pulse sequence, semi-objective descriptors are used to classify lesions into specific categories. These categories are then combined into a final score assessing the likelihood of clinically significant prostate cancer (csPCa). PI-RADS version 2 (PI-RADSv2) showed good performance but moderate inter-reader agreement [1,2,3,4,5,6,7,8,9]. Version 2.1 (PI-RADSv2.1) was published in 2019 to address PI-RADSv2 limitations and improve reproducibility by clarifying some descriptors [10].

Although PI-RADSv2.1 has been extensively evaluated [11,12,13,14,15,16,17,18,19,20,21,22], meta-analyses yielded discordant results on the relative diagnostic performance of PI-RADSv2 and PI-RADSv2.1 [23,24,25]. Particularly, whether PI-RADSv2.1 improves inter-reader agreement remains unclear.

MRI interpretation can be broken down into two phases: the detection phase, in which the radiologist sees the lesion, and the characterization phase, in which they assess its degree of suspicion. Each phase contributes to the scoring performance and variability.

In this study, we focussed on the characterization phase by asking 21 readers with varying experience to assess, using PI-RADSv2.1 and PI-RADSv2 descriptors, the same set of MRI lesions with known histology. Our primary objective was to determine whether these descriptors were precise enough to allow readers to assign similar scores to the same lesions.

Materials and methods

Prospective biopsy database

As of September 2008, consecutive patients undergoing prostate MRI and subsequent biopsy at our institution were included in a prospective database after signing institutional review board-approved consent forms [26]. MRIs combined T2-weighted (T2w), diffusion-weighted (Dw) and dynamic contrast-enhanced (DCE) imaging at 1.5 T or 3 T. Transrectal biopsies combined systematic and targeted cores obtained under cognitive or MRI/ultrasound fusion (Urostation, Koelis) depending on the lesions’ location and the operator’s preference. Two to five targeted cores were taken from each lesion and at least two systematic cores (one paramedian, one lateral) from each PZ sextant. The operator could omit systematic cores from PZ sextants with lesions targeted at biopsy. TZ was biopsied only if it contained suspicious lesions.

Readers

Twenty-one radiologists (14 seniors, 7 juniors), from nine different private and public hospitals, participated in the study. Seven seniors (experienced seniors) had more than 5 years and seven (less experienced seniors) less than 5 years of experience. Four juniors had achieved a 6-month rotation in a department of uroradiology, three had passed an advanced diploma in genitourinary imaging, and two had no experience in prostate imaging (Additional file 1: I). Before starting the study, juniors took a 2-h class on PI-RADS scoring. Then, all readers attended a meeting during which representative cases were presented and differences between PI-RADSv2 and PI-RADSv2.1 were discussed.

Study sample

Consecutive biopsy-naïve patients included in the biopsy database between September 2015 and July 2016 were retrospectively selected. September 2015 corresponded to the date of implementation of PI-RADSv2 guidelines at our institution (Additional file 1: II). July 2016 was chosen to allow for at least four years of follow-up. These dates were also chosen because during that period, biopsy operators were instructed to target all focal lesions, even those with a low degree of suspicion, resulting in a large variety of targeted lesions.

Readers were given a four-month period (September-December 2019) to interpret the MRIs of the study sample. They were blinded to clinical and histological data, and to each other’s assessment.

Predefined lesions

First, readers assessed the ‘predefined lesions’, i.e. the MRI lesions targeted at biopsy. These were indicated on one T2w image. Readers were informed that, at the time the sample was acquired, biopsy operators were instructed to target all focal lesions, and thus, that a substantial proportion of the predefined lesions was expected to be benign. Nonetheless, the proportion of benign lesions and csPCas in the sample was not disclosed.

Readers noted the lesions’ maximal diameter, side and location (PZ, TZ or central zone (CZ)). When lesions extended into several zones, the zone in which most of the lesion was located was selected.

Then, readers defined the lesions’ PI-RADSv2 and PI-RADSv2.1 categories, for each pulse sequence, following as closely as possible the manual definitions of these categories (Additional file 1: II). The lesions’ final PI-RADSv2 and PI-RADSv2.1 scores were automatically calculated based on their location, size and pulse sequence categories.

Additional lesions

If needed, readers could note additional lesions that had not been targeted at biopsy. They defined, for each ‘additional lesion’, its location, diameter and pulse sequence categories according to PI-RADSv2 and PI-RADSv2.1 manual definitions. The overall scores were automatically calculated.

Per-lobe and per-patient scores

The PI-RADSv2 and PI-RADSv2.1 scores of each prostate lobe/patient were computed by selecting the highest scores of the predefined and additional lesions described in this lobe/patient. Lobes or patients with no lesion received default PI-RADSv2 and PI-RADSv2.1 scores of 1 (Additional file 1: III).

Follow-up

Follow-up data were retrieved in June–September 2020. The medical files of the patients without csPCa at initial biopsy were searched for any additional prostate biopsy performed during follow-up. Patients without follow-up at our institution were contacted by telephone or through their general practitioner.

Reference standard and csPCa definition

For characterizing predefined lesions, targeted biopsy findings were used as reference standard. For per-lobe and per-patient analysis that took into account predefined and additional lesions, combined targeted and systematic biopsy findings were used as reference standard. csPCa was defined as International Society of Urological Pathology (ISUP) grade ≥ 2 cancer.

Statistical analysis

Quantitative characteristics were described using medians and interquartile ranges (IQRs). Qualitative characteristics were described using absolute and relative frequencies.

A mixed probit regression corresponding to the binormal model was used to model the receiver operating characteristic (ROC) curves according to the reader’s experience, with the reader as random effect [27, 28]. Regression coefficients for experienced and less experienced seniors in comparison to juniors allowed to quantify and test the effect of reader’s experience on the diagnostic performance of the scores. The model was also used to predict the ROC curve for each category of readers. Areas under the curve (AUCs) were estimated using the binormal method [28]. Stratified bootstrap with sampling at the level of patients within strata defined by the presence or absence of csPCA was used to build AUCs 95% confidence intervals (CIs). A logistic mixed model was used to model sensitivity and specificity according to the reader’s experience, with the reader as random effect. Sensitivities and specificities were estimated with their 95% CIs for predefined thresholds of PI-RADS scores of ≥ 3 and ≥ 4. Inter-reader agreement was estimated using Cohen’s kappa coefficient (κ) for location and DCE categories, concordance correlation coefficient for size, and weighted κ for T2w and Dw categories and overall scores. Coefficients of ≤ 0.20, 0.21–0.40, 0.41–0.60, 0.61–0.80 and > 0.80 indicate poor, fair, moderate, good and excellent agreement, respectively.

Similar analyses were performed at lobe and patient level. R software, version 3.6.1 (https://cran.r-project.org) was used for analysis. This study is registered with ClinicalTrials.gov, number NCT04299997.

Results

Study sample

A total of 159 patients imaged at 1.5 T (n = 77) or 3 T (n = 82) were included (Fig. 1, Table 1). MRI scanners and protocols are detailed in Additional file 1: IV. Twelve patients had normal MRI, and 240 lesions were targeted in the 147 remaining patients. These 240 lesions constituted the ‘predefined lesions’ corpus.

Table 1 Patients’ characteristics

Full size table

Predefined lesions

Agreement on location, size and PI-RADS categories

Agreement on lesions’ location was moderate-to-good (κ = 0.60–0.73), with experienced seniors obtaining the highest κ (Table 2). Perfect agreement across all readers was reached in only 142/240 lesions (PZ, n = 133; TZ, n = 9; CZ, n = 0). Depending on the reader, a median number of 204 (IQR, 202–210), 26 (IQR, 23–28) and 10 (IQR, 6–12) lesions were localized in PZ, TZ and CZ, respectively (Additional file 1: V.1). Agreement on size was excellent (CCC ≥ 0.80) for all groups of readers (Table 2).

Table 2 Inter-reader agreement (analysis of the 240 predefined lesions)

Full size table

Agreement on PI-RADSv2.1 T2w and Dw categories was moderate (κ = 0.42–0.58 and κ = 0.48–0.57, respectively) and tended to increase with experience. For DCE categories, agreement was fair (κ = 0.30–0.38) for all groups of readers. Similar findings were obtained with PI-RADSv2 categories (Table 2).

PI-RADS scores

Inter-reader agreement for PI-RADSv2.1 scoring was moderate for seniors (κ = 0.43–0.47) and fair for juniors (κ = 0.39; Table 2). Using PI-RADSv2.1, juniors obtained a significantly lower AUC (0.74 [95%CI, 0.70–0.79]) than experienced seniors (0.80 [95%CI, 0.76–0.84], p = 0.008), but not than less experienced seniors (0.74 [95%CI, 0.70–0.78], p = 0.75). Experienced seniors tended to show higher specificity, but the difference was not statistically significant (Tables 3–4, Additional file 1: V.2-V.5).

Table 3 PI-RADSv2.1 and PI-RADSv2 scores assigned by the three groups of readers

Full size table

Table 4 Sensitivities and specificities obtained by the three groups of readers using PI-RADSv2.1 and PI-RADSv2 scoring

Full size table

Similar findings were obtained with PI-RADSv2 (Tables 2–4, Additional file 1: V.2–V.5). All groups of readers tended to assign lower scores to non-csPCa lesions using PI-RADSv2.1 than using PI-RADSv2. As compared to PI-RADSv2, PI-RADSv2.1 downgraded a median number of 17 lesions per reader (IQR, 6–29), of which 2 (IQR, 1–3) were csPCa. It upgraded a median number of 4 lesions per reader (IQR, 2–7), of which 1 (IQR, 0–2) was csPCa. The most frequent downgradings were from PI-RADS scores of 3 to 2 and 4 to 2. In TZ, a median number of 2 lesions (IQR, 0–2) were downgraded from a score of 3 to 2, and a median number of 1 lesion (IQR, 0–2) was upgraded from a score of 2 to 3 (Additional file 1: V.6-V.7).

Additional lesions

Readers described a median number of 60 ‘additional lesions’ (IQR, 25–73; Additional file 1: VI.1).

Per-lobe and per-patient scores

At per-lobe analysis, after taking into consideration predefined and additional lesions, inter-reader agreement for PI-RADSv2.1 scoring was moderate-to-good (κ = 0.54–0.63; Table 5). Using PI-RADSv2.1, juniors obtained a significantly lower AUC (0.79 [95%CI, 0.75–0.83]) than experienced seniors (0.82 [95%CI, 0.79–0.86], p = 0.03), but not than less experienced seniors (0.79 [95%CI, 0.76–0.83], p = 0.71). Experienced seniors tended to show higher specificity, but the difference was not statistically significant (Table 5, Additional file 1: VI.2–VI.5).

Table 5 Inter-reader agreement (per-lobe analysis)

Full size table

Similar findings were obtained with PIRADSv2 (Tables 3–5, Additional file 1: VI.2–VI.5). As compared to PI-RADSv2, PI-RADSv2.1 downgraded a median number of 66 lobes per reader (IQR, 35–94), of which 6 (IQR, 1–11) contained csPCa at biopsy (Fig. 2). It upgraded a median number of 5 lobes per reader (IQR, 2–8), of which 1 (IQR, 0–2) contained csPCa. The most frequent downgradings were from PI-RADS scores of 2 to 1, 4 to 2 and 3 to 2 (Additional file 1: VI.6).

Per-patient analysis showed concordant results (Additional file 1: VII).

Follow-up

Of the 96 patients without csPCa at initial biopsy, 7 with an ISUP 1 cancer received immediate radical treatment. During a median follow-up of 51 months (IQR, 45–55), 7 of the 88 remaining patients were diagnosed with an ISUP 2 cancer and none with an ISUP ≥ 3 cancer.

Discussion

To specifically evaluate the characterizing value of the PI-RADSv2/v2.1 descriptors, we asked the readers to score the exact same corpus of lesions. To be clinically meaningful, this corpus had to include lesions with a large range of degrees of suspicion. Therefore, we selected consecutive patients who underwent MRI and biopsy at our institution in 2015–2016. At that time, our biopsy policy required to target all focal lesions, even those with a low degree of suspicion. Biopsy operators could omit systematic biopsy in PZ sextants that had targeted biopsy, which allowed targeting several lesions without unreasonably increasing the number of cores taken. Hence, 92.5% (147/159) of the study patients underwent targeted biopsy while the csPCa prevalence was only 39.6% and 33% at patient and lesion level retrospectively. Furthermore, in accordance with the recommendations of the time [29], MRI was not used to select patients for biopsy but only to indicate the lesions to target, which limited selection bias.

This set of predefined lesions was first used to assess inter-reader agreement on lesion size and location. Agreement on size was excellent (CCC ≥ 0.80). The overall agreement on lesion location (PZ, TZ or CZ) was moderate-to-good (κ = 0.60–0.73). Only 59% (142/240) of the predefined lesions were localized in the same zone by all readers. This is problematic since PZ and TZ lesions are scored differently, using different dominant sequences. Additionally, CZ lesions are also assessed differently, at least using PIRADSv2.1 descriptors [10]. Thus, any variability in lesion location can have major consequences on the final scoring agreement. Variability on lesion location can be explained by two main factors. First, due to the lack of well-defined anatomical landmarks between CZ and PZ, the number of lesions localized in CZ was highly variable from one reader to another. Second, partial volume effects in some locations (e.g. anterior horn of the PZ, extreme apex) made it difficult to distinguish between PZ lesions and TZ nodules protruding into the PZ. 3D T2w acquisitions with multiplanar reformations might facilitate lesion location by reducing partial volume effects. Unfortunately, in this study, readers had only access to 2D T2w axial and sagittal imaging.

As others [30], we found that experienced seniors performed significantly better, mostly because they assigned lower scores to non-csPCa lesions. However, the impact of experience on inter-reader agreement was small and agreement remained moderate at best, even for experienced seniors. This is discordant with another study in which inter-reader agreement was substantial and better between dedicated uro-radiologists than between non-dedicated radiologists. However, in that study, all radiologists were from the same institution, which may have reduced interpretation variability, particularly among dedicated radiologists [15]. Taken together, our results suggest that, despite continuous efforts of standardization and clarification, most PI-RADS descriptors remain subjective. Distinguishing ‘marked’ from ‘non-marked’ abnormalities, ‘encapsulated’ from ‘mostly encapsulated’ nodules, or ‘focal’ from ‘non-focal’ enhancement is subjective but has major effect on the final score. Interestingly, for PI-RADSv2.1 and PI-RADSv2, and for all groups of readers, κ values tended to be higher for T2-weighted and diffusion-weighted categories than for DCE categories. Although this finding should be interpreted with care since all pulse sequences do not have the same number of categories, it may suggest that visually distinguishing positive from negative cases is difficult at DCE, especially in the presence of subtle enhancements from background.

Several solutions for improving MRI reproducibility can be suggested. Mentoring through systematic double reading with an experienced reader could probably accelerate the training of beginners, but this is made difficult by the heavy workload of radiologists [31]. Using quantitative thresholds for apparent diffusion coefficient or DCE-derived parameters may also improve prostate MRI accuracy and inter-reader agreement [16, 32,33,34], but there is still progress to be made on the reproducibility of MRI biomarkers [35,36,37,38]. Finally, assistance by Artificial Intelligence algorithms may facilitate prostate MRI reading in the future; however, conflicting results have been recently published on this matter [39,40,41,42,43,44,45].

Our sample size was not designed to statistically compare PI-RADSv2.1 and PI-RADSv2 performances, because the difference was expected to be small. Meaningful comparison would have needed an unrealistic number of patients. Yet, the strict application of PI-RADSv2.1 descriptors in predefined lesions tended to yield lower scores in non-csPCa lesions as compared to PI-RADSv2 descriptors. This was mainly observed in PZ lesions for which the PI-RADSv2.1 clarifications on Dw imaging categories 2, 3 and 4 seem to have favoured better characterization. However, this effect was too small and too heterogeneous across readers to induce a substantial difference between the AUCs of the two scores. Additionally, PI-RADSv2.1 clarifications did not improve inter-reader agreement.

After assessing the predefined lesions, readers were allowed to describe additional suspicious lesions. This was designed to evaluate whether the new PI-RADSv2.1 upgrading rules in TZ increased the number of suspicious lesions as compared to PI-RADSv2. In accordance with other studies [12,13,14, 18], we found that such upgradings were rare. As a result, per-lobe analysis, that included predefined and additional lesions, showed similar results than per-lesion analysis: experienced seniors out-performed the two other groups of readers, and, in all groups of readers, PI-RADSv2.1 showed a trend toward improved specificity as compared to PI-RADSv2. Of note, the number of additional lesions was highly variable across readers, with juniors tending to describe more lesions that seniors.

In this study, experienced readers were a priori defined as having more than 5 years of experience. A recent European consensus suggested that a minimum of 1000 cases should be read to become an expert [31]. All our experienced seniors fulfilled that condition, and our results are in line with those of the European consensus.

Readers assessed PI-RADSv2 and PI-RADSv2.1 during the same session. This may have resulted in underestimating the differences between the scores. However, independent scoring is illusory; most readers were familiar with the PI-RADSv2 descriptors and would have kept them in mind when using the new PI-RADSv2.1 criteria. In addition, assigning the scores in two different sessions introduces intra-reader variability, which may be substantial [46, 47]. Because reading the cases needed approximately 15–20 h, we were also afraid that the second reading would be biased by fatigue and the gradual lack of involvement of the readers. Thus, we chose to ask the readers to concentrate, during the same reading session, on the assessment of each pulse sequence category by following as closely as possible the written PI-RADSv2 and PI-RADSv2.1 descriptors without minding the overall score that was calculated automatically.

Our study has limitations. Firstly, because we indicated the predefined lesions to the readers, the AUCs obtained herein do not fully assess the diagnostic performance of the PI-RADS score in clinical routine. The detection phase, that is also a source of interpretation variability, was outside the scope of this study. However, many other studies have already assessed the overall performance of the PI-RADS score [23,24,25]. Instead, we wanted to specifically evaluate whether the PI-RADS descriptors were specific enough to induce reproducible characterization of the same lesion across multiple readers. This allowed the evaluation of factors of variability (size, location, PI-RADS categories of each pulse sequence) that, to our knowledge, had not been studied before. Secondly, prostate biopsy, used as reference standard, may have missed some csPCas. However, the small proportion of aggressive cancers detected during follow-up suggests that the sensitivity of our biopsy technique was good. Thirdly, we included only biopsy-naïve patients. Our results may not be valid for other populations.

In conclusion, when assessing the same set of MRI lesions using PI-RADSv2.1 and PI-RADSv2 descriptors, experienced seniors performed significantly better in characterizing csPCa than the other groups of readers. PI-RADSv2.1 descriptors tended to be more specific than PI-RADSv2 descriptors, but did not improve inter-reader variability.

Availability of data and materials

The biopsy databases was collected at The Hospices Civils de Lyon and is not publicly available. Pseudonymized data from MULTI dataset (i.e. individual score sheets of the readers) may be available from the corresponding author upon reasonable request. To gain access, data requestors will need to sign a data access agreement.

Abbreviations

AUC:: Area under the curve
CI:: Confidence interval
csPCa:: Clinically significant prostate cancer
CZ:: Central zone
DCE:: Dynamic contrast-enhanced
Dw:: Diffusion-weighted
IQR:: Interquartile range
ISUP:: International society of urological pathology
MR:: Magnetic resonance
MRI:: Magnetic resonance imaging
PI-RADSv2.1:: Prostate imaging-reporting and data system version 2.1
PI-RADSv2:: Prostate imaging-reporting and data system version 2
PZ:: Peripheral zone
ROC:: Receiver operating characteristic
T2w:: T2-weighted
TZ:: Transition zone

References

Richenberg J, Logager V, Panebianco V, Rouviere O, Villeirs G, Schoots IG (2019) The primacy of multiparametric MRI in men with suspected prostate cancer. Eur Radiol 29:6940–6952
Article PubMed PubMed Central Google Scholar
Drost FH, Osses DF, Nieboer D et al (2019) Prostate MRI, with or without MRI-targeted biopsy, and systematic biopsy for detecting prostate cancer. Cochrane Database Syst Rev 4: CD012663
Westphalen AC, McCulloch CE, Anaokar JM et al (2020) Variability of the positive predictive value of PI-RADS for prostate MRI across 26 centers: experience of the society of abdominal radiology prostate cancer disease-focused panel. Radiology 296:76–84
Article PubMed Google Scholar
Greer MD, Shih JH, Lay N et al (2019) Interreader variability of prostate imaging reporting and data system version 2 in detecting and assessing prostate cancer lesions at prostate MRI. AJR Am J Roentgenol 212:1197–1205
Article PubMed PubMed Central Google Scholar
Mussi TC, Yamauchi FI, Tridente CF et al (2020) Interobserver agreement of PI-RADS v. 2 lexicon among radiologists with different levels of experience. J Magn Reson Imaging 51:593–602
Article PubMed Google Scholar
Barkovich EJ, Shankar PR, Westphalen AC (2019) A systematic review of the existing prostate imaging reporting and data system version 2 (PI-RADSv2) literature and subset meta-analysis of PI-RADSv2 categories stratified by gleason scores. AJR Am J Roentgenol 212:847–854
Article PubMed Google Scholar
Park KJ, Choi SH, Lee JS, Kim JK, Kim MH, Jeong IG (2020) Risk stratification of prostate cancer according to PI-RADS(R) version 2 categories: meta-analysis for prospective studies. J Urol 204:1141–1149
Article PubMed Google Scholar
Park KJ, Choi SH, Lee JS, Kim JK, Kim MH (2020) Interreader agreement with prostate imaging reporting and data system version 2 for prostate cancer detection: a systematic review and meta-analysis. J Urol 204:661–670
Article PubMed Google Scholar
Rudolph MM, Baur ADJ, Haas M et al (2020) Validation of the PI-RADS language: predictive values of PI-RADS lexicon descriptors for detection of prostate cancer. Eur Radiol 30:4262–4271
Article PubMed PubMed Central Google Scholar
Turkbey B, Rosenkrantz AB, Haider MA et al (2019) Prostate imaging reporting and data system version 2.1: 2019 update of prostate imaging reporting and data system version 2. Eur Urol 76:340–351
Article PubMed Google Scholar
Tamada T, Kido A, Takeuchi M et al (2019) Comparison of PI-RADS version 2 and PI-RADS version 2.1 for the detection of transition zone prostate cancer. Eur J Radiol 121:108704
Article PubMed Google Scholar
Byun J, Park KJ, Kim MH, Kim JK (2020) Direct comparison of PI-RADS version 2 and 2.1 in transition zone lesions for detection of prostate cancer: preliminary experience. J Magn Reson Imaging 52:577–586
Article PubMed Google Scholar
Lim CS, Abreu-Gomez J, Carrion I, Schieda N (2021) Prevalence of prostate cancer in PI-RADS version 2.1 transition zone atypical nodules upgraded by abnormal DWI: correlation With MRI-directed TRUS-guided targeted biopsy. AJR Am J Roentgenol 216:683–690
Article PubMed Google Scholar
Costa DN, Jia L, Subramanian N et al (2021) Prospective PI-RADS v2.1 atypical benign prostatic hyperplasia nodules with marked restricted diffusion: detection of clinically significant prostate cancer on multiparametric MRI. AJR Am J Roentgenol 217:395–403
Article PubMed Google Scholar
Brembilla G, Dell’Oglio P, Stabile A et al (2020) Interreader variability in prostate MRI reporting using prostate imaging reporting and data system version 2.1. Eur Radiol 30:3383–3392
Article PubMed Google Scholar
Linhares Moreira AS, De Visschere P, Van Praet C, Villeirs G (2021) How does PI-RADS v2.1 impact patient classification? A head-to-head comparison between PI-RADS v2.0 and v2.1. Acta Radiol 62:839–847
Article PubMed Google Scholar
Hotker AM, Bluthgen C, Rupp NJ, Schneider AF, Eberli D, Donati OF (2020) Comparison of the PI-RADS 2.1 scoring system to PI-RADS 2.0: Impact on diagnostic accuracy and inter-reader agreement. PLoS One 15:e0239975
Article CAS PubMed PubMed Central Google Scholar
Rudolph MM, Baur ADJ, Cash H et al (2020) Diagnostic performance of PI-RADS version 21 compared to version 20 for detection of peripheral and transition zone prostate cancer. Sci Rep 10:15982
Article CAS PubMed PubMed Central Google Scholar
Walker SM, Mehralivand S, Harmon SA et al (2020) Prospective evaluation of PI-RADS version 21 for prostate cancer detection. AJR Am J Roentgenol. https://doi.org/10.2214/AJR.19.22679:1-6
Article PubMed PubMed Central Google Scholar
Bhayana R, O’Shea A, Anderson MA et al (2021) PI-RADS versions 2 and 2.1: interobserver agreement and diagnostic performance in peripheral and transition zone lesions among six radiologists. AJR Am J Roentgenol 217:141–151
Article PubMed Google Scholar
Xu L, Zhang G, Zhang D et al (2020) Comparison of PI-RADS version 2.1 and PI-RADS version 2 regarding interreader variability and diagnostic accuracy for transition zone prostate cancer. Abdom Radiol (NY) 45:4133–4141
Article PubMed Google Scholar
Wei CG, Zhang YY, Pan P et al (2021) Diagnostic accuracy and interobserver agreement of PI-RADS version 2 and version 2.1 for the detection of transition zone prostate cancers. AJR Am J Roentgenol 216:1247–1256
Article PubMed Google Scholar
Lee CH, Vellayappan B, Tan CH (2022) Comparison of diagnostic performance and inter-reader agreement between PI-RADS v2.1 and PI-RADS v2: systematic review and meta-analysis. Br J Radiol 95:20210509
Article PubMed Google Scholar
Park KJ, Choi SH, Kim MH, Kim JK, Jeong IG (2021) Performance of prostate imaging reporting and data system version 2.1 for diagnosis of prostate cancer: a systematic review and meta-analysis. J Magn Reson Imaging 54:103–112
Article PubMed Google Scholar
Annamalai A, Fustok JN, Beltran-Perez J, Rashad AT, Krane LS, Triche BL (2022) Interobserver agreement and accuracy in interpreting mpMRI of the prostate: a systematic review. Curr Urol Rep 23:1–10
Article PubMed Google Scholar
Habchi H, Bratan F, Paye A et al (2014) Value of prostate multiparametric magnetic resonance imaging for predicting biopsy results in first or repeat biopsy. Clin Radiol 69:e120–e128. https://doi.org/10.1016/j.crad.2013.10.018
Article CAS PubMed Google Scholar
Alonzo TA, Pepe MS (2002) Distribution-free ROC analysis using binary regression techniques. Biostatistics 3:421–432
Article PubMed Google Scholar
Pepe MS (2003) The statistical evaluation of medical tests for classification and prediction. Oxford University Press, New York
Google Scholar
Mottet N, Bellmunt J, Bolla M et al (2017) EAU-ESTRO-SIOG guidelines on prostate cancer. part 1: screening, diagnosis, and local treatment with curative intent. Eur Urol 71:618–629
Article PubMed Google Scholar
Stabile A, Giganti F, Kasivisvanathan V et al (2020) Factors influencing variability in the performance of multiparametric magnetic resonance imaging in detecting clinically significant prostate cancer: a systematic literature review. Eur Urol Oncol 3:145–167
Article PubMed PubMed Central Google Scholar
de Rooij M, Israel B, Tummers M et al (2020) ESUR/ESUI consensus statements on multi-parametric MRI for the detection of clinically significant prostate cancer: quality requirements for image acquisition, interpretation and radiologists’ training. Eur Radiol 30:5404–5416
Article PubMed PubMed Central Google Scholar
Ullrich T, Schimmoller L (2020) Perspective: a critical assessment of PI-RADS 2.1. Abdom Radiol (NY) 45:3961–3968
Article CAS PubMed Google Scholar
Moraes MO, Roman DHH, Copetti J et al (2020) Effects of the addition of quantitative apparent diffusion coefficient data on the diagnostic performance of the PI-RADS v2 scoring system to detect clinically significant prostate cancer. World J Urol 38:981–991
Article PubMed Google Scholar
Abreu-Gomez J, Walker D, Alotaibi T, McInnes MDF, Flood TA, Schieda N (2020) Effect of observation size and apparent diffusion coefficient (ADC) value in PI-RADS v2.1 assessment category 4 and 5 observations compared to adverse pathological outcomes. Eur Radiol 30:4251–4261
Article CAS PubMed Google Scholar
Fedeli L, Belli G, Ciccarone A et al (2018) Dependence of apparent diffusion coefficient measurement on diffusion gradient direction and spatial position - a quality assurance intercomparison study of forty-four scanners for quantitative diffusion-weighted imaging. Phys Med 55:135–141
Article PubMed Google Scholar
Shukla-Dave A, Obuchowski NA, Chenevert TL et al (2019) Quantitative imaging biomarkers alliance (QIBA) recommendations for improved precision of DWI and DCE-MRI derived biomarkers in multicenter oncology trials. J Magn Reson Imaging 49:e101–e121
Article PubMed Google Scholar
Brunelle S, Zemmour C, Bratan F et al (2018) Variability induced by the MR imager in dynamic contrast-enhanced imaging of the prostate. Diagn Interv Imaging 99:255–264
Article CAS PubMed Google Scholar
Hoang-Dinh A, Nguyen-Quang T, Bui-Van L, Gonindard-Melodelima C, Souchon R, Rouviere O (2022) Reproducibility of apparent diffusion coefficient measurement in normal prostate peripheral zone at 1.5T MRI. Diagn Interv Imaging 103:545–554. https://doi.org/10.1016/j.diii.2022.06.001
Article PubMed Google Scholar
Penzkofer T, Padhani AR, Turkbey B et al (2021) ESUR/ESUI position paper: developing artificial intelligence for precision diagnosis of prostate cancer using magnetic resonance imaging. Eur Radiol 31:9567–9578
Article PubMed PubMed Central Google Scholar
Gaur S, Lay N, Harmon SA et al (2018) Can computer-aided diagnosis assist in the identification of prostate cancer on prostate MRI? a multi-center, multi-reader investigation. Oncotarget 9:33804–33817
Article PubMed PubMed Central Google Scholar
Mehralivand S, Harmon SA, Shih JH et al (2020) Multicenter multireader evaluation of an artificial intelligence-based attention mapping system for the detection of prostate cancer with multiparametric MRI. AJR Am J Roentgenol 215:903–912
Article PubMed PubMed Central Google Scholar
Zhu L, Gao G, Liu Y et al (2020) Feasibility of integrating computer-aided diagnosis with structured reports of prostate multiparametric MRI. Clin Imaging 60:123–130
Article PubMed Google Scholar
Zhang KS, Schelb P, Netzer N et al (2022) Pseudoprospective paraclinical interaction of radiology residents with a deep learning system for prostate cancer detection: experience, performance, and identification of the need for intermittent recalibration. Invest Radiol 57:601–612
Article CAS PubMed Google Scholar
Labus S, Altmann MM, Huisman H et al (2022) A concurrent, deep learning-based computer-aided detection system for prostate multiparametric MRI: a performance study involving experienced and less-experienced radiologists. Eur Radiol 33:64–76. https://doi.org/10.1007/s00330-022-08978-y
Article PubMed Google Scholar
Rouviere O, Jaouen T, Baseilhac P et al (2022) Artificial intelligence algorithms aimed at characterizing or detecting prostate cancer on MRI: How accurate are they when tested on independent cohorts? - a systematic review. Diagn Interv Imaging. https://doi.org/10.1016/j.diii.2022.11.005
Article PubMed Google Scholar
Niaf E, Lartizien C, Bratan F et al (2014) Prostate focal peripheral zone lesions: characterization at multiparametric MR imaging–influence of a computer-aided diagnosis system. Radiology 271:761–769
Article PubMed Google Scholar
Smith CP, Harmon SA, Barrett T et al (2019) Intra- and interreader reproducibility of PI-RADSv2: a multireader study. J Magn Reson Imaging 49:1694–1703
Article PubMed Google Scholar

Download references

Acknowledgements

The authors thank the MULTI study group collaborators: Michel Abihanna, Alexandre Ben Cheikh, Flavie Bratan, Stéphanie Bravetti, Domitille Cadiot, Stéphane Cadot, Bénédicte Cayot, Jean Champagnac, Sabine Debeer, Marine Dubreuil-Chambardel, Nicolas Girouin, Leangsing Iv, Paul-Hugo Jouve de Guibert, Olivier Lopez, Paul Cezar Moldovan, Gaele Pagnoux, Clément Pernet, Louis Perrier, Sébastien Ronze, Rémy Rosset, Athivada Soto Thammavong, Nicolas Stacoffe, Sarah Transin. Hospices Civils de Lyon, Department of Imaging, Hôpital Edouard Herriot, Lyon, F-69437, France: Sabine Debeer, Marine Dubreuil-Chambardel, Stéphanie Bravetti, Stéphane Cadot, Bénédicte Cayot, Paul-Hugo Jouve de Guibert, Paul Cezar Moldovan, Gaele Pagnoux, Clément Pernet, Louis Perrier, Nicolas Stacoffe, Sarah Transin. Imagerie médicale Val d’Ouest Charcot (IMVOC), Ecully, France : Michel Abihanna, Sébastien Ronze, Clinique de la Sauvegarde, Department of Imaging, Lyon, France: Alexandre Ben Cheikh, Centre Hospitalier Saint Joseph Saint Luc, Department of Imaging, Lyon, France: Flavie Bratan, Rémy Rosset, Hospices Civils de Lyon, Department of Imaging, Centre Hospitalier Lyon Sud, Pierre Bénite, France: Domitille Cadiot, Leangsing Iv, Médipôle Lyon-Villeurbanne, Department of Imaging, Villeurbanne, France: Jean Champagnac, Norimagerie, Caluire et Cuire, France: Nicolas Girouin, Department of Vascular and Interventional Radiology, Image-Guided Therapy Center, François-Mitterrand University Hospital, Dijon Cedex, France: Olivier Lopez Centre Hospitalo-Universitaire de Saint-Etienne, Department of Imaging, Hôpital Nord, Saint-Etienne, France: Athivada Soto Thammavong.

Funding

The Hospices Civils de Lyon covered the publication costs.

Author information

Authors and Affiliations

Hospices Civils de Lyon, Department of Imaging, Hôpital Edouard Herriot, 69437, Lyon, France
Florian Di Franco, Laurent Milot & Olivier Rouvière
INSERM, LabTau, U1032, Lyon, France
Rémi Souchon, Sébastien Crouzet, Laurent Milot & Olivier Rouvière
Université de Lyon, Université Lyon 1, Lyon, France
Sébastien Crouzet, Marc Colombel, Alain Ruffion, Laurent Milot, Muriel Rabilloud & Olivier Rouvière
Faculté de Médecine Lyon Est, Lyon, France
Sébastien Crouzet, Marc Colombel & Olivier Rouvière
Hospices Civils de Lyon, Department of Urology, Hôpital Edouard Herriot, 69437, Lyon, France
Sébastien Crouzet & Marc Colombel
Hospices Civils de Lyon, Department of Urology, Centre Hospitalier Lyon Sud, Pierre-Bénite, France
Alain Ruffion
Equipe 2—Centre d’Innovation en Cancérologie de Lyon, 3738, Lyon, EA, France
Alain Ruffion
Faculté de Médecine Lyon Sud, 69003, Lyon, France
Alain Ruffion & Laurent Milot
Service de Biostatistique et Bioinformatique, Hospices Civils de Lyon, Pôle Santé Publique, 69003, Lyon, France
Amna Klich, Mathilde Almeras & Muriel Rabilloud
UMR 5558, Laboratoire de Biométrie et Biologie Évolutive, CNRS, Équipe Biostatistique-Santé, 69100, Villeurbanne, France
Amna Klich, Mathilde Almeras & Muriel Rabilloud

Authors

Florian Di Franco
View author publications
You can also search for this author in PubMed Google Scholar
Rémi Souchon
View author publications
You can also search for this author in PubMed Google Scholar
Sébastien Crouzet
View author publications
You can also search for this author in PubMed Google Scholar
Marc Colombel
View author publications
You can also search for this author in PubMed Google Scholar
Alain Ruffion
View author publications
You can also search for this author in PubMed Google Scholar
Amna Klich
View author publications
You can also search for this author in PubMed Google Scholar
Mathilde Almeras
View author publications
You can also search for this author in PubMed Google Scholar
Laurent Milot
View author publications
You can also search for this author in PubMed Google Scholar
Muriel Rabilloud
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Rouvière
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

on behalf of the MULTI Study Group

Sabine Debeer
, Marine Dubreuil-Chambardel
, Stéphanie Bravetti
, Stéphane Cadot
, Bénédicte Cayot
, Paul-Hugo Jouve de Guibert
, Paul Cezar Moldovan
, Gaele Pagnoux
, Clément Pernet
, Louis Perrier
, Nicolas Stacoffe
, Sarah Transin
, Michel Abihanna
, Sébastien Ronze
, Alexandre Ben Cheikh
, Flavie Bratan
, Rémy Rosset
, Domitille Cadiot
, Leangsing Iv
, Jean Champagnac
, Nicolas Girouin
, Olivier Lopez
& Athivada Soto Thammavong

Contributions

RS and OR designed the study. FDF, RS and OR organized the reading sessions and established the score sheets used by the readers. FDF and RS were responsible for data curation. All authors participated to the data analysis and interpretation. AK, MA and MR performed the formal statistical analysis. OR drafted the manuscript. The MULTI Study group collaborators read the MRI or participated to data curation. All authors participated to the intellectual revision of the manuscript. All authors read and approved by the final manuscript.

Corresponding author

Correspondence to Olivier Rouvière.

Ethics declarations

Ethics approval and consent to participate

The creation of the prospective database of patients undergoing prostate MRI and subsequent biopsy was approved by an Ethics Committee (Comité de Protection des Personnes Sud-Est IV, decision L09-04). All the patients included in this database gave written informed consent for the use of their imaging and histological data for research purposes.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Online Appendix.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Di Franco, F., Souchon, R., Crouzet, S. et al. Characterization of high-grade prostate cancer at multiparametric MRI: assessment of PI-RADS version 2.1 and version 2 descriptors across 21 readers with varying experience (MULTI study). Insights Imaging 14, 49 (2023). https://doi.org/10.1186/s13244-023-01391-z

Download citation

Received: 24 January 2023
Accepted: 15 February 2023
Published: 20 March 2023
DOI: https://doi.org/10.1186/s13244-023-01391-z

Characterization of high-grade prostate cancer at multiparametric MRI: assessment of PI-RADS version 2.1 and version 2 descriptors across 21 readers with varying experience (MULTI study)

Abstract

Objective

Methods

Results

Conclusions

Key points

Introduction

Materials and methods

Prospective biopsy database

Readers

Study sample

Predefined lesions

Additional lesions

Per-lobe and per-patient scores

Follow-up

Reference standard and csPCa definition

Statistical analysis

Results

Study sample

Predefined lesions

Agreement on location, size and PI-RADS categories

PI-RADS scores

Additional lesions

Per-lobe and per-patient scores

Follow-up

Discussion

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Consortia

on behalf of the MULTI Study Group

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Supplementary Information

Additional file 1.

Rights and permissions

About this article

Cite this article

Share this article

Keywords