Introduction

Nodal involvement constitutes one of the most significant risk factors for local recurrence in rectal cancer [1]. For patients with node-positive disease the standard treatment consists of a long course of chemoradiation therapy (CRT), followed by surgery. In up to 35% of patients, no residual tumour cells are found at histological evaluation after surgery [2, 3]. These complete responders are known to have a good clinical outcome, with respect to local control and survival [4, 5]. Therefore, it is now increasingly being suggested that minimally invasive treatments, such as a local excision or a wait-and-see policy be advocated as an alternative to standard surgery with the aim of reducing treatment-related morbidity and mortality [6, 7]. When doing so it is, however, mandatory to ensure that no metastatic nodes are left behind and that the true node-negative (the ypN0) patients are accurately selected. CT, endoluminal ultrasound and MRI are all known to be insufficiently accurate with sensitivities and specificities in the 55–78% range, although some authors have reported more encouraging results after CRT [811].

Diffusion-weighted MRI (DWI) could be a potentially valuable oncological imaging technique. DWI uses differences in water motion to discriminate between tissues of varying cellularity. As lymph nodes have a high cellular density, they generally show restricted diffusion and are easily detected on DWI [12, 13]. Quantification of the apparent diffusion coefficient (ADC) could be useful to discriminate between benign and metastatic nodes, as the cellular density—and therefore the diffusion—is likely to alter when nodes are invaded with tumour. Only a few studies, mainly in head/neck and uterine/cervical cancer, have focussed on DWI for lymph node characterisation and most showed only a per patient or per region validation [1323]. To our knowledge there have been no reports so far of a per node DWI validation study in rectal cancer and in particular focussing on nodal restaging after CRT.

This study therefore aims to assess the diagnostic value of diffusion-weighted MRI for nodal staging in rectal cancer patients undergoing preoperative chemoradiation therapy, using a lesion-by-lesion histological validation as the reference standard. A second aim is to evaluate the interobserver reproducibility for ADC measurements of rectal cancer nodes.

Materials and methods

Patients

Between April 2008 and December 2009, 40 consecutive locally advanced rectal cancer patients were considered for inclusion in this prospective study. The study was part of a nodal imaging study, approved by the local institutional review board, for which all patients provided written informed consent. Inclusion criteria consisted of (a) biopsy-proven rectal adenocarcinoma, (b) locally advanced disease, defined on primary staging T2W-MRI as tumour in the distal rectum (≤ 5 mm from the anorectal junction), tumour involvement of the circumferential resection margins and/or positive nodal stage, and (c) treatment consisting of CRT + resection. Ten patients were excluded; 3 did not undergo surgery, in 2 patients the MR images showed severe metal implant artefacts and in 5 patients there was no PhD fellow present during histopathological evaluation so that a lesion-by-lesion matching could not be performed. This left a total of 30 patients (25 male, 5 female; median age 71 years, range 47–90) who were included in the study. The CRT consisted of 28 × 1.8 Gy and 2 × 825 mg/m2/d capecitabine, followed by a restaging MRI 6–8 weeks after completion of CRT and subsequent surgery.

MR imaging

Patients did not receive a bowel preparation or spasmolytics. Imaging was performed with a 1.5-T MR unit (Intera; Philips Medical Systems, Best, The Netherlands) using a a phased-array body coil. The imaging protocol consisted of standard 2D T2-weighted (T2W) fast spin-echo sequences in three orthogonal directions (TR/TE 3427/150 msec, 90° flip angle, 25 echotrain length, 6 NSA, 0.78 × 1.14 × 5.00 mm acquisition voxel size, 22 slices, 5.08 min acquisition time). For study purposes, two additional sequences were performed; (a) an axial diffusion-weighted sequence with background body signal suppression (DWIBS [24], b-values 0,500,1000 s/mm2, TR/TE 4829/70 msec, EPI factor 53, 4 NSA, 2.50 × 3.11 × 5.00 mm acquisition voxel size, 50 slices, 10.37 min acquisition time) and (b) an axial 3DT1W gradient-echo (GRE) sequence (TR/TE 9.8/4.6 msec, 15° flip angle, 1 NSA, 1.15 × 1.15 × 1.00 mm acquisition voxel size, 200 slices, 6.30 min acquisition time). The 3DT1W GRE sequence is not part of a standard rectal MRI protocol, but was used as a roadmap to depict small lymph nodes and confirm them anatomically as nodes and not vessels.

Image evaluation

The MR images were independently analysed by two readers: a senior (RGHBT) and junior (FCHB) gastrointestinal radiologist with 13 and 3 years of respective expertise in reading pelvic MRI. The readers were blinded to each other’s results. The pre-treatment images were used to identify the primary tumour and nodes. Lymph nodes were then identified on the post-treatment 3DT1W GRE images and drawn on an anatomical map, used as a template to ensure exact lesion-by-lesion matching with nodes harvested at histology [25]. On T2W-MRI each visible lymph node was scored using a 5-point confidence level score (Table 1). Criteria for malignancy were > 5 mm in size + spiculated or indistinct border and/or heterogeneous signal pattern [26, 27].

Table 1 Criteria for benign and malignant nodes on T2-weighted MRI

Visual diffusion analyses

In the same reading session, the readers evaluated the visibility of lymph nodes on the b1000 images. A lymph node was defined as a round/oval, high signal intensity structure corresponding with a node on the anatomical T1W GRE (Fig. 1). The signal of each node was subjectively scored as ‘high’, ‘average’ or ‘low’, with the signal of the primary tumour (on pre-treatment MRI) as the reference for ‘high’, the signal of the prostate/uterus for ‘average’ and the signal of the normal rectal wall for ‘low’.

Fig. 1
figure 1

a 3DT1W GRE images with 1-mm isotropic voxels were used for accurate depiction of lymph nodes (arrowheads). b Nodes were scored as benign or malignant on standard T2W FSE images. c Nodes were identified on DWI, where regions of interest were placed and copied to the ADC map. d Nodal ADC was calculated from the ADC map including b-values 0, 500 and 1000 s/mm2

ADC analyses

ADC maps in greyscale were automatically generated at the operating system, using al three b-values. To evaluate interobserver reproducibility, two radiological PhD fellows (DMJL and MM) independently performed ADC measurements of all visible nodes, blinded to each other’s results and the radiologists’ scores. An oval-shaped region of interest (ROI) was manually drawn for each node on the b1000 images and copied to the ADC map. ROI size was chosen to include as much of the nodal parenchyma as possible. ADCs were averaged between the two observers for further diagnostic analyses.

Lesion-by-lesion histological evaluation

All patients underwent a total mesorectal excision. To provide accurate matching with the axial MRI, each specimen was sectioned perpendicular to the rectal lumen, every 5 mm. Each section was searched for lymph nodes by a dedicated pathologist (RGR). Optimal nodal matching was obtained by side by side comparison of the anatomical map with the axially sliced specimen, with special attention for the nodal size and the location of the nodes in relation to surrounding structures [25]. The matching process was performed in consensus by a PhD fellow (DMJL) and the pathologist. Each node was placed in a marked individual tray and processed according to standard methods. For each malignant node the pathologist reported the percentage of the node that contained normal lymphoid tissue, metastatic changes or necrosis, respectively.

Statistical analyses

Statistical analyses were performed using the Statistical Package for the Social Sciences (SPSS, version 16.0, Inc., Chicago, IL, USA). Weighted kappa values with quadratic weighting were calculated to determine interobserver agreement (IOA) for T2W-MRI. IOA for the nodal ADC measurements was analysed according to the method of Bland and Altman [28] and by calculating the intraclass correlation coefficient (ICC). A Student’s T test was used to compare ADC (mean of two observers) between benign and malignant nodes. Receiver operator characteristics (ROC) curve analyses were performed to evaluate the diagnostic performance for (a) T2W-MRI, (b) ADC, and (c) the combination of T2W-MRI + ADC. For the latter, a predicted probability derived from a logistic regression analysis (using ADC and the confidence levels from T2W-MRI as independent variables and histologically benign/malignant as the dependent variable) was used as a test variable. Corresponding areas under the ROC curve (AUC), sensitivity, specificity, positive and negative predictive values and overall accuracy were calculated. For T2W-MRI the confidence levels were dichotomised with the cut-off set between 2 (possibly malignant) and 3 (probably malignant). For the ADC and combined T2W + ADC, the cut-offs were determined according to the point nearest to the upper left corner in the ROC curves. Each node that was not identified on T2W-MRI and/or DWI was assigned a negative score. Differences in diagnostic performance were analysed by comparing the ROC curves according to the method described by DeLong et al. [29]. P values < 0.05 were considered statistically significant.

Results

Histopathology

At histology 19 patients had a ypN0, 7 a ypN1 and 4 a ypN2 status. A total number of 321 nodes was harvested (median 12 nodes per patient; range 1–24; 281 N−, 40 N+). A node-by-node MR-histological analysis of 222 nodes could be performed (median size 3 mm, range 2–16 mm), of which 188 were benign and 34 were malignant. The other 99 harvested nodes were excluded because they could not be matched with MRI.

Interobserver agreement

IOA for the confidence level scores on T2W-MRI was good (κ 0.73). IOA for the visual (signal intensity) evaluation on DWI was moderate (κ 0.43). The Bland-Altman plot for the ADC measurements is displayed in Fig. 2. The mean bias in ADC between the two readers was 0.064 *10−3 mm2/s and limits of agreement were ±0.442 *10−3 mm2/s. The ICC was 0.77.

Fig. 2
figure 2

Interobserver reproducibility for nodal ADC measurements. Bland-Altman plot of the mean ADC of the two observers (x-axis) against the difference in ADC between the two observers (y-axis). The continuous line represents the mean absolute difference (bias) in ADC between the two observers; the dashed lines represent the 95% confidence intervals of the mean difference (limits of agreement)

Diagnostic performance of T2W-MRI

On T2W-MRI, 101/222 (45%) nodes were identified (median size 4 mm; range 2–16 mm; 27 N+, 74 N−). The AUC for detection of metastatic nodes was 0.88 for reader 1 and 0.95 for reader 2. Sensitivity was 65%, specificity 93%, PPV 61% and NPV 94% for reader 1. These values were 71%, 91%, 60% and 95% for reader 2.

Visual evaluation of DWI

On the b1000 images, 157/222 (71%) nodes were identified (median size 3 mm; range 2–16 mm; 37 N+, 180 N−). Signal intensities did not differ between the benign and metastatic nodes (Fig. 3) and rendered an AUC of 0.64 (95% CI 0.53–0.75) for reader 1 and 0.52 (95% CI 0.40–0.64) for reader 2.

Fig. 3
figure 3

Distribution of subjectively scored signal intensities on DWI for benign and malignant nodes. The bars represent the proportion (%) of ‘low’, ‘average’ and ‘high’ signal intensities for the two readers

Diagnostic performance of ADC

Of the 157 nodes identified on DWI, ADC measurements could be obtained in 115 nodes. The remaining 42 nodes were excluded because the nodes were too small to place an ROI or the standard errors of the ADC measurements were too large, because of local image distortions. Mean ADC was 1.19 ± 0.27 *10−3 mm2/s for the benign and 1.43 ± 0.38 *10−3 mm2/s for the malignant nodes (p < 0.001). AUC for detection of metastatic nodes was 0.66. The optimal ADC threshold was 1.25 *10−3 mm2/s, resulting in a sensitivity of 53%, specificity 82%, PPV 35% and NPV 91%.

Diagnostic performance of T2W-MRI + ADC

For 69 nodes, ADCs and T2W confidence level scores could be combined. For the other nodes, ADC was not available and/or the nodes were not visualised on T2W-MRI. The predicted probability for the combined assessment of T2W-MRI + ADC derived from the logistic regression rendered an AUC of 0.91 for reader 1 and 0.96 for reader 2. The optimal predicted probability was 0.51 for reader 1 and 0.69 for reader 2, which resulted in a sensitivity of 56%, specificity 98%, PPV 83% and NPV 92% for reader 1. These values were 56%, 99%, 95% and 93% for reader 2.

The diagnostic predictive values for the three reading methods are displayed in Table 2. The ROC curves are displayed in Fig. 4. The diagnostic performance when using ADC only was significantly lower than for T2W-MRI (p = 0.02 and p = 0.0003 for readers 1 and 2, respectively) and T2W-MRI + ADC combined (p = 0.001 and p < 0.0001). There was no significant difference in diagnostic performance between T2W-MRI and the combination of T2W + ADC (p = 0.17 and p = 0.26).

Table 2 Diagnostic predictive values for discrimination between benign and malignant nodes
Fig. 4
figure 4

ROC curves used to evaluate the diagnostic performance for the detection of metastatic nodes for (a) T2-weighted (T2W) MRI only, (b) ADC measurements only and (c) for the combined assessment of T2W-MRI + ADC. There was no significant difference in performance between the use of T2W-MRI only, and the combination of T2W-MRI + ADC (p = 0.17 for reader 1 and 0.26 for reader 2). The use of ADC measurements only resulted in significantly worse performance than for T2W-MRI only or the combination of T2W-MRI + ADC

Histological features of the malignant nodes

In 22% of the malignant nodes, > 50% of the node consisted of necrosis. In 17% of the nodes, 10–30% of the node was necrotic (Fig. 5b). In the other 61% of nodes there was no or only a little (< 5%) necrosis. Of these nodes, a normal lymphoid architecture was preserved in a large proportion (> 20%) of the node in 45% of the cases (Fig. 5a). These nodes with a relatively preserved lymphoid architecture showed low ADC values (1.16 ± 0.14 *10−3 mm2/s) that were comparable to the benign nodes (1.19 ± 0.27 *10−3 mm2/s, p = 0.49) but were lower compared with the nodes that were fully metastatic or necrotic (1.40 ± 0.28 *10−3 mm2/s), although this difference did not reach statistical significance (p = 0.08).

Fig. 5
figure 5

Histological evaluation of 2 metastatic lymph nodes (haematoxylin and eosin staining; original magnification 50×). a Lymph node with a relatively preserved lymphoid architecture, where only small nests of metastatic adenocarcinoma can be visualised (arrowheads). This node still resembled a healthy lymph node and showed relatively low ADC. b Example of a node with large areas of necrosis (arrows). Because of the decreased cellular density in these necrotic areas, this node showed a relatively high ADC

Discussion

The goal of this study was to evaluate the diagnostic performance of DWI in addition to standard T2W-MRI for the evaluation of rectal cancer lymph nodes after neoadjuvant chemoradiation treatment. Our results indicate that visual DWI evaluation improves the number of lymph nodes detected (from 45% to 71%) but that it is not useful for discrimination between benign and metastatic nodes. The mean ADCs of lymph nodes are discriminative with significantly higher values for the malignant nodes. However, we found an overlap in ADC between the benign and malignant nodes and insufficient accuracy when using ADC only for the detection of nodal metastases. ADC combined with standard T2W-MRI improved the diagnostic performance, however without reaching a significant improvement compared with T2W-MRI only.

Our results thus suggest that after chemoradiation, T2W-MRI on its own is already sufficient for nodal evaluation, reaching an AUC of 0.88–0.95. Such good results have also been reported by previous authors, with high NPVs ranging between 81% and 100%, indicating that a restaging MRI after CRT can accurately identify the ypN0 patients [10, 11]. This is remarkable, as it is known from two recent meta-analyses that MRI generally performs poorly for nodal staging [8, 9]. It has to be noted, however, that these meta-analyses only included studies that focussed on primary nodal staging. Apparently, morphological criteria (size, shape and border) work better in a restaging setting. Many small (2–5 mm) nodes disappear after CRT, while up to 50% of these nodes initially contained metastases [30, 31]. Of the small nodes that remain visible after CRT, up to 90% are sterilised [32]. Nodes that remain large are thus more likely to be malignant. Overall this facilitates a more reliable assessment of nodes based on morphological criteria only on a restaging MRI.

We found a significant difference in ADC between benign and malignant nodes. Nevertheless, there was a considerable overlap in values, making it difficult to identify a clinically reliable threshold for malignancy. With the optimal threshold of 1.25 *10−3 mm2/s, a high NPV of 91% could only be obtained at the cost of a low PPV of 35%. This is in concordance with the previous results of Kim et al. who measured nodal ADCs in 125 cervical cancer patients and achieved a NPV of 99% on a patient basis, but a PPV of only 38%, leading to overstaging in a huge proportion of patients [19]. Of interest was our finding that, although it did not improve the overall performance, the addition of ADC to standard T2W-MRI did improve the PPV from 60–61% to 83–95%, thus reducing the overstaging errors.

Contrary to most nodal DWI reports, we found significantly higher ADCs for the metastatic nodes. This discrepancy may be attributed to the fact that we included patients who all had undergone preoperative chemoradiation. It is known that irradiated malignant nodes often become (partly) necrotic. Due to a loss of cell structure, these necrotic areas are associated with increased diffusivity, resulting in higher ADCs. At histology, large areas of necrosis were present in over one third of the metastatic nodes (Fig. 5b), which may explain the higher ADCs found for the malignant nodes in our study. Previous studies in head/neck and oesophageal cancer also reported higher ADCs in metastatic nodes and found necrotic changes to be the major reason [16, 17, 33]. Interestingly, in studies that reported lower ADC values, none of the patients had been irradiated [14, 15, 1921, 23].

Interobserver agreement for ADC measurements was good (ICC 0.77). However, the limits of agreement from the Bland-Altman plot (±0.442 *10−3 mm2/s) exceeded the mean absolute difference of 0.240 *10−3 mm2/s in ADC between the benign and malignant nodes, thus limiting the clinical utility of ADC. This was also observed in a previous study by Kwee et al. [34]. Although ADC measurements provide objective, quantifiable information, the manual placement of ROIs remains a subjective action and will limit reproducibility.

The ADC thresholds in our study were only tested retrospectively, whereas ideally these thresholds should be tested prospectively. At the same time we realise that measuring nodal ADCs requires a considerable time investment from a radiologist in a busy clinical practice. Furthermore, ADC measurements are influenced by MR equipment, imaging parameters and methods of analyses. It will therefore be difficult to obtain uniform thresholds that can be broadly implemented into clinics. Further research should therefore focus on standardising imaging and analyses protocols.

The main gain from the addition of DWI was the improved number of detected nodes compared with T2W-MRI. On DWI, high signal intensity nodes were more easily detected against the suppressed background signal of the surrounding tissues. DWI can thus be used to immediately focus a radiologist’s eye on the presence of nodes and reduce the time needed for the evaluation of nodal status.

Clinical impact

Accurate nodal restaging after chemoradiation may have a substantial impact on therapeutic decision-making. While at present all patients with locally advanced rectal cancer undergo preoperative chemoradiation followed by a total mesorectal excision, minimally invasive treatments could be a safe alternative when patients with a node-negative status and a good response of the primary tumour can be accurately selected. Although these treatment alternatives are still under debate, the gradual shift within the near future will be towards organ-saving surgery for the good responders. When we can provide a tool for better selection of these patients, in particular for the selection of those with truly sterilised nodes, patients with a small tumour remnant (ypT1-2N0) may be safely stratified for a local excision, while patients with a complete response (ypT0N0) could go for a wait-and-see policy [6, 7].

In conclusion, this study shows that after chemoradiation ADC measurements may have the potential for nodal characterisation in rectal cancer, but DWI as a stand alone technique is not reliable. The main gain from the addition of DWI is an increase in the number of detected nodes and an improved PPV for identification of metastatic nodes. However, it does not improve overall diagnostic performance and after CRT, T2W-MRI on its own is already sufficiently accurate.