Skip to main content

Table 2 Performance of deep learning segmentation models

From: Automatic segmentation of fat metaplasia on sacroiliac joint MRI using deep learning

Models

Internal cross-validation

External test set

DSC (%)

Precision (%)

Recall (%)

DSC (%)

Precision (%)

Recall (%)

2.5D-AttentionUNet (ours)

81.86 ± 1.55

80.49 (80.16–80.82)

85.5 (85.34–85.66)

85.44 ± 6.09

85.83 (82.62–89.04)

86.43 (81.10–91.76)

2D-UNet

76.97 ± 2.11**

78.5 (78.25–78.75)

78.24 (77.84–78.64)

66.53 ± 19.37**

69.14 (62.20–74.04)

68.09 (59.71–76.50)

3D-UNet

80.05 ± 1.57

80.18 (79.71–81.55)

82.4 (82.00–82.80)

67.40 ± 20.84**

85.12 (80.31–89.37)

62.87 (52.46–71.26)

ResUNet

76.42 ± 1.92 **

76.49 (76.15–76.83)

79.56 (79.31–79.81)

66.41 ± 17.60**

75.74 (69.66–79.58)

65.66 (57.32–72.96)

UNETR

73.94 ± 2.68 **

79.64 (79.12–80.16)

72.93 (72.51–73.35)

57.81 ± 21.58**

68.83 (61.45–77.27)

55.26 (46.22–67.42)

Attention U-Net

79.73 ± 1.65 *

79.95 (79.59–80.31)

82.47 (82.12–82.82)

65.54 ± 24.25*

72.29 (63.05–76.22)

67.95 (56.84–76.66)

  1. DSC is presented as an average percentage with a standard deviation. Precision and recall are shown as percentages with 95% confidential intervals. Paired t-tests are performed to determine the statistical significance of differences between the 2.5D-AttentionUNet and other models
  2. DSC Dice similarity coefficient
  3. *represents p value < 0.05
  4. **represents p value < 0.001