Skip to main content

Table 4 Performance metrics of AI models and radiologists

From: Machine learning combined with radiomics and deep learning features extracted from CT images: a novel AI model to distinguish benign from malignant ovarian tumors

  

Accuracy

Sensitivity

Specificity

AUC

Positive predictive rate

Negative predictive rate

F1 score

AI models

Radiomics

0.61

0.32

0.76

0.66

0.40

0.68

0.35

DL

0.73

0.21

1

0.89

1

0.71

0.35

Clinical

0.73

0.53

0.84

0.82

0.63

0.78

0.57

Radiomics + DL

0.71

0.37

0.89

0.82

0.64

0.73

0.47

Ensemble*

0.82

0.68

0.89

0.83

0.77

0.85

0.72

Radiologists without AI assistance

Radiologist 1

0.63

0.58

0.65

0.61

0.46

0.75

0.51

Radiologist 2

0.64

0.58

0.68

0.63

0.48

0.76

0.52

Radiologist 3

0.70

0.84

0.62

0.73

0.53

0.88

0.65

 

Krippendorff’s alpha

0.4757

    

Radiologist 4

0.86

0.68

0.95

0.82

0.87

0.85

0.77

Radiologist 5

0.79

0.95

0.70

0.83

0.62

0.96

0.75

 

Krippendorff’s alpha

0.4806

    

Radiologists with AI assistance

Radiologist 1

0.77

0.74

0.78

0.76

0.64

0.85

0.68

Radiologist 2

0.80

0.89

0.76

0.83

0.65

0.93

0.76

Radiologist 3

0.86

0.84

0.86

0.85

0.76

0.91

0.80

 

Krippendorff’s alpha

0.6333

    

Radiologist 4

0.88

0.79

0.92

0.85

0.83

0.89

0.81

Radiologist 5

0.82

0.84

0.81

0.83

0.70

0.91

0.76

  

Krippendorff’s alpha

0.7331

    
  1. *Ensemble = radiomics + DL + clinical
  2. Junior radiologists: radiologist 1–3
  3. Senior radiologists: radiologist 4–5
  4. AI Artificial intelligence, AUC Area under the ROC Curve, DL Deep learning