Skip to main content

Table 3 Diagnostic performance comparison among the DL model, radiologist evaluation without model assistance, and radiologist evaluation with model assistance

From: Deep learning for differentiation of osteolytic osteosarcoma and giant cell tumor around the knee joint on radiographs: a multicenter study

Model

ACC (95% CI)

p value

Deep learning

93.1 (87.0–96.5) [108/116]

 

Expert committee A-Wt

72.4 (63.7–0.79.7) [84/116]

< 0.001

Expert committee A-Wi

91.4 (84.9–0.95.3) [106/116]

0.051*

Expert committee B-Wt

88.8 (81.8–93.3) [103/116]

0.25

Expert committee B-Wi

90.5 (83.8–94.6) [105/116]

0.67*

Expert committee C-Wt

87.1 (79.8–92.0) [101/116]

0.32

Expert committee C-Wi

89.7 (82.8–94.0) [104/116]

0.54*

  1. Expert committees A, B, and C indicated three groups of radiologists with different levels of experience in reading musculoskeletal radiographs. “Wt” means without DL model assistance, “Wi” means with DL model assistance, and the p value reflects the comparison of accuracy between different pairs of models, indicating “Wt” versus DL and “Wt” versus “Wi” (indicated by *) (the same below)