Dataset | AUC (95% CI) | Accuracy | Sensitivity | Specificity | p value |
---|---|---|---|---|---|
Training set (n = 217) | |||||
DL model | 0.94 (0.90–0.97) | 91.2% (198/217) | 90.2% (92/102) | 92.2% (106/115) | < 0.001 |
Internal test set (n = 62) | |||||
DL model | 0.97 (0.90–1.00) | 93.5% (58/62) | 90.8% (20/22) | 95.0% (38/40) | < 0.001 |
Clinical model | 0.77 (0.65–0.87) | 82.3% (51/62) | 59.1% (13/22) | 95.0% (38/40) | < 0.001 |
Integrated model | 0.94 (0.84–0.98) | 93.5% (58/62) | 86.4% (19/22) | 97.5% (39/40) | < 0.001 |
External test set (n = 54) | |||||
DL model | 0.97 (0.88–1.00) | 92.6% (50/54) | 100% (12/12) | 90.5% (38/42) | < 0.001 |
Clinical model | 0.64 (0.50–0.76) | 79.6% (43/54) | 41.7% (5/12) | 90.5% (38/42) | 0.17 |
Integrated model | 0.88 (0.76–0.95) | 90.7% (49/54) | 66.7% (8/12) | 97.6% (41/42) | < 0.001 |
Total test set (n = 116) | |||||
DL model | 0.97 (0.92–1.00) | 93.1% (108/116) | 94.1% (32/34) | 92.7% (76/82) | < 0.001 |
Clinical model | 0.72 (0.63–0.80) | 81.0% (94/116) | 52.9% (18/34) | 92.7% (76/82) | < 0.001 |
Integrated model | 0.91 (0.85–0.96) | 92.2% (107/116) | 79.4% (27/34) | 97.6% (80/82) | < 0.001 |