From: Artificial intelligence for radiological paediatric fracture assessment: a systematic review
Author, year | Dataset | Body part | AUC | Accuracy, % (95% CI) | Sensitivity, % (95% CI) | Specificity, % (95% CI) | PPV, % (95% CI) | NPV, % (95% CI) | TP | FP | FN | TN |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Upper limb—elbow | ||||||||||||
England [31] | Validation | Elbow effusions | 0.985 (0.966–1.00) | NS | NS | NS | NS | NS | NS | NS | NS | NS |
Test | Elbow effusions | 0.943 (0.884–1.00) | 0.907 (0.843–0.951) | 0.909 (0.788–1.00) | 0.906 (0.844–0.958) | NS | NS | 87 | 9 | 3 | 30 | |
Rayan [33] | Validation | Elbow fractures | 0.947 (0.930–0.960) | 0.877 (0.856–0.895) | 0.908 (0.882–0.929) | 0.841 (0.807–0.870) | 0.867 (0.838–0.892) | 0.889 (0.858–0.914) | 536 | 82 | 54 | 434 |
Choi [17] | Validation | Supracondylar fractures | 0.976 (0.949–0.991) | 0.945 (0.910–0.967) | 0.948 (0.859–0.982) | 0.944 (0.902–0.968) | 0.833 (0.726–0.904) | 0.984 (0.954–0.995) | 55 | 11 | 3 | 185 |
Temporal test set | Supracondylar fractures | 0.985 (0.962–0.996) | 0.904 (0.855–0.938) | 0.939 (0.852–0.983) | 0.922 (0.874–0.956) | 0.805 (0.717–0.871) | 0.978 (0.945–0.991) | 62 | 15 | 4 | 117 | |
Geographical test set | Supracondylar fractures | 0.992 (0.947–1.000) | 0.895 (0.817–0.942) | 1.000 (0.852–1.000) | 0.861 (0.759–0.931) | 0.697 (0.564–0.803) | 1.000 | 23 | 10 | 0 | 62 | |
Dupuis [30] | Test | Elbow fractures (subgroup) | NS | 0.888 (0.847–0.919) | 0.918 (0.846–0.958) | 0.873 (0.819–0.913) | 0.781 (0.969–0.847) | 0.956 (0.915–0.977) | 89 | 25 | 8 | 172 |
Upper limb—other | ||||||||||||
Zhou [35] | Test set (best performing for AP ulnar view, using optimal central angle measurement of bone) | Forearm (Bowing fracture) | 0.992 (NS) | NS | 1.000 (NS) | 0.940 (NS) | NS | NS | NS | NS | NS | NS |
Zhang [35] | Test set—analysed per patient | Distal radius (ultrasound) | NS | 0.92 | 1.0 | 0.87 | NS | NS | NS | NS | NS | NS |
Lower limb | ||||||||||||
Malek [32] | Training | Lower limb fracture healing | 0.8 (NS) | 0.821 (0.673–0.910) | 0.792 (0.595–0.908) | 0.867 (0.621–0.963) | 0.905 (0.711–0.973) | 0.722 (0.491–0.875) | 19 | 2 | 5 | 13 |
Validation | Lower limb fracture healing | NS | 0.556 (0.267–0.811) | 0.600 (0.231–0.882) | 0.500 (0.150–0.850) | 0.600 (0.231–0.882) | 0.500 (0.150–0.850) | 3 | 2 | 2 | 2 | |
Test | Lower limb fracture healing | NS | 0.889 (0.565–0.980) | 1.000 (0.566–1.000) | 0.750 (0.301–0.954) | 0.833 (0.436–0.970) | 1.000 (0.439–1.000) | 5 | 1 | 0 | 3 | |
Starosolski [34] | Test | Distal tibia | 0.995 (NS) | 0.979 (0.929–0.994) | 0.959 (0.863–0.989) | 1.000 (0.927–1.000) | 1.000 (0.924–1.000) | 0.961 (0.868–0.989) | 47 | 0 | 2 | 49 |
Tsai [58] | Test (mean and SD for accuracy across models in fivefold cross-validation) | Distal tibia (corner metaphyseal fracture) | NS | 0.93 ± 0.018 | 0.88 ± 0.05 | 0.96 ± 0.015 | 0.89 ± 0.036 | 0.95 ± 0.023 | 13 | 2 | 2 | 33 |
Test (best performing model) | Distal tibia (corner metaphyseal fracture) | NS | 0.960 (0.865–0.989) | 0.929 (0.685–0.987) | 0.972 (0.858–0.995) | 0.929 (0.685–0.987) | 0.972 (0.858–0.995) | 13 | 1 | 1 | 35 | |
All appendicular skeleton | ||||||||||||
Dupuis [30] | Test | Appendicular skeleton | NS | 0.926 (0.915–0.936) | 0.957 (0.940–0.969) | 0.912 (0.898–0.925) | 0.829 (0.803–0.852) | 0.979 (0.971–0.985) | NS | NS | NS | NS |