Artificial intelligence for radiological paediatric fracture assessment: a systematic review

Shelmerdine, Susan C.; White, Richard D.; Liu, Hantao; Arthurs, Owen J.; Sebire, Neil J.

doi:10.1186/s13244-022-01234-3

Insights into Imaging

Table 4 Diagnostic accuracy of artificial intelligence algorithms for fracture detection, organised by body parts

From: Artificial intelligence for radiological paediatric fracture assessment: a systematic review

Author, year	Dataset	Body part	AUC	Accuracy, % (95% CI)	Sensitivity, % (95% CI)	Specificity, % (95% CI)	PPV, % (95% CI)	NPV, % (95% CI)	TP	FP	FN	TN
Upper limb—elbow
England [31]	Validation	Elbow effusions	0.985 (0.966–1.00)	NS	NS	NS	NS	NS	NS	NS	NS	NS
England [31]	Test	Elbow effusions	0.943 (0.884–1.00)	0.907 (0.843–0.951)	0.909 (0.788–1.00)	0.906 (0.844–0.958)	NS	NS	87	9	3	30
Rayan [33]	Validation	Elbow fractures	0.947 (0.930–0.960)	0.877 (0.856–0.895)	0.908 (0.882–0.929)	0.841 (0.807–0.870)	0.867 (0.838–0.892)	0.889 (0.858–0.914)	536	82	54	434
Choi [17]	Validation	Supracondylar fractures	0.976 (0.949–0.991)	0.945 (0.910–0.967)	0.948 (0.859–0.982)	0.944 (0.902–0.968)	0.833 (0.726–0.904)	0.984 (0.954–0.995)	55	11	3	185
	Temporal test set	Supracondylar fractures	0.985 (0.962–0.996)	0.904 (0.855–0.938)	0.939 (0.852–0.983)	0.922 (0.874–0.956)	0.805 (0.717–0.871)	0.978 (0.945–0.991)	62	15	4	117
	Geographical test set	Supracondylar fractures	0.992 (0.947–1.000)	0.895 (0.817–0.942)	1.000 (0.852–1.000)	0.861 (0.759–0.931)	0.697 (0.564–0.803)	1.000	23	10	0	62
Dupuis [30]	Test	Elbow fractures (subgroup)	NS	0.888 (0.847–0.919)	0.918 (0.846–0.958)	0.873 (0.819–0.913)	0.781 (0.969–0.847)	0.956 (0.915–0.977)	89	25	8	172
Upper limb—other
Zhou [35]	Test set (best performing for AP ulnar view, using optimal central angle measurement of bone)	Forearm (Bowing fracture)	0.992 (NS)	NS	1.000 (NS)	0.940 (NS)	NS	NS	NS	NS	NS	NS
Zhang [35]	Test set—analysed per patient	Distal radius (ultrasound)	NS	0.92	1.0	0.87	NS	NS	NS	NS	NS	NS
Lower limb
Malek [32]	Training	Lower limb fracture healing	0.8 (NS)	0.821 (0.673–0.910)	0.792 (0.595–0.908)	0.867 (0.621–0.963)	0.905 (0.711–0.973)	0.722 (0.491–0.875)	19	2	5	13
	Validation	Lower limb fracture healing	NS	0.556 (0.267–0.811)	0.600 (0.231–0.882)	0.500 (0.150–0.850)	0.600 (0.231–0.882)	0.500 (0.150–0.850)	3	2	2	2
	Test	Lower limb fracture healing	NS	0.889 (0.565–0.980)	1.000 (0.566–1.000)	0.750 (0.301–0.954)	0.833 (0.436–0.970)	1.000 (0.439–1.000)	5	1	0	3
Starosolski [34]	Test	Distal tibia	0.995 (NS)	0.979 (0.929–0.994)	0.959 (0.863–0.989)	1.000 (0.927–1.000)	1.000 (0.924–1.000)	0.961 (0.868–0.989)	47	0	2	49
Tsai [58]	Test (mean and SD for accuracy across models in fivefold cross-validation)	Distal tibia (corner metaphyseal fracture)	NS	0.93 ± 0.018	0.88 ± 0.05	0.96 ± 0.015	0.89 ± 0.036	0.95 ± 0.023	13	2	2	33
Tsai [58]	Test (best performing model)	Distal tibia (corner metaphyseal fracture)	NS	0.960 (0.865–0.989)	0.929 (0.685–0.987)	0.972 (0.858–0.995)	0.929 (0.685–0.987)	0.972 (0.858–0.995)	13	1	1	35
All appendicular skeleton
Dupuis [30]	Test	Appendicular skeleton	NS	0.926 (0.915–0.936)	0.957 (0.940–0.969)	0.912 (0.898–0.925)	0.829 (0.803–0.852)	0.979 (0.971–0.985)	NS	NS	NS	NS

95% confidence intervals are omitted where these are not provided in the publication or calculatable by raw values in the confusion matrix
AP anterior–posterior, NS not stated. CI confidence interval. AUC area under the curve, PPV positive predictive value, NPV negative predictive value, TP true positive, FP false positive, FN false negative, TN true negative, SD standard deviation

Back to article page