Skip to main content

Table 4 Diagnostic accuracy of artificial intelligence algorithms for fracture detection, organised by body parts

From: Artificial intelligence for radiological paediatric fracture assessment: a systematic review

Author, year

Dataset

Body part

AUC

Accuracy, % (95% CI)

Sensitivity, % (95% CI)

Specificity, % (95% CI)

PPV, % (95% CI)

NPV, % (95% CI)

TP

FP

FN

TN

Upper limb—elbow

England [31]

Validation

Elbow effusions

0.985 (0.966–1.00)

NS

NS

NS

NS

NS

NS

NS

NS

NS

Test

Elbow effusions

0.943

(0.884–1.00)

0.907

(0.843–0.951)

0.909

(0.788–1.00)

0.906

(0.844–0.958)

NS

NS

87

9

3

30

Rayan [33]

Validation

Elbow fractures

0.947

(0.930–0.960)

0.877

(0.856–0.895)

0.908

(0.882–0.929)

0.841

(0.807–0.870)

0.867

(0.838–0.892)

0.889

(0.858–0.914)

536

82

54

434

Choi [17]

Validation

Supracondylar fractures

0.976

(0.949–0.991)

0.945

(0.910–0.967)

0.948

(0.859–0.982)

0.944

(0.902–0.968)

0.833

(0.726–0.904)

0.984

(0.954–0.995)

55

11

3

185

Temporal test set

Supracondylar fractures

0.985

(0.962–0.996)

0.904

(0.855–0.938)

0.939

(0.852–0.983)

0.922

(0.874–0.956)

0.805

(0.717–0.871)

0.978

(0.945–0.991)

62

15

4

117

Geographical test set

Supracondylar fractures

0.992

(0.947–1.000)

0.895

(0.817–0.942)

1.000

(0.852–1.000)

0.861

(0.759–0.931)

0.697

(0.564–0.803)

1.000

23

10

0

62

Dupuis [30]

Test

Elbow fractures (subgroup)

NS

0.888

(0.847–0.919)

0.918

(0.846–0.958)

0.873

(0.819–0.913)

0.781

(0.969–0.847)

0.956

(0.915–0.977)

89

25

8

172

Upper limb—other

Zhou [35]

Test set (best performing for AP ulnar view, using optimal central angle measurement of bone)

Forearm (Bowing fracture)

0.992

(NS)

NS

1.000

(NS)

0.940

(NS)

NS

NS

NS

NS

NS

NS

Zhang [35]

Test set—analysed per patient

Distal radius (ultrasound)

NS

0.92

1.0

0.87

NS

NS

NS

NS

NS

NS

Lower limb

Malek [32]

Training

Lower limb fracture healing

0.8

(NS)

0.821

(0.673–0.910)

0.792

(0.595–0.908)

0.867

(0.621–0.963)

0.905

(0.711–0.973)

0.722

(0.491–0.875)

19

2

5

13

Validation

Lower limb fracture healing

NS

0.556

(0.267–0.811)

0.600

(0.231–0.882)

0.500

(0.150–0.850)

0.600

(0.231–0.882)

0.500

(0.150–0.850)

3

2

2

2

Test

Lower limb fracture healing

NS

0.889

(0.565–0.980)

1.000

(0.566–1.000)

0.750

(0.301–0.954)

0.833

(0.436–0.970)

1.000

(0.439–1.000)

5

1

0

3

Starosolski [34]

Test

Distal tibia

0.995 (NS)

0.979

(0.929–0.994)

0.959

(0.863–0.989)

1.000

(0.927–1.000)

1.000

(0.924–1.000)

0.961

(0.868–0.989)

47

0

2

49

Tsai [58]

Test (mean and SD for accuracy across models in fivefold cross-validation)

Distal tibia (corner metaphyseal fracture)

NS

0.93 ± 0.018

0.88 ± 0.05

0.96 ± 0.015

0.89 ± 0.036

0.95 ± 0.023

13

2

2

33

Test (best performing model)

Distal tibia (corner metaphyseal fracture)

NS

0.960

(0.865–0.989)

0.929

(0.685–0.987)

0.972

(0.858–0.995)

0.929

(0.685–0.987)

0.972

(0.858–0.995)

13

1

1

35

All appendicular skeleton

Dupuis [30]

Test

Appendicular skeleton

NS

0.926

(0.915–0.936)

0.957

(0.940–0.969)

0.912

(0.898–0.925)

0.829

(0.803–0.852)

0.979

(0.971–0.985)

NS

NS

NS

NS

  1. 95% confidence intervals are omitted where these are not provided in the publication or calculatable by raw values in the confusion matrix
  2. AP anterior–posterior, NS not stated. CI confidence interval. AUC area under the curve, PPV positive predictive value, NPV negative predictive value, TP true positive, FP false positive, FN false negative, TN true negative, SD standard deviation