Fully automated measurement of paediatric cerebral palsy pelvic radiographs with BoneFinder : external validation using a national surveillance database.
Hughes K., Luzar J., Lang J., Perry DC., Gaston MS.
AIMS: BoneFinder is a machine-learning tool that can automatically calculate Reimers migration percentage (RMP) and head-shaft angle (HSA) from paediatric cerebral palsy (CP) pelvic radiographs. This study's primary aim was to compare BoneFinder's fully automated measurements to manual measurements made by clinicians and HipScreen-assisted measurements made by clinicians. METHODS: Using the radiological database within Cerebral Palsy Integrated Care Pathway Scotland (CPIPS), BoneFinder's automatic RMP and HSA measurements were compared across the same set of radiographs to: routine manual measurements performed by clinical experts from the CPIPS database; additional manual measurements performed by two clinicians; and measurements performed by the same two clinicians using the smartphone application HipScreen. RESULTS: A total of 509 anteroposterior pelvic radiographs (1,018 hips; mean age 7.4 years (1 to 17)) were selected at random from the CPIPS database. Gross Motor Function Classification System levels were I (n = 69), II (n = 37), III (n = 97), IV (n = 120), and V (n = 186). The mean absolute difference (MAD) in RMP between BoneFinder and CPIPS measurements, manual measurements, and HipScreen was 7.6% (SD 10.0%), 5.5% (SD 9.1%), and 5.8% (SD 9.2%), respectively. Interobserver reliability of RMP measurement across all methods was excellent (intraclass correlation coefficient (ICC) 0.89 (95% CI 0.87 to 0.91); p < 0.001). Good ICC was found between BoneFinder and CPIPS measurements (ICC 0.80 (95% CI 0.65 to 0.87); p < 0.001). The area under the receiver operating characteristic curve for BoneFinder's ability to detect a hip with a RMP ≥ 30%/40%/50% was 0.95/0.97/0.98, respectively. ICC of HSA measurement across all raters was moderate (ICC 0.72 (95% CI 0.67 to 0.76); p < 0.001). Image artefact was present in 138 of 1,018 hips (14%). In these images, MAD increased and ICC decreased for both RMP and HSA measurement between BoneFinder and CPIPS, indicating a decline in agreement. CONCLUSION: Fully automated RMP and HSA measurements using BoneFinder were highly reliable with clinically acceptable measurement error. Further refinement of BoneFinder is required for analysis of radiographs with artefact.