Publications
Detailed Information
Test-retest reproducibility of a deep learning-based automatic detection algorithm for the chest radiograph
Cited 7 time in
Web of Science
Cited 10 time in Scopus
- Authors
- Issue Date
- 2020-04
- Publisher
- Springer Verlag
- Citation
- European Radiology, Vol.30 No.4, pp.2346-2355
- Abstract
- Objectives To perform test-retest reproducibility analyses for deep learning-based automatic detection algorithm (DLAD) using two stationary chest radiographs (CRs) with short-term intervals, to analyze influential factors on test-retest variations, and to investigate the robustness of DLAD to simulated post-processing and positional changes. Methods This retrospective study included patients with pulmonary nodules resected in 2017. Preoperative CRs without interval changes were used. Test-retest reproducibility was analyzed in terms of median differences of abnormality scores, intraclass correlation coefficients (ICC), and 95% limits of agreement (LoA). Factors associated with test-retest variation were investigated using univariable and multivariable analyses. Shifts in classification between the two CRs were analyzed using pre-determined cutoffs. Radiograph post-processing (blurring and sharpening) and positional changes (translations in x- and y-axes, rotation, and shearing) were simulated and agreement of abnormality scores between the original and simulated CRs was investigated. Results Our study analyzed 169 patients (median age, 65 years; 91 men). The median difference of abnormality scores was 1-2% and ICC ranged from 0.83 to 0.90. The 95% LoA was approximately +/- 30%. Test-retest variation was negatively associated with solid portion size (beta, - 0.50; p = 0.008) and good nodule conspicuity (beta, - 0.94; p < 0.001). A small fraction (15/169) showed discordant classifications when the high-specificity cutoff (46%) was applied to the model outputs (p = 0.04). DLAD was robust to the simulated positional change (ICC, 0.984, 0.996), but relatively less robust to post-processing (ICC, 0.872, 0.968). Conclusions DLAD was robust to the test-retest variation. However, inconspicuous nodules may cause fluctuations of the model output and subsequent misclassifications.
- ISSN
- 0938-7994
- Files in This Item:
- There are no files associated with this item.
- Appears in Collections:
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.