Publications

Detailed Information

Test-retest reproducibility of a deep learning-based automatic detection algorithm for the chest radiograph

Cited 7 time in Web of Science Cited 10 time in Scopus
Authors

Kim, Hyungjin; Park, Chang Min; Goo, Jin Mo

Issue Date
2020-04
Publisher
Springer Verlag
Citation
European Radiology, Vol.30 No.4, pp.2346-2355
Abstract
Objectives To perform test-retest reproducibility analyses for deep learning-based automatic detection algorithm (DLAD) using two stationary chest radiographs (CRs) with short-term intervals, to analyze influential factors on test-retest variations, and to investigate the robustness of DLAD to simulated post-processing and positional changes. Methods This retrospective study included patients with pulmonary nodules resected in 2017. Preoperative CRs without interval changes were used. Test-retest reproducibility was analyzed in terms of median differences of abnormality scores, intraclass correlation coefficients (ICC), and 95% limits of agreement (LoA). Factors associated with test-retest variation were investigated using univariable and multivariable analyses. Shifts in classification between the two CRs were analyzed using pre-determined cutoffs. Radiograph post-processing (blurring and sharpening) and positional changes (translations in x- and y-axes, rotation, and shearing) were simulated and agreement of abnormality scores between the original and simulated CRs was investigated. Results Our study analyzed 169 patients (median age, 65 years; 91 men). The median difference of abnormality scores was 1-2% and ICC ranged from 0.83 to 0.90. The 95% LoA was approximately +/- 30%. Test-retest variation was negatively associated with solid portion size (beta, - 0.50; p = 0.008) and good nodule conspicuity (beta, - 0.94; p < 0.001). A small fraction (15/169) showed discordant classifications when the high-specificity cutoff (46%) was applied to the model outputs (p = 0.04). DLAD was robust to the simulated positional change (ICC, 0.984, 0.996), but relatively less robust to post-processing (ICC, 0.872, 0.968). Conclusions DLAD was robust to the test-retest variation. However, inconspicuous nodules may cause fluctuations of the model output and subsequent misclassifications.
ISSN
0938-7994
URI
https://hdl.handle.net/10371/208901
DOI
https://doi.org/10.1007/s00330-019-06589-8
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Related Researcher

  • College of Medicine
  • Department of Medicine
Research Area Radiology

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share