Resident Norwalk Hospital NYC H+H Guilford, Connecticut, United States
Purpose: Integration of eye tracking and gaze data into deep learning models for image analysis improves explainability as well as accuracy. We survey eye tracking principles in radiology, including a discussion of visual search techniques stratified by expertise. We review principal deep learning architectures employed today in medical imaging prediction. We explore how eye tracking data can enhance accuracy and transparency, and discuss limitations and future implications.
Methods/Materials: Informational processing in imaging involves consolidating salient patterns, illustrated in the radiologist scan path, features of which are discussed. A literature review is performed on deep learning models and vision transformers, with attention to how eye-tracking and gaze (ETAG) data improve accuracy in radiology models and mitigate the 'black box' problem.
Results: Deep learning models learn statistical patterns, while radiologists’ gaze tracks clinically meaningful features. ETAG data quantifies expert attention patterns to be adapted as learning signals for DL models, for example to guide focus toward clinically significant regions and away from artifacts. This improves accuracy and efficiency (including energy efficiency) and mitigates overfitting. Present challenges to application of ETAG data to DL models include scalability and data nuances, such as that dwell time sometimes reflects a pruned scanpath.
Conclusions: Integrating eye tracking and gaze data during training of radiology DL models can enhance robustness, efficiency and accuracy. Moreover, by aligning machine reasoning with clinical knowledge, ETAG data integration transforms ‘black box’ models into transparent systems that mirror human expert focus.