Abstract
This paper presents NeuroViT (Neuro-Adaptive Vision Transformer), a novel deep learning architecture for four-class ocular disease classification from retinal fundus images, targeting cataract, diabetic retinopathy, glaucoma, and normal conditions. Built upon the ViT-Base/16 backbone, NeuroViT integrates four neuro-inspired modules: Saliency-Guided Attention Pathway (SGAP), Predictive Coding Feedback Loop (PCFL), Dynamic Feature Routing, and Prototype-Based Classification. Evaluated on a stratified dataset, NeuroViT achieves state-of-the-art performance with 94.67% accuracy, 94.67% F1-score, and 99.50% AUCROC, while maintaining efficient inference at 6.22 ms per image. Ablation studies reveal that the Prototype-Based Classifier is the most critical component, contributing significantly to diagnostic robustness, whereas SGAP and PCFL show dataset-dependent effects, removing them slightly improves accuracy (94.91%) due to reduced noise in saliency estimation and feature over-smoothing. Qualitative analysis and confusion matrix inspection confirm minimal misclassification, with no critical diagnostic errors (e.g., diabetic retinopathy mislabeled as normal). Despite its strong performance, limitations include single-source data, limited disease scope, lack of clinical validation, and untested deployment on edge devices. NeuroViT demonstrates that biologically inspired design principles can enhance vision transformers for medical imaging, offering a promising foundation for future clinical AI systems in ophthalmology.
Concepts :
Access to Document
10.1109/icimcis68501.2025.11326933Citations by Year
| Year | Count |
|---|---|
| 2025 | 0 |