MSAL-MIR: Multi-Stage Adaptive Loss for Medical Image Retrieval
Tóm tắt: 107
|
PDF: 70
##plugins.themes.academic_pro.article.main##
Author
-
Nguyen Van Hoang PhucThe University of Danang - University of Science and Technology, VietnamLe Quang NhatThe University of Danang - University of Science and Technology, VietnamDuong Manh QuanThe University of Danang - University of Science and Technology, VietnamHoang Phuong LeThe University of Danang - University of Science and Technology, VietnamNguyen Van HieuThe University of Danang - University of Science and Technology, Vietnam
Từ khóa:
Tóm tắt
Efficient and accurate retrieval of medical images underpins timely diagnosis and informed clinical decisions. This work introduces a novel multi-stage training paradigm designed for medical image retrieval. In the first stage, a ConvNeXt model pretrained on ImageNet is fine-tuned using Focal Loss to address class imbalance. Building on this foundation, the feature space is refined with Triplet Margin Loss, where chosen sample triplets are used to enhance discriminative learning. Our approach further streamlines the retrieval process by applying Global Max Pooling, L2 normalization, and Principal Component Analysis (PCA) for dimensionality reduction, followed by integration with Facebook AI Similarity Search (FAISS) for efficient similarity search. Experiments on the ISIC 2017 and COVID-19 chest X-ray datasets demonstrate that the proposed method achieves significant improvements in evaluation metrics, including mean Average Precision at 5(mAP@5), Precision at 1 (P@1), and Precision at 5 (P@5)
Tài liệu tham khảo
-
[1] K. Fukushima, “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position”, Biological Cybernetics, vol. 36, pp. 193–202, 1980, doi: 10.1007/BF00344251.
[2] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition”, Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998, doi: 10.1109/5.726791.
[3] T. Brosch and R. Tam, for the Alzheimer’s Disease Neuroimaging Initiative, “Manifold learning of brain MRIs by deep learning”, in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2013, Berlin, Germany: Springer, 2013, vol. 8150, pp. 629–636, doi: 10.1007/978-3-642-40763-5_78.
[4] S. M. Plis et al., “Deep learning for neuroimaging: a validation study”, Frontiers in Neuroscience, vol. 8, p. 229, 2014, doi: 10.3389/fnins.2014.00229.
[5] D. Yang, S. Zhang, Z. Yan, C. Tan, K. Li, and D. Metaxas, “Automated anatomical landmark detection on distal femur surface using convolutional neural network”, in Proc. IEEE 12th Int. Symp. Biomedical Imaging (ISBI), Brooklyn, NY, USA, 2015, pp. 17–21, doi: 10.1109/ISBI.2015.7163806.
[6] Y. Anavi, I. Kogan, E. Gelbart, O. Geva, and H. Greenspan, “Visualizing and enhancing a deep learning framework using patients’ age and gender for chest x-ray image retrieval”, in Proc. SPIE Medical Imaging: Computer-Aided Diagnosis, 2016, vol. 9785, p. 978510, doi: 10.1117/12.2217587.
[7] X. Liu, H. R. Tizhoosh, and J. Kofman, “Generating binary tags for fast medical image retrieval based on convolutional nets and Radon Transform”, in Proc. Int. Joint Conf. Neural Netw. (IJCNN), Vancouver, Canada, 2016, pp. 2872–2878, doi: 10.1109/IJCNN.2016.7727562.
[8] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition”, arXiv preprint arXiv:1512.03385, 2016.
[9] N. Tajbakhsh et al., “Convolutional neural networks for medical image analysis: Full training or fine tuning?”, IEEE Transactions on Medical Imaging, vol. 35, no. 5, pp. 1299–1312, 2016.
[10] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks”, Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), arXiv preprint arXiv:1608.06993, 2017.
[11] H. C. Shin et al., “Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning”, IEEE Transactions on Medical Imaging, vol. 35, no. 5, pp. 1285–1298, 2016.
[12] Z. Liu et al., “A ConvNet for the 2020s”, arXiv preprint arXiv:2201.03545, 2022.
[13] A. Esteva et al., “Dermatologist-level classification of skin cancer with deep neural networks”, Nature, vol. 542, no. 7639, pp. 115–118, 2017.
[14] A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale”, arXiv preprint arXiv:2010.11929, 2020.
[15] F. Schroff, D. Kalenichenko, and J. Philbin, “FaceNet: A unified embedding for face recognition and clustering”, arXiv preprint arXiv:1503.03832, 2015.
[16] R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality reduction by learning an invariant mapping”, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2006, pp. 1735–1742.
[17] J. Wang, F. Zhou, S. Wen, X. Liu, and Y. Lin, “Deep ranking for image similarity learning”, arXiv preprint arXiv:1405.0301, 2014.
[18] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection”, arXiv preprint arXiv:1708.02002, 2017.
[19] E. Ustinova and V. Lempitsky, “Learning deep embeddings with histogram loss”, arXiv preprint arXiv:1611.00822, 2016.
[20] X. Wang et al., “Multi-similarity loss with general pair weighting for deep metric learning”, arXiv preprint arXiv:1904.06657, 2019.
[21] Y. Wen, K. Zhang, Z. Li, and Y. Qiao, “A discriminative feature learning approach for deep face recognition”, in Proc. Eur. Conf. Computer Vision (ECCV), 2016, pp. 499–515.
[22] J. Johnson, M. Douze, and H. Jégou, “Billion-scale similarity search with GPUs”, arXiv preprint arXiv:1702.08734, 2017.
[23] T.-Y. Lin et al., “Focal loss for dense object detection”, arXiv preprint arXiv:1708.02002, 2017.
[24] J. Deng et al., “ArcFace: Additive angular margin loss for deep face recognition”, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2019.
[25] B. Cao, A. Araujo, and J. Sim, “DELG: Deep local and global features for image retrieval”, arXiv preprint arXiv:2001.05027, 2020.
[26] A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale”, arXiv preprint arXiv:2010.11929, 2020.
[27] B. Hu, B. Vasu, and A. Hoogs, “X-MIR: Explainable medical image retrieval”, in Proc. IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, pp. 440–450.

