Li, Liang. “Overview of Multimodal Generative Models in Natural Language Processing and Computer Vision”. Journal of Computer Technology and Applied Mathematics 1, no. 4 (November 2, 2024): 69–78. Accessed February 9, 2025. https://www.suaspress.org/ojs/index.php/JCTAM/article/view/v1n4a09.