Li, Liang. “Overview of Multimodal Generative Models in Natural Language Processing and Computer Vision”. Journal of Computer Technology and Applied Mathematics, vol. 1, no. 4, Nov. 2024, pp. 69-78, doi:10.5281/zenodo.13988327.