Li, L. (2024). Overview of Multimodal Generative Models in Natural Language Processing and Computer Vision. Journal of Computer Technology and Applied Mathematics, 1(4), 69–78. https://doi.org/10.5281/zenodo.13988327