A Technical Review of Sequence-to-Sequence Models

Tao Bo; Weiyi Li; Yue Liu

doi:10.70393/616a6e73.323834

Authors

Tao Bo TikTok.Inc
Weiyi Li Georgia Institute of Technology
Yue Liu TikTok.Inc

DOI:

https://doi.org/10.70393/616a6e73.323834

ARK:

https://n2t.net/ark:/40704/AJNS.v2n2a01

Disciplines:

Computer Science

Subjects:

Artificial Intelligence

References:

25

Keywords:

Sequence-to-sequence, Transformer, Attention Mechanism, Neural Machine Translation, Text Summarization, Conversational AI, Reinforcement Learning, Pointer-generator Network, Beam Search, Natural Language Processing

Abstract

Seq2Seq models and their variants have become a mainstay of modern natural language processing and sequence modelling tasks. Just Information about Seq2Seq models. In this paper, we provide a comprehensive overview of the evolution of Seq2Seq architecture from early-stage RNN based approaches to recent Transformer based methods. The paper extensively covers additional important methods such as attention mechanisms, bidirectional encoders, pointer-generator networks, as well as optimization methods such as beam search, scheduled sampling and reinforcement learning. It also discusses the challenges of data preprocessing, loss functions, and evaluation metrics, as well as applications in machine translation, summarization, speech recognition, and conversational AI. This paper provides a comprehensive report on the design and future directions of Seq2Seq models emphasizing on theoretical foundations as well as real world applications.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Author Biographies

Tao Bo, TikTok.Inc

TikTok.Inc, USA.

Weiyi Li, Georgia Institute of Technology

Georgia Institute of Technology, USA.

Yue Liu, TikTok.Inc

TikTok.Inc, USA.

References

[1] Lin, W., Xiao, J., & Cen, Z. (2024). Exploring Bias in NLP Models: Analyzing the Impact of Training Data on Fairness and Equity. Journal of Industrial Engineering and Applied Science, 2(5), 24-28.

[2] Katrompas, A., Ntakouris, T., & Metsis, V. (2022, June). Recurrence and self-attention vs the transformer for time-series classification: A comparative study. In International Conference on Artificial Intelligence in Medicine (pp. 99-109). Cham: Springer International Publishing.

[3] Lin, C. C., Huang, A. Y., & Yang, S. J. (2023). A review of ai-driven conversational chatbots implementation methodologies and challenges (1999–2022). Sustainability, 15(5), 4012.

[4] Lin, W. (2024). The Application of Real-time Emotion Recognition in Video Conferencing. Journal of Computer Technology and Applied Mathematics, 1(4), 79-88.

[5] Dong, Y., Li, G., & Jin, Z. (2023, July). CODEP: grammatical seq2seq model for general-purpose code generation. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis (pp. 188-198).

[6] Li, K., Chen, X., Song, T., Zhou, C., Liu, Z., Zhang, Z., Guo, J., & Shan, Q. (2025a, March 24). Solving situation puzzles with large language model and external reformulation.

[7] Lyu, S. (2024). Machine Vision-Based Automatic Detection for Electromechanical Equipment. Journal of Computer Technology and Applied Mathematics, 1(4), 12-20.

[8] Barakat, H., Turk, O., & Demiroglu, C. (2024). Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resources. EURASIP Journal on Audio, Speech, and Music Processing, 2024(1), 11.

[9] Lin, W. (2025). Enhancing Video Conferencing Experience through Speech Activity Detection and Lip Synchronization with Deep Learning Models. Journal of Computer Technology and Applied Mathematics, 2(2), 16-23.

[10] Orynbay, L., Razakhova, B., Peer, P., Meden, B., & Emeršič, Ž. (2024). Recent advances in synthesis and interaction of speech, text, and vision. Electronics, 13(9), 1726.

[11] Luo, M., Zhang, W., Song, T., Li, K., Zhu, H., Du, B., & Wen, H. (2021, January). Rebalancing expanding EV sharing systems with deep reinforcement learning. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence (pp. 1338-1344).

[12] Lyu, S. (2024). The Application of Generative AI in Virtual Reality and Augmented Reality. Journal of Industrial Engineering and Applied Science, 2(6), 1-9.

[13] Lin, W. (2024). A Systematic Review of Computer Vision-Based Virtual Conference Assistants and Gesture Recognition. Journal of Computer Technology and Applied Mathematics, 1(4), 28-35.

[14] Zhu, H., Luo, Y., Liu, Q., Fan, H., Song, T., Yu, C. W., & Du, B. (2019). Multistep flow prediction on car-sharing systems: A multi-graph convolutional neural network with attention mechanism. International Journal of Software Engineering and Knowledge Engineering, 29(11n12), 1727–1740.

[15] Lyu, S. (2024). The Technology of Face Synthesis and Editing Based on Generative Models. Journal of Computer Technology and Applied Mathematics, 1(4), 21-27.

[16] Xu, F. (2021). Gaozhi yuanxiao yingyu jiaoxue moshi chuangxin [Innovation of English teaching models in higher vocational colleges]. Jiuzhou Press.

[17] Li, K., Chen, X., Song, T., Zhang, H., Zhang, W., & Shan, Q. (2024). GPTDrawer: Enhancing Visual Synthesis through ChatGPT. arXiv preprint arXiv:2412.10429.

[18] Jia, Y., Weiss, R. J., Biadsy, F., Macherey, W., Johnson, M., Chen, Z., & Wu, Y. (2019). Direct speech-to-speech translation with a sequence-to-sequence model. arXiv preprint arXiv:1904.06037.

[19] Lin, W. (2024). A Review of Multimodal Interaction Technologies in Virtual Meetings. Journal of Computer Technology and Applied Mathematics, 1(4), 60-68.

[20] Li, X., Wang, X., Qi, Z., Cao, H., Zhang, Z., & Xiang, A. DTSGAN: Learning Dynamic Textures via Spatiotemporal Generative Adversarial Network. Academic Journal of Computing & Information Science, 7(10), 31-40.

[21] Gholami, M. J., & Al Abdwani, T. (2024). The rise of thinking machines: A review of artificial intelligence in contemporary communication. Journal of Business, Communication & Technology, 1-15.

[22] Luo, M., Du, B., Zhang, W., Song, T., Li, K., Zhu, H., ... & Wen, H. (2023). Fleet rebalancing for expanding shared e-Mobility systems: A multi-agent deep reinforcement learning approach. IEEE Transactions on Intelligent Transportation Systems, 24(4), 3868-3881.

[23] Li, X., Cao, H., Zhang, Z., Hu, J., Jin, Y., & Zhao, Z. (2024). Artistic Neural Style Transfer Algorithms with Activation Smoothing. arXiv preprint arXiv:2411.08014.

[24] Sun, Y., & Ortiz, J. (2024). An ai-based system utilizing iot-enabled ambient sensors and llms for complex activity tracking. arXiv preprint arXiv:2407.02606.

[25] Gligorea, I., Cioca, M., Oancea, R., Gorski, A. T., Gorski, H., & Tudorache, P. (2023). Adaptive learning using artificial intelligence in e-learning: A literature review. Education Sciences, 13(12), 1216.

A Technical Review of Sequence-to-Sequence Models

Authors

DOI:

ARK:

Disciplines:

Subjects:

References:

Keywords:

Abstract

Downloads

Metrics

Author Biographies

Tao Bo, TikTok.Inc

Weiyi Li, Georgia Institute of Technology

Yue Liu, TikTok.Inc

References

Downloads

Published

How to Cite

Issue

Section

ARK

License

Make a Submission

Keywords

digital-distribution

Indexing & Abstracting

Information

Announcements

Call for Reviewers

Call for Papers

Current Issue