Research on Stress Testing Automation of AI Server for High Concurrency Scenarios

Authors

  • Xingcheng Ren Quanta Manufacturing Nashville LLC

DOI:

https://doi.org/10.70393/6a696574.343131

ARK:

https://n2t.net/ark:/40704/JIET.v1n2a02

Disciplines:

Intelligent Systems

Subjects:

Other

References:

22

Keywords:

Stress Testing, AI Server, High Concurrency Scenarios, Reinforcement Learning

Abstract

Traditional stress testing methods are difficult to simulate the complexity and dynamics in real business scenarios, resulting in distorted test results and low efficiency. In order to solve the above problems, this paper proposes an automated framework for stress testing of AI servers facing high concurrency scenarios. The framework adopts the design concept of hierarchical decoupling and intelligent decision-making, and consists of four modules: intelligent load generation layer, system resources and performance monitoring layer, dynamic tuning and control center, root cause analysis and report generation layer. Among them, the intelligent load generation layer supports mixed simulation of multi-modal AI loads, the dynamic tuning and control center realizes dynamic optimization of test parameters based on reinforcement learning (RL) algorithm, and the root cause analysis and report generation layer automatically locates performance bottlenecks and generates reports by unsupervised learning and time series correlation analysis. The experimental results show that the framework can effectively find the potential bottlenecks of the system, improve the test efficiency, and shorten the fault diagnosis cycle, which provides strong support for the performance optimization of AI server.

Author Biography

Xingcheng Ren, Quanta Manufacturing Nashville LLC

Quanta Manufacturing Nashville LLC, US, xrenwork@yahoo.com.

References

[1] Agarwal, U., Deligiannis, P., Huang, C., Jung, K., Lal, A., Naseer, I., ... & Xiao, Y. (2021, November). Nekara: Generalized concurrency testing. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) (pp. 679-691). IEEE.

[2] Abdulkadhim, M., & Repas, S. R. (2025). SHEAB: A Novel Automated Benchmarking Framework for Edge AI. Technologies, 13(11), 515.

[3] Ali, A., Maghawry, H. A., & Badr, N. (2022). Performance testing as a service using cloud computing environment: A survey. Journal of Software: Evolution and Process, 34(12), e2492.

[4] Hao, Z. (2026). Low-Overhead Scheduling for Real-Time AI Workloads on Multi-Core Edge Chips. International Journal of Advance in Applied Science Research, 5(3), 15-25.

[5] Iyer, V., Lee, S., Lee, S., Kim, J. J., Kim, H., & Shin, Y. (2023). Automated backend allocation for multi-model, on-device ai inference. Proceedings of the ACM on Measurement and Analysis of Computing Systems, 7(3), 1-33.

[6] Doragacharla, V. R. (2026). Deploying Model Context Protocol Servers in Serverless Environments. Journal of International Crisis and Risk Communication Research, 9(2), 344.

[7] Hao, Z. (2025). Task Affinity-Aware Scheduling for Multi-Core Edge Devices in Autonomous Vehicles. Engineering Frontiers, 1(2).

[8] Wang, P., Wang, H., Li, Q., Shen, D., & Liu, Y. (2024). Joint and individual component regression. Journal of Computational and Graphical Statistics, 33(3), 763-773.

[9] Zhang, Z., Li, S., Zhang, Z., Liu, X., Jiang, H., Tang, X., ... & Jiang, M. (2025). IHEval: Evaluating language models on following the instruction hierarchy. arXiv preprint arXiv:2502.08745.

[10] Wang, J., Chang, Y., Cao, S., Dong, Y., Li, S., Jia, L., & Li, W. (2025). Explanatory framework of typhoon extreme wind speed predictions integrating the effects of climate changes. Climate Dynamics, 63(3), 142.

[11] Lin, A. (2025). Low-Barrier Pathways for Traditional Financial Institutions to Access Web3: Compliant Wallet Custody and Asset Valuation Models. Frontiers in Management Science, 4(6), 80-86.

[12] Wu, Y. (2026). A Study on the Impact of Cross-Departmental Data Collaboration on Marketing Campaign Efficiency in Fast-Moving Consumer Goods E-commerce: The Case of PepsiCo (China)’s 7UP and Mirinda Project. Frontiers in Management Science, 5(1), 7-12.

[13] Hao, Z. (2026). Structure-Aware Deep Reinforcement Learning for Latency-Minimal Scheduling of Edge AI Inference on Heterogeneous Cores. Journal of Intelligence and Engineering Technology, 1(1), 50-59.

[14] Lin, A. (2026). Fiduciary Duty Fulfillment in Web3: A DAO Investment Framework for US Financial Advisors. International Academic Journal of Social Science, 2, 17-26.

[15] Luo, M., Du, B., Zhang, W., Song, T., Li, K., Zhu, H., ... & Wen, H. (2023). Fleet rebalancing for expanding shared e-mobility systems: A multi-agent deep reinforcement learning approach. IEEE Transactions on Intelligent Transportation Systems, 24(4), 3868-3881.

[16] Zhu, H., Luo, Y., Liu, Q., Fan, H., Song, T., Yu, C. W., & Du, B. (2019). Multistep flow prediction on car-sharing systems: A multi-graph convolutional neural network with attention mechanism. International Journal of Software Engineering and Knowledge Engineering, 29(11n12), 1727-1740.

[17] Wu, Y. (2026). Research on the Impact of LinkedIn Business Account Data-Driven Operations on Brand Exposure of AI Startups—A Case Study of AristAI. International Academic Journal of Social Science, 2, 27-37.

[18] Jyoti, S. N., Islam, M. R., & Kudapa, S. P. (2024). The Role of Test Automation Frameworks In Enhancing Software Reliability: A Review Of Selenium, Python, And API Testing Tools. International Journal of Business and Economics Insights, 4(4), 01-34.

[19] Wang, C. (2025). Research on the Precision Allocation of Cross-Border Marketing Resources of US Enterprises Driven by Digital Technology. Innovation in Science and Technology, 4(11), 7-13.

[20] Alesio, S. D., Briand, L. C., Nejati, S., & Gotlieb, A. (2015). Combining genetic algorithms and constraint programming to support stress testing of task deadlines. ACM Transactions on Software Engineering and Methodology (TOSEM), 25(1), 1-37.

[21] Jin, Y., Li, Z., Zhang, C., Cao, T., Gao, Y., Jayarao, P., ... & Yin, B. (2024). Shopping mmlu: A massive multi-task online shopping benchmark for large language models. Advances in Neural Information Processing Systems, 37, 18062-18089.

[22] Christidis, A., Moschoyiannis, S., Hsu, C. H., & Davies, R. (2020). Enabling serverless deployment of large-scale ai workloads. IEEE Access, 8, 70150-70161.

Published

2026-04-10

How to Cite

Ren, X. (2026). Research on Stress Testing Automation of AI Server for High Concurrency Scenarios. Journal of Intelligence and Engineering Technology, 1(2), 7–12. https://doi.org/10.70393/6a696574.343131

Issue

Section

Articles

ARK