Optimizing Cloud-Native Lakehouse Architectures for Real-Time Semiconductor Analytics: Balancing Performance, Cost, and Energy Efficiency

Authors

  • Min Yin University of California-Berkeley

DOI:

https://doi.org/10.70393/6a69656173.333833

ARK:

https://n2t.net/ark:/40704/JIEAS.v4n1a07

Disciplines:

Computer Science

Subjects:

Semiconductor Analytics

References:

27

Keywords:

Cloud-native Lakehouse, Semiconductor Analytics, Storage Tiering, Columnar Compression, Query Routing, Cost-energy Optimization

Abstract

Currently, semiconductor data analysis requires processing massive amounts of real-time data, and traditional data warehouses face challenges in meeting the demands for low latency and high-concurrency queries. Therefore, this paper proposes a cloud-native Lakehouse architecture specifically designed for real-time semiconductor analysis. By introducing an innovative query routing mechanism and data lineage tracing framework, a dynamic multi-tiered storage system is designed. This system can tier data based on access frequency to achieve efficient storage and faster query performance. This research provides a practical solution to overcome the limitations of existing architectures and offers valuable insights for the future development of cloud-native platforms for real-time industrial analysis.

Downloads

Download data is not yet available.

Author Biography

Min Yin, University of California-Berkeley

University of California-Berkeley, US, gmiayinc@gmail.com.

References

[1] Chen, Y. (2025). Artificial Intelligence in Economic Applications: Stock Trading, Market Analysis, and Risk Management. Journal of Economic Theory and Business Management, 2(5), 7-14.

[2] Pahune, S., & Akhtar, Z. (2025). Transitioning from MLOps to LLMOps: Navigating the unique challenges of large language models. Information, 16(2), 87.

[3] Sun, Y., & Ortiz, J. (2024). An AI-Based System Utilizing IoT-Enabled Ambient Sensors and LLMs for Complex Activity Tracking. Academic Journal of Science and Technology, 11(3), 277–281.

[4] Chen, Y. (2025). Daily Asset Pricing Based on Deep Learning: Integrating No-Arbitrage Constraints and Market Dynamics. Journal of Computer Technology and Applied Mathematics, 2(6), 1-10.

[5] Lin, A. (2025). Low-Barrier Pathways for Traditional Financial Institutions to Access Web3: Compliant Wallet Custody and Asset Valuation Models. Frontiers in Management Science, 4(6), 80-86.

[6] Qi, Z. (2025). Root Cause Tracing Algorithm and One-Click Repair Mechanism for Medical Server Failures. Journal of Progress in Engineering and Physical Science, 4(5), 43-48.

[7] Qi, Z. (2025). Design of a Medical IT Automated Auditing System Based on Multiple Compliance Standards. Innovation in Science and Technology, 4(9), 17-23.

[8] Bhimji, W., Carder, D., Dart, E., Duarte, J., Fisk, I., Gardner, R., ... & Würthwein, F. (2023). Snowmass 2021 computational frontier compf4 topical group report storage and processing resource access. Computing and Software for Big Science, 7(1), 5.

[9] Chen, Y., & Xu, J. (2026). Deep Learning for US Bond Yield Forecasting: An Enhanced LSTM–LagLasso Framework. ICCK Transactions on Emerging Topics in Artificial Intelligence, 3(2), 61-75.

[10] Yin, M., & Frank, L. F. (2026). Multi-Modal Fusion for Yield Optimization: Integrating Wafer Maps, Metrology, and Process Logs with Graph Models. ICCK Transactions on Emerging Topics in Artificial Intelligence, 3(1), 45-60.

[11] Qi, Z. (2025). Design and Practice of Elastic Scaling Mechanism for Medical Cloud-Edge Collaborative Architecture. Journal of Innovations in Medical Research, 4(5), 13-18.

[12] Luo, M., Du, B., Zhang, W., Song, T., Li, K., Zhu, H., ... & Wen, H. (2023). Fleet rebalancing for expanding shared e-Mobility systems: A multi-agent deep reinforcement learning approach. IEEE Transactions on Intelligent Transportation Systems, 24(4), 3868-3881.

[13] Yin, M. (2025). Drift-Aware Streaming Predictive Maintenance for Semiconductor Equipment.

[14] Petrovic, A. J. (2025). Design of Secure and Fault-Tolerant Patterns for AI-Driven Healthcare Analytics in Hybrid Cloud Platforms. International Journal of Research and Applied Innovations, 8(6), 13027-13033.

[15] Gujjala, P. K. R. (2023). The Future of Cloud-Native Lakehouses: Leveraging Serverless and Multi-Cloud Strategies for Data Flexibility. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 868-882.

[16] Chen, Y. (2025). Leveraging LSTM Networks for Vehicle Stability Prediction: A Comparative Analysis with Traditional Models under Dynamic Load Conditions. Computing and Interdisciplinary Science, 1(2), 15-22.

[17] Li, K., Chen, X., Song, T., Zhou, C., Liu, Z., Zhang, Z., Guo, J., & Shan, Q. (2025a, March 24). Solving situation puzzles with large language model and external reformulation.

[18] Wang, H. (2022). Supervised Learning for Complex Data (Doctoral dissertation, The University of North Carolina at Chapel Hill).

[19] Aires, V. A. J. (2025). Optimising Energy Analytics: a Dimensional Modelling Approach for Enhanced Decision-Making (Master's thesis, Universidade NOVA de Lisboa (Portugal)).

[20] Chen, Y. (2025). A Comparative Study of Machine Learning Models for Credit Card Fraud Detection. Academic Journal of Natural Science, 2(4), 11-18.

[21] Chen, T. (2025). Innovative Approaches in Art Vocational Education: Exploring Industry-Academia Collaboration and Internationalization. Journal of Advanced Research in Education, 4(5), 11-17.

[22] Wang, H., Li, Q., & Liu, Y. (2024). Multi-response Regression for Block-missing Multi-modal Data without Imputation. Statistica Sinica, 34(2), 527.

[23] Kretzer, A. R., Benitti, F. B., & Siqueira, F. (2025). Challenges and Opportunities in Big Data Analytics for Industry 4.0: A Systematic Evaluation of Current Architectures. IEEE Access.

[24] Nyunt, A. T., Kotak, B., Chauhan, R., Jain, R., Parmar, K. J., Palaniappan, D., & Premavathi, T. (2026). Next Generation Data Warehousing for Destination Marketing With Big Data Technologies. In Maximizing Destination Marketing Strategies in the Digital Era (pp. 157-194). IGI Global Scientific Publishing.

[25] Zhang, K. (2025). Research on the Dynamic Control Path of Procurement Costs Driven by Information Technology Tools: A Case Study of the QG Project. Innovation in Science and Technology, 4(8), 10-15.

[26] Wang, H., Li, Q., & Liu, Y. (2023). Adaptive supervised learning on data streams in reproducing kernel Hilbert spaces with data sparsity constraint. Stat, 12(1), e514.

[27] Nuthalapati, A. (2025). Scaling AI Applications on the Cloud toward Optimized Cloud-Native Architectures, Model Efficiency, and Workload Distribution. International Journal of Latest Technology in Engineering, Management & Applied Science, 14(2), 200-206.

Downloads

Published

2026-02-05

How to Cite

[1]
M. Yin, “Optimizing Cloud-Native Lakehouse Architectures for Real-Time Semiconductor Analytics: Balancing Performance, Cost, and Energy Efficiency”, Journal of Industrial Engineering & Applied Science, vol. 4, no. 1, pp. 49–61, Feb. 2026.

Issue

Section

Articles

ARK