Research on the Application of Deep Learning-Based Text Classification Algorithms in Data Mining
DOI:
https://doi.org/10.5281/zenodo.13762126ARK:
https://n2t.net/ark:/40704/AJNS.v1n1a03References:
11Keywords:
Deep Learning, Text Classification, Data Mining, Convolutional Neural Network (CNN), Natural Language Processing (NLP)Abstract
This article explores the application of text classification algorithms based on deep learning in data mining. By reviewing the existing literature, the application of convolutional neural network (CNN), recurrent neural network (RNN) and transformer (Transformer) in text classification is introduced, and related research on data mining and analysis is analyzed. The experiment used the IMDB movie review data set to verify the effectiveness of the deep learning model through data preprocessing, feature extraction and model training. The experimental results show that the model's accuracy on the test set is 86%, the precision rate is 85%, the recall rate is 88%, and the F1 value is 86%. Research shows that deep learning models can significantly improve text classification performance. This article also discusses the significance, limitations and future research directions of the research findings, including improvements in data acquisition and annotation, computing resource optimization, and model interpretability.
Downloads
Metrics
References
Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., ... & Chen, T. (2018). Recent advances in convolutional neural networks. Pattern recognition, 77, 354-377.
Medsker, L. R., & Jain, L. (2001). Recurrent neural networks. Design and Applications, 5(64-67), 2.
Han, K., Xiao, A., Wu, E., Guo, J., Xu, C., & Wang, Y. (2021). Transformer in transformer. Advances in neural information processing systems, 34, 15908-15919.
Church, K. W. (2017). Word2Vec. Natural Language Engineering, 23(1), 155-162.
Dharma, E. M., Gaol, F. L., Warnars, H. L. H. S., & Soewito, B. E. N. F. A. N. O. (2022). The accuracy comparison among word2vec, glove, and fasttext towards convolution neural network (cnn) text classification. J Theor Appl Inf Technol, 100(2), 31.
Koroteev, M. V. (2021). BERT: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943.
Qaiser, S., & Ali, R. (2018). Text mining: use of TF-IDF to examine the relevance of words to documents. International Journal of Computer Applications, 181(1), 25-29.
Tato, A., & Nkambou, R. (2018). Improving adam optimizer.
Mehta, S., Paunwala, C., & Vaidya, B. (2019, May). CNN based traffic sign classification using Adam optimizer. In 2019 international conference on intelligent computing and control systems (ICCS) (pp. 1293-1298). IEEE.
Song, Q., Xia, S., & Wu, Z. (2024, May). Automatic Optimization of Hyperparameters for Deep Convolutional Neural Networks: Grid Search Enhanced with Coordinate Ascent. In Proceedings of the 2024 International Conference on Machine Intelligence and Digital Applications (pp. 300-306).
Topal, K., & Ozsoyoglu, G. (2016, August). Movie review analysis: Emotion analysis of IMDb movie reviews. In 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) (pp. 1170-1176). IEEE.

Downloads
Published
How to Cite
Issue
Section
ARK
License
Copyright (c) 2024 The author retains copyright and grants the journal the right of first publication.

This work is licensed under a Creative Commons Attribution 4.0 International License.