Research on Text Classification and Topic Extraction Model of Medical English Corpus Based on Natural Language Processing

Citation Data2024 IEEE 6th International Conference on Power, Intelligent Computing and Systems, ICPICS 2024, Page: 727-731

Publication Year2024

0
Citations
0
Usage
0
Captures
0
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Conference Paper Description

In the medical field, a large amount of professional literature stores a wealth of knowledge, and effectively organizing and classifying this information is of great significance for medical research and practice. This paper proposes a deep learning-based text classification and topic extraction model for processing medical English corpora. We first obtained a corpus of 1000 medical texts, and then used natural language processing techniques to preprocess these texts. We propose a convolutional neural network (CNN) based text classification model that is capable of automatically learning and extracting features from text. Meanwhile, we utilize Latent Dirichlet Allocation (LDA) model for topic extraction. On the test set, our model outperforms other common text classification models such as Naive Bayes, Support Vector Machines, and Logistic Regression in both accuracy and F1-score. The research in this paper provides an effective method for the automatic classification and topic extraction of medical English texts, which has wide practicability and application value.

Bibliographic Details

DOI10.1109/icpics62053.2024.10796240

URL IDhttp://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=85216194090&origin=inward; http://dx.doi.org/10.1109/icpics62053.2024.10796240; https://ieeexplore.ieee.org/document/10796240/

AUTHOR(S)

Xinli Zhang; Lingyue Xie

PUBLISHER(S)

Institute of Electrical and Electronics Engineers (IEEE)

TAG(S)

Energy; Computer Science; Engineering

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know