Improved email classification through enhanced data preprocessing approach

Citation DataSpatial Information Research, ISSN: 2366-3294, Vol: 29, Issue: 2, Page: 247-255

Publication Year2021

11
Citations
0
Usage
14
Captures
0
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Metrics Details

Citations
11
- Citation Indexes
  11
Captures
14
- Readers
  14

Article Description

Email has become one of the most widely used forms of communication, resulting in an exponential increase in emails received and creating an immense burden on existing approaches to email classification. Applying the classification method on the raw data may worsen the performance of classifier algorithms. Hence, the data have to be prepared for better performance of the machine learning classifiers. This paper proposes an enhanced data preprocessing approach for multi-category email classification. The proposed model removes the signature of the email. Further, special characters and unwanted words are removed using various preprocessing methods such as stop-word removal, enhanced stop-word removal, and stemming. The proposed model is evaluated using various classifiers such as Multi-Nominal Naïve Bayes, Linear Support Vector Classifier, Logistic Regression, and Random Forest. The results showed that the proposed data preprocessing to email classification is superior to the existing approach.

Bibliographic Details

DOI10.1007/s41324-020-00378-y

URL IDhttp://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=85100159790&origin=inward; http://dx.doi.org/10.1007/s41324-020-00378-y; http://link.springer.com/10.1007/s41324-020-00378-y; http://link.springer.com/content/pdf/10.1007/s41324-020-00378-y.pdf; http://link.springer.com/article/10.1007/s41324-020-00378-y/fulltext.html; https://dx.doi.org/10.1007/s41324-020-00378-y; https://link.springer.com/article/10.1007/s41324-020-00378-y

AUTHOR(S)

B. Aruna Kumara; Mallikarjun M. Kodabagi; Tanupriya Choudhury; Jung Sup Um

PUBLISHER(S)

Springer Science and Business Media LLC

TAG(S)

Social Sciences; Computer Science; Earth and Planetary Sciences

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know