Harassment Detection on Twitter using Conversations

Publication Year:
Usage 1
Abstract Views 1
Repository URL:
Edupuganti, Venkatesh
Department of Computer Science and Engineering; Social Media; Twitter; Harassment; Computer Engineering; Computer Sciences; Engineering; Physical Sciences and Mathematics; Department of Computer Science and Engineering; Social Media; Twitter; Harassment
thesis / dissertation description
Social media has brought people closer than ever before, but the use of social media has also brought with it a risk of online harassment. Such harassment can have a serious impact on a person such as causing low self-esteem and depression. The past research on detecting harassment on social media is primarily based on the content of messages exchanged on social media. The lack of context when relying on a single social media post can result in a high degree of false alarms. In this study, I focus on the reliable detection of harassment on Twitter by better understanding the context in which a pair of users is exchanging messages, thereby improving precision. Specifically, I use a comprehensive set of features involving content, profiles of users exchanging messages, and the sequence of messages. By analyzing the conversation between users and features such as change of behavior during their conversation, length of conversation and frequency of curse words, I find that the detection of harassment can be improved significantly over merely using content features and user profile information. Experimental results demonstrate that the comprehensive set of features I use in my supervised machine learning classifier achieves F-score of 88.2 and Area Under Curve (AUC) of Receiver Operating Characteristic (ROC) of 94.3.