Neural Network Models for Hate Speech Classification in Tweets
AbstractThe increase in hate speech on social media in recent years calls for improved detection methods. While traditional techniques rely on manually monitoring hate speech, there is a growing interest in applying machine learning to text classification. Improvements in hate speech classification would have important implications as social media companies such as Twitter, Facebook, and Reddit begin to enforce hate speech regulations. Linear classifiers, support vector machines, and neural networks have shown promising results in hate speech classification, and we expand on research done on a Twitter dataset of 16K annotated tweets to incorporate metadata such as retweet and favorite counts on tweets, as well as user follower and friend counts to improve classification accuracy by 2% on convolutional and recurrent neural network models. We also train hate speech-specific word embeddings to capture the code words appropriated by hate speech culprits to target specific groups of people. Task-specific word embeddings show an additional 2% increase in accuracy on hate speech classification.
Citable link to this pagehttp://nrs.harvard.edu/urn-3:HUL.InstRepos:38811552
- FAS Theses and Dissertations