EAGER: Automated Content-Based Detection of Public Online Harrassment

$150,000FY2015CSENSF

University Of Maryland, College Park, College Park MD

Investigators

Abstract

Public, online harassment takes many forms, but at its core are posts that are offensive, threatening, and intimidating. It is not an isolated problem. The Pew Research Center found 73% of people had witnessed harassment online, and a full 40% of people had experienced harassment directly. This research develops a method for analyzing the things people post online, and automatically detecting which posts fall into the category of severe public online harassment -- messages posted simply to disrupt, offend, or threaten others. This helps websites better limit what messages are posted and reduce the amount of harassment people experience online. The researchers develop a corpus of online comments from a number of media outlets and social media platforms where each post is labeled as harassing or non-harassing. Then, they apply a set of computational linguistic techniques that describe features of the message, including types of words and language structure, which is passed to rule-based and machine learning artificial intelligence systems for classification. The goal is to develop models that can automatically detect the public online harassment messages with high accuracy.

View original record on NSF Award Search →