Sounds familiar somehow...
"We've all heard anecdotes about trolling on Wikipedia and other
social platforms, but rarely has anyone been able to quantify
levels and origins of online abuse. That's about to change.
Researchers with Alphabet tech incubator Jigsaw worked with
Wikimedia Foundation to analyze 100,000 comments left on
English-language Wikipedia. They found predictable patterns behind
who will launch personal attacks and when.
The goal of the research team was to lay the groundwork for an
automated system to "reduce toxic discussions" on Wikipedia. The
team's work could one day lead to the creation of a warning system
for moderators. The researchers caution that this system would
require more research to implement, but they have released a paper
with some fascinating early findings.
To make the supervised machine-learning task simple, the Jigsaw
researchers focused exclusively on ad hominem or personal attacks,
which are relatively easy to identify. They defined personal
attacks as directed at a commenter (i.e., "you suck"), directed at
a third party ("Bill sucks"), quoting an attack ("Bill says Henri
sucks"), or just "another kind of attack or harassment." They used
Crowdflower to crowdsource the job of reviewing 100,000 Wikipedia
comments made between 2004-2015. Ultimately, they used over 4,000
Crowdflower workers to complete the task, and each comment was
annotated by 10 different people as an attack or not.
Once the researchers had their dataset, they trained a logistic
regression algorithm to recognize whether a comment was a personal
attack or not. "With testing, we found that a fully trained model
achieves better performance in predicting whether an edit is a
personal attack than the combined average of three human
crowd-workers," they write in a summary of their paper on Medium.
Who is launching personal attacks?
The researchers unleashed their algorithm on Wikipedia comments
made during 2015, constantly checking results for accuracy. Almost
immediately, they found that they could debunk the time-worn idea
that anonymity leads to abuse. Although anonymous comments are
"six times more likely to be an attack," they represent less than
half of all attacks on Wikipedia. "Similarly, less than half of
attacks come from users with little prior participation," the
researchers write in their paper. "Perhaps surprisingly,
approximately 30% of attacks come from registered users with over
a 100 contributions." In other words, a third of all personal
attacks come from regular Wikipedia editors who contribute several
edits per month. Personal attacks seem to be baked into Wikipedia
The researchers also found that an outsized percentage of attacks
come from a very small number of "highly toxic" Wikipedia
contributors. A whopping 9% of attacks in 2015 came from just 34
users who had made 20 or more personal attacks during the year.
"Significant progress could be made by moderating a relatively
small number of frequent attackers," the researchers note. This
finding bolsters the idea that problems in online communities
often come from a small minority of highly vocal users.
The algorithm was also able to identify a phenomenon often called
the "pile-on." They found that attacking comments are 22 times
more likely to occur close to another attacking comment. "Personal
attacks cluster together in time," the researchers write. "Perhaps
because one personal attack triggers another." Though this
shouldn't be surprising to anyone who has ever taken a peek at
Twitter, being able to quantify this behavior is a boon for
machine learning. It means that an algorithm might be able to
identify a pile-on before it really blows up, and moderators could
come in to de-escalate before things get really ugly.
Depressingly, the study also found that very few personal attacks
are moderated. Only 17.9% of personal attacks lead to a warning or
ban. Attackers are more likely to be moderated if they have
launched a number of attacks or have been moderated before. But
still, this is an abysmal rate of moderation for the most obvious
and blatant form of abuse that can happen in a community.
The researchers conclude their paper by calling for more research.
Wikipedia has released a dump of all talk-page comments to the
site between 2004-1015 via Figshare, so other researchers will
have access to the same dataset that the Jigsaw team did.
Understanding how attacks affect other users is urgent, say the
researchers. Do repeated attacks lead to user abandonment? Are
some groups attacked more often than others? The more we know, the
closer we get to having good tools to aid moderators. Such tools,
the researchers write, "might be used to help moderators build
dashboards that better visualize the health of Wikipedia
conversations or to develop systems to better triage comments for