The challenge of detecting hate speech

UPDATED

JUN 11, 2021

Hate speech is especially difficult for technology and human review teams to detect. Idioms and nuances vary widely across cultures, languages and regions. Also, people sometimes share words that would normally be hate speech, but they do it to raise awareness for the problem or to use self-referentially in an effort to reclaim the term.

Those are challenges just detecting hate speech in text. A lot of hate speech we find on the Facebook app and Instagram is in photos or videos. A meme, for example, might use text and images together to attack a particular group of people. This is an even greater challenge for technology.

Content like this gets more complicated when people try to avoid detection by changing their content. For example, they might misspell words, avoid certain phrases or modify their images and videos.

Progress in using artificial intelligence to detect hate speech

We improved our tools for detecting hate speech over the last several years, so now we remove much of this content before people report it—and, in some cases, before anyone sees it.

We use AI to identify images and text that are identical to content that we already removed as hate speech. Our technology also looks at reactions and comments to assess how similar a piece of content is.

These techniques help our technology more accurately detect hate speech, even when the meaning is not obvious or the content is changed to avoid detection.