People who share encounters with racism are silenced online by humans and machines, but a guideline-reframing intervention holds promise

Proc Natl Acad Sci U S A. 2024 Sep 17;121(38):e2322764121. doi: 10.1073/pnas.2322764121. Epub 2024 Sep 9.

Abstract

Are members of marginalized communities silenced on social media when they share personal experiences of racism? Here, we investigate the role of algorithms, humans, and platform guidelines in suppressing disclosures of racial discrimination. In a field study of actual posts from a neighborhood-based social media platform, we find that when users talk about their experiences as targets of racism, their posts are disproportionately flagged for removal as toxic by five widely used moderation algorithms from major online platforms, including the most recent large language models. We show that human users disproportionately flag these disclosures for removal as well. Next, in a follow-up experiment, we demonstrate that merely witnessing such suppression negatively influences how Black Americans view the community and their place in it. Finally, to address these challenges to equity and inclusion in online spaces, we introduce a mitigation strategy: a guideline-reframing intervention that is effective at reducing silencing behavior across the political spectrum.

Keywords: content moderation; natural language processing (NLP); race; social media; toxicity classification.

MeSH terms

  • Algorithms
  • Black or African American
  • Humans
  • Racism*
  • Social Media*