CIDER: Context-sensitive polarity measurement for short-form text

James C Young; Rudy Arthur; Hywel T P Williams

doi:10.1371/journal.pone.0299490

CIDER: Context-sensitive polarity measurement for short-form text

PLoS One. 2024 Apr 18;19(4):e0299490. doi: 10.1371/journal.pone.0299490. eCollection 2024.

Authors

James C Young¹, Rudy Arthur¹, Hywel T P Williams¹

Affiliation

¹ Computer Science, Innovation Centre, University of Exeter, Exeter, United Kingdom.

Abstract

Researchers commonly perform sentiment analysis on large collections of short texts like tweets, Reddit posts or newspaper headlines that are all focused on a specific topic, theme or event. Usually, general-purpose sentiment analysis methods are used. These perform well on average but miss the variation in meaning that happens across different contexts, for example, the word "active" has a very different intention and valence in the phrase "active lifestyle" versus "active volcano". This work presents a new approach, CIDER (Context Informed Dictionary and sEmantic Reasoner), which performs context-sensitive linguistic analysis, where the valence of sentiment-laden terms is inferred from the whole corpus before being used to score the individual texts. In this paper, we detail the CIDER algorithm and demonstrate that it outperforms state-of-the-art generalist unsupervised sentiment analysis techniques on a large collection of tweets about the weather. CIDER is also applicable to alternative (non-sentiment) linguistic scales. A case study on gender in the UK is presented, with the identification of highly gendered and sentiment-laden days. We have made our implementation of CIDER available as a Python package: https://pypi.org/project/ciderpolarity/.

Copyright: © 2024 Young et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

MeSH terms

Algorithms
Gender Identity
Semantics
Sentiment Analysis
Social Media*

Grants and funding

H.T.P.W. acknowledges funding from UK Natural Environment Research Council (NE/P017436/1). J.C.Y. is funded by a PhD studentship from the UK Engineering and Physical Sciences Research Council. No funding bodies had any influence over the content of this report.