Advancements and challenges in Arabic sentiment analysis: A decade of methodologies, applications, and resource development

Heliyon. 2024 Oct 24;10(21):e39786. doi: 10.1016/j.heliyon.2024.e39786. eCollection 2024 Nov 15.

Abstract

The exponential growth of digital information, particularly user-generated content on social media and blogging platforms, has underscored the importance of sentiment analysis (SA). Arabic language sentiment analysis (ASA) involves identifying the orientation of ideas, feelings, emotions, and attitudes within Arabic text to determine whether they convey a positive, negative, or neutral sentiment. This paper presents a comprehensive review of the past decade, focusing on the utilization of SA in the Arabic language. It examines various applications, methodologies, and challenges associated with ASA, highlighting gaps and limitations in existing approaches, lexicons, and annotated datasets. The primary objective of this review is to assist researchers in identifying these gaps and limitations while offering accessible annotated datasets, preprocessing techniques, and procedures. We defined specific criteria for selecting research publications from the last 10 years, including 150 papers in our review process, while excluding earlier publications. The review utilized multiple databases, including Google Scholar, Scopus, and Web of Science. The inherent complexity of the Arabic language, due to its unique traits and diverse dialects, presents significant challenges in ASA. Moreover, the lack of annotated datasets, lexicon resources, and programming tools further complicates sentiment analysis in Arabic. The morphological variations within Arabic make it linguistically challenging. To address these issues, it is crucial to develop additional resources and construct new Arabic sentiment lexicons that account for the various dialects within Modern Standard Arabic (MSA). Our findings reveal that there is no standard public lexicon that adequately enhances the calculation of ASA across different domains, such as e-commerce, politics, public health, and marketing.

Keywords: And lexical tools; Arabic sentiment analysis; Machine learning.

Publication types

  • Review