A Sentiment Analysis Approach to Predict an Individual's Awareness of the Precautionary Procedures to Prevent COVID-19 Outbreaks in Saudi Arabia

Int J Environ Res Public Health. 2020 Dec 30;18(1):218. doi: 10.3390/ijerph18010218.

Abstract

In March 2020, the World Health Organization (WHO) declared the outbreak of Coronavirus disease 2019 (COVID-19) as a pandemic, which affected all countries worldwide. During the outbreak, public sentiment analyses contributed valuable information toward making appropriate public health responses. This study aims to develop a model that predicts an individual's awareness of the precautionary procedures in five main regions in Saudi Arabia. In this study, a dataset of Arabic COVID-19 related tweets was collected, which fell in the period of the curfew. The dataset was processed, based on several machine learning predictive models: Support Vector Machine (SVM), K-nearest neighbors (KNN), and Naïve Bayes (NB), along with the N-gram feature extraction technique. The results show that applying the SVM classifier along with bigram in Term Frequency-Inverse Document Frequency (TF-IDF) outperformed other models with an accuracy of 85%. The results of awareness prediction showed that the south region observed the highest level of awareness towards COVID-19 containment measures, whereas the middle region was the least. The proposed model can support the medical sectors and decision-makers to decide the appropriate procedures for each region based on their attitudes towards the pandemic.

Keywords: Arabic sentiment analysis; K-nearest neighbor; N-gram; Twitter; machine learning; natural language processing; naïve bayes; support vector machine.

MeSH terms

  • Bayes Theorem
  • COVID-19 / prevention & control*
  • Disease Outbreaks / prevention & control*
  • Health Knowledge, Attitudes, Practice*
  • Humans
  • Public Health
  • Saudi Arabia / epidemiology
  • Support Vector Machine