A bibliometric analysis of 23,492 publications on rectal cancer by machine learning: basic medical research is needed

Therap Adv Gastroenterol. 2020 Jul 27:13:1756284820934594. doi: 10.1177/1756284820934594. eCollection 2020.

Abstract

Background and aims: The aim of this study was to analyse the landscape of publications on rectal cancer (RC) over the past 25 years by machine learning and semantic analysis.

Methods: Publications indexed in PubMed under the Medical Subject Headings (MeSH) term 'Rectal Neoplasms' from 1994 to 2018 were downloaded in September 2019. R and Python were used to extract publication date, MeSH terms and abstract from the metadata of each publication for bibliometric assessment. Latent Dirichlet allocation was applied to analyse the text from the articles' abstracts to identify more specific research topics. Louvain algorithm was used to establish a topic network resulting in identifying the relationship between the topics.

Results: A total of 23,492 papers published were identified and analysed in this study. The changes of research focus were analysed by the changing of MeSH terms. Studied contents extracted from the publications were divided into five areas, including surgical intervention, radiotherapy and chemotherapy intervention, clinical case management, epidemiology and cancer risk as well as prognosis studies.

Conclusions: The number of publications indexed on RC has expanded rapidly over the past 25 years. Studies on RC have mainly focused on five areas. However, studies on basic research, postoperative quality of life and cost-effective research were relatively lacking. It is predicted that basic research, inflammation and some other research fields might become the potential hotspots in the future.

Keywords: LDA analyses; bibliometric analysis; machine learning; rectal cancer.