Neural classification of Norwegian radiology reports: using NLP to detect findings in CT-scans of children

BMC Med Inform Decis Mak. 2021 Mar 4;21(1):84. doi: 10.1186/s12911-021-01451-8.

Abstract

Background: With a motivation of quality assurance, machine learning techniques were trained to classify Norwegian radiology reports of paediatric CT examinations according to their description of abnormal findings.

Methods: 13.506 reports from CT-scans of children, 1000 reports from CT scan of adults and 1000 reports from X-ray examination of adults were classified as positive or negative by a radiologist, according to the presence of abnormal findings. Inter-rater reliability was evaluated by comparison with a clinician's classifications of 500 reports. Test-retest reliability of the radiologist was performed on the same 500 reports. A convolutional neural network model (CNN), a bidirectional recurrent neural network model (bi-LSTM) and a support vector machine model (SVM) were trained on a random selection of the children's data set. Models were evaluated on the remaining CT-children reports and the adult data sets.

Results: Test-retest reliability: Cohen's Kappa = 0.86 and F1 = 0.919. Inter-rater reliability: Kappa = 0.80 and F1 = 0.885. Model performances on the Children-CT data were as follows. CNN: (AUC = 0.981, F1 = 0.930), bi-LSTM: (AUC = 0.978, F1 = 0.927), SVM: (AUC = 0.975, F1 = 0.912). On the adult data sets, the models had AUC around 0.95 and F1 around 0.91.

Conclusions: The models performed close to perfectly on its defined domain, and also performed convincingly on reports pertaining to a different patient group and a different modality. The models were deemed suitable for classifying radiology reports for future quality assurance purposes, where the fraction of the examinations with abnormal findings for different sub-groups of patients is a parameter of interest.

Keywords: Machine learning; Natural language processing; Reproducibility of results; Tomography; X-ray computed.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Child
  • Humans
  • Neural Networks, Computer
  • Radiography
  • Radiology*
  • Reproducibility of Results
  • Tomography, X-Ray Computed*