Development of machine learning model to predict pulmonary function with low-dose CT-derived parameter response mapping in a community-based chest screening cohort

J Appl Clin Med Phys. 2023 Nov;24(11):e14171. doi: 10.1002/acm2.14171. Epub 2023 Oct 2.

Abstract

Purpose: To construct and evaluate the performance of a machine learning-based low dose computed tomography (LDCT)-derived parametric response mapping (PRM) model for predicting pulmonary function test (PFT) results.

Materials and methods: A total of 615 subjects from a community-based screening population (40-74 years old) with PFT parameters, including the ratio of the first second forced expiratory volume to forced vital capacity (FEV1/FVC), the percentage of forced expiratory volume in the one second predicted (FEV1%), and registered inspiration-to-expiration chest CT scanning were enrolled retrospectively. Subjects were classified into a normal, high risk, and COPD group based on PFT. Data of 72 PRM-derived quantitative parameters were collected, including volume and volume percentage of emphysema, functional-small airways disease, and normal lung tissue. A machine-learning with random forest regression model and a multilayer perceptron (MLP) model were constructed and tested on PFT prediction, which was followed by evaluation of classification performance based on the PFT predictions.

Results: The machine-learning model based on PRM parameters showed better performance for predicting PFT than MLP, with a coefficient of determination (R2 ) of 0.749 and 0.792 for FEV1/FVC and FEV1%, respectively. The Mean Squared Errors (MSE) for FEV1/FVC and FEV1% are 0.0030 and 0.0097 for the random forest model, respectively. The Root Mean Squared Errors (RMSE) for FEV1/FVC and FEV1% are 0.055 and 0.098, respectively. The sensitivity, specificity, and accuracy for differentiating between the normal group and high-risk group were 34/40 (85%), 65/72 (90%), and 99/112 (88%), respectively. For differentiating between the non-COPD group and COPD group, the sensitivity, specificity, and accuracy were 8/9 (89%), 112/112 (100%), 120/121 (99%), respectively.

Conclusions: The machine learning-based random forest model predicts PFT results in a community screening population based on PRM, and it identifies high risk COPD from normal populations with high sensitivity and reliably predicts of high-risk COPD.

Keywords: X-ray computed; chronic obstructive; pulmonary disease; pulmonary function test; quantitative imaging; tomography.

MeSH terms

  • Adult
  • Aged
  • Forced Expiratory Volume / physiology
  • Humans
  • Lung* / diagnostic imaging
  • Middle Aged
  • Pulmonary Disease, Chronic Obstructive* / diagnostic imaging
  • Retrospective Studies
  • Tomography, X-Ray Computed / methods