An analysis of security vulnerabilities in container images for scientific data analysis

Gigascience. 2021 Jun 3;10(6):giab025. doi: 10.1093/gigascience/giab025.

Abstract

Background: Software containers greatly facilitate the deployment and reproducibility of scientific data analyses in various platforms. However, container images often contain outdated or unnecessary software packages, which increases the number of security vulnerabilities in the images, widens the attack surface in the container host, and creates substantial security risks for computing infrastructures at large. This article presents a vulnerability analysis of container images for scientific data analysis. We compare results obtained with 4 vulnerability scanners, focusing on the use case of neuroscience data analysis, and quantifying the effect of image update and minification on the number of vulnerabilities.

Results: We find that container images used for neuroscience data analysis contain hundreds of vulnerabilities, that software updates remove roughly two-thirds of these vulnerabilities, and that removing unused packages is also effective.

Conclusions: We provide recommendations on how to build container images with fewer vulnerabilities.

Keywords: Docker; containers; neuroimaging; security vulnerabilities; singularity.

MeSH terms

  • Data Analysis*
  • Reproducibility of Results
  • Software*