Exploring COVID-19 pathogenesis on command-line: A bioinformatics pipeline for handling and integrating omics data

Adv Protein Chem Struct Biol. 2022:131:311-339. doi: 10.1016/bs.apcsb.2022.04.002. Epub 2022 May 12.

Abstract

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was first identified in late 2019 in Wuhan, China, and has proven to be highly pathogenic, making it a global public health threat. The immediate need to understand the mechanisms and impact of the virus made omics techniques stand out, as they can offer a holistic and comprehensive view of thousands of molecules in a single experiment. Mastering bioinformatics tools to process, analyze, integrate, and interpret omics data is a powerful knowledge to enrich results. We present a robust and open access computational pipeline for extracting information from quantitative proteomics and transcriptomics public data. We present the entire pipeline from raw data to differentially expressed genes. We explore processes and pathways related to mapped transcripts and proteins. A pipeline is presented to integrate and compare proteomics and transcriptomics data using also packages available in the Bioconductor and providing the codes used. Cholesterol metabolism, immune system activity, ECM, and proteasomal degradation pathways increased in infected patients. Leukocyte activation profile was overrepresented in both proteomics and transcriptomics data. Finally, we found a panel of proteins and transcripts regulated in the same direction in the lung transcriptome and plasma proteome that distinguish healthy and infected individuals. This panel of markers was confirmed in another cohort of patients, thus validating the robustness and functionality of the tools presented.

Keywords: COVID-19; Data integration; Proteomics; SARS-CoV-2; Transcriptomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19* / genetics
  • Computational Biology
  • Humans
  • Proteome / metabolism
  • Proteomics / methods
  • SARS-CoV-2 / genetics

Substances

  • Proteome