Being able to link clinical outcomes to SARS-CoV-2 virus strains is a critical component of understanding COVID-19. Here, we discuss how current processes hamper sustainable data collection to enable meaningful analysis and insights. Following the 'Fast Healthcare Interoperable Resource' (FHIR) implementation guide, we introduce an ontology-based standard questionnaire to overcome these shortcomings and describe patient 'journeys' in coordination with the World Health Organization's recommendations. We identify steps in the clinical health data acquisition cycle and workflows that likely have the biggest impact in the data-driven understanding of this virus. Specifically, we recommend detailed symptoms and medical history using the FHIR standards. We have taken the first steps towards this by making patient status mandatory in GISAID ('Global Initiative on Sharing All Influenza Data'), immediately resulting in a measurable increase in the fraction of cases with useful patient information. The main remaining limitation is the lack of controlled vocabulary or a medical ontology.
Keywords: COVID-19; GISAID; SARS-CoV-2; genome sequence; ontology; patient information.
© 2020 The Authors. Transboundary and Emerging Diseases published by Wiley-VCH GmbH.