Crowdsourcing image segmentation for deep learning: integrated platform for citizen science, paid microtask, and gamification

Nicolai Spicher; Tim Wesemeyer; Thomas M Deserno

doi:10.1515/bmt-2023-0148

Crowdsourcing image segmentation for deep learning: integrated platform for citizen science, paid microtask, and gamification

Biomed Tech (Berl). 2023 Dec 26;69(3):293-305. doi: 10.1515/bmt-2023-0148. Print 2024 Jun 25.

Authors

Nicolai Spicher¹, Tim Wesemeyer¹, Thomas M Deserno¹

Affiliation

¹ Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Braunschweig, Lower Saxony, Germany.

PMID: 38143326
DOI: 10.1515/bmt-2023-0148

Abstract

Objectives: Segmentation is crucial in medical imaging. Deep learning based on convolutional neural networks showed promising results. However, the absence of large-scale datasets and a high degree of inter- and intra-observer variations pose a bottleneck. Crowdsourcing might be an alternative, as many non-experts provide references. We aim to compare different types of crowdsourcing for medical image segmentation.

Methods: We develop a crowdsourcing platform that integrates citizen science (incentive: participating in the research), paid microtask (incentive: financial reward), and gamification (incentive: entertainment). For evaluation, we choose the use case of sclera segmentation in fundus images as a proof-of-concept and analyze the accuracy of crowdsourced masks and the generalization of learning models trained with crowdsourced masks.

Results: The developed platform is suited for the different types of crowdsourcing and offers an easy and intuitive way to implement crowdsourcing studies. Regarding the proof-of-concept study, citizen science, paid microtask, and gamification yield a median F-score of 82.2, 69.4, and 69.3 % compared to expert-labeled ground truth, respectively. Generating consensus masks improves the gamification masks (78.3 %). Despite the small training data (50 images), deep learning reaches median F-scores of 80.0, 73.5, and 76.5 % for citizen science, paid microtask, and gamification, respectively, indicating sufficient generalizability.

Conclusions: As the platform has proven useful, we aim to make it available as open-source software for other researchers.

Keywords: crowdsourcing; deep learning; image segmentation; platform.

MeSH terms

Citizen Science*
Crowdsourcing*
Deep Learning*
Humans
Image Processing, Computer-Assisted* / methods
Neural Networks, Computer