Manual annotation of ultrasound images relies on expert knowledge and requires significant time and financial resources. Semi-supervised learning (SSL) exploits large amounts of unlabeled data to improve model performance under limited labeled data. However, it faces two challenges: fusion of contextual information at multiple scales and bias of spatial information between multiple objects. We propose a consistency learning-based multi-scale multi-object (MSMO) semi-supervised framework for ultrasound image segmentation. MSMO addresses these challenges by employing a contextual-aware encoder coupled with a multi-object semantic calibration and fusion decoder. First, the encoder extracts multi-scale multi-objects context-aware features, and introduces attention module to refine the feature map and enhance channel information interaction. Then, the decoder uses HConvLSTM to calibrate the output features of the current object by using the hidden state of the previous object, and recursively fuses multi-object semantics at different scales. Finally, MSMO further reduces variations among multiple decoders in different perturbations through consistency constraints, thereby producing consistent predictions for highly uncertain areas. Extensive experiments show that proposed MSMO outperforms the SSL baseline on four benchmark datasets, whether for single-object or multi-object ultrasound image segmentation. MSMO significantly reduces the burden of manual analysis of ultrasound images and holds great potential as a clinical tool. The source code is accessible to the public at: https://github.com/lol88/MSMO.
Keywords: Consistent learning; Multi-object; Multi-scale; Semi-supervised learning; Ultrasound image segmentation.
Copyright © 2024 Elsevier Ltd. All rights reserved.