To better address the difficulties in designing green fruit recognition techniques in machine vision systems, a new fruit detection model is proposed. This model is an optimization of the FCOS (full convolution one-stage object detection) algorithm, incorporating LSC (level scales, spaces, channels) attention blocks in the network structure, and named FCOS-LSC. The method achieves efficient recognition and localization of green fruit images affected by overlapping occlusions, lighting conditions, and capture angles. Specifically, the improved feature extraction network ResNet50 with added deformable convolution is used to fully extract green fruit feature information. The feature pyramid network (FPN) is employed to fully fuse low-level detail information and high-level semantic information in a cross-connected and top-down connected way. Next, the attention mechanisms are added to each of the 3 dimensions of scale, space (including the height and width of the feature map), and channel of the generated multiscale feature map to improve the feature perception capability of the network. Finally, the classification and regression subnetworks of the model are applied to predict the fruit category and bounding box. In the classification branch, a new positive and negative sample selection strategy is applied to better distinguish supervised signals by designing weights in the loss function to achieve more accurate fruit detection. The proposed FCOS-LSC model has 38.65M parameters, 38.72G floating point operations, and mean average precision of 63.0% and 75.2% for detecting green apples and green persimmons, respectively. In summary, FCOS-LSC outperforms the state-of-the-art models in terms of precision and complexity to meet the accurate and efficient requirements of green fruit recognition using intelligent agricultural equipment. Correspondingly, FCOS-LSC can be used to improve the robustness and generalization of the green fruit detection models.
Copyright © 2023 Ruina Zhao et al.