Recently, computer vision methods have been widely applied to agricultural tasks, such as robotic harvesting. In particular, fruit harvesting robots often rely on object detection or segmentation to identify and localize target fruits. During the model selection process for object detection, the average precision (AP) score typically provides the de facto standard. However, AP is not intuitive for determining which model is most efficient for robotic harvesting. It is based on the intersection-over-union (IoU) of bounding boxes, which reflects only regional overlap. IoU alone cannot reliably predict the success of robotic gripping, as identical IoU scores may yield different results depending on the overlapping shape of the boxes. In this paper, we propose a novel evaluation metric for robotic harvesting. To assess gripping success, our metric uses the center coordinates of bounding boxes and a margin hyperparameter that accounts for the gripper's specifications. We conducted evaluation about popular object detection models on peach and apple datasets. The experimental results showed that the proposed gripping success metric is much more intuitive and helpful in interpreting the performance data.
Keywords: computer vision; grasp detection; object detection; robot harvesting; robotic grasping.