Objective: To investigate the feasibility and accuracy of computer vision-based artificial intelligence technology in detecting and recognizing instruments and organs in the scenario of radical laparoscopic gastrectomy for gastric cancer. Methods: Eight complete laparoscopic distal radical gastrectomy surgery videos were collected from four large tertiary hospitals in China (First Medical Center of Chinese PLA General Hospital [three cases], Liaoning Cancer Hospital [two cases], Liyang Branch of Jiangsu Province People's Hospital [two cases], and Fudan University Shanghai Cancer Center [one case]). PR software was used to extract frames every 5-10 seconds and convert them into image frames. To ensure quality, deduplication was performed manually to remove obvious duplication and blurred image frames. After conversion and deduplication, there were 3369 frame images with a resolution of 1,920×1,080 PPI. LabelMe was used for instance segmentation of the images into the following 23 categories: veins, arteries, sutures, needle holders, ultrasonic knives, suction devices, bleeding, colon, forceps, gallbladder, small gauze, Hem-o-lok, Hem-o-lok appliers, electrocautery hooks, small intestine, hepatogastric ligaments, liver, omentum, pancreas, spleen, surgical staplers, stomach, and trocars. The frame images were randomly allocated to training and validation sets in a 9:1 ratio. The YOLOv8 deep learning framework was used for model training and validation. Precision, recall, average precision (AP), and mean average precision (mAP) were used to evaluate detection and recognition accuracy. Results: The training set contained 3032 frame images comprising 30 895 instance segmentation counts across 23 categories. The validation set contained 337 frame images comprising 3407 instance segmentation counts. The YOLOv8m model was used for training. The loss curve of the training set showed a smooth gradual decrease in loss value as the number of iteration calculations increased. In the training set, the AP values of all 23 categories were above 0.90, with a mAP of 0.99, whereas in the validation set, the mAP of the 23 categories was 0.82. As to individual categories, the AP values for ultrasonic knives, needle holders, forceps, gallbladders, small pieces of gauze, and surgical staplers were 0.96, 0.94, 0.91, 0.91, 0.91, and 0.91, respectively. The model successfully inferred and applied to a 5-minutes video segment of laparoscopic gastroenterostomy suturing. Conclusion: The primary finding of this multicenter study is that computer vision can efficiently, accurately, and in real-time detect organs and instruments in various scenarios of radical laparoscopic gastrectomy for gastric cancer.
目的: 探究计算机视觉人工智能技术在腹腔镜胃癌根治术场景中对器械和脏器检测识别的可行性和准确性。 方法: 收集国内4家大型三甲医院[解放军总医院第一医学中心(3份)、辽宁省肿瘤医院(2份)、江苏省人民医院溧阳分院(2份)、复旦大学附属肿瘤医院(1份)]共计8份完全腹腔镜远端胃癌根治术手术视频。使用PR软件每5~10 s进行抽帧转换为图帧,转换后进行人工去重,去除明显雷同图帧和模糊图帧以确保质量。转换并去重后,抽帧图像共3 369张,图像分辨率为1 920×1 080 PPI,用LabelMe实例分割图像;共计23个类别包括静脉、动脉、缝针、持针器、超声刀、吸引器、出血、结肠、钳子、胆囊、小纱布、Hem-o-lok夹、Hem-o-lok钳子、电钩、小肠、肝圆韧带、肝脏、网膜、胰腺、脾脏、吻合器、胃和Trocar穿刺器。将抽帧图像按照9∶1比例随机分为模型训练集和模型验证集,使用YOLOv8深度学习框架进行模型训练和验证。采用精确度、召回率、精确度均值和平均精确度均值(mAP)评价检测识别准确性。 结果: 训练集3 032帧图像,23个类别共计30 895个实例分割数量;验证集337帧图像,共计3 407个实例分割数量。使用YOLOv8m模型训练,训练集损失曲线中损失值随迭代计算轮次增加而逐步平滑下降。训练集中,23个类别检测识别AP值均达0.90以上,23个类别mAP为0.99。验证集中,23个类别mAP为0.82。单一类别中,超声刀、持针器、钳子、胆囊、小纱布和吻合器的AP值分别为0.96、0.94、0.91、0.91、0.91和0.91。模型成功推理应用于时长为5 min的腹腔镜下缝合胃肠共同开口视频片段。 结论: 本研究初步证实了计算机视觉可高效准确并实时地检测腹腔镜胃癌根治术各手术场景中的脏器和器械。.