Objective: To study the association between histopathological features and HER2 overexpression/amplification in breast cancers using deep learning algorithms. Methods: A total of 345 HE-stained slides of breast cancer from 2012 to 2018 were collected at the China-Japan Friendship Hospital, Beijing, China. All samples had accurate diagnosis results of HER2 which were classified into one of the 4 HER2 expression levels (0, 1+, 2+, 3+). After digitalization, 204 slides were used for weakly supervised model training, and 141 used for model testing. In the training process, the regions of interest were extracted through cancer detected model and then input to the weakly supervised classification model to tune the model parameters. In the testing phase, we compared performance of the single- and double-threshold strategies to assess the role of the double-threshold strategy in clinical practice. Results: Under the single-threshold strategy, the deep learning model had a sensitivity of 81.6% and a specificity of 42.1%, with the AUC of 0.67 [95% confidence intervals (0.560,0.778)]. Using the double-threshold strategy, the model achieved a sensitivity of 96.3% and a specificity of 89.5%. Conclusions: Using HE-stained histopathological slides alone, the deep learning technology could predict the HER2 status using breast cancer slides, with a satisfactory accuracy. Based on the double-threshold strategy, a large number of samples could be screened with high sensitivity and specificity.
目的: 通过深度学习方法,探讨病理形态学与乳腺癌HER2过表达/扩增的关系。 方法: 采集2012—2018年中日友好医院345张乳腺癌HE染色切片,所有样本均拥有HER2的准确诊断结果,并包含0、1+、2+、3+多种HER2类型。数字化扫描后,204张用于弱监督模型训练,141张用于模型测试。在训练过程中,首先通过癌区识别模型,提取热点区域,随后将热点区域输入弱监督分类模型进行深度学习模型的建立。在测试过程中,对比使用单阈值与双阈值策略的效果,验证双阈值策略在临床可用性方面的作用。 结果: 在单阈值策略下,深度学习模型可达到81.6%的灵敏度及42.1%的特异度,AUC=0.67[95%CI(0.560,0.778)]。采用双阈值策略,模型的灵敏度为96.3%,特异度达到89.5%。 结论: 仅使用HE组织学切片,通过深度学习技术,能够以一定的准确率实现乳腺癌HER2基因状态的预测。基于双阈值策略,能够以高灵敏度和特异度筛出大量样本。.