Anaplastic lymphoma kinase (ALK) and ROS oncogene 1 (ROS1) gene fusions are well-established key players in non-small cell lung cancer (NSCLC). Although their frequency is relatively low, their detection is important for patient care and guides therapeutic decisions. The accepted methods used for their detection are immunohistochemistry (IHC) and fluorescence in situ hybridization (FISH) assay, as well as DNA and RNA-based sequencing methodologies. These assays are expensive, time-consuming, and require technical expertise and specialized equipment as well as biological specimens that are not always available. Here we present an alternative detection method using a computer vision deep learning approach. An advanced convolutional neural network (CNN) was used to generate classifier models to detect ALK and ROS1-fusions directly from scanned hematoxylin and eosin (H&E) whole slide images prepared from NSCLC tumors of patients. A two-step training approach was applied, with an initial unsupervised training step performed on a pan-cancer sample cohort followed by a semi-supervised fine-tuning step, which supported the development of a classifier with performances equal to those accepted for diagnostic tests. Validation of the ALK/ROS1 classifier on a cohort of 72 lung cancer cases who underwent ALK and ROS1-fusion testing at the pathology department at Sheba Medical Center displayed sensitivities of 100% for both genes (six ALK-positive and two ROS1-positive cases) and specificities of 100% and 98.6% respectively for ALK and ROS1, with only one false-positive result for ROS1-alteration. These results demonstrate the potential advantages that machine learning solutions may have in the molecular pathology domain, by allowing fast, standardized, accurate, and robust biomarker detection overcoming many limitations encountered when using current techniques. The integration of such novel solutions into the routine pathology workflow can support and improve the current clinical pipeline.
© 2022. The Author(s).