Application of Machine Learning Algorithms in Breast Cancer Diagnosis and Classification

Int J Sci Acad Res. 2021 Jan;2(1):3081-3086. Epub 2021 Oct 30.

Abstract

Breast cancer continues to be the most frequent cancer in females, affecting about one in 8 women and causing the highest number of cancer-related deaths in females worldwide despite remarkable progress in early diagnosis, screening, and patient management. All breast lesions are not malignant, and all the benign lesions do not progress to cancer. However, the accuracy of diagnosis can be increased by a combination or preoperative tests such as physical examination, mammography, fine-needle aspiration cytology, and core needle biopsy. Despite some limitations, these procedures are more accurate, reliable, and acceptable, when compared with a single adopted diagnostic procedure. Recent studies have shown that breast cancer can be accurately predicted and diagnosed using machine learning (ML) technology. The objective of this study was to explore the application of ML approaches to classify breast cancer based on feature values generated from a digitized image of a fine-needle aspiration (FNA) of a breast mass. To achieve this objective, we used ML algorithms, collected a scientific dataset of 569 breast cancer patients from Kaggle (https://www.kaggle.com/uciml/breast-cancer-wisconsin-data), analyze and interpreted the data based on ten real-valued features of a breast mass FNA including the radius, texture, perimeter, area, smoothness, compactness, concavity, concave points, symmetry, and fractal dimension. Among the 569 patients tested, 63% were diagnosed with benign breast cancer and 37% were diagnosed with malignant breast cancer. Benign tumors grow slowly and do not spread while malignant tumors grow rapidly and spread to other parts of the body.

Keywords: Breast cancer; benign; computer-based learning; machine learning; malignant.