A large multi-focus dataset for white blood cell classification

Sci Data. 2024 Oct 9;11(1):1106. doi: 10.1038/s41597-024-03938-1.

Abstract

The White Blood Cell (WBC) differential test ranks as the second most frequently performed diagnostic assay. It requires manual confirmation of the peripheral blood smear by experts to identify signs of abnormalities. Automated digital microscopy has emerged as a solution to reduce this labor-intensive process and improve efficiency. Several publicly available datasets provide various WBC subtypes of differing quality and resolution. These datasets have contributed to advancing WBC classification using machine learning techniques. However, digital microscopy of blood cells with high magnification often requires a wider depth of field, posing challenges for automatic digital microscopy that necessitates capturing multiple stacks of focal planes to obtain complete images of specific blood cells. Our dataset provides 25,773 image stacks from 72 patients. The image labels consist of 18 classes encompassing normal and abnormal cells, with two experts reviewing each label. Each image includes 10 z-stacks of cropped 200 by 200 pixel images, captured using a 50X microscope with 400 nm intervals. This study presents a comprehensive multi-focus dataset for WBC classification.

Publication types

  • Dataset

MeSH terms

  • Humans
  • Image Processing, Computer-Assisted
  • Leukocyte Count
  • Leukocytes* / classification
  • Leukocytes* / cytology
  • Machine Learning
  • Microscopy*