BMSeg- An automated system for cell segmentation and classification in bone marrow biopsies
Lang Xiong & Joshua Liu
Lay Summary:
To help diagnose diseases like leukemia, doctors must manually analyze bone marrow samples under a microscope, a difficult and time-consuming process that can lead to errors. We are developing an AI-powered tool that automates this analysis by learning to rapidly and accurately identify all the different cell types in a sample, aiming to help doctors make faster and more reliable diagnoses.
Abstract:
The manual pathological review of bone marrow biopsies is a critical diagnostic process that is time-consuming, labor-intensive, and subjective, leading to misdiagnosis in up to 12% of cases. This highlights a significant need for standardized and efficient analytical tools. A major gap exists in computational pathology, as there is no general-purpose deep learning model for comprehensive bone marrow analysis that leverages the crucial tissue architecture preserved in biopsies. To address this, we propose BMSeg, an automated pipeline for the segmentation and classification of cells in bone marrow whole-slide images (WSIs) designed to work with sparse annotations. Our method is developed on a dataset of 23 H&E stained biopsies. We first employ a pre-trained HoverNet model for robust nuclear segmentation, which achieved an average Dice Score of 0.7 against ground-truth annotations. To capture detailed morphological information, we then utilize a pre-trained Convolutional Neural Network (CNN) to generate a rich 512-feature vector for each segmented cell. These features serve as inputs to a Graph Neural Network (GNN), which learns to classify cells by modeling their spatial relationships within the tissue microenvironment. The GNN undergoes a two-step training process- self-supervised pre-training on all detected cells to learn general structural patterns, followed by supervised fine-tuning on the sparse annotations. Our results demonstrate the efficacy of this pipeline and underscore the critical importance of feature representation. The GNN trained with the 512-feature CNN-based vectors achieved a classification accuracy of 28.29%, a significant increase from the 18.70% accuracy achieved with a simpler 7-feature set. This work establishes a complete, data-efficient pipeline for GNN-based bone marrow analysis. BMSeg serves as a foundational model for developing more sophisticated tools capable of analyzing cellular niches and enhancing diagnostic accuracy in hematopathology.
Q&A:
Bios: Lang Xiong,Joshua Liu
Program Track: Advanced Research
GitHub Username:
langlglang -Lang Xiong
jliu-GH -Joshua Liu
What was your favorite seminar? Why?
Rishy, not only because he was our mentor, but also it was very inspiring to see what a fellow student was able to accomplish over the summer last year and the potential of each project if you put the time and effort in. -Lang Xiong
My favorite seminar was Vivek’s at the end since he was my mentor and it was very interesting to see the many projects that he was tackling. It was also interesting to see how he used hover net and GNNs in one of his projects. It was also very inspiring at the end when he talked about his personal journey and gave tips. -Joshua Liu
If you were to summarize your summer internship experience in one sentence, what would it be?
A transformative experience in bioinformatics with great mentors and teammates, allowing me to dive deep into a cutting edge research topic. -Lang Xiong
Working and learning from my mentors on how to conduct impactful research. -Joshua Liu