Neeraj Dandamudi

Malignant Cell Detection Using Deep Learning and Segmentation Techniques

Neeraj Dandamudi



Lay Summary:

My work consists of developing an automated method to find the proportions of various features within microscopic cell images. This methodology uses ratios to help in recognizing patterns that indicate a possible cancer diagnosis, helping doctors diagnose bladder cancer faster and with less effort.

Abstract:

Bladder cancer is the sixth most common cancer in the United States, with urothelial carcinoma representing a significant proportion of cases and posing high recurrence rates. Early detection and accurate diagnosis are critical for improving patient outcomes, yet current screening methods, such as urine cytology, rely heavily on manual interpretation. This process is time-consuming, subjective, and susceptible to variability. The primary aim of this project is to develop and evaluate automated segmentation-based methods for accurate N/C ratio measurement as a malignancy marker in urothelial cells from urine cytology images. The study compares multiple traditional image processing techniques, including intensity thresholding, K-means clustering, and Random Forest classifiers, against deep learning architectures such as U-Net and Feature Pyramid Networks (FPN) with ResNet backbones. Results show that while deep learning models achieve the highest segmentation accuracy (mIoU ≈ 0.7715) and produce visually precise masks, K-means clustering achieves the strongest correlation with true N/C ratios (Spearman ≈ 0.781) and delivers the best malignancy classification performance (AUC ≈ 0.917). These findings highlight that optimal segmentation quality does not always translate to superior quantitative metric fidelity, and that simpler, resource-efficient methods can outperform in specific diagnostic contexts. The application of this workflow could significantly streamline cytology analysis, reduce diagnostic variability, and provide a scalable, objective screening tool to enhance clinical decision-making in bladder cancer detection.



Q&A:


Bios: Neeraj Dandamudi

Program Track: Skills Development

GitHub Username:

Xelerate97 -Neeraj Dandamudi

What was your favorite seminar? Why?

I enjoyed the presentation done by Dr. Indrani Bhattacharya, as it resonated with the independent work I’m doing. Her past work on automated prostate cancer detection aligns with my own work on urothelial carcinoma detection. Thus causing me to be extremely invested in her presentation since I believed I could learn something useful and apply it to my own work. I find that image analysis, like her past work, is a fascinating subset of medical computing that I think is interesting due to it’s potential for improving treatment possibilities. -Neeraj Dandamudi

If you were to summarize your summer internship experience in one sentence, what would it be?

I would summarize my summer internship by saying it was a great opportunity to learn more about a field that I was unfamiliar with through hands-on, independent problem solving. -Neeraj Dandamudi

Blog Post


Automating Malignant Cell Detection Using Segmentation-Based N/C Ratio Analysis

By: Neeraj Dandamudi – Emerging Diagnostic and Investigative Technologies, Department of Pathology, Dartmouth Hitchcock Medical Center

Introduction

Bladder cancer is one of the most common cancers in the United States, ranking sixth overall and notorious for its high recurrence rates. Urothelial carcinoma, a type of bladder cancer affecting the cells lining the urinary tract, often requires ongoing monitoring long after initial treatment. In clinical practice, urine cytology is widely used for screening because it is non-invasive and capable of detecting high-grade disease. However, urine cytology has important limitations. Pathologists must manually examine cell samples under a microscope, making judgments about subtle morphological features such as nuclear size, shape, and texture. One of the most established diagnostic indicators is the nuclear-to-cytoplasmic (N/C) ratio, which tends to be higher in malignant cells. While effective in skilled hands, manual N/C ratio estimation is time-consuming, subjective, and highly variable between observers. These limitations can lead to inconsistent results and missed opportunities for early intervention, especially in low-grade cases where features are less obvious.

Our project addresses these challenges by exploring whether segmentation-based image analysis can automate and standardize N/C ratio measurement. We compare a range of segmentation techniques — both traditional computer vision methods and modern deep learning architectures — to evaluate which approaches most accurately reproduce pathologist-provided ground truth annotations and which yield the strongest clinical signals for malignancy prediction. The goal is to provide a tool that maintains the objectivity and reproducibility of quantitative metrics while achieving the speed and scalability needed for integration into clinical workflows.

From a clinical standpoint, early detection of malignant urothelial cells allows for timely, less invasive treatment and reduced patient morbidity. Therefore, I set out to answer whether or not segmentation combined with automated N/C ratio analysis serves as a reliable malignancy marker, and which method of segmentation is most appropriate for this task.

Methodology

We approached this as a comparative retrospective image analysis. Our dataset consisted of approximately 300 single-cell images (128×128 pixels) of urothelial cells, each with manually annotated masks labeling the background, cytoplasm, and nucleus. These masks served as the ground truth for segmentation accuracy and N/C ratio calculation. The dataset was divided into 70% training, 15% validation, and 15% test sets. To assess clinical validity, we included an additional set of 100 specimen images (25 each labeled as negative, atypical, suspicious, or positive) without segmentation masks but with diagnostic category labels.

We implemented six segmentation approaches. On the traditional side, we evaluated intensity thresholding, K-means clustering (k=3), and three variations of Random Forest classifiers. Thresholding applied a grayscale intensity cutoff, refined through Bayesian search, to distinguish the nucleus from the cytoplasm. K-means clustered pixels in RGB space and mapped the resulting groups to the three classes based on average pixel intensity. The Random Forest models progressed from a basic RGB + edge feature set to a hyperparameter-optimized version to a fully extended model incorporating GLCM texture and multiscale features for richer morphological representation.

For deep learning, we tested two architectures: U-Net and a Feature Pyramid Network (FPN), both with a ResNet-34 backbone. U-Net uses skip connections to preserve spatial detail during decoding, which is critical for tracing tight nuclear contours. FPN combines features from multiple scales, improving its ability to delineate both large and small nuclei within the same image.

We evaluated segmentation quality using mean Intersection over Union (mIoU) and Dice coefficient, measuring overlap between predicted and ground truth masks. To assess N/C ratio agreement, we calculated Spearman correlation and mean absolute error (MAE) between automated measurements and ground truth ratios. Finally, we tested whether N/C ratios could serve as a malignancy classifier by generating ROC curves and calculating AUC, sensitivity, and specificity.

Results and Analysis

segmentation_iou_per_class.png{width=”5.815944881889764in” height=”2.271981627296588in”}

Figure 1: Comparison of IoU between segmentation approaches

U-Net and FPN segmentation achieved the highest mIoU scores at approximately 0.7715, producing clean, well-defined masks with minimal leakage at the boundaries. Visually, these models excelled in complex regions where nuclei overlapped with cytoplasm or where staining was uneven. However, while these models dominated in pixel-level accuracy, their N/C ratio calculations showed a subtle but consistent bias toward overestimating nuclear size. This is likely due to the models’ tendency to slightly “inflate” nuclear boundaries, which has minimal impact on segmentation metrics but can meaningfully alter ratio-based measurements.

In contrast, K-means clustering produced the lowest mIoU at 0.6441, yet achieved the highest agreement with ground truth N/C ratios, with a Spearman correlation of 0.781 and an MAE of 0.097. This means that while K-means clustering fails to produce high-quality masks, it manages to calculate accurate N/C ratios. This suggests that while K-means struggled with fine boundary precision, its segmentation errors were more balanced between over- and under-segmentation, leading to a truer representation of nucleus-to-cytoplasm proportions. This property made K-means particularly effective for clinical evaluation.

Figure 2: Scatterplot comparing True vs Calculated N/C ratios for Top 3 Segmentation Approachesnc_scatter_all_methods.png{width=”7.705161854768154in” height=”2.0640923009623795in”}

Random Forest models fell in between. The fully extended model with texture features improved mask quality compared to the base version, but at a steep computational cost. While their N/C ratio agreement was reasonable, they did not surpass K-means and were less consistent across the test set. Thresholding, while very low in computational costs, is worse than other segmentation approaches across the board.

specimen_nc_boxplot.png{width=”3.3177088801399823in” height=”2.2016601049868765in”}specimen_roc.png{width=”2.963542213473316in” height=”2.448143044619423in”}

Figures 3 & 4: Clinical Evaluation of N/C ratios developed by K-means segmentation

The clinical validation confirmed the predictive value of N/C ratios. Using K-means-derived N/C values, we achieved an ROC curve AUC of 0.917 at an optimal threshold of approximately 0.324, with an accuracy of 0.88, sensitivity of 0.84, and specificity of 0.92. Boxplots of the specimen set revealed clear separation between benign and malignant categories, and N/C medians increased from negative to positive diagnostic classes (ρ ≈ 0.761). This ordinal trend suggests that the N/C ratio not only serves as a binary classifier but also tracks with the spectrum of diagnostic severity, providing an interpretable clinical signal.

Conclusion

We can conclude that automated segmentation combined with N/C ratio analysis can yield a high-performance, interpretable marker for malignant urothelial cells in urine cytology. Deep learning methods such as U-Net and FPN produce the cleanest, most precise masks, making them valuable for applications requiring high-resolution morphological detail. However, for ratio-based malignancy detection, simpler approaches like K-means can outperform due to their proportional accuracy. This opens up future work as deep learning methods could hypothetically be more accurate if parameters are optimized further. The high AUC, sensitivity, and specificity achieved with automated N/C ratio measurement showcase its potential as an objective screening tool.