Leah Zhang

Evaluating SimCLR and Graph Contrastive Learning for Embedding and Clustering Chorionic Villi

Leah Zhang



Lay Summary:

Pathologists often disagree in differentiating between villi types, thus we use clustering to find a separation. Instead of focusing on cell types and villi metrics we wish to find a connection between superficial clusters and morphological clusters.

Abstract:

The placenta is an essential organ that transfers nutrients from the mother to the unborn baby. Within it chorionic villi are tiny projections of placental tissues that provide nutrient and waste transfer between fetus and mother. To aid in its understanding for managing maternal and newborn health, it is imperative to find metrics and identify abnormalities in the placenta villi through clustering. Our goal is to use various embedding methods to evaluate the most effective method of clustering tissue data without identifying internal components; we wanted to find a connection between superficial clusters and morphological clusters. Thus we tested several machine learning approaches against one focused on size and shape. Using Lightly’s SimCLR, and a graph contrasted approach for embeddings and KMeans for clustering, we identified promising results. Our results yielded Silhouette Scores and Davies-Bouldin Indices of 0.577, 0.592, 0.452, 0.780, 0.217, and 1.536 for the three methods respectively. As expected morphological data captures the most distinct clusters, but between the machine learning approaches SimCLR greatly outperforms the graph contrastive approach. Since SimCLR captures image data, it can tell by the relative size of the cells, which villi are smaller or larger. In comparison, the graph contrastive approach captures villi, but not the whole slide image (WSI), resulting in failure to cluster effectively. Overall these results show a strong connection between the two approaches. In the future we plan to explore additional embedding methods and incorporate vessels and cells as evaluating metrics.



Q&A:


Bios: Leah Zhang

Program Track: Advanced Research

GitHub Username:

Leahie -Leah Zhang

What was your favorite seminar? Why?

Spatial multimodal analysis by Aruesha Srivastava. I really liked the metaphor they used with the cake.

-Leah Zhang

If you were to summarize your summer internship experience in one sentence, what would it be?

Self designed, crazy and overwhelming. -Leah Zhang