The Thinking Eye: AI That Sees, Reads, and Reasons in Medicine
Yuyin Zhou is an Assistant Professor of Computer Science and Engineering at UC Santa Cruz. Her research interests lie at the intersection of machine learning and computer vision, with a primary focus on AI for healthcare and scientific discovery. Her work has been recognized with honors including 2025 Google Research Scholar Award, 2023 Hellman Fellowship, Best Paper Award at KDD 2025 Health Day, Best Paper Honorable Mention at DART 2022, and finalist recognition for the MICCAI Young Scientist Publication Impact Award in 2022. Beyond her research, Yuyin has organized over 20 workshops and tutorials at major conferences including ICML, MICCAI, ML4H, ICCV, CVPR, and ECCV, with coverage in media outlets such as ICCV Daily and Computer Vision News. She serves as a regular Area Chair for CVPR, ICLR, MICCAI, CHIL, and ISBI, and is the Doctoral Consortium Chair for WACV 2025.
Medical AI is undergoing a profound transformation, evolving from simple pattern recognition to systems capable of complex clinical reasoning. This talk will chart this evolution across three dimensions: data, models, and evaluation. I will first highlight the shift from limited, unimodal datasets to massive multimodal resources. In particular, I will introduce MedTrinity-25M—a novel collection of over 25 million richly annotated medical images that serves as a foundation for multimodal tasks such as visual question answering and report generation. Building on this, I will describe how grounding decision processes in a structured medical knowledge graph enables the generation of high-fidelity reasoning chains. Using these chains, we construct a large-scale medical reasoning dataset, which in turn allows us to develop a new class of reasoning models. These models not only achieve state-of-the-art performance on multiple clinical Q&A benchmarks but also produce reasoning outputs that physicians across seven specialties have independently verified as clinically reliable, interpretable, and more factually accurate than existing large language models. Finally, the talk will offer a deep dive into the critical evaluation of these advanced models, moving beyond standard benchmarks to expose their current limitations—particularly in interpreting dynamic clinical scenarios such as tracking disease progression from temporal image sequences. To foster a holistic understanding of the mechanisms underlying these reasoning models, I will introduce a new evaluation framework that examines performance from two complementary perspectives: their grasp of static knowledge versus their capacity for dynamic reasoning. Together, these advances point toward a future where AI systems can holistically analyze patient information and function as true collaborative partners in complex medical decision-making.