Jun
17
2025
Towards Genomics Reasoning Models
Ignacio L. Ibarra
Helmholtz Zentrum München
hosted by Sven Sahle
4:00 PM
SR41
Abstract
Large-scale models trained on genomic sequences and single-cell data have made impressive strides in prediction and pattern discovery. Yet many of these advances fall short of enabling biological reasoning: the ability to infer mechanistic explanations, generalize across conditions, and propose causally informed hypotheses. In this talk, I will outline a roadmap towards genomics reasoning models - systems that integrate flexible sequence representations with structured biological priors. Rather than claiming that current models truly reason, I will introduce the concept of reasoning mining: the use of hybrid machine learning approaches to extract testable mechanistic hypotheses from complex genomic data. Two concrete applications will be discussed: (i) transcriptional repressors acting as lineage stabilizers, and (ii) host–pathogen regulatory interactions in infectious disease. Drawing on recent work including muBind and genomic language models, I will highlight opportunities and challenges in developing models that connect statistical learning to mechanistic insight.
Biosketch
Ignacio L. Ibarra is a computational biologist and machine learning researcher specializing in regulatory genomics and single-cell data integration. He completed his PhD on transcription factor cooperativity, and his work has contributed to multi-omics atlases, graph-based regulatory modeling, and the development of interpretable machine learning frameworks for genomic data. He is particularly interested in bridging biology and AI to build sequence-to-insight models that advance our understanding of biology and support therapeutic discovery.