BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//UM//UM*Events//EN
CALSCALE:GREGORIAN
BEGIN:VTIMEZONE
TZID:America/Detroit
TZURL:http://tzurl.org/zoneinfo/America/Detroit
X-LIC-LOCATION:America/Detroit
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20070311T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20071104T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260403T135851
DTSTART;TZID=America/Detroit:20260414T140000
DTEND;TZID=America/Detroit:20260414T160000
SUMMARY:Lecture / Discussion:Transport-Based Methods for Inference and Generation with Graphical Structure
DESCRIPTION:Statistical learning posits a structured relationship between observed data and unobserved quantities — latent components underlying a mixture\, counterfactual outcomes unobserved under the realized treatment assignment\, or low-dimensional representations encoding complex generative factors — and makes inference over the parameters that govern this relationship. In each case\, a graphical model encodes the structural assumptions through conditional independence and factorization\, but inference over the resulting distributional objects demands tools that are stable under the geometric irregularities — limited overlap\, high dimensionality\, unknown model complexity — that arise in practice. This thesis develops a distributional framework that pairs graphical model structure with optimal transport geometry to address this need. We apply the framework to causal inference under limited overlap\, replacing density-ratio reweighting with geometrically stable transport maps and developing Wasserstein-based sensitivity analysis for partial identification\; to structured generative modeling\, introducing Structured Flow Autoencoders that combine conditional normalizing flows with latent graphical models via a novel flow matching objective\; and to mixture model estimation\, where Bayes fixed-point iteration and entropy-regularized semi-discrete optimal transport yield a geometry-driven approach to component recovery and model selection. Across all settings\, the thesis demonstrates that replacing pointwise inference procedures with distributional\, geometry-aware ones — anchored by graphical model structure — yields methods that are simultaneously more principled and more practically reliable.
UID:147391-21900960@events.umich.edu
URL:https://events.umich.edu/event/147391
CLASS:PUBLIC
STATUS:CONFIRMED
CATEGORIES:Dissertation
LOCATION:West Hall - 438
CONTACT:
END:VEVENT
BEGIN:VEVENT
DTSTAMP:20260403T112823
DTSTART;TZID=America/Detroit:20260416T130000
DTEND;TZID=America/Detroit:20260416T150000
SUMMARY:Lecture / Discussion:Modeling Structure in Unstructured Data: Statistical and Causal Perspectives
DESCRIPTION:Modern machine learning systems are trained on massive amounts of unstructured data such as text\, images\, and sequences. Despite the apparent lack of explicit structure\, they exhibit remarkable abilities to learn patterns\, perform reasoning\, and support decision-making. This apparent paradox raises a central question: what structure do these models recover from unstructured data\, and how can we understand and use it?\n\nThis dissertation investigates how language models (i) represent structure through their architectures\, (ii) learn structure from unstructured data\, and (iii) enable us to leverage this learned structure for principled causal inference with unstructured data.\n\nThe first part develops a statistical perspective of attention mechanisms\, the core building block of modern language models. We show that attention can be interpreted as adaptive mixture-of-experts models. This interpretation enables us to extend attention to model general exponential family-distributed data\, making it capable of modeling complex\, heterogeneous data beyond text. In turn\, this perspective reframes attention as a statistical model\, explaining how it captures complex dependencies and latent structure\, with guarantees on identifiability and generalization.\n\nThe second part examines how such structure arises from unstructured training data. We show that many in-context learning behaviors can emerge directly from co-occurrence patterns in unstructured text\, linking modern models to classical co-occurrence modeling tools like latent factor modeling. At the same time\, we identify the limits of this mechanism: positional structure becomes essential for more complex reasoning tasks. We further demonstrate that training data composition plays a critical role in shaping model behavior and alignment\, with example difficulty acting as a key factor.\n\nThe final part studies how learned representations in language models can be leveraged for causal inference in high-dimensional\, unstructured settings. Our approach identifies causal variables directly within the representation space\, enabling well-defined estimation of causal effects when treatments or outcomes are themselves unstructured. In particular\, we isolate representation directions corresponding to the most causally influential treatment components and the most salient treatment-induced outcome variations.\n\nTogether\, these results provide a unified perspective on how modern machine learning systems extract structure from unstructured data\, and how that structure can be harnessed for rigorous statistical and causal analysis.
UID:147382-21900950@events.umich.edu
URL:https://events.umich.edu/event/147382
CLASS:PUBLIC
STATUS:CONFIRMED
CATEGORIES:Dissertation
LOCATION:West Hall - 438
CONTACT:
END:VEVENT
BEGIN:VEVENT
DTSTAMP:20260403T113109
DTSTART;TZID=America/Detroit:20260429T090000
DTEND;TZID=America/Detroit:20260429T110000
SUMMARY:Lecture / Discussion:Principled Evaluation of Large Language Models: A Statistical Perspective
DESCRIPTION:The rapid progress of large language models has outpaced the development of principled methodologies for their evaluation. This dissertation draws on ideas from psychometrics and statistics to build rigorous\, efficient\, and interpretable evaluation frameworks for modern AI systems. In this talk\, I focus on three contributions that address complementary challenges in LLM evaluation.\n\nFirst\, I present PromptEval\, a method that confronts the problem of prompt sensitivity — the phenomenon whereby minor rephrasing of benchmark questions can substantially alter measured model performance. By combining Item Response Theory with matrix completion\, PromptEval efficiently approximates the full distribution of model performance across hundreds of prompt variations while requiring less than 5% of the total evaluations\, replacing arbitrary single-prompt assessments with statistically robust characterizations of model behavior.\n\nSecond\, I introduce skill-based scaling laws that model LLM performance through latent capabilities such as reasoning and instruction-following. Inspired by factor analysis\, this approach exploits the correlation structure among benchmark tasks to produce scaling predictions that are both more accurate and more interpretable than existing laws\, which typically focus on aggregate validation loss and fail to generalize across model families.\n\nThird\, I present Bridge\, a unified statistical framework that explicitly connects LLM-as-a-Judge evaluations to human assessments. Bridge models the systematic discrepancies between human and LLM judgments through a latent preference score and a linear transformation of divergence-capturing covariates\, enabling principled recalibration of automated scores and formal statistical testing for human–LLM gaps.\n\nTogether\, these contributions advance a vision of AI evaluation as a scientific discipline in its own right — one that demands the same statistical care we expect from the systems being evaluated.
UID:147383-21900951@events.umich.edu
URL:https://events.umich.edu/event/147383
CLASS:PUBLIC
STATUS:CONFIRMED
CATEGORIES:Dissertation
LOCATION:West Hall - 470
CONTACT:
END:VEVENT
END:VCALENDAR