BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//UM//UM*Events//EN
CALSCALE:GREGORIAN
BEGIN:VTIMEZONE
TZID:America/Detroit
TZURL:http://tzurl.org/zoneinfo/America/Detroit
X-LIC-LOCATION:America/Detroit
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20070311T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20071104T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260403T112823
DTSTART;TZID=America/Detroit:20260416T130000
DTEND;TZID=America/Detroit:20260416T150000
SUMMARY:Lecture / Discussion:Modeling Structure in Unstructured Data: Statistical and Causal Perspectives
DESCRIPTION:Modern machine learning systems are trained on massive amounts of unstructured data such as text\, images\, and sequences. Despite the apparent lack of explicit structure\, they exhibit remarkable abilities to learn patterns\, perform reasoning\, and support decision-making. This apparent paradox raises a central question: what structure do these models recover from unstructured data\, and how can we understand and use it?\n\nThis dissertation investigates how language models (i) represent structure through their architectures\, (ii) learn structure from unstructured data\, and (iii) enable us to leverage this learned structure for principled causal inference with unstructured data.\n\nThe first part develops a statistical perspective of attention mechanisms\, the core building block of modern language models. We show that attention can be interpreted as adaptive mixture-of-experts models. This interpretation enables us to extend attention to model general exponential family-distributed data\, making it capable of modeling complex\, heterogeneous data beyond text. In turn\, this perspective reframes attention as a statistical model\, explaining how it captures complex dependencies and latent structure\, with guarantees on identifiability and generalization.\n\nThe second part examines how such structure arises from unstructured training data. We show that many in-context learning behaviors can emerge directly from co-occurrence patterns in unstructured text\, linking modern models to classical co-occurrence modeling tools like latent factor modeling. At the same time\, we identify the limits of this mechanism: positional structure becomes essential for more complex reasoning tasks. We further demonstrate that training data composition plays a critical role in shaping model behavior and alignment\, with example difficulty acting as a key factor.\n\nThe final part studies how learned representations in language models can be leveraged for causal inference in high-dimensional\, unstructured settings. Our approach identifies causal variables directly within the representation space\, enabling well-defined estimation of causal effects when treatments or outcomes are themselves unstructured. In particular\, we isolate representation directions corresponding to the most causally influential treatment components and the most salient treatment-induced outcome variations.\n\nTogether\, these results provide a unified perspective on how modern machine learning systems extract structure from unstructured data\, and how that structure can be harnessed for rigorous statistical and causal analysis.
UID:147382-21900950@events.umich.edu
URL:https://events.umich.edu/event/147382
CLASS:PUBLIC
STATUS:CONFIRMED
CATEGORIES:Dissertation
LOCATION:West Hall - 438
CONTACT:
END:VEVENT
END:VCALENDAR