BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//UM//UM*Events//EN
CALSCALE:GREGORIAN
BEGIN:VTIMEZONE
TZID:America/Detroit
TZURL:http://tzurl.org/zoneinfo/America/Detroit
X-LIC-LOCATION:America/Detroit
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20070311T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20071104T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260330T103539
DTSTART;TZID=America/Detroit:20260417T150000
DTEND;TZID=America/Detroit:20260417T160000
SUMMARY:Lecture / Discussion:AIM Seminar:  Mean-Field Dynamics of Transformers: From Modeling to Clustering and Critical Scaling
DESCRIPTION:Abstract:  Self-attention is a central component of modern transformer architectures and is one of the key mechanisms behind the success of Large Language Models. Understanding its mathematical structure is therefore essential for explaining how these models process information and learn useful representations. In this talk\, I will describe an interacting-particle perspective on self-attention\, which has been developed in recent work by several authors as a fruitful framework for analyzing transformer dynamics. Unlike the classical mean-field theory for deep neural networks\, where the particles are neurons and the mean-field limit is tied to overparameterization\, here the particles are tokens whose representations evolve through attention interactions. This mean-field perspective leads to a new viewpoint on transformer dynamics\, with consequences for both theory and practice. In particular\, I will explain how it helps illuminate clustering behavior in deep transformers and the critical temperature scaling laws that arise in many frontier models.\n\nContact:  Zhiyan Ding
UID:141904-21889619@events.umich.edu
URL:https://events.umich.edu/event/141904
CLASS:PUBLIC
STATUS:CONFIRMED
CATEGORIES:Mathematics
LOCATION:East Hall - 1084
CONTACT:
END:VEVENT
END:VCALENDAR