Skip to Content

Sponsors

No results

Keywords

No results

Types

No results

Search Results

Events

No results
Search events using: keywords, sponsors, locations or event type
When / Where
All occurrences of this event have passed.
This listing is displayed for historical purposes.

Presented By: Student Machine Learning Seminar and Reading Group

Student ML Seminar: Lightning Talks on Attention-based Architectures, Diffusion Models and State Space Models

Michael Hwang, Andrej Leban and Yash Patel

The event will be divided into 3 lightning talks with the following abstracts.

Attention-based Architectures: We will quickly discuss the basic architecture of transformers and intuitions behind some of its components. As time permits, we will also look through Andrej Karpathy’s “build-nanogpt” repository to precisely understand some implementation details.

Diffusion Models: Diffusion models, in essence, work by learning to denoise data that had been noised with a prescribed forward process. While much work has been done on different mathematical formulations of the process, the architectural part that actually does the learning is often divorced from the former and relegated to a footnote in the articles. In this lightning talk, I will illustrate the development and workings of the main backbone architectures used in diffusion models, from the ubiquitous U-Net to approaches utilizing Transformers.

State Space Models: Much recent excitement has centered on sequential data modeling, which has been recently dominated by the popular Transformer architecture. Transformers, however, suffer from a quadratic memory requirement in sequence length, limiting its use for the ever-increasing context window demands. State-space models (SSMs), in particular Mamba, have arisen as a potential replacement for Transformers, having linear memory scaling and comparable predictive performance. In this talk, we discuss the fundamentals of SSMs from the perspective of linear dynamical systems and the core of the Mamba architecture.

Explore Similar Events

  •  Loading Similar Events...

Back to Main Content