Skip to Content

Sponsors

No results

Keywords

No results

Types

No results

Search Results

Events

No results
Search events using: keywords, sponsors, locations or event type
When / Where
All occurrences of this event have passed.
This listing is displayed for historical purposes.

Presented By: Frontiers in Scientific Machine Learning (FSML)

FSML Lecture Series: Tokenization for Chemistry

Alex Wadell (University of Michigan)

FSML Lecture Series: Alex Wadell FSML Lecture Series: Alex Wadell
FSML Lecture Series: Alex Wadell
Abstract:
Molecular Foundation Models are emerging as a powerful tool for molecular design, material science, and cheminformatics. By leveraging the transformer architecture, these models attempt to learn the language of chemistry and discover robust molecular embeddings. However, current models are constrained by tokenizers that fail to capture the full breadth of chemical space or even the periodic table of elements. In his talk, Alex will introduce smirk, a new tokenizer for molecular foundation models that can represent the entirety of the OpenSMILES specification. We'll also discuss performance metrics for tokenizers and the results of Alex's systematic evaluation of thirteen chemistry-specific tokenizers using N-gram language models as a low-cost proxy for transformer models.
FSML Lecture Series: Alex Wadell FSML Lecture Series: Alex Wadell
FSML Lecture Series: Alex Wadell

Explore Similar Events

  •  Loading Similar Events...

Back to Main Content