All occurrences of this event have passed.
This listing is displayed for historical purposes.

Free Presentation

Flag As Inappropriate

Hazy Oracles in Deep Learning

Name: Hazy Oracles in Deep Learning
Start: 2023-01-17T10:00:00-05:00
End: 2023-01-17T12:00:00-05:00
Location: Ford Robotics Building

PhD Defense, Stephan Lemmer

An elderly man asks a robot to fetch the correct mug.

Chair: Jason Corso

In person in FRB 2300 and on Zoom:
https://umich.zoom.us/j/95963594618
Passcode: HAZY

Abstract:
While deep learning problems are often motivated as enabling technologies for human-computer interaction---a support robot, for example, must align natural language referents and sensor readings to operate in a human world---assumptions of these works make them poorly suited to real-world human interaction. Specifically, evaluation typically assumes that humans are oracles that provide semantically correct and unambiguous information, and that all such information is equally useful. While this is enforced in controlled experiments via carefully curated datasets, models operating in the wild will need to compensate for the fact that humans are hazy oracles that may provide information that is incorrect, ambiguous, or misaligned with the features learned by the model. For example: given a choice of three mugs, a robot would not be able to satisfy a request to retrieve the mug, but would be able to retrieve the orange mug.

A natural question follows: how can we use deep learning models trained via the oracle assumption with hazy humans? We answer this question via a method we call deferred inference, which allows models trained via supervised learning to solicit and integrate additional information from the human when necessary. Deferred inference begins with a method for determining if the model should defer inference and wait until more human-provided information is provided. While past work has generally simplified this into one of two questions: is the human-provided information correct? or is the output correct? We find that these approaches are insufficient due to the complex relationship between human inputs, sensor readings, and deep models: low-quality human-provided information may not cause error, while high-quality human-provided information may not correct it. To address the misalignment between input and output error, we introduce Dual-loss Additional Error Regression, or DAER, a method that successfully locates instances where a new human input can reduce error.

Although introduction of such an effective deferral function is necessary to optimize the trade-off between human effort and error, we must additionally consider that the deferral response is also subject to the effects of hazy oracles. For this reason, we must not only consider how to find error caused by human input but also how to integrate deferral responses and measure the performance of the team. For this, we introduce aggregation functions that allow us to integrate information across multiple inferences and a novel evaluation framework that measures the trade-off between error and additional human effort. Through this evaluation, we show that we can reduce error by up to 48% under a reasonable level of human effort without any changes to training or architecture.

Last, we consider how shifting from a dataset-based evaluation to an individual human affects deferred inference. Specifically, whereas crowdsourced datasets work well for rapid implementation and evaluation of deferral and aggregation functions, they do not accurately model human-computer interaction: the mechanisms used to procure high-quality data cause shifts in the distribution, and the failure to track the inputs of individual annotators makes the tacit assumptions that all humans are the same, and inputs do not change over time or deferral depth. Through a human-centered experiment, we find that these assumptions are not true: an ideal deferral function must be calibrated for a specific user, users learn the model over time, and the deferral response is likely to be of lower quality than the initial query. Despite this mismatch with crowdsourced evaluation, we find that using our proposed deferral and aggregation functions can still reduce error in practice.

Explore Similar Events

Deep Learning Methods for Autonomous Underwater Survey, Reacquisition, and Close-Range Inspection
- 4/21/2025 12:00pm
- Presentation
- Ford Robotics Building
EEB Student Dissertation Defense - Microbial Diversity and Dynamics in Lake Food Webs: Species Interactions, Life History Strategies, and Community Reassembly
- 4/7/2025 3:30pm
- Workshop / Seminar
- Biological Sciences Building
CAV Pilot Development and Deployment in Midwest Winter
- 4/15/2025 2:00pm
- Lecture / Discussion
- Virtual
CSEAS Friday Lecture Series. Fact Checking in Low-Resource Languages: A New Dataset and Transformer Model for the Burmese Language
- 4/18/2025 12:00pm
- Lecture / Discussion
- Weiser Hall
2025 Research and Analytics Showcase
- 4/30/2025 3:30pm
- Exhibition
- Center for Academic Innovation

Keywords

0 upcoming occurrence
1 expired occurrence

January 2023

Selected 2023/01/17
1 expired occurrence

Ford Robotics Building - 2300

10:00am - 12:00pm

Happening @ Michigan

The University of Michigan Events Calendar

Sponsors

Keywords

Types

Search Results

Events

Hazy Oracles in Deep Learning

PhD Defense, Stephan Lemmer

Explore Similar Events

Deep Learning Methods for Autonomous Underwater Survey, Reacquisition, and Close-Range Inspection

EEB Student Dissertation Defense - Microbial Diversity and Dynamics in Lake Food Webs: Species Interactions, Life History Strategies, and Community Reassembly

CAV Pilot Development and Deployment in Midwest Winter

CSEAS Friday Lecture Series. Fact Checking in Low-Resource Languages: A New Dataset and Transformer Model for the Burmese Language

2025 Research and Analytics Showcase

Keywords

Selected 2023/01/17
1 expired occurrence

Selected 2023/01/17
1 expired occurrence

Contact Us

Happening @ Michigan

The University of Michigan Events Calendar

Sponsors

Keywords

Types

Search Results

Events

Hazy Oracles in Deep Learning

PhD Defense, Stephan Lemmer

Share Event

Explore Similar Events

Deep Learning Methods for Autonomous Underwater Survey, Reacquisition, and Close-Range Inspection

EEB Student Dissertation Defense - Microbial Diversity and Dynamics in Lake Food Webs: Species Interactions, Life History Strategies, and Community Reassembly

CAV Pilot Development and Deployment in Midwest Winter

CSEAS Friday Lecture Series. Fact Checking in Low-Resource Languages: A New Dataset and Transformer Model for the Burmese Language

2025 Research and Analytics Showcase

Keywords

Selected 2023/01/17 1 expired occurrence

Selected 2023/01/17 1 expired occurrence

Contact Us

Selected 2023/01/17
1 expired occurrence

Selected 2023/01/17
1 expired occurrence