Skip to Content

Sponsors

No results

Tags

No results

Types

No results

Search Results

Events

No results
Search events using: keywords, sponsors, locations or event type
When / Where
All occurrences of this event have passed.
This listing is displayed for historical purposes.

Presented By: Department of Statistics Dissertation Defenses

Topics in Sequential Decision Making and Algorithmic Fairness

Laura Niss

Defense Flyer Defense Flyer
Defense Flyer
Abstract:
The ability to collect and process data has greatly expanded the areas of application for data driven inference, predictions, and decisions. How to collect and modify data is dependent upon the ultimate goal. Two areas of research with focus on these questions are sequential decision making and algorithmic fairness. Sequential decision making is the process of a learner choosing an action, observing the outcome, and using this and previous information to determine the next action to take. Algorithmic fairness is the overarching term used to describe when an algorithmic decision is seen as unfair to certain groups or individuals. Biases present in training data may rise from historical inequities or improper representation. This dissertation addresses four problems in these two areas: policies for contaminated stochastic multi-armed bandits, fair representation through convex-hull feasibility sampling, data debiasing, and implications of a sequential pipeline of fair/biased decisions.

We start in chapter 2 by considering the stochastic multi-armed bandit problem, with the added assumption that rewards can be contaminated some fixed proportion of the time. This reflects the scenario of a sequential decision when the reward is from a human response. Here there is no guarantee the observed reward is from the true reward distribution of the action. To account for the contamination, we propose an Upper Confidence Bound (UCB) policy that relies on robust mean estimators. We derive inequality bounds on these estimators in the contaminated setting and give upper bounds on the regret, showing they are comparable to UCB policies in the standard stochastic setting. Through simulations, we show the effectiveness or our policies under different types of contamination.


Bias in training data is often split into two categories, representation bias and historical bias. Representation bias refers to data with no or limited samples from groups within the target population. Representation bias can result in unfair outcomes for the underrepresented groups. Historical bias refers to unwanted correlations between protected attributes and other features caused by societal inequities. It is an inherent property of the data and cannot be attenuated by more data.

Addressing representational bias, chapter 3 introduces the convex-hull feasibility sampling problem. Here we develop a framework for sequentially testing whether a known point lies within the convex hull of a set of points with unknown distributions. This represents the problem of whether or not it is possible to sample an equally representative data set among labeled groups when the distribution of the sampling sources is unknown. We provide theoretical results in the 2D setting and simulations of our policy in 2 and 3 dimensions.

In contrast, chapter 4 addresses historical bias by proposing a data debiasing method based on a factor model. The goal is to remove variation caused by protected attributes that are undesirable during training. We compute the correlation between the debiased data and the original protected attributes and show that in ideal cases there is no correlation. We show empirical results with a case study.
Chapter 5 explores how bias across multiple decisions—what we call a pipeline—impacts the final outcome. We show how fair decisions at each decision point can perpetuate a fair outcome, and also how a biased decision can prevent fair outcomes further down the pipeline. This highlights the importance of representative data at each training and decision period.
Defense Flyer Defense Flyer
Defense Flyer

Livestream Information

 Zoom
April 28, 2022 (Thursday) 10:00am
Meeting ID: 99505201852503909

Explore Similar Events

  •  Loading Similar Events...

Back to Main Content