Skip to Content

Sponsors

No results

Tags

No results

Types

No results

Search Results

Events

No results
Search events using: keywords, sponsors, locations or event type
When / Where

Presented By: Department of Statistics Dissertation Defenses

Statistics in the Modern Era: High Dimensions, Decision-Making, and Privacy

Saptarshi Roy

Abstract: The rapid growth of Artificial Intelligence (AI) and the abundance of data collection from edge devices like cell phones, personal computers, and smartwatches, have put the privacy of personal data at risk. Therefore, differential privacy (DP), a mathematical framework that guarantees data privacy protection by ensuring similar output irrespective of the presence or absence of an individual in the database, has emerged as one of the leading research areas in the landscape of modern AI. Although, DP algorithms have been used in several machine learning problems including risk minimization, density estimation, and hypothesis testing, theoretical investigation of model selection under the DP framework remained somewhat scarce for high-dimensional data. This is concerning as model selection methods are heavily used in high-dimensional genetic data containing sensitive information that may compromise patient’s privacy, thereby hindering data sharing and delaying scientific advancements. In this talk, we propose a differentially private algorithm for model selection under the high-dimensional sparse regression setup. We adopt the well-known exponential mechanism for designing a sampling scheme that can identify the true set of features under desirable condition on the signal. In fact, under low privacy regime, we show that the minimum signal strength requirement exactly matches with the requirement under non-private setting. Moreover, to achieve computational expediency over the intractable exponential mechanism, we design a Metropolis-Hastings chain that quickly mixes to the target distribution to generate private estimates of the model. Therefore, our research provides the first private algorithm for model selection that provably achieves high utility along with computational efficiency, allowing efficient sharing of scientific discoveries to a broader community to practice open science.

Explore Similar Events

  •  Loading Similar Events...

Back to Main Content