Skip to Content

Sponsors

No results

Tags

No results

Types

No results

Search Results

Events

No results
Search events using: keywords, sponsors, locations or event type
When / Where
All occurrences of this event have passed.
This listing is displayed for historical purposes.

Presented By: Michigan Program in Survey and Data Science

MPSDS JPSM Seminar Series - Flexible Formal Privacy for Public Data Curation

Jeremy Seeman - Michigan Institute for Data Science (MIDAS) and Institute for Social Research (ISR), University of Michigan

Flyer Flyer
Flyer
MPSDS JPSM Seminar Series
November 1, 2023
12;00 - 1:00 pm EDT

In person, room 1070 Institute for Social Research, and via Zoom.
The Zoom call will be locked 10 minutes after the start of the presentation.

Flexible Formal Privacy for Public Data Curation

Researchers rely extensively on public datasets disseminated by official statistics agencies, universities, non-governmental organizations, and other data curators. With the increasing availability of data and computing power comes increased threats to privacy, as published statistics can more easily be used to reconstruct sensitive personal data. Formal privacy (FP) methods, like differential privacy (DP), provably limit such information leakage by injecting carefully chosen randomized noise into published statistics. However, the way DP accounts for privacy degradation requires this noise be injected into every statistic dependent on the confidential dataset. This fails to reflect data curator needs, social, legal or ethical requirements, and complex dependency structures between public and confidential datasets. In this talk, I'll discuss statistical methodology that addresses these problems. We propose a FP framework with novel characterizations of disclosure risk when assessing collections of statistics wherein only some statistics are published with DP guarantees. We demonstrate FP properties maintained by our proposed framework, propose data release mechanisms which satisfy our proposed definition, and prove the optimality properties of downstream statistical estimators based on these mechanism outputs. For this talk, I'll discuss a few end-to-end data analysis examples in public health and surveys, showing how theoretical trade-offs between privacy, utility, and computation time manifest in practice when assessing disclosure risks and statistical utility. I'll conclude with a discussion on the implications of this work for survey researchers, focusing on opportunities to incorporate privacy by design in survey planning, experimental design, and other data collection operations.

Jeremy Seeman is a Michigan Data Science Fellow at the Michigan Institute for Data Science (MIDAS) and MPSDS. He recently graduated with his PhD in statistics from Penn State University. Jeremy's research focuses on statistical data privacy, quantitative methods in the social sciences, and social values in data governance. He is the recipient of the U.S Census Bureau Dissertation Fellowship and the ASA Pride Scholarship. Prior to joining Penn State, Jeremy completed his BS in Physics and MS in Statistics at the University of Chicago, where he was a research fellow at the Center for Data Science and Public Policy.

Back to Main Content