Research Accomplishments of Latanya Sweeney, Ph.D.

Overview
Medical Informatics
      Scrub
      Datafly
      Genomic identifiability
      Patient-centered management
Database Security
      k-anonymity
Surveillance
      Selective-revelation
      Risk assessment server
      PrivaMix
Vision
      Face de-identification
Biometrics
      Contactless capture
Policy and Law
      Identifiability of de-identified data
      HIPAA assessments
      Privacy-preserving surveillance
Public Education
      Identity angel
      SSNwatch
      CameraWatch
Quantitative assessments

Database Security: k-anonymity

[cite, cite, cite, cite, cite, cite, cite, cite, cite, cite, cite]
Problem Statement: Given person-specific field-structured data, produce a release of the data with scientific guarantees that the individuals who are the subjects of the data cannot be reidentified while the data remain practically useful.
Description: A solution provided in a series of papers is k-anonymity. A release provides kanonymity if the data for each person cannot be distinguished from at least k-1 individuals whose data also appears in the release. One paper [cite] examines possible re-identifications of a k-anonymized release unless assumptions and accompanying policies are respected. Another paper [cite] achieves k-anonymity by generalizing and suppressing values.

Scientific Influence and Impact: k-anonymity was the first formal privacy protection model. Its original intention was to thwart the ability to link field-structured databases, but has been viewed more broadly, and in so doing, spurred a series of highly cited works. For example, other researchers have proposed efficiencies, alternatives and hardness proofs [Meyerson, Williams, et al.]. To improve utility, k-anonymity can allow an assumption that it may be enforced on a subset of fields known to lead to re-identifications. L-diversity [Gehrke et al.] poses an alternative motivated if the subset is chosen incorrectly. T-closeness [Li et al.] poses an alternative to address concerns found in l-diversity and vulnerabilities if k-anonymity is applied generally. Most recently, differential privacy [Dwork et al.] poses another alternative, which typically distorts data using randomization and noise, enforced across all values, to report inexact commonly occurring information.
Other Achievements: ¹²

Recognition. Workshop on Privacy Enhancing Technologies. [cite].

Patent 7,269,578 issued that includes k-anonymized privacy protection. [cite]

Three k-anonymity papers ([cite], [cite], and [cite]) have the most citations in a year and still rising in half as many years as the most cited papers from Associate Professors in the School of Computer Science at Carnegie Mellon.

k-anonymity papers ([cite], [cite], [cite]) are among Dr. Sweeney's most cited papers which jointly have the second highest citation count among joint counts of Associate Professors in the School of Computer Science at Carnegie Mellon and the count is statistical significant at the 99.9th percentile.

Three of Dr. Sweeney's papers related to k-anonymity ([cite], [cite], [cite]) are among 5% (58/1156) of the papers from Associate Professors in the School of Computer Science at Carnegie Mellon that enabled successful work by others.

Notes

12 See quantitative assessments for more details.

Previous | Next

Related links:

Latanya Sweeney's CV
Latanya Sweeney's Home Page
Data Privacy Lab

Fall 2009