Database Security: k-anonymity
Problem Statement: Given person-specific field-structured data, produce a release of the data with scientific guarantees that the individuals who are the subjects of the data cannot be reidentified while the data remain practically useful.
Description: A solution provided in a series of papers is k-anonymity. A release provides kanonymity if the data for each person cannot be distinguished from at least k-1 individuals whose data also appears in the release. One paper [cite] examines possible re-identifications of a k-anonymized release unless assumptions and accompanying policies are respected. Another paper [cite] achieves k-anonymity by generalizing and suppressing values.
Scientific Influence and Impact: k-anonymity was the first formal privacy protection model. Its original intention was to thwart the ability to link field-structured databases, but has been viewed more broadly, and in so doing, spurred a series of highly cited works. For example, other researchers have proposed efficiencies, alternatives and hardness proofs [Meyerson, Williams, et al.]. To improve utility, k-anonymity can allow an assumption that it may be enforced on a subset of fields known to lead to re-identifications. L-diversity [Gehrke et al.] poses an alternative motivated if the subset is chosen incorrectly. T-closeness [Li et al.] poses an alternative to address concerns found in l-diversity and vulnerabilities if k-anonymity is applied generally. Most recently, differential privacy [Dwork et al.] poses another alternative, which typically distorts data using randomization and noise, enforced across all values, to report inexact commonly occurring information.
Other Achievements: 12