De-identification Project

Preserving Privacy by De-identifying Facial Images

by Elaine Newton, Latanya Sweeney, and Bradley Malin



In the context of sharing video surveillance data, a significant threat to privacy is face recognition software, which can automatically identify known people from a driverís license photo database, for example, and thereby track people regardless of suspicion. This paper introduces an algorithm to protect the privacy of individuals in video surveillance data by de-identifying faces such that many facial characteristics remain but the face cannot be reliably recognized. A trivial solution to de-identifying faces involves blacking out each face. This thwarts any possible face recognition, but because all facial details are obscured, the result is of limited use. Many ad hoc attempts, such as covering eyes or randomly perturbing image pixels, fail to thwart face recognition because of the robustness of face recognition methods. This paper presents a new privacy-enabling algorithm, named k-Same, that scientifically limits the ability of face recognition software to reliably recognize faces while maintaining facial details in the images. The algorithm determines similarity between faces based on a distance metric and creates new faces by averaging image components, which may be the original image pixels (k-Same-Pixel) or eigenvectors (k-Same-Eigen). Results are presented on a standard collection of real face images with varying k.

Keywords: Video surveillance, privacy, de-identification, privacy-preserving data mining, k-anonymity, microaggregation

E. Newton, L. Sweeney, and B. Malin Preserving Privacy by De-identifying Facial Images. IEEE Transactions on Knowledge and Data Engineering, 17(2) February 2005 pp. 232-243.

Earlier version available as: Carnegie Mellon University, School of Computer Science, Technical Report, CMU-CS-03-119. Pittsburgh: 2003. (26 pages in PS; PDF: 600dpi, or 300dpi).

Related Links

Spring 2005 Data Privacy Lab [De-identification Project]