Seminar by Rahul Garg
Anonymization of High Dimensional Data Using Nearest Neighbor Clustering and Perturbation
Rahul Garg
Opera Solutions
Date: Wednesday, September 12th, 2012
Time: 4PM
Venue: CS102.
Abstract:
In this talk I will describe some new anonymization techniques for high dimensional data sets with large number of records. These methods combine the advantages of k-anonymity and perturbation methods of anonymization. The dataset is first clustered using a nearest neighbor approach such that there are at least k points in each cluster. The nearest neighbors for every point are obtained using cover tree data structure. These clusters are then perturbed to obtain anonymized data, which is expected to retain most of the statistical properties of the original data.
About the speaker:
Dr. Rahul Garg is Principal Scientist at Opera Solutions and Leading the R&D Operation in India. His current area of work includes Predictive Business Analytics, Machine Learning and Algorithms. Prior to this he was at IBM T.J. Watson Research Center, in New York USA working on Computational Neuroscience. He established the high-performance computing group at IBM India Research Lab, New Delhi which was a part of the core team designing the IBM's Blue Gene supercomputer. Dr. Garg has published over 50 peer reviewed publications in international conference and journals in the areas such Machine Learning, Neuroscience, Supercomputing, Economics and Game Theory, Communication Networks and Algorithms. He holds a PhD from IIT-Delhi, MS from UC Berkeley, and B.Tech. from IIT-Delhi.