November 17, 2019

399 words 2 mins read

Protecting individual privacy in a data-driven world

Protecting individual privacy in a data-driven world

With the analytic and predictive power of big data comes the responsibility to respect and protect individual privacy. As citizens, we should hold organizations to account; as data practitioners, we must find intelligent ways to analyze data without violating privacy. Jason McFall discusses privacy risks and surveys leading privacy-preserving analysis techniques.

Talk Title Protecting individual privacy in a data-driven world
Speakers Jason McFall (Privitar)
Conference Strata + Hadoop World
Conf Tag Making Data Work
Location London, United Kingdom
Date June 1-3, 2016
URL Talk Page
Slides Talk Slides

As data practitioners, we come to Strata because we are excited by the opportunities to unlock the value in data. But as individuals, we are each sensitive to how our own data is used, and we want our privacy to be respected. We expect organizations to keep our data secure, but we also expect them to use our data ethically and not exploit or leak our private data. Many citizens are simply unaware of the degree to which their trails of data can reveal highly private information. Meanwhile, organizations are not doing enough to preserve privacy; they need to find privacy-preserving ways to analyze and operationalize data. Organizations may be open to far greater liability due to possible customer reidentification than they realize. Jason McFall surveys the risks around private data and discusses some examples of privacy breaches where well-meaning and responsible organizations inadvertently violated privacy because they didn’t understand the threats they faced—including linkage attacks, where connecting data to a public dataset can reveal privacy; network graph matching: identifying segments of a graph (such as a social graph) and then walking the graph; and the risks of aggregate data, where often a single data point seems innocuous in isolation but in aggregate can reveal very private information—in real-world examples such as mining social network comments, likes, and friend graphs; connecting location information to learn patterns about where a person lives, works, and travels; or exploiting Internet of Things data. Jason outlines techniques that enable the safe and effective use of data while preserving privacy, including tokenization and masking, generalization and blurring of data (such as k-anonymity), controlled privacy-preserving querying of data (such as differential privacy), homomorphic encryption, and randomized responses for the IoT, and explores the strengths and weaknesses of these approaches, before listing some key lessons that individual citizens, organizations, and data scientists need to know about privacy.

comments powered by Disqus