For the Love of Big Data

5
IBM Research – Business Solutions and Mathematical Sciences For the Love of Big Data Dr. Bob Sutor VP, Business Solutions and Mathematical Sciences

description

Presentation by Bob Sutor at the International Association of Privacy Professionals in Washington, DC USA, on March 6, 2014. This short presentation was meant to stimulate ideas that would then be complemented by discussions about privacy policies as it relates to Big Data, and in that sense is not complete regarding all aspects of privacy that come from the issues discussed.

Transcript of For the Love of Big Data

Page 1: For the Love of Big Data

IBM Research – Business Solutions and Mathematical Sciences

For the Love of Big Data

Dr. Bob SutorVP, Business Solutions and Mathematical Sciences

Page 2: For the Love of Big Data

© 2014 International Business Machines Corporation 2

IBM Research – Business Solutions and Mathematical Sciences

What is Big Data?

Big data is being generated by everything around us.

Every digital process and social media exchange produces it.

Systems, sensors and mobile devices transmit it.

Big data is arriving from multiple sources at amazing velocities, volumes and varieties.

To extract meaningful value from big data, you need optimal processing power, storage, analytics capabilities, and skills.

Page 3: For the Love of Big Data

© 2014 International Business Machines Corporation 3

IBM Research – Business Solutions and Mathematical Sciences

Why do data scientists want more data, rather than less?

It is there.

Data is the basis of the models we create to explain, predict, and affect behavior.

With more data, our models become more sophisticated and, we hope, more accurate.

How much data is too much data?

Page 4: For the Love of Big Data

© 2014 International Business Machines Corporation 4

IBM Research – Business Solutions and Mathematical Sciences

What issues can analytics present?

Are all aspects of privacy, anonymization, and liability understood by the practitioners?

If I tell you that you cannot look at some data but you can infer the information (e.g., gender) anyway, is that all right?

What are the rules for working with metadata and summarized data?

How do we process static, collected data together with more real-time, rapidly changing information such as location?

Page 5: For the Love of Big Data

© 2014 International Business Machines Corporation 5

IBM Research – Business Solutions and Mathematical Sciences

Approach to policy can determine outcomes

Reductions in the amount and kinds of data can produce diminished or inaccurate results.

Policy must take into account the value received by individuals for the use of their personal data.

Enforced data localization may decrease analytical completeness unless we can move intermediate results or the site of computation.