Privacy and Anonymity in the Internet - HTW Dresdensobe/Basoti/Lectures/1...2 Introduction This...

22
BaSoTI 2014, Privacy and Anonymity in the Internet 1 Privacy and Anonymity in the Internet Peter Sobe Faculty of Informatics and Mathematics HTW Dresden, Germany Lecture Material for the Baltic Summer School (BaSoTI), Riga, Latvia 2014

Transcript of Privacy and Anonymity in the Internet - HTW Dresdensobe/Basoti/Lectures/1...2 Introduction This...

BaSoTI 2014, Privacy and Anonymity in the Internet 1

Privacy and Anonymityin the Internet

Peter SobeFaculty of Informatics and Mathematics

HTW Dresden, Germany

Lecture Material for the Baltic Summer School (BaSoTI), Riga, Latvia 2014

2

IntroductionThis lecture provides insights how personal and identity-related data is collected and analyzed in the Internet.

A number of techniques are introduced that are typically used to collect and analyze private data – mostly so called “analytics”.

To protect privacy, infrastructures und cryptographic techniques are discussed – content encryption in combination with anonymization.

3

Structure of the Lecture

Three parts: 1: Introduction and Definitions

2: Observation and analytics techniques

3: Personal data protection and cryptographicanonymity techniques

+ Tutorial

4

Literature for further reading C. Jensen, C. Potts, Ch. Jensen:

Privacy practices of Internet users: Self-reports versus observed behavior, Int. Journal of Human-Computer Studies, 63 (2005),203-227

O. Berthold, H. Federrath, S. Köpsell: Web MIXes: A system for anonymous and unobservable Internet access, Designing Privacy Enhancing Technologies: Workshop on Design Issues in Anonymity and Unobservability (2000), Springer, Lecture Notes in Computer Science, 2001

Roger Dingledine, Nick Mathewson, Paul Syverson: Tor: The Second-Generation Onion Router. Proceedings of the 13th UsenixSecurity Symposium, 2004

D.A. Buell, R. Sandhu (Editors): Identity Management, IEEE Internet Computing, Nov./Dec. 2003

D. Kesdogana, C. Palmer: Technical challenges of network anonymity. Computer Communications, Volume 29, Issue 3, 1 February 2006, Pages 306-324 Internet Security

P. Eckersly: How unique is your web browser? Electronic Frontier Foundation, http://panopticlick.eff.org, accessed july, 2012

5

Privacy Protection andAnonymity in the Internet

Part 1: Introduction and Definitions

Terms:• Privacy • Anonymity• Pseudonymization• Digital Identity

6

IntroductionStatement 1:

The Internet by design lacks unified provisions for identifyingwho communicates with whom; it lacks a well designed identityinfrastructure. …

cited from: J. Camenisch (IBM Research Zurich) , R. Leenes (Tilburg University), M. Hansen, J. Schallaböck (Unabhängiges Landeszentrum für Datenschutz)An Introduction to Privacy-Enhancing Identy Management, In: LNCS 6545, Pages 3-21, 2011 Springer, Berlin Heidelberg

7

IntroductionStatement 2:

At first glance those procedures based on personal contact orpaper are transformed into digital procedures for use online. But below the surface, more fundamental differences betweenthe offline and the online world exist, such as the relative permanence of memories and the ease with which experiencescan be shared between many of actors across time and spacebarriers.

cited from: J. Camenisch (IBM Research Zurich) , R. Leenes (Tilburg University), M. Hansen, J. Schallaböck (Unabhängiges Landeszentrum für Datenschutz)An Introduction to Privacy-Enhancing Identy Management, In: LNCS 6545, Pages 3-21, 2011 Springer, Berlin Heidelberg

8

Introduction

often less (or absence of ) identifying features that areappearant for the user

identities based on user-logins, email-addresses, certificates

several (role-specific) identitiesfor a person are possible

Internet (online world) compared to the real world (offline world)

identification may take placeby hidden techniques (not appearent to the user)

automated data collection- at several places- over a long time

identity and its behaviour canbe observed using automatedtechniques

9

What identifies a person ?The combination of (sex, postal code, birth date) is uniquefor 87% of the U.S. citizien (1990: 248 Mio.)L. Sweeney, Uniqueness of Simple Demographics in the U.S. Population, LIDAPWP4, Carnegie Mellon University, Laboratory forInternational Data Privacy, Pittsburgh, PA, 2000.

A few apperantly trivial informations that are provided in additionallow to absolutely identify a person.

Today: name, birth-date, email address, (telephone number,) country(registration data for google services, facebook, etc.)

10

PrivacyPrivacy is the ability to control the dissemmination of personal data (attributes of a digital identity) and keep it on a level of theoffline world.

Offline World: Usage of information for contacts that can be conducted

anonymously Physical barriers avoid insights into private sphere Space and time barriers make combination of information hard Typically, a person notices and distinguishes a private and a

public context

11

PrivacyRelations and procedures are transferred to the web …

Web - Online World: for web contacts often more information than necessary is

collected, due to the absence of widely-accepted / uniform authentication techniques

anonymous operation is often seen as suspicious physical barriers disappear which formerly prevented a

combination of identity-related information it is hard to determine, whether web techniques collect

personal information and whether someone combines thisinformation with the users identy

12

Privacy –An example for a privacy problem Mobile web applications and location-based services, Service that takes the geographical position as one of the input variablese.g. display of a map, location-adaptive information services

Technical basis:Determination of the position using infrastructures, such as GPS, WLAN, QR-codes

W3C Geolocation API:Returns a postition object including longitude, latitude (WGS84),date, time and precision,optional: altitude, velocity, orientation

Position + User Identification (Name, ID number) possibly violates privacyW3C Geolocation API – Security and Privacy Policiesposition determination requires agreement of the user, no storage of gegraphical data,deletion of positions when process / app. terminates

13

AnonymityAnonymization is a process that alters person-related data in a way that a reliable relation to a civil identy can not be build that different transactions initiated by the same one can not be

related to another the activity of an identity becomes unobservable

Examples: IP addresses behind a NAT router ( not completely ) Usage of a proxy that handles a transaction on behalf of a client

Anonymity is not an absolute feature (given or not given), rather itrelates to some assumptions about the observerIn addition, anonymity provided by cryptographic techniques dependson the attackers strength

14

AnonymityAnonymity sometimes is given by construction … classical money

(coins) – not trackable(bank notes) – bank note numbers are normally not trackedand not connected to someone who pays with them

Anonymity is forced by law (or regulations) for specific tasks: elections evaluations (teachers evaluation,

anonymous review of conference articles)

Computer network infrastuctures for these application have to provideanonymity, in terms of unobservability

15

PseudonymizationPseudonymization changes significant attributes of an identiy that makes the assignment to a civil identy impossible without

knowledge how the attributes are changed

Examples: student matriculation number instead of the real name a car license plate number instead of the owners name and address

several activities of a pseudonymized identity can be related to another

Pseudonymization is forced by law for specific situations: medical systems and hospital information systems

16

Digital IdentityA digital identity describes a person or an object in a way that it can be distinguished from others reliably contains information (attributes) describing the person or the object

and relationships to others

Examples: car identification record (in germany), that contains all theowners (if more than one) the current number plate, chassis number

A digital identity is often used to authorize an action or to authorizeaccess to data or to a system

17

Digital IdentityTypically, persons or objects use different parts of their identity for different objectivities.E-Shop-Customer: {name, date-of-birth, banking account, customer-id }Employee: {name, date-of-birth, banking account,

health insurance number, tax number}Hospital patient { …, sex, weight, genetic diseases, … }

Some of the attributes are related to the real person that need to be protected

18

Digital IdentityRelation to data protection and security

digital identity

data protection security

data protection requires security techniques

identity is data that must be

protected

security requires trustworthy communication partners (identities)

19

Digital IdentityTechnically, an identy that can be trusted is provided by a certificate.Certificates are data records containing identity-related data that is signed by a trusted third party – the certification authority.

A certificate is commonly a signature over identy data and a public key of the identity. When the private key of the identity is kept secret, the certificate can be used to check the validity of trusted digital signatures .

Needed infrastuctures:PKI, Identity Management

20

Digital Identity, Privacy and Trust Digital identities are used to create trust between (communication, business, …) partners

CertificateA

partner A(Identity)

Certification Authority

partner B(Identity)

CertificateB

B trusts A, A trusts B wrt. to identity, authenticity of messages

Trust

21

Digital Identity, Privacy and Trust Trust on data privacy for services is another issue

partner A(Service)

partner B

KnowledgeData privacy(privacy policy)

Control ofpersonal information,opt-in, opt-out

Credibility of A:Authentication as a basisUser reports and public information

Trust

22

Privacy Enhancing Technology Privacy Enhancing Technology (PET) enables the user of communication systems to protect himself or herself from being traced his or her activities and behaviour. PET addresses confidentiality aspects:

– Anonymity of a sender or recipient (hiding the identity of a user),– Unobservability of communication relations (hiding who is

communicating with whom) or– generally the unlinkability of actions (events).

Taken from:Hannes Federrath: Privacy Enhanced Technologies: Methods – Markets –Misuse. Proc. 2nd International Conference on Trust, Privacy, and Security in Digital Business (TrustBus '05). LNCS 3592, Springer-Verlag, Heidelberg 2005