Privacy and Anonymity in the Internet - HTW Dresdensobe/Basoti/Lectures/1...2 Introduction This...
Transcript of Privacy and Anonymity in the Internet - HTW Dresdensobe/Basoti/Lectures/1...2 Introduction This...
BaSoTI 2014, Privacy and Anonymity in the Internet 1
Privacy and Anonymityin the Internet
Peter SobeFaculty of Informatics and Mathematics
HTW Dresden, Germany
Lecture Material for the Baltic Summer School (BaSoTI), Riga, Latvia 2014
2
IntroductionThis lecture provides insights how personal and identity-related data is collected and analyzed in the Internet.
A number of techniques are introduced that are typically used to collect and analyze private data – mostly so called “analytics”.
To protect privacy, infrastructures und cryptographic techniques are discussed – content encryption in combination with anonymization.
3
Structure of the Lecture
Three parts: 1: Introduction and Definitions
2: Observation and analytics techniques
3: Personal data protection and cryptographicanonymity techniques
+ Tutorial
4
Literature for further reading C. Jensen, C. Potts, Ch. Jensen:
Privacy practices of Internet users: Self-reports versus observed behavior, Int. Journal of Human-Computer Studies, 63 (2005),203-227
O. Berthold, H. Federrath, S. Köpsell: Web MIXes: A system for anonymous and unobservable Internet access, Designing Privacy Enhancing Technologies: Workshop on Design Issues in Anonymity and Unobservability (2000), Springer, Lecture Notes in Computer Science, 2001
Roger Dingledine, Nick Mathewson, Paul Syverson: Tor: The Second-Generation Onion Router. Proceedings of the 13th UsenixSecurity Symposium, 2004
D.A. Buell, R. Sandhu (Editors): Identity Management, IEEE Internet Computing, Nov./Dec. 2003
D. Kesdogana, C. Palmer: Technical challenges of network anonymity. Computer Communications, Volume 29, Issue 3, 1 February 2006, Pages 306-324 Internet Security
P. Eckersly: How unique is your web browser? Electronic Frontier Foundation, http://panopticlick.eff.org, accessed july, 2012
5
Privacy Protection andAnonymity in the Internet
Part 1: Introduction and Definitions
Terms:• Privacy • Anonymity• Pseudonymization• Digital Identity
6
IntroductionStatement 1:
The Internet by design lacks unified provisions for identifyingwho communicates with whom; it lacks a well designed identityinfrastructure. …
cited from: J. Camenisch (IBM Research Zurich) , R. Leenes (Tilburg University), M. Hansen, J. Schallaböck (Unabhängiges Landeszentrum für Datenschutz)An Introduction to Privacy-Enhancing Identy Management, In: LNCS 6545, Pages 3-21, 2011 Springer, Berlin Heidelberg
7
IntroductionStatement 2:
At first glance those procedures based on personal contact orpaper are transformed into digital procedures for use online. But below the surface, more fundamental differences betweenthe offline and the online world exist, such as the relative permanence of memories and the ease with which experiencescan be shared between many of actors across time and spacebarriers.
cited from: J. Camenisch (IBM Research Zurich) , R. Leenes (Tilburg University), M. Hansen, J. Schallaböck (Unabhängiges Landeszentrum für Datenschutz)An Introduction to Privacy-Enhancing Identy Management, In: LNCS 6545, Pages 3-21, 2011 Springer, Berlin Heidelberg
8
Introduction
often less (or absence of ) identifying features that areappearant for the user
identities based on user-logins, email-addresses, certificates
several (role-specific) identitiesfor a person are possible
Internet (online world) compared to the real world (offline world)
identification may take placeby hidden techniques (not appearent to the user)
automated data collection- at several places- over a long time
identity and its behaviour canbe observed using automatedtechniques
9
What identifies a person ?The combination of (sex, postal code, birth date) is uniquefor 87% of the U.S. citizien (1990: 248 Mio.)L. Sweeney, Uniqueness of Simple Demographics in the U.S. Population, LIDAPWP4, Carnegie Mellon University, Laboratory forInternational Data Privacy, Pittsburgh, PA, 2000.
A few apperantly trivial informations that are provided in additionallow to absolutely identify a person.
Today: name, birth-date, email address, (telephone number,) country(registration data for google services, facebook, etc.)
10
PrivacyPrivacy is the ability to control the dissemmination of personal data (attributes of a digital identity) and keep it on a level of theoffline world.
Offline World: Usage of information for contacts that can be conducted
anonymously Physical barriers avoid insights into private sphere Space and time barriers make combination of information hard Typically, a person notices and distinguishes a private and a
public context
11
PrivacyRelations and procedures are transferred to the web …
Web - Online World: for web contacts often more information than necessary is
collected, due to the absence of widely-accepted / uniform authentication techniques
anonymous operation is often seen as suspicious physical barriers disappear which formerly prevented a
combination of identity-related information it is hard to determine, whether web techniques collect
personal information and whether someone combines thisinformation with the users identy
12
Privacy –An example for a privacy problem Mobile web applications and location-based services, Service that takes the geographical position as one of the input variablese.g. display of a map, location-adaptive information services
Technical basis:Determination of the position using infrastructures, such as GPS, WLAN, QR-codes
W3C Geolocation API:Returns a postition object including longitude, latitude (WGS84),date, time and precision,optional: altitude, velocity, orientation
Position + User Identification (Name, ID number) possibly violates privacyW3C Geolocation API – Security and Privacy Policiesposition determination requires agreement of the user, no storage of gegraphical data,deletion of positions when process / app. terminates
13
AnonymityAnonymization is a process that alters person-related data in a way that a reliable relation to a civil identy can not be build that different transactions initiated by the same one can not be
related to another the activity of an identity becomes unobservable
Examples: IP addresses behind a NAT router ( not completely ) Usage of a proxy that handles a transaction on behalf of a client
Anonymity is not an absolute feature (given or not given), rather itrelates to some assumptions about the observerIn addition, anonymity provided by cryptographic techniques dependson the attackers strength
14
AnonymityAnonymity sometimes is given by construction … classical money
(coins) – not trackable(bank notes) – bank note numbers are normally not trackedand not connected to someone who pays with them
Anonymity is forced by law (or regulations) for specific tasks: elections evaluations (teachers evaluation,
anonymous review of conference articles)
Computer network infrastuctures for these application have to provideanonymity, in terms of unobservability
15
PseudonymizationPseudonymization changes significant attributes of an identiy that makes the assignment to a civil identy impossible without
knowledge how the attributes are changed
Examples: student matriculation number instead of the real name a car license plate number instead of the owners name and address
several activities of a pseudonymized identity can be related to another
Pseudonymization is forced by law for specific situations: medical systems and hospital information systems
16
Digital IdentityA digital identity describes a person or an object in a way that it can be distinguished from others reliably contains information (attributes) describing the person or the object
and relationships to others
Examples: car identification record (in germany), that contains all theowners (if more than one) the current number plate, chassis number
A digital identity is often used to authorize an action or to authorizeaccess to data or to a system
17
Digital IdentityTypically, persons or objects use different parts of their identity for different objectivities.E-Shop-Customer: {name, date-of-birth, banking account, customer-id }Employee: {name, date-of-birth, banking account,
health insurance number, tax number}Hospital patient { …, sex, weight, genetic diseases, … }
Some of the attributes are related to the real person that need to be protected
18
Digital IdentityRelation to data protection and security
digital identity
data protection security
data protection requires security techniques
identity is data that must be
protected
security requires trustworthy communication partners (identities)
19
Digital IdentityTechnically, an identy that can be trusted is provided by a certificate.Certificates are data records containing identity-related data that is signed by a trusted third party – the certification authority.
A certificate is commonly a signature over identy data and a public key of the identity. When the private key of the identity is kept secret, the certificate can be used to check the validity of trusted digital signatures .
Needed infrastuctures:PKI, Identity Management
20
Digital Identity, Privacy and Trust Digital identities are used to create trust between (communication, business, …) partners
CertificateA
partner A(Identity)
Certification Authority
partner B(Identity)
CertificateB
B trusts A, A trusts B wrt. to identity, authenticity of messages
Trust
21
Digital Identity, Privacy and Trust Trust on data privacy for services is another issue
partner A(Service)
partner B
KnowledgeData privacy(privacy policy)
Control ofpersonal information,opt-in, opt-out
Credibility of A:Authentication as a basisUser reports and public information
Trust
22
Privacy Enhancing Technology Privacy Enhancing Technology (PET) enables the user of communication systems to protect himself or herself from being traced his or her activities and behaviour. PET addresses confidentiality aspects:
– Anonymity of a sender or recipient (hiding the identity of a user),– Unobservability of communication relations (hiding who is
communicating with whom) or– generally the unlinkability of actions (events).
Taken from:Hannes Federrath: Privacy Enhanced Technologies: Methods – Markets –Misuse. Proc. 2nd International Conference on Trust, Privacy, and Security in Digital Business (TrustBus '05). LNCS 3592, Springer-Verlag, Heidelberg 2005