SEGURIDAD DE INFORMACIÓN Y CRIPTOGRAFÍA · INTRODUCTORY ASSIGNMENT Implement in Python binary...
Transcript of SEGURIDAD DE INFORMACIÓN Y CRIPTOGRAFÍA · INTRODUCTORY ASSIGNMENT Implement in Python binary...
SEGURIDAD DE INFORMACIÓN Y CRIPTOGRAFÍA
Unidad de aprendizaje impartida en inglés
Semestre agosto-diciembre 2014
Elisa Schaeffer
Introduction One-time pads Mathematical fundamentals Pseudorandom generation Key-exchange algorithms Public-key cryptography Digital identity and signatures Block ciphers
Stream ciphers One-way hashing Steganography Case studies:
Authentication E-commerce E-voting Anonymous browsing
COURSE CONTENTS
One-time pads (intro, 5 pts) Pseudorandomness (report, 7 pts) Key exchange (program 7 pts) Public keys (program, 7 pts) Digital signatures (program, 10 pts) Block ciphers (report, 7 pts) Stream ciphers (report, 7 pts) Steganography (program, 10 pts)
HOMEWORK ASSIGNMENTS
ANY ATTEMPT TO COPY CODE FROM BOOKS OR WEBSITES OR
OTHER STUDENTS WILL RESULT IN IMMEDIATELT
FAILING THE ENTIRE COURSE
EXAMINATIONS
Midterm and final examination.
No books, notes nor internet access allowed.
BACKGROUND
Pre-WW1regular research.
WWI provokes a boom, but little gets published.
Since the 1970s: rapid increase in the field.
Seminal textbook: Bruce Schneier’s Applied Cryptography.
SECURITY VS. OBSCURITY
TRUST
Confidenciality = only the intended persons can access the information.
Commitment without disclosure (zero-knowledge) = proving that one has access to some information without revealing the information per sé.
BASIC SETUP
Alice Bob
Eve
TERMINOLOGY AND NOTATION
Plaintext: a message, denoted by M.
Encryption: a mechanism that hides the message contents; denoted by E.
Ciphertext: an encrypted message, denoted by C.
Decryption: a mechanism that recover a hidden message; denoted by D.
E(M) = C
D(C) = D(E(M))= M
AUTHENTICATION
Knowing with certainty the origin of a message.
INTEGRITY
Knowing with certainty that the message has not been modified.
NONREPUDIATION
Falsely deny having sent a message is not possible.
KEY
A piece of information used when encrypting or decrypting.
Can be the same for both processes (symmetric).
Kept secret (private) or provided for all (public).
CRYPTANALYSIS
Recovering the plaintext without the key.
Also, recovering the key from intercepted ciphertext.
Attempts are called attacks.
ATTACK TYPES
Ciphertext only. Known plaintext. Chosen plaintext. Adaptive chosen plaintext. Chosen ciphertext. Chosen key (knowledge about keys). Rubberhose (personal threats or deceipts).
TYPES OF BREAK-INS
Total break: attacker finds the key.
Global deduction: attacker finds an alternative algorithm.
Instance deduction: attacker recovers a plaintext.
Information deduction: attacker gains information about the key or the plaintext.
TYPES OF SECURITY
Unconditional: no amount of ciphertext is sufficient for an attack.
Not breakable even by brute force.
Computational: not breakable with available resources (current or future); data, storage & processing complexity.
SUBSTITUTION CIPHERS
Each symbol (or block of symbols) of the plaintext alphabet is substituted by another symbol (or block of symbols).
These can be chained one after another.
For example, ROT13 (very insecure).
Often attacked by means of statistics.
TRANSPOSITION CIPHERS
Positions of the symbols are shuffled in some systematic way to produce a ciphertext.
Knowing the fashion in which the plaintext was accommodated (along with parameters such as width and height) allows the recovery of the plaintext.
COMBINATIONS
It is very common to combine two or more techniques to create an encryption.
Pre-computer, physical machines were built to implement cryptosystems.
Rotor devices like Enigma.
BASIC GROUP THEORY
Gawron: Groups, Modular Arithmetic, and Cryptography, 2004.
GROUP G = (G, ◦)
G is a set of objets.
◦ is a binary operator.
∀g, h ∈ G, i = g ◦ h ∈ G (closure).
∀g, h, i ∈ G: (g ◦ h) ◦ i = g ◦ (h ◦ i) (associativity).
∃e ∈ G s.t. ∀g ∈ G: e ◦ g = g ◦ e = g (identity).
∀g ∈ G ∃h ∈ G s.t. h ◦ g = g ◦ h = e (inverse).
ORDER
The cardinality of G, denoted by |G|.
That is, the number of elements in the group G.
A group is finite if it has finite order.
CANCELLATION
g ◦ h = g ◦ i ⇒ h = i (left cancellation).
h ◦ g = i ◦ g ⇒ h = i (right cancellation).
ITERATED OPERATIONS
g ◦ g ∈ G
(g ○ g) ○ g = g ○ (g ○ g) ∈ G
We write g2 = g ○ g, g3 = g ○ g2, etc.
gi refers to a sequence of i applications of ○ to g.
CYCLIC GROUP
∃ g ∈ G s.t. ∀ h ∈ G ∃ i ∈ Z, i ⩾ 0 s.t. gi = h.
This means that all elements in G are powers of a certain element g ∈ G.
That certain element is called a generator of the group G.
XOR ⨁
The “exclusive-or” is true if and only if exactly one of its arguments is true.
({⟙, ⟘}, ⨁) is a group.
XOR CIPHER
Make a binary key K with the same length as message blocks; denote an individual message block by M.
C = K ⨁ M.
M = K ⨁ C.
This is not at all secure as is.
Broken by counting coincidences.
ONE-TIME PADS
AT&T 1917 (Mauborgne & Vernam) Generate a single-use truly random key. Elements are numbers representing the alphabet A. Encrypt by addition modulo n = |A|. Decrypt by subtracting with the same modulo. Make two copies of the “pad”. Distribution, storage, and capacity are problematic.
INTRODUCTORY ASSIGNMENT
Implement in Python binary one-time pad cryptography using a standard random number generator and the XOR operation.
Pad length and the number of keys in it are both given as a parameter for the generation phase.
Each generated pad is stored as two identical files containing all the generated keys.
Both encryption and decryption are fully automated.
Used keys are automatically destroyed.
5 points maximum.
PERMUTATIONS
A composition of permutations is applying first one and then another. Permutations can be represented as adjacency matrices indicating which element moves to which. Multiplying such matrices yields the composition of the corresponding permutations. These matrices and their multiplication form a group.
SHIFT CIPHERS
Take the indexing of the letters as natural numbers to be a permutation on natural numbers.
Caesar’s cipher with a 3-shift.
SUBGROUP
H a subgroup of G if both G and H are groups and H ⊆ G.
We denote this by H ⊑ G.
ORDER OF AN ELEMENT
Let i be the smallest positive integer such that gi = e.
If such an i exists, it is the order of the element g.
If no such i exists, g is said to be of infinite order.
We denote the order of g by ord(g).
GENERATED SUBGROUP
Hg = { h | ∃i s.t. h = gi }
|Hg| = the order of g
g is the generator of Hg
AN OBSERVATION
In finite groups, any element will generate a subgroup.
This means that there are no elements of infinite order in finite groups.
MORPHISM
Functions f() that map the members of one group (G, ○) to those of another (H, ●) in a way that preserves the structure of the group operation:
f: G → H ⋀ f (g ○ h) = f(g) ● f(h).
MORPHISM FROM Z
∀g ∈ G, G being a group, ɸ : n ↦ gn is a morphism to the subgroup Hg.
Proving this requires defining g0 = e and g-n = (gn)-1.
MODULAR ARITHMETIC
Gawron: Groups, Modular Arithmetic, and Cryptography, 2004.
CONGRUENCE
{ x | ∃ y ∈ N así que ((y × a) + b = x) }
congruente con b mod a
x ≣ b mod a
THEOREM
a ≣ b mod n ⟺ n | (a - b)
Meaning: a is congruent to b modulo n if and only if (a - b) is divisible by n.
RESIDUE SET
The set {0, 1, 2, ..., n-1} is called the residue set of n.
Each member of a residue set represents an equivalence class of integers.
It makes no structural difference if we multiply each element by a fixed multiple of n; we choose the minimum (that is, the actual residual).
MODULAR ADDITION
x mod n + y mod n = (x + y) mod n
Modular addition mod n within {0, 1, 2, ..., n -1} is a group.
MODULAR MULTIPLICATION
x mod n × y mod n = (x × y) mod n
Modular multiplication mod n is a group Zn within the subset of the residue set of n formed by those elements g that are relatively prime to n, denoted by g ⊥ n.
AN OBSERVATION
(i ⊥ n) ⋀ (j ⊥ n) ⇒
((i × j) mod n) ⊥ n
GCD
Recall the definition.
Recall Euclid’s algorithm, both iterative and recursive formulations.
Work out some examples as tables.
c = GCD(a, b) ⇒
∃x, y ∈ Z así que c = (x × a) + (y × b)
EUCLID’S EXTENDED ALGORITHM
Recover the values of x and y while coming back up along the table that the original algorithm formulates.
When on the way down you were computing a’ = t × b’ + s, compute on the way up s = (x’ × a’) + (y’ × b’) using whatever it is that you’re propagating upwards as x’ and y’. On the way down, you terminate when you hit s = 0; on the way up. you terminate when a’ = a, and b’ = b, meaning that you now know the values of x and y.
MODULAR INVERSE
x ⊥ n ⟺ ∃x-1 mod n.
El algoritmo extendido nos permite calcular elementos inversos bajo aritmética modular: x-1 ≣ y mod n ⇒ (x × y) ≣ 1 mod n.
Nota que necesariamente aplica x ⊥ n ⋀ y ⊥ n. También ∃a tal que (a × n) + (y × x) = 1.
COSET
Let G = (G, ○) be a group and H = (H, ○) ⊑ G.
For g ∈ G, we define the (left) coset of H as
g ○ H = { g ○ h | h ∈ H }.
∀g, H: |H| = |g ○ H| ⋀ H ⋂ g ○ H = ∅.
COSET PARTITIONING
∀a, b ∈ G, H ⊑ G ⇒
((a ○ H = b ○ H) ∨ ((a ○ H) ⋂ (b ○ H) = ∅))
LAGRANGE THEOREM
(H ⊑ G ⋀ |G| = a < ∞ ⋀ |H| = b) ⇒
b | a
COROLLARY
|G| = n; ∀g ∈ G: ord(g) | n
EULER’S TOTIENT FUNCTION
ɸ(n) = |{ p ∈ Z s.t. 1 < p ≤ n ∧ p ⟘ n}|
!For n prime, trivially ɸ(n) = n - 1.
We know that |Zn| = ɸ(n).
PRODUCT THEOREM
p ⊥ q ⇒ ɸ(p × q) = ɸ(p) × ɸ(q)
EULER’S THEOREM
a ⊥ n ⇒ aɸ(n) ≣ 1 mod n
FERMAT’S LITTLE THEOREM
If p is prime, ap-1 ≣ 1 mod p.
LEMMA
n ⊥ (p × q) ⇔ (n ⟘ p ⋀ n ⊥ q)
CHINESE REMAINDER THEOREM
If n = p1p2...pk where pi are the prime factors of n, the system of equations
ai ≣ x mod pi
has a unique solution x < n.
ADECUATE RANDOMNESS
Passes statistical tests for randomness.
Unpredictable.
Uncompressible.
Unreproducible.
Secure = withstands attacks.
HOMEWORK
Investigate effects of inadequate randomness in information security.
Write a report; indicate bibliographical references.
Maximum 7 points.
Attempts to recycle content from Wikipedia without checking its validity result in failing the homework assignment.
PROTOCOL
A series of steps to carry out a task between two or more interested parties.
All participants must know and follow the protocol.
It must be nonambiguous and complete.
ARBITRATOR
A disinterested third party.
Trusted to complete a protocol.
An adjudicator is an arbitrator that is not directly involved used to resolve disputes.
SELF-ENFORCING PROTOCOL
Guarantees fairness by design without a need for either an active arbitrator or an adjudicator.
Uncheatable, ideal.
Hard to find.
ATTACKS ON PROTOCOLS
Passive attack: eavesdropping.
Active attack: injection, deletion, replay, interruption, substitution, or alteration of messages.
CHEATERS
One of the parties involved attempts to attack the protocol.
Passive cheater: follows the protocol but attempts to recover more information than intended.
Active cheater: disrupts the protocol as part of an attack.
A GOOD CRYPTOSYSTEM
“All the security is inherent in the knowledge of the key and none is inherent in the knowledge of the algorithm.”
Importance in key management.
ENCRYPTION
Given a plain-text message (m), produce an encoded message (c) using an encoding algorithm (E) with a key (k): E(m, k) = c.
Send the encoded message (c). Using the key (k) and a decoding algorithm (D), recover the original message (m): D(c, k) = m.
MERKLE SECRET KEY PROTOCOL
Bob sends Alice a million messages, all encrypted with a randomly generated short key (for example 20-bit). He stores all his messages for now. Each message has the following format: (x, Kx), where
x is a sequence number, Kx is a long key (for example, 128-bit).
Alice randomly picks a message and brute-force hacks it. She sends c = E(Kx, m) to Bob together with x in plain text. Bob looks up the Kx that corresponds to x and computes m = D(Kx, c).
ONE-WAY FUNCTIONS
For any x, we can compute f(x) with reasonable effort.
For any f(x), it is practically impossible to recover the value of x.
For x ≠ y, f(x) ≠ f(y).
GAME PROTOCOL
1.Alice chooses an even or an odd x (representing heads or tails).
2.Alice sends y = f(x) to Bob.
3.Bob calls heads or tails (even or odd).
4.Alice sends Bob her value x.
5.Bob computes f(x).
6.Bob verifies that f(x) is in fact equal to the y Alice sent.
POSSIBLE FUNCTIONS
It is easy to multiply but hard to factor.
Modular exponentiation is easy, but modular logarithms are hard.
SESSION KEY
Generating a secure key for a given cryptosystem.
Distributing this key among the parties involved.
Estabilishing secure communications using this key.
VERIFIABLE KEY EXCHANGE
Alice and Bob construct a key together.
Both provide some of the bits; those of Alice are x and those of Bob are y.
Alice sends to Bob f(x) and Bob sends to Alice f(y), where f is a one-way function.
We now need to be able to construct a key (k) from f(x) and y (on Bob’s end) and get the same key using x and f(y) (on Alice’s end) such that it is hard to break to recover k.
DIFFIE-HELLMAN PROTOCOL
Alice and Bob publicly choose a prime p and a generator g ∈ Zp.
Alice chooses x and sends to Bob f(x) = gx mod p.
Bob chooses y and sends to Alice f(y) = gy mod p.
K = f(y)x mod p = f(x)y mod p.
FINDING A GENERATOR
∀i < (n - 1) así que i | (n - 1), gi ≠ 1 ⇒
g ∈ Zn es un generador
HOMEWORK
Implement the DH protocol as well as a brute-force attack against it.
Program.
Max. 7 pts.
RSA ALGORITHM
Generate n = p × q, where both p and q are prime. (e, n) is the public key; c = me mod n.
e is to be relatively prime with ɸ(n)
(d, n) is the private key; m = cd mod n.
Requirement: e × d ≣ 1 mod ɸ(n);
ɸ(n) = (p - 1) × (q - 1). me × d ≣ m mod n due to Euler’s theorem.
HARD PRIMES
Difficult to factor
GCD(p - 1, q - 1) should be small.
Both p - 1 & q - 1 need to have large prime factors as well as p + 1 & q + 1.
(p - 1) / 2 & (q - 1) / 2 should be prime.
PRACTICAL USES
Computationally heavy, not widely used as a cipher.
Useful for authentication.
Used to establish a secure session key.
Then, some efficient (block) cipher will be used for the rest of the communication.
USE OF RSA FOR SESSION KEYS
Alice requests Trent for a session key to talk to Bob. Trent generates a random key.
Encrypts it once with Alice’s public key and sends it to her. Then encrypts the same key with Bob’s public key and sends it to him.
Both Alice and Bob use their private keys to obtain the session key. Alice and Bob use symmetric-key cryptography in their communication.
MAN IN THE MIDDLE
What if Alice just generates the key and sends it to Bob along with her public key and we get rid of Trent?
An attacker, Mallory, could intercept the public key Alice is sending and could send his key to Bob, pretending to be Alice, and then send his as “Bob´s response” to Alice.
All communications would now pass through Mallory without Alice or Bob noticing.
INTERLOCK PROTOCOL
Rivest & Shamir.
Only sending half the message at a time, taking turns.
The man-in-the-middle attack described no longer works.
Also useful as a mutual authentication protocol for example for login.
RSA AUTHENTICATION
Avoid having to send passwords as plaintext. The host sends to Alice a random input. Alice computes something based on this input, and encrypts the result with her private key. The server decrypts with Alice´s public key, computes, and compares. If it matches, Alice is trusted to enter the host. Alice may also challenge the host similarly to be able to trust it.
HOMEWORK ASSIGNMENT
Implement RSA authentication in Python as a login system for a website.
Program.
Max. 7 pts.
KEY EXCHANGE WITH DIGITAL SIGNATURES
We could also sign the messages when exchanging keys to make man-in-the-middle attacks difficult.
The public-key repository (Trent) also signs the public keys, verifying whose key each one is.
DIGITAL SIGNATURES
Using RSA: Your friend sends you a challenge to encode. Encode with your secret key. The recipient decodes with your public key. If your friend recovers the challenge, it must have been you all along.
Works because (me)d = (md)e = m mod n. One should never sign an entire message.
HOMEWORK
Implement a socket-based key exchange that employs RSA-based digital signatures.
Program.
Max. 10 puntos.
MULTIPLE KEYS
Secret splitting.
Problems with missing parts.
Secret sharing:
(m, n)-threshold schemes.
With or without cheaters.
With or without a trusted third-party.
Verifiability.
Non-revealing.
ONE-WAY HASHING
A hash function maps an input to a hash value with the purpose of “fingerprinting” the input. Collision-free: it is unlikely that two inputs produce the same hash value. In a one-way hash, hashing is easy but it is very hard to recover the input from the hash value. The hash function itself is publicly available. MAC = a one-way hash that uses a key.
USAGE WITH PASSWORDS
Instead of storing the password on server-side, store the hash value.
SALT
A mechanism against dictionary attacks on the one-way hash. A random string, concatenated to a password before hashing. The salt value is also stored. UNIX traditionally uses 12 bits of salt.
CASE-STUDY TOPICS (20 PTS)
Password safety.
Online financial transactions and electronic currency.
Electronic voting systems.
Anonymous browsing.
Group assignment.
SAFE PASSWORDS
Generation of secure keys.
Key length versus time to crack it by brute force.
Number of possible keys.
Computational effort of each attempt.
The Chinese lottery.
The birthday attack against one-way hashing.
Malware for gaining passwords.
Keywords versus pass phrases (c.f. xkcd).
ELECTRONIC COMMERCE
Web stores.
Online banking.
Digital cash.
Anonymous money (without an audit trail).
Bitcoins.
ELECTRONIC VOTING
Only authorized voters are able to place a vote.
No authorized voter may vote more than once.
It is impossible to determine who voted for whom.
Votes cannot be duplicated.
Votes cannot be altered unnoticed.
Each voter is able to verify that the vote placed was counted.
ANONYMOUS BROWSING
Untraceable navigation of the web.
What for?
How?
STEGANOGRAPHY
Hiding a message within another publicly available message without revealing the presence of the hidden message.
The statistical properties should not be significantly altered to avoid detection.
http://www.garykessler.net/library/steganography.html
http://qaa.ath.cx/PiggyPack.html
HOMEWORK
Prepare in Python a program to hide a text in an audio file and then recover it.
Upload to your blog your own code (or at least significant fragments of it) hidden in three different media files and also three other media files without any hidden messages.
Do not indicate anywhere which ones contain messages or what the messages are.
Document and publish your Python code in plaintext (as is, within a syntax highlighter).
Program, max. 10 pts.
BLOCK CIPHERS
Operate on blocks of plaintext and ciphertext. The block length is a parameter; typically 64 bits. A specific plaintext block will always produce the same ciphertext block. Electronic codebook: list for each block its correspondant.
Easy to break with prolonged eavesdropping when one can access both plaintext and ciphertext.
Repetitive headers/footers are particularly vulnerable.
PADDING
Making the last block “complete” by adding a pattern.
Add the number of padded bytes as the last byte of the last block.
ATTACKS
Block replay. Online banking attach: identify an authorization message with a certain structure and replay it with modification on some blocks.
Solved by chaining. Introduction of a feedback mechanism:
Previous blocks reenter the encryption process; CBC = ⨁ with predecessor before encryption.
CIPHER BLOCK CHAINING
http://cryptome.info/0001/bcm/bcm-f2.jpg
Ci = EK(Mi⨁Ci-1) Mi = Ci-1⨁DK(Ci)
IV can be public with no risk.
DATA ENCRYPTION STANDARD
Known as DES and DEA.
Adopted by ANSI in 1981.
A symmetric block cipher with 56-bit keys.
Uses substitutions, permutations, and XORs.
http://www.itl.nist.gov/fipspubs/fip46-1.gif
ONE ROUND OF DES
http://www.jhdl.org/documentation/latestdocs/code/des.gif
ADVANCED ENCRYPTION STANDARD
AES is a replacement for DES.
Originally called Rijndael in a more general form.
Also based on a network of substitutions and permutations.
128-bit blocks.
Key length 256 bits.
HOMEWORK
Elije un cifrado bloque no discutido en clase.
Describe su operación.
Describe ataques y debilidades conocidas.
Breve presentación en la siguiente clase, con diapositivas.
Máx. 7 puntos.
STREAM CIPHERS
Operate on streams of plaintext. Produce ciphertext one bit/byte/word at a time. Instead of a key, a keystream is used.
Typically XORed with the plaintext in the same unit that it is produced (bit/byte/word).
Decryption requires an identical keystream.
All security lies within the keystream generator.
The more random it appears, the better.
SYNCHRONOUS
The keystream is independent of the message stream.
The two keystreams needs to be synchronized in order for the decryption to be successful.
Mismatches should never be resolved by returning to a previous state.
Must be deterministic (which is unfortunate).
Will be periodic, but the period must be long.
Vulnerable to insertion attack.
SELF-SYNCHRONIZATION
Each bit of the keystream is a function of a fixed number of precious ciphertext bits.
Known as ciphertext auto key (CTAK).
Obviously, a key-key also enters in the function.
Vulnerable to playback attack.
COMBINATIONS
Block + stream: OFB (output-feedback).
Interleaving: mixing multiple streams.
HOMEWORK
Elije un cifrado de flujo no discutido en clase.
Describe su operación.
Describe ataques y debilidades conocidas.
Breve presentación en la siguiente clase, con diapositivas.
Máx. 7 puntos.