Quantitative Randomness Measuring Model for Pseudo-Random F unctions
description
Transcript of Quantitative Randomness Measuring Model for Pseudo-Random F unctions
![Page 1: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/1.jpg)
By Jyh-haw YehDepartment of Computer Science
Boise State University
![Page 2: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/2.jpg)
Measuring the correlation between inputs and outputs of complicated functions.
The model was designed for measuring cryptographic algorithms.
Other possible applications:Environmental factors V.S. gene mutationDependable variables V.S. nature change
such as climate, land surface, see level, etc
![Page 3: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/3.jpg)
Use neural networks to learn the relationship between a set of inputs and it’s corresponding set of outputs.
Predict outputs from other N sets of inputs.
Compare predictions and real outputs, and then generate N chi-square statistics, one for each set of data.
![Page 4: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/4.jpg)
From these N statistics, some quantitative measurements can be formulated.
These measurements indicate how much those tested inputs related to the known outputs.
![Page 5: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/5.jpg)
Cryptographic algorithms:For each algorithm, the model generates
measurements.The measurements indicate how random
the algorithm is.An algorithm is more secure if it is more
random.Through this model, security strength
among different algorithms can be quantitatively compared.
![Page 6: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/6.jpg)
Nature changes: Scientist recorded nature change
(independent variable) over a period of time T - outputs in our model.
Over the same time period T, they also recorded the changes of several other factors (dependent variables ), which may cause the nature change – inputs in our model.
Our model evaluates which factor is more related to the nature change.
![Page 7: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/7.jpg)
Gene mutation:Outputs to our model: recorded mutation
over a time period T. Inputs to our model: recorded
environmental factors in the same T – temperature, humidity, …
Our model evaluates which factor may be more related to gene mutation.
![Page 8: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/8.jpg)
Raw data generation:A data set: M, say 1,000k pairs of
plain(text)s and cipher(text)s.For each algorithm, generate N, say 101,
data sets.One data set (training set) for training the
networks.The other 100 data sets (testing sets) for
testing the networks.
![Page 9: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/9.jpg)
Network training: use the training set to train the network.
Network testing: use each testing set to test the networks.For each testing set, there are 1,000k
predictions of ciphers. Observed data generation:
1,000k hamming distances (HDs) are produced , from 1,000k of (predictions, real ciphers).
If the algorithm is truly random, the distribution of these HDs is binomial.
![Page 10: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/10.jpg)
Chi-square analysis: apply chi-square analysis to these 1,000 HDs, and generate a statistic V.
N=1,000kNi : the # of HDs with value i.Pi : the probability of a HD with value i, for a truly random algorithm.d : degree of freedom (or block size).
d
ii
ii
NP
NPNV
0
2)(
![Page 11: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/11.jpg)
Chi-square analysis:A critical statistic value CV can be
calculated, based on a pre-picked significance level α.
If V > CV, this analysis is considered failed,
i.e., the data set being tested is statistical non-random,
or the algorithm is considered non-random based on the tested data set.
![Page 12: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/12.jpg)
More chi-square analyses:Random/non-random decided by one data
set and one chi-square analysis – risky.100 or more data sets.For each data set, perform many chi-square
analyses, one for each bit, each 2-bit, each 4-bit, … the whole block. (power of 2)
Let be the set of portion sizes used for chi-square analysis.
For a128-bit algorithm, there are totally 25,500 chi-square analyses.
500,25128100 Sdd
}128,4,2,1{ S
![Page 13: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/13.jpg)
Generate quantitative measurements: after testing 100 testing sets, there are 25,500 statistics are produced. : the statistics for the j-th d-bit
analysis in i-th data set. : the critical statistics for a d-bit
analysis. : the failure weight for a d-bit analysis. For example, set
),( jiVd
dCV
dFW'' ddifFWFW dd
sizeblock
ddFW
![Page 14: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/14.jpg)
: the failure frequency of d-bit analyses in the i-th data set.
: estimated
failure rate for the i-th data set.Estimated Failure Rate:
represents the expected failure percentage for a data set generated from the algorithm.
)(iFFd
Sd dd FWiFFiEFR )()(
100)(100
1 i iEFREFR
EFR
![Page 15: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/15.jpg)
Estimated Failure Variance :
estimates how bad each (failed) non-random data set is.
That is, those tested non-random data sets, whose chi-square statistics is about times than critical statistics.
EFR
CVjiVFWEFV iSdCVjiV ddd
dd
100
)),((100,3,2,1,,),(
EFV
EFV
![Page 16: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/16.jpg)
Both EFR and EFV are not absolute, but relative quantities.
Used to measure relative security strength among algorithms.
In general, smaller values of EFR and EFV, the algorithm is more random.
![Page 17: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/17.jpg)
The measuring methodology described, called ANN test (using Artificial Neural Networks).
For comparison, two other measuring methodologies Avalanche test and plain-cipher test were also performed.
The observed data set for each test:Avalanche: the hamming distance between
two ciphertexts, where their plaintexts differ by one bit.
Plain-cipher: the hamming distance between the plaintext and it’s ciphertext.
![Page 18: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/18.jpg)
Have measured AES, MD5, and DES, each with 100 ANN tests, 100 avalanche tests and 100 plain-cipher tests.
Comparing AES and MD5, the portion sizes to be chi-square analyzed are S={1,2,4,…,128}. Thus, 255 chi-analyses in each test.
Comparing all three algorithms, S={1,2,4,…64} since the block size of DES is 64. Thus, 127 chi-square analyses in each test.
![Page 19: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/19.jpg)
ANN avalanche plain-cipher
MD5 AES MD5 AES MD5 AES
EFR 12.98% 11.91%
11.31%
10.88%
10.48%
10.61%
EFV 1.878 2.808 2.184 1.904 3.529 1.784
ANN Avalanche plain-cipher
DES MD5 AES DES MD5 AES DES MD5 AES
EFR 12.95%
11.92%
10.94%
12.22%
10.20%
9.91%
6.44%
9.58%
9.38%
EFV 1.941 1.928 2.947 2.203 2.282 1.967 1.454 3.735 1.852
![Page 20: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/20.jpg)
A hypothesis: ANN test is more effective on identifying security weakness – need more measuring methodologies to solidify.
What is a good ANN architecture? What is appropriate parameter setting for ANN training process?
A single ANN or multiple ANNs to simulate the encryption mapping?
In ANN test, what is a good prediction logic?
In addition to hamming distance, other way to generate observed data? Cumulative sum, approximate entropy?
![Page 21: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/21.jpg)
To avoid over- or under-counting the non-randomness, how many different portions within a block to be analyzed in a test?
In addition to EFR and EFV, other meaningful quantitative measurements?
Comparison strategy if conflicting indications among quantitative measurements.
Fair comparison method for algorithms with different block sizes.
![Page 22: Quantitative Randomness Measuring Model for Pseudo-Random F unctions](https://reader035.fdocuments.in/reader035/viewer/2022062500/56815739550346895dc4e00b/html5/thumbnails/22.jpg)
Data from other applications may not be binary.
Unlike cryptographic algorithms, other applications may be difficult to gather large amount of data.
The model is not used to predict the future, but for measuring relative correlation among different factors.
Different applications may need to modify the model more or less, and in different ways.