Objective Metrics and Gradient Descent Algorithms for … · 2020-06-16 · Objective Metrics and...
Transcript of Objective Metrics and Gradient Descent Algorithms for … · 2020-06-16 · Objective Metrics and...
Objective Metrics and Gradient Descent Algorithms forAdversarial Examples in Machine Learning
Uyeong JangUniversity of WisconsinMadison, [email protected]
Xi Wu∗Google
Somesh JhaUniversity of WisconsinMadison, [email protected]
ABSTRACTFueled bymassive amounts of data, models produced bymachine-
learning (ML) algorithms are being used in diverse domains
where security is a concern, such as, automotive systems, fi-
nance, health-care, computer vision, speech recognition, natural-
language processing, and malware detection. Of particular
concern is use of ML in cyberphysical systems, such as driver-
less cars and aviation, where the presence of an adversary can
cause serious consequences. In this paper we focus on attacks
caused by adversarial samples, which are inputs crafted by
adding small, often imperceptible, perturbations to force a ML
model to misclassify. We present a simple gradient-descent
based algorithm for finding adversarial samples, which per-
forms well in comparison to existing algorithms. The second
issue that this paper tackles is that of metrics. We present a
novel metric based on few computer-vision algorithms for
measuring the quality of adversarial samples.
KEYWORDSAdversarial Examples, Machine Learning
ACM Reference Format:Uyeong Jang, Xi Wu, and Somesh Jha. 2017. Objective Metrics andGradient Descent Algorithms for Adversarial Examples in MachineLearning. In Proceedings of December 4–8, 2017, San Juan, PR, USA(ACSAC 2017,). ACM, New York, NY, USA, 15 pages. https://doi.org/https://doi.org/10.1145/3134600.3134635
1 INTRODUCTIONMassive amounts of data are currently being generated in
domains such as health, finance, and computational science.
Fueled by access to data,machine learning (ML) algorithms are
also being used in these domains, for providing predictions
of lifestyle choices [7], medical diagnoses [17], facial recog-
nition [1], and more. However, many of these models that
are produced by ML algorithms are being used in domains
where security is a big concern – such as, automotive sys-
tems [26], finance [20], health-care [2], computer vision [21],
speech recognition [14], natural-language processing [29], and
cyber-security [8, 31]. Of particular concern is use of ML in
∗Work was done while at University of Wisconsin
Permission to make digital or hard copies of part or all of this work for personal
or classroom use is granted without fee provided that copies are not made or
distributed for profit or commercial advantage and that copies bear this notice
and the full citation on the first page. Copyrights for third-party components of
this work must be honored. For all other uses, contact the owner/author(s).
ACSAC 2017„ 2017© 2017 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-5345-8/17/12. . . $15.00
https://doi.org/https://doi.org/10.1145/3134600.3134635
Figure 1: To humans, these two images appear to be thesame. The image on the left is an ordinary image ofa stop sign. The image on the right was produced byadding a small, precise perturbation that forces a par-ticular image-classification DNN to classify it as a yieldsign.
cyberphysical systems, where the presence of an adversary can
cause serious consequences. For example, much of the technol-
ogy behind autonomous and driver-less vehicle development
is driven by machine learning [3, 4, 10]. Deep Neural Networks(DNNs) have also been used in airborne collision avoidance
systems for unmanned aircraft (ACAS Xu) [18]. However, indesigning and deploying these algorithms in critical cyberphysi-cal systems, the presence of an active adversary is often ignored.
In this paper, we focus on attacks on outputs or models that
are produced by machine-learning algorithms that occur aftertraining or “external attacks”, which are especially relevant
to cyberphysical systems (e.g., for a driver-less car the ML-
algorithm used for navigation has been already trained once
the “car is on the road”). These attacks are more realistic, and
are distinct from “insider attacks”, such as attacks that poison
the training data (see the paper [15] for a survey such attacks).
Specifically, we focus on attacks caused by adversarial exam-ples, which are inputs crafted by adding small, often impercep-
tible, perturbations to force a trained ML model to misclassify.
As a concrete example, consider a ML algorithm that is used
to recognize street signs in a driver-less car, which takes the
images, such as the one depicted in Figure 1, as input. While
these two images may appear to be the same to humans, the
image on the left [32] is an ordinary image of a stop sign while
the right image was produced by adding a small, precisely
crafted perturbation that forces a particular image-classifier to
classify it as a yield sign. Here, the adversary could potentially
use the altered image to cause the car to behave dangerously,
if the car did not have additional fail-safes such as GPS-based
maps of known stop-sign locations. As driver-less cars become
more common, these attacks are of a grave concern.
Our paper makes contributions along two dimensions: a
new algorithm for finding adversarial samples and better met-
rics for evaluating quality of adversarial samples. We summa-
rize these contributions below.
Algorithms: Several algorithms for generating adversarial
samples have been explored in the literature. Having a diverse
suite of these algorithms is essential for understanding the
nature of adversarial examples and also for systematically
evaluating the robustness of defenses. The second point is
underscored quite well in the following recent paper [6]. In-
tuitively, diverse algorithms for finding adversarial examples,
exploit different limitations of a classifier and thus stress the
classifier in a different manner. In this paper, we present a
simple gradient-descent based algorithm for finding adversar-
ial samples. Our algorithm is described in section 3 and an
enhancement to our algorithm appears in the appendix. Even
though our example is quite simple (although we discuss some
enhancements to the basic algorithm), it performs well in com-
parison to existing algorithms. Our algorithm, NewtonFool,
successfully finds small adversarial perturbations for all test
images we use, but also it does so by significantly reducing the
confidence probability. Detailed experimental results appear
in section 4.
Metrics: The second issue that this paper tackles is the
issue of metrics. Let us recall the adversary’s goal: given an
image I and a classifier F , the adversary wishes to find a "small"
perturbation δ , such that F (I ) and F (I + δ ) are different but Iand I + δ “look the same” to a human observer (for targeted
misclassification the label F (I + δ ) should match the label that
an adversary desires). The question is– how does one formal-
ize "small" perturbation that is not perceptible to a human
observer? Several papers have quantified "small perturbation"
by using the number of pixels changed or the difference in
the L2 norm between I and I + δ . On the other hand, in the
computer-vision community several algorithms have been
developed for tasks that humans perform quite easily, such
as edge detection and segmentation. Our goal is to leverage
these computer-vision algorithms to develop better metrics for
to address the question given before. As a first step towards
this challenging problem, we use three algorithms from the
computer-vision literature (which is described in our back-
ground section 2) for this purpose, but recognize that this is
a first step. Leveraging other computer-vision algorithms to
develop even better metrics is left as future work.
Related work: Algorithms for generating adversarial ex-
amples is a very active area of research, and we will not pro-
vide a survey of all the algorithms. Three important related
algorithms are described in the background section 2. How-
ever, interesting algorithms and observations about adversar-
ial perturbations are constantly being discovered. For exam-
ple, Kurakin, Goodfellow, and Bengio [22] show that even in
physical-world scenarios (such as deployment of cyberphysical
systems), machine-learning systems are vulnerable to adver-
sarial examples. They demonstrate this by feeding adversarial
images obtained from cell-phone camera to an ImageNet In-
ception classifier and measuring the classification accuracy of
the system and discover that a large fraction of adversarial ex-
amples are classified incorrectly even when perceived through
the camera. Moosavi-Dezfooli et al. [24] propose a systematic
algorithm for computing universal perturbations1In general,
the area of analyzing of robustness of machine-learning al-
gorithms is becoming a very important and several research
communities have started working on related problems. For
example, the automated-verification community has started
developing verification techniques targeted for DNNs [16, 19].
2 BACKGROUNDThis section describes the requisite background. We need
Moore-Penrose pseudo-inverse of a matrix for our algorithm,
which is described in section 2.1. Three techniques that are
used in our metrics are described in the next three sub-sections.
Formulation of the problem is discussed in section 2.5. Some
existing algorithms for crafting adversarial examples are dis-
cussed in section 2.6. We conclude this section with a discus-
sion section 2.7.
2.1 Moore-Penrose Pseudo-inverseGiven a n×m matrixA, a matrixA+ is called it Moore-Penrose
pseudo-inverse if it satisfies the following four conditions (AT
denotes the transpose of A):(1) AA+A = A(2) A+AA+ = A+
(3) A+A = (A+A)T
(4) AA+ = (AA+)T
For any matrix A, a Moore-Penrose pseudo-inverse exists and
is unique [11]. Given an equation Ax = b, x0 = A+b is thebest approximate solution to Ax = b (i.e., for any vector xsatisfying the equation, ∥ Ax0 − b ∥ is less than or equal to
∥ Ax−b ∥).
2.2 Edge DetectorsGiven an image, edges are defined as the pixels whose value
changes drastically from the values of its neighbor. The con-
cept of edges has been used as a fundamental local feature in
many computer-vision applications. The Canny edge detector(CED) [5] is a popular method used to detect edges, and is de-
signed to satisfy the following desirable performance criteria:
high true positive, low false positive, the distance between a
detected edge and a real edge is small, and there is no duplicate
detection of a single edge. In this section we will describe CED.
Preprocess – noise reduction.Most images contain random
noise that can cause errors in edge detection. Therefore, filter-
ing out noise is an essential preprocessing step to get stable
results. Applying convolution with a Gaussian kernel, also
know as Gaussian blur, is a common choice to smooth a given
image. For any n ∈ N, (2n + 1) × (2n + 1) Gaussian kernel is
defined as follows.
K (x ,y) =1
2πσ 2e−x2+y2
2σ 2where − n ≤ x ,y ≤ n
The denoising is dependent on the choice of n and σ .Computing the gradient. After the noise-reduction step,
CED computes the intensity gradients, by convolving with
1A single vector, which when added to an image causes its label to change.
Sobel filters. For a given image I ,the following shows an ex-
ample convolutions of 3 × 3 Sobel filters, which has been used
in our experiments (in the equations given below ∗ represents
convolution).
Gx =
−1 0 1
−2 0 2
−1 0 1
∗ I
Gy =
−1 −2 −1
0 0 0
1 2 1
∗ I
Gx and Gy encodes the variation of intensity along x-axisandy-axis respectively, therefore we can compute the gradient
magnitude and direction of each pixel using the following
formula.
G =√G2
x +G2
y
θ = tan−1
(Gy
Gx
)The angle θ is rounded to angles corresponding horizontal
(0◦), vertical (90
◦), and diagonal directions (45
◦,135◦).
Non-maximum suppression. The intensity gradient gives
us enough information about the changes of the values over
pixels, so edges can be computed by thresholding the intensity
magnitude. However, as we prefer thin and clear boundary
with no duplicated detection, Canny edge detector performs
edge thinning, which is done by suppressing each gradient
magnitude to zero unless it achieves the local maxima along
the gradient direction. Specifically, for each pixel point, among
its eight neighboring pixels, Canny edge detector chooses
two neighbors to compare according to the rounded gradient
direction θ .
• If θ = 0◦, choose the neighboring pixels at the east and
west
• If θ = 45◦, choose the neighboring pixels at the north
east and south west
• If θ = 90◦, choose the neighboring pixels at the north
and south
• If θ = 135◦, choose the neighboring pixels at the north
west and south east
These choices of pixels correspond to the direction perpen-
dicular to the possible edge, and the Canny edge detector tries
to detect a single edge achieving the most drastic change of
pixel intensity along that direction. Therefore, it compares the
magnitude of the pixel to its two neighbors along the gradient
direction, and set the magnitude to be 0 when it is smaller
than any magnitudes of its two neighbors.
Thresholding with hysteresis. Finally, the Canny edge de-
tector thresholds gradients using hysteresis thresholding. In
hysteresis thresholding, we first determine a strong edge (pixel
with gradient bigger than θhiдh ) and a weak edge (pixel with
gradient between θlow and θhiдh ), while suppressing all non-
edges (pixels with gradient smaller than θlow ). Then, theCanny edge detector checks the validity of each weak edge,
based on its neighborhood.Weak edges with at least one strong
edge neighbor will be detected as valid edges, while all the
other weak edges will be suppressed.
The performance of Canny edge detector depends highly
on the threshold parameters θlow and θhiдh , and those param-
eters should be adjusted according to the properties of input
image. There are various heuristics to determine the thresh-
olds, and we use the following heuristics in our experiments
• MNIST: While statistics over pixel values (e.g mean,
median) are usually used to determine thresholds, pixel
values in MNIST images are mostly 0, making such sta-
tistics unavailable. In this work, we empirically searched
the proper values for thresholds, sufficiently high to be
able to ignore small noise, and finally used θlow = 300
and θhiдh = 2 · θlow .
• GTSRB:When distribution of pixel value varies, usu-
ally statistics over pixel value are used to adjust thresh-
olds, because pixel gradient depends on overall bright-
ness and contrast of image. Since those image properties
varies in GTSRB images, we put thresholds as follows.
θlow = (1 − 0.33)µ
θhiдh = (1 + 0.33)µ
where µ is the mean of the values on pixels.
2.3 Fourier TransformIn signal processing, spectral analysis is a technique that trans-
forms signals into functions with respect to frequency, rather
than directly analyzing the signal on temporal or spatial do-
main. There are various mathematical operators (transforms)
converting signals into spectra, and Fourier transform is one
of the most popular operator among them. In this section we
mainly discuss two dimensional Fourier transform as it is an
operation on spatial domain where an image lies in and is
commonly applied to image analysis.
Considering an image as a function f of intensity on two
dimensional spatial domain, the Fourier transform F is written
in the following form.
F (u,v ) =
∫ ∞
−∞
∫ ∞
−∞
f (x ,y)e−2π i (ux+vy )dxdy
While this definition is written for continuous spatial do-
main, values in an image are sampled for a finite number of
pixels. Therefore, for computational purposes, the correspond-
ing transform on discrete two dimensional domain, or discrete
Fourier transform, is used in most applications. For a function
f of on a discrete grid of pixels, the discrete Fourier transform
maps it to another function F on frequency domain as follows.
F (k,l ) =M−1∑x=0
N−1∑y=0
f (x ,y) exp
[−2πi
(kx
M+ly
N
)]While naive computation of this formula requires quadratic
time complexity, there are several efficient algorithms comput-
ing discrete Fourier transformwith time complexityO (n logn),called Fast Fourier Transform (FFT), and we use the two dimen-
sional FFT in our analysis of perturbations.
Fourier transform on two dimensional spatial domain pro-
vides us spectra on two dimensional frequency (or spatial fre-
quency) domain. These spectra describe the periodic structures
across positions in space, providing us valuable information
of features of an image. Specifically, low spatial frequency cor-
responds to the rough shape structure of the image, whereas
spectrum on high spatial frequency part conveys detailed fea-
ture, such as sharp change of illumination and edges.
2.4 Histogram of Oriented GradientsThe histogram of oriented gradients (HOG) [9] is a feature
descriptor, widely used for object detection. Simply, HOG de-
scriptor is a concatenation of a number of locally normalized
histograms. Each histogram contains local information about
the intensity gradient directions, each local set of neighboring
histograms is normalized to improve the overall accuracy. By
concatenating those histogram vectors, HOG outputs a more
compact description of the shape. For object detection, a ma-
chine learning algorithm is trained over HOG descriptors of
training set images, to classify if any part of an input image
has HOG descriptor should be labeled as “detected”
Computing the gradient. Similarly to the edge detector in
2.2, HOG starts from computing gradient for each pixel. While
3×3 Sobel filters are used in Canny edge detector, HOG applies
the following simpler one dimensional filters to the input
image.
Gx =[−1 0 1
]∗ I
Gy =
−1
0
1
∗ I
Using the same formula used for Canny edge detector, HOG
computes the gradient magnitude G and direction θ of each
pixel. However, HOG does not round the angle θ , as it givesan important information for the next step.
Histogram construction. To construct histograms to be con-
catenated, HOG first divides the input image into smaller cells,consisting of pixels (e.g. 8 × 8 pixels) then compute a his-
togram for each cell. Each histogram has several bins and
each bin is assigned an angle. In [9], 9 bins corresponding to
0◦,20◦, . . . ,160◦ are used in each histogram.
Histograms are computed by the weighted vote for each
pixels in a cell by voting the weighted magnitude to one or two
bins. The weights are determined by the gradient direction θ ,based on how close the angle is from its closest two bins.
Block normalization. Since the magnitude of gradients is
highly dependent to the local illumination and contrast, each
histogram should be locally normalized. Block, consisting ofseveral number of cells (e.g. 2× 2 cells), is the region that HOG
applies the normalization to the histograms of the contained
cells. While there are various way to normalize a single block,
Dalal et.al.[9] reported that the performance is seldom influ-
enced by the choice of normalization method, except for the
case of using L1 normalization.
The resulting feature vector is dependent on the parameters
introduced above: the number of pixels per cell, the number
of cells per block, and the number of histogram bins. In our
experiments, we use the following values (which are also used
in [9]: 8 × 8 pixels per cell, 2 × 2 cells per block, and 9 bins
for each histogram. For normalization method, we use L2normalization.
2.5 FormulationThe adversarial goal is to take any input vector x ∈ ℜn
(vec-
tors will be denoted in boldface) and produce a minimally al-
tered version of x, adversarial sample denoted by x⋆, that hasthe property of being misclassified by a classifier F : ℜn → C.
For our proposes, a classifier is a function fromℜnto C, where
C is the set of class labels. Formally, speaking an adversary
wishes to solve the following optimization problem:
min∆∈ℜn µ (∆)such that F (x+∆) ∈ C
∆ ·M = 0
The various terms in the formulation are µ is a metric on
ℜn,C ⊆ C is a subset of the labels, andM (called the mask) is
a n-dimensional 0 − 1 vector of size n. The objective functionminimizes themetric µ on the perturbation∆. Next we describevarious constrains in the formulation.
• F (x+∆) ∈ CThe set C constrains the perturbed vector x+∆ 2
to
have the label (according to F ) in the set C . For mis-classification problems (the label of x and x+∆) aredifferent we have C = C −{F (x)}. For targeted mis-classification we have C = {l } (for l ∈ C), where l is thetarget that an attacker wants (e.g., the attacker wants lto correspond to a yield sign).
• ∆ ·M = 0
The vector M can be considered as a mask (i.e., an at-
tacker can only perturb a dimension i if M[i] = 0),
i.e., if M[i] = 1 then ∆[i] is forced to be 0.3Essen-
tially the attacker can only perturb dimension i if thei-th component of M is 0, which means that δ lies in
k-dimensional space where k is the number of non-zero
entries in ∆.• ConvexityNotice that even if the metric µ is convex (e.g., µ is
the l2 norm), because of the constraint involving Fthe optimization problem is not convex (the constraint∆ ·M = 0 is convex). In general, solving convex opti-
mization problems is more tractable non-convex opti-
mization [25].
Generally, machine-learning algorithms use a loss func-
tion ℓ(F ,x,y) to penalize predictions that are “far away” from
the true label tl (x) of x. For example, if we use 0 − 1 loss
function, then ℓ(F ,x,y) = δ (F (x),y), where δ (z,y) is equalto 1 iff z = y (i.e., if z , y, then δ (z,y) = 0). For nota-
tional convenience, we will write LF (·) for ℓ(F , ·, ·), whereLF (x) = ℓ(F ,x,tl (x)). Some classifiers F (x) are of the formargmaxl Fs (x) (i.e., the classifier F outputs the label with the
maximum probability). For example, in a deep-neural network
(DNN) that has a softmax layer, the output of the softmax layer
2The vectors are added component wise
3the i-the component of a vectorM is written asM[i].
is a probability distribution over class labels (i.e., the probabil-
ity of a label y intuitively means the belief that the classifier
has in the example has label y). Throughout the paper, we
sometimes refer to the function Fs as the softmax layer. In
these case, we will consider the probability distribution cor-
responding to a classifier. Formally, let c = | C | and F be a
classifier, we let Fs be the function that maps Rn to Rc suchthat ∥Fs (x)∥1 = 1 for any x (i.e., Fs computes a probability
vector). We denote F ls (x) to be the probability of Fs (x) at labell .
2.6 Some Existing AlgorithmsIn this section, we describe few existing algorithms. This sec-
tion is not meant to be complete, but simply to give a “flavor”
of the algorithms to facilitate the discussion.
Goodfellow et al. attack - This algorithm is also known as
the fast gradient sign method (FGSM) [13]. The adversary craftsan adversarial sample x⋆ = x+∆ for a given legitimate sample
x by computing the following perturbation:
∆ = ε sign(∇LF (x)) (1)
The gradient of the function LF is computed with respect
to x using sample x and label y = tl (x) as inputs. Note that∇LF (x)) is an n-dimensional vector and sign(∇LF (x)) is an-dimensional vector whose i-th element is the sign of the
∇LF (x))[i]. The value of the input variation parameter ε fac-toring the sign matrix controls the perturbation’s amplitude.
Increasing its value, increases the likelihood of x⋆ being mis-
classified by the classifier F but on the contrary makes adver-
sarial samples easier to detect by humans.
Papernot et al. attack - This algorithm is suitable for tar-
geted misclassification [28]. We refer to this attack as JSMA
throughout the rest of the paper. To craft the perturbation
∆, components are sorted by decreasing adversarial saliency
value. The adversarial saliency value S (x,t )[i] of component ifor an adversarial target class t is defined as:
S (x,t )[i] =
0 if∂Ft∂ x[i] (x) < 0 or
∑j,t
∂Fj∂ x[i] (x) > 0
∂Ft∂ x[i] (x)
����∑j,t
∂Fj∂ x[i] (x)
���� otherwise(2)
where matrix JF =[
∂Fj∂ x[i]
]
i jis the Jacobian matrix for Fs .
Input components i are added to perturbation ∆ in order of
decreasing adversarial saliency value S (x,t )[i] until the result-ing adversarial sample x⋆ = x+∆ achieves the target label t .The perturbation introduced for each selected input compo-
nent can vary. Greater individual variations tend to reduce the
number of components perturbed to achieve misclassification.
Deepfool. The recent work of Moosavi-Dezfooli et al. [23]
try to achieve simultaneously the advantages of the methods
in [33] and [12]: Being able to find small directions (compet-
itive with [33]), and being fast (competitive with [12]). The
result is the DeepFool algorithm, which is an iterative algo-
rithm that, starting at x0, constructs x1,x2, . . . . Suppose weare at xi , the algorithm first linearly approximates Fks for each
k ∈ C at xi :
Fks (x) ≈ Fks (xi ) + ∇Fks (xi ) · (x− xi )
With these approximations, thus at xi we want to find the
direction d such that for some l ′ , l
F l′
s (xi ) + ∇F l′
s (xi ) · d > F ls (xi ) + ∇Fls (xi ) · d
In other words, at iteration i let Pi be the polyhedron that
Pi =⋂k ∈C
{d : Fks (xi ) + ∇F
ks (xi ) · d
≤ F ls (xi ) + ∇Fls (xi ) · d
}
Therefore the d we are seeking for is dist(xi ,Pci ), the dis-tance of xi to the complement of Pi is the smallest. Solving this
problem exactly and iterating until a different label is obtained,results in the DeepFool algorithm.
2.7 DiscussionSeveral algorithms for crafting adversarial samples have ap-
peared in the literature. These algorithms differ in three di-
mensions: the choice of the metric µ, the set of target labelsC ⊆ L, and the maskM . None of these methods use a maskM ,
so we do not include it in the figure. We describe few attacks
according to these dimensions in Figure 2 (please refer to the
formulation of the problem at the beginning of the section).
3 OUR ALGORITHMWe now devise new algorithms for crafting adversarial pertur-
bations. Our starting point is similar to Deepfool, and assumes
that the classifier F (x) is of the form argmaxl Fs (x) and the
softmax output Fs is available to the attacker. Suppose that
F (x0) = l ∈ C, then F ls (x0) is the largest probability in Fs (x0).Note that F ls is a scalar function. Our method is to find a small
d such that F ls (x0 + d) ≈ 0. We are implicitly assuming that
there is a point x′, nearby x, that F “strongly believes” x′ doesnot belong to class l (as the belief probability is “close to 0”),
while it believes that x does. While our assumption is “aggres-
sive”, it turns out in our experiments this assumption works
quite well in practice.
More specifically, with the above discussion, we now want
to decrease the value of the function F ls as fast as possible to
0. Therefore, the problem is to solve the equation F ls (x) = 0
starting with x0. The above condition can be easily general-
ized to the condition if the probability of label l goes below
a specified threshold (i.e. F ls (x) < ϵ). We solve this problem
based on Newton’s method for solving nonlinear equations.
Specifically, for i = 0,1,2, . . . , suppose at step i we are at xi ,we first approximate F ls at xi using a linear function
F ls (x) ≈ F ls (xi ) + ∇Fls (xi ) · (x− xi ). (3)
Let di = x− xi be the perturbation to introduce at iteration
i , and pi = F ls (xi ) be the current belief probability of xi asin class l . Assuming that we have not found the adversarial
example, pi must assume the largest value at the softmax layer,
thus in particular pi ≥ 1/| C |. Finally, denote gi = ∇Fls (xi ) be
the gradient of F ls at xi . Then our goal is to solve the following
linear system for di :
pi + gi · di = pi+1 (4)
Method C Short Description
FGSM C −{F (x)} Adds a perturbation which is proportional to the sign of the
gradient of the loss function at the image x.Papernot et al. {l } Constructs a saliency matrix S (x,t ), where x is the image
JSMA t is the target label. In each iteration, change the pixel according to the saliency matrix.
Deepfool C −{F (x)} In each iteration, tries to push the probabilities of
other labels l ′ higher than the current label l .
Figure 2: The second column indicates whether the method is suited for targeted misclassification. The last columngives a short description of the method.
where pi+1 < pi is some appropriately chosen probability
value we want to achieve in the next iteration. Denote δi =pi − pi+1 be the decrease of probability. Now we are ready to
describe the basic version of our new algorithm called New-tonFool.
Basic Version of NewtonFool. We start with a simple ob-
servation: Given pi+1, the minimal-norm solution to (4) is
d∗i = g†i (pi+1 − pi ) = −δi g†i where g
†i is the Moore-Penrose
inverse of gi (see the background section for a description of
Moore-Penrose inverse). For rank-1 matrix gi , g†i is precisely
gi /∥ gi ∥2, and so
d∗i = −δi gi∥ gi ∥2
(5)
Therefore the question left is how to pick pi+1. We make two
observations: First, it suffices that pi+1 < 1/| C | in order to
“fool” the classifier into a different class. Therefore we have an
upper bound of δi , namely
δi < pi − 1/| C |.
Second, note that we want small perturbation, which means
that ∥ d∗i ∥ is small. This can be formalized as ∥ d∗i ∥ ≤ η∥ x0 ∥for some parameter η ∈ (0,1). Since ∥ d∗i ∥ = δi/∥ gi ∥, thisthus translates to
δi ≤ η∥ x0 ∥∥ gi ∥
Combining the two conditions above we thus have
δi ≤ min
{η∥ x0 ∥∥ gi ∥,pi −
1
| C |
}We can thus pick
δ∗i = min{η∥ x0 ∥∥ gi ∥,pi − 1/| C |}. (6)
Plugging this back into (5), we thus get direction
d∗i = −δ∗i gi /∥ gi ∥
2 .
This gives the algorithm shown in 1.
Astute readers may realize that this is nothing but gradient
descent with step size δ∗i /∥∇Fls (xi )∥2. However, there is an
important difference: The step size of a typical gradient descent
procedure is tuned on a complete heuristic basis. However,
in our case, exploiting the structure of softmax layer and our
assumption on the vulnerability of the neural network, the
step size is determined once the intuitively sensible parameter
η is fixed (which controls how small the perturbation we want).
Note also that the step size changes over time according to
Algorithm 1
Input: x0,η, maxIter , a neural network F with a softmax
layer Fs .1: function NewtonFoolMinNorm(x0,η,maxIter ,F )2: l ← F (x0)3: d← 04: for i = 0,1,2, . . . ,maxIter − 1 do5: δ∗i ← min
{η∥ x0 ∥∥∇F ls (xi )∥,F ls (xi ) −
1
| C |
}
6: d∗i ← −δ ∗i ∇F
ls (xi )
∥∇F ls (xi ) ∥2
7: xi+1 ← xi + d∗i8: d← d+ d∗i9: return d
(6): If initially pi − 1/| C | is too large, the step size is then
determined by gi , the gradient vector we obtained on the
current image xi . Otherwise, as pi becomes closer and closer
to 1/| C |, we will instead adjust the step size according to
pi − 1/| C |.
3.1 EnhancementsOur basic algorithm attempts to drive down the probability
of the label l with the maximum probability according Fs . We
have extended this algorithm to consider multiple labels, but
we have not implemented this enhancement. Our enhance-
ment tries to “drive down” the probability of all labels in a
set L+ and “drive up” the probability of all labels in the set
L− (presumably an adversary wants an adversarial example
whose label is in L−). Our enhanced algorithm is described in
the appendix A.
3.2 Possible Benefits in Convergence withNewton’s Method
We now give some heuristic arguments regarding the benefits
of using Newton’s method. The main possible benefit is faster
convergence to an adversarial example. At the high level, the
main reason under the hood is actually that Newton’s method
is locally quadratically convergent to a zero x∗ when solving a
non-linear system of equations [30].
To start with, let us first consider an extreme case where,
actually, the convergence of Newton’s method does not apply
directly. Suppose that in a reasonably small neighborhood
N (x ,δ ) (under some norm), we have x∗ so that F ls (x∗) = 0.
That is, x∗ is a perfect adversarial example where we believe
that it is not label l with probability 1. Note that in this case x∗
is also a local minimal of F ls because Fls is non-negative. In this
case, the gradient of F ls is however singular (0), so we cannot
apply the convergence result of Newton’s method to conclude
fast convergence to x∗, to give that F ls (x∗) = 0. However, a
crucial point now is that while we will not converge to x∗
exactly, at some point in this iterative process, F ls (xk ) will besmall enough so that the network will no longer predict xk as
label l (which is the correct label), and thus gives an adversarialexample. If we let k∗ be the first iterate such that xk∗ is anadversarial example, our goal is thus to argue fast convergence
to xk∗ (where the gradient exists but not singular).
Suppose now that F ls (xk∗ ) = p > 0, then if we instead solve
that F ls (x ) −p = 0, then the zero point (for example xk∗ ) has anon-singular gradient, and so the convergence result of New-
ton’s method applies and says that we will converge quadrati-
cally to xk∗ . But this is exactly what NewtonFool algorithm
is doing, modulo picking p: Starting with the hypothesis that
nearby we have a adversarial point with very low confidence
in l , we pick a heuristic p, and apply Newton’s method to solve
the equation so that the confidence hopefully decreases to p –
If our guess is somewhat accurate, then we will quickly con-
verge to an adversarial point. Of course, the crucial problem
left is to pick the p – which we give some sufficient condition
(such as 1/| C |) to get guarantees.
4 EXPERIMENTSIn this section we present an empirical study comparing our
algorithms with previous algorithms in producing adversarial
examples. Our goals are to examine the tradeoff achieved
by these algorithms in their (1) effectiveness in producing
“indistinguishable” adversarial perturbations, and (2) efficiency
in producing respective perturbations. More specifically, the
following are three main questions we aim to answer:
(1) How effective is NewtonFool in decreasing the confidenceprobability at the softmax layer for the correct label inorder to produce an adversarial example?
(2) How is quality of the adversarial examples produced byNewtonFool compared with previous algorithms?
(3) How efficient is NewtonFool compared with previous al-gorithms?
In summary, our findings are the following:
• (1)+(3) NewtonFool achieves a better effectiveness-efficiencytradeoff than previous algorithms. On one hand, New-
tonFool is significantly faster than either DeepFool (up
to 49X faster) or JSMA (up to 800X faster), and is only
moderately slower than FGSM. This is not surprising
since NewtonFool does not need to examine all classes
for complex classification networks with many classes.
On the other hand, under the objective metric of Cannyedge detection, NewtonFool produces competitive and
sometimes better adversarial examples when compared
with DeepFool, and both of them produce typically sig-
nificant better adversarial examples than JSMA and
FGSM.
• (2) NewtonFool is not only effective in producing goodadversarial perturbations, but also significantly reduces
the confidence probability of the correct class. Specifically,in our experiments not only NewtonFool successfully
finds small adversarial perturbations for all test images
we use, but also it does so by significantly reducing the
confidence probability, typically from 0.8-0.9 to 0.2-0.4.
While previous work have also shown that confidence
probability can be significantly reduced with small ad-
versarial perturbations, it is somewhat surprising that
a gradient descent procedure, constructed under an ag-gressive assumption on the vulnerabilities of deep neuralnetworks against adversarial examples, can achieve sim-
ilar effects in such a uniform manner.
Two remarks are in order. First, we stress that, different
from DeepFool and JSMA where these two algorithms are
“best-effort” in the sense that they leverage first-order informa-
tion from all classes in a classification network to construct
adversarial examples, NewtonFool exploits an aggressive as-
sumption that, “nearby” the original data point, there is an-
other point where the confidence probability in the “correct
class” is significantly lower. The exploitation of this assump-
tion is similar to the structural assumption made by FGSM, yet
NewtonFool gives significantly better adversarial examples
than FGSM. Second, note that typically the training of a neuralnetwork is gradient descent on the hypothesis space (i.e., pa-
rameters of the network) with features (training data points)
fixed. In this case, NewtonFool is “dual” in the sense that it is
a gradient descent on the feature space (i.e., training features)
with network parameters fixed. As a result of the above two
points, we believe that the success of NewtonFool in produc-
ing adversarial examples gives a more explicit demonstration
of vulnerabilities of deep neural networks against adversarial
examples than previous algorithms.
In the rest of this section we give more details of our ex-
periments. We begin by presenting experimental setup in Sec-
tion 4.1. Then in Section 4.2 we provide experimental results
on the effectiveness of NewtonFool. Finally in Section 4.3 and
Section 4.4 we report perturbation quality and efficiency of
NewtonFool compared with previous algorithms.
4.1 Experimental SetupThe starting point of our experiments is the cleverhans projectby Papernot et al. [27], which implements FGSM, JSMA and a
convolutional neural network (CNN) architecture for study-
ing adversarial examples. We reuse their implementations
of FGSM and JSMA, and the network architecture in our ex-
periments to ensure fair comparisons. Our algorithms are
implemented as extensions of cleverhans4 using Keras andTensorFlow. We also ported the publicly available implemen-
tation of DeepFool [23] to cleverhans. All the experiments
are done on a system with Intel Xeon E5-2680 2.50GHz CPUs
(48-core), 64GB RAM, and CUDA acceleration using NVIDIA
Tesla K40c.
Datasets. In our experiments we considered two datasets.
4cleverhans is a software library that provides standardized reference implemen-
tations of adversarial example construction techniques and adversarial training
and can be found at https://github.com/tensorflow/cleverhans.
MNIST. MNIST dataset consists of 60,000 grayscale images of
hand-written digit. There are 50,000 images for training and
10,000 images for testing. Each image is of size 28 × 28 pixels,
and there are ten labels {0, . . . ,9} for classification.
GTSRB. GTSRB dataset consists of colored images of traffic
signs, with 43 different classes. There are in total 39,209 images
for training and 12,630 images for testing. In official GTSRB
dataset, image sizes vary from 15 × 15 to 250 × 250 pixels, and
we use the preprocessed dataset containing all images rescaled
to size 32 × 32.
CNN Architecture. The CNN we used in experiments for
both datasets is the network architecture defined in cleverhans.This networks has three consecutive convolutional layers with
ReLu activation, followed by a fully connected layer outputting
softmax function values for each classes. We train a CNN
classifier achieving 94.14% accuracy for MNIST testset after
6-epoch training, and another CNN classifier achieving 86.12%
accuracy for GTSRB testset after 100-epoch training.
ExperimentalMethodology. Since JSMA andDeepFool take
too long to finish on the entire MNIST and GTSRB datasets, we
use sampling based method for evaluation. More specifically,
for MNIST we run ten independent experiments, where in
each experiment we randomly choose 50 images, for each of
the ten labels in {0, . . . ,9}. We do similar things for GTSRB
except that we choose 20 images at random for each of the
6 chosen classes among 43 possible classes. Those 6 classes
were intentionally chosen to include diverse shapes, different
colors, and varying complexity of embedded figures, of traffic
signs.
With the sampled images, we then attack each CNN classi-
fier using different adversarial perturbation algorithms , and
evaluate the quality of perturbation and efficiency. Specifically:
(1) For NewtonFool, given data input x0 where it is classifiedas l , F ls (x0) − F ls (x0 +d) in order to evaluate the effectiveness
of NewtonFool in reducing confidence probability in class l .(2) To evaluate the perturbation quality, we use three algo-
rithms (Canny edge detector, FFT, and HOG) to evaluate the
quality of the perturbed example.
(3) Finally we record running time of the attacking algorithm
in order to evaluate efficiency.
Tuning ofDifferentAdversarial PerturbationAlgorithms.Finally, we describe how we tune different adversarial pertur-
bation algorithms in order to ensure a fair comparison.
FGSM. To find the optimal value for ϵ minimizing the final
perturbation ∆, we iteratively search from 0 to 10, increasing
by 10−3. We use the first ϵ achieving adversarial perturbation
to generate the adversarial sample. Searching time for the best
ϵ value is also counted towards the running time.
JSMA. Since target label l must be specified in JSMA, we try
all possible targets, and then choose the best result that the
adversarial example achieves the minimum number of pixel
changes. For each trial, we use fixed parameters ϒ = 14.5%,
θ = 1, which are valued used in [28]. We always increase the
pixel values for perturbations. For all target labels, attack times
are counted to the running time.
DeepFool. We fix η = 0.02, which is the value used in [23] to
adjust perturbations in each step.
NewtonFool. We fix η = 0.01, which is the parameter used
to determine the step size δi/∥∇Fls (xi )∥2 in gradient descent
steps.
4.2 Effectiveness of NewtonFoolThis section reports results in evaluating the effectiveness
of NewtonFool. We do so by measuring (1) success rate of
generating adversarial examples, and (2) changes in CNN clas-
sifier’s confidence in its classification. Table 1 summarizes
results for both. Specifically, column 5 and 6 gives the total
number of images we use to test and the total number of
successful attacks. In our experiments, NewtonFool achieves
perfect success rate. Column 4 gives the success probability
if ones makes a uniformly random guess, which is the value
NewtonFool algorithm aims to achieve in theory. Column 2
and 3 gives results on the reduction of confidence before and
after attacks, respectively, for both MNIST and GTSRB. We see
that in practice while the confidence we end at is larger than
1/| C |, NewtonFool still significantly reduce the confidence
at the softmax layer, typically from “almost sure” (0.8-0.9) to
“not sure” (0.2-0.4), while succeeding to produce adversarial
perturbations.
4.3 Quality of Adversarial PerturbationNow we evaluate the quality of perturbations using an ob-jective metric: Recall that an objective metric measures the
quality of perturbation independent of the optimization objec-
tive, and thus serves a better role in evaluating to what degree
a perturbation is “indistinguishable.”. Specifically, in our exper-
iments we use the classic techniques of computer vision, as the
objective metrics: Canny edge detection, fast Fourier transform,
and histogram of oriented gradients.Given an input image, Canny edge detection finds a wide
range of edges in an image, which is a critical task in computer
vision. We use Canny edge detection in the following way: We
count the number of edges detected by the detector and use
that as a metric for evaluation. Intuitively, an indistinguish-
able perturbation should maintain the number of edges after
perturbation very close to the original image. Therefore, by
measuring how close this count is to the count on the original
image, we have a metric on the quality of the perturbation:
Smaller the metric, better the perturbation. Note that there are
many edge detection methods.We chose Canny edge detection
since it is one of the most popular and reliable edge detection
algorithms.
Discrete Fourier transform maps images to the two dimen-
sional spartial domain, allowing us to analyse the perturba-
tions according to their spectra.When analysing an imagewith
its spectrum of spatial frequencies, higher frequency corre-
sponds to feature details and abrupt change of values, whereas
lower frequency corresponds to global shape information. As
adversarial perturbations does not change the general shape
but corrupt detailed features of the input image, measuring
the size of the high frequency part of the spectrum is desir-
able as the metric for feature corruption. Therefore, we first
Dataset Confidence be-
fore attack
Confidence af-
ter attack1/|C|
Number of attacked
samples
Number of successful
attacks
MNIST 0.926 (0.135) 0.400 (0.065) 0.1 5000 5000
GTSRB 0.896 (0.201) 0.245 (0.102) 0.023 1200 1200
Table 1: Confidence reduction and success rate of NewtonFool. Column 2 and 3 gives results on the reduction ofconfidence before and after attacks. Column 4 gives the success probability if one makes a uniformly random guess,which is the value NewtonFool algorithm aims to achieve in theory. Column 5 and 6 gives the success rate results.
compute the Fourier transform of the perturbation, discard
the values lies in the low frequency part, then measure the
l2 norm difference of the remaining part as a metric: Smaller
metric implies that less feature corruption has been induced
by the adversarial perturbation. For the low frequency part
to be discarded, we chose the intersection of lower halves of
the frequency domain along each dimensions (horizontal and
vertical).
Object detection with HOG descriptor is done by sweep-
ing a large image with a fixed size window, computing the
HOG descriptor of the image restricetd in the window, and
using a machine learning algorithm (e.g. SVM) to classify the
descriptor vector as “detected” and “not detected”. That is, if
a perturbed image has HOG descriptor relatively close to the
HOG descriptor of the original image, then the detecting algo-
rithm is more likely to detect the perturbed image whenever
the original image is detected. From this, we suggest another
objective metric that measures the l2 norm difference between
two HOG descriptor vectors: One computed from the original
image and the other from the perturbed image. The perturba-
tions resulting smaller HOG vector difference are considered
to have better quality.
In summary, in our experiments we find that: (1) Among all
algorithms DeepFool and NewtonFool generally produce the
best results, and typically are significantly better than FGSM
and JSMA. (2) Further, DeepFool and NewtonFool are stableon both datasets and both of them give good results on both
datasets. On the other hand, while FGSM performs relatively
well on MNIST, it gives poor results on GTSRB, and JSMA
performs the opposite: It performs well on GTSRB, but not
on MNIST. (3) Finally, for all tests we have, NewtonFool gives
competitive and sometimes better results than DeepFool. In
the rest of this section we give detailed statistics. Results for
HOG are provided in the appendix B.
Table 2 and Table 3 reports the number of edges we find on
MNIST and GTSRB, respectively. Each column gives the results
on the class represented by the image at the top, and each row
presents the mean and standard deviation of the number of
detected edges. Specifically, the first row gives the statistics
on the original image, and the other rows gives the statistics
on the adversarial examples found by different adversarial
perturbation algorithms.
On MNIST (Table 2), FGSM, DeepFool and NewtonFool
give similar results, where the statistics is very close to the
statistics on the original images. DeepFool and NewtonFoolare
especially close. On the other hand, we note that JSMA pro-
duces significantly more edges compared to other methods.
The result suggests that perturbations from JSMA can be more
perceivable than perturbations produced by other algorithms.
On GTSRB dataset, the situation changes a little. Now FGSM
produces theworst results among all algorithms (instead of JSMA),
and it produces significantly more edges been detected on
its produced adversarial examples. While the quality of ad-
versarial examples produced by JSMA improves significantly,
it is still slightly worse than those produced by DeepFool
and NewtonFool in most cases (except for the last sign). Inter-
estingly, NewtonFool gives the best result for all signs (again,
except the last one).
Table 4 and Table 5 presents the distance on the domain of
high spatial frequency we computed on MNIST and GTSRB,
respectively. Again, each column gives the results on the class
represented by the image at the top, and each row presents the
mean and standard deviation of the hight frequency distances.
For each row, we first generated adversarial perturbations
using the algorithm of the row, computed the fast Fourier
transform of the perturbation, then computed the l2 norm of
the high spatial frequency part.
On both of MNIST dataset and GTSRB dataset, DeepFool
and NewtonFool produce very close results, achieving smaller
values than the other two algorithms. On the other hand, the
statistics on FGSM and JSMA are relatively worse. Especially,
JSMA changes the high frequency part significantly more.
From this result, perturbations from JSMA seem to corrupt
more feature details than other algorithms. Remarkably, New-
tonFool maintains the best result over all experiments on our
Fourier transform metric.
4.4 EfficiencyIn the final part of our experimental study we evaluate the
efficiency of different algorithms. We measure the end-to-end
time to produce adversarial examples. In short, we find that
NewtonFool is substantially faster than either DeepFool (up
to 49X) or JSMA (up to 800X), while being only moderately
slower than FGSM. This is not surprising because NewtonFool
does not need to examine first-order information for all classesof a complex classification network for many classes. However,
it is somewhat surprising that, NewtonFool, a gradient proce-dure exploiting an aggressive assumption on the vulnerability
of CNN, achieves substantially better efficiency while produc-
ing competitive and often times better adversarial examples.
Due to space limitations detailed results are provided in the
appendix B.
Original image 1.71 (0.49) 1.07 (0.35) 1.36 (0.59) 1.17 (0.52) 1.22 (0.53)
FGSM 1.62 (0.54) 1.07 (0.34) 1.37 (0.66) 1.20 (0.55) 1.25 (0.57)
JSMA 2.34 (0.84) 2.39 (1.30) 2.00 (1.01) 1.69 (0.96) 1.67 (0.82)
DeepFool 1.60 (0.53) 1.06 (0.31) 1.34 (0.61) 1.17 (0.53) 1.26 (0.60)
NewtonFool 1.62 (0.54) 1.07 (0.34) 1.34 (0.61) 1.19 (0.55) 1.25 (0.59)
Original image 1.22 (0.52) 1.78 (0.78) 1.08 (0.34) 2.23 (0.90) 1.56 (0.62)
FGSM 1.23 (0.51) 1.76 (0.78) 1.12 (0.38) 2.20 (0.90) 1.57 (0.68)
JSMA 1.97 (1.08) 2.47 (1.21) 1.87 (1.07) 2.95 (1.21) 2.21 (0.97)
DeepFool 1.23 (0.53) 1.75 (0.81) 1.11 (0.38) 2.14 (0.91) 1.53 (0.63)
NewtonFool 1.23 (0.55) 1.75 (0.81) 1.11 (0.38) 2.15 (0.91) 1.54 (0.64)
Table 2: MNIST: Results with Canny edge detection. FGSM, DeepFool and NewtonFool give close results, with New-tonFool being especially close with DeepFool. JSMA, on the other hand, produces significantly larger statistics withrespect to the Canny edge detection metric.
Original image 13.73 (6.65) 17.85 (6.84) 11.69 (6.00) 18.14 (7.41) 10.37 (5.42) 13.66 (6.28)
FGSM 18.88 (7.71) 22.63 (7.76) 19.11 (7.56) 20.97 (8.01) 17.89 (7.92) 20.45 (9.46)
JSMA 15.04 (6.62) 18.54 (6.69) 12.95 (5.86) 19.18 (7.83) 11.20 (5.53) 14.71 (6.24)
DeepFool 14.84 (6.61) 18.82 (7.34) 12.96 (5.67) 18.40 (7.03) 11.16 (5.30) 15.06 (7.13)
NewtonFool 14.68 (6.88) 18.34 (7.11) 12.81 (5.73) 18.37 (7.25) 10.62 (5.20) 15.02 (7.08)
Table 3: GTSRB: Results with Canny edge detection. Now JSMA, DeepFool and NewtonFool give close results, whileFGSM produces significantly worse results. NewtonFool gives the best results across all tests (except for the last sign).
FGSM 20.50 (8.13) 13.01 (3.38) 15.96 (8.56) 13.56 (7.72) 11.95 (6.64)
JSMA 44.19 (7.29) 50.28 (8.37) 44.73 (8.53) 42.00 (9.24) 36.60 (6.52)
DeepFool 10.26 (3.57) 5.14 (1.13) 8.88 (4.33) 6.78 (3.49) 5.98 (2.73)
NewtonFool 9.57 (3.53) 4.62 (1.08) 8.26 (4.11) 6.17 (3.30) 5.39 (2.69)
FGSM 12.57 (6.70) 15.33 (6.07) 15.79 (7.95) 11.99 (6.52) 10.43 (5.11)
JSMA 42.37 (9.04) 48.67 (9.42) 45.07 (8.69) 49.44 (11.69) 44.41 (9.35)
DeepFool 6.39 (3.11) 8.54 (3.17) 7.37 (3.09) 7.42 (3.82) 6.70 (3.01)
NewtonFool 5.95 (2.96) 7.83 (3.01) 6.60 (2.91) 6.76 (3.62) 5.96 (2.80)
Table 4: MNIST: Results with fast Fourier transform. DeepFool and NewtonFool give close results. FGSM and JSMAproduce worse results with JSMA being significantly larger. NewtonFool produces the best results for all labels.
5 LIMITATIONS AND FUTUREWORKWe have not implemented the enhancement to our algorithm
described in the appendix. In the future, we will implement
the enhancement and compare its performance to our basic
algorithm. The computer-vision research literature has sev-
eral algorithms for analyzing images, such as segmentation,
edge detection, and deblurring. In our evaluation we use one
algorithm (i.e., edge detection) as a metric. Incorporating other
computer-vision algorithms into a metric is a very interesting
avenue for futurework. For example let µ1,µ2, · · · ,µk bek met-
rics (e.g., based on number of pixels, edges, and segments). We
could consider a weighted metric (i.e. the difference between
two images x1 and x2 is∑ki=1wi µi (x1,x2)). The question re-
mains – what weights to choose? Perhaps mechanical turk
FGSM 46.86 (36.94) 37.50 (15.97) 44.34 (27.72) 37.81 (27.04) 15.12 (10.47) 50.84 (22.82)
JSMA 54.95 (28.17) 66.81 (18.04) 79.87 (23.89) 72.62 (29.64) 56.14 (15.56) 78.91 (24.02)
DeepFool 9.67 (7.18) 7.33 (2.70) 8.07 (3.96) 7.88 (6.43) 4.38 (1.21) 8.78 (4.41)
NewtonFool 9.02 (7.66) 6.16 (2.54) 6.84 (3.84) 6.93 (6.12) 3.49 (0.86) 7.95 (4.34)
Table 5: GTSRB: Results with fast Fourier transform. Again, DeepFool and NewtonFool give close results, while FGSMand JSMA produce worse results. Across all tests, NewtonFool achieves the best results.
studies can be used to train the appropriate weights. Moreover,
the weighted metric
∑ki=1wi µi (x1,x2) could also be incorpo-
rated into algorithms for constructing adversarial examples.
All these directions are very important avenues for future
work.
6 CONCLUSIONThis paper presented a gradient-descent based algorithm for
finding adversarial examples.We also presented edge detectors
as a way for evaluating the quality adversarial examples gen-
erated by different algorithms. The research area of crafting
adversarial examples is very active. On the other hand, metrics
for evaluating the quality of adversarial examples has not been
well studied. We believe that incorporating computer-vision
algorithms into metrics is worthwhile goal and worthy of fur-
ther research. Moreover, having a diverse set of algorithms
for crafting adversarial examples is very important towards a
thorough evaluation of proposed defenses.
ACKNOWLEDGMENTSWe are grateful to the ACSAC reviewers for their valuable
comments and suggestions. We also thank Adam Hahn for
his patience during the submission process. This material is
based upon work supported by the Army Research Office
(ARO) under contract number W911NF-17-1-0405. Any opin-
ions, findings, conclusions and recommendations expressed in
this material are those of the authors and do not necessarily
reflect the views of the funding agencies.
REFERENCES[1] DeepFace: Closing the Gap to Human-Level Performance in Face Verifica-
tion. In Conference on Computer Vision and Pattern Recognition (CVPR).[2] Babak Alipanahi, Andrew Delong, Matthew T Weirauch, and Brendan J
Frey. 2015. Predicting the sequence specificities of DNA-and RNA-binding
proteins by deep learning. Nature biotechnology (2015).
[3] M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L.
Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, and K. Zieba.
2016. End to End Learning for Self-Driving Cars. Technical Report.[4] Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner,
Beat Flepp, Prasoon Goyal, Lawrence D. Jackel, Mathew Monfort, Urs
Muller, Jiakai Zhang, Xin Zhang, Jake Zhao, and Karol Zieba. 2016. End
to End Learning for Self-Driving Cars. CoRR abs/1604.07316 (2016). http:
//arxiv.org/abs/1604.07316.
[5] J Canny. 1986. A Computational Approach to Edge Detection. IEEE Trans.Pattern Anal. Mach. Intell. 8, 6 (June 1986), 679–698. https://doi.org/10.
1109/TPAMI.1986.4767851
[6] Nicholas Carlini and David Wagner. 2017. Towards Evaluating the Robust-
ness of Neural Networks. In IEEE Symposium on Security and Privacy.[7] Chih-Lin Chi, W. Nick Street, Jennifer G. Robinson, and Matthew A. Craw-
ford. 2012. Individualized Patient-centered Lifestyle Recommendations:
An Expert System for Communicating Patient Specific Cardiovascular Risk
Information and Prioritizing Lifestyle Options. J. of Biomedical Informatics45, 6 (Dec. 2012), 1164–1174.
[8] George E Dahl, Jack W Stokes, Li Deng, and Dong Yu. 2013. Large-scale
malware classification using random projections and neural networks. In
Proceedings of the IEEE International Conference on Acoustics, Speech andSignal Processing (ICASSP). IEEE, 3422–3426.
[9] Navneet Dalal and Bill Triggs. 2005. Histograms of Oriented Gradients
for Human Detection. In Proceedings of the 2005 IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CV PR’05) - Volume1 - Volume 01 (CVPR ’05). IEEE Computer Society, Washington, DC, USA,
886–893. https://doi.org/10.1109/CVPR.2005.177
[10] Nathan Eddy. 2016. AI, Machine Learning Drive
Autonomous Vehicle Development. http://www.
informationweek.com/big-data/big-data-analytics/
ai-machine-learning-drive-autonomous-vehicle-development/d/
d-id/1325906. (2016).
[11] Leslie Hogben (Editor). 2013. Handbook of Linear Algebra. Chapman and
Hall/CRC.
[12] Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining
and Harnessing Adversarial Examples. CoRR (2014).
[13] Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explain-
ing and Harnessing Adversarial Examples. In Proceedings of the 2015 In-ternational Conference on Learning Representations. Computational and
Biological Learning Society.
[14] Geoffrey Hinton, Li Deng, Dong Yu, George E Dahl, Abdel-rahman
Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick
Nguyen, Tara N Sainath, et al. 2012. Deep neural networks for acoustic
modeling in speech recognition: The shared views of four research groups.
IEEE Signal Processing Magazine 29, 6 (2012), 82–97.[15] Ling Huang, Anthony D Joseph, Blaine Nelson, Benjamin IP Rubinstein,
and JD Tygar. 2011. Adversarial machine learning. In Proceedings of the4th ACM workshop on Security and artificial intelligence. ACM, 43–58.
[16] X. Huang, M. Kwiatkowska, S. Wang, and M. Wu. 2017. Safety Verification
of Deep Neural Networks.
[17] International Warfarin Pharmacogenetic Consortium. 2009. Estimation of
the Warfarin Dose with Clinical and Pharmacogenetic Data. New EnglandJournal of Medicine 360, 8 (2009), 753–764.
[18] K. Julian, J. Lopez, J. Brush, M. Owen, and M. Kochenderfer. 2016. Policy
Compression for Aircraft Collision Avoidance Systems. In Proc. 35th DigitalAvionics Systems Conf. (DASC).
[19] Guy Katz, Clark Barrett, David Dill, Kyle Julian, and Mykel Kochenderfer.
2017. An Efficient SMT Solver for Verifying Deep Neural Networks.
[20] Eric Knorr. 2015. How PayPal beats the bad guys with machine learn-
ing. http://www.infoworld.com/article/2907877/machine-learning/how-
paypal-reduces-fraud-with-machine-learning.html. (2015).
[21] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet
classification with deep convolutional neural networks. In Advances inneural information processing systems. 1097–1105.
[22] A. Kurakin, I. J. Goodfellow, and S. Bengio. 2016. Adversarial Examples in
the Physical world. (2016).
[23] Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard.
2015. DeepFool: a simple and accuratemethod to fool deep neural networks.
CoRR (2015).
[24] Seyed Mohsen Moosavi Dezfooli, Alhussein Fawzi, Omar Fawzi, and Pascal
Frossard. 2017. Universal adversarial perturbations. In Proceedings of 2017IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Jorge Nocedal and Stephen Wright. 2006. Numerical Optimization.Springer.
[26] NVIDIA. 2015. NVIDIA Tegra Drive PX: Self-Driving Car Computer. (2015).
http://www.nvidia.com/object/drive-px.html
[27] Nicolas Papernot, Ian Goodfellow, Ryan Sheatsley, Reuben Feinman, and
Patrick McDaniel. 2016. cleverhans v1.0.0: an adversarial machine learning
library. arXiv preprint arXiv:1610.00768 (2016).[28] Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson,
Z. Berkay Celik, and Ananthram Swami. 2016. The Limitations of Deep
Learning in Adversarial Settings. In Proceedings of the 1st IEEE EuropeanSymposium on Security and Privacy. arXiv preprint arXiv:1511.07528.
[29] Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014.
Glove: Global vectors for word representation. Proceedings of the EmpiricialMethods in Natural Language Processing (EMNLP 2014) 12 (2014), 1532–1543.
[30] Alfio Quarteroni, Riccardo Sacco, and Fausto Saleri. 2000. Numericalmathematics. p.307.
[31] Eui Chul Richard Shin, Dawn Song, and Reza Moazzezi. 2015. Recogniz-
ing functions in binaries with neural networks. In 24th USENIX SecuritySymposium (USENIX Security 15). 611–626.
[32] J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel. 2012. Man vs. computer:
Benchmarking machine learning algorithms for traffic sign recognition.
Neural Networks 0 (2012), –. https://doi.org/10.1016/j.neunet.2012.02.016[33] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru
Erhan, Ian J. Goodfellow, and Rob Fergus. 2013. Intriguing properties of
neural networks. CoRR abs/1312.6199 (2013).
A EXTENSIONA.1 Problem FormulationIn general, given a point x0, we can consider two disjoint sets
of labels L+ and L− as follows.
• L+ = {l+1, . . . ,l+m } is the set ofm labels achieving high
confidences F ls (x0) of the classifier• L− = {l
−1, . . . ,l−n } is the set of n labels achieving low
confidences F ls (x0) of the classifier
The attacker tries to decrease (resp. increase) all Fs values oflabels in L+ (resp. in L−). Specifically, for a given point x0 anda pair of parameters α , β , the attacker tries to find a sample xnearby x0 such that,
• For all label l in L+, Fls (x) < α .
• For all label l in L−, Fls (x) > β .
Formally, the attack can be formulated with the following
optimization problem.
minimize ∥ d ∥
subject to F ls (x0 + d) < α for all l ∈ L+
F ls (x0 + d) > β for all l ∈ L−
A.2 Generalizing the Solution of 3Again, let di = x− xi be the perturbation to introduce at itera-
tion i , and let p(l )i = F ls (xi ) be the current belief probability of
xi as in class l . Similarly, denote g(l )i = ∇Fls (xi ) be the gradient
of F ls at xi . Finally, denote δ(l )i = p
(l )i − p
(l )i+1 be the decrease
of probability in class l .From equation 4, for each label l ∈ L+∪L−, our perturbation
di should satisfy the following equation,
g(l )i · di = p(l )i+1 − p
(l )i = −δ
(l )i (7)
In other words, di is a solution to a linear system consisting
of (m + n) equations. For notational convenience, we assignnumbers 1,2, . . . ,m + n to each label l ∈ L+ ∪ L−. Then, we
can define a matrix Gi whose jth row is g(j )i , and a vector δi
whose jth element is −δ(j )i , for j = 1,2, . . . ,m + n.
Gi =
*.......,
g(1)ig(2)i...
g(m+n)i
+///////-
δi =
−δ(1)i
−δ(2)i...
−δ(m+n)i
(8)
Now, we have the following linear system determining di .
Gi · di = δi (9)
In usual, g(j )i ’s are gradient vectors in a high dimensional
vector space, and (m + n) is much smaller than the dimension.
Therefore, the linear system (9) is underdetermined. For an un-
derdetermined linear system, ifGi is of full rank and δ(j )i ’s are
fixed for all j , then the min-norm solution d∗i can be computed
as,
d∗i = G†i · δi (10)
where G†i is the Moore-Penrose pseudoinverse of Gi . This
solution d∗i satisfies all constraints of the form (7) for each
label l , and achieves the minimum norm among vectors in the
solution set. In this section, we assume that Gi is always of
full rank, and the solution to the other case will be discussed
at §A.3. As long as Gi is of full rank, G†i can be defined, so
the only remaining task to compute (10) is to determine each
values in δi .
A.2.1 Trivial Constraints on δi . For any lable l ∈ L+ ∪ L−,
its corresponding δ(l )i should satisfy the following condition.
• If l ∈ L+, we want to decrease p(l )i , so δ
(l )i = p
(l )i −
p(l )i+1 ≥ 0
• If l ∈ L−, wewant to increasep(l )i , so δ
(l )i = p
(l )i −p
(l )i+1 ≤
0
Also,
• If l ∈ L+, it suffices to get p(l )i+1 < α , so we get an upper
bound of the decrease δ(l )i ,
δ(l )i ≤ max{0,p
(l )i − α }
• If l ∈ L−, it suffices to get p(l )i+1 > β , so we get an upper
bound of the increase −δ(l )i ,
−δ(l )i ≤ max{0,β − p
(l )i }
Note that themax fucntion is used to ensure no changes for the
labels which the adversarial goal is already achieved. There-
fore, for labels that allow more changes, we have the following
ranges that δ(l )i ’s can be adjusted.
0 ≤ δ(l )i ≤ p
(l )i − α for l ∈ L+
p(l )i − β ≤ δ
(l )i ≤ 0 for l ∈ L−
(11)
A.2.2 Enforcing the Norm Bound of d∗i . In this section, we
consider how to ensure small perturbation in each iteration, i.e.
∥ d∗i ∥ ≤ η∥ x0 ∥. From the minimal-norm solution d∗i = G†i ·δi ,
we have the following inequality.
∥ di ∥ = ∥G†i · δi ∥ ≤ ∥G
†i ∥∥δi ∥ (12)
Note that if Gi = g(1)i has only one row and δi = −δ(1)i is
a scalar value, then equality is satisfied, so we have ∥ di ∥ =|δi |∥ gi ∥
, which corresponds to the norm of the previous minimal-
norm solution 5. However, the equality may not hold for (12)
in general.
Let Gi = U ΣV⊤ be the singular value decomposition of
Gi . It is known that G†i = V Σ†U⊤, so the 2-norm of G†i is
∥G†i ∥ = ∥Σ†∥ = 1
σmin
where σmin is the smallest singluar
value of Gi . Combining with (12),
∥ di ∥ ≤∥δi ∥
σmin
To enforce the norm bound of d∗, we can set the 2-norm con-
straint of δi as follows.
∥δi ∥ ≤ ησmin∥ x0 ∥
To get constraints for individual δ(j )i , we strengthen the re-
striction further using the relationship between 2-norm and
∞-norm.
∥δi ∥ ≤√|L+ ∪ L− | max
l ∈L+∪L−|δ(l )i | ≤ ησmin∥ x0 ∥
With |L+ ∪ L− | =m + n, we get the following constraints thatensures the norm bound of d∗i .
|δ(l )i | ≤
ησmin∥ x0 ∥√m + n
for each label l ∈ L+ ∪ L− (13)
Combining the bounds (11) and (13), we get the following rules
to set δ(l )i for label l at iteration i . These rules are designed
to maximize the amount of possible change, i.e. |δ(l )i |, while
satisfying (11) and (13).
• If l ∈ L+,
– If p(l )i < α , we don’t need to decrease this probability.
Set δ(l )i = 0
– If not, choose δ(l )i = min{p
(l )i − α ,
ησmin ∥ x0 ∥√m+n
}
• If l ∈ L−,
– if p(l )i > β , we don’t need to increase this probability.
Set δ(l )i = 0
– If not, choose δ(l )i = max{p
(l )i − β ,−
ησmin ∥ x0 ∥√m+n
}
A.3 What if Gi has some linearlydependent rows?
In usual, Gi is not guaranteed to be a full-rank matrix, and
may contain linearly dependent rows, making it impossible
to define G†i . However, we can control the values of δ(l )i to
resolve the dependency. In this section, we will present how
to remove dependencies by setting δ(l )i .
Without loss of generality, assume we have two dependent
rows inGi , correnponding to labels l1 and l2. Then, there is a
nonzero constant c such that g(l1 )i = c · g(l2 )i . One naïve choice
to remove dependecy is just setting δ(l1 )i = δ
(l2 )i = 0. Then, the
constraints g(l1 )i · d = 0 and g(l2 )i · d = 0 become equivalent,
thus removing one of the dependent rows from Gi does not
change the solution set. Then, the algorithm proceeds with no
changes of probabilities in those labels, hoping that g(l1 )i+1 and
g(l2 )i+1 will become linearly independent at the next iteration.
The better solution is to make use of the constant factor c todefine another constraint between δ
(l1 )i and δ
(l2 )i . In previous
setting of two dependent labels, if we set δ(l1 )i = c · δ
(l2 )i ,
then the constraints g(l1 )i · d = δ(l1 )i and g(l2 )i · d = δ
(l2 )i are
equivalent upto multiplying constant c , so it is again safe to
ignore one dependent row from Gi as it does not change the
solution set.
Now, we summarize how to set δ(l )i ’s for two labels l1 and l2
to remove dependencies, while the added constraint is satisfied.
There are several cases depending on the membership of labels
l1, l2 and the sign of c .
(1) l1,l2 ∈ L+ (resp. l1,l2 ∈ L−)(a) If c > 0,
• If p(l1 )i < α (resp. p
(l1 )i > β), then δ
(l1 )i = 0 by our
rules at §A.2.2. To satisfy the dependency condition
δ(l1 )i = c · δ
(l2 )i , we set δ
(l1 )i = δ
(l2 )i = 0
• If p(l2 )i < α (resp. p
(l2 )i > β), then δ
(l2 )i = 0 by our
rules at §A.2.2. To satisfy the dependency condition
δ(l1 )i = c · δ
(l2 )i , we set δ
(l1 )i = δ
(l2 )i = 0
• If both probabilities can be changed, we first set
the δ(l )i with larger change |δ
(l )i | possible while
satisfying the constraint, then use δ(l1 )i = c · δ
(l2 )i ,
to determine the other value.
(b) If c < 0,
• sign(δ(l1 )i ) = sign(δ
(l2 )i ). To satisfy δ
(l1 )i = c · δ
(l2 )i
with c < 0, set δ(l1 )i = δ
(l2 )i = 0
(2) l1 ∈ L+, l2 ∈ L− (resp. l1 ∈ L−, l2 ∈ L+)(a) If c > 0,
• sign(δ(l1 )i ) , sign(δ
(l2 )i ). To satisfy δ
(l1 )i = c · δ
(l2 )i
with c > 0, set δ(l1 )i = δ
(l2 )i = 0
(b) If c < 0,
• If p(l1 )i < α (resp. p
(l1 )i > β), then δ
(l1 )i = 0 by our
rules at §A.2.2. To satisfy the dependency condition
δ(l1 )i = c · δ
(l2 )i , we set δ
(l1 )i = δ
(l2 )i = 0
• If p(l2 )i > β (resp. p
(l2 )i < α ), then δ
(l2 )i = 0 by our
rules at §A.2.2. To satisfy the dependency condition
δ(l1 )i = c · δ
(l2 )i , we set δ
(l1 )i = δ
(l2 )i = 0
• If both probabilities can be changed, we first set
the δ(l )i with larger change |δ
(l )i | possible while
satisfying the constraint, then use δ(l1 )i = c · δ
(l2 )i ,
to determine the other value.
The following pseudocode incorporates the rules in §A.2.2
and §A.3.
Algorithm 2
Input: x0,η,maxIter , α , β , L+, L−, a neural network F with
a softmax layer Fs .1: function NewtonFoolGeneral-
ized(x0,η,maxIter ,α ,β ,L+,L−,F )2: m← |L+ |3: n← |L− |4: d← 05: for i = 0,1,2, . . . ,maxIter − 1 do6: Compute g(l )i = ∇F
ls (xi ) for all l ∈ L+ ∪ L−
7: Construct Gi as defined in (8)
8: Remove dependencies of Gi using the tricks in
§A.3, record added constraints on δ(l )i ’s
9: Find the smallest singular value σmin of Gi10: for l ∈ L+ do11: if p (l )i > α then
12: δ(l )i ← min{p
(l )i − α ,
ησmin ∥ x0 ∥√m+n
}
13: else14: δ
(l )i ← 0
15: for l ∈ L− do16: if p (l )i < β then
17: δ(l )i ← max{p
(l )i − β ,−
ησmin ∥ x0 ∥√m+n
}
18: else19: δ
(l )i ← 0
20: Check if the constraints added at step 8 are all
satisfied. If not, adjust one of two δ(l )i , in the way that
|δ(l )i | always decreases.
21: d∗i ← G†i · δi22: xi+1 ← xi + d∗i23: d← d+ d∗i24: return d
B ADDITIONAL EXPERIMENTB.1 Results for HOGTable 6 and Table 7 shows the distances between the HOG
feature vectors of the original image and the perturbed image,
on MNIST and GTSRB, respectively. Again, each column gives
the results on the class represented by the image at the top,
and each row presents the mean and standard deviation. To
be specific, for each row, adversarial examples were generated
by the designated algorithm, then their HOG feature vectors
were compared to those of the corresponding orignal images.
OnMNIST dataset, FGSM gives the best results, while Deep-
Fool and NewtonFool give similar values in general with an
exception of the second label (digit 1). JSMA produces bigger
statistics except for the fifth label (digit 4), which implies that
the perturbed image is less likely to be detected when the
original image is detected correctly.
On GTSRB dataset, the situation changes entirely that now
JSMA produces noticeably better results. Among the other
three algorithms, NewtonFool gives the best results for all
experiments, even achieving the smaller or equal results to
that of JSMA for the last three labels. Similarly to the edge de-
tection metric, FGSM becomes much worse for GTSRB dataset,
whereas JSMA generates relatively better perturbations than
its output on MNIST dataset.
B.2 EfficiencyWe now give detailed statistics. Table 8 and Table 9 gives the
running time results for MNIST and GTSRB, respectively. Each
column corresponds to each class represented by the image at
top, and each row presents the mean and standard deviation
of the running time. For MNIST, (Table 8), NewtonFool is only
moderately slower than FGSM (if one calculates the absolute
difference in seconds). On the other hand, NewtonFool is sub-
stantially faster than DeepFool (up to 3X) and JSMA (up to
6X). The efficiency becomes more obvious on GTSRB dataset
because it has significantly more classes. In this case, we ob-
serve that NewtonFool is up to 49X faster than DeepFool, and
is up to 800X faster than JSMA.
FGSM 0.47 (0.17) 0.50 (0.16) 0.44 (0.20) 0.38 (0.17) 0.40 (0.17)
JSMA 0.57 (0.17) 0.85 (0.18) 0.82 (0.29) 0.52 (0.17) 0.44 (0.14)
DeepFool 0.51 (0.19) 0.86 (0.38) 0.47 (0.21) 0.42 (0.18) 0.48 (0.20)
NewtonFool 0.49 (0.19) 0.83 (0.37) 0.46 (0.21) 0.40 (0.17) 0.47 (0.20)
FGSM 0.39 (0.16) 0.40 (0.15) 0.46 (0.18) 0.37 (0.15) 0.34 (0.14)
JSMA 0.48 (0.16) 0.51 (0.17) 0.65 (0.21) 0.55 (0.19) 0.46 (0.16)
DeepFool 0.39 (0.16) 0.43 (0.16) 0.50 (0.19) 0.40 (0.15) 0.39 (0.16)
NewtonFool 0.38 (0.16) 0.42 (0.17) 0.47 (0.18) 0.38 (0.16) 0.37 (0.16)
Table 6: MNIST: Results with histogram of oriented gradients. FGSM, DeepFool and NewtonFool give close results,except for the second label (for digit 1) where JSMA, DeepFool and NewtonFool give close results. JSMA generallyproduces slightly larger statistics except for the fifth label (for digit 4).
FGSM 1.13 (0.57) 0.88 (0.21) 0.86 (0.20) 0.64 (0.29) 0.72 (0.28) 0.94 (0.23)
JSMA 0.42 (0.28) 0.57 (0.25) 0.69 (0.22) 0.48 (0.29) 0.99 (0.44) 0.87 (0.24)
DeepFool 0.96 (0.53) 0.85 (0.22) 0.96 (0.31) 0.56 (0.24) 0.96 (0.41) 0.89 (0.21)
NewtonFool 0.80 (0.48) 0.71 (0.18) 0.75 (0.21) 0.48 (0.19) 0.62 (0.23) 0.81 (0.19)
Table 7: GTSRB: Results with histogram of oriented gradients. Now JSMA gives better results than others. FGSM,DeepFool and NewtonFool produce slightly worse results, while NewtonFool performs better than the othermethodsbut JSMA.
FGSM 0.20 (0.02) 0.19 (0.02) 0.19 (0.03) 0.19 (0.03) 0.19 (0.02)
JSMA 8.72 (3.14) 4.21 (0.75) 8.29 (3.04) 7.18 (2.73) 6.47 (2.00)
DeepFool 5.90 (6.42) 3.98 (1.01) 5.46 (5.04) 4.30 (4.31) 3.46 (1.73)
NewtonFool 2.51 (1.02) 2.86 (0.96) 2.30 (1.29) 1.81 (1.06) 1.75 (0.96)
FGSM 0.18 (0.02) 0.19 (0.02) 0.19 (0.02) 0.18 (0.02) 0.18 (0.02)
JSMA 6.76 (2.64) 7.39 (2.59) 6.44 (2.18) 8.07 (3.20) 6.46 (2.20)
DeepFool 5.48 (15.02) 7.64 (19.04) 4.89 (4.54) 3.76 (3.53) 4.35 (8.55)
NewtonFool 1.75 (0.96) 2.31 (0.93) 2.43 (1.26) 1.59 (0.92) 1.50 (0.73)
Table 8: MNIST statistics for the running time (unit: sec.)
FGSM 0.19 (0.03) 0.19 (0.02) 0.19 (0.03) 0.18 (0.03) 0.17 (0.02) 0.20 (0.02)
Papernot et al. 755.67 (879.74) 321.35 (289.29) 316.57 (257.51) 372.64 (279.62) 108.89 (59.59) 689.81 (511.29)
Deepfool 118.55 (1097.21) 24.93 (13.07) 34.99 (23.22) 28.19 (24.68) 13.57 (4.43) 41.78 (28.01)
NewtonFool 0.85 (0.66) 0.64 (0.29) 0.80 (0.36) 0.65 (0.41) 0.56 (0.30) 0.85 (0.42)
Table 9: GTSRB statistics for the running time (unit: sec.)