TECHNOLOGY YOU CAN USE AGAINST THOSE ... - Fraud · PDF fileWHO USE TECHNOLOGY BENFORD’S...
Transcript of TECHNOLOGY YOU CAN USE AGAINST THOSE ... - Fraud · PDF fileWHO USE TECHNOLOGY BENFORD’S...
©2012
TECHNOLOGY YOU CAN USE AGAINST THOSE
WHO USE TECHNOLOGY
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
Benford’s Law is named after physicist Frank Benford, who discovered that there were
predictable patterns to the frequencies of the digits in tabulated data. Using real-world fraud and
bankruptcy-related examples, this session will review the foundations of this interesting
statistical phenomenon and its place in forensic analytics. The session will also cover other
recent Benford applications, including the analysis of bankruptcy filings, financial statement
numbers, and the analysis of Ponzi scheme numbers.
MARK NIGRINI, PH.D.
Professor
The College of New Jersey
Pennington, NJ
Mark J. Nigrini, Ph.D., is a professor at The College of New Jersey, where he teaches
managerial accounting and forensic accounting. His current research involves advanced
theoretical work on Benford’s Law and the legal process surrounding fraud convictions.
Nigrini is the author of Forensic Analytics (Wiley 2011), which describes tests to detect
fraud, errors, estimates, and biases in financial data. Nigrini is also the author of Benford’s Law,
published by Wiley in April 2012. His next book, Losing the War Against Fraud, will be
published in March 2013. His work has been featured in national media, including The Financial
Times, The New York Times, and The Wall Street Journal, and he has published papers on
Benford’s Law in accounting academic journals, scientific journals, and pure mathematics
journals, as well as professional publications such as Internal Auditor and Journal of
Accountancy. His radio interviews have included the BBC in London and NPR in the United
States. His television interviews have included an appearance on NBC’s Extra. He regularly
presents professional seminars for accountants and auditors in North America, Europe, and Asia,
with recent events in Singapore, Switzerland, and New Zealand.
“Association of Certified Fraud Examiners,” “Certified Fraud Examiner,” “CFE,” “ACFE,” and the
ACFE Logo are trademarks owned by the Association of Certified Fraud Examiners, Inc. The contents of
this paper may not be transmitted, re-published, modified, reproduced, distributed, copied, or sold without
the prior consent of the author.
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 1
NOTES Introduction
Benford’s Law gives the expected patterns of the digits in
tabulated data. The law is named after Frank Benford, who
noticed that the first few pages of his tables of common
logarithms were more worn than the later pages (Benford,
1938). From this he hypothesized that people were looking
up the logs of numbers with low first digits (e.g., 1, 2, and
3) more often than the logs of numbers with high first digits
(e.g., 7, 8, and 9) because there were more numbers in the
world with low first digits. The first digit of a number is the
leftmost digit, and 0 is inadmissible as a first digit. The first
digits of 2,204, 0.0025, and 20 million are all 2. Benford
empirically tested the first digits of 20 diverse lists of
numbers and noticed a marked skewness in favor of the low
digits that approximated a logarithmic pattern. He then
made some assumptions related to the geometric pattern of
natural phenomena—despite the fact that some of his
datasets were not related to natural phenomena—and
formulated the expected patterns for the digits in tabulated
data. These expected frequencies are shown below, with D1
representing the first digit and D1D2 representing the first-
two digits of a number:
P(D1=d1) = log(1 + 1/d1) d1 {1, 2, ... 9} (1)
P(D1D2=d1d2) = log(1 + 1/d1d2) d1d2 {10, 11, 12, ... 99} (2)
P indicates the probability of observing the event in
parentheses, and log refers to the log to the base 10. For
example, the expected probability of the first digit 2 is
log(1 + ½), which equals 0.1761.
Durtschi, Hillison, and Pacini (2004) review the types of
accounting data that are likely to conform to Benford’s
Law and the conditions under which a “Benford Analysis”
is likely to be useful. Benford’s Law as a test of data
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 2
NOTES authenticity has not been limited to internal audit and the
attestation functions. Hoyle (et al., 2002) apply Benford’s
Law to biological findings, and Nigrini and Miller (2007)
apply it to earth science data. The mathematical theory
supporting Benford’s Law is still evolving. Examples
include: Berger, Bunimovich, and Hill (2005); Kontorovich
and Miller (2005); Berger and Hill (2007); Miller and
Nigrini (2008); and Jang (et al., 2009).
A set of numbers that closely conforms to Benford’s Law is
called a Benford Set in Nigrini (2012). The link between a
geometric sequence and a Benford Set is well known in the
literature, and is discussed in Raimi (1976). The link was
also evident to Benford, who titled a part of his paper
“Geometric Basis of the Law” and declared, “Nature
counts geometrically and builds and functions accordingly”
(Benford 1938, 563). Raimi (1976) relaxes the tight
restriction that the sequence should be perfectly geometric,
and states that a close approximation to a geometric
sequence will also produce a Benford Set. Raimi further
relaxes the geometric requirement and notes that “the
interleaving of a finite number of geometric sequences”
will also produce a Benford Set. A mixture of approximate
geometric sequences will therefore also produce a Benford
Set.
The digits of a geometric sequence will form a Benford Set
if two requirements are met. First, N should be large, and
this vague requirement of being “large” is because even a
perfect geometric sequence with 1,000 records cannot fit
Benford’s Law perfectly. For example, for the first-two
digits from 90 to 99, the expected proportions range from
0.0044 to 0.0048. Since any actual count must be an
integer, it means that the actual counts (probably either 4 or
5) will translate to actual proportions of either 0.004 or
0.005. As N increases, the actual proportions are able to
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 3
NOTES tend toward the exact expected proportions of Benford’s
Law. Second, the difference log(b)–log(a) should be an
integer value. The geometric sequence needs to span a large
enough range to allow each of the possible first digits to
occur with the expected frequency of Benford’s Law. A
geometric sequence over the range [20, 82] will be clipped
short with no numbers beginning with either a 1 or a 9, and
very few numbers with a first digit of 8.
The Nigrini Cycle
In Forensic Analytics, Nigrini (2011) introduces the Nigrini
Cycle, which is a series of tests that should be run on every
dataset as a start to the data analysis phase. The Nigrini
Cycle comprises six tests made up of the periodic graph,
the data profile, the first-two digit tests, the number
duplication test, the summation test, and the second-order
test.
The Nigrini Cycle is demonstrated on purchasing card data
of a government entity. The specific entity was the victim
of fraud in the prior year and management wanted an
analysis of the current transactions to give some assurance
that the current year’s data was free of further fraud. The
focus was on fraud as opposed to waste and abuse. The first
test was the data profile, shown in Figure 1-01.
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 4
NOTES
Figure 1-01 shows the data profile with the dollar amounts
and counts for the card purchases.
The data profile in Figure 1-01 shows that there were
approximately 95,000 transactions totaling $39 million.
The total should be compared to the payments made to the
card issuer. It is puzzling that there are no credits. This
might be because there is a field in the data table indicating
whether the amount is a debit or credit that was deleted
before the analysis. It might also signal that cardholders
aren’t too interested in getting credits where credits are due.
The data profile also shows that about one-third of the
charges are for amounts of $50 and under. Card programs
are there to make it easy for employees to pay for small
business expenses. The data profile shows one large
invoice for $3,102,000. The review showed that this
amount was actually in Mexican pesos, making the
transaction worth about $250,000. This transaction was
investigated and was a special circumstance where the
Mexican vendor needed to be paid with a credit card. This
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 5
NOTES finding showed that the Amount field was in the source
currency, not in U.S. dollars. Another query showed that
very few other transactions were in other currencies, and so
the Amount field was still used “as is.” The second high-
level overview was a periodic graph, shown in Figure 1-02.
Figure 1-02 shows the monthly totals for card purchases.
The monthly totals are shown in Figure 1-02. Because the
“$3,102,000” purchase was an abnormal event, this number
was excluded from this graph. The graph shows that
August and September had the largest transaction totals.
The entity’s fiscal year ends on September 30. The
August/September spike might be the result of employees
making sure that they’re spending money that is “in the
budget.” The average monthly total is $3 million. The two
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 6
NOTES spikes averaged $1.18 million, which is a significant
amount of money. An earlier example of abuse was a
cardholder buying unnecessary helmets in one fiscal year,
only to return them the next fiscal year and use the funds
for other purchases. The transactions for 2011 should be
reviewed for this type of scheme. In another card analysis,
a utility company found that it had excessive purchases in
December, right around the holiday season. This suggested
that cardholders might be buying personal items with their
corporate cards.
The data profile and the periodic graph are high-level tests
that are well-suited to purchasing cards. The high-level
overview could also include a comparative analysis of the
descriptive statistics, which would require data for two
consecutive years.
The Benford’s Law tests work well on card transactions. It
would seem that the upper limit of $2,500 on card
purchases would make the test invalid, but this isn’t the
case because most of the purchases are below $1,000, and
the $1,000-plus strata is dwarfed by the under $1,000
purchases. Also, the $2,500 limit can be breached if the
purchase is authorized. The purchase might also be in
another currency, and the analysis can be run on the
“transaction currency” as opposed to the amounts after
converting to U.S. dollars. The first-order test results are
shown in Figure 1-03.
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 7
NOTES
Figure 1-03 shows the results of (a) all the card purchases.
The first-order results in the first panel of Figure 1-03 show
a large spike at 36. A review of the number duplication
results (by peeking ahead) shows a count of 5,903 amounts
in the $3.60 to $3.69 range. These transactions were almost
all for FedEx charges, and it seems that FedEx was used as
the default mail carrier for all documents larger than a
standard first-class envelope. While this was presumably
not fraud, it might be wasteful because U.S. Postal Service
first-class mail is cheaper for small documents. It is also
noteworthy that a government agency would prefer a
private carrier over the USPS. Another test was run on all
purchases of $10 and higher, and the results are in the
second panel of Figure 1-03.
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 8
NOTES
Figure 1-03 also shows the results of (b) card purchases
that are $10 and higher.
The first-order test in the second panel in Figure 1-03
shows a reasonably good fit to Benford’s Law. There is,
however, a large spike at 24, which is the largest spike on
the graph. Also, there is a relatively large spike at 99, in
that the actual proportion is about double the expected
proportion. The spike at 24 is there because card users are
buying with great gusto for amounts that are just less than
the maximum allowed for the card. The first-order test
allows us to conclude that there are excessive purchases in
this range because we can compare the actual amount to an
expected proportion. The number duplication test will look
at the 24 purchases in some more detail. The 99 purchases
showed many payments for seminars delivered over the
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 9
NOTES Internet (i.e., webinars), and it seemed reasonable that the
seminars would be priced just below a psychological
boundary. There were also purchases of computer and
electronic goods priced at just under $100. This pricing
pattern is normal for the computer and electronics industry.
The purchases also included a payment to a camera store
for $999.95. This might be an abusive purchase. The
procurement rules stated that a lower priced good should be
purchased when it will perform essentially the same task as
an expensive item. The camera purchase was made in
August, which was in the two-month window preceding the
end of the fiscal year.
The summation gives all the amounts with first-two digits
10, 11, 12, … 99. The test identifies amounts with the same
first-two digits that are large relative to the rest of the
population. The results so far have highlighted the large
$3.102 million purchase, and the fact that there is an excess
of transactions just below the $2,500 threshold. The
summation graph is shown in Figure 1-04.
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 10
NOTES
Figure 1-04 shows the results of the summation test applied
to the card data.
The summation test in Figure 1-04 shows that there is a
single record, or a group of records with the same first-two
digits, that are large when compared to the other numbers.
The spike is at 31. An access query was used to select all
31 records and to sort the results by Amount descending.
The query identified the 3,102,000 pesos transaction.
The summation test was run on the Amounts greater than or
equal to $10. The summation test could be run on all the
positive amounts in a dataset. The expected sum for each
digit combination was $433,077 ($38,976,906/90). The 25
sum is $2.456 million. The difference is about $2 million.
The drill-down query showed that there were eight
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 11
NOTES transactions for about $24,500, and about 850 transactions
for about $2,450, each summing to about $2,250,000.
There is a large group of numbers that are relatively large
and have first-two digits of 25 in common. So, not only is
the spike on the first-order graph significant, but these
transactions are also larger than expected with respect to
their magnitudes.
The last-two digits test is usually only run as a test for
number invention. The number invention tests are usually
not run on accounts payable data or other types of
payments data because any odd last-two digits results will
be noticeable from the number duplication test. For
purchase amounts this test will usually simply show that
many numbers end with 00. This should also be evident
from the number duplication test. The results are shown in
Figure 1-05.
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 12
NOTES
Figure 1-05 shows the results of the last-two digits test
applied to the card data.
The result of the last-two digits test is shown in Figure 1-
05. There is a large spike at 00, which is as expected. The
00 occurs in amounts such as $10.00 or $25.00. An
interesting finding is the spike at 95. This was the result of
2,600 transactions with the cents amounts equal to 95 cents,
as in $99.95.
The last-two digits test was run on the numbers equal to or
larger than $10. If the test was run on all the amounts, there
would have been large spikes at 62 and 67 from the FedEx
charges for $3.62 and $3.67. The large spike in Panel A of
Figure 1-06 was for amounts of $3.62 and $3.67, which
have last-two digits of 62 and 67, respectively. The 62 and
67 spikes would not be there because of fraud, but rather
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 13
NOTES because of the abnormal duplications of one specific type
of transaction.
The second-order test looks at the relationships and patterns
found in data and is based on the digits of the differences
between amounts that have been sorted from smallest to
largest (ordered). These digit patterns are expected to
closely approximate the expected frequencies of Benford’s
Law. The second-order test gives few, if any, false
positives in that if the results are not as expected (close to
Benford’s Law), then the data does have some
characteristic that is rare and unusual, abnormal, or
irregular. The second-order results are shown in Figure 1-
06.
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 14
NOTES Figure 1-06 shows the second-order results of the card
purchases amounts.
The result of the second-order test is shown in Figure 1-06.
The graph has a series of prime spikes (10, 20, … 90) that
have a Benford-like pattern and a second series of minor
spikes (11–19, 21–29, …) that follow another Benford-like
pattern. The prime spikes are relatively large. These results
are as expected for a large dataset with numbers that are
tightly clustered in a small ($1 to $2,500) range. The
second-order test doesn’t indicate any anomaly here, and
this test usually doesn’t indicate any anomaly except in
rare, highly anomalous situations.
The number duplication test analyzes the frequencies of the
numbers in a dataset. This test indicates which numbers
were causing the spikes in the first-order test. This test has
had good results when run against bank account numbers,
and has also been used with varying levels of success on
inventory counts, temperature readings, health care claims,
airline ticket refunds, airline flight liquor sales, electricity
meter readings, and election counts. The results are shown
in Figure 1-07.
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 15
NOTES
Figure 1-07 shows the results of the number duplication
test.
The number duplication results in Figure 1-07 show four
amounts, all below $4.00, in the first four positions.
Another query showed that 99.9 percent of these amounts
were for FedEx charges. While the charges might be
wasteful, they were presumably not fraudulent. A second
number duplication test was run on the numbers below
$2,500. This would give some indication as to how
“creative” the cardholders were when trying to stay at or
below the $2,500 maximum allowed. Purchases could
exceed $2,500 if authorized. The “just below $2,500” table
is shown in Figure 1-08.
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 16
NOTES
Figure 1-08 shows the purchase amounts in the $2,495 to
$2,500 range.
The $2,495 to $2,500 transactions in Figure 1-10 show
some interesting patterns. The large count of “at the
money” purchases of $2,500 shows that this number has
some real financial implications. Either suppliers are
marginally reducing their prices so that the bill can be paid
easily and quickly, or some other factors are at play.
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 17
NOTES Another possible reason is that cardholders are splitting
their purchases and the excessive count of $2,500
transactions includes partial payments for other larger
purchases. Card transaction audits should select the $2,500
transactions for a close perusal. Also of interest in Figure 1-
08 is the set of five transactions for exactly $2,499.99 and
the 42 transactions for exactly $2,499. There are also 21
other transactions in the $2,499.04 to $2,499.97 range. It is
surprising that people think that they are the only ones that
are capable of gaming the system. The review of the eight
transactions of $2,497.04 showed that these were all items
purchased from GSA Global Supply, a purchasing program
administered by the General Services Administration. It
seems that even the federal government itself takes the card
limit into account when setting prices.
Conclusions
The Nigrini Cycle is the start of the suite of risk-based
auditing tests designed to detect fraud, and to test both the
effectiveness of controls and transaction accuracy. The use
of audit techniques that cover the entire population reduces
detection risk and increases the chances of finding unusual
transactions.
Recent applications of Benford’s Law include:
Testing the accuracy of census data
The link to the Fibonacci sequence
Changing numbers to a different base
Tax evasion caused by the tax tables
Financial statement fraud
Authenticity of ledger balances
The accuracy of bankruptcy filing data
Identification of Ponzi schemes
Invention of charitable gift amounts
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 18
NOTES It is surprising that people are still surprised by the
discovery of fraud. The financial press and popular press
regularly report on only the largest cases. It seems that
when people are given the opportunity to commit fraud,
many do indeed commit the act. A few general comments
to take note of include:
Forensic analytics is only one part of the forensic
investigations process. An entire investigation cannot
be completed with the computer alone. The
investigation would usually include a review of paper
documents, interviews, reports and presentations, and
concluding actions.
It is best to collect and analyze the data at the start of
the investigation, and long before the suspect suspects
that an investigation is underway. In proactive fraud
detection, the data is automatically analyzed before the
suspect catches any wind of an investigation.
Incomplete and inaccurate data might give rise to
incorrect and incomplete insights. Data should be
checked for completeness and accuracy before being
analyzed.
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 19
NOTES References
Benford, F., “The Law of Anomalous Numbers,”
Proceedings of the American Philosophical Society 78
(1938), pages 551–572.
Berger, A., Bunimovich, L.A., and Hill, T.P., “One-
Dimensional Dynamical Systems and Benford’s Law,”
Transactions of the American Mathematical Society 357 (1)
(2005), pages 197–219.
Berger, A., and Hill, T.P., “Newton’s Method Obeys
Benford’s Law,” American Mathematical Monthly 114,
(August/September 2007), pages 588–601.
Durtschi, C., Hillison, W., and Pacini, C., “The Effective
Use of Benford’s Law in Detecting Fraud in Accounting
Data,” Journal of Forensic Accounting 5 (2004), pages 17–
33.
Hoyle, D., M. Rattray, R. Jupp, and Brass, A., “Making
Sense of Microarray Data Distributions,” Bioinformatics 18
(4) (2002), pages 576–584.
Jang, D., Kang, J., Kruckman, A., Kudo, J., and Miller,
S.J., “Chains of Distributions, Hierarchical Bayesian
Models, and Benford’s Law,” Journal of Algebra, Number
Theory: Advances and Applications: Forthcoming (2009).
Kontorovich, A.V., and Miller, S.J., “Benford’s Law,
Values of L-Functions, and the 3x+1 Problem,” Acta
Arithmetica 120 (2005), pages 269–297.
Miller, S. J., and Nigrini, M.J., “The Modulo 1 Central
Limit Theorem and Benford’s Law for Products,”
International Journal of Algebra 2 (3) (2008), pages 119–
130.
BENFORD’S LAW: THE FUN, THE FACTS, AND THE FUTURE
23rd
Annual ACFE Fraud Conference and Exhibition ©2012 20
NOTES Nigrini, M.J., Forensic Analytics: Methods and Techniques
for Forensic Accounting Investigations (New Jersey: John
Wiley & Sons, 2011).
Nigrini, M.J., Benford’s Law: Applications for Forensic
Accounting, Auditing, and Fraud Detection (New Jersey:
John Wiley & Sons, 2011).
Nigrini, M.J., and Miller, S.J., “Benford’s Law Applied to
Hydrology Data—Results and Relevance to Other
Geophysical Data,” Mathematical Geology 39 (5) (2007),
pages 469–490.
Raimi, R., “The First Digit Problem,” American
Mathematical Monthly 83 (August/September 1976), pages
521–538.