Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the...
Transcript of Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the...
![Page 1: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/1.jpg)
![Page 2: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/2.jpg)
Why is Internal Audit so Hard?
2 ©2014
![Page 3: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/3.jpg)
Why is Internal Audit so Hard?
3 ©2014
![Page 4: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/4.jpg)
Waste
Abuse Fraud
Why is Internal Audit so Hard?
4 ©2014
![Page 5: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/5.jpg)
Waves of Change
1st Wave
Personal Computers
Electronic Spreadsheets
The end of hand calculation
5 ©2014
![Page 6: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/6.jpg)
2nd Wave: ERPs
Personal
Computers Electronic
Spreadsheets The end of hand
calculation
ERPs – all our data in one place
Database analysis
Opens the Age of Rules
6 ©2014
![Page 7: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/7.jpg)
2nd Wave Also Opens the Age of CAATs
Beginner’s CAATs:
Basic database manipulation: join,
summarize, append, stratify, sample, extract
Basic testing: duplicates, gaps
Intermediate CAATs:
Automate our rules and (limited) automated
testing. (for example in purchase-to-pay) o P.O. with blank / zero amount
o Split P.O.s
o Duplicate invoices
o Invoice amount paid > goods received
o Invoices with no matching receiving report
o Multiple invoices for same P.O. and date
o Pattern of sequential invoices from a vendor
o Non-approved vendors
o Employee and vendor with same: Name, address, bank, etc.
7 ©2014
![Page 8: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/8.jpg)
3rd Wave: Predictive Analytics
Personal
Computers Electronic
Spreadsheets The end of hand
calculation
ERPs –data in one place
Database analysis
Age of Rules
Predictive Analytics focuses our attention on important /
suspect transactions.
Comes in many different flavors
o Each somewhat more sophisticated
o Each making audit work more accurate and our lives easier
(GTAG 16, 2011,” The use of data analysis can significantly reduce audit risk by
honing the risk assessment and stratifying the population”)
Predictive Analytics Sophisticated
Statistical Insights
True Predictive & Continuous Audit
8 ©2014
![Page 9: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/9.jpg)
5 Levels of Predictive Analytics
1. Statistical Insights
2. Fuzzy Logic
3. Clustering
4. Predictive Modeling
5. Big Data Analytics
9 ©2014
![Page 10: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/10.jpg)
Statistical Insights: Benford’s Law
The most famous name in
forensic accounting does not belong to an accountant.
In 1938 at the age of 55 he
published a paper titled “The Law of Anomalous Numbers”.
Benford’s Law is a statement
about the occurrence of digits
in lists of data.
Useful in detecting fraudulent
invoices or other numbered
documents.
10 ©2014
Frank Benford (1883-1948),
an American physicist.
![Page 11: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/11.jpg)
Benford’s Law Distribution of 1st Digits
Benford’s Distribution Observed Distribution
11 ©2014
![Page 12: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/12.jpg)
Which to Investigate?
For distributions that
appear to be anomalous:
1. Calculate the Kolmogorov-
Smirnov distance between the vendor’s first digit
distribution and the ideal
Benford distribution.
2. Investigate those with the largest numerical scores.
Kolmogorov-Smirnov distance is the absolute value of the greatest
distance between the cumulative distribution functions (CDF).
Source: Graph: Pivotal, Inc., Machine Learning for Forensic Accounting, 2013
Benford’s Law of first digit distribution follows a logarithmic pattern and applies to a large number of surprising datasets including country populations, Twitter users by follower count and many more. See testingbenefordslaw.com for more examples.
12 ©2014
![Page 13: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/13.jpg)
Fuzzy Logic – Duplicate Invoice Detection
Problem: Deterministic rules
expect key information to be
exactly the same. • Vendor name
• Address
• Phone
• Invoice amount
• Date
• Bank account
• TIN
If the criteria is kept tight:
Too many false negatives –
missed duplicates.
If the criteria is made loose:
Too many false positives result in
too many items to investigate.
13 ©2014
![Page 14: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/14.jpg)
Fuzzy Matching Using Natural Language Processing
Vendors are
considered ‘close’
matches when:
Vendor names
Remit vendor
Address & Phone
Other text-based
of your choosing
are identical or
sufficiently similar
Steps in Natural Language
Processing (NLP)
1. Tokenize the vendor
names
2. Remove ‘stop words’
and special characters (of, and, the, …)
4. Calculate the tf-idfs
for each word (term
frequency – inverse document
frequency)
5. Calculate the cosine
similarity between
documents to identify
‘close’ matches
3. Process synonyms
and abbreviations.
14 ©2014
![Page 15: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/15.jpg)
Fuzzy Matching in Numerical Strings
Numerical Values (strings) are
considered close when:
Invoice IDs
• Edit distance is small
Dates
• Are the same
• Are within 7 days of each other
• Are inversed (3/11/14 vs 11/3/14)
Payments
• Amounts are identical
• Edit distances are small
TINS, Bank Accounts, Other
Numerics
• Edit distances are small
Substitutions
Additions
Deletions
Transposes
‘Edit Distance’ calculated with the
Damerau-Levenschtein value
15 ©2014
![Page 16: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/16.jpg)
Fuzzy Matching Using as many features of the
invoice as desired
o Not limited to 3 dimensions
1. Determine the best ‘distance’
metric for each dimension
o Some are text-based
o Others numerical strings
2. Calculate the ‘distance’
between invoices
3. Adjust the measurement
values to yield the best true
positive result
4. Investigate any pair of
invoices where the ‘distance’
is within your threshold
16 ©2014
![Page 17: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/17.jpg)
Clustering – Identify Invoice Anomalies with Vendor Baselining Vendors will tend to have
patterns in their billing but may
have more than one pattern
based on service, ordering
business unit, specific users,
delivery address, etc. There may
be multiple ‘normal’ behaviors.
Identify the true outliers for
investigation by:
Featurizing the invoices (see
fuzzy logic)
Run a clustering algorithm
such as K-Means
Identify clusters with low
populations and low density as
potential anomalies.
Payments ~$1,000 to $5,000
Bus Unit: Bldg Maintenance
Users: Loc 1, Loc 2, Loc 3
Paid by ACH
To address ABC
Payments >$100,000
Bus Unit: Construction
Users: Loc 4
Paid by ACH
To address DEF
Payments <$700
Bus Unit: Security
Users: Loc Z
Paid by check
To address GHI
Vendor A
17 ©2014
![Page 18: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/18.jpg)
Predictive Modeling: Time Travel in the 21st Century
18 ©2014
![Page 19: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/19.jpg)
Type 1: Prediction by Scoring
Your Financial System
Current You
Examine lots of possible FWA
invoices every month
Machine Learning
System
Do this once - ML
learns what is FWA
Future You
ML continuously monitors and scores
from 1 to 100 – examine only the high
scoring items.
19 ©2014
![Page 20: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/20.jpg)
Type 2: Prediction by Actual Value
$ Premium
SIC code
# employees
Address
$ Sales
N1
…
N100
Claim File
N1
…
N100
Machine Learning
System
Predicted Premium
Actual Premium Paid
variance
$ 10,254 $ 9,946 -3%
$ 25,687 $ 26,971 5%
$ 5,621 $ 5,452 -3%
$ 96,321 $ 98,247 2%
$ 85,741 $ 72,880 -18%
Historical data from many
sources is combined to
train the ML System to
predict the correct $
premium
Example from Insurance
Investigate the outliers
Accuracy can be very
high in the range of 90%
to 98% based on
historical data used. 20 ©2014
![Page 21: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/21.jpg)
So What is a Machine Learning System?
ML Mathematical Cores
• Regression
• K-Means
• Bayesian Classifiers
• Decision Trees CART / CHAID
• Support Vector Machines
• Artificial Neural Nets (ANN)
• Genetic Programs
Systems (very partial list)
Advanced CAATS • Pivotal
• Oversight (as a service)
• EMC
Proprietary – General Purpose • SAS
• IBM SPSS
• RapidMiner
Open Source – Do It Yourself • PSPP
• Weka
• R
• Python
21 ©2014
![Page 22: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/22.jpg)
4th Wave: Big Data Analytics
Personal Computers
Electronic Spreadsheets
The end of hand calculation
ERPs –data in one place
Database analysis
Age of Rules
Predictive Analytics
Statistical Insights True Predictive &
Continuous Audit
Big Data Analytics
o Addresses new concerns regarding social media and other risks
from text and image based sources.
o Continues to improve the accuracy of predictive analytics further
reducing false positives and false negatives.
o Allows true continuous audit of even the largest enterprises as
computation costs drop to fractions of previous investments.
22 ©2014
![Page 23: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/23.jpg)
Got Big Data?
Volume High
Terabytes or
Petabytes
Very long
retrieval and
processing
times
Velocity Batch
Near time
Real Time
Streams
Variety Structured
Unstructured
Semi-
structured
All at once
23 ©2014
![Page 24: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/24.jpg)
It’s Really About Big Data Technology
The database
Source: EMC
Search & Retrieve
24 ©2014
![Page 25: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/25.jpg)
What are Big Data Analytics?
1st – The haystack
gets a lot bigger • Traditional
structured data
• Unstructured data
o Documents
o Email
o Web content
o Social Media
2nd – Thanks to
Hadoop and
Massive Parallel
Processing • Query and retrieval
times are short
• Cost of even
massive storage is
very low
3rd – Many
predictive modeling
techniques can also
be applied to
structured and
unstructured data • Models become
more accurate
4th – New
techniques for
unstructured
data based on
NLP • Sentiment
analysis
25 ©2014
![Page 26: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/26.jpg)
Focus on Social Media Risks*
*Risk also arises
from other types of
unstructured and
semi-structured
data:
Internal
documents
Images stored
centrally or on
users machines
26 ©2014
![Page 27: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/27.jpg)
Social Media Risks
Source: 2014 Internal Audit Capabilities and Needs Survey Report, Protiviti
7.3
6.6
5.6
4.9
4.9
4.0
5.5
6.1
6.9
2.9 0 1 2 3 4 5 6 7
“They gave me financial aid then I cancelled
all my classes and kept the money”
“Sit in at the Chancellor’s Office at 3:00”
“Can’t believe how much I made on eBay today”
“Professor X is such a perv”
“Did you hear we’re losing accreditation.
Don’t sign up next term.”
“I found out they’re cutting my budget.
I’m going to the union before this gets out”
“I’ll fix them. I put a virus on the lab computer.”
“The instructor said I could make money
after school fixing cars in the auto shop”
“I just downloaded a bunch of student
financial data from the finance system”
“Joe sold me the answers to tomorrow’s test”
27 ©2014
![Page 28: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/28.jpg)
Personal Computers
Electronic Spreadsheets
The end of hand calculation
ERPs –data in one place
Database analysis
Age of Rules
Predictive Analytics
Statistical Insights
True Predictive & Continuous Audit
Social media, text, image
Improved accuracy
Cost effective continuous audit
The Age of Smart CAATs
You Don’t Need to be a Data Scientist, Just a Smart Tool User
28 ©2014
![Page 29: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s](https://reader036.fdocuments.in/reader036/viewer/2022071414/610e540027e1b7086344cde5/html5/thumbnails/29.jpg)
Questions
Contact Information
Bill Vorhies
President & Chief Data Scientist
Data-Magnum
www.Data-Magnum.com
818.257.2035
“I shall find a way or make one.” Admiral Robert Peary
29 © 2014
Big Data & Predictive Analytics