Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the...

29

Transcript of Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the...

Page 1: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s
Page 2: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Why is Internal Audit so Hard?

2 ©2014

Page 3: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Why is Internal Audit so Hard?

3 ©2014

Page 4: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Waste

Abuse Fraud

Why is Internal Audit so Hard?

4 ©2014

Page 5: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Waves of Change

1st Wave

Personal Computers

Electronic Spreadsheets

The end of hand calculation

5 ©2014

Page 6: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

2nd Wave: ERPs

Personal

Computers Electronic

Spreadsheets The end of hand

calculation

ERPs – all our data in one place

Database analysis

Opens the Age of Rules

6 ©2014

Page 7: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

2nd Wave Also Opens the Age of CAATs

Beginner’s CAATs:

Basic database manipulation: join,

summarize, append, stratify, sample, extract

Basic testing: duplicates, gaps

Intermediate CAATs:

Automate our rules and (limited) automated

testing. (for example in purchase-to-pay) o P.O. with blank / zero amount

o Split P.O.s

o Duplicate invoices

o Invoice amount paid > goods received

o Invoices with no matching receiving report

o Multiple invoices for same P.O. and date

o Pattern of sequential invoices from a vendor

o Non-approved vendors

o Employee and vendor with same: Name, address, bank, etc.

7 ©2014

Page 8: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

3rd Wave: Predictive Analytics

Personal

Computers Electronic

Spreadsheets The end of hand

calculation

ERPs –data in one place

Database analysis

Age of Rules

Predictive Analytics focuses our attention on important /

suspect transactions.

Comes in many different flavors

o Each somewhat more sophisticated

o Each making audit work more accurate and our lives easier

(GTAG 16, 2011,” The use of data analysis can significantly reduce audit risk by

honing the risk assessment and stratifying the population”)

Predictive Analytics Sophisticated

Statistical Insights

True Predictive & Continuous Audit

8 ©2014

Page 9: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

5 Levels of Predictive Analytics

1. Statistical Insights

2. Fuzzy Logic

3. Clustering

4. Predictive Modeling

5. Big Data Analytics

9 ©2014

Page 10: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Statistical Insights: Benford’s Law

The most famous name in

forensic accounting does not belong to an accountant.

In 1938 at the age of 55 he

published a paper titled “The Law of Anomalous Numbers”.

Benford’s Law is a statement

about the occurrence of digits

in lists of data.

Useful in detecting fraudulent

invoices or other numbered

documents.

10 ©2014

Frank Benford (1883-1948),

an American physicist.

Page 11: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Benford’s Law Distribution of 1st Digits

Benford’s Distribution Observed Distribution

11 ©2014

Page 12: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Which to Investigate?

For distributions that

appear to be anomalous:

1. Calculate the Kolmogorov-

Smirnov distance between the vendor’s first digit

distribution and the ideal

Benford distribution.

2. Investigate those with the largest numerical scores.

Kolmogorov-Smirnov distance is the absolute value of the greatest

distance between the cumulative distribution functions (CDF).

Source: Graph: Pivotal, Inc., Machine Learning for Forensic Accounting, 2013

Benford’s Law of first digit distribution follows a logarithmic pattern and applies to a large number of surprising datasets including country populations, Twitter users by follower count and many more. See testingbenefordslaw.com for more examples.

12 ©2014

Page 13: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Fuzzy Logic – Duplicate Invoice Detection

Problem: Deterministic rules

expect key information to be

exactly the same. • Vendor name

• Address

• Phone

• Invoice amount

• Date

• Bank account

• TIN

If the criteria is kept tight:

Too many false negatives –

missed duplicates.

If the criteria is made loose:

Too many false positives result in

too many items to investigate.

13 ©2014

Page 14: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Fuzzy Matching Using Natural Language Processing

Vendors are

considered ‘close’

matches when:

Vendor names

Remit vendor

Address & Phone

Other text-based

of your choosing

are identical or

sufficiently similar

Steps in Natural Language

Processing (NLP)

1. Tokenize the vendor

names

2. Remove ‘stop words’

and special characters (of, and, the, …)

4. Calculate the tf-idfs

for each word (term

frequency – inverse document

frequency)

5. Calculate the cosine

similarity between

documents to identify

‘close’ matches

3. Process synonyms

and abbreviations.

14 ©2014

Page 15: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Fuzzy Matching in Numerical Strings

Numerical Values (strings) are

considered close when:

Invoice IDs

• Edit distance is small

Dates

• Are the same

• Are within 7 days of each other

• Are inversed (3/11/14 vs 11/3/14)

Payments

• Amounts are identical

• Edit distances are small

TINS, Bank Accounts, Other

Numerics

• Edit distances are small

Substitutions

Additions

Deletions

Transposes

‘Edit Distance’ calculated with the

Damerau-Levenschtein value

15 ©2014

Page 16: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Fuzzy Matching Using as many features of the

invoice as desired

o Not limited to 3 dimensions

1. Determine the best ‘distance’

metric for each dimension

o Some are text-based

o Others numerical strings

2. Calculate the ‘distance’

between invoices

3. Adjust the measurement

values to yield the best true

positive result

4. Investigate any pair of

invoices where the ‘distance’

is within your threshold

16 ©2014

Page 17: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Clustering – Identify Invoice Anomalies with Vendor Baselining Vendors will tend to have

patterns in their billing but may

have more than one pattern

based on service, ordering

business unit, specific users,

delivery address, etc. There may

be multiple ‘normal’ behaviors.

Identify the true outliers for

investigation by:

Featurizing the invoices (see

fuzzy logic)

Run a clustering algorithm

such as K-Means

Identify clusters with low

populations and low density as

potential anomalies.

Payments ~$1,000 to $5,000

Bus Unit: Bldg Maintenance

Users: Loc 1, Loc 2, Loc 3

Paid by ACH

To address ABC

Payments >$100,000

Bus Unit: Construction

Users: Loc 4

Paid by ACH

To address DEF

Payments <$700

Bus Unit: Security

Users: Loc Z

Paid by check

To address GHI

Vendor A

17 ©2014

Page 18: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Predictive Modeling: Time Travel in the 21st Century

18 ©2014

Page 19: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Type 1: Prediction by Scoring

Your Financial System

Current You

Examine lots of possible FWA

invoices every month

Machine Learning

System

Do this once - ML

learns what is FWA

Future You

ML continuously monitors and scores

from 1 to 100 – examine only the high

scoring items.

19 ©2014

Page 20: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Type 2: Prediction by Actual Value

$ Premium

SIC code

# employees

Address

$ Sales

N1

N100

Claim File

N1

N100

Machine Learning

System

Predicted Premium

Actual Premium Paid

variance

$ 10,254 $ 9,946 -3%

$ 25,687 $ 26,971 5%

$ 5,621 $ 5,452 -3%

$ 96,321 $ 98,247 2%

$ 85,741 $ 72,880 -18%

Historical data from many

sources is combined to

train the ML System to

predict the correct $

premium

Example from Insurance

Investigate the outliers

Accuracy can be very

high in the range of 90%

to 98% based on

historical data used. 20 ©2014

Page 21: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

So What is a Machine Learning System?

ML Mathematical Cores

• Regression

• K-Means

• Bayesian Classifiers

• Decision Trees CART / CHAID

• Support Vector Machines

• Artificial Neural Nets (ANN)

• Genetic Programs

Systems (very partial list)

Advanced CAATS • Pivotal

• Oversight (as a service)

• EMC

Proprietary – General Purpose • SAS

• IBM SPSS

• RapidMiner

Open Source – Do It Yourself • PSPP

• Weka

• R

• Python

21 ©2014

Page 22: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

4th Wave: Big Data Analytics

Personal Computers

Electronic Spreadsheets

The end of hand calculation

ERPs –data in one place

Database analysis

Age of Rules

Predictive Analytics

Statistical Insights True Predictive &

Continuous Audit

Big Data Analytics

o Addresses new concerns regarding social media and other risks

from text and image based sources.

o Continues to improve the accuracy of predictive analytics further

reducing false positives and false negatives.

o Allows true continuous audit of even the largest enterprises as

computation costs drop to fractions of previous investments.

22 ©2014

Page 23: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Got Big Data?

Volume High

Terabytes or

Petabytes

Very long

retrieval and

processing

times

Velocity Batch

Near time

Real Time

Streams

Variety Structured

Unstructured

Semi-

structured

All at once

23 ©2014

Page 24: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

It’s Really About Big Data Technology

The database

Source: EMC

Search & Retrieve

24 ©2014

Page 25: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

What are Big Data Analytics?

1st – The haystack

gets a lot bigger • Traditional

structured data

• Unstructured data

o Documents

o Email

o Web content

o Social Media

2nd – Thanks to

Hadoop and

Massive Parallel

Processing • Query and retrieval

times are short

• Cost of even

massive storage is

very low

3rd – Many

predictive modeling

techniques can also

be applied to

structured and

unstructured data • Models become

more accurate

4th – New

techniques for

unstructured

data based on

NLP • Sentiment

analysis

25 ©2014

Page 26: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Focus on Social Media Risks*

*Risk also arises

from other types of

unstructured and

semi-structured

data:

Email

Internal

documents

Images stored

centrally or on

users machines

26 ©2014

Page 27: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Social Media Risks

Source: 2014 Internal Audit Capabilities and Needs Survey Report, Protiviti

7.3

6.6

5.6

4.9

4.9

4.0

5.5

6.1

6.9

2.9 0 1 2 3 4 5 6 7

“They gave me financial aid then I cancelled

all my classes and kept the money”

“Sit in at the Chancellor’s Office at 3:00”

“Can’t believe how much I made on eBay today”

“Professor X is such a perv”

“Did you hear we’re losing accreditation.

Don’t sign up next term.”

“I found out they’re cutting my budget.

I’m going to the union before this gets out”

“I’ll fix them. I put a virus on the lab computer.”

“The instructor said I could make money

after school fixing cars in the auto shop”

“I just downloaded a bunch of student

financial data from the finance system”

“Joe sold me the answers to tomorrow’s test”

27 ©2014

Page 28: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Personal Computers

Electronic Spreadsheets

The end of hand calculation

ERPs –data in one place

Database analysis

Age of Rules

Predictive Analytics

Statistical Insights

True Predictive & Continuous Audit

Social media, text, image

Improved accuracy

Cost effective continuous audit

The Age of Smart CAATs

You Don’t Need to be a Data Scientist, Just a Smart Tool User

28 ©2014

Page 29: Why is Internal Audit so Hard?forensic accounting does not belong to an accountant. In 1938 at the age of 55 he published a paper titled “The Law of Anomalous Numbers”. Benford’s

Questions

Contact Information

Bill Vorhies

President & Chief Data Scientist

Data-Magnum

[email protected]

www.Data-Magnum.com

818.257.2035

“I shall find a way or make one.” Admiral Robert Peary

29 © 2014

Big Data & Predictive Analytics