Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s...

39
© 2018 Association of Certified Fraud Examiners, Inc. Using Data Analytics to Detect Fraud Other Data Analysis Techniques

Transcript of Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s...

Page 1: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc.

Using Data Analytics to

Detect Fraud

Other Data Analysis Techniques

Page 2: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 2 of 27

Benford’s Law

▪ In a population of naturally occurring, multi-digit

numbers, the first digit of the numbers will be

distributed in a predictable way.

▪ Data set conditions for application:

• Should describe the sizes of similar phenomena

• Should have no built-in minimum or maximum values

• Should have no pre-assigned (non-natural) numbers

• Should have more small items than large ones

Page 3: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 3 of 27

Benford’s Law

Digit 1st 2nd 3rd 4th 5th or

Greater

0 11.97% 10.18% 10.02% 10.00%

1 30.103% 11.39% 10.14% 10.01% 10.00%

2 17.609% 10.88% 10.10% 10.01% 10.00%

3 12.494% 10.43% 10.06% 10.01% 10.00%

4 9.691% 10.03% 10.02% 10.00% 10.00%

5 7.918% 9.67% 9.98% 10.00% 10.00%

6 6.695% 9.34% 9.94% 9.99% 10.00%

7 5.799% 9.04% 9.90% 9.99% 10.00%

8 5.115% 8.76% 9.86% 9.99% 10.00%

9 4.576% 8.50% 9.83% 9.98% 10.00%

100.000% 100.00% 100.00% 100.00% 100.00%

Page 4: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 4 of 27

Benford’s Law

Page 5: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 5 of 27

Application: Benford’s Law

Excel ACL IDEA

▪ COUNTIFS() ▪ Benford’s Law

command

▪ Benford’s Law

command

Tableau

▪ Calculated Fields

▪ LEFT, LOG, INT

▪ onlinehelp.tableau.com/current/pro/desktop/en-

us/benford.html

Page 6: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 6 of 27

Statistical Analysis

▪ Averages

▪ Standard deviations

▪ Highest and lowest values

▪ Absolute values

Page 7: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 7 of 27

Application: Statistics

Excel ACL IDEA

▪ AVERAGEIFS()

▪ MIN()

▪ MAX()

▪ ABS()

▪ STDEV()

▪ STATISTICS

command

▪ Field Statistics

Tableau

▪ AVG, RUNNING_AVG, WINDOW_AVG

▪ SUM, RUNNING_SUM, WINDOW_SUM

▪ MAX, RUNNING_MAX, WINDOW_MAX

▪ MIN, RUNNING_MIN, WINDOW_MIN

▪ STDEV, WINDOW_STDEV

▪ STDEVP, WINDOW_STDEV

Page 8: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 8 of 27

Trend Analysis

▪ Compares data across time, location,

operational unit, or other source of

comparable information, and identifies

outliers

▪ Identifies unexpected trends, or lack of

trends where trends would be expected

Page 9: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 9 of 27

Regression Analysis

▪ Statistical technique that uses a series of

records to create a model relationship

between a dependent variable and one or

more independent variables:

• Example: predicting the number of items

manufactured based on amounts of materials and

labor used

Page 10: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 10 of 27

Regression Analysis

▪ Must have a pair-wise relationship between

the two sets of data:

• Each x variable must have a corresponding and

unique y variable.

• Determine which is x (independent variable) and

which is y (dependent variable).

Page 11: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 11 of 27

Application: Regression

Excel Tableau

▪ Analysis ToolPak ▪ Table calculations

Page 12: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 12 of 27

Correlation Strength

▪ Correlation analysis allows a fraud examiner

to evaluate the strength of the connection

between two or more pieces of information.

▪ The strength of the relationship between two

variables is indicated by the correlation

coefficient (r):

• r ranges from -1 to 1.

Page 13: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 13 of 27

Application: Correlation

Excel IDEA Tableau

▪ CORREL()

▪ Analysis

ToolPak

▪ Correlation

analysis

(Statistics)

▪ CORR

▪ WINDOW

CORR

Page 14: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 14 of 27

Linear Regression

▪ Graphing two variables against each other

provides a best-fit line:

• y = mx + b

▪ Best-fit line can be used to predict y for any

given x.

Page 15: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 15 of 27

Linear Regression Analysis Example

y = mx + b

▪ y is dependent variable (total cost of gala)

▪ b is constant amount (fixed cost of gala)

▪ x is independent variable (number of guests)

▪ m is coefficient of x (marginal cost of each

additional guest)

Page 16: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 16 of 27

Linear Regression Analysis Example

Number of gala guests Total gala cost

53 $21,672

81 $23,059

118 $32,848

157 $39,916

174 $40,317

202 $51,902

227 $63,522

269 $64,134

306 $82,083

Page 17: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 17 of 27

Linear Regression Analysis Example

$-

$10,000

$20,000

$30,000

$40,000

$50,000

$60,000

$70,000

$80,000

$90,000

0 50 100 150 200 250 300 350

To

tal

Co

st

Number of guests

Page 18: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 18 of 27

Linear Regression Analysis Example

$-

$10,000

$20,000

$30,000

$40,000

$50,000

$60,000

$70,000

$80,000

$90,000

0 50 100 150 200 250 300 350

To

tal

Co

st

Number of guests

y = 237.78x + 4677.36

Page 19: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 19 of 27

Reasonableness Testing

▪ This testing involves using nonfinancial and financial data to predict a financial result.

▪ Example: annual membership dues x number

of members = estimated annual membership

revenue

▪ Accuracy depends on reliability of the

nonfinancial data.

Page 20: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 20 of 27

Reasonableness Testing

Page 21: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 21 of 27

Predictive Modeling

▪ This type of analysis involves extracting

information from past data and using it to

predict trends, behavior patterns, and future

outcomes.

▪ Most predictive models generate a score,

with a higher score indicating a higher

likelihood of the given behavior or event

occurring:

• Example: credit scores

Page 22: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 22 of 27

Qualitative Data Analysis

▪ Most data analysis techniques require the

use of data in the form of numbers.

▪ Qualitative data analysis is the process used

to collect and analyze unstructured data.

▪ It requires extreme organization, as data sets

tend to be quite large.

▪ It is best used in conjunction with quantitative

data analysis.

Page 23: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 23 of 27

Textual Analytics

▪ Analysis of unstructured, text-based data for

keywords that reveal patterns, sentiments, and

relationships indicative of fraud

▪ Uses linguistic technologies, statistical

techniques, and scoring algorithms to

categorize and analyze textual data

▪ Requires:

• Sophisticated software

• Thorough understanding of the legal issues

regarding employee rights and workplace searches

Page 24: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 24 of 27

Textual Analytics Data Sources

▪ Emails

▪ Social media posts

▪ Instant messages

▪ Videos

▪ News feeds

Page 25: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 25 of 27

Textual Analytics Data Sources

▪ Apps

▪ Voice files

▪ User documents

▪ Sales and marketing

▪ Presentations

Page 26: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 26 of 27

Textual Analytics

▪ Emotive tone—derogatory, surprised, secretive,

angry, worried, or confused communications

▪ Ethical behavior—harassing, secretive, cursing

▪ Concept, entity, relationship, or event extraction

▪ Text link analysis

▪ Social network analysis

Page 27: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 27 of 27

Developing a List of Keywords

▪ Use the Fraud Triangle as a reference.

Deadline

Quota

Short

Problem

Concern

Reasonable

Deserve

Borrow

Therefore

Override

Write-off

Adjust

Discount

▪ Keywords depend on industry, relevant fraud

risks, and data set.

Page 28: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 28 of 27

Visual Analytics

▪ Some data sets are so

vast or complex that

they are difficult to

analyze using

traditional means.

▪ Visual analytics can

help identify patterns

or anomalies in such

data.

Page 29: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 29 of 27

Word Clouds

overstateWeak

Risky

unbalancedCreativejobs

Uncertainty

moderate

Liability

strategydelay

audit accrualcritical

Page 30: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 30 of 27

Heat Maps

Page 31: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 31 of 27

Tree Maps

Reputation

Risk

Fraudulent

Disbursements

Theft of

Assets

Natural

Disasters

Accounting Accounts Payable Information Technology

AdministrativeResearch and Development

Financial Misstatement

Fraudulent

Disbursements System Failures

Reputation Risk

Fraudulent Disbursements

Penalties

and Fines Thefts of

Assets

Reputation Risk

Penalties

and Fines

Financial Misstatement

System

Failures

Theft of Assets

Fraudulent

Disbursements

Financial MisstatementNatural

Disasters

Theft of

Assets

Sales

Page 32: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 32 of 27

Link Analysis

Page 33: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 33 of 27

Geospatial Analysis

Page 34: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 34 of 27

Timeline Analysis

▪ Highlight key times, dates, and facts.

▪ More readily determine a sequence of

events.

▪ Analyze multiple or concurrent sequences of

events.

▪ Track unaccounted-for time.

▪ Identify inconsistencies or impossibilities in

data.

Page 35: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 35 of 27

Conflict Checking

▪ Use of specialized software to identify

potentially problematic relationships:

• Conflicts of interest on a company’s board of

directors

• Relationships between potential new hires and

business units

• Vendors that operate from addresses that previously

have been associated with fraud

• Vendors that are not listed in any online commercial

database or that are listed on government excluded

party lists

Page 36: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 36 of 27

Log-in Monitoring Software

▪ Used to identify

anomalies in log-ins

to company system:

• Unusual times or

days

• Unusual locations

• Excessive log-ins

• Unauthorized log-ins

Page 37: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 37 of 27

Employee Risk Ranking and Scoring

▪ Ranking and scoring bring together big

data—both structured and unstructured

data—to identify and monitor high-risk

employees.

▪ All tests and analytics can be aggregated into

dashboard.

▪ A similar analysis can be performed for

vendors or customers.

▪ Data privacy and employee rights are integral

concerns.

Page 38: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 38 of 27

Employee Risk Ranking Example

1. KeywordsPercentage of keywords around pressure, opportunity, and rationalization in email

and IM communicationsScaling: 3

2. T&E analysis Ranking of T&E out-of-compliance hits and overall email scoring Scaling: 3

Custodian C1 C2 C3 C4 C5 C6 C7Scaling

C1

Scaling

C2Scaling

C3Scaling

C4

Scaling

C5

Scaling

C6

Scaling

C7Score

A, Week 1 1 3 3 4 6 2 3

3 3 4 2 2 3 545

A, Week 2 2 2 4 5 3 4 2 37

4. User activityPercentage of instances within period where custodian sends or receives information

involving those outside of peer group, as identified through hierarchiesScaling: 2

5. Third-party riskInstances where employee is linked to high-risk third parties (e.g., customers,

vendors, state-owned entities, etc.) in email, T&E, or sales activityScaling: 2

6. Alias

clustering

Percentage of instances within that week where custodian sends or receives ESI

involving at least one of their identified communicative aliasesScaling: 3

7. Emotive

tone

Percentage of instances where the employee sends or receives ESI with negative

emotions (angry, frustrated, secretive, etc.) identified through linguistic analysesScaling: 5

3. Sales activity Ranking of sales activity, field notes, and sales returns and allowances Scaling: 4

Page 39: Using Data Analytics to Detect Fraud...Excel ACL IDEA COUNTIFS() Benford’s Law command Benford’s Law command Tableau ... • y = mx + b Best-fit line can be used to predict y for

© 2018 Association of Certified Fraud Examiners, Inc. 39 of 27

Transaction Risk Scoring

▪ Brings multiple analytics together

▪ Monitors and scores transactions based on

number of breaches:

• Round-dollar amounts

• Incomplete invoice

• Keyword in payment description field

▪ Can apply weights to certain tests to influence

the overall risk score