Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will...

71
1 Oracle’s Machine Learning and ADWC to Detect and Fight Fraud and UK NHS Case Study Charlie Berger, MS Engineering, MBA Sr. Director Product Management, Machine Learning, AI and Cognitive Analytics [email protected] www.twitter.com/CharlieDataMine Phil Godfrey Data Scientist, Information Services UK NHS, Business Services Authority Move the Algorithms; Not the Data!

Transcript of Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will...

Page 1: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

1

Oracle’s Machine Learning and ADWC to Detect and Fight Fraud and UK NHS Case Study

Charlie Berger, MS Engineering, MBASr. Director Product Management, Machine Learning, AI and Cognitive [email protected] www.twitter.com/CharlieDataMine

Phil GodfreyData Scientist, Information ServicesUK NHS, Business Services Authority

Move the Algorithms; Not the Data!

Page 2: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Safe Harbor Statement

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

2

Page 3: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

People's Attitudes About FraudConsumers

• Nearly one of 10 Americans would commit insurance fraud if they knew they could get away with it.

• Nearly one of four Americans say it’s ok to defraud insurers

–About one in 10 people agree it’s ok to submit claims for items that aren’t lost or damaged, or for personal injuries that didn’t occur.

–Two of five people are “not very likely” or “not likely at all” to report someone who ripped of an insurer. Accenture Ltd.(2003)

• Nearly three of 10 Americans (29 percent) wouldn't report insurance scams committed by someone they know. Progressive Insurance (2001)

– About 35 percent say fraud costs their companies 5-10 percent of claim volume. More than 30 percent say fraud losses cost 10-20 percent of claim volume;

– Detecting fraud before claims are paid, and upgrading analytics, were mentioned most often as the insurers’ main fraud-fighting priorities;

Page 4: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

American Society of Certified Fraud Examiners20 Ways to Detect Fraud

1. Unusual Behavior The perpetrator will often display unusual behavior, that when taken as a whole is a strong indicator of fraud. The fraudster may not ever take a vacation or call in sick in fear of being caught. He or she may not assign out work even when overloaded. Other symptoms may be changes in behavior such as increased drinking, smoking, defensiveness, and unusual irritability and suspiciousness.

2. ComplaintsFrequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have been known to be some of the best sources of fraud and should be taken seriously. Although all too often, the motives of the complainant may be suspect, the allegations usually have merit that warrant further investigation.

3. Stale Items in ReconciliationsIn bank reconciliations, deposits or checks not included in the reconciliation could be indicative of theft. Missing deposits could mean the perpetrator absconded with the funds; missing checks could indicate one made out to a bogus payee.

4. Excessive VoidsVoided sales slips could mean that the sale was rung up, the payment diverted to the use of the perpetrator, and the sales slip subsequently voided to cover the theft.

5. Missing DocumentsDocuments which are unable to be located can be a red flag for fraud. Although it is expected that some documents will be misplaced, the auditor should look for explanations as to why the documents are missing, and what steps were taken to locate the requested items. All too often, the auditors will select an alternate item or allow the auditee to select an alternate without determining whether or not problem exists.

6. Excessive Credit MemosSimilar to excessive voids, this technique can be used to cover the theft of cash. A credit memo to a phony customer is written out, and the cash is taken to make total cash balance.

http://www.fraudiscovery.com/detect.html http://www.auditnet.org/testing_20_ways.htm

Page 5: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Automatically sift through large amounts of data to find hidden patterns, discover new insights and make predictions

What is Machine Learning?

• Identify most important factor (Attribute Importance)

• Predict customer behavior (Classification)

• Predict or estimate a value (Regression)

• Find profiles of targeted people or items (Decision Trees)

• Segment a population (Clustering)

• Find fraudulent or “rare events” (Anomaly Detection)

• Determine co-occurring items in a “baskets” (Associations)

A1 A2 A3 A4 A5 A6 A7

Page 6: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Machine Learning/Data Mining ProvidesBetter Information, Valuable Insights and Predictions

Customer Months

Lease Churners vs. Loyal Customers

Insight & PredictionSegment #1

IF CUST_MO > 14 AND INCOME < $90K, THEN Prediction = Lease Churner

Confidence = 100%Support = 8/39

Segment #3 IF CUST_MO > 7 AND INCOME < $175K, THENPrediction = Lease Churner,

Confidence = 83%Support = 6/39

Source: Inspired from Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management by Michael J. A. Berry, Gordon S. Linoff

Page 7: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Data Mining When Lack Sufficient Past ExamplesBetter Information, Valuable Insights and Predictions

Customer Months

Cell Phone Fraud vs. Loyal Customers

Source: Inspired from Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management by Michael J. A. Berry, Gordon S. Linoff

?

Page 8: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

CLASSIFICATION– Naïve Bayes– Logistic Regression (GLM)– Decision Tree– Random Forest– Neural Network– Support Vector Machine– Explicit Semantic Analysis

CLUSTERING– Hierarchical K-Means– Hierarchical O-Cluster– Expectation Maximization (EM)

ANOMALY DETECTION– One-Class SVM

TIME SERIES– State of the art forecasting using

Exponential Smoothing. – Includes all popular models

e.g. Holt-Winters with trends, seasons, irregularity, missing data

REGRESSION– Linear Model– Generalized Linear Model– Support Vector Machine (SVM)– Stepwise Linear regression– Neural Network– LASSO

ATTRIBUTE IMPORTANCE– Minimum Description Length– Principal Comp Analysis (PCA)– Unsupervised Pair-wise KL Div – CUR decomposition for row & AI

ASSOCIATION RULES– A priori/ market basket

PREDICTIVE QUERIES– Predict, cluster, detect, features

SQL ANALYTICS– SQL Windows, SQL Patterns,

SQL Aggregates

A1 A2 A3 A4 A5 A6 A7

• OAA (Oracle Data Mining + Oracle R Enterprise) and ORAAH combined• OAA includes support for Partitioned Models, Transactional, Unstructured, Geo-spatial, Graph data. etc,

Oracle’s Machine Learning & Adv. Analytics Algorithms

FEATURE EXTRACTION– Principal Comp Analysis (PCA)– Non-negative Matrix Factorization– Singular Value Decomposition (SVD)– Explicit Semantic Analysis (ESA)

TEXT MINING SUPPORT– Algorithms support text type– Tokenization and theme extraction– Explicit Semantic Analysis (ESA) for

document similarity

STATISTICAL FUNCTIONS– Basic statistics: min, max,

median, stdev, t-test, F-test, Pearson’s, Chi-Sq, ANOVA, etc.

R PACKAGES– CRAN R Algorithm Packages

through Embedded R Execution– Spark MLlib algorithm integration

EXPORTABLE ML MODELS– REST APIs for deployment

Page 9: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Where’s Waldo?

• Can you find Waldo in 10 seconds?

• Get ready….Go!

9

Page 10: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Page 11: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Data Lab To Find Savings and Cost Reductions in Health Care Budget

• United Kingdom’s National Health Service

• Identify billing and identity fraud

• Optimize treatment by reducing use of less effective medical procedures

• Deployed Oracle Advanced Analytics, and Oracle Business Intelligence on Oracle Exadata and Oracle Exalytics

“With one vendor providing the whole solution, it’s very easy for us.”

Nina Monckton, NHS BSA

£350M Confirmed savings identified

Confidential – Oracle Internal/Restricted/Highly Restricted 11Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Page 12: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

“It is despicable people would even claim things they are not entitled to.”

– Sue Frith, head of NHS Counter Fraud Authority

12

Dentists inventing patients fuelling

NHS fraud bill of more than £1bn

every year, officials warn

'Despicable' fraud costs NHS in

England £1bn a year

Page 13: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Remember Big Bird’s Song

One of these things is not like the others,

One of these things doesn't belong,

Can you tell which thing is not like the others

By the time I finish my song?

https://youtu.be/ueZ6tvqhk8U

Page 14: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Approach to Finding Anomalies

• LOTS of data and few or zero examples

• Many attributes

• For an single attribute, a value may seem ~“OK”

• Taken collectively, many attributes may indicate “anomalous”

• Look for what is “different”

X1

X2

X3

X4

X1

X2

X3

X4

Page 15: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

A Real Fraud ExampleMy credit card statement—Can you see the fraud?

May 22 1:14 PM FOOD Monaco Café $127.38May 22 7:32 PM WINE Wine Bistro $28.00…June 14 2:05 PM MISC Mobil Mart $75.00June 14 2:06 PM MISC Mobil Mart $75.00June 15 11:48 AM MISC Mobil Mart $75.00June 15 11:49 AM MISC Mobil Mart $75.00May 28 6:31 PM WINE Acton Shop $31.00May 29 8:39 PM FOOD Crossroads $128.14June 16 11:48 AM MISC Mobil Mart $75.00June 16 11:49 AM MISC Mobil Mart $75.00

Monaco?Gas Station?

All same $75 amount?

Pairs of $75?

Page 16: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Start with a Business Problem Statement

“If I had an hour to solve a problem I'd spend 55 minutes thinking about the problem and 5 minutes thinking about solutions.”

― Albert Einstein

Clearly Define Problem

Page 17: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 17

Manage and Analyze All Your Data

Big Data SQL / R

SQL / R

Object Store

“Engineered Features”– Derived attributes that reflect domain knowledge—key to best models e.g.:• Counts• Totals• Changes

over time

Boil down the Data Lake

Data Scientists, R Users, Citizen Data Scientists

Architecturally, Many Options and Flexibility

Page 18: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Multiple Data Scientist User Roles SupportedOracle’s Machine Learning/Advanced Analytics

Application DevelopersDBAs

Additional relevant data and “engineered

features”

Sensor data, Text, unstructured data, transactional data, spatial

data, etc.

Historical data Assembled historical data

Historical or Current Data to be “scored” for predictions

Predictions & Insights

Oracle Database 12c

R, Python Users, Data Scientists Data Analysts, Citizen Data ScientistsOML Notebook Users

Page 19: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Big Data Analytics using w Graph

• Add new engineered features– Percentage time spent in

zones

– Amount time/encounters with persons of interest

• Better predictions using available data– At risk customers

– Government approval processes

– Medical claims

– IoT predictive analytics

Oracle Advanced Analytics/Machine Learning with Enhanced Graph & Spatial Data SourcesTransactional network relationships data

Transactional geo-location data summarized to % time spent in areas or number of “hits” near a location

Better data and “engineered features”; better predictive models and predictive insights

Page 20: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

OAA Model Build and Real-time SQL Apply Prediction

BEGIN

DBMS_DATA_MINING.CREATE_MODEL(

model_name => 'BUY_INSUR1',

mining_function => dbms_data_mining.classification,

data_table_name => 'CUST_INSUR_LTV',

case_id_column_name => 'CUST_ID',

target_column_name => 'BUY_INSURANCE',

settings_table_name => ' CUST_INSUR_LTV_SET');

END;

/

Simple SQL Syntax

Select prediction_probability(BUY_INSUR1, 'Yes'

USING 3500 as bank_funds, 825 as checking_amount, 400 as credit_balance, 22 as age,

'Married' as marital_status, 93 as MONEY_MONTLY_OVERDRAWN, 1 as house_ownership)

from dual;

ML Model Build (PL/SQL)

Model Apply (SQL query)

Page 21: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Fraud Prediction Simplified

drop table CLAIMS_SET;exec dbms_data_mining.drop_model('CLAIMSMODEL');create table CLAIMS_SET (setting_name varchar2(30), setting_value varchar2(4000));insert into CLAIMS_SET values ('ALGO_NAME','ALGO_SUPPORT_VECTOR_MACHINES');insert into CLAIMS_SET values ('PREP_AUTO','ON');commit;

begindbms_data_mining.create_model('CLAIMSMODEL', 'CLASSIFICATION','CLAIMS', 'POLICYNUMBER', null, 'CLAIMS_SET');

end;/

-- Top 5 most suspicious fraud policy holder claimsselect * from(select POLICYNUMBER, round(prob_fraud*100,2) percent_fraud,

rank() over (order by prob_fraud desc) rnk from(select POLICYNUMBER, prediction_probability(CLAIMSMODEL, '0' using *) prob_fraudfrom CLAIMSwhere PASTNUMBEROFCLAIMS in ('2to4', 'morethan4')))where rnk <= 5order by percent_fraud desc;

Automated In-DB Analytical Methodology

POLICYNUMBER PERCENT_FRAUDRNK------------ ------------- ----------6532 64.78 12749 64.17 23440 63.22 3654 63.1 412650 62.36 5

Automated Monthly “Application” -- Just add:

Create

View CLAIMS2_30

As

Select * from CLAIMS2

Where mydate > SYSDATE – 30

Time measure: set timing on;

Page 22: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Operationalizing and Embedding Analytics for Action

22

How long does it take to put a defined model into operational use?

?

??Length of time to put a model into production.

Based on 141 respondents who stated they are doing this today

Page 23: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Introducing:Oracle Machine Learning SQL Notebook

Page 24: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

ADWC + Oracle Machine Learning Notebooks

Oracle Exadata Cloud Service

Oracle Database Cloud Service

Express Cloud Service

Data Warehouse Services(EDWs, DW, departmental marts and sandboxes)

Autonomous Data Warehouse Cloud

Service Console

Built-in Access Tools

Oracle Machine Learning

Service Management

Autonomous Database Cloud

Oracle SQL Developer

Developer Tools

Oracle Object Storage CloudFlat Files and Staging

Data Integration

Services

Oracle Data Integration Platform Cloud

3rd Party DI on Oracle Cloud Compute

3rd Party DI On-premises

24

Oracle Analytics Cloud

Analytics

Page 25: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Page 26: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Page 27: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Page 28: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Page 29: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Page 30: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Free Trial & Hands on Workshop:

https://go.oracle.com/adwc

Page 31: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Page 32: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Page 33: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Oracle Advanced AnalyticsBrief Demo + Additional Reference Slides

Page 34: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Tax Noncomplaince Audit SelectionTwo Example Approaches - There are many possible more!

• Anomaly Detection

– Build 1-Class Support Vector Machine (SVM) models on “normal or compliant” tax submissions

• Unsupervised machine learning when few know examples on which to train e.g. < 2%

– Build Decision Tree models for classification of Noncompliant tax submissions (yes/no) based on historical 2011 data• Supervised machine learning approach

when many known examples of target classes are available oh which to train

Page 35: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

SQL Developer/Oracle Data Miner GUIAnomaly Detection—Simple Conceptual Workflow

Train on “normal” recordsApply model and sort on

likelihood to be “different”

Page 36: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Financial Sector/Accounting/ExpensesAnomaly Detection

Simple Fraud Detection Methodology—1-Class SVM

More Sophisticated Fraud Detection Methodology—Clustering + 1-Class SVM

Page 37: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Multiple Approaches To Detect Potential Fraud

1. Anomaly Detection (1-Class SVM) • Add feedback loop to purify the input training data over time

and improve model performance

2. Classification• IF you have a lot of examples (25% or more) of fraud on which

to train/learn

3. Clustering• Find records that don’t high very high probability to fit any

particular cluster and/or lie in the outlier/edges of the clusters

4. Hybrid of #3 and then #1• Pre-cluster the records to create “similar” segments and then

apply anomaly detection models for each cluster

5. Panel of Experts • i.e. 3 out of 5 models predict possibly anomalous above 40% or

any 1 out of N models considers this record unusual

Page 38: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Getting Started—Oracle ML/AA Resources & LinksOracle Advanced Analytics Overview Information

• Oracle's Machine Learning and Advanced Analytics 12.2c and Oracle Data Miner 4.2 New Features preso• Oracle Advanced Analytics Public Customer References• Oracle’s Machine Learning and Advanced Analytics Data Management Platforms white paper on OTN • Oracle INTERNAL ONLY OAA Product Management Wiki and Beehive Workspace (contains latest

presentations, demos, product, etc. information)

YouTube recorded Oracle Advanced Analytics Presentations and Demos, White Papers • Oracle's Machine Learning & Advanced Analytics 12.2 & Oracle Data Miner 4.2 New Features YouTube video• Library of YouTube Movies on Oracle Advanced Analytics, Data Mining, Machine Learning (7+ “live” Demos

e.g. Oracle Data Miner 4.0 New Features, Retail, Fraud, Loyalty, Overview, etc.)• Overview YouTube video of Oracle’s Advanced Analytics and Machine Learning

Getting Started/Training/Tutorials Link to OAA/Oracle Data Miner Workflow GUI Online (free) Tutorial Series on OTN Link to OAA/Oracle R Enterprise (free) Tutorial Series on OTN Link to Try the Oracle Cloud Now! Link to Getting Started w/ ODM blog entry Link to New OAA/Oracle Data Mining 2-Day Instructor Led Oracle University course. Oracle Data Mining Sample Code Examples

Additional Resources, Documentation & OTN Discussion Forums• Oracle Advanced Analytics Option on OTN page• OAA/Oracle Data Mining on OTN page, ODM Documentation & ODM Blog• OAA/Oracle R Enterprise page on OTN page, ORE Documentation & ORE Blog• Oracle SQL based Basic Statistical functions on OTN• Oracle R Advanced Analytics for Hadoop (ORAAH) on OTN

Analytics and Data Summit , All Analytics, All Data, No Nonsense. March 12-14, 2019, Redwood Shores, CA

Send email now to [email protected] and you’ll get my “away message” with the links.

Page 39: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Fraud and Error in the NHS

Phil Godfrey – Data Scientist

NHS Business Services Authority

Page 40: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

NHSBSA Portfolio

We have data covering:

• More than £35 billion NHS activity

• Billions of transactions

• Millions of NHS staff

• Tens of thousands of NHS

contracts

• Millions of citizens

Page 41: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Key facts data stats (2017/18)

1,000,000,000The number of

prescription items

processed every year

44,360,000FP17 Dental Claim

forms processed

LR 14,700,000 The number of claims

for free prescriptions

and dental procedures

checked by our loss

recovery teams

11,000,000Applications from

patients for help with

their NHS health

costs

3,840,000The number of calls

our Contact Centre

handled plus 785,000e-mails

5,500,000The number of

European Health

Insurance Cards

provided to UK

residents

2,200,000The number of

employee assignments

processed per month

for 1,500,000 NHS

Staff

1,400,000Actively contributing

NHS Pension

members, 550,000deferred members

paying 860,000recipients each month

3,000The number of people

registered and eligible

for support through the

England Infected

Blood Support

Scheme

£5,100,000Paid out in NHS

Student and Social

Work Bursaries to

80,000 Students

Page 42: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

In the beginning

• What are the questions we want to answer?

Questions that not only demonstrate business value

but also test the requirements of the volumes of

data.

• What data do we have to answer these questions?

Where is it?

• How can we get the data from a transactional

system to an environment where we can do some

advanced analytics?

• Who is going to own the actions from the insight

created?

• How do we engage/educate/involve the rest of the

organisation?

Page 43: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Absolute must haves of our Data Lab

• Security

• Performance

• Hold large amounts of data and process

efficiently

• Analysis of the data without not knowing the

person

• No issues with the network

• Don’t need huge computing power on your

desk

• No need for specialist skills such as Hadoop /

Spark

• Availability of training – online and classroom

Page 44: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Why Oracle Exadata and Advanced Analytics?

• Engineered in-memory system (algorithms executed within

database)

• Existing Oracle database experience within the

organisation

• Courses and support available

• Affordability

• Speed to deploy

• Integration with R

• Unstructured data

• Intuitive to use

• Oracle understood our business problem

Page 45: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Our Data Lab

Insight

Operational Improvements

Finding Fraud and Error

Data

Lab 0102

03

Page 46: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Examples of how we use our data to find unwarranted variations

• Polypharmacy

• Fraud in the NHS

• Unusual Quantities

• Primary Care

• Homeopathic analysis

• Drug prices

Page 47: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Using Oracle & Advanced Analytics in the NHS

Page 48: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Polypharmacy• Polypharmacy is described as ‘the concurrent use of multiple medication items by one

individual’

– Appropriate polypharmacy – will extend life expectancy and improve quality of life

– Inappropriate polypharmacy – can be harmful and may increase the risk of adverse drug

reactions, impairing quality of life for patients

Discussion Points1. Analysis

2. Findings

3. Recommendations

4. Outcomes

Page 49: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Polypharmacy findings by area

Percentage of patients in each Clinical Commission Group area who were dispensed ten or more items in June

2018

Knowlsey, Liverpool and Bradford City CCGs have the 1st, 2nd and 3rd highest percentage of patients who were dispensed ten or more

medications.

Page 50: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Polypharmacy basket analysis

Specific insights:

• Identification of commonly co-

occurring medicinal items (Market

Basket Analysis).

– In particular, highest co-occurring

items are multiple doses of

Warfarin.

– Food items frequently occur in

multiple varieties together.

• Identification of co-occurrences of

medicines with potentially harmful

interactions.

Potentially harmful NSAID combinations

in March 2015

Page 51: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Monitoring polypharmacy

Dashboard CCG Practice NHSBSA Practice Patient

• The polypharmacy dashboard was first published in July 2017.

• CCGs monitor the dashboard and notify practices of their performance. Practices request

polypharmacy patient details from the NHSBSA, upon which to take action.

• Practices from only 11 CCGs have requested patient details; one of those active CCGs is

North East Hampshire and Farnham. Its performance has decreased significantly

compared to nationally.

National

CCG

NE Hampshire & Farnham

Page 52: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Fraud in the NHS

• It is estimated that £1.25bn of fraud is being committed each year by patients, staff and contractors

(NHSCFA, 2017)

• The sum represents about 1% of the NHS budget

Page 53: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Irregularities in the prescribing and dispensing of dependency forming

drugs

• Drug abuse isn't just about recreational drugs. Besides marijuana, legal medicines

are the most commonly abused drugs in the UK. Some prescribed drugs can be

addictive and dangerous

• Prescribing of addictive medicines has increased 3% over five years

• One patient in eleven (8.9%) who receives a prescription is prescribed one of

these medicines

• Antidepressant prescriptions in England have more than doubled in the past 10

years

• A recent survey also found that 7.6% of adults had taken a prescription-only

painkiller not prescribed to them

• From April 2017 to March 2018 there were approximately 149 million dependency

forming drug items dispensed at a cost of £1.2bn

Page 54: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Unusual Quantities

Purpose

• The purpose of this analysis was to provide insight into outlier analysis of items dispensed at prescription

level, with a focus on products with unusually high quantities.

• Identifying these outliers could help the NHSBSA to realise cost savings, improve our internal processes

and have a positive impact on patient care.

• Research conducted by NHS National Patient Safety Agency (2007) identified over three thousand

Medication incidents in 2007 (5% of all Medication incidents) were categorised as “Wrong Quantity”.

Although the impact of these incidents is not clear, they could have the potential to cause harm to

patients.

Page 55: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Unusual Quantities

How?

• Utilising Oracle Anomaly Detection.

• Descriptive Statistics using the aggregate function to

describe attributes of drugs:

– Minimum, Median, Average, Maximum quantities

• New functionality allows us to include “Partition By”

drug, which makes this analysis possible.

Why this is important?

• Without the Partition By we would combine similar but

very different quantities across all drugs:

– 1,000 tablets

– 1,000 ml

– 1,000 litres

• All of the above would be considered the same!

Page 56: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Unusual Quantities

Findings

• The findings have been shared with a pilot CCG, and the potential benefits are:

• Aim to reduce prescribing error by identifying highly prescribed quantities (which could be made in error).

• Identifying any products which are commonly appearing

• Influence training needs for any CCG’s/practices that appear prominently in the analysis.

Next

• Looking to movie this analysis to be business as usual, investigating using Oracle Integration with R to

allow a script to produce this with no user intervention.

Page 57: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

DHSC Historic Drug Prices

Purpose?

• Department of Health and Social Care requested DALL to identify

products which display unusual price behaviour to help inform

investigations.

Why?

• An article published in the Times Newspaper identified 36 drugs of

particular interest which were supplied by the DHSC.

• Understanding attributes of these drugs

and prescribing patterns could provide a

methodology to identify additional drugs

Page 58: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

DHSC Historic Drug Prices

Why?

• DALL applied this insight over our entire

prescriptions dataset (December 2013 to

September 2017) with the aim of identifying

additional products to investigate.

How?

• Regression analysis conducted on a drugs

pack price and reimbursement values over

time.

• Only those with a positive regression slope

on pack price and total cost returns products

with an increasing cost to the NHS.

• The top 25% of products are returned to

reduce the dataset.

Page 59: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

DHSC Historic Drug Prices

Findings

Branded Products

– 1,120 branded products identified meeting the criteria, with

products being supplied by 111 wholesalers.

Generic Products

– 269 generic products identified meeting the criteria.

• All of the above products, along with a dashboard of data showcasing

the drug pack prices and cost to the NHS over time have been

provided to the DHSC for investigation.

Page 60: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Oracle SQL - Regression

• Regression functionality built-in to Oracle SQL.

• Allows regression to be calculated with single lines of code!

• As well as SLOPE , you’re also able to calculate all other

regression components, Intercept, R-Squared, average X and Y

values.

SELECT job_id, employee id id, salary, REGR_SLOPE(SYSDATE-hire_date, salary) OVER (PARTITION BY job_id) slopeFROM employees

Page 61: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Delivering Analysis through ePACT2

using Oracle Analytics Cloud

Page 62: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

ePACT2 Project

• Provides authorised NHS users with free access to a wide range of prescription

data not available in the public domain due to its sensitivity

• Delivered originally using BICS and now Oracle Analytics Cloud

• NHSBSA create a number of dashboards that allow CCG’s to understand the

variation in prescribing across GP practices, within a CCG and nationally.

• Dashboards are created in conjunction with cross-NHS working groups,

Academic Health Science Networks (AHSN’s) to ensure clinical

appropriateness.

Page 63: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

ePACT2 – Then and Now

ePACT

ePACT2

Export data to analyse and visualise

Analyse and visualise within ePACT2

Page 64: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

ePACT2 Functionality

• ePACT2 delivered through Oracle Analytics Cloud, allows us to create dashboards and produce ad-hoc

analysis.

• To date, ePACT2 has delivered 13 dashboards to reduce costs and improve health of patients

Page 65: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Polypharmacy• To provide insight into prescribing of

multiple medicines for individual

patients in primary care.

• To provide potential metrics to show

the variation of Polypharmacy across

England to help CCGs and others to

target their work plans to address

PolyPharmacy.

• Reducing the inappropriate use of

multiple medicinal items by an

individual will:

– Reduce the risk of harmful interaction of

drugs on individual patients.

– Reduce spend of ineffective or harmful

prescribing.

– Reduce hospital admissions for adverse

reactions to medicines that could be

avoided.

Page 66: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Items which should not be routinely prescribed in Primary

Care• Introduced in November 2017, this

dashboard reports on spend and

volume of items prescribed on a list

of 18 treatments, drawn up with

family doctors and pharmacists,

deemed to be ineffective, over-

priced and of low clinical value.

• Based on our current drug mapping,

the cost to the NHS is £24.4M in the

latest quarter (February to April

2018).

• The data can be broken down to GP

Practice level, making it as relevant

as possible for care commissioners

to use.

Page 67: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Homeopathic Analysis

• Prescription data can be utilised to

explore prescribing for particular

medicines.

• Analysis shows unusual prescribing

pattern in 50-54 year olds.

• Bristol CCG accounts for 77.3% of all

50-54 homeopathic prescribing.

Page 68: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Drug Prices

• Analyses of Atropine 1%

eye drops explores drug

price variation over time.

• The price of this product

has increased over

19,000% since 2004.

• We can use our data to

understand how supplier

behaviour could have

impacted prices.

• This type of analysis can

be produced for any drug,

over a large time period.

Page 69: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

In Summary

• Continue working with business areas and using our data to identify areas of fraud and error.

• Deploying some of our data science insights into operational efficiencies and anomaly detection.

• Collaborations with other NHS organisations to expand the availability of NHS data for data science.

Page 70: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have

Any Questions?

Page 71: Oracle’s Machine Learning and ADW to Detect and …. Complaints Frequently tips or complaints will be received which indicate that a fraudulent action is going on. Complaints have