TOWARDS A HOLISTIC FRAUD MANAGEMENT...

25
TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCT Aris Papadopoulos [email protected]

Transcript of TOWARDS A HOLISTIC FRAUD MANAGEMENT...

Page 1: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCT Aris Papadopoulos [email protected]!

Page 2: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

A data-mining fraud management framework • A holistic, big-data mining approach to narrow risk and potentially create space

for strategic expansion (credit/loan application scoring, marketing/recommendations etc.)

• A cross-layer, cross-channel architecture building models from data on multiple levels: Industry level, processor level, merchant level, down to individual buyers (also include geographical, seasonal patterns).

• A 360 view that will predict individual buyers behavior down to a “segment of one”.

•  Intelligent knowledge fusion from multiple agents operating at all levels mentioned above.

•  Full value chain management based on transparent, actionable conversion of insights.

Aris Papadopoulos - A fraud management framework - Private & Confidential 2

Page 3: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Fraud examples •  Stolen/lost card •  Counterfeit card •  ID theft

•  Account takeover •  Skimming: Stealing card info during legit transaction •  Carding: Producing and trying card information •  Triangulation: Receiving payment on auction sites and sending goods bought with

stolen card •  Phishing •  Botnets •  Reshipping: Employed to re-ship goods coming from fraudulent purchases •  Affiliate fraud: Affiliate that drives consumer traffic •  Clean fraud: Transaction info seems legitimate in isolation •  Chargeback fraud (“friendly” fraud): Bogus mail-not-received claim

Aris Papadopoulos - A fraud management framework - Private & Confidential 3

Page 4: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Goals

• Detect and prevent fraud with the greatest accuracy • Minimise false positives • Minimise chargebacks • Minimise the number and the cost of manual reviews •  Transparent analytics and insightful reporting to discover the appropriate

anti-fraud strategy • Powerful strategy management tool to implement it • Ultimately, maximise customer’s revenues

Aris Papadopoulos - A fraud management framework - Private & Confidential 4

Scoring Manual review Chargeback management

Reporting/ analytics

Strategy management

Page 5: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Key features

• Customizable workflows, queues, priorities, roles etc. during manual review •  Intelligent routing to the most appropriate agent • Dispute and chargeback management • Chargeback prediction • Device fingerprinting • Device-account-owner dynamics monitoring • Geolocation and IP monitoring with proxy detection • Affiliate monitoring •  3rd party services integration

Aris Papadopoulos - A fraud management framework - Private & Confidential 5

Scoring Manual review Chargeback management

Reporting/ analytics

Strategy management

Page 6: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Key features (cont.)

• Automatic rules creation from data • Post-authorization retrospective scoring until shipping • Compliance to local policies, regulations, tax procedures • Customizable reports • Real time and historic monitoring and ad-hoc data analytics • Events and alerts management • Anti-fraud strategy management:

•  Easy rules management •  Models management

• Cross-channel

Aris Papadopoulos - A fraud management framework - Private & Confidential 6

Scoring Manual review Chargeback management

Reporting/ analytics

Strategy management

Page 7: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

M-payments • A strategic opportunity.

•  Gartner: From $170bn in 2012 to $600bn by 2016 (other analysts predict even higher volumes).

•  In a global population of 7bn, 5bn have mobile phones and only 2bn have bank accounts. • A collaboration/collision field for: banks, mobile operators, credit card

companies, payment gateways, device makers. •  Two models, three broad markets:

•  M-commerce: Replace cash and cards to buy products. •  Applies to sophisticated (1bn) and developed (4bn) markets.

•  Mobile money transfers: Replace bank accounts. •  Applies to emerging (2bn) and developed markets.

• Some mobile payments companies: •  Comviva, Fundamo (VISA), Gemalto, Monetise, Oberthur, Sybase (SAP), Utiba,

Aris Papadopoulos - A fraud management framework - Private & Confidential 7

Page 8: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

M-payments (cont.) • New models are needed to accommodate:

•  The more vulnerable nature of mobile devices theft which may result to data theft and account takeover.

•  Increased rates of authentication false positives due to miss-typing (due to small size etc.). •  New device-account patterns. •  Mobility-sensitive behavior patterns such as IP-velocity, out-of-band authentication etc. •  Mobile payments innovation such as mobile POS (Square, Paypal Beacon etc.), digital

wallets (Google etc.). •  Possibly different regulations depending on location.

Aris Papadopoulos - A fraud management framework - Private & Confidential 8

Page 9: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

A scoring framework

Aris Papadopoulos - A fraud management framework - Private & Confidential 9

Scoring Manual review Chargeback management

Reporting/ analytics

Strategy management

Page 10: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Rules

Rules

Predictive Analytics

Neural Networks

Support Vector Machines

Anomaly detection

Distance-based

Gaussian

Clustering

Centroid

Hierarchical

Velocity analysis

Markov Processes

Aris Papadopoulos - A fraud management framework - Private & Confidential 10

Detect known patterns

Detect complex known patterns

Detect unknown patterns Detect segments

Detect patterns in the time-

domain

Multi-agent knowledge fusion

Ensemble learning Meta-learning

Opt

imiz

atio

n

Artificial Immune Systems

Decision Trees

Aris Papadopoulos Private and confidential

Page 11: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Rules • Aim: Decide an action (class) given a state (feature values). • Pros:

•  Comprehensible

• Cons: •  Scale poorly •  Especially with noisy data

• Product feature: •  Automatic rule learning

•  Example algorithm: PRISM, RIPPER (underlying method: Covering aka Separate-and-conquer) with pruning to simplify.

•  Fuzzy rules: Writing rules in natural language.

• Examples of rules engines execution algorithms: RETE, TREAT.

Aris Papadopoulos - A fraud management framework - Private & Confidential 11

Practically: •  Many false negatives => Insufficient

Rul

e-ba

sed

syst

ems

Page 12: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Decision trees • Closely related to rules. Main differences:

•  Only one output (with rules come priority and conflict resolution methods). •  Less comprehensible (you need to see at the entire structure, while rules are independent

chunks of knowledge).

• Example algorithms: ID3, C4.5/C5.0 (underlying method: divide-and-conquer to minimize entropy).

• Devise rules indirectly from decision trees: Combine separate-and-conquer with divide-and-conquer.

Aris Papadopoulos - A fraud management framework - Private & Confidential 12 R

ule-

base

d sy

stem

s

Page 13: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Neural networks • Predict the probability that a transaction is fraudulent given potentially vast

amounts of features (parameters) in real world’s complex patterns (non-linear classification).

• Cons: •  A pattern must have been seen before, to be recognised. •  Non-convex.

• Contextual challenge: Uneven classes: Fraud rate by order 0.8% (2012) => potentially many false positives. •  To tackle this M.Krivko combined a rules-based system with a logistic regression classifier (i.e. a

single NN unit) deployed for a bank (“a hybrid model for plastic card fraud detection systems” -2010).

Aris Papadopoulos - A fraud management framework - Private & Confidential 13 P

redi

ctiv

e A

naly

tics

Page 14: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Neural Networks (cont.) •  Learning algorithms:

•  Back Propagation (supervised) •  Self-Organizing Map (unsupervised)

• Advanced algorithms: •  Deep learning: Self-taught neural networks that can be trained to recognize hierarchies of

higher level patterns. No manual classification during training. •  Product feature: Identify patterns in false negatives? •  Algorithm: Calculate the cost of each instance from self, through the NN. After training, perceptrons

of the last hidden layer output the probability of the input in the world they “fantasise”. •  Underlying algorithm: Sparse coding: Learn a base (“bottleneck”), to which all instances can be

decomposed. •  Dynamic NN architecture: Vary the number of hidden layers and the number of perceptrons

at each layer to minimize the error of the maximum likelihood model. •  Nature-inspired heuristics: Genetic Algorithms, Simulated Annealing etc.

Aris Papadopoulos - A fraud management framework - Private & Confidential 14 P

redi

ctiv

e A

naly

tics

Page 15: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Artificial Immune Systems •  In “Credit Card Fraud detection with Artificial Immune System” (2008) Cadi et

al. showed that a GA-optimized AIS outperformed the corresponding NN. • Natural: The immune system protects the body by recognizing antigens. When

B-cells encounter antigens, they respond with antibodies which attack the antigens. •  AIS: B-cells respond like pattern matchers. Antigens and B-cells are represented as feature

vectors.

•  If the affinity of the B-cell to the antigen is high, the B-cell becomes stimulated and produces mutated clones (better fit for the particular antigen). •  AIS: Affinity: A similarity metric in the features vector space (see clustering).

• As the new B-cells are added, inactive ones die (survival of the fittest for network diversity).

Aris Papadopoulos - A fraud management framework - Private & Confidential 15 P

redi

ctiv

e A

naly

tics

Page 16: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Artificial Immune Systems (cont.) •  Training an AIS (clustering within

classification): •  Present the new antigen. •  From a pool of memory B-cells select the

one with greater affinity (similarity metric in the features vector space, see clustering) with the antigen and clone. The clones may also mutate.

•  B-cells stimulation is proportional to antigen affinity.

•  After repeating until a certain threshold if average stimulation is reached, the process stops and the closest B-cell to the antigen is selected as the recognized class (added to the memory pool).

•  Classification: •  K-nearest neighbours to the closest B-cell

from the memory pool.

Aris Papadopoulos - A fraud management framework - Private & Confidential 16 P

redi

ctiv

e A

naly

tics

Page 17: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Support Vector Machines • SVMs are an extension of logistic regression to

non-linear classification. •  It is reported to perform better than NNs under

circumstances. •  It transforms features to the kernel space in

order to learn the maximum margin hyperplane that separates the classes.

Aris Papadopoulos - A fraud management framework - Private & Confidential 17 P

redi

ctiv

e A

naly

tics

Page 18: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Optimization • Use nature-inspired heuristics

•  Genetic Algorithms •  A population of random solutions is produced. The next generation of solutions is produced by

mutating (random changes) the best solutions of the previous generation or by breeding (combining) the best solutions.

•  Simulated Annealing •  Gradually decreasing “temperature”, decreases the level of randomization in search.

to search for the optimal: •  Number of hidden layers in a neural network •  Number of perceptrons at each layer •  Feature subset out of all data (feature engineering)

Aris Papadopoulos - A fraud management framework - Private & Confidential 18

Page 19: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Clustering • Aim: Compute groups according to a similarity metric:

•  Euclidean distance •  Pearson correlation •  Cosine similarity

• Product feature: •  Cluster buyers in different groups (custom input features) and apply the appropriate

scorecard. •  Identify common features inside a group, e.g. discover the common features insde a group

of fraudulent transactions (reporting) to adjust strategy. • Example algorithms:

•  Centroid clustering (e.g. K-means) •  Fuzzy C-means (membership to more than one clusters allowed by adding a fuzzifier in the

objective function that determines the degree of membership). •  Hierarchical clustering

Aris Papadopoulos - A fraud management framework - Private & Confidential 19

Page 20: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Anomaly Detection • Aim: Detect outliers i.e. transaction patterns that have not been seen before. • Example algorithms:

•  Distance from any cluster beyond a given threshold. •  Gaussian distribution: Probability of an instance given the Gaussian distribution calculated

by previous instances. •  The Gaussian distro is used when the actual distribution is unknown, because it is universal in

social sciences. •  Scalable outlier detection. •  Fuzzy-logic combinations of the above.

Aris Papadopoulos - A fraud management framework - Private & Confidential 20

Page 21: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Velocity • Aim: Compute the posterior probability of a sequence of different transactions

(filtering), given a Markov Model consisting of: •  A transition model •  An observation model (for a Hidden Markov Model where the latent variables are “legit” or

“fraud” transaction).

• Each transaction may be represented by: •  The amount spent •  The time elapsed from previous transaction etc.

• Algorithm: •  Forward algorithm (conditional independence)

•  Scalability note: Heuristics e.g. Gibbs sampling. •  Learn an HMM: Baum-Welch

Aris Papadopoulos - A fraud management framework - Private & Confidential 21

Page 22: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Multi-agent knowledge fusion • Product feature:

•  Intelligent fusion of the results from the discussed smart agents. •  Allowing users fine-tuning, e.g. “overriding” rules:

•  Enforce a scorecard •  Exceptions (e.g. airline tickets etc.)

• Ensemble strategies: •  Same vs different input representation •  Multi-agent (different agents work in parallel and then results are combined) •  Multi-stage (classifiers are trained on subsets based on the results of previous classifiers,

e.g. classify initial false positives to final true/false positive) •  Cascading (next classifier used if previous result had low confidence)

Aris Papadopoulos - A fraud management framework - Private & Confidential 22

Page 23: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Multi-agent knowledge fusion (cont.) • Homogeneous classifiers combination techniques:

•  Voting: •  Max of all predictions, Max over average (for each class) among all classifiers, Max of products,

Max of sums, unanimity, accuracy weighted voting, accuracy on training/testing etc. •  Data manipulation:

•  Boosting: Classifiers trained on instances where previous classifiers failed (e.g. train on false positives).

•  Bagging: Train each classifier on a different portion of the training set chosen stochastically. •  Correlation reduction: Train each classifier on a different portion of the training set chosen based on

patterns (to the limit that each classifier is trained on one class).

Aris Papadopoulos - A fraud management framework - Private & Confidential 23

Page 24: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Multi-agent knowledge fusion (cont.) • Meta-learning (stacking):

•  Used for heterogeneous classifiers (e.g. a NN and a SVM). •  Replaces voting with a meta-classifier that learns to classify based on the outputs of the

previous levels classifiers.

Aris Papadopoulos - A fraud management framework - Private & Confidential 24

Page 25: TOWARDS A HOLISTIC FRAUD MANAGEMENT PRODUCTaris.onl/wp-content/uploads/2015/11/Fraud_Mgmt_Framework.pdf · 2015-11-17 · A data-mining fraud management framework • A holistic,

Social integration • Potential to narrow risk even further, e.g.:

•  A location sign-in may prevent a false positive, otherwise potentially signaled due to location change.

•  Black listed individuals may be an extra feature for those people having them in their network.

•  Typing patterns may be used to enhance authentication.

Aris Papadopoulos - A fraud management framework - Private & Confidential 25