Market Basket Analysis & Neural Networks (chaps 7 & 11) Retail Checkout Data.
-
Upload
johnathan-rice -
Category
Documents
-
view
218 -
download
0
Transcript of Market Basket Analysis & Neural Networks (chaps 7 & 11) Retail Checkout Data.
Market Basket
Analysis & Neural
Networks(chaps 7 & 11)
Retail Checkout
Data
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-2
MARKET BASKET ANALYSIS• INPUT: list of purchases by purchaser
– do not have names
• Identify purchase patterns– what items tend to be purchased together
• obvious: steak-potatoes; beer-pretzels
– what items are purchased sequentially• obvious: house-furniture; car-tires
– what items tend to be purchased by season
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-3
Market Basket Analysis• Categorize customer purchase behavior
• Identify actionable information– purchase profiles– profitability of each purchase profile– use for marketing
• layout or catalogs• select products for promotion• space allocation, product placement
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-4
Market Basket Analysis
• Affinity Positioning– coffee, coffee makers in close proximity
• Cross-Selling– cold medicines, tissue, orange juice– Monday Night Football kiosks on Monday p.m.
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-5
Possible Market BasketsCustomer 1: beer, pretzels, potato chips, aspirin
Customer 2: diapers, baby lotion, grapefruit juice, baby food, milk
Customer 3: soda, potato chips, milk
Customer 4: soup, beer, milk, ice cream
Customer 5: soda, coffee, milk, bread
Customer 6: beer, potato chips
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-6
Co-occurrence TableBeer Pot. Milk Diap. Soda
Chips
Beer 3 2 1 0 0
Pot. Chips 2 3 1 0 1
Milk 1 2 4 1 2
Diapers 0 0 1 1 0
Soda 0 1 2 0 2beer & potato chips - makes sense milk & soda - probably noise
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-7
Jaccard CoefficientRatio of cases together over total cases
Beer PotChip Milk Diapers
PotChip 0.333
Milk 0.143 0.143
Diapers 0 0 0.200
Soda 0 0.200 0.333 0
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-8
Market Basket Analysis• Steve Schmidt - president of ACNielsen-US• Market Basket Benefits
– selection of promotions, merchandising strategy• sensitive to price: Italian entrees, pizza, pies,
Oriental entrees, orange juice
– uncover consumer spending patterns• correlations: orange juice & waffles
– joint promotional opportunities
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-9
Market Basket Analysis• Retail outlets
• Telecommunications
• Banks
• Insurance– link analysis for fraud
• Medical– symptom analysis
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-10
Market Basket Analysis
• Chain Store Age Executive (1995)1) Associate products by category
2) What % of each category was in each market basket
• Customers shop on personal needs, not on product groupings
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-11
Purchase ProfilesBeauty conscious Kids’ play Smoker
Health conscious Casual drinker Pet lover
Sports conscious New family Gardener
Men’s image conscious Casual reader Hobbyist
Convenience food Sentimental Illness (OTC)
Home handyman Automotive Illness (prescription)
TV/stereo enthusiast Photographer Personal care
Seasonal/traditional Homemaker Men’s fashion
Student/home office Home Comfort Kid’s fashion
Fashion footwear Women’s fashion
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-12
Purchase Profiles
• Beauty conscious– cotton balls– hair dye– cologne– nail polish
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-13
Purchase Profile Use• Each profile has an average profit per
basketKids’ fashion $15.24 Push these
Men’s fashion $13.41 Push these
….
Smoker $2.88 Don’t push these
Student/home office $2.55 Don’t push these
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-14
Market Basket Analysis• LIMITATIONS
– takes over 18 months to implement– market basket analysis only identifies
hypotheses, which need to be tested• neural network, regression, decision tree analyses
– measurement of impact needed– difficult to identify product groupings– complexity grows exponentially
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-15
Market Basket Analysis• BENEFITS:
– simple computations– can be undirected (don’t have to have
hypotheses before analysis)– different data forms can be analyzed
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-16
Market Basket Software• Market Basket Analysis is highly
unstructured• Most popular data mining software doesn’t
support– Clementine does
• Specialty software market for this specific purpose– DataSage Customer Analysis– Xaffinity
Neural Networks
Automatic Model Building
(Machine Learning)
Artificial Intelligence
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-18
High-Growth Product• Used for classifying data
– target customers– bank loan approval– hiring– stock purchase– trading electricity– DATA MINING
• Used for prediction
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-19
Description• Use network of connected nodes (in
layers)
• Network connects input, output (categorical)– inputs like independent variable values in
regression– outputs: {buy, don’t} {paid, didn’t}
{red, green, blue, purple}
{character recognition - alphabetic characters}
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-20
Perceptron
Basic building block Comprised of Synaptic Weights and Neuron Weights scale the input values Combination of weights and transfer function F(x) transform inputs to
needed output O Trained by changing weights until desired output is achieved
F(x)
Bias
I1
I2
In
I3 XO
W1
W2
W3
Wn
Inputs SynapticWeights Neuron
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-21
Network
Input Hidden Output
Layer Layers Layer
Good
Bad
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-22
Operation• Randomly generate weights on model
– based on brain neurons• input electrical charge transformed by neuron• passed on to another neuron
– weight input values, pass on to next layer– predict which of the categorical output is true
• Measure fit– fine tune around best fit
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-23
Operation• Useful for PATTERN RECOGNITION
• Can sometimes substitute for REGRESSION– works better than regression if relationships
nonlinear– MAJOR RELATIVE ADVANTAGE OF NEURAL
NETWORKS:YOU DON’T HAVE TO UNDERSTAND THE MODEL
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-24
Neural Network Testing• Usually train on part of available data
– package tries weights until it successfully categorizes a selected proportion of the training data
• When trained, test model on part of data– if given proportion successfully categorized, quits– if not, works some more to get better fit
• The “model” is internal to the package
• Model can be applied to new data
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-25
Business Application• Best in classifying data
mortgage underwriting asset allocation
bond rating fraud prevention
commodity trading
• Predicting interest rate, inventoryfirm failure bank failure
takeover vulnerability stock price
corporate merger profitability
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-26
Neural Network Process1. Collect data
2. Separate into training, test sets
3. Transform data to appropriate units• Categorical works better, but not necessary
4. Select, train, & test the network• Can set number of hidden layers• Can set number of nodes per layer• A number of algorithmic options
5. Apply (need to use system on which built)
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-27
Marketing Applications• Direct marketing
– database of prospective customers• age, sex, income, occupation, education, location• predict positive response to mail solicitations
• THIS IS HOW DATA MINING CAN BE USED IN MICROMARKETING
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-28
Neural Nets to Predict Bankruptcy
Wilson & Sharda (1994)
Monitor firm financial performanceUseful to identify internal problems, investment evaluation, auditingPredict bankruptcy - multivariate discriminant analysis of financial ratios
(develop formula of weights over independent variables)Neural network - inputs were 5 financial ratios - data from Moody’s
Industrial Manuals (129 firms, 1975-1982; 65 went bankrupt)Tested against discriminant analysisNeural network significantly better
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-29
CASE: Support CRMDrew et al. (2001), Journal of Service Research
• Identify customers to target
• Customer hazard function:– Likelihood of leaving to a competitor (CHURN)
• Gain in Lifetime Value (GLTV)– NPV: weight EV by prob{staying}– GLTV: quantified potential financial effects of
company actions to retain customers
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-30
SystemsA great many products
• general NN products$59 to $2,000 @Brain BrainMaker Discover-It
• componentsDATA MINING along with megadatabases other products
• specialty productsconstruction bidding, stock trading, electricity trading
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
11-31
Potential Value• THEY BUILD THEMSELVES
– humans pick the data, variables, set test limits
• CAN DEAL WITH FAST-MOVING SITUATIONS– stock market
• CAN DEAL WITH MASSIVE DATA– data mining
• Problem - speed unpredictable