BCS Talk - What sells well and when?
-
Upload
stephen-matthews -
Category
Technology
-
view
147 -
download
0
description
Transcript of BCS Talk - What sells well and when?
What sells well and when?
Stephen G. Matthews
Centre for Computational Intelligence (CCI)De Montfort University
7th February 2012 / BCS meeting
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 1 / 36
Outline
1 Background
2 The Problem
3 The Solution
4 Experiments
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 2 / 36
Outline
1 Background
2 The Problem
3 The Solution
4 Experiments
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 3 / 36
Association Rules
Significant correlations between items in datasets.
Uses: positioning stock on shelves, inventory control andcross-selling.
Descriptive data mining.
Example Rule20% of customers matched the rule
IF pizza THEN beer
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 4 / 36
Terminology and formal description
A dataset contains a set of N transactions T = {t1, t2, ..., tN}.
Each transaction comprises a subset of items, referred to as anitemset, from M items I = {i1, i2, ..., iM}.
Support count measures the number of transactions containing anitemset by
Support count, σ(X ) = |{ti |X ⊆ ti , ti ∈ T}|. (1)
The support-confidence framework:
Support, s(X ⇒ Y ) =σ(X ∪ Y )
N; (2)
Confidence, c(X ⇒ Y ) =σ(X ∪ Y )
σ(X ). (3)
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 5 / 36
Terminology and formal description
A dataset contains a set of N transactions T = {t1, t2, ..., tN}.
Each transaction comprises a subset of items, referred to as anitemset, from M items I = {i1, i2, ..., iM}.
Support count measures the number of transactions containing anitemset by
Support count, σ(X ) = |{ti |X ⊆ ti , ti ∈ T}|. (1)
The support-confidence framework:
Support, s(X ⇒ Y ) =σ(X ∪ Y )
N; (2)
Confidence, c(X ⇒ Y ) =σ(X ∪ Y )
σ(X ). (3)
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 5 / 36
Terminology and formal description
A dataset contains a set of N transactions T = {t1, t2, ..., tN}.
Each transaction comprises a subset of items, referred to as anitemset, from M items I = {i1, i2, ..., iM}.
Support count measures the number of transactions containing anitemset by
Support count, σ(X ) = |{ti |X ⊆ ti , ti ∈ T}|. (1)
The support-confidence framework:
Support, s(X ⇒ Y ) =σ(X ∪ Y )
N; (2)
Confidence, c(X ⇒ Y ) =σ(X ∪ Y )
σ(X ). (3)
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 5 / 36
Terminology and formal description
A dataset contains a set of N transactions T = {t1, t2, ..., tN}.
Each transaction comprises a subset of items, referred to as anitemset, from M items I = {i1, i2, ..., iM}.
Support count measures the number of transactions containing anitemset by
Support count, σ(X ) = |{ti |X ⊆ ti , ti ∈ T}|. (1)
The support-confidence framework:
Support, s(X ⇒ Y ) =σ(X ∪ Y )
N; (2)
Confidence, c(X ⇒ Y ) =σ(X ∪ Y )
σ(X ). (3)
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 5 / 36
Example
TID Cheese Beer Pizza1 1 1 12 1 0 13 0 1 14 0 1 05 1 0 0
Support(pizza ⇒ beer) =σ(pizza ∪ beer)
N=
25= 0.4
Confidence(pizza ⇒ beer) =σ(pizza ∪ beer)
σ(pizza)=
23= 0.6̇
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 6 / 36
Example
TID Cheese Beer Pizza1 1 1 12 1 0 13 0 1 14 0 1 05 1 0 0
Support(pizza ⇒ beer) =σ(pizza ∪ beer)
N=
25= 0.4
Confidence(pizza ⇒ beer) =σ(pizza ∪ beer)
σ(pizza)=
23= 0.6̇
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 6 / 36
Fuzzy Association Rules
Represent quantities of items with words.
Interpretable and comprehensible.
Uncertainty in data (e.g., web server log) and linguistic uncertainty(human interpretation).
Imprecision in data (physical measurements from weighing goodsin a butchers, a fishmongers and a sweet shop).
Example Rule20% of customers matched the rule
IF quantity of pizza is high THEN quantity of beer is high
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 7 / 36
What are Fuzzy Sets?Fuzzy sets have elements that have degrees of membership in[0, 1].
Crisp boundary problem.
1
0
µ
6ft 7ft height
(a) A crisp set tall
1
0
µ
6ft 7ft height
(b) A fuzzy set tall
“Fuzzy Logic: An Introduction” - Award-winning video from the CCI
◮ https://www.youtube.com/watch?v=P8wY6mi1vV8◮ http://www.cci.dmu.ac.uk/news-archive/212/
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 8 / 36
Example
TID Cheese Beer Pizza1 1 0 22 0 16 53 0 15 64 7 8 05 2 8 1
2016Quantity
1
0.58
01
µ low medium high
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 9 / 36
Example
TIDCheese Beer Pizza
l m h l m h l m h1 1 0 0 0 0 0 0.79 0.21 02 0 0 0 0 0.42 0.58 0.42 0.58 03 0 0 0 0 0.58 0.42 0.58 0.42 04 0.37 0.63 0 0.26 0.74 0 0 0 05 0.79 0.21 0 0.26 0.74 0 1 0 0
FuzzySupport(cheese.l ⇒ pizza.l)
=
∑5i=1 min(cheese.l , pizza.l)
N
=0.79 + 0.79
5= 0.316
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 10 / 36
Example
TIDCheese Beer Pizza
l m h l m h l m h1 1 0 0 0 0 0 0.79 0.21 02 0 0 0 0 0.42 0.58 0.42 0.58 03 0 0 0 0 0.58 0.42 0.58 0.42 04 0.37 0.63 0 0.26 0.74 0 0 0 05 0.79 0.21 0 0.26 0.74 0 1 0 0
FuzzySupport(cheese.l ⇒ pizza.l)
=
∑5i=1 min(cheese.l , pizza.l)
N
=0.79 + 0.79
5= 0.316
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 10 / 36
Temporal Association Rules
Lifespan Occurs over a period of time, e.g., one week.
Cyclic Recurs at regular intervals.
Calendar Occurs in periods defined with a calendar, e.g., 1stJanuary 1970.
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 11 / 36
Outline
1 Background
2 The Problem
3 The Solution
4 Experiments
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 12 / 36
The goal
Example Rule20% of customers matched the following rule on one Friday evening
IF quantity of pizza is high THEN quantity of beer is high
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 13 / 36
Traditional Approach1 Define linguistic labels and membership function parameters
(clustering, GA and uniform partitioning).2 Mine rules using the linguistic labels and membership functions,
e.g., cheese.l , cheese.m, cheese.h, beer.l , beer.m, . . .
20Quantity
1
01
µ low medium high
Assumes that the membership functions stay the same throughout theentire dataset . . .
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 14 / 36
Losing RulesTID Cheese Beer Pizza1 1 0 22 0 16 53 0 15 64 7 8 05 2 8 1
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 15 / 36
Losing RulesTID Cheese Beer Pizza1 1 0 22 0 16 53 0 15 64 7 8 05 2 8 1
Quantities at intersection of membership function boundaries.
2016Quantity
1
0.58
01
µ low medium high
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 15 / 36
Losing Rules
Less prominent across entire dataset, but more prominent in atemporal period.
Rigid definition of membership functions.
20Quantity
1
01
µ low medium high
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 16 / 36
Example
TIDCheese Beer Pizza
l m h l m h l m h1 1 0 0 0 0 0 0.79 0.21 02 0 0 0 0 0.42 0.58 0.42 0.58 03 0 0 0 0 0.58 0.42 0.58 0.42 04 0.37 0.63 0 0.26 0.74 0 0 0 05 0.79 0.21 0 0.26 0.74 0 1 0 0
FuzzySupport(cheese.l ⇒ pizza.l)
=
∑5i=1 min(cheese.l , pizza.l)
N
=0.79 + 0.79
5= 0.316
FuzzySupport(pizza.m ⇒ beer.h)
=
∑5i=1 min(pizza.m, beer.h)
N
=0.58 + 0.42
5= 0.2
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 17 / 36
Example
TIDCheese Beer Pizza
l m h l m h l m h1 1 0 0 0 0 0 0.79 0.21 02 0 0 0 0 0.42 0.58 0.42 0.58 03 0 0 0 0 0.58 0.42 0.58 0.42 04 0.37 0.63 0 0.26 0.74 0 0 0 05 0.79 0.21 0 0.26 0.74 0 1 0 0
FuzzySupport(cheese.l ⇒ pizza.l)
=
∑5i=1 min(cheese.l , pizza.l)
N
=0.79 + 0.79
5= 0.316
FuzzySupport(pizza.m ⇒ beer.h)
=
∑5i=1 min(pizza.m, beer.h)
N
=0.58 + 0.42
5= 0.2
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 17 / 36
Outline
1 Background
2 The Problem
3 The Solution
4 Experiments
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 18 / 36
2-tuple Linguistic Representation
Displace the membership function left or right.
Overcomes lack of flexibility.
Maintains interpretability whilst discovering more temporal rules.
20Quantity
1
01
µ s0 s1 s2
(s1,−0.3)
α = −0.3
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 19 / 36
Search for Rules with a Genetic Algorithm (GA)
What is a GA?
Search method based on principles of genetics and naturalselection.
Solution to a problem encoded in a chromosome.
Many solutions compete in a population.
Performance of solutions measured with fitness function.
Population of solutions evolve over time.
Particularly good in large and complex search spaces.
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 20 / 36
Why is it used for this problem?
Searches for rules.
Combination of different search spaces.Simultaneously:
◮ Tunes lateral displacements of membership functions.◮ Discovers a rule.◮ Discovers temporal period of a rule.
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 21 / 36
Chromosome
C = (el , eu, i1, s1, α1, a1, . . . , ik , sk , αk , ak )
el lower endpoint
eu upper endpoint
i item (e.g., beer)
s linguistic label (e.g., high)
α lateral displacement
a antecedent/consequent flag
k number of items in rule
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 22 / 36
Fitness Evaluation
Fitness(C) = TemporalFuzzySupport(C) + Confidence(C) (4)
Fitness(C) =
∑euj=el
FuzzySupport(C(j)X ∩ C(j)
Y )
eu − el
(5)
+
∑euj=el
FuzzySupport(C(j)X ∩ C(j)
Y )∑eu
j=elFuzzySupport(C(j)
X )
.
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 23 / 36
Fitness Evaluation
Fitness(C) = TemporalFuzzySupport(C) + Confidence(C) (4)
Fitness(C) =
∑euj=el
FuzzySupport(C(j)X ∩ C(j)
Y )
eu − el
(5)
+
∑euj=el
FuzzySupport(C(j)X ∩ C(j)
Y )∑eu
j=elFuzzySupport(C(j)
X )
.
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 23 / 36
Iterative Rule Learning
GA is run many times.
Best rule from each run of GA is stored.
Previously discovered rules penalised in fitness function.
Begin
Run GA
Max. rules? Add to rule set
End
Yes
No
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 24 / 36
Outline
1 Background
2 The Problem
3 The Solution
4 Experiments
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 25 / 36
MethodologyAims:
1 Improve existing rules discovered with traditional approach2 Discover new rules not discovered with traditional approach
Compare rules produced from GA (CHC) and traditional exhaustivesearch (FuzzyApriori).
Define membership functionsand linguistic labels
One datasetEnumerate partitions
of dataset
CHC FuzzyApriori
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 26 / 36
Dataset
IBM Quest synthetic dataset.
Benchmark dataset for association rule mining.
Parameters: 10,000 transactions, 64 items and quantities in therange 1–20.
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 27 / 36
General results
Measure GA FuzzyAprioriNumber of Rules 10000 90325Average temporal fuzzy support 0.025 0.031Average confidence (%) 99.986 24.187Mode of dataset partitions 100 100
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 28 / 36
General results in pictures
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 29 / 36
What rules have improved?
GA (CHC) found rules that were also discovered with exhaustivesearch (FuzzyApriori).
Temporal Fuzzy SupportDecrease(%) Increase(%) Total(%)
CHC and FuzzyApriori 10.49 10.78 21.27Only CHC 4.26 74.47 78.73
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 30 / 36
What rules are new?
GA (CHC) found rules that were NOT discovered with exhaustivesearch (FuzzyApriori).
Temporal Fuzzy SupportDecrease(%) Increase(%) Total(%)
CHC and FuzzyApriori 10.49 10.78 21.27Only CHC 4.26 74.47 78.73
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 31 / 36
Why were the new rules lost?
FuzzyApriori discarded rules that fell below minimum thresholds.
Temporal Fuzzy SupportDecrease(%) Increase(%) Total(%)
Below min. temporal support 3.73 73.98 77.71Below min. confidence 0.53 00.49 1.02
78.73
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 32 / 36
What rules are now above the min. thresholds?
Rules that are above minimum thresholds after CHC.
Temporal Fuzzy SupportDecrease(%) Increase(%) Total(%)
Below min. temporal support 0 24.65 24.65Below min. confidence 0.23 00.50 0.73
25.38
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 33 / 36
What does a rule look like?Endpoints: 9300–9400Rule: IF quantity of Item38 is (medium, -0.422)Rule: THEN quantity of Item12 is (medium, 0.315)
20
Quantity
1
1
µ medium
(medium,−0.422)
α = −0.422
Item38
20
Quantity
1
1
µ medium
(medium, 0.315)
α = 0.315
Item12
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 34 / 36
Summary
Temporal rules can be lost by fixing membership functions.
2-tuple provides the flexibility required to discover these rules.
Analysis has unearthed lost rules.
Real-world datasets . . .
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 35 / 36
Thank you
Stephen G. Matthews◮ [email protected]◮ www.slideshare.net/stephengmatthews
“Fuzzy Logic: An Introduction”◮ https://www.youtube.com/watch?v=P8wY6mi1vV8◮ http://www.cci.dmu.ac.uk/news-archive/212/
Stephen G. Matthews (DMU) What sells well and when? BCS meeting 36 / 36