BCS Talk - What sells well and when?

44
What sells well and when? Stephen G. Matthews Centre for Computational Intelligence (CCI) De Montfort University 7th February 2012 / BCS meeting Stephen G. Matthews (DMU) What sells well and when? BCS meeting 1 / 36

description

A 1 hour talk given to the BCS on 7th February about using computational intelligence in data mining.

Transcript of BCS Talk - What sells well and when?

Page 1: BCS Talk - What sells well and when?

What sells well and when?

Stephen G. Matthews

Centre for Computational Intelligence (CCI)De Montfort University

7th February 2012 / BCS meeting

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 1 / 36

Page 2: BCS Talk - What sells well and when?

Outline

1 Background

2 The Problem

3 The Solution

4 Experiments

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 2 / 36

Page 3: BCS Talk - What sells well and when?

Outline

1 Background

2 The Problem

3 The Solution

4 Experiments

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 3 / 36

Page 4: BCS Talk - What sells well and when?

Association Rules

Significant correlations between items in datasets.

Uses: positioning stock on shelves, inventory control andcross-selling.

Descriptive data mining.

Example Rule20% of customers matched the rule

IF pizza THEN beer

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 4 / 36

Page 5: BCS Talk - What sells well and when?

Terminology and formal description

A dataset contains a set of N transactions T = {t1, t2, ..., tN}.

Each transaction comprises a subset of items, referred to as anitemset, from M items I = {i1, i2, ..., iM}.

Support count measures the number of transactions containing anitemset by

Support count, σ(X ) = |{ti |X ⊆ ti , ti ∈ T}|. (1)

The support-confidence framework:

Support, s(X ⇒ Y ) =σ(X ∪ Y )

N; (2)

Confidence, c(X ⇒ Y ) =σ(X ∪ Y )

σ(X ). (3)

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 5 / 36

Page 6: BCS Talk - What sells well and when?

Terminology and formal description

A dataset contains a set of N transactions T = {t1, t2, ..., tN}.

Each transaction comprises a subset of items, referred to as anitemset, from M items I = {i1, i2, ..., iM}.

Support count measures the number of transactions containing anitemset by

Support count, σ(X ) = |{ti |X ⊆ ti , ti ∈ T}|. (1)

The support-confidence framework:

Support, s(X ⇒ Y ) =σ(X ∪ Y )

N; (2)

Confidence, c(X ⇒ Y ) =σ(X ∪ Y )

σ(X ). (3)

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 5 / 36

Page 7: BCS Talk - What sells well and when?

Terminology and formal description

A dataset contains a set of N transactions T = {t1, t2, ..., tN}.

Each transaction comprises a subset of items, referred to as anitemset, from M items I = {i1, i2, ..., iM}.

Support count measures the number of transactions containing anitemset by

Support count, σ(X ) = |{ti |X ⊆ ti , ti ∈ T}|. (1)

The support-confidence framework:

Support, s(X ⇒ Y ) =σ(X ∪ Y )

N; (2)

Confidence, c(X ⇒ Y ) =σ(X ∪ Y )

σ(X ). (3)

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 5 / 36

Page 8: BCS Talk - What sells well and when?

Terminology and formal description

A dataset contains a set of N transactions T = {t1, t2, ..., tN}.

Each transaction comprises a subset of items, referred to as anitemset, from M items I = {i1, i2, ..., iM}.

Support count measures the number of transactions containing anitemset by

Support count, σ(X ) = |{ti |X ⊆ ti , ti ∈ T}|. (1)

The support-confidence framework:

Support, s(X ⇒ Y ) =σ(X ∪ Y )

N; (2)

Confidence, c(X ⇒ Y ) =σ(X ∪ Y )

σ(X ). (3)

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 5 / 36

Page 9: BCS Talk - What sells well and when?

Example

TID Cheese Beer Pizza1 1 1 12 1 0 13 0 1 14 0 1 05 1 0 0

Support(pizza ⇒ beer) =σ(pizza ∪ beer)

N=

25= 0.4

Confidence(pizza ⇒ beer) =σ(pizza ∪ beer)

σ(pizza)=

23= 0.6̇

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 6 / 36

Page 10: BCS Talk - What sells well and when?

Example

TID Cheese Beer Pizza1 1 1 12 1 0 13 0 1 14 0 1 05 1 0 0

Support(pizza ⇒ beer) =σ(pizza ∪ beer)

N=

25= 0.4

Confidence(pizza ⇒ beer) =σ(pizza ∪ beer)

σ(pizza)=

23= 0.6̇

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 6 / 36

Page 11: BCS Talk - What sells well and when?

Fuzzy Association Rules

Represent quantities of items with words.

Interpretable and comprehensible.

Uncertainty in data (e.g., web server log) and linguistic uncertainty(human interpretation).

Imprecision in data (physical measurements from weighing goodsin a butchers, a fishmongers and a sweet shop).

Example Rule20% of customers matched the rule

IF quantity of pizza is high THEN quantity of beer is high

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 7 / 36

Page 12: BCS Talk - What sells well and when?

What are Fuzzy Sets?Fuzzy sets have elements that have degrees of membership in[0, 1].

Crisp boundary problem.

1

0

µ

6ft 7ft height

(a) A crisp set tall

1

0

µ

6ft 7ft height

(b) A fuzzy set tall

“Fuzzy Logic: An Introduction” - Award-winning video from the CCI

◮ https://www.youtube.com/watch?v=P8wY6mi1vV8◮ http://www.cci.dmu.ac.uk/news-archive/212/

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 8 / 36

Page 13: BCS Talk - What sells well and when?

Example

TID Cheese Beer Pizza1 1 0 22 0 16 53 0 15 64 7 8 05 2 8 1

2016Quantity

1

0.58

01

µ low medium high

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 9 / 36

Page 14: BCS Talk - What sells well and when?

Example

TIDCheese Beer Pizza

l m h l m h l m h1 1 0 0 0 0 0 0.79 0.21 02 0 0 0 0 0.42 0.58 0.42 0.58 03 0 0 0 0 0.58 0.42 0.58 0.42 04 0.37 0.63 0 0.26 0.74 0 0 0 05 0.79 0.21 0 0.26 0.74 0 1 0 0

FuzzySupport(cheese.l ⇒ pizza.l)

=

∑5i=1 min(cheese.l , pizza.l)

N

=0.79 + 0.79

5= 0.316

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 10 / 36

Page 15: BCS Talk - What sells well and when?

Example

TIDCheese Beer Pizza

l m h l m h l m h1 1 0 0 0 0 0 0.79 0.21 02 0 0 0 0 0.42 0.58 0.42 0.58 03 0 0 0 0 0.58 0.42 0.58 0.42 04 0.37 0.63 0 0.26 0.74 0 0 0 05 0.79 0.21 0 0.26 0.74 0 1 0 0

FuzzySupport(cheese.l ⇒ pizza.l)

=

∑5i=1 min(cheese.l , pizza.l)

N

=0.79 + 0.79

5= 0.316

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 10 / 36

Page 16: BCS Talk - What sells well and when?

Temporal Association Rules

Lifespan Occurs over a period of time, e.g., one week.

Cyclic Recurs at regular intervals.

Calendar Occurs in periods defined with a calendar, e.g., 1stJanuary 1970.

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 11 / 36

Page 17: BCS Talk - What sells well and when?

Outline

1 Background

2 The Problem

3 The Solution

4 Experiments

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 12 / 36

Page 18: BCS Talk - What sells well and when?

The goal

Example Rule20% of customers matched the following rule on one Friday evening

IF quantity of pizza is high THEN quantity of beer is high

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 13 / 36

Page 19: BCS Talk - What sells well and when?

Traditional Approach1 Define linguistic labels and membership function parameters

(clustering, GA and uniform partitioning).2 Mine rules using the linguistic labels and membership functions,

e.g., cheese.l , cheese.m, cheese.h, beer.l , beer.m, . . .

20Quantity

1

01

µ low medium high

Assumes that the membership functions stay the same throughout theentire dataset . . .

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 14 / 36

Page 20: BCS Talk - What sells well and when?

Losing RulesTID Cheese Beer Pizza1 1 0 22 0 16 53 0 15 64 7 8 05 2 8 1

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 15 / 36

Page 21: BCS Talk - What sells well and when?

Losing RulesTID Cheese Beer Pizza1 1 0 22 0 16 53 0 15 64 7 8 05 2 8 1

Quantities at intersection of membership function boundaries.

2016Quantity

1

0.58

01

µ low medium high

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 15 / 36

Page 22: BCS Talk - What sells well and when?

Losing Rules

Less prominent across entire dataset, but more prominent in atemporal period.

Rigid definition of membership functions.

20Quantity

1

01

µ low medium high

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 16 / 36

Page 23: BCS Talk - What sells well and when?

Example

TIDCheese Beer Pizza

l m h l m h l m h1 1 0 0 0 0 0 0.79 0.21 02 0 0 0 0 0.42 0.58 0.42 0.58 03 0 0 0 0 0.58 0.42 0.58 0.42 04 0.37 0.63 0 0.26 0.74 0 0 0 05 0.79 0.21 0 0.26 0.74 0 1 0 0

FuzzySupport(cheese.l ⇒ pizza.l)

=

∑5i=1 min(cheese.l , pizza.l)

N

=0.79 + 0.79

5= 0.316

FuzzySupport(pizza.m ⇒ beer.h)

=

∑5i=1 min(pizza.m, beer.h)

N

=0.58 + 0.42

5= 0.2

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 17 / 36

Page 24: BCS Talk - What sells well and when?

Example

TIDCheese Beer Pizza

l m h l m h l m h1 1 0 0 0 0 0 0.79 0.21 02 0 0 0 0 0.42 0.58 0.42 0.58 03 0 0 0 0 0.58 0.42 0.58 0.42 04 0.37 0.63 0 0.26 0.74 0 0 0 05 0.79 0.21 0 0.26 0.74 0 1 0 0

FuzzySupport(cheese.l ⇒ pizza.l)

=

∑5i=1 min(cheese.l , pizza.l)

N

=0.79 + 0.79

5= 0.316

FuzzySupport(pizza.m ⇒ beer.h)

=

∑5i=1 min(pizza.m, beer.h)

N

=0.58 + 0.42

5= 0.2

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 17 / 36

Page 25: BCS Talk - What sells well and when?

Outline

1 Background

2 The Problem

3 The Solution

4 Experiments

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 18 / 36

Page 26: BCS Talk - What sells well and when?

2-tuple Linguistic Representation

Displace the membership function left or right.

Overcomes lack of flexibility.

Maintains interpretability whilst discovering more temporal rules.

20Quantity

1

01

µ s0 s1 s2

(s1,−0.3)

α = −0.3

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 19 / 36

Page 27: BCS Talk - What sells well and when?

Search for Rules with a Genetic Algorithm (GA)

What is a GA?

Search method based on principles of genetics and naturalselection.

Solution to a problem encoded in a chromosome.

Many solutions compete in a population.

Performance of solutions measured with fitness function.

Population of solutions evolve over time.

Particularly good in large and complex search spaces.

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 20 / 36

Page 28: BCS Talk - What sells well and when?

Why is it used for this problem?

Searches for rules.

Combination of different search spaces.Simultaneously:

◮ Tunes lateral displacements of membership functions.◮ Discovers a rule.◮ Discovers temporal period of a rule.

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 21 / 36

Page 29: BCS Talk - What sells well and when?

Chromosome

C = (el , eu, i1, s1, α1, a1, . . . , ik , sk , αk , ak )

el lower endpoint

eu upper endpoint

i item (e.g., beer)

s linguistic label (e.g., high)

α lateral displacement

a antecedent/consequent flag

k number of items in rule

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 22 / 36

Page 30: BCS Talk - What sells well and when?

Fitness Evaluation

Fitness(C) = TemporalFuzzySupport(C) + Confidence(C) (4)

Fitness(C) =

∑euj=el

FuzzySupport(C(j)X ∩ C(j)

Y )

eu − el

(5)

+

∑euj=el

FuzzySupport(C(j)X ∩ C(j)

Y )∑eu

j=elFuzzySupport(C(j)

X )

.

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 23 / 36

Page 31: BCS Talk - What sells well and when?

Fitness Evaluation

Fitness(C) = TemporalFuzzySupport(C) + Confidence(C) (4)

Fitness(C) =

∑euj=el

FuzzySupport(C(j)X ∩ C(j)

Y )

eu − el

(5)

+

∑euj=el

FuzzySupport(C(j)X ∩ C(j)

Y )∑eu

j=elFuzzySupport(C(j)

X )

.

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 23 / 36

Page 32: BCS Talk - What sells well and when?

Iterative Rule Learning

GA is run many times.

Best rule from each run of GA is stored.

Previously discovered rules penalised in fitness function.

Begin

Run GA

Max. rules? Add to rule set

End

Yes

No

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 24 / 36

Page 33: BCS Talk - What sells well and when?

Outline

1 Background

2 The Problem

3 The Solution

4 Experiments

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 25 / 36

Page 34: BCS Talk - What sells well and when?

MethodologyAims:

1 Improve existing rules discovered with traditional approach2 Discover new rules not discovered with traditional approach

Compare rules produced from GA (CHC) and traditional exhaustivesearch (FuzzyApriori).

Define membership functionsand linguistic labels

One datasetEnumerate partitions

of dataset

CHC FuzzyApriori

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 26 / 36

Page 35: BCS Talk - What sells well and when?

Dataset

IBM Quest synthetic dataset.

Benchmark dataset for association rule mining.

Parameters: 10,000 transactions, 64 items and quantities in therange 1–20.

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 27 / 36

Page 36: BCS Talk - What sells well and when?

General results

Measure GA FuzzyAprioriNumber of Rules 10000 90325Average temporal fuzzy support 0.025 0.031Average confidence (%) 99.986 24.187Mode of dataset partitions 100 100

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 28 / 36

Page 37: BCS Talk - What sells well and when?

General results in pictures

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 29 / 36

Page 38: BCS Talk - What sells well and when?

What rules have improved?

GA (CHC) found rules that were also discovered with exhaustivesearch (FuzzyApriori).

Temporal Fuzzy SupportDecrease(%) Increase(%) Total(%)

CHC and FuzzyApriori 10.49 10.78 21.27Only CHC 4.26 74.47 78.73

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 30 / 36

Page 39: BCS Talk - What sells well and when?

What rules are new?

GA (CHC) found rules that were NOT discovered with exhaustivesearch (FuzzyApriori).

Temporal Fuzzy SupportDecrease(%) Increase(%) Total(%)

CHC and FuzzyApriori 10.49 10.78 21.27Only CHC 4.26 74.47 78.73

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 31 / 36

Page 40: BCS Talk - What sells well and when?

Why were the new rules lost?

FuzzyApriori discarded rules that fell below minimum thresholds.

Temporal Fuzzy SupportDecrease(%) Increase(%) Total(%)

Below min. temporal support 3.73 73.98 77.71Below min. confidence 0.53 00.49 1.02

78.73

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 32 / 36

Page 41: BCS Talk - What sells well and when?

What rules are now above the min. thresholds?

Rules that are above minimum thresholds after CHC.

Temporal Fuzzy SupportDecrease(%) Increase(%) Total(%)

Below min. temporal support 0 24.65 24.65Below min. confidence 0.23 00.50 0.73

25.38

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 33 / 36

Page 42: BCS Talk - What sells well and when?

What does a rule look like?Endpoints: 9300–9400Rule: IF quantity of Item38 is (medium, -0.422)Rule: THEN quantity of Item12 is (medium, 0.315)

20

Quantity

1

1

µ medium

(medium,−0.422)

α = −0.422

Item38

20

Quantity

1

1

µ medium

(medium, 0.315)

α = 0.315

Item12

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 34 / 36

Page 43: BCS Talk - What sells well and when?

Summary

Temporal rules can be lost by fixing membership functions.

2-tuple provides the flexibility required to discover these rules.

Analysis has unearthed lost rules.

Real-world datasets . . .

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 35 / 36

Page 44: BCS Talk - What sells well and when?

Thank you

Stephen G. Matthews◮ [email protected]◮ www.slideshare.net/stephengmatthews

“Fuzzy Logic: An Introduction”◮ https://www.youtube.com/watch?v=P8wY6mi1vV8◮ http://www.cci.dmu.ac.uk/news-archive/212/

Stephen G. Matthews (DMU) What sells well and when? BCS meeting 36 / 36