multi-armed bandit
-
Upload
yika-yujia-luo -
Category
Documents
-
view
105 -
download
1
Transcript of multi-armed bandit
![Page 1: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/1.jpg)
Intuit Confidential and Proprietary 1
CTG Data Science LabAugust 17, 2016
Multi-armed Bandit ProblemPotential Improvement for DARTS
Aniruddha Bhargava, Yika Yujia Luo
![Page 2: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/2.jpg)
Intuit Confidential and Proprietary 2
Agenda
1. Problem Overview
2. Algorithms
Non-contextual cases
Contextual cases
3. Industry Review
4. Advanced Topics
![Page 3: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/3.jpg)
Intuit Confidential and Proprietary 3
Problem Overview
![Page 4: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/4.jpg)
Intuit Confidential and Proprietary 4
When do we run into Multi-armed Bandit Problem (MAB)?Gambling Research Funding
Clinical Trials Content Management
![Page 5: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/5.jpg)
Intuit Confidential and Proprietary 5
What is Multi-armed Bandit Problem (MAB)?
Goal: Pick the best restaurant efficiently
Logistics: Select a restaurant for each person, who leaves you a tip afterwards
$1 $8 $10
How?
$3 $6 $6Average: $2 Average: $7 Average: $6
![Page 6: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/6.jpg)
Intuit Confidential and Proprietary 6
MAB Terminology
Exploration: a learning process of people’s preferences, always involves a certain degree of randomness
Exploitation: use the current, reliable knowledge of a certain parameter to select a restaurant
Arm: restaurant
Expected Reward: Average tips in the end
Regret: expected tip loss after sending a person to a restaurant that is not the best
Policy: a strategy that you use to select restaurant
Total Cumulative Regret: the total tips you lose -- a performance measure for bandit algorithms
Expected: $1
Expected: $10
Regret is $9!
Expected: $8Regret is $9!
Regret is $2!
0 Regret!
Total regret: $20
User: People sent to restaurants
Reward: Tips
$0
$8
$2
$6
![Page 7: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/7.jpg)
Intuit Confidential and Proprietary 7
Big Picture
MAB Big Picture
DecisionMaking
OptimizationMAB
Choose the best product by finding the best restaurant to go
Minimize total regretby avoiding sending people to bad restaurants as much as possible
![Page 8: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/8.jpg)
Intuit Confidential and Proprietary 8
Algorithms(Non-contextual Cases)
“Anytime you are faced with the problem of both exploring and exploiting a search space, you have a bandit problem. Any method of solving that problem is a bandit algorithm”
-- Chris Stucchio
![Page 9: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/9.jpg)
Intuit Confidential and Proprietary 9
Non-Contextual
Non-contextual V.S. Contextual
User Product
IMPORTANT THING HERE: Although everyone has different taste, we pick one best restaurant for everyone
![Page 10: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/10.jpg)
Intuit Confidential and Proprietary 10
ε-greedy
Thompson Sampling
Upper Confidence Bound (UCB)
MAB Policies
There are more bandit algorithms… ...
A/B Testing
Adaptive
![Page 11: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/11.jpg)
Intuit Confidential and Proprietary 11
AB Testing
Person i Random100%
Exploration
33.3%
33.3%
33.3%
Exploitation
Person j100%
![Page 12: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/12.jpg)
Intuit Confidential and Proprietary 12
ε-greedy
Person i
Highest average tips
Random
20%
80%
Record person i’s feedback,
Update that restaurant’s average
tips value
Select (ε = 0.2)
Update33.3%
33.3%33.3%
![Page 13: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/13.jpg)
Intuit Confidential and Proprietary 13
Upper Confidence Bound (UCB)
Person iHighest upper
confidence bound Record person i’s
feedback,Update the upper confidence bound
of that restaurant’s average tips
Select
Update
Average tips from restaurant j #people went
to restaurant j
#people
100%
![Page 14: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/14.jpg)
Intuit Confidential and Proprietary 14
Thompson Sampling (Bayesian)
Person iHighest tips from
the sampling
Record person i’s feedback,
Update that restaurant’s average
tip distribution
Select
Update
Simulate 3 restaurants’average tip distribution,randomly draw a value from each distribution
SamplingMcDonald’s
Subway
Chili's
Average Tips($)
100%
![Page 15: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/15.jpg)
Intuit Confidential and Proprietary 15
Thompson Sampling (Bayesian)
Pr(r < b) = 10% Pr(r < b) = 0.01%
![Page 16: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/16.jpg)
Intuit Confidential and Proprietary 16
Algorithm Comparison
1. Exploration V.S Exploitation
2. Total Regret
3. Batch Update
![Page 17: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/17.jpg)
Intuit Confidential and Proprietary 17
Algorithm Comparison: Exploration V.S. Exploitation
IMPORTANT THING HERE: Exploration costs money!
Exp
lora
tion
(%)
Time (%)
75
50
25
0
100
25 50 75 100
AB Testing
εε-greedy
UCB/Thompson
![Page 18: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/18.jpg)
Intuit Confidential and Proprietary 18
Algorithm Comparison: Total Regret
M44%
S28%
C28%
AdaptiveAB Testing
M70%
S18%
C12%
Time Time
![Page 19: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/19.jpg)
Intuit Confidential and Proprietary 19
Algorithm Comparison: Batch Update
AB Testing ε-greedy UCB Thompson
Very Robust Depends Not Robust Robust
System UserQuestion
AnswerStore
ManyAnswers
![Page 20: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/20.jpg)
Intuit Confidential and Proprietary 20
Algorithm Comparison: Summary
AB Testing ε-greedy UCB Thompson
• Easy to implement
• If good ε found, lower total regret and faster to find best arm than ε-first
• Good for large amount of arms• Find the best arm fast • Low total regret
• Robust to batch update
Pros
Cons
• Easy to implement
• Good for small amount of arms
• Robust to batch update
• Not robust to batch update
• Sensitive to statistical assumptions
• High total regrets
• Need to figure out good ε
• High total regrets
![Page 21: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/21.jpg)
Intuit Confidential and Proprietary 21
ContextualNon-Contextual
Non-contextual V.S. Contextual
Female
Vegetarian
Married
Latino
Burger
Non-Vegetarian
Cheap
Good Service
User Product
IMPORTANT THING HERE: Everyone has different tastes, so we pick one best restaurant for each person
![Page 22: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/22.jpg)
Intuit Confidential and Proprietary 22
Agenda
1. Problem Overview
2. Algorithms
Non-contextual cases
Contextual cases
3. Industry Review
4. Advanced Topics
![Page 23: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/23.jpg)
Intuit Confidential and Proprietary 23
Algorithms(Contextual Bandits)
![Page 24: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/24.jpg)
Intuit Confidential and Proprietary 24
What do we mean by context?
Likes spicy food, refined tastes, plays violin, Male, …
From Wisconsin, likes German food, likes Football, Male, …
Student, doesn’t like seafood, allergic to cats, Female, …
Chief of AFC, watches shows on competitive eating, Female, …
User side Arm side
Tex-Mex style, sit down dining,founded in 1975, …
Serves sandwiches, has veggie options, founded in 1965, …
Breakfast, lunch, and dinner, cheap, founded in 1940, …
![Page 25: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/25.jpg)
Intuit Confidential and Proprietary 25
User Context
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 250
2
4
6
8
10
12
14
16Average reward over time
Non-contextual Best possible without context Context (user) Best possible with context
Non-Contextual
User Context
![Page 26: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/26.jpg)
Intuit Confidential and Proprietary 26
Arm Context
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 250
2
4
6
8
10
12
14
16Average reward over time
Non-contextual Contextual (arm) Contextual (user)
Best possible without user context Best possible with user context Context (arm and user)
Non-contextualOnly arm context
Both arm and user context
![Page 27: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/27.jpg)
User context can increase the optimal rewards;Arm context can get you there faster!
Takeaway Message
![Page 28: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/28.jpg)
Intuit Confidential and Proprietary 28
User side:
Population segmentation
e.g. DARTS
Clustering users
Learning embedding
Arms side:
Linear models:
LinUCB, Linear TS, OFUL
Maintain estimate of best arm
More data → shrink uncertainty
Exploiting Context
![Page 29: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/29.jpg)
Intuit Confidential and Proprietary 29
Assumptions:
• Users can be represented as points in space
• Users cluster together so that points that are close are similar
• Stationarity
Exploiting User Context
![Page 30: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/30.jpg)
Intuit Confidential and Proprietary 30
Exploiting User Context
meat vegetarian
spic
ym
ild Joe
Yao
Nichola
PeterAniruddha
Rachel
SophieYika
Vineeta
Jason Andre
Chris
Madeline
John
![Page 31: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/31.jpg)
Intuit Confidential and Proprietary 31
Linear
Exploiting User Context
meat vegetarian
spic
ym
ild Joe
Yao
Nichola
PeterAniruddha
Rachel
SophieYika
Vineeta
Jason Andre
Chris
Madeline
John
![Page 32: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/32.jpg)
Intuit Confidential and Proprietary 32
Exploiting User Context
meat vegetarian
spic
ym
ild Joe
Yao
Nichola
PeterAniruddha
Rachel
SophieYika
Vineeta
Jason Andre
Chris
Madeline
John
Quadratic
![Page 33: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/33.jpg)
Intuit Confidential and Proprietary 33
Exploiting User Context
meat vegetarian
spic
ym
ild Joe
Yao
Nichola
PeterAniruddha
Rachel
SophieYika
Vineeta
Jason Andre
Chris
Madeline
John
40% 35% 25%
Hierarchical
![Page 34: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/34.jpg)
Intuit Confidential and Proprietary 34
Exploiting User Context
meat vegetarian
spic
ym
ild Joe
Yao
Nichola
PeterAniruddha
Rachel
SophieYika
Vineeta
Jason Andre
Chris
Madeline
John
80% 15% 5%
5% 15% 80%
Hierarchical
![Page 35: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/35.jpg)
Intuit Confidential and Proprietary 35
Exploiting User Context
meat vegetarian
spic
ym
ild Joe
Yao
Nichola
PeterAniruddha
Rachel
SophieYika
Vineeta
Jason Andre
Chris
Madeline
John
5% 50% 45%
80% 15% 5%
5% 10% 85%
15% 80% 5%
Hierarchical
![Page 36: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/36.jpg)
Intuit Confidential and Proprietary 36
80% 15% 5%
Exploiting User Context
meat vegetarian
spic
ym
ild Joe
Yao
Nichola
PeterAniruddha
Rachel
SophieYika
Vineeta
Jason Andre
Chris
Madeline
John
5% 5% 90%10% 45% 45%
5% 50% 45%
15% 80% 5%
Hierarchical
![Page 37: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/37.jpg)
Intuit Confidential and Proprietary 37
Assumptions:• We can represent arms as vectors.• Rewards are a noisy version of the inner product.• Stationarity.
Look at only arm context and no user context
Methods include:• Linear UCB• Linear Thompson Sampling• OFUL (Optimism in the Face of Uncertainty – Linear)• ... and many more.
Linear modelsExploiting Arm Context
![Page 38: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/38.jpg)
Intuit Confidential and Proprietary 38
The Math Slide
Standard noisy linear model:rt = xtTθ* + ηt
θ* : the optimal armxt : arm pulled at time trt : reward at time t
ηt : noise at time t
Ct : confidence set
λ : ridge termXt : matrix of all arms pulled till time t
Collect all data and write:r = X θ* + η
Least Squares Solution: θLS = (XTX)-1 XTr
Ridge regression: θLSR = (XTX + λI)-1 XTr
Typical Linear Bandit algorithm:θ0 = 0t = 0,1,2,…
xt = argmaxx∈Ct (xTθt )
θt = (XtTXt + λI)-1 Xt
Trt
![Page 39: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/39.jpg)
Intuit Confidential and Proprietary 39
Exploiting Arm Context Arms
Optimal arm
meat vegetarian
spic
ym
ild
Mince pie
Buffalo wings
Tofu scramble
Grilledvegetables
Ratatouille
Tandoori Chicken
Jalapeno scramble
Pad Thai
Penne Arrabiata
Set of Armsx1, x2, …
θ* : the optimal arm
![Page 40: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/40.jpg)
Intuit Confidential and Proprietary 40
Exploiting Arm Context Arms
Optimal arm
Next armchosen
Reward (=cos(θ)) is small, but we can still infer information about other arms!
Buffalo wings
θ
![Page 41: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/41.jpg)
Intuit Confidential and Proprietary 41
Exploiting Arm Context
C1
θ1
Arms
Optimal arm
Next armchosen
Estimate of optimal armRegion ofuncertainty
![Page 42: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/42.jpg)
Intuit Confidential and Proprietary 42
Exploiting Arm Context
We’ve already honed in on a pretty good choice
x2
Arms
Optimal arm
Next armchosen
Estimate of optimal armRegion ofuncertainty
![Page 43: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/43.jpg)
Intuit Confidential and Proprietary 43
Exploiting Arm Context
And the process continues …
C2
θ2
Arms
Optimal arm
Next armchosen
Estimate of optimal armRegion ofuncertainty
![Page 44: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/44.jpg)
Intuit Confidential and Proprietary 44
• Big assumption that we know good features.
• Finding features takes a lot of work.
• Few arms, many people → learn an embedding of arms
• Few people, many arms → Featurize, linear bandits
• Linear models are a naive assumption, see kernel methods.
Some Caveats
![Page 45: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/45.jpg)
Intuit Confidential and Proprietary 45
Agenda
1. Problem Overview
2. Algorithms
Non-contextual cases
Contextual cases
3. Industry Review
4. Advanced Topics
![Page 46: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/46.jpg)
Intuit Confidential and Proprietary 46
Industry Review
![Page 47: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/47.jpg)
Intuit Confidential and Proprietary 47
Companies using MAB
![Page 48: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/48.jpg)
Intuit Confidential and Proprietary 48
Headlines, Photos and Ads
Washington Post Google
![Page 49: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/49.jpg)
Intuit Confidential and Proprietary 49
Used Upper Confidence Bound (UCB) to picking headlines and photos
Washington Post
![Page 50: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/50.jpg)
Intuit Confidential and Proprietary 50
Google ExperimentsUsed Thompson Sampling (TS)Updated models twice a dayTwo metrics used to gauge end of experiment:
• 95% confidence that alternate better or …• "potential value remaining in the experiment”
![Page 51: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/51.jpg)
The more arms the higher the gain over A/B testing.
Takeaway Message
![Page 52: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/52.jpg)
Intuit Confidential and Proprietary 52
Advanced Topics
![Page 53: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/53.jpg)
Intuit Confidential and Proprietary 53
Biasing
Data Joining and Latency
Non-stationary
Topics
![Page 54: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/54.jpg)
Intuit Confidential and Proprietary 54
Bias
Website 1 Website 2
50% 50%Probability
Numbersold
100 20
90% 10%Probability
Numbersold
100 20
Who did better?
![Page 55: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/55.jpg)
Intuit Confidential and Proprietary 55
• Be careful when using past data!
• Inverse Propensity Score Matching
• New sales estimates:
Bias
Website 1: 100*0.5+20*0.5 = 60
Website 2: 100*0.5*(0.5/0.9) + 20*0.5*(0.5/0.1) = 75
![Page 56: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/56.jpg)
Intuit Confidential and Proprietary 56
Data Joining and Latency
Courtesy: Microsoft MWT white paper
Context, decision
RewardsLatency
![Page 57: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/57.jpg)
Intuit Confidential and Proprietary 57
Non-Stationarity – Beer example
January April July October December
Stouts and porters
Pale Ales and IPAs
Wits and Lagers
Oktoberfests and Reds
Christmas Ales
My yearly beer taste:
![Page 58: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/58.jpg)
Intuit Confidential and Proprietary 58
Preferences change over time.
There may be periodicity in data, Tax season is a great example.
Some solutions:
• Slow changes → System with finite memory
• Abrupt changes → Subspace tracking/anomaly detection
Non-Stationarity
![Page 59: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/59.jpg)
Preferences change over time, biases are added and data
needs to be joined from different sources.
Takeaway Message
![Page 60: multi-armed bandit](https://reader035.fdocuments.in/reader035/viewer/2022081512/58ecd8621a28ab3c1c8b46e3/html5/thumbnails/60.jpg)
Intuit Confidential and Proprietary 60
Thank You.Questions?