1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy &...

51
1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey

Transcript of 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy &...

Page 1: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

1

On Choosing An Efficient Service

Selection Mechanism In Dynamic

Environmentsby

Murat Sensoy & Pinar YolumBogazici University,

Istanbul,Turkey

Page 2: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

2

OUTLINE

Introduction Comparison of different service selection

mechanisms Problem statement and Proposed approach Evaluation Conclusions

Page 3: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

3

INTRODUCTION

We examine the problem of service selection in an e-commerce setting where consumer agents cooperate to

identify service providers that would satisfy their service needs the most.

Service Selection Problem:

Page 4: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

4

INTRODUCTION Using Selective Ratings (SR): Ratings are taken from those agents who have similar demands.

Using Context-Aware Ratings (CAR): Context of the ratings is described using an ontology. So, ratings are evaluated considering their context.

Using Experiences: Instead of ratings, experiences of consumers are represented using ontologies and shared. An experience represent what is demanded and what is provided in response. Two approach for using experiences:

Parametric classification and Gaussian Model (GM )Using Case-Based Reasoning (CBR )

Page 5: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

5

Comparison of Service Selection Methods

Simulations: 20 Service Providers 400 Service Consumers Repeated 10 times

Performance Measures:Ratio of satisfaction (ratio of service decisions resulted in satisfaction)time required for service selection.

Page 6: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

6

Variations in service quality (PI) : In our simulations, with a very small probability, providers deviate from their expected behavior in the favor of consumers (they produce absolutely satisfactory service). This probability is called probability of indeterminism (PI).

SIMULATION ENVIRONMENTSome factors are regarded in the Simulations:

Variations in service demand (PCD): Each service consumer changes its demand characteristics after receiving a service with a predefined probability denoted as PCD.

Page 7: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

7

Variations in service satisfaction: Misleading similarity factor (β) is roughly the ratio of service consumers who have similar service demands but conflicting satisfaction criteria.

SIMULATION ENVIRONMENT

Example: β = 0.5 means that half of the consumers having similar demands have conflicting satisfaction criteria. In this case half of the ratings given for this demand will be misleading.

Page 8: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

8

Configuration Bad Performance Good PerformancePCD=0, =0, PI=0 { } {SR, CAR, CBR, GM}

Consumers vary their demands.

( PCD > 0 )

{ SR } {CAR, CBR, GM}

Taste of the consumes vary.

( > 0 ){ SR, CAR } { CBR, GM }

Indeterminism.

( PI > 0){ CBR } {SR, CAR, GM}

PCD > 0, > 0, PI > 0 {SR, CAR, CBR} {GM}

RATIO OF SATISFACTION

Page 9: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

9

Method Average Time Consumption (msec)

SR 0.9

CAR 10.6

CBR 502.6

GM 2432.9

TIME CONSUMPTION

TSR < TCAR < TCBR < TGM

Page 10: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

10

PROBLEM

A number of different service selection methods are shortly explained.

Each of these approaches has different strengths and weaknesses in different configurations of the environment.

Configuration of the environment is not observable. Consumers can only observe the outcomes of their service selections.

How will an agent select among these methods given its trade-offs and a partially observable environment ?

Page 11: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

11

Using Reinforcement Learning To Choose A Service Selection

Mechanism Dynamically

Reinforcement learning (RL) is an ideal learning technique to enable agents to learn the environment and thus decide on which strategy to use in a particular situation.

Hence, we propose to use RL for choosing a service selection mechanism in dynamic environments.

Page 12: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

12

Basics of Reinforcement Learning

In RL, an agent interacts with the environment.

Page 13: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

13

Basics of Reinforcement Learning

The agent partially observes the states of the environment.

Page 14: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

14

Basics of Reinforcement Learning

The agent has a number of actions to take in a given state of the environment

Page 15: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

15

Basics of Reinforcement Learning

As a result of this action, new state of the environment is observed.

Page 16: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

16

Basics of Reinforcement Learning

… and a reward is given.The purpose of RL is to construct an optimal action policy that maximizes the total reward (finds the best service providers through out).

Page 17: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

17

SERVICE SELECTION & RL

Actions are choosing one of different service selection mechanisms (e.g., choosing context-aware ratings).

Rewards are computed using the result of the current service selection mechanism and the trade-offs of the agent.

States of the environment is observed in terms of the consequences of the agent’s actions.

In order to use standard RL techniques, we need a reward function and a set of discrete states.

Page 18: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

18

Reward function reflects the trade offs of the service consumers. The reward function used in this study is

shown below.

Reward Functiona negative reward after choosing an

action if there is another action with an expected ratio of satisfaction that is at

least 10% better than that of the chosen action.A negative reward if the chosen

action is at least 10% slower than another action whose ratio of

satisfaction is at most 1% worse than that of the chosen action.

Page 19: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

19

States

Although we can parameterize environment in our simulations (using β, PCD, PI), in real-life, these parameters are not visible to consumer agents.

Consumer agents observe the environment through the consequences of their actions.

Therefore, states of the environment are coded using the expected ratio of satisfaction of known service selection mechanisms (actions). That is (RSR, RCAR, RCBR , RGM).

For example: If the consumer observes that RSR=0.5,RCAR=0.9, RCBR=0.7, and RGM=0.95 then the agent observes the state of the environment as (0.5,0.9,0.7,0.95).

Page 20: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

20

States

Different values of (RSR, RCAR , RCBR, RGM) may represent different states of the environment.

This results in a continuous state-space.

This continuous state-space must be discretized in order to use standard Reinforcement Learning approaches.

We propose to use k-means clustering algorithm to incrementally create discrete states, each of which encapsulates a portion of continuous state-space.

Page 21: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

21

Discretization Example

Initially there is only one state represented by a

cluster.

(0.5,0.7,0.9,0.95)

Initial observation of the agent

Page 22: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

22

Discretization Example

Observations of the agent are encapsulated

by this cluster

Page 23: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

23

Discretization Example

If within class variance exceeds a predefined threshold. A new state and a corresponding new

cluster is created.

Page 24: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

24

Discretization Example

Page 25: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

25

Discretization Example

Page 26: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

26

Discretization Example

Page 27: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

27

Discretization Example

Page 28: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

28

Determination of the Current State

State 1

State 2

State 3

Given the current observation of the agent and the states with their corresponding clusters. How can we determine current state of the

environment ?

Current Observation

Page 29: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

29

Determination of the Current State

State 1

State 2

State 3

Euclidian distance from the current observation to the center of each cluster is computed. Current state is the nearest state (State 1).

0.1

0.15

0.2 Current state is State 1

Page 30: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

30

Determination of the Current State

State 1

State 2

State 3

Then, k-means is used to update clusters and compute new cluster centers. If necessary a new cluster (and a new state)

is created.

Page 31: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

31

EVALUATIONWe take several runs to evaluate the performance of the proposed approach.

In the first 8 runs environment has only one configuration through the simulations. In the last run, environment is changed from the 1th

configuration to the 8th configuration through the simulations.Same performance as GM but 114 times faster.

RGM RRL TGM

TRL

Almost, same performance as GM but 46 times faster.

RGM RRL TGM

TRL

RGM RRL TGM

TRL

Same performance as GM but 95 times faster.

RGM RRL TGM

TRL

Performance is 10 % less than that of GM. Primary

choice of RL is GM.

When we combine different configurations, performance of RL is slightly less than that of GM and

it 32 times faster than GM.

Performance of Modeling each provider using GM

Performance of Proposed approach.

Ratio of time consumptions of the GM and the proposed approach.

Page 32: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

32

CONCLUSION Our approach allows agents to learn how to choose the most useful service selection mechanism among different alternatives in dynamic environments.

Our experiments show that consumers choose the most useful service selection mechanism using the proposed approach.

The performance of the proposed approach does not go below the lower-bound defined by the trade-offs of the consumers.

As a future work, we plan to enable online addition of new service selection mechanisms. We also plan to enable agents to share their observations of the environment.

Page 33: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

33

Page 34: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

34

Comparison of Service Selection Methods

CONFIGURATION OF THE ENVIRONMENT

PERFORMANCE OF METHODS IN TERMS OF RATIO OF SATISFACTION

Each approach has the same performance

Page 35: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

35

Comparison of Service Selection Methods

CONFIGURATION OF THE ENVIRONMENT

PERFORMANCE OF METHODS IN TERMS OF RATIO OF SATISFACTION

Performance of rating-based approach decreases when consumers vary their

demands.

Page 36: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

36

Comparison of Service Selection Methods

CONFIGURATION OF THE ENVIRONMENT

PERFORMANCE OF METHODS IN TERMS OF RATIO OF SATISFACTION

Performances of rating-based approach and

context-aware ratings decrease when

taste of the consumes significantly vary.

Page 37: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

37

Comparison of Service Selection Methods

CONFIGURATION OF THE ENVIRONMENT

PERFORMANCE OF METHODS IN TERMS OF RATIO OF SATISFACTION

Performance of GM is high and does not change in different configurations of

the environment.

Performance of CBR approach decreases when providers produce services

with a little indeterminism.

Page 38: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

38

Comparison of Service Selection Methods

CONFIGURATION OF THE ENVIRONMENT

TIME CONSUMPTION OF METHODS (msec)

TSR < TCAR < TCBR < TGM

Page 39: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

39

INTRODUCTION

Ratings have two major drawbacks:

Ratings disregard the context of the service demands.

Ratings reflect the satisfaction criteria and taste of ratings.

Previous approaches to service selection are mainly based on ratings.

Page 40: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

40

Define the context using an ontology and attach this to the rating.

Consumers aggregates ratings from contexts that are similar to their current context.

Example: Consumer wants to buy a book.- Context: buying a book- Some Context-Aware Ratings: • Positive rating for buying a book from Amazon.

• Negative rating for buying a bycle from Amazon.

Enrich ratings with context-information (Context-Aware Rating):

Page 41: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

41

Ratings reflect the satisfaction criteria and taste of the raters.

Experiences

Instead of Negative/Positive Ratings, Tell me about your Experiences and Let Me Evaluate them on my own.

MAIN IDEA Of Experiences

Page 42: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

42

EXPERIENCES

An Experience of a consumer contains …

Service Demand of the Consumer

Identity of the Selected Service Provider

Supplied Service in response to the Service Demand

Date of the Experience

Commitments between the Consumer and the Provider if any

Page 43: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

43

MAKING SERVICE DECISIONS USING EXPERIENCES

Modeling Service Providers Using Multivariate Gaussian Model (Parametric Classification)

Case-Based Reasoning

Page 44: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

44

Comparison of Service Selection Methods If demands of the consumers do not change significantly and taste of consumers are similar for a specific demand, ratings are better.

If consumers significantly change their demands and their tastes are similar for a specific demand, using context-aware ratings is better.

If consumers significantly change their demands and their tastes, using experiences with CBR is better.

In other cases, using experiences with GM is better.

Page 45: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

45

Basics of Reinforcement Learning Each state has a value in terms of the maximum

discounted reward expected in that state.

The equation expresses that expected value of a state, st, is the weighted sum of the rewards received when starting in the state st and following the current

policy.

The action selection at each step is based on Q-values, which are related to the goodness of the actions.

The Q-value, Q(s, a), is the total discounted reward that the agent would receive when it starts at a state s, performs an action a, and behaves optimally thereafter.

Page 46: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

46

Basics of Reinforcement Learning

The purpose of RL is to construct an optimal action policy that maximizes the total reward (finds the best service providers through out ).

There are different approaches to achieve this, such as Q-Learning, SARSA etc.

We prefer SARSA in our work, because it learns rapidly and in the early part of learning, its average policy is better than that of the other RL approaches.

Page 47: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

47

SARSA

Learning Rate Discount factor

RewardOld Q-value

Next State

Best action according to the current policy

Page 48: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

48

Reward Function

RX = Average ratio of service decisions resulted in satisfaction when X is used for service selection.

For example: RCAR = 0.8 means that on the average 80 % of service decisions results in satisfaction of the consumer when context-aware ratings are used for service selection.

Terminology:

TX = Average time consumed when X is used for service selection.

Page 49: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

49

GM SR CBR

CAR

RL

TRL/TGM

Page 50: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

50

GM SR CBR

CAR

RL

TRL/TGM

Page 51: 1 On Choosing An Efficient Service Selection Mechanism In Dynamic Environments by Murat Sensoy & Pinar Yolum Bogazici University, Istanbul,Turkey.

51

GM SR CBR

CAR

RL

TRL/TGM