Post on 20-Aug-2015
A Study on Behavior Mining of Cloud computing users
2014・ 02.14
Shree Krishna Shrestha
12054071Graduate School of Engineering
Muroran Institute of Technology, Muroran, Hokkaido, Japan
CONTENTS
IntroductionTest-bed Cloud System: Jyaguchi IntroductionProblem DefinitionAlgorithm Description : 1. TWSMA
2.RecommendationExperiment and ResultsConclusion
INTRODUCTION AND PURPOSE
Purpose: A framework to recommend service
Method to mine services, based on the behavior of service user Method to Recommend Services based on the result data of mining of
services
Jyaguchi: A cloud system proposed by Bishnu Prasad Gautam Based on service on demand and pay per use business model
TEST BED CLOUD SYSTEM :JYAGUCHI (OVERVIEW)
Jyaguchi is a SAAS based cloud that provides a platform to develop application as service with multi-language support.
component
Calculator Service Component
componentAddServiceComponent
component SubtractServiceComponent
component MultiplyServiceComponent
Component DivideServiceComponent
JavaScript
Ruby
Python
Groovy
Ref: As per the Definition of Inventor of Jyaguchi, Asst. Prof. Bishnu Prasad Guatam
Features Software as Service(SaaS) Distributed Resource ManagementPay per use Business Model Service on Demand
MAJOR ISSUE IN MINING OF SERVICES
What is difference between mining Item and Service?
Why current Item mining cannot be used for Service
mining?
Usage Time
Mining for frequent service usage pattern considering not only service usage frequency but also the service usage time
Service Mining
ALGORITHM FOR SERVICE MINING
Propose an algorithm for service mining which consider the time of service usage.
Time Weight Sequence Mining Algorithm (TWSMA)
Create Multi-dimensional Weighted Service Sequence Database
Mining Multi-dimensional Sequence
CREATION OF SERVICE WEIGHT INPUT SEQUENCE
Input: Service Usage logs; Unit time uOutput: Multi-dimensional Weighted Service Sequence Database (MDWSSDB)1: Calculate service usage time from service usage logs for each service on each position.2: Create Multi-dimensional service usage time sequence from service usage logs3: Calculate , Service Count, for each service on each position4: Calculate Absolute Service Weight, for each service on each position5: Calculate Relative Service Weight for each service on each position6: Make Weighted Sequence (ws) integrating service id sj with its Related Service Weight.7: Create MDWSSDB with integrating ws and associated user id.
CALCULATION OF RELATIVE SERVICE WEIGHT
Seq. id User_id Sequence
1 10 (2,6),(123,16),(456,31),(2,33),(456,35)
2 10 (2,21),(2,20),(2,22),(1,22),(2,21)
3 16 (2,1),(123,9),(456,1),(123,1),(456,15
4 15 (456,19),(456,24)(234,24),(456,43
5 15 (234,20),(234,11),(234,30),(456,38)
6 16 (456,19),(123,39),(456,30),(234,30)
Service Weight of service 2 for user 10,
ST2,10 = (6 + 33 + 21 + 20 + 22 + 21) min = 123 min
T10 = (6 + 16 + 31 +33+35+21+20+22+22+ 21) min = 227 min.
ASW2,10= 123/227 = 0.542For unit time (ut) 5 min, service usage count for service 2 at position 1 and sequence 1 is (SC2,1,1) = 6/5 = 1.2
RSW2,1,1 = 1.2 * 0.542 = 0.650
Multi-dimensional service usage time sequence
( 456, 35)
Service ID
Use time
EXAMPLE OF INPUT SEQUENCE
Seq. id User_id Sequence
1 10 (2,0.650),(123,0.224),(456,1.804),(2,3.577),(456,2.037)
2 10 (2,2.276),(2,2.168),(2,2.385),(1,0.427),(2,2.276)
3 16 (2,0.0014),(123,0.608),(456,0.089),(123,0.068),(456,1.344)
4 15 (456,2.253),(456,2.846)(234,1.954),(456,5.1)
5 15 (234,1.628),(234,0.895),(234,2.442),(456,4.507)
6 16 (456,1.702),(123,2.636),(456,2.688),(234,1.242)
Seq. id User_id Sequence
1 10 (2,6),(123,16),(456,31),(2,33),(456,35)
2 10 (2,21),(2,20),(2,22),(1,22),(2,21)
3 16 (2,1),(123,9),(456,1),(123,1),(456,15
4 15 (456,19),(456,24)(234,24),(456,43
5 15 (234,20),(234,11),(234,30),(456,38)
6 16 (456,19),(123,39),(456,30),(234,30)
( 456, 35)
Service ID
Use time
Calculation of service weights
( 456, 2.037)
Service ID
Serviceweight
Jyaguchi log data
MINING MULTIDIMENSIONAL SEQUENCE
Input: Multi-dimensional Weighted Service Sequence Database: MDWSSDB; Minimum support min support
Output: The complete set of labeled frequent patterns1: Calculate sequence database weight SDW of MDWSSDB2: Calculate minimum weight Wm3: Call ModiedPrexSpan4: End if no frequent pattern is found or at end of database5: Form Projected Sequence Database6: Mine labeled frequent patterns from Projected Sequence
Database
MINING SEQUENTIAL PATTERN
Prefix Postfix 2 <_123,456,2,456>,<_2,2,1,2>,
<_123,456,123,456><2>-projected database
123 <_456,2,456>, <_456,123,456>, <_456,234> <123>-projected database
2,123 <_456,2,456>,<_456,123,456>,<_456> <2,123>-projected database
Service id : 1230.224+0.608+0.068+2.636Total weight of service id 123 :3.53
Total Database Weight (SDW )= (0.650+0.224+1.804+...+1.242)=49.83
Frequent Pattern : 123,456
For min_support 5%min_weight = 49.83*.05 = 2.49
Prefix Postfix123 <_456,2,456>,
<_456,123,456>, <_456,234>123,456 <_2,456>, <_123,456>
Seq. id
User_id
Sequence
1 10 (2,0.650),(123,0.224),(456,1.804),(2,3.577),(456,2.037)
2 10 (2,2.276),(2,2.168),(2,2.385),(1,0.427),(2,2.276)
3 16 (2,0.0014),(123,0.608),(456,0.089),(123,0.068),(456,1.344)
4 15 (456,2.253),(456,2.846)(234,1.954),(456,5.1)
5 15 (234,1.628),(234,0.895),(234,2.442),(456,4.507)
6 16 (456,1.702),(123,2.636),(456,2.688),(234,1.242)
MINING SEQUENTIAL PATTERN
Prefix Postfix 2 <_123,456,2,456>,<_2,2,1,2>,
<_123,456,123,456><2>-projected database
123 <_456,2,456>, <_456,123,456>, <_456,234> <123>-projected database
2,123 <_456,2,456>,<_456,123,456>,<_456> <2,123>-projected database
Prefix
<123, 456>
Seq. id
User_id
Sequence
1 10 (2,0.650),(123,0.224),(456,1.804),(2,3.577),(456,2.037)
2 10 (2,2.276),(2,2.168),(2,2.385),(1,0.427),(2,2.276)
3 16 (2,0.0014),(123,0.608),(456,0.089),(123,0.068),(456,1.344)
4 15 (456,2.253),(456,2.846)(234,1.954),(456,5.1)
5 15 (234,1.628),(234,0.895),(234,2.442),(456,4.507)
6 16 (456,1.702),(123,2.636),(456,2.688),(234,1.242)
For frequent service sequence<123; 456>
User_id 16 and * are found frequentfrom postfix database.
Postfix
<10>;<16>; <16>
Labeled frequent pattern(16; <123; 456>); (*,<123; 456>)
RECOMMENDATION OF SERVICES
Based on result labelled Frequent pattern from TWSMA Categorized for 3 user group
1. Anonymous Users/First time User Group,2. Registered Users group without Previous History of Service
Usage (don’t have current service usage log)., 3. Registered Users group with Previous History of Service Usage
(have current service usage log).
RECOMMENDING SERVICE
User_id Sequence support
10 2 3
* 234 3
15 234,456 2
* 456,234 2
16 456,123,456 2
* 2,123,,456 2
Frequent Patterns
RECOMMENDING SERVICE
User_id Sequence support
10 2 3
* 234 4
15 234,456 2
* 456,234 2
16 456,123,456 2
* 2,123,,456 2
Frequent Patterns Anonymous Users Group
Services with highest support
Recommended Service: 234
RECOMMENDING SERVICE
User_id Sequence support
10 2 3
* 234 4
15 234,456 2
* 456,234 2
16 456,123,456 2
* 2,123,,456 2
Frequent Patterns First Time User user_id 14
Services with highest support
Recommended Service: 234
RECOMMENDING SERVICE
User_id Sequence support
10 2 3
* 234 4
15 234,456 2
* 456,234 2
16 456,123,456 2
* 2,123,,456 2
Frequent Patterns Authorized User 10
Services with highest support of that user
Recommended Service: 2
RECOMMENDING SERVICE
User_id Sequence support
10 2 3
* 234 4
15 234,456 2
* 456,234 2
16 456,123,456 2
* 2,123,,456 2
Frequent PatternsAuthorized User 15 and has used
service 234
Next service from the frequent pattern with highest support
Recommended Service: 456
RECOMMENDING SERVICE
User_id Sequence support
10 2 3
* 234 3
15 234,456 2
* 456,234 2
16 456,123,456 2
* 2,123,,456 2
Frequent PatternsLogged in user 16 who has used
service 2,456,123
- This sequence is not in frequent pattern- Drop 2 and search from remaining sequence. i.e. 456, 123
Recommended Service: 234
EXPERIMENTS (TWSMA)
Experiment MethodologyImplemented on Jyaguchi systemUsed actual log of Jyaguchi UsersVaried minimum support to find variation in No. of
patterns found and processing time.Comapred No. of patterns found and processing time
with seq-dim algorithm.
EXPERIMENT RESULTS (1)
• No. of patterns and Process time with no. of sequences for varied minimum support
EXPERIMENT RESULTS (3)
• No. of patterns and Process time with no. of sequences for varied minimum support
EXPERIMENTS (TWSMA)
Precision and Recall based evaluation Experiment Methodology
Learning Phase: Find frequent services from log data of prior to implementing TWSMA algorithm with
various minimum support did an online survey among Jyaguchi Users about the favorite services. found common services in between survey data and frequent services for various
minimum support which is used as relevant services. Evaluation Phase
Users Use Jyaguchi system where services are recommended from 3 algorithms: 1. TWSMA, 2. SEQ-DIM and 3. Random
Calculate Precision and Recall for each user. Take average of Precision and Recall for various minimum support.
EXPERIMENT RESULTS (3)
Comparision of Precision and recall for Various minimum support for 3 algorithm
Minimum_support:10%Minimum_support:7%
EXPERIMENT RESULTS (4)
Comparision of Precision and recall for Various minimum support for 3 algorithm
Minimum_support:12% Minimum_support:15%
CONCLUSION AND FUTURE WORKS
• proposed a framework for recommending services utilizing service usage time as service weight.
• Implemented the algorithm in the Jyaguchi System.• Evaluated the proposed framework on Jyaguchi System.
• Implement and evaluate algorithm on other SAAS based Cloud system.• Add the dimension of user profile for better recommendation
Future Tasks
Conclusion