Dr. Guandong Xu Intelligent Web & Information Systems
(IWIS) Department of Computer Science, Aalborg University Web Usage
Mining & Personalization
Slide 2
References MOBASHAR, B., Chap4: Web Usage Mining and
Personalization. In Practical Handbook of Internet Computing,
Munindar P. Singh (ed.), CRC Press. 2005 XU, G., ZHANG, Y., &
LI, L., Web Mining and Social Networking : Techniques and
Applications, Springer, Nov, 2010,
http://www.springer.com/978-1-4419-
7734-2http://www.springer.com/978-1-4419- 7734-2
Slide 3
Personalization Web personalization can be described as any
action that makes the Web experience of a user customized to the
users taste or preferences. Principal elements of Web
personalization include modeling of Web objects (such as pages or
products) and subjects (such as users or customers), categorization
of objects and subjects, matching between and across objects and/or
subjects, and determination of the set of actions to be recommended
for personalization
Slide 4
Personalization categories three general groups: manual
decision rule systems - rules based on user demographics or static
profiles (collected through a registration process); content-based
filtering agents - rely on personal profiles and the content
similarity of Web documents collaborative filtering systems- user
ratings or preferences, and through a correlation engine.
Slide 5
Drawbacks of current systems Rule-based filtering Subjective
description of users Prone to biased Static user profile degrade
the performance after time Content-based filtering Rely on content
similarity, not feasible for non-textual resource Missing the
consideration of user preference Collaborative filtering a
predominant approach in most commercial e-commerce systems Instead
of the content features, it involves in matching the ratings of
like-minded users for objects (e.g. movies or products) Most
commonly used KNN algorithm
Slide 6
Limitations of collaborative filtering The high dimensionality
of items in the system The sparsity of rating data decrease the
likelihood of a significant overlap of ratings The high
computational cost of real-time prediction the necessity of
integrating the explicit user ratings and implicit content or
product-oriented features
Slide 7
Improvement on these concerns Optimization strategies, such as
Similarity indexing and dimensionality reduction Offline clustering
of user records search only within a matching cluster Integration
of content and user demographics A promising technique web usage
mining Goal is to capture and model the patterns and profiles of
users interacting with a Web site
Slide 8
The aims of web usage mining The discovered patterns are
usually represented as collections of pages or items that are
frequently accessed by groups of users with common needs or
interests. Such patterns can be used to better understand
behavioral characteristics of visitors or user segments, improve
the organization and structure of the site, and create a
personalized experience for visitors by providing dynamic
recommendations.
Slide 9
Aspects of web usage mining Enhance the above discussed
approaches and remedy the shortcomings Benefits from the advance of
data mining employing data mining algorithms on the offline pattern
discovery from user transaction improve the scalability of CF
Clustering Association rule mining Navigation pattern mining and so
on
Slide 10
Personalization based on WUM Goal: to recommend a set of
objects, e.g. links, ads, text, products or service tailored to the
users preference Personalization process Derive the navigational
patterns of users long-term (the user activity history) or
short-term (single sessions) web log mining or analysis Match the
target active user with the derived patterns Make recommendations
via the matched pattern (recommendation engine)
Slide 11
The overview of web personalization based on WUM Consists of
three phases data preparation and transformation, pattern
discovery, and recommendation ( real-time)
Slide 12
The offline data preparation and pattern discovery
components
Slide 13
The online personalization component
Slide 14
Data Preparation and Modeling the most time consuming and
computationally intensive step in the knowledge discovery process
requires the use of especial algorithms and heuristics not commonly
employed in other domains critical to the successful extraction of
useful patterns from the data
Slide 15
Sources and Types of Data Usage data: log file Content data:
textual Structure data: linkage map User data: demographic, domain
know
Slide 16
Usage data preparation data preprocessing include data
cleaning, pageview identification, user identification, session
identification (or sessionization), the inference of missing
references due to caching, and transaction (episode) identification
SESSION #925 (USER_ID = 338) 9745438/news/default.asp-
9745452/admissions/ >/news/default.asp
9745520/admissions/requirements.asp > /admissions/
9745846/programs/ >/admissions/requirements.asp
9745852/programs/2002/gradect2002.asp > /programs/
9745907/pdf/promos/2002/ect2002.pdf >
/programs/2002/gradect2002.asp PageurlDuration(s)weight(%)
/news/default.asp:142.98 /admissions/6814.5
/admissions/requirements.asp32669.51 /programs/61.28
/programs/2002/gradect2002.asp5511.73
/pdf/promos/2002/ect2002.pdf
Slide 17
Web usage data model The data preprocessing results in User
sessions, S={S i |i=1,,m} Pages corpus, P={P j |j=1,,n} Each user
session s i ={,, }, or simply s i ={w i1, w i2 ,w in }
Slide 18
Data integration from multiple source
Slide 19
Data integration example TP matrix: transaction-pageview R mn
PF matrix: pageview-feature R nk Multiply TP with PF : TP PF=TP={t
1 ,t 2 ,,t m } R mk t j is a k-dimensional vector over the feature
space a user transaction can be represented as a content feature
vector, reflecting that users interests in particular concepts or
topics
Slide 22
Association rule mining Given a transaction T and a set I = {I
1, I 2,..., I k } of frequent itemsets over T. The support of an
itemset I i I is defined as An association rule r is an expression
of the form X Y ( r, r ), where X and Y are itemsets, r = (X Y ) is
the support of X Y representing the probability that X and Y occur
together in a transaction. The confidence for the rule r, r, is
given by (X Y ) / (X) and represents the conditional probability
that Y occurs in a transaction given that X has occurred in that
transaction.
Slide 23
An association rule example For example, a high-confidence rule
such as {special-offers/, /products/software/} {shopping-cart/}
might provide some indication that a promotional campaign on
software products is positively affecting online sales. Such rules
can also be used to optimize the structure of the site. For
example, if a site does not provide direct linkage between two
pages A and B, the discovery of a rule {A} {B} would indicate that
providing a direct hyperlink might aid users in finding the
intended information.
Slide 24
Sequential pattern mining Given a transaction set T and a set S
= {S 1, S 2,..., S n } of frequent sequential (respectively,
contiguous sequential) pattern over T, the support of each S i is
defined as follows: The confidence of the rule X Y, where X and Y
are (contiguous) sequential patterns, is defined as where denotes
the concatenation operator
Slide 25
Clustering approaches A process to assign data objects into
various data groups or categories based on the similarity or
distance between the objects such that the intra-group similarity
within one group is maximized but the inter-group similarity is
minimized. A unsupervised approach only relying on the mutual
similarity, in contrast, classification is supervised learning In
the context of usage data, two types of clustering : clustering the
transactions (or users), or clustering pageviews Applications of
clustering in Web usage mining, e-marketing, personalization, and
collaborative filtering
Slide 26
Clustering in user profiling An example of deriving aggregate
usage profiles from transaction clusters
Slide 27
Steps of clustering in user profiling Given the mapping of user
transactions into a multi-dimensional space as vectors of pageviews
(i.e., the matrix TP)
Slide 28
Steps of clustering in user profiling Given the mapping of user
transactions into a multi-dimensional space as vectors of pageviews
(i.e., the matrix TP) Employ standard clustering algorithms, such
as k-means, generally partition this space into subgroups Obtain
user segments, but not capturing an aggregated view of common user
patterns
Slide 29
Steps of clustering in user profiling Given the mapping of user
transactions into a multi-dimensional space as vectors of pageviews
(i.e., the matrix TP) Employ standard clustering algorithms, such
as k-means, generally partition this space into subgroups Obtain
user segments, but not capturing an aggregated view of common user
patterns Utilize the centroid (or the mean vector) of each cluster
to represent the aggregated view of user pattern
Slide 30
Discovered Patterns for Personalization several effective
recommendation algorithms based on clustering (which can be seen as
an extension of standard kNN- based collaborative filtering),
association rule mining (AR), and sequential pattern (SP) or
contiguous sequential pattern (CSP) discovery.
Slide 31
kNN-Based Approach k-Nearest-Neighbor (kNN) approach involves
comparing the activity record for a target user with the historical
records of other users in order to find the top k users who have
similar tastes or interests. Measuring the similarity or
correlation between the active session s and each transaction
vector t (where t T ). The top k most similar transactions to s are
considered to be the neighborhood for the session s, which denoted
by NB(s) (taking the size k of the neighborhood to be implicit)
NB(s) = {t s 1,t s 2, ,t s k }. Various similarity measure Pearson
correlation coefficient; cosine coefficient; Jaccard distance and
so on.
Slide 32
Recommendation score Given the active use session s and the
nearest neighbors NB(s), the recommendation score of an item is
where weight(p,NB(s)) is the mean weight for pageview p in the
neighborhood as expressed in the centroid vector
Slide 33
Using Clustering For Personalization A method PACT (Profile
Aggregation Based on Clustering Transactions) [Mobasher et al.,
2002] an aggregate usage profile pr c as a set of pageview-weight
pairs: pr c = { | p P, weight(p, pr c ) }, where the significance
weight, weight(p, pr c ), of the pageview p within the usage
profile pr c Recommendation score
Slide 34
Association Rules for Personalization Sample Web Transactions
involving pageviews A, B, C, D and E Example of discovered frequent
itemsets
Slide 35
An example of a Frequent Itemsets Graph (frequency threshold of
4) Now, given user active session window B,E, the recommendation
generation algorithm finds items A and C as candidate
recommendations. The recommendation scores of item A and C are 1
and 4/5, corresponding to the confidences of the rules {B,E} {A}
and {B,E} {C}, respectively.
Slide 36
Sequential Patterns for Personalization Example of discovered
sequential patterns
Slide 37
Example of discovered contiguous sequential patterns
Slide 38
Recommendation example Give a users active session window A,B,
the recommendation engine using sequential patterns finds item E as
a candidate recommendation. The recommendation score of item E is
1, corresponding to the rule A,B E. On the other hand, the
recommendation engine using contiguous sequential patterns will, in
this case, fails to give any recommendations