BEAR: Mining Behaviour Models from User-Intensive Web Applications
-
Upload
giordano-tamburrelli -
Category
Engineering
-
view
78 -
download
3
Transcript of BEAR: Mining Behaviour Models from User-Intensive Web Applications
Mining Behaviour Models
from User-Intensive Web
ApplicationsCarlo Ghezzi
Politecnico di Milano, Italy (IT)
Mauro Pezzè[email protected]
Università della Svizzera Italiana, Lugano (CH)
Michele [email protected]
Touchtype Ltd, UK
Giordano [email protected]
Università della Svizzera Italiana, Lugano (CH)
Scalability Privacy
Security Users
Modern Web Applications
• Millions of interactions per day
• Manage sensible data
• Secure economic transactions
• Capture/measure user behaviours
• User’s behaviours cannot be
predicted at design time.
• Only released applications allow
us to collect statistics
• Multiple and heterogeneous
navigational behaviours that
depend on several factors
• Behaviours may unpredictably
change over time
User behaviours
• Monitoring+analysis/mining
• Little support from a general software engineering perspective
Related work
Google AnalyticsLink PredictionWeb Caching
• General abstraction to support software engineers
• Automated and non-ambigous analysis tool
• Support for different user classes
• Other key features:
• extensibility (domain specific analysis)
• incrementality
• applicable to legacy systems
What is missing
• Exploit formal models to capture and quantitatively analyse
user behaviors
• Focus on RESTful architectures
• Based on log file mining applicable to legacy systems
Formal
MethodsWeb
Development
+
Our Idea
• User classes
• Give semantics to events
in the log file
• Infer user-behaviour
models (DTMC)
• Queries the models
Ingredients
A real-world case study
• Small example, but general enough:
• URL with parameters
• URL with parametric structure
URL Description
/home/ Homepage of findyourhouse.com
/anncs/sales/ The first page that shows the sales announcements.
/anncs/sales/?page=< n>Nth page of sales announcements
/anncs/sales/< id> / Detailed view of the sales announcement
/anncs/renting/The first page that shows the renting announcements.
/anncs/renting/?page=< n> Nth page of renting announcements
/anncs/renting/< id> / Detailed view of the renting announcement
/search/ Page containing the results of a search
/admin/.../Website’s control panel
/admin/login/ Login page that allows to access the control panel.
/contacts/ URL with the form to contact a sales agent.
/contacts/submit/Contact form submitted
has been submitted.
/contacts/tou/Page that describes the website terms of use.
• A set of atomic propositions (AP) give semantics to the
entries in the log
• Declarative approach: @BearFilter
URLs ➔ Atomic Propositions
@BearFilter(regex="^/anncs/sales/(\w+)/$")
public static Proposition void filterSales(LogLine line){
return new Proposition("sales_anncs");
}
@BearFilter(regex="^/admin/login/$")
public static Proposition void filterLogin(LogLine line){
if(logLine.getHTTPStatusCode == "302")
return new Proposition("login_success");
else
return new Proposition("login_fail");
}
URLs ➔ Atomic Propositions
URL Atomic Propositions
/home/homepage
/anncs/sales/sales_page, page_1
/anncs/sales/?page=< n>sales_page, page_n
/anncs/sales/< id> /sales_anncs
/anncs/renting/renting_page, page_1
/anncs/renting/?page=< n>
renting_page, page_n
/anncs/renting/< id> /
renting_anncs
• Code fragments called classifiers to specify user classes
• Declarative approach: @BearClassifier
Identify User Classes
@BearClassifier(name="userAgent")
public static String UserAgentClassifier(LogLine logline) {
return logline.getAgent();
}
{(userAgent = “Mozilla/5.0...”), (location = “Boston”)}
• BEAR infers a set of DTMCs
• Sequential and incremental
process
• An independent DTMC for
each user class
Infer the models
IP TIMESTAMP URL
1.1.1.1 - [20/Dec/2013:15:35:02] - /home/
2.2.2.2 - [20/Dec/2013:15:35:07] - /admin/login/
1.1.1.1 - [20/Dec/2013:15:35:12] - /anncs/sales/1756/
2.2.2.2 - [20/Dec/2013:15:35:19] - /admin/edit/
Infer the models
• Rewards: domain specific
metrics of interests
• Number of announcements
displayed
• DB Queries
Annotating the models
extensibility
• Probabilistic Computation Tree Logic (PCTL)
augmented with rewards
• BEAR Properties = scope + PCTL formula
Specifying the properties
{userAgent = “(.∗)Mozilla(.∗)”}P=?[F contact_requested]
{userAgent = “(.∗)(Android|iOS)(.∗)”}R=?[F end]
generality
Querying the models
automation
• Scope identifies the set of
relevant DTMCs among the
inferred models
• BEAR analysis engine
compose selected DTMCs into
single one
• PCTL verification performed
with PRISM on the composed
model
Model Composition• Union of the sets of states of the input DTMCs
• Law of total probability to compute transitions
• Detecting navigational anomalies:
• A difference between the actual and the expected user navigation
actions.
• Comparing the BEAR models with the site map:
{}P =?[(X si)]{sj}
• Measuring behaviours and attitudes
• {}P =?[(F sales_anncs) & (!(F renting_anncs))]
• {(?!(.∗)(Android | iOS))(.∗)}R=?[F end {sales_anncs}]
BEAR at work