A Framework to Evaluate Intelligent Environments Chao Chen Supervisor: Dr. Sumi Helal Mobile &...

Post on 01-Jan-2016

216 views 2 download

Tags:

Transcript of A Framework to Evaluate Intelligent Environments Chao Chen Supervisor: Dr. Sumi Helal Mobile &...

A Framework to Evaluate Intelligent Environments

Chao Chen

Supervisor: Dr. Sumi HelalMobile & Pervasive Computing LabCISE DepartmentApril 21, 2007

Motivation Mark Weiser’s Vision

‘‘The most profound technologies are those that disappear. They weave themselves into the fabric of everyday life until they are indistinguishable from it…’’ Scientific American, 91

An increasing number of deployment in the past 16 years:lab: Gaia, GatorTech SmartHouse, Aware home, etc.real world: iHospital…

The Big Question:Are we there yet? Our research community need a ruler: quantitative metrics, a benchmark (suite), common set scenarios...

Conventional Performance Evaluation Performance evaluation is never a new idea Evaluation parameters:

System throughput, Transmission rate, Responsive time, … Evaluation approaches:

Test bed Simulation / Emulation Theoretical model (Queueing theory, Petri net, Markov chain, Monte

Carlo simulation… ) Evaluation tools:

Performance monitoring: MetaSim Tracer (memory), PAPI, HPCToolkit, Sigma++ (memory), DPOMP (OpenMP), mpiP, gprof, psrun, …

Modeling/analysis/prediction: MetaSim Convolver (memory), DIMEMAS(network), SvPablo (scalability), Paradyn, Sigma++, …

Runtime adaptation: ActiveHarmony, SALSA Simulation : ns-2 (network), netwiser (network), …

All déjà vu again? When it comes to pervasive computing, questio

ns emerge: Same set of parameters? Is conventional tools sufficient? I have tons of performance data, now what?

It is not feasible to bluntly apply conventional evaluation methods for hardware, database or distributed systems to pervasive computing systems.

Pervasive computing systems are heterogeneous, dynamic, and heavily context dependent. Evaluation of PerCom systems require new thinking.

Related work Performance evaluations in related area

Atlas, University of Florida. Metrics: Scalability (memory usage / number of sensors)

one.world, University of Washington. Metrics: Throughput (tuples / time, tuples / senders)

PICO, University of Texas at Arlington. Metrics: Latency (Webcast latency / duration)

Memory Usage versus Number of Sensors with and without Application Service

0

10

20

30

40

50

60

70

0 100 200 300 400 500

Number of Sensors

Me

mo

ry U

sa

ge

(%

)

Memory Usage (%), Sensors only

Memory Usage (%), Application service with sensors

Poly. (Memory Usage (%), Sensors only)

Poly. (Memory Usage (%), Application service with sensors)

We are measuring different things, applying different metrics, evaluating systems of different architecture.

Challenges Pervasive computing systems are diverse. Performance metrics: A panacea for all? Taxonomy: a classification of PerCom systems.

Taxonomy

Systems Perspective

UsersPerspective

Centralized Distributed

Stationary Mobile

(Application domain)

(User-interactivity)

(Geographic span)

Mission-criticalAuxiliary Remedial

Body-area Building Urban computing

ProactiveReactive

Performance Factors• Scalability• Heterogeneity•Consistency / Coherency• Communication cost / performance, • Resource constraints• Energy• Size/Weight• Responsiveness• Throughput• Transmission rate• Failure rate• Availability• Safety• Privacy & Trust• Context Sentience• Quality of context• User intention prediction…

/

/

/

/

/

/ /

Outline

Taxonomy Common Set of Scenarios Evaluation Metrics

A Common Set of Scenarios Re-defining research goals:

A variety of understanding and interpretation of pervasive computing

What researchers design may not be exactly what users expect

Evaluating pervasive computing systems is a process involving two steps: Are we building the right thing? (Validation) Are we building things right? (Verification)

A common set of scenarios defines: the capacities a PerCom system should have The parameters to be examined when evaluating ho

w well these capacities are achieved.

Common Set Scenarios Settings: Smart House Scenario:

Plasma burnt out System capabilities:

Service composability Fault resilience Heterogeneity compliance

Performance parameters: Failure rate Availability Recovery time

Common Set Scenarios Settings: Smart Office Scenario:

Real-time location tracking System overload Location prediction

System capabilities: Adaptivity Proactivity Context sentience

Performance parameters: Scalability Quality of Context (refreshness & precisio

n) Prediction rate

Parameters

Taxonomy and common set scenarios enable us identify performance parameters.

Observation: Quantifiable vs. non-quantifiable parameters Parameters does not contribute equally to

overall performance Performance metrics:

Quantifiable parameters: measurement Non-quantifiable: analysis & testing Parameters may have different “weights”.

Quantifiable Parameters Characteristics

System-related Parameters

System performance Node-level characteristics

Communication performance & cost Service and Application Software footprint Context characteristics Power profiles Security and Privacy Data storage and manipulation Economical considerations Quality of context Knowledge representation Programming efficiency Architectural characteristics Reliability and fault-tolerance Adaptivity characteristics Scalability Standardization characteristics Adaptivity and self-organization

by measurementby measurement by survey of userby survey of user

Usability-related Parameters

Effectiveness Acceptance Functionalities

Performance Need Modality

Learning Curve ExpectationInterface to backend and peer systems

Measurement regarding Users’ Effort

Knowledge/experience Dummy Compliance

Correctness of user intention prediction

Attitude toward technology

Conclusion & Future work Contributions

performed a taxonomy on existing pervasive computing systems

proposed a set of common scenarios as an evaluating benchmark

Identified the evaluation metrics (a set of parameters) for pervasive computing systems.

With parameters of performance listed, can we evaluate/measure them? How?

A test bed• + reality measurement• - expensive, difficult to set-up/maintain, replay

difficult Simulation/Emulation

• + reduced cost, quick set-up, consistent replay, safe

• - not reality, needs modeling and validation Theoretical Model: abstraction of pervasive space on

a higher levelAnalytical

Empirical

Thank you!