An Introduction to the Prescience Lab Peter A. Dinda Prescience Lab Department of Computer Science...

45
An Introduction to the Prescience Lab Peter A. Dinda Prescience Lab Department of Computer Science Northwestern University http://plab.cs.northwestern.edu
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    226
  • download

    0

Transcript of An Introduction to the Prescience Lab Peter A. Dinda Prescience Lab Department of Computer Science...

An Introduction to thePrescience Lab

Peter A. DindaPrescience Lab

Department of Computer Science

Northwestern University

http://plab.cs.northwestern.edu

2

Outline

• Motivations

• Questions

• Projects

• Conclusions

3

How do we deliver arbitrary amounts of computational power to ordinary people?

Assumptions: Shared computing environments,Limited utility of reservations

4

How do we deliver arbitrary amounts of computational power to ordinary people?

Distributed and Parallel Computing

Interactive Applications

5

• How do we build adaptive distributed interactive applications effectively?

• How does the demand for resources in these applications vary over time?

• How does the supply of resources vary over time?

• How can we use the adaptation mechanisms exposed by an application to match its resource demand with resource supply?

6

How do we build adaptive distributed interactive applications effectively?

• Applications– Virtualized Audio

• Immersive audio

– Interactive visualization of massive datasets

• Frameworks– Virtuoso

• Grid computing using virtual machines

– Dv

7

Virtualized Audio (with Dong Lu, Curtis Barrett)

DistributedComputationalResources

Other Users orAudio Sources

Microphones, HeadphonesGPS, head-trackingWireless connectivityLimited local computation

8

Virtualized Audio: Interactive Auralization

Listener atVirtual LocationHeadphones

AuralizationSound Field 2

Virtual Performer

HRTF

Listener Performer Room Virtual Listening Room

•Auralization injects performer into listener’s space•Auralization adapts as listener moves or room changes

•Recomputes impulse responses

9

Architecture of Interactive Auralization

Client

Scalable Real-time Simulation Server

Master filtering server

Mixing server

Mixing server

Filtering server

Filtering server

Filtering server

Filtering server

Streaming AudioService

Source 1

Source 2

Source 3

Source 4

Filtering server

Filtering server

Source n

Filter configuration

Left Channel

Right Channel

Scalable Audio Filtering Service

Parallel FD Simulation

Parallel FD Simulation

Parallel FD Simulation

Parallel FD Simulation

Parallel FD Simulation

Parallel FD Simulation

Filter generation

Binaural AudioOutput

Current Spatial Modeland source/sink positions

User-driven Immersive Audio Client

Impulse response filters characterize user’s space

10

Adaptation in Virtualized Audio

• Numerous mechanisms• Sampling rate, impulse response length,

algorithm for computing impulse response, filter approximations, server selection, …

• Can vary computational load over many orders of magnitude

• Compute/communicate ratio is huge

• How do we use these mechanisms to achieve consistent real-time response?

11

Virtuoso (with Renato Figueiredo, Jose Fortes, Ananth Sundararaj, Ashish Gupta)

• Make Grids like PCs– User gets raw machine(s)– Machine appears to be on his network– User can install what he needs as owner

• Lower level of abstraction– Classic virtual machine monitors– Virtual networking

• Middleware support– Instantiation, migration of machines– Connectivity to remote files, machines– Resource control

12

Classic Virtual Machine: VMWare

13

Why Virtual Networking?

• A machine running is suddenly plugged into your network. What happens?– Does it get an IP address?– Is it a routeable address?– Does firewall let its traffic through?– To any port?

Virtual machine hostile environment

14

A Simple Layer 2 Virtual Network

Client Server

Remote VM

PhysicalNIC

VM monitor

VirtualNIC

PhysicalNIC

SSH

Hostile Remote NetworkFriendly Local Network

15

A Simple Layer 2 Virtual Network

Client Server

Remote VM

PhysicalNIC

VM monitor

VirtualNIC

PhysicalNIC

SSH

Hostile Remote NetworkFriendly Local Network

16

A Simple Layer 2 Virtual Network

Client Server

Remote VM

PhysicalNIC

VM monitorBridge Bridge

VirtualNIC

PhysicalNIC

SSH Tunnel

Hostile Remote NetworkFriendly Local Network

17

Bootstrapping the Virtual Network

• Star topology always possible• TCP session from client must have been possible

• Better topology may be possible• Depends on security at each site

• Topology may change• Virtual machines can migrate

• Bootstrap to higher layers• Virtual filesystems

18

How does the demand for resources vary over time? How does the supply of resources vary over time?

• Resource demand in interactive applications– Instrumented games, preceding applications, … – Not much is known here

• Resource supply in distributed environments– URGIS

• Grid Information based on the relational data model

– GridG– Clairvoyance

• Online resource prediction for hosts and networks

– Tsunami• Wavelet-based approaches to information dissemination

– Diffusion• Zero-cost information dissemination

19

URGIS (with Beth Plale, Dong Lu)

• Unified Relational Grid Information Services– GIS based on the relational data model– Leverage results from database community– Northwestern work: MySQL, Oracle RDBMSes

• Compositional queries– Application-specific information aggregration– Like decision support queries (TPC-H)

• Support for information of varying dynamicity– Varying update rates and freshness requirements– Seamless inclusion of streaming data

• A common data model and query language– Powerful, high level, declarative, easy-to-optimize

20

Compositional Queries

• “Find four different hosts with a total memory between 512 MB and 1 GB”

• “Find all available sensors and predictors that provide information about the network path between a and b”

• “Tell me when the load on any of these four hosts diverges from the average by more than 50%”

21

Example select host1.name, host2.name, host3.name, host4.name, hd1.mem+hd2.mem+hd3.mem+hd4.mem as TotalMem,

from hosts as host1, hostdata as hd1, hosts as host2, hostdata as hd2, hosts as host3, hostdata as hd3,hosts as host4, hostdata as hd4

where host1.ip=hd1.ip and host2.ip=hd2.ip and

host3.ip=hd3.ip and host4.ip=hd4.ip andhd1.mem+hd2.mem+hd3.mem+hd4.mem>=512 and hd1.mem+hd2.mem+hd3.mem+hd4.mem<=1024 and host1.ip!=host2.ip and host1.ip!=host3.ip and

host1.ip!=host4.ip and host2.ip!=host3.ip andhost2.ip!=host4.ip and host3.ip!=host4.ip

order by TotalMem desc

limit 10

22

Time-bounded, non-deterministic queries

select nondeterministicallyhost1.name, host2.name, host3.name, host4.name, hd1.mem+hd2.mem+hd3.mem+hd4.mem as TotalMem,

from hosts as host1, hostdata as hd1, hosts as host2, hostdata as hd2, hosts as host3, hostdata as hd3,hosts as host4, hostdata as hd4

where host1.ip=hd1.ip and host2.ip=hd2.ip and

host3.ip=hd3.ip and host4.ip=hd4.ip andhd1.mem+hd2.mem+hd3.mem+hd4.mem>=512 and hd1.mem+hd2.mem+hd3.mem+hd4.mem<=1024 and host1.ip!=host2.ip and host1.ip!=host3.ip and

host1.ip!=host4.ip and host2.ip!=host3.ip andhost2.ip!=host4.ip and host3.ip!=host4.ip

order by TotalMem desc

limit 10

inlessthan5 seconds

usingheuristicprefer_depth_first

23

Implementation of Non-deterministic, Time-bounded Queries

• Random number associated with each row in each table (or insert)

• Query is rewritten to incorporate a random ranges on the input tables

• Range lengths chosen to meet deadline– This is not trivial and we don’t have this translation yet

• Heuristics not yet incorporated• Hopefully RDBMS-independent

24

RGIS1 Non-deterministic Query Performance

Find n hosts with a total memory of 1 GB of memory

0.01

0.1

1

1

10

100

1000

1 2 3 4 5Number of Hosts In Join

Query Time

Number ofResults

100,000 hosts

25

RGIS1 Non-deterministic Query Performance

Find 2 hosts with a total memory of 1 GB of memory

0.1

1

10

100

1000

1

10

100

1000

10000

100000

1000000

10000000

0.0005 0.001 0.01 0.1Selection Probability

Query Time

Number ofResults

100,000 hosts

26

Clairvoyance (with Jason Skicewicz, Yi Qiao)

• Measure, Characterize, Predict, and Disseminate information about dynamic resource supply

• Resource signals– Discrete-time signals strongly correlated with resource supply– Currently, univariate, working on multivariate– Currently

• Host load• Windows performance counters (using WatchTower)• Network flow bandwidth and latency (using Remos)• Any text-based source

• Online predictive modeling– Simple models (MEAN, BESTMEAN, BESTMEDIAN, LAST…)– Box/Jenkins Models (AR, MA, ARMA, ARIMA,…)– Fractional ARIMAs– Nonlinear modeling (TARs, Wavelet-decompositions)

27

RPS Toolkit• Extensible toolkit for implementing resource

signal prediction systems [CMU-CS-99-138]• Growing: RTA, RTSA, Wavelets, GUI, etc

• Easy “buy-in” for users• C++ and sockets (no threads)• Prebuilt prediction components• Libraries (sensors, time series, communication)

28

Measurement and Prediction

29

Multiscale Network Prediction

• Large, recent study of predictability

• Hundreds of NLANR and other traces– Mostly WANs

• Different resolutions– Binning and low-pass via wavelets

• Sweet Spot– Predictability often maximized at particular

resolution

30

Multiresolution Prediction Example

0

0.05

0.1

0.15

0.2

0.25

0.3

0.1 1 10 100 1000Bin Size (Seconds)

last

bm(8)

ma(8)

ar(8)

ar(32)

arma(4,4)

arima(4,1,4)

arima(4,2,4)

arfima(4,-1,4)

31

Tsumami (with Jason Skicewicz)

• Efficient dissemination of resource signals

• Wavelet-based methods for characterization, modeling, and prediction

• Tsumani toolkit will ship with the next RPS release

32

The Tension

Sensor

Video App

Network

Course-grain measurement

Resource-appropriate

measurement

Fine-grain measurement

Grid App

Resource Signal (periodic sampling)Example: host load

33

Proposed System

WaveletTransform

Level 0

Sensor

InverseWavelet

Transform

Application

Level M-1

Level M

Level 0

Level L

Network

Application receives levels based on its needs

Stream Interval

34

Delay• Transforms introduce sample delay

– Depends on number of levels and type of filter used– Exponential in the number of levels– Affects both streaming and block transforms– Seemingly inherent for wavelets

• Exploit prediction– Limited success

• Exploit “wavelet-like” decompositions– Trade-off between reconstruction accuracy and

delay– Existing theory. Our evaluation not done yet.

35

Wavelets and Prediction

• Predict each level of transformed signal separately– “Detail signals”

• Surprisingly ineffective in practice• Whitens the signal

– “Approximation signals”• Smoothing, used in network prediction work

discussed earlier• Reasonably effective, worth pursuing

36

Diffusion (with Brian Cornell, Jack Lange)

• Efficient dissemination of resource signals• Piggyback additional information on

existing packet transfers– No additional packets– Packet size unchanged

• Evaluations with traces, Minet• Implementation as Linux kernel module• >=86 bits per packet possible• 17 bits per packet verified

Zero CostInformationDissemination

37

Diffusion Implementation

App

Transport

Network

Data Link

Physical

App

Transport

Network

Data Link

Physical

Sensor

Header Editing

Consumer

DataExtraction

Sensor data piggybacked on application packets

38

SpyTalk

39

How can we use the adaptation mechanisms exposed by an application to match its resource demand with resource supply?

• Application-level performance predictions– Running Time Advisor

• Confidence interval for running time of a task on a particular host

– Message Time Advisor• Confidence interval for transfer time of a message

• Adaptation advisors– Real-time Scheduling Advisor

• Choose which host of a set on which a task is most likely to meet its deadline

• Real-time responsiveness requirement• Service for interactive applications

40

Running Time Advisor

41

Real-time Scheduling Advisor

42

• How do we build adaptive distributed interactive applications effectively?

• How does the demand for resources in these applications vary over time?

• How does the supply of resources vary over time?

• How can we use the adaptation mechanisms exposed by an application to match its resource demand with resource supply?

43

How do we deliver arbitrary amounts of computational power to ordinary people?

Distributed and Parallel Computing

Interactive Applications

44

Future Directions

• Continue pushing on projects discussed

• New directly related projects– Interactive hierarchical visualization of

huge datasets– Resource demand characterization,

modeling, and prediction

• Other directions– Intrusion detection using signal processing

45

For MoreInformation

• Peter Dinda– http://www.cs.northwestern.edu/~pdinda

• Prescience Lab– http://plab.cs.northwestern.edu