An Introduction to the Prescience Lab Peter A. Dinda Prescience Lab Department of Computer Science...
-
date post
21-Dec-2015 -
Category
Documents
-
view
226 -
download
0
Transcript of An Introduction to the Prescience Lab Peter A. Dinda Prescience Lab Department of Computer Science...
An Introduction to thePrescience Lab
Peter A. DindaPrescience Lab
Department of Computer Science
Northwestern University
http://plab.cs.northwestern.edu
3
How do we deliver arbitrary amounts of computational power to ordinary people?
Assumptions: Shared computing environments,Limited utility of reservations
4
How do we deliver arbitrary amounts of computational power to ordinary people?
Distributed and Parallel Computing
Interactive Applications
5
• How do we build adaptive distributed interactive applications effectively?
• How does the demand for resources in these applications vary over time?
• How does the supply of resources vary over time?
• How can we use the adaptation mechanisms exposed by an application to match its resource demand with resource supply?
6
How do we build adaptive distributed interactive applications effectively?
• Applications– Virtualized Audio
• Immersive audio
– Interactive visualization of massive datasets
• Frameworks– Virtuoso
• Grid computing using virtual machines
– Dv
7
Virtualized Audio (with Dong Lu, Curtis Barrett)
DistributedComputationalResources
Other Users orAudio Sources
Microphones, HeadphonesGPS, head-trackingWireless connectivityLimited local computation
8
Virtualized Audio: Interactive Auralization
Listener atVirtual LocationHeadphones
AuralizationSound Field 2
Virtual Performer
HRTF
Listener Performer Room Virtual Listening Room
•Auralization injects performer into listener’s space•Auralization adapts as listener moves or room changes
•Recomputes impulse responses
9
Architecture of Interactive Auralization
Client
Scalable Real-time Simulation Server
Master filtering server
Mixing server
Mixing server
Filtering server
Filtering server
Filtering server
Filtering server
Streaming AudioService
Source 1
Source 2
Source 3
Source 4
Filtering server
Filtering server
Source n
Filter configuration
Left Channel
Right Channel
Scalable Audio Filtering Service
Parallel FD Simulation
Parallel FD Simulation
Parallel FD Simulation
Parallel FD Simulation
Parallel FD Simulation
Parallel FD Simulation
Filter generation
Binaural AudioOutput
Current Spatial Modeland source/sink positions
User-driven Immersive Audio Client
Impulse response filters characterize user’s space
10
Adaptation in Virtualized Audio
• Numerous mechanisms• Sampling rate, impulse response length,
algorithm for computing impulse response, filter approximations, server selection, …
• Can vary computational load over many orders of magnitude
• Compute/communicate ratio is huge
• How do we use these mechanisms to achieve consistent real-time response?
11
Virtuoso (with Renato Figueiredo, Jose Fortes, Ananth Sundararaj, Ashish Gupta)
• Make Grids like PCs– User gets raw machine(s)– Machine appears to be on his network– User can install what he needs as owner
• Lower level of abstraction– Classic virtual machine monitors– Virtual networking
• Middleware support– Instantiation, migration of machines– Connectivity to remote files, machines– Resource control
13
Why Virtual Networking?
• A machine running is suddenly plugged into your network. What happens?– Does it get an IP address?– Is it a routeable address?– Does firewall let its traffic through?– To any port?
Virtual machine hostile environment
14
A Simple Layer 2 Virtual Network
Client Server
Remote VM
PhysicalNIC
VM monitor
VirtualNIC
PhysicalNIC
SSH
Hostile Remote NetworkFriendly Local Network
15
A Simple Layer 2 Virtual Network
Client Server
Remote VM
PhysicalNIC
VM monitor
VirtualNIC
PhysicalNIC
SSH
Hostile Remote NetworkFriendly Local Network
16
A Simple Layer 2 Virtual Network
Client Server
Remote VM
PhysicalNIC
VM monitorBridge Bridge
VirtualNIC
PhysicalNIC
SSH Tunnel
Hostile Remote NetworkFriendly Local Network
17
Bootstrapping the Virtual Network
• Star topology always possible• TCP session from client must have been possible
• Better topology may be possible• Depends on security at each site
• Topology may change• Virtual machines can migrate
• Bootstrap to higher layers• Virtual filesystems
18
How does the demand for resources vary over time? How does the supply of resources vary over time?
• Resource demand in interactive applications– Instrumented games, preceding applications, … – Not much is known here
• Resource supply in distributed environments– URGIS
• Grid Information based on the relational data model
– GridG– Clairvoyance
• Online resource prediction for hosts and networks
– Tsunami• Wavelet-based approaches to information dissemination
– Diffusion• Zero-cost information dissemination
19
URGIS (with Beth Plale, Dong Lu)
• Unified Relational Grid Information Services– GIS based on the relational data model– Leverage results from database community– Northwestern work: MySQL, Oracle RDBMSes
• Compositional queries– Application-specific information aggregration– Like decision support queries (TPC-H)
• Support for information of varying dynamicity– Varying update rates and freshness requirements– Seamless inclusion of streaming data
• A common data model and query language– Powerful, high level, declarative, easy-to-optimize
20
Compositional Queries
• “Find four different hosts with a total memory between 512 MB and 1 GB”
• “Find all available sensors and predictors that provide information about the network path between a and b”
• “Tell me when the load on any of these four hosts diverges from the average by more than 50%”
21
Example select host1.name, host2.name, host3.name, host4.name, hd1.mem+hd2.mem+hd3.mem+hd4.mem as TotalMem,
from hosts as host1, hostdata as hd1, hosts as host2, hostdata as hd2, hosts as host3, hostdata as hd3,hosts as host4, hostdata as hd4
where host1.ip=hd1.ip and host2.ip=hd2.ip and
host3.ip=hd3.ip and host4.ip=hd4.ip andhd1.mem+hd2.mem+hd3.mem+hd4.mem>=512 and hd1.mem+hd2.mem+hd3.mem+hd4.mem<=1024 and host1.ip!=host2.ip and host1.ip!=host3.ip and
host1.ip!=host4.ip and host2.ip!=host3.ip andhost2.ip!=host4.ip and host3.ip!=host4.ip
order by TotalMem desc
limit 10
22
Time-bounded, non-deterministic queries
select nondeterministicallyhost1.name, host2.name, host3.name, host4.name, hd1.mem+hd2.mem+hd3.mem+hd4.mem as TotalMem,
from hosts as host1, hostdata as hd1, hosts as host2, hostdata as hd2, hosts as host3, hostdata as hd3,hosts as host4, hostdata as hd4
where host1.ip=hd1.ip and host2.ip=hd2.ip and
host3.ip=hd3.ip and host4.ip=hd4.ip andhd1.mem+hd2.mem+hd3.mem+hd4.mem>=512 and hd1.mem+hd2.mem+hd3.mem+hd4.mem<=1024 and host1.ip!=host2.ip and host1.ip!=host3.ip and
host1.ip!=host4.ip and host2.ip!=host3.ip andhost2.ip!=host4.ip and host3.ip!=host4.ip
order by TotalMem desc
limit 10
inlessthan5 seconds
usingheuristicprefer_depth_first
23
Implementation of Non-deterministic, Time-bounded Queries
• Random number associated with each row in each table (or insert)
• Query is rewritten to incorporate a random ranges on the input tables
• Range lengths chosen to meet deadline– This is not trivial and we don’t have this translation yet
• Heuristics not yet incorporated• Hopefully RDBMS-independent
24
RGIS1 Non-deterministic Query Performance
Find n hosts with a total memory of 1 GB of memory
0.01
0.1
1
1
10
100
1000
1 2 3 4 5Number of Hosts In Join
Query Time
Number ofResults
100,000 hosts
25
RGIS1 Non-deterministic Query Performance
Find 2 hosts with a total memory of 1 GB of memory
0.1
1
10
100
1000
1
10
100
1000
10000
100000
1000000
10000000
0.0005 0.001 0.01 0.1Selection Probability
Query Time
Number ofResults
100,000 hosts
26
Clairvoyance (with Jason Skicewicz, Yi Qiao)
• Measure, Characterize, Predict, and Disseminate information about dynamic resource supply
• Resource signals– Discrete-time signals strongly correlated with resource supply– Currently, univariate, working on multivariate– Currently
• Host load• Windows performance counters (using WatchTower)• Network flow bandwidth and latency (using Remos)• Any text-based source
• Online predictive modeling– Simple models (MEAN, BESTMEAN, BESTMEDIAN, LAST…)– Box/Jenkins Models (AR, MA, ARMA, ARIMA,…)– Fractional ARIMAs– Nonlinear modeling (TARs, Wavelet-decompositions)
27
RPS Toolkit• Extensible toolkit for implementing resource
signal prediction systems [CMU-CS-99-138]• Growing: RTA, RTSA, Wavelets, GUI, etc
• Easy “buy-in” for users• C++ and sockets (no threads)• Prebuilt prediction components• Libraries (sensors, time series, communication)
29
Multiscale Network Prediction
• Large, recent study of predictability
• Hundreds of NLANR and other traces– Mostly WANs
• Different resolutions– Binning and low-pass via wavelets
• Sweet Spot– Predictability often maximized at particular
resolution
30
Multiresolution Prediction Example
0
0.05
0.1
0.15
0.2
0.25
0.3
0.1 1 10 100 1000Bin Size (Seconds)
last
bm(8)
ma(8)
ar(8)
ar(32)
arma(4,4)
arima(4,1,4)
arima(4,2,4)
arfima(4,-1,4)
31
Tsumami (with Jason Skicewicz)
• Efficient dissemination of resource signals
• Wavelet-based methods for characterization, modeling, and prediction
• Tsumani toolkit will ship with the next RPS release
32
The Tension
Sensor
Video App
Network
Course-grain measurement
Resource-appropriate
measurement
Fine-grain measurement
Grid App
…
Resource Signal (periodic sampling)Example: host load
33
Proposed System
WaveletTransform
Level 0
Sensor
InverseWavelet
Transform
Application
Level M-1
Level M
Level 0
Level L
Network
Application receives levels based on its needs
Stream Interval
34
Delay• Transforms introduce sample delay
– Depends on number of levels and type of filter used– Exponential in the number of levels– Affects both streaming and block transforms– Seemingly inherent for wavelets
• Exploit prediction– Limited success
• Exploit “wavelet-like” decompositions– Trade-off between reconstruction accuracy and
delay– Existing theory. Our evaluation not done yet.
35
Wavelets and Prediction
• Predict each level of transformed signal separately– “Detail signals”
• Surprisingly ineffective in practice• Whitens the signal
– “Approximation signals”• Smoothing, used in network prediction work
discussed earlier• Reasonably effective, worth pursuing
36
Diffusion (with Brian Cornell, Jack Lange)
• Efficient dissemination of resource signals• Piggyback additional information on
existing packet transfers– No additional packets– Packet size unchanged
• Evaluations with traces, Minet• Implementation as Linux kernel module• >=86 bits per packet possible• 17 bits per packet verified
Zero CostInformationDissemination
37
Diffusion Implementation
App
Transport
Network
Data Link
Physical
App
Transport
Network
Data Link
Physical
Sensor
Header Editing
Consumer
DataExtraction
Sensor data piggybacked on application packets
39
How can we use the adaptation mechanisms exposed by an application to match its resource demand with resource supply?
• Application-level performance predictions– Running Time Advisor
• Confidence interval for running time of a task on a particular host
– Message Time Advisor• Confidence interval for transfer time of a message
• Adaptation advisors– Real-time Scheduling Advisor
• Choose which host of a set on which a task is most likely to meet its deadline
• Real-time responsiveness requirement• Service for interactive applications
42
• How do we build adaptive distributed interactive applications effectively?
• How does the demand for resources in these applications vary over time?
• How does the supply of resources vary over time?
• How can we use the adaptation mechanisms exposed by an application to match its resource demand with resource supply?
43
How do we deliver arbitrary amounts of computational power to ordinary people?
Distributed and Parallel Computing
Interactive Applications
44
Future Directions
• Continue pushing on projects discussed
• New directly related projects– Interactive hierarchical visualization of
huge datasets– Resource demand characterization,
modeling, and prediction
• Other directions– Intrusion detection using signal processing