SPORC: Group Collaboration using Untrusted Cloud Resources OSDI 2010 Presented by Yu Chen.
1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen,...
-
Upload
edward-french -
Category
Documents
-
view
214 -
download
0
Transcript of 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen,...
![Page 1: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/1.jpg)
1
Automatic Misconfiguration Disagnosis with PeerPressure
Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang
Microsoft Research
OSDI 2004, San Francisco, CA
![Page 2: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/2.jpg)
2
Misconfiguration Diagnosis
• Technical support contributes 17% of TCO [Tolly2000]
• Much of application malfunctioning comes from misconfigurations
• Why?– Shared configuration data (e.g., Registry) and
uncoordinated access and update from different applications
• How about maintaining the golden config state?– Very hard [Larsson2001]
• Complex software components and compositions• Third party applications• …
![Page 3: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/3.jpg)
3
Outline
Motivation
• Goals
• Design
• Prototype
• Evaluation results
• Future work
• Concluding remarks
![Page 4: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/4.jpg)
4
Goals
• Effectiveness– Small set of sick configuration candidates that
contain the root-cause entries
• Automation – No second party involvement – No need to remember or identify what is
healthy
![Page 5: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/5.jpg)
5
Intuition behind PeerPressure
• Assumption– Applications function correctly on most
machines -- malfunctioning is anomaly
• Succumb to the peer pressure
![Page 6: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/6.jpg)
6
An Example
Suspects Mine P1’s P2’s P3’s P4’s
e1 0 1 1 1 1
e2 on on on on off
e3 57 4 0 100 34
• Is R1 sick? Most likely• Is R2 sick? Probably not• Is R3 sick? Maybe not
– R3 looks like an operational state
• We use Bayesian statistics to estimate the sick probability of a suspect -- our ranking metric
![Page 7: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/7.jpg)
7
Registry Entry Suspects
0HKLM\System\Setup\...
OnHKLM\Software\Msft\...
nullHKCU\%\Software\...
DataEntry
PeerPressure
Search& Fetch
StatisticalAnalyzer
CanonicalizerPeer-to-Peer
TroubleshootingCommunity
Database
Troubleshooting Result
0.2HKLM\System\Setup\...
0.6HKLM\Software\Msft\...
0.003HKCU\%\Software\...
Prob.Entry
AppTracer
Run the faulty app
System Overview
![Page 8: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/8.jpg)
8
The Sick Probability
• P(Sick) = (N + c) / (N + ct + cm (t-1) )– N: # of the samples– C: cardinality– t: the number of suspects– m: the number of entries that match the suspect entry
value
• Properties:– As m increases, P decreases– As c increases, P decreases; when m = 0, smaller c
implies smaller p
![Page 9: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/9.jpg)
9
The PeerPressure Prototype
• Database of 87 live Windows XP registry snapshots as our sample pool– hierarchical persistent storage for named, typed
entries
• PeerPressure troubleshooter implemented in C#• Needed to “sanitize” the entry values
– 1, “1”, “#1”– Heuristics: unifying values of entries with different
types
![Page 10: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/10.jpg)
10
Outline
MotivationGoalsDesignPrototype
• Evaluation results
• Future work
• Concluding remarks
![Page 11: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/11.jpg)
11
Windows Registry Characteristics
• Max size: 333,193• Min size: 77,517• Average size: 198,376• Median size: 198,608• Cardinality: 87% 1, 94% <=2• Distinct canonicalized entries in GeneBank
1,476,665• Common canonicalized entries 43,913• Distinct entries data-sanitized 1,820,706
![Page 12: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/12.jpg)
12
Evaluation Data Set
• 87 live Windows XP registry snapshots (in the database)– Half of these snapshots are from three diverse
organizations within Microsoft: Operations and Technology Group (OTG) Helpdesk in Colorado, MSR-Asia, and MSR-Redmond.
– The other half are from machines across Microsoft that were reported to have potential Registry problems
• 20 real-world troubleshooting cases with known root-causes
![Page 13: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/13.jpg)
13
Response Time
• # of suspects: 8 to 26,308 with a median: 1171• 45 seconds in average for SQL server hosted on a 2.4GHz
CPU workstation with 1 GB RAM• Sequential database queries dominate
0.00
50.00
100.00
150.00
200.00
250.00
8
37 64
105
135
182
237
293
354
482
853
1171
1230
1350
1777
1779
3209
3590
3983
5483
# of Suspects
Sec
onds
![Page 14: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/14.jpg)
14
Troubleshooting Effectiveness
• Metric: root cause ranking
• Results:– Rank = 1 for 12 cases– Rank = 2 for 3 cases– Rank = 3, 9, 12, 16 for 4 cases, respectively– cannot solve one case
![Page 15: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/15.jpg)
15
Source of False Positives
• Nature of the root-cause entry – Root cause entry has a large cardinality
• How unique other suspects– A highly customized machine likely produces
more noise
• The database is not pristine
![Page 16: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/16.jpg)
16
Impact of the Sample Set Size
• Larger sample set doesn’t necessarily indicate better accuracy– Strong conformity doesn’t depend on the
number of samples– Operational state doesn’t depend on the
number of samples– Only helps with non-pristine sample set
• 10 samples are large enough for most cases
![Page 17: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/17.jpg)
17
Related Work
• Blackbox-based techniques– Strider: need to identify the healthy [Wang ‘03]– Hardware, software component dependencies [Brown
‘01]
• Much prior on leveraging statistics to pinpoint anomaly– Bug as deviant behavior [Engler et al SOSP ‘01]– Host-based intrusion detection based on system calls
[Forrest ’96] and based on registry behavior [Apap et al, ‘99]
![Page 18: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/18.jpg)
18
Future Work
• Only scratch the surface!
• Multiple root cause entries
• Cross-application troubleshooting
• Database maintenenance
• Privacy– Friends Troubleshooting Network
![Page 19: 1 Automatic Misconfiguration Disagnosis with PeerPressure Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang Microsoft Research OSDI 2004,](https://reader031.fdocuments.in/reader031/viewer/2022013004/56649ddc5503460f94ad3e52/html5/thumbnails/19.jpg)
19
Concluding Remarks
• Automatic misconfiguration diagnosis is possible– Use statistics from the mass to automate
manual identification of the healthy– Initial results promising