Windows Thoughts on dependability of Windows Rob Short Corporate Vice president Windows architecture...
-
Upload
miles-stewart -
Category
Documents
-
view
219 -
download
2
Transcript of Windows Thoughts on dependability of Windows Rob Short Corporate Vice president Windows architecture...
WindowsThoughts on dependability of WindowsRob ShortCorporate Vice presidentWindows architecture and kernel
Today’s talk
End user view of dependabilityWindows overview – some statsProgress we’ve made already Tough Challenges
Expectations have changed
Computers are now appliancesPeople no longer accept the blame when computer fails or when they can’t find out how to use a feature
Highest cost issues are complex design problems, rather than traditional “software failure” issuesComplexity of diagnosis is beyond most usersMicrosoft must be responsible for everything running on the system
The end user doesn’t know or care if they loaded a virus, they see “Windows crashed”
Our approach needs to change Chart from pre-reading shows reduction
in hw and sw failures as cause of downtime
System management, user interference component has increased
BUTComputer is an appliance - system manager is no longer an expert
We must find approaches to deal with this
Windows – some stats
Windows is an entire family of productsAlmost 1 Billion! Copies sold~200 Million per year, >500K a dayOver 1,000,000 devices supported100,000s applications availableEmbedded in phones and handhelds with 32MB of memory64-CPU, 64-bit systems with > 1TB of RAM
Developer view of product range
Embedded/specific purposeAs small as possible, but composible by an expert. Tool kit to add/remove componentsDevices chosen by engineer, not end userBuilds a run-time, not a general purpose OS
ServerJust the files for a particular role, others available when neededGUI/Wizards allows user to choose role Administrator may want to hand configure some devices such as SANsHow to add support such as NUMA without slowing client?
Client – most complex windows systemEverything on by defaultFull automatic plug and play, streaming media, audio, etcMost systems have hundreds of drivers and extensions runningCorporate IT wants complete control over devices and installed software
Huge progress
Systems are much more reliable and functional than a few years agoNewer hardware and better driver tools have helped with hardware issuesWe’re very good at the “hard faults” ie crashes etc where we have real informationHangs, slowdowns etc still haunt usersSecurity ---- may be the worst problem since solution is social as much as technical
Progress in Vista
Reliability and security were a top prioritySignificantly strengthened performance and reliability teams
Source level analysis tools Diagnosis and tracing infrastructureAuto diagnosis for common problem areasUser mode driver frameworksHang detection infrastructureCancelable synchronous I/OsHardware error architecture
Partners
Partners
Partners
InternetCustomers and
Community
Fixes, patches, updates, etc.
Windows Update
Problems, crashes,
annoyances
Using feedback to improve quality(The nice marketing slide)
Analysis used to prioritize Dev work
Online Crash Analysis (OCA)
Automatically takes dumps from customers, analyzes them, and sends solution back Store the output of the analysis into the OCA DatabaseDev and MSR worked together to create analysis tools and to mine the data
Search for common devices/ sw and themes
We save dumps for further analysis
Why Windows crashesTop ten causes, OS Core includes malware
Architecture challenges
Windows has grown explosively, but organicallySystem became intertwined and complexOrganization is also large and complexSharing code base across products is great, but can serialize developmentAdding new product variants is too hard Servicing it all is a challenge
Architectural focus areas in Windows
Application model StateExtensions, both user and kernel modeApplication compatibilityLayering and partitioningTop management added security
13
State
State is everything persistentSchema to identify system, user, and application state Users should be able to move to new machinesHow to really understand impact of changesRules for developers
14
Extensibility Microsoft is successful because we’re a platform company – it's what we doProviding the ultimate platform means well thought-out extensibility points throughout the systemThe system needs a common way to identify, load and enumerate extensionsWe need to make extensibility consistent and robust so customers feel comfortable using software, all of which includes extensions
15
Drivers – extensibility example
Windows driver model designed for performance first and extensibility second
Wrong choice for today 100,000 drivers, 1,000,000 versions
Created driver “frameworks” for VistaRe-architecting the boundariesHuge effort on tools for developersStatic driver verifier
Joint MSR and development effort
Software Engineering Research challenges
Large systems are too complex to fully analyzeHow to think about full impact of design?Ways to think about interactions more formallyWhat should the extension model be?Component model with cross-component tools
Focus on entire lifecycleRequirements, specification and architecture, failure analysisDesign/coding/Test and verification/Maintenance, patching etc
Help with education?Raise awareness of value of correctness, test and verification etc
Summary
Huge improvements in capability and reliability in the past decadeRequirements and expectations increased faster than improvements in dependabilitySystem complexity has increased faster than our ability to manage itDevelopment teams are very good at evolutionary improvementsWe need new, end to end, approaches to help entire product lifecycle
The right people are hereLets do something about it
Questions?
Security
More than just a technology issueWorld wide network of hackers spread the word on vulnerabilities, most attacks take advantage of more than oneReverse engineer patches - Race to get the patch out before the hackAttacks are increasingly sophisticatedMost issues are design problems, not simple coding errorsThreat models, design reviews, code reviews, tools etc
Tradeoff between usability and security
40 / 100 GB 80 / 200 GB 568GB / 1 TB
213 / 500 GB
2002 2003 2005 2007
100Mb/S Wired11Mb/S Wireless
100Mb/S (wired)11 / 54 Mb/S (wireless)
1Gb/S (wired)54 Mb/S (wireless)
20-30 GHz
10 GHz
3 GHz
PC Hardware capabilities drive entire industry