Green Computing Observatory
-
Upload
germainrenaud -
Category
Technology
-
view
824 -
download
2
description
Transcript of Green Computing Observatory
The Green Computing Observatory
Cécile Germain-Renaud1, Fredéric Fürst2, Gilles Kassel2, Julien Nauroy1, Michel Jouvin3, Guillaume Philippon3
1: Laboratoire de Recherche en Informatique, U. Paris Sud, CNRS, INRIA
2: Université Picardie Jules Verne
3: Laboratoire de l’Accélérateur Linéaire, CNRS-IN2P3
Outline
� Contexts
� Sensors
� Information Model
� Scientific issues
� Conclusion
GCG 2011 The Green Computing Observatory
Motivation and Goals
� Energy usages are complex systems � Sophisticated HW/SW mechanisms eg ACPI, dynamically over-clocking of
active cores, and other optimisations based on on-line statistical monitoring. � Interaction with local cooling provisioning (eg. fan speed) and global cooling
� Validating generative or predictive models and policies requires behavioral models based on real data
� The first barrier to improved energy efficiency is the difficulty of collecting data on the energy usage of individual components, and the lack of overall data collection.
GCO monitors energy usage at a large computing center, and publishes them through the Grid Observatory.
� A second barrier is making the collected data usable, consistent and complete.
GCO adopts an ontological approach in order to rigorously define the semantics of the data and the context of their production.
GCG 2011 The Green Computing Observatory
The GRIF-LAL computing room
GCG 2011 The Green Computing Observatory
� 13 racks hosting 1U systems, 4 lower-density racks (network, storage), resulting in ≈240 machines and 2200+ cores, and 500TB of storage.
� Mainly a Tier 2 in the EGI grid, but also includes local services and the StratusLab Cloud testbed
� High-throughput, worldwide workload, analysis-oriented production facility, accessible approximation of a data center
Sensors
GCG 2011 The Green Computing Observatory
1GByte/day at 5 minutes sampling period
IPMI
� IPMI = Intelligent Platform Management Interface,
� Based on a specialized processor card (BMC) � 1998: IPMI v1.0, 2001: IPMI v1.5, originally by Intel, HP, NEC, Dell � 2004: IPMI v2.0 (matured version of IMPI) � De facto standard implemented by all motherboard vendors
� Fine grain monitoring of individual system parts: temperatures, fans, voltages, etc. and much more: Recovery Control (power on/off/reset a server), Logging (System Event Log), Inventory (FRU information)
� Why? To contribute to a global approach, e.g. cooling inefficiency leads to increased fan speed which leads to +20% in power consumption – vs the “hot servers” trend.
� http://www.intel.com/design/servers/ipmi
GCG 2011 The Green Computing Observatory
Source: http://www.netways.de/uploads/media/Werner_Fischer_-The-Power-Of-IPMI.pdf
GCG 2011 The Green Computing Observatory
IPMI
� IPMI = Intelligent Platform Management Interface
� The exchange protocols are defined and heavily documented
� But NOT the sensors (nor defined nor documented)
� At LAL, we have DELL and IBM PowerEdge motherboards � Very different sensors: e.g. AVGPower (Watts) vs
PSCurrent (Amps) � Many inactive (NA), may depend on the BIOS version
GCG 2011 The Green Computing Observatory
Smart PDU
� PGEP PULTI � 16 outlets
� Each PDU outlet managed separately
� Query protocol : SNMP
� Embedded Web server
� Issue: last systems are Twin2
� 4 systems in 2U, 8-16 processors
� Useful for calibration
� Not all racks will be equipped
GCG 2011 The Green Computing Observatory
Ganglia
� De facto standard
� Sensors associated with an OS
� CPU load (average number of processes during a given duration, 1-2-15 minutes) , Memory (buffered, cached, free, shared, swap, total) and network usage
� Applies to Virtual Machines as well
� Periodic acquisition of system monitoring information (/proc), transfer and storage protocols. RDD storage inadequate for the GCO.
GCG 2011 The Green Computing Observatory
GCG 2011 The Green Computing Observatory
Information model
� There is no standard for � The output of the physical sensors
� The integration of computational usage and physical sensors’ output
� There are standards for � OS information: Ganglia
� Virtual Machine definition: OVF � Centralized statistics publication: SDMX (Statistical
Data and Metadata Exchange). Successful experience of porting to a Linked Data model.
GCG 2011 The Green Computing Observatory
Extension of DOLCE
GCG 2011 The Green Computing Observatory
Ontology – measurement concepts
� Define the semantics of the data (what is measured?) and the context of their production (how are they acquired and/or calculated?) � Observables are individual qualities of endurants (e.g.
temperature of a component) and/or perdurants (e.g. speed of the rotation of a fan).
� Observations make use of sensors and data acquisition chains which are physical and non-physical (software) artifacts.
� Observation values are boolean/numeric/scalar qualia.
� Extensions/adaptations of DOLCE, FOOM (Functional Ontology of Observation and Measurement) and OGC-O&M (OGC’s Observations and Measurements standard)
� Work in progress
GCG 2011 The Green Computing Observatory
Publication: XML files
� At 5 minutes sampling period, 1GB/day.
� Scalable querying w/ Xpath
� Integration capability w/ Xinclude
� Easy conversion to analysis-focused formats eg matlab w/ XSL
GCG 2011 The Green Computing Observatory
Preliminary XML schema
GCG 2011 The Green Computing Observatory
How to
Get an account Download files
GCG 2011 The Green Computing Observatory
www.grid-observatory.org
Status and Roadmap
� Acquisition of timeseries and metadata for IPMI, Ganglia, PDU and temperature are in production
� Examples of raw timeseries for IPMI, PDU and Ganglia released
� Metadata integration and temperature timeseries, stable XML schema V1 1T 2012
� Global energy consumption 2T 2012
� Ontology-consistent XML schema (V2) 4T 2012
� Also: rack monitoring
GCG 2011 The Green Computing Observatory
Non-stationarity
The Green Computing Observatory
The “physical” process is not stationary
� Trends: Rogers’s curve of adoption
� Technology innovations
� Real-world events � Experimental discoveries
� Slashdotted accesses
NON-STATIONARITY IS A REASONABLE HYPOTHESIS BUT PRECLUDES NAÏVE STATISTICS
GCG 2011
How to build knowledge?
� Supervised learning? No reference, too rare experts
� Let’s build it on-line! Model-free policies e.g. Reinforcement Learning!
� Unfortunately, tabula rasa policies and vanilla ML methods are too often defeated [Rish & Tesauro 2006).
The Green Computing Observatory
Intelligibility
Exploration/exploitation tradeoff
GCG 2011
Methods
Intelligibility: Uncovering hidden causes
� Semantic inference [Y. Kim et al. Characterizing E-Science File Access Behavior via Latent Dirichlet Allocation, UCC 2011]
� Collaborative Prediction, Rank approximation [D. Feng et al. Distributed Monitoring with Collaborative Prediction]
Dealing with non stationarity
� Segmentation [T. Elteto et al. Towards non stationary Grid Models, JoGC Dec. 2011]
� Adaptive clustering with changepoint detection [X. Zhang et al. Toward Autonomic Grids: Analyzing the Job Flow with Affinity Streaming. SIGKDD'2009]
GCG 2011 The Green Computing Observatory
With the support of
� France Grilles – French NGI member of EGI
� EGI-Inspire (FP7 project supporting EGI)
� INRIA – Saclay (ADT programme)
� CNRS (PEPS programme)
� University Paris Sud (MRM programme)
n
GCG 2011 The Green Computing Observatory
Conclusion: Digital Curation
� Establish long-term repositories of digital assets for current and future reference � Continuously monitoring a large computing facility
� Providing digital asset search and retrieval facilities to scientific communities through a gateway � Data published through Grid Observatory portal
� Tackling the good data creation and management issues, and prominently interoperability, � Formal mainstream ontology, standard-aware
� Adding value to data by generating new sources of information and knowledge � Semantic and Machine Learning based inference.
GCG 2011 The Green Computing Observatory