Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards...

15
brought to you by by Julius Neudorfer, DCEP Evolving Data Center Cooling Environmental Ranges and Standards Getting Ready for Revisions to ASHRAE Standards Special Report

Transcript of Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards...

Page 1: Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards ... “Thermal Guidelines for Data Processing Environ- ... In 2011, ASHRAE provided

brought to you by

by Julius Neudorfer, DCEP

Evolving Data Center Cooling Environmental Ranges and StandardsGetting Ready for Revisions to ASHRAE Standards

Special Report

Page 2: Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards ... “Thermal Guidelines for Data Processing Environ- ... In 2011, ASHRAE provided

Copyright © 2016, Data Center Frontier 2

SPECIAL REPORTEvolving Data Center Cooling Environmental Ranges and Standards

For the majority of traditional data centers the envi-ronmental operating conditions have been long been based on ASHRAE recommendations defined in the “Thermal Guidelines for Data Processing Environ-ments”, which was first published in 2004, and has since been updated twice. With each succeeding edi-tion, the facility environmental envelope ranges have broadened in response to the increased environmental ruggedness of newer generations of IT hardware. These broader ranges have allowed the facility operators the opportunity to improve their cooling energy efficiency. The industry reliance on these ASHRAE guidelines has allowed data center facility managers to consider increasing the operating temperatures and adjusting the humidity ranges to save energy, while considering any effects on IT equipment reliability. The 4th edition is expected to be finalized and released in 2016.

This whitepaper will examine the underlying rela-tionship of temperature, humidity and energy usage,

as well as the operational risk considerations of the expanded environmental ranges on both the facility and the IT equipment. It will also examine the existing issues of the ASHRAE 90.1 standards, which are used by many state and local building departments, as well as discussing the potential impact of the pending ASHRAE 90.4 standard, which is now in its 3rd review for public comment and is also expected to become effective in the fall of 2016.

IntroductionThe demand for more efficient and cost effective computing has driven organizations large and small to reevaluate their strategies. This examination can incorporate many aspects, encompassing system architecture and software platforms, as well as the IT hardware, and of course the data center facility. Moreover, there are many strategic options for the enterprise CIO and CTO to consider, such as the possibility of direct or indirect ownership and operation of their own data center facility, as well as colocation, cloud, or hybrid combinations thereof. Nonetheless, the IT hardware must ultimately reside in a physical data center that will provide conditioned power and safe environmental conditions for the IT equipment.

Introduction...................................................2Overview of Data Center Cooling Practices and ASHRAE Recommendations and Standards ...............3Data Center Guidelines, Metrics and Standards ........4

Power Usage Effectiveness (PUE) .......................4Understanding Temperature References .................5

Dry Bulb .....................................................5Wet Bulb ....................................................5Dew Point ...................................................5Recommended vs. Allowable Temperatures ...........5

Temperature Measurements – Room vs. IT Inlet .........6ASHRAE vs. NEBS Environmental Specifications .........6Controlling Supply and IT Air Intake Temperatures .....7Airflow Management .........................................7Server Reliability vs. Ambient Temperature – the X-Factor ..................................................8

The Impact of New Servers and Energy Star IT Equipment ................................... 9IT Fan Power and Noise Levels vs. Intake Temperature .......................................... 9Understanding the Implications of ASHRAE 90.1 and 90.4 Standards ............................................. 11Examining the Proposed ASHRAE 90.4 Standard ....... 12Mandatory Electrical and Mechanical Energy Compliance (90.4) ......................................... 12

Electrical Loss Component (ELC) ...................... 12Mandatory PUE .......................................... 12Mechanical Loss Component (MLC).................... 12

The Bottom Line ........................................... 14Biography .................................................... 15

Contents

The facility environmental envelope ranges have broadened in response to the increased environmental ruggedness of newer generations of IT hardware. These broader ranges have allowed the facility operators the opportunity to improve their cooling energy efficiency.

Page 3: Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards ... “Thermal Guidelines for Data Processing Environ- ... In 2011, ASHRAE provided

Copyright © 2016, Data Center Frontier 3

SPECIAL REPORTEvolving Data Center Cooling Environmental Ranges and Standards

The majority of data centers cooling practices have been based on ASHRAE’s Technical Committee 9.9 (TC 9.9) Mission Critical Facilities, Technology Spaces, and Electronic Equipment, which published the first edition of the “Thermal Guidelines for Data Processing Environments” in 2004. TC 9.9 was developed by cooperative effort to create a unified environmental specification range for data center operators and IT equipment manufacturers. The first edition defined a relatively narrow recommended temperature range of 68°F to77°F (20–25°C), as well as 40–55% relative humidity (RH). These boundaries were based on the range of temperature and humidity limits provided by each of the different IT equipment manufacturers of that era, as well as historic operating practices of data centers containing older generation, highly sensitive, so-called legacy computing equipment. Moreover, at that time, energy efficiency was not given much con-sideration, the primary focus was maintaining very tightly controlled environmental conditions to mini-mize the risk of IT hardware failure.

In 2008 the second edition of the Thermal Guidelines was released and the recommended temperature range for Class–1 facilities was broadened to 64.4–80.4°F (18–27°C). However, the humidity limits became a bit more complex and became based on a combination of dew point (DP) temperatures and relative humidity (41.9°F {5°C} DP to 60%RH and 59°F {15°C} DP). This was in response to newer generations of IT hardware’s increased environmental tolerances and the industry impetus to begin to explore ways to save cooling system energy. However, the historic reliance on the original, more conservative ASHRAE guidelines had already become a de-facto practice and imbedded memory. It is also important to note that while the 2nd edition included a wider “allowable” envelope, it was clearly marked “for reference only”, and the focus was primarily still on the “recommended” envelope. In fact it warned that “that prolonged operation out-side of the recommended operating ranges can result in decreased equipment reliability and longevity”. As a result, despite the 2008 broadened recommended environmental envelope, most data center managers continued to keep their temperatures at 68°F (or less) and maintain a tightly controlled humidity of 50% RH.

Almost concurrently, The Green Grid introduced the Power Usage Efficiency (PUE) metric in 2007, which was defined as the ratio of power used by the facility compared to the power used by the IT equipment.

While initially PUE was slow to take hold, however by 2010 the awareness of the need for energy efficiency was beginning to spread throughout the industry. At first, very low PUEs of 1.2 or less made headlines when they were announced by Internet Search and social media giants. This was accomplished by building custom hyper-scale data centers which utilized a variety of leading edge cooling systems designed to minimize cooling energy, such as the use of direct outside air economizers and higher and wider tem-perature ranges. These designs and broader IT tem-peratures ranges broke with conventional data center cooling practices, and were not readily accepted by traditional enterprise organizations.

However, they also proved that low cost commodity IT equipment could operate fairly reliably over a much wider temperature and humidity range than the typical, more conservative industry practice of 68°F at 50% RH. This ultimately helped spur more facility managers to begin using PUE to focus on analyzing and improving their energy efficiency. It then became much clearer that in most cases cooling used the majority of facility power and therefore offered the greatest opportunity for improvement.

In 2011, ASHRAE provided an early release of the key details of the newly introduced Expanded Allowable Temperature (A1–A4) ranges that would be included the 3rd edition of the Thermal Guidelines (published in 2012). Moreover, the 3rd edition openly encouraged the more common use of non-mechanical (compressor-less) cooling—so-called free cooling—using direct out-side air economizers to take maximum advantage of ambient air temperatures (within the newly expanded allowable limits) to cool data centers. This would have been considered pure heresy just a few years prior.

This seeming radical declaration was the result of the information the IT equipment manufacturers who inter-nally shared their projected failure rates vs. tempera-ture, over the expanded temperate ranges. As a result of this anatomized data, they created the X-Factor risk projections. The publication of the X-Factor was meant to encourage data center managers to save energy, by providing the information to consider increasing the operating temperature to increase the use of free cooling, while still maintaining acceptable expecta-tions of IT equipment reliability. Although released in 2011, the X-Factor is still highly debated and some-times misinterpreted. We will delve into the details of these factors in the Server Reliability section.

Overview of Data Center Cooling Practices and ASHRAE Recommendations and Standards

Page 4: Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards ... “Thermal Guidelines for Data Processing Environ- ... In 2011, ASHRAE provided

Copyright © 2016, Data Center Frontier 4

SPECIAL REPORTEvolving Data Center Cooling Environmental Ranges and Standards

It is important to note that while closely followed by the industry, the TC9.9 Thermal Guidelines are only recommendations for the environmental operating ranges inside the data center, they are not a legal standard. ASHRAE also publishes many standards, such as 90.1 “Energy Standard for Buildings – Except for Low Rise Buildings” which is used as a reference and has been adopted by many state and local building departments. Prior to 2010, the 90.1 standard virtu-ally exempted data centers. In 2010 the revised 90.1 standard included and mandated highly prescriptive methodologies for data center cooling systems. This concerned many data center designers and operators, especially the Internet and social media sites which utilized a wide variety of leading-edge cooling systems designed to minimize cooling energy. These designs broke with traditional data center cooling designs and could potentially conflict with the prescriptive requirements of 90.1, thus limiting rapidly developing innovations in the more advanced data center designs. We will examine 90.1 and 90.4 in more detail in the Standards section.

Power Usage Effectiveness (PUE)

While the original version of PUE metric became more well known, it was criticized by some since “power” (kW) was an instantaneous measurement at a point in time, and some facilities claimed very low PUEs based on a power measurement made during the coldest day which minimized cooling energy. In 2011 it was updated to PUE version 2 (which is focused on annual-ized energy rather than power).

The revised 2011 version is also recognized by ASHRAE, as well as the US EPA and DOE, became part of basis of the Energy Star program, as well as becoming a glob-ally accepted metric. It defined four PUE Categories (PUE0-3) and three specific points of measurement. Many data centers do not have energy meters at the specified points of measurement. To address this issue, PUE0 still was based on power, but required the highest power draw, typically during the warmer weather (highest PUE), rather than a best case, cold weather measurement, to negate incorrect PUE claims. The next three PUE categories were based on annualized energy

(kWh). In particular PUE Category 1 (PUE1) specified the output of the UPS and was the most widely used point of measurement. The point of measurement for PUE2 (PDU output) and PUE3 (at the IT cabinet), represented more accu-rate measurement methods of the actual IT loads, but were harder and more expensive to implement. (See graphic.)

The Green Grid clearly stated that the PUE metric was not intended to compare data centers, its purpose was only meant as a method to baseline and track changes to help data centers improve their own efficiency. The use of a manda-tory PUE for compliance purposes in the 90.1–2013 building stan-dard, and the proposed ASHRAE 90.4 Data Center Energy Efficiency standard, was in conflict with its intended purpose. The issue is also discussed in more detail in the section on ASHRAE standards.

Data Center Guidelines, Metrics and Standards

Page 5: Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards ... “Thermal Guidelines for Data Processing Environ- ... In 2011, ASHRAE provided

Copyright © 2016, Data Center Frontier 5

SPECIAL REPORTEvolving Data Center Cooling Environmental Ranges and Standards

In order to discuss evolving operating temperatures it is important to examine the differences of dry bulb, wet bulb and dew point temperatures.

Dry Bulb

This is the most commonly used type of thermometer referenced in the specification of IT equipment oper-ating ranges. The “dry bulb” thermometer (analog or digital), readings are unaffected by the humidity level of the air.

Wet Bulb

In contrast, there is also a “wet bulb” thermometer, wherein the “bulb” (or sensing element) is covered with a water-saturated material such as cotton wick and a standardized velocity of air flows past it to cause evaporation, cooling the thermometer bulb (a device known as a sling psychrometer). The rate of evapora-tion and related cooling effect is directly affected by

the moisture content of the air. As a result, at 100% RH the air is saturated and the water in the wick will not evaporate and will equal the reading of a “dry bulb” thermometer. However, at lower humidity levels, the dryer the air, the faster the moisture in the wick will evaporate, causing a lower reading by the “wet bulb” thermometer, when compared to a “dry bulb” thermometer. Wet bulb temperatures are commonly used as a reference for calculating the cooling unit’s capacity (related to latent heat load. i.e. condensa-tion – see Dew Point below), while “dry bulb” tempera-tures are used to specify sensible cooling capacity. Wet bulb temperatures are also used to project the perfor-mance of the external heat rejection systems, such as evaporative cooling towers, or adiabatic cooling systems. However, for non-evaporative systems, such as fluid coolers or refrigerant condensers, dry bulb temperatures are used.

Dew Point

Dew point temperature represents the point at which water vapor has reached the saturation point (100% RH). This temperature varies, and its effect can be commonly seen when condensation forms on an object that is colder than the dew point. This is an obvious concern for IT equipment. When reviewing common IT equipment operating specifications, it should be noted that the humidity range is specified as “non-condensing”.

Dew point considerations also become important to address and minimize latent heat loads on cooling systems, such as the typical CRAC/CRAH unit whose cooling coil operates below the dew point, therefore inherently dehumidifies while cooling (latent cooling – requiring energy). This then requires the humidifica-tion system to use more energy to replace the mois-ture removed by the cooling coil. New cooling system can avoid this double-sided waste of energy by imple-menting dew point control.

Recommended vs. Allowable Temperatures

As of 2011, the “recommended” temperature range remained unchanged at 64.4–80.4°F (18–27°C). While the new A1–A2 “allowable” ranges surprised many IT and Facility personnel, it was the upper ranges of the A3 and A4 temperatures that really shocked the industry.

While meant to provide more information and options, the new expanded “allowable” data center classes significantly complicated the decision process for the data center operator when trying to balance the need to optimize efficiency, reduce total cost of ownership, address reliability issues, and improve performance.

Understanding Temperature References

2011 Equipment Class Range

LOW °F

HIGH °F

LOW °C

HIGH °C

Recommended 64.4°F 80.4°F 18°C 27°C

Allowable A1 59°F 89.6°F 15°C 32°C

Allowable A2 50°F 95°F 10°C 35°C

Allowable A3 41°F 104°F 5°C 40°C

Allowable A4 41°F 113°F 5°C 45°C

Wet bulb temperatures are commonly used as a reference for calculating the cooling unit’s capacity (related to latent heat load. i.e. condensation — see Dew Point below), while “dry bulb” temperatures are used to specify sensible cooling capacity.

Page 6: Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards ... “Thermal Guidelines for Data Processing Environ- ... In 2011, ASHRAE provided

Copyright © 2016, Data Center Frontier 6

SPECIAL REPORTEvolving Data Center Cooling Environmental Ranges and Standards

Although ASHRAE Thermal Guidelines are well known in the data center, the telecommunications industry created environmental parameters long before TC9.9 released the first edition in 2004. The NEBS1 envi-ronmental specifications provides a set of physical, environmental, and electrical requirements for local exchanges of telephone system carriers. The NEBS specifications have evolved and been revised many times and its ownership has changed as telecommu-nications companies reorganized. Nonetheless, it and its predecessors effectively defined the standards for ensuring reliable equipment operation of the US tele-phone system for over a hundred years.

In fact, NEBS is referenced in the ASHRAE Thermal Guidelines. The NEBS “recommended” temperature range 64.4°F – 80.6°F (18–27°C), existed well before the original TC9.9 guidelines, but was not until 2008 in the 2nd edition, that the Thermal Guidelines were expanded to the same values. More interestingly, in 2011, the TC9.9 new A3 specifications now matched the long standing NEBS allowable temperature range of 41–104F. However, it is the NEBS allowable humidity range that would shock most data center operators 5%–85% RH. The related note in the ASHRAE Thermal Guideless states: “Generally accepted telecom prac-tice; the major regional service providers have shut down almost all humidification based on Telecordia research”.

1NEBS (previous known as Network Equipment-Building System) is currently owned and maintained by Telecorida which was formerly known as Bell Communications Research, Inc. or Bellcore. It was the telecommunication research and development company created as part of the break-up of the American Telephone and Telegraph Company (AT&T).

Temperature Measurements – Room vs. IT Inlet

As indicated in the summary of Thermal Guidelines, the temperature of the “room” was originally used as the basis for measurement. However, “room” temper-atures were never truly meaningful, since the temper-atures could vary greatly in different areas across the whitespace. Fortunately, in 2008, there was an impor-tant, but often overlooked change in where the tem-perature was measured. The 2nd edition referenced the temperature of the “air entering IT equipment”. This highlighted the need to understand and address airflow management issues in response to the higher IT equip-ment power densities, and the recommendation of the

Cold-Aisle / Hot-Aisle cabinet layout. In the 2012 guide-lines there were also additional recommendations for the locations for monitoring the temperatures in the cold aisle. These also covered placing sensors inside the face of cabinet and the position and number of sensors per cabinet, (depending on the power density of the cabinets and IT equipment). While this provided better guidance on where to monitor the temperatures, very few facility managers had temperature monitoring in the cold aisles, much less inside the racks. Moreover, it did not directly address how to control the intake temperatures of the IT hardware.

ASHRAE vs. NEBS Environmental Specifications

Revised Low Humidity Ranges and Risk of Static Discharge

In 2015 TC9.9 completed a study of the risk of Electro-static Discharge “ESD” and discovered that lower humidity did not significantly increase the risk of damage from ESD, as long as proper grounding was used when servicing IT equipment.

It is expected that the 2016 edition of the Thermal Guidelines will expand the allowable low humidity level down to 8%RH. This will allow a substantial en-ergy saving, by avoiding the need to use humidification systems to raise humidity unnecessarily.

Page 7: Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards ... “Thermal Guidelines for Data Processing Environ- ... In 2011, ASHRAE provided

Copyright © 2016, Data Center Frontier 7

SPECIAL REPORTEvolving Data Center Cooling Environmental Ranges and Standards

Setting aside the supply temperature control strat-egies, one of the most significant issues to ensuring IT equipment intake temperatures is Airflow Manage-ment. While the basic concept of Hot Aisle – Cold Aisle layouts have been generally adopted as commonly accepted cabinet layout, it is only the beginning of ensuring the temperature and amount of airflow to the IT equipment. There are varying methods and levels of airflow management schemes to avoid recirculation or bypass airflow. This can range from the most basic

recommendations, such as blanking plates in the cabi-nets to prevent exhaust recirculation within the cab-inet, all the way up to complete aisle containment systems (hot or cold). By minimizing the mixing of cold supply air with warm IT exhaust air (both within the cabinets and between the aisles), it reduces or negates the need to unnecessarily overcool supply temperatures to make certain that the air entering the IT is within the desired operational range (either recommended or allowable).

Controlling Supply and IT Air Intake Temperatures

There are multiple issues related to controlling the temperature of the air reaching the intake of the IT equipment, however, they generally fall into two major areas. The first is the method of controlling the supply air temperature that leaves the cooling units. The second is controlling the airflow to avoid or minimize the mixing of supply and return air— before it enters the IT equipment.

One of the most common methods of temperature control in data center cooling systems has been tra-ditionally based on sensing the temperature of the air returning to the cooling units. The temperature is typ-ically set and controlled on each cooling unit individu-ally. Based on this, when the return air temperature rises above the set-point, this simply causes the unit to begin the cooling cycle. Therefore in most systems, if set for 70°F, the cooling unit would then lower the temperature by 18-20°F, resulting in supply air tem-peratures of 50-52°F. This simple return temperature based control method has been used for over 50 years. Nonetheless, this inherent drawback has essentially been overlooked by most facility managers, and was and still is considered a normal and “safe” operating practice for most data centers. This wastes energy and despite the very low supply air temperatures, it does not really ensure that the IT equipment ever received enough airflow or that it was within the recommended temperature range, primarily due to poor airflow man-agement issues.

To improve energy efficiency, more recently there has been a trend to try to sense and control supply air temperatures, either at the output of the cooling system, in the underfloor plenum, in the cold aisle or at the cabinet level. Supply air temperatures can be

controlled relatively easily in newer CRAHs, which can continuously modulate the chilled water flow rate from 0-100% and also may vary fan speeds to maintain the supply temperature, to more efficiently adapt to heat load changes. It is more difficult to implement supply based temperature control in the DX-CRAC units which need to cycle the internal compressors on and off, and are therefore limited to only a few stages of cooling (dependent on the number of compressors). While there have been some more recent developments utilizing variable speed compressors, the majority of installed CRACs only have simple on-off control of the compressors.

There are a wide variety of opinions and recommen-dations about the best, most energy-efficient way to control supply temperature, such as; centralized con-trol of individual CRAC/CRAH or by averaged supply air using under-floor sensors or sensors in the cold aisles or a combination of inputs. Of course, that just led to the inevitable issues of where and how many sensors would be required, and how would the control system address if there were some areas that were too warm and some that were too cold. Nonetheless, there is an ongoing development of more sophisticated control systems that can optimize and adapt to these challenges.

It is more difficult to implement supply based temperature control in the DX-CRAC units which need to cycle the internal compressors on and off, and are therefore limited to only a few stages of cooling (dependent on the number of compressors).

Airflow Management

Page 8: Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards ... “Thermal Guidelines for Data Processing Environ- ... In 2011, ASHRAE provided

Copyright © 2016, Data Center Frontier 8

SPECIAL REPORTEvolving Data Center Cooling Environmental Ranges and Standards

Beside introducing and defining the expanded “allow-able” temperature ranges, the projected failure rates provided by the various IT equipment manufacturers were compiled and used as the basis for the “X-Factor” mentioned previously. This provided a statistical pro-jection of the relative failure rate vs. temperature, and is described in the 2011 guidelines as a method “to allow the reader to do their own failure rate projec-tions for their respective data center locale”. These higher allowable temperature ranges were meant to promote the use of “free cooling” wherever possible, to help improve facility cooling system energy, and the X-Factor was provided to help operators assess the projected failure rate impact of operating in the expanded ranges. It also provided a list of time-weighted X-Factors based for major cities in the US, as well as Europe and Asia.

Upon first inspection, the X-Factor seems to run counter to everything that the previous editions of the ASHRAE guidelines held as sacrosanct for maximum reliability; a constant temperature of 68°F (20°C) to be tightly controlled 7x24x365. This may still be the preferred target temperature for many facilities to minimize risks, but this data is used as the baseline x-factor reference point for A2 rated volume servers, wherein 68°F (20°C) is assigned as a reference risk value of 1.0. Thereafter, the risk factor increases with temperature above 68°F and also deceases below 68°F. (See footnote regarding humidity.2) This was used to also create a table of time-weighted X-Factors for major cities in the US, as well as Europe and Asia.

However, part of the confusion about the X-Factor stems from the fact that the increase (or decrease)

only represents a statistical deviation of the existing equipment failure rate (assuming the server was con-tinuously operating at the reference 68F). It is impor-tant to note that the underlying data failure rate (i.e. xx failures per 1000 units, per year), is not disclosed and is already a “built-in” IT failure risk. This “normal” failure rate is a fact of life which must be dealt with in any IT architecture. For example, the undisclosed failure rate from a given server manufacturer may be 4 failures per 1000 servers per year. Therefore, if the server was operated at 90°F (X-Factor of 1.48) for a year, the statically projected failure would be 6 servers per year, which should not have a signifi-cant operational impact. That is why the projected and actual historic failure rate of any particular server needs to be understood in the context of the impact of a failure, and evaluated when making operation tem-perature decisions.

2 It should be noted that all of the above are based only on dry bulb temperatures and it does not take into account the effects of pollution and humidity introduced by the use of airside econo-mizers. However, in the case of waterside economizers, where the IT intake air is not subjected to wider humidity variations from outside air, or any related pollution issues, which are ignored in these calculations and would improve reliability compared to exposure to direct outside air.

In an ideal world, the airflow though the cooling system would perfectly match the airflow requirements (CFM) of the IT gear, even as the CFM requirements change dynamically—the result, no bypass air, no recircula-tion—a perfectly optimized airflow balance. However, the increasingly dynamic nature of server heat loads also means that the fan rates have a very wide CFM range. This level of airflow control may be accom-plished in very tightly enclosed systems, such as row level containment systems. However, this is typically not very feasible for most Multi-Tenant colocation data centers with centralized cooling, which uses a common open return airflow path to provide flexibility to various customers.

Moreover, even well managed dedicated enterprise facilities still need to have the flexibility to adapt to major additions and upgrades of IT equipment, as well as regular operational moves and changes. Tempera-tures can vary greatly across the various cold aisles, depending on the power density and airflow in uncon-tained aisles. In particular, there is typically tempera-ture stratification from the bottom to the top of racks, which can range from only a few degrees to 20°F or more in cases of poor airflow management. Higher temperatures at the top of the racks and end of aisles, as well as any “hot-spots”, are one of the reasons that many sites still need to keep supply temperatures low, to ensure that these problem areas are still within the desired temperature range.

Server Reliability vs. Ambient Temperature – the X-Factor

The projected and actual historic failure rate of any particular server needs to be understood in the context of the impact of a failure, and evaluated when making operation temperature decisions.

Page 9: Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards ... “Thermal Guidelines for Data Processing Environ- ... In 2011, ASHRAE provided

Copyright © 2016, Data Center Frontier 9

SPECIAL REPORTEvolving Data Center Cooling Environmental Ranges and Standards

The Impact of New Servers and Energy Star IT EquipmentIn the past, most IT equipment power loads did not decrease much even when idle, representing a huge direct waste of IT energy, as well as creating a heat load that required, and wasted, cooling system energy. As of 2009, the EPA released the first Energy Star for Servers program, which defined a series of energy usage and efficiency requirements. These require-ments were essentially focused on increasing overall server energy efficiency, lowering overall power, and especially the power drawn while at idle. The Energy Star program now also includes storage systems and the large network equipment specification is expected to be finalized in 2016. While not all IT equipment is Energy Star rated, it does represent an energy cost reduction factor to be considered when making a pur-chasing decision for IT buyers.

IT Fan Power and Noise Levels vs. Intake TemperatureWhile operating temperatures and energy efficiency have been the primary focus of the industry, noise in the data center is a long standing and growing issue. In particular, it is the high density IT equipment such as high power 1U servers and blade servers that are the primary source of the increased noise due to the higher airflow requirements.

Moreover, while new IT equipment energy efficiency considerations have minimized the fan speeds at low intake temperatures and CPU loads, the internal thermal management systems will still increase the fan to speed up significantly as Intake temperatures and CPU loads increase. This can significantly increase the amount of power the IT fans use as well as the level of noise.

ASHRAE charts show that for A2 servers increasing intake temperatures from 59°F to 95°F the airflow requirements could increase up to 250%, as well as a commensurate and substantial increase in fan noise. This could also result in server power increases of up to 20% (note these are the maximum projections, and are an anonymized composite of vendor data that range from 7–20%, so check with your IT equipment manufacturer for specific performance).

Hidden Exposure – Rate of Change

While the wider temperature ranges get the most attention, one of the lesser no-ticed operational parameters in the TC 9.9 thermal guidelines is the rate of tempera-ture change. Since 2008 it has been speci-fied as 36°F (20°C) per hour (except for tape based back-up systems, which is lim-ited to less than 9°F/5°C per hour). While expressed as “per hour”, this should be evaluated in minutes or even seconds, es-pecially when working with modern high density servers (such a rack of 5 kW blade-servers with a high delta-t), since even a 10°F rise could occur in 5 minutes (i.e. 2°F per minute) which would effectively represent a 120°F per hour rate of rise and could result in internal component damage to servers from thermal shock.

While operating at 90°F may result in an increase in the statistical failure rate of a few more servers a year, a sudden in-crease in temperature from the loss of cooling, due to failure (or related to 5-15 minute compressor restart delays after utility failure and generator power comes on-line), could prove catastrophic to all the IT equipment. It is therefore just as important, if not more important, to en-sure a stable temperature in the event of cooling system incident.

The loss or interruption of cooling is an-other operational concern to keep supply temperatures low, thus increasing ther-mal ride-through time in the event of a brief loss of cooling, especially for higher-density cabinets, where an event of only a few minutes would cause an unaccept-able high intake IT temperature. There are several ways to minimize or mitigate this risk, such as increased cooling system redundancy, rapid re-start chillers, and or thermal storage systems. Ultimately this is a still a technical and business decision for the operator.

Page 10: Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards ... “Thermal Guidelines for Data Processing Environ- ... In 2011, ASHRAE provided

Copyright © 2016, Data Center Frontier 10

SPECIAL REPORTEvolving Data Center Cooling Environmental Ranges and Standards

ASHRAE has been aware of the rising noise issues for a many years and cites that fan laws generally predict that the sound power level of an air-moving device increases with the fifth power of rotational speed. Fan noise, energy, airflow and the related fan affinity laws can be quite complex engineering studies. However, without delving into too much technical calculations, the example provided in the 2012 guidelines postu-lates that a 3.6°F (2°C) increase in IT intake tempera-ture (to save cooling system energy) would result in an estimated 20% increase in speed (e.g., 3000 to 3600 rpm). This would equate to a 4 dB increase in fan noise and that it would not be unreasonable to expect to see increases in the range of 3 to 5 dB (if temperatures are raised from 68 F to 72F—still within the “recom-mended” range).

While the example cited by ASHRAE uses 3600 rpm fans, in many cases, high-density 1U servers (which have 500–1000 watt power supplies), use very small fans which can run at up to 15,000 rpm, in order pro-vide enough airflow at full load and high intake tem-peratures. They can produce high noise levels (at much higher frequencies), than the larger fans utilized in bigger server chassis and bladeservers, which also creates less noise and use less energy to deliver the necessary airflow.

Lower IT fan speeds improve energy efficiency in sev-eral ways (as well as reducing fan noise). In addition to directly lowering the fan energy of the IT server, indi-rectly it also allows the facility side cooling (CRAC/CRAH) to lower their CFM delivery requirements, thus also lowering facility fan energy.

In order to meet Energy Star for Data Center IT Equip-ment requirements, every component and energy management system is energy optimized. This allows the system to idle at very low power levels (CPU, memory, etc.), as a result, fans will idle down to min-imum speed whenever possible, but will also ramp-up quickly as intake temperature rises. This fan energy management function has now become fairly stan-dard in most new servers and its effect on power con-sumption and airflow requirements can be seen in the ASHRAE airflow curves for A2 servers.

As a result, the server fan speed controllers have become more intelligent and are normally configured to keep the fans speed (and energy) as low as pos-sible, while still keeping the CPU and other compo-nents within their safe operating region. Now many regular (non-Energy Star) servers use this scheme to save energy. While it is known that raising supply air temperatures can save cooling system energy, however beyond a certain point, IT fan power may potentially increase faster than the reduction in cooling energy, resulting in an overall increase in total energy.

The server fan speed controllers have become more intelligent and are normally configured to keep the fans speed (and energy) as low as possible, while still keeping the CPU and other components within their safe operating region.

Source: ASHRAE TC 9.9 whitepaper “2011 Thermal Guidelines for Data Processing Environments – Expanded Data Center - Classes and Usage Guidance”

Temperature vs. Power Temperature vs. Airflow

Page 11: Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards ... “Thermal Guidelines for Data Processing Environ- ... In 2011, ASHRAE provided

Copyright © 2016, Data Center Frontier 11

SPECIAL REPORTEvolving Data Center Cooling Environmental Ranges and Standards

Therefore, the ability to monitor the IT intake tem-perature and IT power to find the warmest tempera-ture range which does not cause the server fans to rapidly increase speed requires a monitoring system. The monitoring system should be able to look at the IT power and temperature, as well as the cooling system

energy to be able to optimize overall energy usage, not just facility PUE. This is one of the justifications for a Data Center Infrastructure Management System (DCIM) which would track and correlate these and other data points, such as IT workloads (i.e. processing).

Understanding the Implications of ASHRAE 90.1 and 90.4 StandardsThe 90.1 “Energy Standard for Buildings – Except for Low Rise Buildings” has long been used as a reference for commercial buildings and is used by many state and local building departments. Prior to 2010, the 90.1 standard virtually exempted data centers. In 2010 the revised 90.1 standard included and mandated highly prescriptive methodologies for data center cooling sys-tems. This concerned many data center designers and operators, especially the Internet and social media sites which utilized a wide variety of leading-edge cooling systems designed to minimize cooling energy. These designs broke with traditional data center cooling designs and could potentially conflict with the prescriptive requirements of 90.1, thus limiting rap-idly developing innovations in the more advanced data center designs.

In response to these complaints, the 2013 revision introduced performance based criteria, which included PUE. This addressed some of the prescriptive con-cerns, but also created other issues and is still being protested by the industry. In that same time period, a committee was formed to create “90.4 “Energy Stan-dard for Data Centers”. The public review process of the proposed 90.4 standard began with the Initial draft of 90.4 released for the first 45 day public review and comment on February 2, 2015. The second draft for public review was released September 4, 2015 and both were met with further data center industry con-cerns. The 3rd and latest revision to 90.4 was released for public review on January 29, 2016. In addition, the first revision “cz” was also concurrently posted for public review to the 90.1–2013 standards, as well. This 90.1 proposed revision very clearly indicated that going forward 90.4 would define data center energy efficiency standards, hopefully eliminating future overlap and confusion.

Temperature vs. Server Performance

In the quest to improve facility and IT en-ergy efficiency, one of the other issues which is also either overlooked or misin-terpreted is the issue of intake tempera-ture vs. server performance, at higher intake temperatures. The power drawn by the CPU increases exponentially with chip temperature, due to increased “sili-con leakage current” within the chip.3 This will also cause the CPU to run even hotter which will further raise the chip temperature. The thermal management software in recent generation servers interactively monitors and controls fan speed based on multiple factors. These factors include intake temperature, as well as key internal component temper-atures (the CPU and in some cases also memory), and compare it to computing load. As previously discussed, an energy management system will keep the fan speed low whenever possible, however, as CPU loading or intake temperature increase, the fans will speed up. If the CPU or other internal temperatures rise beyond a certain point and the fan speed has reached 100 percent, the thermal management system will begin to reduce the CPU voltage and/or clock frequency. This allows system to continue to remain operational, but with reduced perfor-mance, but it is preferable than allowing systems to go in to thermal overload and ultimately shutdown.

3CPU Leakage vs. temperature Increases Power http://www.intel.la/content/dam/www/public/us/en/documents/white-papers/data-center-server-cooling-power-management-paper.pdfhttp://svlg.org/wp-content/uploads/2014/11/WarmDataCentersWasteEnergy.pdf

Page 12: Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards ... “Thermal Guidelines for Data Processing Environ- ... In 2011, ASHRAE provided

Copyright © 2016, Data Center Frontier 12

SPECIAL REPORTEvolving Data Center Cooling Environmental Ranges and Standards

Examining the Proposed ASHRAE 90.4 Standard The stated purpose of the of the proposed ASHRAE 90.4P standard is “to establish the minimum energy efficiency requirements of Data Centers for: the design, construction, and a plan for operation and mainte-nance, and utilization of on-site or off-site renewable energy resources”. The scope covers a) New Data Cen-ters or portions thereof and their systems, b) new addi-tions to Data Centers or portions thereof and their sys-tems, and c) modifications to systems and equipment in existing Data Centers or portions thereof”. It also states that the provisions of this standard do not apply to: a) telephone exchange(s) b) essential facility(ies) c) information technology equipment (ITE).

Mandatory Electrical and Mechanical Energy Compliance (90.4)Electrical Loss Component (ELC)The designers and builders of new data centers will need to demonstrate how their design will comply with the highly specific and intertwining mandatory compliance paths. This is defined by the “design elec-trical loss component (design ELC)” sections. These involve a multiplicity of tables of electrical system energy efficiency minimum standards issues related to IT design load capacity at both 50 and 100 percent. It addition it requires multiple levels of electrical system path losses and UPS efficiencies at 25, 50 and 100 percent loads system at various levels of redundancy (N, N+1, 2N and 2 N+1), electrical system path losses.

Moreover, it delineates three distinct sections in the power chain losses: “incoming service segment; UPS segment and ITE distribution segment” (which extends down cable losses to the IT cabinets). Furthermore, it states that the Design ELC and “shall be “calculated using the worst case parts of each segment of the power chain in order to demonstrate a minimum level of electrically efficient design.”

Mandatory PUE The second revision of the proposed 90.4 standard explicitly lists mandated maximum PUE ranges from 1.30 to 1.61, each specifically related to a geographic area listed in each of 18 ASHRAE climate zones. How-ever, these PUE value seemed prohibitively low and many felt they would increase initial design and build costs substantially and created a lot of industry con-cerns, comments and protests. Moreover, it did not take into account the local cost of energy, its avail-ability and fuel sources, or water usage or any potential restrictions or shortages of water, which was recently an issue in California’s ongoing drought. The PUE ref-erence was removed in the next revision; however the issues of local resources remained unaddressed.

Mechanical Loss Component (MLC)The 3rd revision removed all references to The Green Grid PUE requirements; however it contained highly detailed specific compliance requirements for min-imum energy efficiency for the mechanical cooling systems, again specifically listed for each of the 18 climate zones. The 3rd revision has another cooling performance table (again for each of the 18 climate zones), called “design mechanical load component (design MLC)” defined as; the sum of all cooling, fan, pump, and heat rejection design power divided by the data center ITE design power (at 100% and 50%).

Mandatory Compliance Through Legislation

The proposed 90.4 standard states that: “Compliance with this standard is volun-tary until and unless a legal jurisdiction makes compliance mandatory through legislation”. As previously stated, one of the many data center industry con-cerns are that the “authorities having jurisdiction”(AHJ). This encompasses many state and local building depart-ments, as well as the federal govern-ment, which use 90.1 as a basis for many types of commercial and industrial new buildings or requirements to upgrade ex-isting buildings. Therefore, they will also use 90.4 as their new data center stan-dard, since it is referred to in the revi-sion to the 90.1 standard. Moreover 90.4 also acts as a more detailed supplement to 90.1 and in fact specifically requires all other compliance with 90.1 building provisions. In addition to any design com-promises and increased costs, it seems unlikely that many AHJs local building inspectors may fully understand the com-plex data center specific issues, such as redundancy levels, or even be familiar with TC9.9 or The Green Grid PUE met-ric. This could delay approvals or force unnecessary design changes, simply based on how the local building inspector interprets the 90.1 and 90.4 standards.

Page 13: Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards ... “Thermal Guidelines for Data Processing Environ- ... In 2011, ASHRAE provided

Copyright © 2016, Data Center Frontier 13

SPECIAL REPORTEvolving Data Center Cooling Environmental Ranges and Standards

One of the other and perhaps significant issues is that all cooling efficiency calculations would seem to pre-clude the effective use of the “allowable” ranges to meet the mandatory and prescriptive requirements: “The calculated rack inlet temperature and dew point must be within Thermal Guidelines for Data Pro-cessing Environments recommended thermal envelope for more than 8460 of the hours per year.” So even if the data center design is intended for new higher tem-perature IT equipment (such as A2, A3 or A4), it would unnecessarily need to be designed and constructed for the lower recommended range, which could substan-tially increase cooling system costs.

Concurrently with the release of the 3rd draft of 90.4, ASHRAE also released new Proposed Addendum “cz” to 90.1–2013 for public review has now removed all ref-erences mandatory PUE compliance. The addendum provides a clear cut reference transferring all data center energy efficiency requirements to 90.4, which should reduce potential conflict and confusion (other aspects of the building would still need to comply with local building codes). The goal of publishing the final version of 90.4 is the fall of 2016.

Nonetheless while this was a significant issue, why should a data center be still limited to the “recommended” temperature and dew point, by designing a system to meet the mandatory cooling system energy efficiency requirements? It should be up to the operators if and for how long they intend to operate in the expanded “allowable” ranges. This is especially true now that vir-tually all commodity IT servers can operate within the A2 range (50–95°F). Moreover, Solid State Disks (SSD) now have a much wider temperature range of up to 32°F–170°F. While originally expensive, SSDs continue to come down in price, as well as matching spinning

disk storage capacity and are substantially faster deliv-ering increased system throughput, increasing overall server performance and energy efficiency. They have become more common in high performance servers and more recently as a cost effective option in commodity servers, which will also eventually result in in greater thermal tolerance as servers are refreshed.

So with all the these additional factors defined in the ASHRAE Thermal Guidelines, and the proposed 90.4 standard, many of which overlap and potentially con-flict with each other, how should facility designer and managers decide as to the “optimum” or “compliant” operating environmental conditions in the data center?

Clearly, ASHRAE TC9.9 members feel it should still be up to the owners and operators as to where to build the facility and make their own decisions as to the Total Cost of Ownership (TCO), not just based on mandatory cooling system efficiency calculated for the “recommended” environmental range, but on the ever evolving IT hardware environmental envelope and interrelated energy efficiency performance. Con-versely, it would seem that the 90.4 committee mem-bers prefer more complex standards with the potential to be interpreted (or misinterpreted) and be legally enforced by local authorities.

A3 and A4 Servers

While A3 and A4 servers did not exist in 2011 when the expanded ranges were originally introduced, as of 2015, there were several A4 rated servers on the market whose manufacturer’s specifica-tions state that those models can: “run continuously at 113°F (45°C) — with no impact on reliability”. This new genera-tion of A3 and A4 hardware overcomes the early restrictions by some manufac-turers’ regarding limiting the exposure time to higher temperatures.

Mutually Aligned Goals Energy Efficiency and Lower Total

Cost of Ownership

In sharp contrast to the 90.4 standard, ASHRAE’s TC9.9 2011 whitepaper stated the new environmental envelope “was created for general use across all types of businesses and conditions. However, dif-ferent environmental envelopes may be more appropriate for different business values and climate conditions. Therefore, to allow for the potential to operate in a different envelope that might provide even greater energy savings, this white-paper provides general guidance on server metrics that will assist data center opera-tors in creating a different operating enve-lope that matches their business values…. The global interest in expanding the tem-perature and humidity ranges continues to increase driven by the desire for achiev-ing higher data center operating efficiency and lower total cost of ownership (TCO).”

Page 14: Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards ... “Thermal Guidelines for Data Processing Environ- ... In 2011, ASHRAE provided

Copyright © 2016, Data Center Frontier 14

SPECIAL REPORTEvolving Data Center Cooling Environmental Ranges and Standards

The Bottom LineThere is no question that improving data center energy efficiency is an important goal. However, the basic function of a data center is to provide a secure physical space, ensure the availability of power and environmental operating conditions suitable for the reliable operation of the IT equipment. Moreover, there are a variety of complex business and technical issues which interact that can affect the design, as well as the present and future operating conditions over the operating life of the data center. Therefore, being able to efficiently adjust to IT equipment envi-ronmental requirements as they continue to broaden in the future will remain an important factor and ongoing factor.

Nonetheless, enterprise and other organizations may choose to follow the ASHRAE thermal guidelines (recommended or allowable) for their own business requirements as they see fit. However, other organi-zations such as hyperscale Internet and cloud service providers have different requirements and commonly use custom hardware made their specifications. The environmental specifications of that equipment may even be broader than the current A4 and how they chose to cool it (mechanically or otherwise) is again a business decision. It is therefore not sensible to man-date that a data center cooling system’s energy effi-ciency be designed and bound by the “recommended” envelope, when the operator has no intention to operate in such a manner.

As previously noted, the Internet search and social media giants continue to pioneer and explore the expanded use of economizers and higher IT air intake temperatures, while the existing more traditional organizations are left to ask what is the safe operating temperature and humidity range for the classic enter-prise mission-critical data centers operated by finan-cial institutions, airlines, and governments? Moreover, what should the co-location operators, who are driven

by need to satisfy their wide variety of customer’s requirements, specify or offer to support for their environmental operating ranges?

While some extreme proponents of “sustainability” who are usually the most critical of data centers, will say that governmental regulation is the only way to improve energy efficiency and it should not matter if it affects the cost of building or operating cost of any type of building, including data centers. There are others who are part of the industry, such as the mem-bers ASHRAE of TC 9.9, who have publicly declared their concerns of the restrictive compliance require-ments of 90.4. They are clearly inherent supporters of energy efficiency, yet recognize that over-zealous and prescriptive mandatory measures are not the best method, and in fact could actually hinder this process.

The potential impact 90.4 will most likely increase the cost of preparing documentation during the design stage to submit plans for building department approvals. It will also almost certainly require much more time to be reviewed by AHJs who are not familiar with many of data center specific issues. Worse yet, it could limit or prohibit the hyperscale pioneers which previously had the freedom to design and build energy saving, but unconventional cooling systems, by having local build departments reject or limit their future design options.

Moreover, unlike traditional enterprise data centers (which perhaps were previously less aware of or con-cerned about energy efficiency), colocation and cloud service providers are highly competitive businesses. Their TCO model inherently drives them to improve

It is not sensible to mandate that a data center cooling system’s energy efficiency be designed and bound by the “recommended” envelope, when the operator has no intention to operate in such a manner.

There are others who are part of the industry, such as the members ASHRAE of TC 9.9, who have publicly declared their concerns of the restrictive compliance requirements of 90.4. They are clearly inherent supporters of energy efficiency, yet recognize that over-zealous and prescriptive mandatory measures are not the best method, and in fact could actually hinder this process.

Page 15: Getting Ready for Revisions to ASHRAE Standards · Getting Ready for Revisions to ASHRAE Standards ... “Thermal Guidelines for Data Processing Environ- ... In 2011, ASHRAE provided

Copyright © 2016, Data Center Frontier 15

SPECIAL REPORTEvolving Data Center Cooling Environmental Ranges and Standards

Julius Neudorfer, DCEP

Julius Neudorfer is the CTO and founder of North American Access Technologies, Inc. (NAAT). Based in Westchester, NY, NAAT’s clients include Fortune 500 firms and government agencies. NAAT has been designing and implementing Data Center Infrastructure and related technology projects for over 25 years. He also developed and holds a patent for high-density cooling.

Julius is a member of AFCOM, ASHRAE, IEEE and The Green Grid. In addition, he is an instructor for the US Department of Energy “Data Center Energy Practitioner” “DCEP” program. Julius has written numerous articles and whitepapers for various IT and Data Center publications and has delivered seminars and webinars on data center power, cooling and energy efficiency.

Compass Datacenters delivers high-end, certified, dedicated data centers faster, with more features and personalization options at a lower cost than competing alternatives anywhere our customers want them. For more information, visit www.compassdatacenters.com.

Data Center Frontier charts the future of data centers and cloud computing. We write about what’s next for the Internet, and the innovations that will take us there.

The data center is our prism. We tell the story of the digital economy through the facilities that power the cloud and the people who build them. In writing about data centers and thought leaders, we explain the importance of how and where these facilities are built, how they are powered, and their impact on the Internet and the communities around them. For more information, visit www.datacenterfrontier.com.

their energy efficiency wherever possible. The organi-zation’s decisions should able to be made based on any other factors such as the cost of energy or local tax incentives and customer’s market demands, as well as their own strategic technical and business objectives. Therefore, they should be allowed to build their facili-ties wherever customers need them, and use what-ever designs that will allow them to adapt to the best and most cost effective use of local conditions and resources.

It is important for the data center industry to be aware and follow the changes in the upcoming 4th edition of the TC9.9 Thermal Guidelines and the finalization of the 90.4 and 90.1 standards in 2016. As has been dis-cussed, the recommended (and allowable) operating

ranges have expanded as IT equipment becomes more tolerant of much broader ranges than previous gen-erations of hardware. However, while historic industry practices have inhibited many enterprise data centers from increasing their operating temperatures, due to perceived risk, this is changing.

Nevertheless, all IT manufacturers have a vested interest to continue to develop energy efficient hard-ware to be even more environmentally tolerant to minimize or eliminate the need mechanical cooling. Greater understanding and interaction of facilities and IT operational management domains benefits everyone. By doing so, they permit more power to be used for IT hardware and less for cooling the facility. This allows the data center to install more IT equip-ment and computing capacity, without increasing the total power used by the site, which is the ultimate energy efficiency goal.

Biography

While historic industry practices have inhibited many enterprise data centers from increasing their operating temperatures, due to perceived risk, this is changing.