You asked, we answered: Top Data Center Cooling Questions
Our most recent webinar focused on the application of 5 energy conservation measures (ECMs) to optimize data center cooling. In the colocation case study, we discussed the application of these ECMs to the data center and how each one contributed to optimizing cooling. To guide our interactive discussion, we asked our audience to list their top data center cooling questions, and today we’re sharing those answers with you.
Why optimize airflow?
In a data center, the air is what removes the heat generated by the electronic components in the IT equipment, resulting in cooling the equipment. All IT equipment has operating specifications for temperature and humidity to ensure proper operation. The intent of “optimizing airflow” is to ensure adequate airflow is being provided to maintain thermal conditions at a consistent level so that the IT equipment will operate within appropriate manufacturers’ specifications.
A common misconception is that by providing lots of cooling and airflow, the IT equipment is less likely to fail due to a thermal event resulting in conditions outside the manufacturer’s specifications. In actual fact, providing too much airflow by operating more cooling systems than what is required to meet the cooling needs of the IT equipment often results in less airflow being delivered to the IT equipment. As well, the energy costs to provide the excess cooling are unnecessarily high.
On the other hand, inadequate airflow can cause equipment failure due to overheating. Supplying airflow at too low of a temperature requires cooling systems to work excessively, resulting in high energy bills and more wear and tear on the cooling equipment.
By optimizing airflow, meaning following good airflow management practices, knowing what the IT heat load is and how much airflow should be delivered, the IT equipment operation and efficiency of the cooling system will be improved. There are a number of benefits to optimizing airflow:
- Cooling systems will operate more energy efficiently
- There is less risk of downtime due to thermal events
- The cooling capacity of the data center can be maximized, enabling additional IT load to be added without incurring the cost of more, very expensive cooling units.
How do you reduce or remove hot spots?
Hot spots occur for a number of reasons. However, it is very seldom, if ever, due to insufficient cooling capacity. Ironically, most data centers encounter hot spots due to too much cooling and poor airflow management practices.
Hot spots are defined as air temperature in the cold aisle being close to or above the ASHRAE recommended inlet temperature of 27°C. IT equipment receiving inlet air can be prone to failure or shutdown due to internal electronic components overheating. Poor airflow management practices can result in the hot exhaust air recirculating to the inlet side of the rack, diluting the cooler supply air, resulting in inlet air being too high to provide adequate cooling.
The primary intent of good airflow management is to separate the cool supply air from the warm exhaust air as much as possible. To achieve this the following practices should be followed:
- ensure all empty spaces in the racks are filled with blanking panels
- spaces between racks should be filled or closed off to avoid exhaust air from recirculating to the front of the rack
- use properly sized perforated tiles relative to the rack heat load in front of racks
- proper placement of IT equipment in racks – no equipment blowing hot exhaust air into the cold aisle
- avoid excess use of high flow rate perforated tiles
- do not put perforated tiles in the hot aisle
- ensuring cooling capacity is adequate, but not excessive for the data center heat load
- avoid placing an obstruction in the supply plenum
How do you determine how much cooling you really need in the data center?
The amount of cooling capacity required in a data center is dependent on the amount of heat being generated. Calculating cooling capacity is not an exact science. Other conditions, such as placement of cooling units and IT equipment racks, depth of supply plenum, ceiling height and rack heat distribution, all influence how much cooling capacity is required.
Cooling systems cycle a supply of adequate cool airflow to remove the heat generated by all the IT and other equipment. The largest contributing source of heat is from the IT equipment, accounting for 90 to 95% of total heat load. Other equipment consuming electricity, such as lighting, power distribution systems, fans on cooling systems, etc. also result in heat generation. The heat from these sources is low, typically less than 10% of overall heat generated, but should be taken into account (Learn more about how this works in “Debunking Data Center Cooling Myths: You Can Cool Better with Less Equipment”)
The easiest way to determine IT heat load is to read the values on the data center power plant supplying electricity to all the IT equipment. This could be done at the UPS level, or the power distribution units (PDU’s). For each kW of electricity consumed by the IT equipment, one kW of heat is being generated and will require cooling. By knowing the total kW of heat being generated, the required cooling capacity can be calculated. To allow for the potential failure of a cooling system, N+1 cooling capacity should be added.
All cooling equipment has spec sheets that list the cooling capacity, typically in kW and BTU’s (British Thermal Units). As a guide in estimating required cooling capacity, a one to one ratio of IT kW heat load to a kW of cooling can be used. This simple calculation will give you an estimate of how well aligned the cooling capacity is with the IT requirements.
To determine more accurately how much cooling is needed, a much more thorough analysis of cooling system operation, IT load and site conditions is required. This is an integral part of the SCTi Audit process, which then enables us to accurately define the level of available cooling capacity and airflow in a site.
Additional Cooling Capacity Factors You Must Consider
Calculating the required cooling capacity is also influenced by a number of other conditions.
Airflow is a very important consideration in ensuring the cooling requirements of the IT equipment can be met. Cooling capacity only defines the value of mechanical cooling available. However, if airflow is not well managed the cooling units will not be providing the level of cooling expected. Poor airflow management has a number of negative impacts. First, it means the supply air may not be delivered to the IT equipment as expected. Secondly, without good separation of the cool supply air and warm exhaust air, the return air temperature is lowered, causing the cooling unit to operate less efficiently. Thirdly, the mixing of the cool supply air and warm exhaust air will cause the data center operators to think more cooling capacity is required.
In a raised floor data center, too much airflow will result in a high-pressure differential between the supply plenum and data center space, which will reduce the airflow through the perforated tiles. Airflow exiting the cooling system is at a high velocity, which will result in poor airflow, even negative airflow, through the perforated tiles that are within a few feet of the cooling unit.
The placement of cooling units is an important factor. Air takes the path of least resistance. A cooling unit close to the IT equipment will end up working a lot more than a cooling unit that is far away from the heat source. In these cases, it can appear as though additional cooling capacity is required when, in actual fact, the overall cooling operation is 50% or less.
Too much cooling results in a low return air temperature to the cooling unit, or what is referred to as a low delta temperature (ΔT) between the supply air temperature and return air temperature. This causes cooling units to be very energy inefficient, reducing cooling capacity by 25% or more, and not producing the level of cooling capacity specified.
External factors, such as ambient temperature, can have some influence on heat levels in a data center however, in the Canadian market, this is not a major issue. It has also been shown, in cases where this could be a factor, the size of the data center is important – smaller data centers with less total heat load can be impacted more by ambient conditions than larger data centers with high heat loads.
Personnel in a data center was a consideration years ago when there might have been offices with staff dedicated to the data center. Even then, humidity was a major concern. Permanent offices are a thing of the past, and in most cases, personnel working in the data center are limited in number and are only present long enough to perform the necessary tasks.
How do you make cooling systems work together?
In a data center with multiple cooling units operating independently, it is not uncommon to have units “in conflict” or “fighting” with each other. In these cases, units may have their temperature set points at different levels, so one unit is cooling, and another unit is not doing any cooling or, worse yet, is sending warm exhaust air directly back into the supply plenum or ducting without being cooled. Conflict, in the form of one unit humidifying while a second unit is dehumidifying, is not unusual if units are operating independently. This is a very energy-expensive operation.
Networking of cooling units is quite straightforward and will help avoid the issues noted above. There are a variety of ways to network cooling units, depending on the vintage and type of unit. The benefits of networking include:
- Cooling units can be sequenced so only the required amount of cooling is being generated – this is a significant energy saving step
- Creating a fail-over mode so if a unit fails, a stand–by unit turns on
- Networked alarms, so each unit knows the operating state of the other units
- Units will not “fight” with each other, significantly reducing energy costs
- Units may operate less, resulting in less maintenance being required.
- Cooling capacity of the data center may be increased, allowing for more IT load without having to add more cooling units.
How the networking is established is dependent on the types of cooling units and vintage. Newer units may only require cabling between each unit, while older models will require a network switch for communications to be installed.
What should you do before raising temperature setpoints?
Raising cooling unit temperature setpoints can be a very effective method for improving cooling unit efficiency and an opportunity for energy savings. However, it is imperative the space and its operations, down to the rack or server level, are understood prior to increasing the setpoints.
Poor airflow management practices leading to hotspots or server air recirculation will be magnified at a higher setpoint and can lead to equipment failures. We always prioritize airflow management to ensure these underlying issues are identified and resolved through proper airflow management prior to adjusting any setpoints.
When raising temperature setpoints, a good practice is to monitor the rack intake temperatures (IAT) on an ongoing basis. ASHRAE recommendations or equipment manufacturer’s specifications can help quantify the threshold the temperature setpoints can be increased by. Changes to the setpoints should be incremental and gradual to allow for the site to balance out. Once the setpoints have been increased, verify the rack IAT is in compliance with any site constraints.
How do you prepare your data center for high density IT heat load?
Servers nowadays are constantly improving, built smaller and more powerful, leading to higher density IT loads. Many data center owners automatically point to installing brand new, best-in-class cooling units to accommodate this high–density IT equipment. However, purchasing new cooling units is unecessary, and should be the last resort. Many spaces already have the necessary cooling infrastructure in place to handle high density IT loads – it has just not been optimized, meaning you are not getting as much cooling out of the system as you think. And if the data center was encountering cooling issues previously, the addition of more cooling will only exacerbate those problems.
In preparing for a high density IT load, a clear understanding of the total new heat load is required. Nameplate power specifications for IT equipment are typically 50 to 60% higher than the actual power draw. This means if the manufacturers’ specifications list the server as 3 kW power draw, in reality, the maximum power draw will be no more than 1.8 kW and more likely 1.5 kW. This translates directly to much less heat being generated. If the plan is to install 30 new servers, the total maximum heat generated will be in the 45 to 54 kW range, not the 90 kW value that would be implied from the spec sheets.
It is important to quantify the available cooling capacity in the space. By doing so, it will indicate the amount of additional load the site can handle. However, to take advantage of this available cooling capacity, good airflow practices must be followed to maximize and optimize cooling capacity and to ensure the existing cooling is performing at maximum efficiency. Industry-standard guidelines are for 100-150 CFM per kW of IT load depending on the ΔT across the server. High–density equipment should be placed in areas with adequate supply airflow at the intake and ideally the shortest path for the exhaust air to reach the cooling unit, but not placed closer than 8 feet from the cooling unit (as explained above).
Aisle containment, high-velocity grate tiles, and directional tiles can also be considered to accommodate for high density loads. However, each of these solutions require additional considerations, including how they will impact the overall operation of the data center.
How do you defer capital spend on new cooling systems?
Installing new cooling systems comes with a huge price tag and disruption to data center operation and should only be considered as a last resort. The benefits of the SCTi Cooling Optimization Program can allow you to defer capital spend on new cooling units by increasing the available capacity and extending the service life of existing cooling units, while potentially increasing the available capacity. By optimizing the space through Energy Conservation Measures, such as Airflow Management, CRAC sequencing, CRAC networking and Operational Settings Management, the runtime hours of the CRAC units can be reduced, they will operate more efficiently, in addition to improving the site’s available capacity. The efficiency of the CRAC units can further improve through a Cooling Tech Upgrade. Therefore, installing new cooling systems should only be considered as an End of Life replacement or a requirement for expansion.
Data centers are very dynamic environments, each with their own unique issues that require customized solutions. SCTi understands this, which is why we have multiple energy conservation measures that can be implemented to suit each individual client’s needs.
Still have questions? Watch the webinar “5 Energy Conservation Measures To Optimize Data Center Cooling” to explore data center cooling optimization techniques from four leading experts.