N+1 Air Flow Capacity – Is it something I should be concerned about?
The data center world is all about uptime. For many companies every minute of downtime can cost tens of thousands of dollars in lost revenue.
To ensure uptime the two most important elements are power and cooling. Having a resilient power system means duplicating feeds to critical equipment in case one feed goes down. Similarly, in cooling, N+1 is a commonly used expression for cooling capacity to ensure if one cooling system goes down there is sufficient cooling from other systems to meet the cooling needs.
Although N+1 power can be easily defined and designed, effective N+1 air flow capacity is not as easy.
A common misconception is that if there is N+1 cooling capacity on paper then there will be N+1 air flow capacity, which on a basic level is correct. The question to be asked though is, where is that air flow being delivered? Cooling systems are only effective if they are delivering supply air to where it is needed and the hot exhaust air from IT equipment can make its way back to the cooling system. Having the cooling capacity located a distance away from the majority of heat load is a clear formula for failure, even though on paper N+1 or more cooling capacity may exist.
In designing data centers, the required cooling capacity is calculated based on the expected overall IT load. An ideal set of conditions such as IT load being evenly distributed, no obstructions in supply plenum are assumed and architectural design is seldom, if ever, taken into account. Generally cooling systems are located where it is most convenient from an installation perspective.
As a data center matures very seldom are the original design guidelines followed. IT equipment and heat load is not evenly distributed throughout the room resulting in heat load in one area being substantially higher than another. Equipment may not be in neat rows and obstructions begin to appear in the supply and return plenums. A walk through of the hot aisle will show the variations in temperature of the exhaust air from the IT equipment. Higher temperature air being exhausted means more cool air is required on the inlet side for appropriate cooling – but is that supply air being delivered or is the IT equipment pulling warm air from elsewhere in the room? In the event of a failed cooling unit, air flow to the higher density area may not be sufficient to cool the IT equipment – but you may still meet the N+1 Cooling Capacity criteria.
What happens then? Air recirculation occurs. Cool air is needed to keep the internal server components from overheating and if there is not sufficient cool supply air available the IT equipment will draw air from other sources, such as hot exhaust air from the back of the rack or adjacent servers, hence air recirculation. The net result is high inlet air temperatures that may cause equipment failure. A paper exercise of calculating N+1 Air Flow capacity will not uncover this potential problem.
N+1 Air Flow Capacity issues can occur for other reasons. One is the use of perforated tiles with substantial openings, such as grate tiles with 56% opening. We see these used to “cool” high heat density racks. There are a number of problems associated with their use. First, the velocity of air exiting these tiles can be so high that the IT equipment cannot pull the cool air into the inlets. Instead recycled air will be drawn from the back of the rack or adjacent servers exhaust. Blanking panels will reduce this type of air recirculation but does not eliminate the problem of high velocity air plumes bypassing the servers as the air rushes to the ceiling and returned to the cooling unit without serving any useful purpose. The high volume air plumes steal air supply from other tiles. In normal operation enough air may be available from the regular tiles to meet the cooling needs. However, in case of a drop in air flow, due to the failure of a cooling unit, the air flow through the regular tiles will drop, possibly to the point of not providing adequate supply air, resulting in another case of air recirculation.
So how do you know if you have N+1 Air Flow Capacity?
By doing a Cooling Audit air flow problems can be identified. Measuring the air flow of each tile can be used to create a profile of the air being delivered through the perforated tiles versus other opening in the floor. It will also show where the air is being delivered and will be used to determine the best placement for higher density equipment.
Real-time temperature sensors are deployed throughout the data center to develop a profile of temperature conditions which is then correlated with rack power draw to identify where air flow is being wasted.
By correlating tile air flow and the rack inlet temperature data with rack heat load information the mapping will provide direction on what changes should be made.
Air flow modeling using Computational Fluid Dynamics software (CFD) can also be used to show air flow patterns and is especially useful to see what impact proposed changes such as air containment or reconfiguration of perforated tile layout could do to improve the air flow. Being able to simulate a cooling system failure and drop in air flow can be easily tested in modeling without disrupting data center operation. By deploying effective air flow management it is quite likely some cooling units can be put in standby mode, significantly reducing energy costs.
Creating resilient data center cooling requires effective air flow management. It does not require running 3-4 times more cooling capacity than needed.
Once the cooling and air flow has been optimized how can that level be maintained?
IT equipment will be added, other equipment taken out and so the environment is constantly changing. The Ekkosense Critical system is an effective tool to ensure site optimization is maintained and downtime due to thermal issues is avoided. Maximizing data center capacity, uptime and energy efficiency should be the objective of all organizations. The cost of building a new facility, adding more cooling to an existing data center or having to turn away business in the co-lo world, is extremely high. By having detailed insight into what is happening in the data center IT capacity can be maximized with the assurance that the site has N+1 Air Flow Capacity.
Give us a call to discuss how we can help ensure N+1 Air Flow is available resulting in higher uptimes that are crucial to meet the needs of your business and clients.