March 2011 Interview
The Do’s and Don’ts of Hot Aisle/Cold Aisle Containment
Data Center Containment has become widely accepted as one of the foremost strategies to improve energy efficiency in the server room. While manufacturers offer a wide range of hot aisle and cold aisle containment solutions, sifting through the myriad of options and choosing the right containment strategy can be a daunting task.
Scot Heath, 42U’s CTO compares the benefits of different containment approaches, discusses the do’s and don’ts, and helps you determine which is the best for your needs.
This Intervew covers the following hot aisle / cold aisle containment topics:
- Containment Basics
- Hot Aisle vs. Cold Aisle Containment
- Airflow Strategies
- Half Truths and Pitfalls
Read the Full Transcription
Tanisha White: Ladies and gentleman, thanks for standing by, and welcome to today’s session in the 2011, 42U web seminar series. My name is Tanisha White and I’ll be your moderator for, The Dos and Don’ts of Hot Cold Aisle Containment. At this time, I’d like to welcome Scot.
Scot Heath: Thank you, Tanisha, and thank you, everyone for attending. Today I’d like to talk of course about containment. We’re going to go over a few cooling basics, probably dispel some myths along the way, so the half truths and pitfalls aren’t going to necessarily wait until the end. We’ll talk about the differences between hot and cold aisle containment and then if we have time, we’ll certainly entertain questions, at the end.
So, a quote from TechTarget is that ten years ago, you know, most data centers ran 500 watts to a kilowatt per rack if you even had racks. We were still kind of in the era of having things on depth. Today’s densities can get to 20 kilowatts per rack and beyond. I, in fact, ran some numbers the other day. G7s are fairly new processor, HP’s got a power advisor for C7000 which is a pretty dense box and I actually got almost up to a kilowatt per U. So certainly, you can outstrip the cooling capacity of nearly any data center out there.
I’d like to spend a little bit of time talking about evolution there, referencing that TechTarget quote. So, I came up with a few acronyms. RSE I call that, the random stuff everywhere. And I’m sure many of you have seen that. It was the days when computer centers were really big mainframes, big tape drives and there were printers and paper in there and lots of things that we don’t encourage today. And every small business out there, you know, they had desk tops sitting around, and things were generally not really organized in any fashion with respect to airflow and heat and cooling. In fact, the thinking at that time was this was comfort cooling, right. In your house you have a thermostat, you’d like to set that thermostat and no matter where you go in the house, ideally the temperature’s the same.
Well that’s the way it used to be in data centers. And then along came this thinking of HACAA which is the hot aisle, cold aisle age where we began to realize that it was not prudent to have the same temperature everywhere because the machines themselves certainly don’t except that, right? They take air in on the inlet side and that’s really what temperature they – it’s critical. They depend on a range of temperatures there as specified by the manufacturer. Typically up to 90, 95 degrees Fahrenheit. And the exhaust around the outlet based on how much heat load is in there and as things have progressed along of course, the algorithm for running the fans in each machines has become quite – a bit more sophisticated so the temperature of the air going out the back can certainly be quite high.
And having the air find its way back around to the front is a problem. So this notion of lining up the aisles so we only supply cold air to the cold sides and the hot air goes out the hot side is a good thing. However, it’s not strictly contained there and it does sneak around somewhat. So finally, current thinking , a lot of current thinking is that closed couple cooling, or CCC, will be, much more prevalent going forward. And the real advantage here is, of course that you move the air that it takes to cool the equipment less distance. So fans take energy. The closer we can keep the air to the actual cooling coils in the cooling device, the less energy it takes.
Co containments kind of an offshoot of this so interact cooling. It’s definitely close-coupled and it is 100% contained. I’m not going to talk about liquid cooling today, that’s kind of a whole different animal and does not fall in the realm of containment here.
So cooling equipment characteristics. Let’s talk a little bit about the cooling devices that you’ve got in your data center. By and large they are one of two types. Either direct expansion or water. And that really speaks to the medium that we use to exchange heart to the outside. Now every one of these has got a compressor in it someplace so in the DX systems, the compressor actually expands a gas, whatever it happens to be, directly into a coil that’s in the airflow.
So the air heats that coil and in the process becomes cooled. Some characteristics of these machines are their pitched fan speed so you don’t have variability there because of the temperature of that expanded gas is quite low and the danger of having condensation and icing is always there. So a minimum amount of airflow is provided such that that doesn’t occur. With water, you have a little bit more flexibility. It’s an intermediate transfer mechanism. Eventually that water’s gonna make its way out to a chiller which is nothing more than a compressor with refrigerant and just like the compressor in the DX machine but there’s intermediate exchange.
We go air to water and then water to that refrigerant. And both of them reject their heat to the outside to the atmosphere. And that can be by a variety of means, another water loop, in the case of DX could be directly air cooled condensers, so on and so forth. But the point I really wanted to chat about here is we have limited flexibility here in the amount of energy it takes to cool our data center. So we have a lot of misconceptions out there about, you know, what containment can do, what blanking panels can do, what under floor baffling can do. And I wanted to spell out right now, with this statement that is, unless we change the total airflow, or the temperature set point, the capacity efficiency of our cooling equipment remains constant.
It does not change. The temperature that comes back is completely dependent on the amount of air I put out there and the total hot load in the data center. I can run the air around any way I want in the data center twice, through the machines, whatever the case may be. The total amount that comes back then mixes with the bypass that’s there, assuming we have an oversupply and the temperatures of air that comes back, is constant not matter what happens to the air when it was on the floor.
Now per machine it may vary a little bit but if you look at the total cooling system as a whole. It’s a constant. So I’d like to take a couple questions now if we have any, Tanisha?
Tanisha: Great, we have a question here. Doesn’t containment affect the Delta T of my CRAC?
Scot Heath: Yeah, so that’s exactly my point here. No, it does not. You know, it enables you to do some things and the things that you can do, are change the point of your CRAC or if you have a water cooled, air handler, you could potentially change the airflow if you have some means to do that, VC fans or variable frequency drives, but in and of itself it does not change the Delta T of the CRAC. Probably have time for one more.
Tanisha White: Great, what about redundancy?
Scot Heath: So redundancy is certainly a prudent thing to have, and in an open environment, redundancy is very flexible. We tend to think of, especially with under floor delivery, we pressurize that plan under floor and it’s like mini sources. If one source goes away, there’s some pressure differential, but with a properly designed data center, it’s not enough to cause, any of the machines to go out of regulation and cause a cooling issue on the floor. With containment, it depends on the scheme used. Close-coupled cooling is probably the biggest concern here since we are limiting the airflow to a minimum set of equipment. With close coupled cooling then, we have to make sure that we have redundancy in each pod or enclosure or whatever you designation happens to be.
So I talked a little bit about high density equipment already, characteristics. We already chatted about how dense this stuff can be and, you know, would it be prudent to install four of these in a 42U rack? No, it would not. You know, if you really have that much load which in and of itself is a very difficult proposition, you know, this is at the maximum load on the processors. This is full of memory, this – you know, as bad as I could possibly make it, I could get close to 1,000 watts per U, but even half that for most data centers is quite a challenge. On top of that, they make it more challenging for the design, and especially implementer of containment because the airflow varies significantly through these machines based on external conditions as well as internal loads.
Lots and lots of sensors looking at temperature compliance. You know, the goal is to keep the worse case temperature in there at it’s – at a point that makes the components, the critical component reliable. Anything more than that is a waste of energy and as good equipment designers, the smart guys that make these are now, are quite proud of the fact that they minimize that fan energy. And that’s a big deal, the fan curve for increased airflow is a cubic curve, the relation between power and airflow. So if I need to flow twice as much air because I’ve doubled the power in there, and I need to remove twice as much heat without increasing in temperature, it takes eight time as much energy to do that so certainly it’s prudent to minimize that.
It also certainly makes, you know, properly balancing any kind of tightly contained system a challenge. Finally, the note I’d like to leave you with on this slide is the highest inlet air temperature does not necessarily guarantee absolute minimum data center energy. Now it probably does guarantee best PUE or lowest PUE because, you know, as my temperature ramps up, my fans ramp up, that means I’m taking more power out there on the floor. At the same time, uh, you know, it’s possible that the return temperature is going up depending on whether I’m actually turning the airflow supply or not. So I’m either, you know, keeping my operating point of my CRACs and my condensers, or I mean my chiller the same or maybe getting slightly better.
So that balance goes towards a better PUE, not necessarily towards the minimum data center energy, and that’s just a plug for tightly monitoring these things. I mean if you really want to squeeze the most performance out of your data center, you have to measure, and that’s all there is to it. So let’s put all these things together. we just had a question about over provisioning, redundancy. So overprovision, redundancy, what we’re doing is introducing margin into the system. And it’s prudent to have margin in your system. You don’t want to run so tight that a disturbance in the system causes a failure.
So, you know, something is composed of power outage. I’ve got a certain amount of thermal mass in the air, and in the row and in the mass in the components and if I ran that all the way up to my 95 degree point, let’s say, and all of a sudden power goes out, I lose airflow and I’m stuck with recirculation of what I’ve got. I may or may not move the machine so I probably want to operate somewhat lower than that. So there are a lot of factors here to consider minimizing energy while keeping the availability that you require for your business. Good designs are often ruined by poor floor practices. I see this over and over and over. I was just in a data center a couple weeks ago where they have the rows, in part of the data center, way out completely in the wrong direction.
Now that isn’t the way they started out but for whatever reason, the last time that set of equipment got refreshed, someone decided to put those rows in there the wrong way and now I’ve got hot air flowing over the top of many, many racks before it gets back to the CRACs. And finally, you know, if you don’t measure it, you can’t control it, just another plug in there for control methodology. You know, if we think about car engines for instance, car engines not too long ago, were not nearly the performance or the economy they are now for the same cubic engines. Well, what changed? It’s still pistons, they still go up and down, they’re still connected to a crank shaft. What really changed is the sophisticating of the controls involved. We look at everything.
The fuel air ratio is continuously monitored on a tight, tight tolerance. All kinds of input signals from people are monitored to decide how to control that engine in the most economical fashion. That level of control continues to encroach in the data center world, and what it does is reduce that margin to a minimum. Containment gives you another tool in being able to reduce that margin to a minimum. Okay, Tanisha, I think we can take a couple questions if you have any?
Tanisha White: Great. The first question here is, how important are blanking panels?
Scot Heath: Ah, so I haven’t talked about that but certainly I’m planning on best practices being followed completely for any containment scheme to work. Blanking panels, that’s another – one of those misnomers where I’ve even seen data that says every blanking panel saves $1.30 or some metric and fact of the matter is, it’s just like containment. Putting in blanking panels and causing that air to go in different directions does not gain efficiency. What it does do is give you a tool to then raise the set point, which does gain efficiency, so are they important? They are very important and they actually become more important depending on the containment scheme you’ve got because there’s a possibility of having concentrated, you know, airflow in small gaps with a containment scheme.
So common woes, boy, the common woe that we’re trying to address here is hot spots. Uhm, what containment really does, I mean the function you’re really after here is, uh, to reduce the intake temperature variance to a minimum. Any data center that you go into has got some maximum temperature, some place in the data center that air’s being drawn in and some minimum temperature. You probably are all familiar with where those are, end of rows, top of racks are prime candidates for warm temperatures. And right out of the floor, sometimes you get recirc under the rack but close to the bottom there, six inch, one foot level is typically a very cool spot, so ideally we’d like to have air at exactly the same temperature available to every piece of IT equipment in the data center.
And even better would be if we could regulate, you know, the airflow so that every molecule of air that we spent time cooling in the air handlers went through a piece of IT equipment and did its job by cooling. So those are ideal. We don’t get there and we’re not talking about caulking up gaps and hermetically sealing anything here but any time we can reduce recirculation, we, uhm, increase the possibility that our grading is gonna be smaller. So by putting in containment, what we’re trying to do is reduce recirculation, that’s the bugaboo that’s causing these temperature differentials, and thereby give us a tool then to go make our data center more efficient.
Because as I say here, the single hottest spot in the data center of course limits the maximum set point I can use. So finally, let’s get to a little bit of containment. So what will containment do for me? Well if we do it right, and I caution everything that containment is, on the surface, very simple, but certainly can have some undesired effect if it’s not done correctly. Done correctly, it will do all these things for you, so reduce recirculation.
What it won’t do, it won’t reduce bypass, and I say that with a caveat of – unless I have variable airflow. So I’m dumping a certain amount of air out into the data center environment. Some will go through the IT equipment, some of it does not. I can force a little bit more through the IT equipment if I keep it tightly contained, and do a cold aisle steam and increase the pressure in the front of the machines, but it’s not a great effect. There’s still that same amount of air that has to propagate through the data center and come back.
So whether or not, you know, I have curtains up or whatever the case may be, I’m going to have some bypass available. Uh, it will reduce the thermal intake radiant and that’s my real goal here. I mean if I measure the Delta T to be let’s just say three degrees across all the intakes to my data center, I’ve achieved, you know, my wildest dreams here, I just can’t expect better than that. It will not change the Delta T back at the CRAC as I mentioned earlier. That’s dictated by two things, airflow, and total load. So again, the caveat of it is I’ve got variable airflow and I can adjust that once I put containment in, then certainly I get some benefit from that.
But if I don’t, I have DX machines, that I’m stuck with the Delta T that I’ve got. Uh, it will allow an increase in temperature set point, so a huge benefit, right, uhm, certainly there’s an increase in efficiency of the compressor in the unit. You know, and it’s on the order of DX machines of, I’ve got various studies and in fact, I even did a study when I was still in a previous position. Around 1 1/2% per degree after you increase the temperature you’ll see a performance increase on the compressors themselves. It doesn’t save fan energy in the DX unit, that’s still the same.
It doesn’t have anything out on the condenser because that temperature is regulated separately. But it does raise the average evaporative temperature and that’s what makes that machines run more efficiently. Same thing holds true for water. If I increase the water temperature it’s got a bigger benefit. Even if I don’t increase the water temperature I get to run with less flow so less pump in, so that raises the average temperature at the evaporator itself. It will enable free cooling hours and this is a biggie. Anytime that I can employ free cooling, I see a tremendous benefit from the number of hours I can run.
And, depending on where you are, relevant to the climate in your area, containment and an increase in set point could push you over to be able to get to a lot of “free cooling”. Here in Colorado it’s very dry. Uh, we even have some data centers here, in fact I just worked at a data center here, it’s a 6 megawatt 32,000 square foot data center that is 100% free cooled, it is all evap cooled so free – well you’ve got run the fans on the evaporators, you’ve got to run on the cooling towers you’ve got to run the pumps but no compressors involved. They had one, because it’s prudent over permission but they plan to never run it.
Uh, it won’t increase efficiency by and of itself. Again, you know, I can do anything on the floor I want, with mucking about with blanking panels and curtains and so on and so forth. I’m stuck with that airflow and that Delta T unless I make a change. I’ve got a little example here of a place where containment could benefit the data center significantly, and I actually got these curtsey of a visit that I paid a couple weeks ago to a data center where they have some interesting layout issues. And if you look at the picture on the left, that’s actually a cold aisle and it’s the last aisle next to the wall.
Now the CRACs in this particular data center are laid out wrong if I can say so. They are orthogonal to the row direction. I’d like to see them at the end of the hot aisle corner so that hot air’s got a clean shot, but not so. They’re facing the faces of the IT equipment. And to make matter worse, the first row here is a cold aisle. And so you can imagine what’s happening to that cold air. And in fact you can see it. Here are thermal photographs. The racks you see there on the left side of that left picture have got clearly cold air going up, you know, halfway on some of them, maybe not quite halfway on others.
And a significant amount of that cold air is flowing back to the right, right back into the intake of that CRAC that’s right there. You can just see a little thin strip of it on that left-hand picture. This was quite an interesting observation. I walked up to the CRAC and I’m looking at set points and whether they’re in regulation or not and sure enough this guy thought he was doing his job. Had a 70 degree return air set point. He said, yep, my return air’s 70 degrees and I stuck my infrared thermometer over the top and shot the filter and, gosh, I came away with 80 some degrees, two or three or something and I went, oh, something’s wrong here. Do we have a sensor failure, what’s going on?
And so since they have the infrared camera available, I talked him into climbing up on a ladder and taking a picture of the top of this CRAC and low and behold there’s this big extreme temperature differential from front to back of the CRAC. Guess where there temperature sensors are? Sure enough, they’re near the front of the machine so there’s some mixing of air going on. It’s directly just getting its supply air temperature read, but not nearly – it’s not nearly doing the work that we expect out of this CRAC to keep this equipment cool. So we’re getting false readings there. In addition, you can see the effect of the recirculation coming over the tops of the racks on the left. The temperature grading is top to bottom on these racks.
You know, I measured to be in excess of 20 degrees, so any equipment that’s up there very high is, uh, is in danger of suffering mightily if any event should occur here. So imagine what’s going to happen if we lose, uh, power here and the air in this data center starts to heat up until we can get the compressors and the CRACs restarted?
So would containment help here? Yeah, it would help a lot. Now whether it can be implemented or not is another question. There are other factors to consider. I think in this case it could be but it would take more study. So, uh, in that case, and cold aisle containment I think would be indicated. So let’s talk a little bit about cold aisle containment here. Some pros about cold aisle containment are, it maximizes supply uniformity so you can drive that inlet temperature gradient to the minimum if you contain the cold air.
Depending on how much cold air you supply, an oversupply of cold air will definitely lead to minimizing that gradient. Use the under floor air delivery scheme which is prevalent in many, may data centers so you’ve already got the path for your cold air available and you don’t need any special return so the big bugaboo in most open air return data centers is recirculation. You can imagine these arrows on the right here showing the warm air coming out and propagating back over the racks are exactly the case that I just showed you in that thermal picture. Uh, not all the air, uh, goes back into the CRAC in the case where it’s exposed to the inlet side of the racks and when it does, when it is exposed it gets pulled in.
The machines don’t know, they’re happy to pull air in from any place. Uhm, some of the cons, well now you have to do something to cool the equipment in the rest of the room so a lot times networking racks are separate. PDUs are in the room. You may have case libraries that are kind of a standalone or non uniform airflow devices top to bottom. And these things, uh, typically are not high density but still require cooling, and would be very grumpy to be directly in the hot air stream from the exhaust from the machine. Now if you’ve got under floor deliveries that’s not a big deal. You put a tile down over there and you supply them cooled air near their inlet. The overall effect on the data center is minimal.
Adequate airflow is critical. I have to supply enough air to this intake. If I don’t do that, then my comment before about missing blanking panels causing a concentrated reserve can certainly come into play. I was in the data center once where, uh, there was containment with significant negative pressure in the contained area and the flow of hot air underneath the racks was extreme. In fact, I measured temperatures in excess of 100 degrees going into the machines in the bottom of the rack. And they were still running. Kudos to whoever designed those machines, but golly, you are living on the edge if you’re living like that in a production data center.
Uh, the other thing that is a con in this case is it minimizing ride through time, so remember that total mass of air that I’ve got to work with when I’m not actually cooling, you know, when something happened here, uhm, is minimized here because the majority of the air here is hot. If I look at the total air available, everything in the room is warm. Only that under the floor and in the contained area is cool. So I’m rapidly – I’m going to heat that air up quicker than I would if the majority of the room was cool.
Okay, Tanisha, let’s take a couple questions if you have some?
Tanisha: Yeah, one question here is, are there concerns with too much pressure of cold air in the cold aisle containment configuration?
Scot Heath: You know, not so much over pressure in the cold aisle other than, you know, it’s-, it’s-, it may specifically move your containment around. And I’ve actually seen this on hot aisle containment where the customers thought they had a path, a return path but for some reason it just wasn’t there and, gee, we put those curtains in and they looked like sheets blowing in the breeze. So the air will find its way out. The damage is – no damage is going to occur to the CRAC. The IT equipment itself, some more air will flow through it, it’s not going to hurt anything. The much more dangerous condition is starving these machines for air. And, you know, that can happen on a dynamic basis.
We just talked about that extremely capable machine that varies its fan speed up and down lot. If I am running an application that is very load heavy and takes a lot of equipment, so something like rendering for instance, I see those machines ramp up and down a lot and if the temperature is at a point that – increased load causes a dramatic change to airflow, I need to be able to respond to that. And really the only way to do that with a DX machine is assure I have adequate airflow under worse conditions. So, again, prudently supplying more air is a good thing here. The savings is going to come specifically in DX, from raising that set point.
Let’s look at hot aisle containment for a minute here. Hot aisle containment has its set of pros and cons. The balance of the room is cool, that’s a nice thing and it maximizes the volume of cool air to maximize my ride through times so I have less concern about the equipment that’s sitting around unable to be contained, primarily those networking racks. We just did a data center like this in my previous position. It’s 100% contained and it’s all hot aisle contained, and it’s very nice to not be so concerned about the mix of high density, low density equipment because in essence what we have here is this tremendous platform for cold air and if we didn’t have raised floor, I mean, yes, if we didn’t have raised floor we could even do this on flat. I mean it’s basically, is a very free path for air. Air is a quite low viscosity fluid.
It’ll go around just hunky dory to where it’s needed based on the pressure in the area. So, uhm, it’s great if you’ve already got an over ceiling platform, if you’re doing a retrofit. If you’re doing a new design, uhm, I would kind of tend to the hot aisle containment for the reasons you see listed here. The downside is, of course, in the retrofit, you do have to have an overhead platform, some path to get back or-, or you can depth, but there are expenses involved in retrofitting, if those things don’t exist already. I mean if you don’t have plentemary cable in the ceiling you have to go do something about that. You may have to abate years worth of dust, even asbestos, things like that. So the bottom line is, the design here is economic.
A, is it worth doing? B, if it is worth doing, what scheme, you know, makes the most sense. It’s all decided by the pocket book like any business decision is. Hot aisle can be quite hot. Some of those pieces of equipment of extreme Delta Ts. 30 degrees C I’ve seen quoted. I’ve never measured that high, but none the less, you know, people being in that hot aisle for extended periods of time is a concern, and OSHA has guidelines of the kind of work you do in there so they’ve got kind of a light, medium, heavy work definition and they have a heat index which takes both humidity and temperature into account. And they limit the amount of time you can be exposed to that doing those different kinds of work.
So in a raised floor environment, it’s quite easy to temporary circumvent this by installing a floor tile in the hot aisle where you’re working. Good idea to remember to take it out but in an on slab kind of construction where you don’t have the availability of introducing cold air were personnel is then your choice is the limit their activity in that area. Okay, Tanisha, any questions here?
Tanisha: Sure, one question here is how do you deal with devices with non standard airflow like TVM or network switches?
Scot Heath: That’s a good question, and there are a good amount of those, of course. You know, side to side airflow and switches a big irritation for data center designers, and often times a kit is available to retrofit airflow from the space of the side of the aisle so that you essentially turn a side to side airflow, uh, device into a front to back delivery scheme. If that’s not possible, cases like this, put the networking switches at the end of the aisle, leave them out of the containment, supply them with air there. There’s not a whole lot heat load associated with them but certainly exposing them to the high temperatures with the possibility of recirculation exists with side to side airflow, uh, could be a bad thing.
Tanisha: Not now, no.
Scot Heath: Okay. Okay, a few things that we haven’t covered yet. Myths and pitfalls. What about fire protection? Well fire protection is completely up to the HAD in your region. Uhm, I’ve read quite a bit about gas expression systems if MT200 or what have you. And you know, it’s kind of an even division of, uh, in fact, it’s not an even division. The actually test air from dispersion with and without containment is clear. Containment does nothing to the dispersion of the suppressant in any of the cases that I’ve read in any measurable amount. And the reason for that is there’s just a tremendous amount of air movement through the data center and so, uh, whether or not you dump that out with containment or without, it quickly disperses throughout the data center. That doesn’t keep the AHA from saying you still have to do something about non blocking [inaudible] and certainly if your scheme is wider than you have to in every case but [inaudible] associated with, you know, do something about that.
Now in the data center we just built that is, uhm, uh, purpose built from the beginning with containment, something we did about it was plan sprinkler heads so that we had even distribution in the hot and cold aisles right down the center of the aisle so there’s not decrease in coverage due to the containment. But that’s hardly ever the case. In a retrofit, you’re stuck with where the sprinkler heads are so there are a couple different methods for dealing with that. Thermal links which are designed to drop slightly ahead of, or at and when sprinkler heads come on, so certainly in wet type systems or pre action systems they are typically just fine.
If you have a [inaudible] system or in the case where the HD is worried about a gas system, then there are electrical drops available so activated about smoke detection thermals or what have you. And those kind of come in two flavors. Either a kind of a single shot thing where we melt a few links just like we would if it got hot, uh, you do it one time and you’re done or a latching relay kind of, uhm, uh, mechanism that you can actually do testing with but of course is significantly more expensive to install. So again, it’s just a pocketbook decision.
Concentrated recirculation you’ve actually covered significantly and my example would be at the end of the 400 some degree it’s real, it happens, it’s something to be very, very aware of. I just can’t emphasize enough how, you know, worse case conditions have to be take into account when you’re doing this, so properly engineering the solution is a must. Uhm, all my temperatures are solved? By no means is this true. You now, in and of the containment itself. Uh, it’s a big help, you know, but-, but paying attention to the details is critical here, so absolute adherence to best practices, sealing up those gaps, uh, as good as possible and that’s the general guiding principal so 100% use of the blanking panels and the blanking panels aren’t all created equal.
There are certainly for different blanking panels that sit better and worse. So, uhm, the more attention you pay to it, he more change you’ve got at raising that set point to the maximum amount to get your bet efficiently, to find your best efficiently. Uh, a couple here that aren’t necessarily strictly containment but are kind of accentuated by containment right, uhm, all CRACs should have the same set point, well, you know, CRAC technology being what it is, uhm, typically has the, uh, the regulation point of the return air of the CRAC. So that example I showed you before, the thermal picture, you know, it’s right up there in that return stream kind of at the front of the CRAC there and the CRAC does everything it can to keep that temperature constant. Well the heat load not being evenly distributed then, causes every CRAC in the data center to have a different set point. I mean, a different supply air temperature. What we’d like to do, and we enable with containment, is have a common supply air temperature, and so unless we change the control scheme of the CRAC, which may or may not be possible, it’s prudent to go and adjust the set points of the CRAC to maintain constant supply temperature.
And this isn’t something you have to do every day but you do have to do it on a regular basis because data center’s are dynamic things, you know, slow changes. I just showed you that example where they started off, it rose in the right direction and it ended up in the wrong direction. Sometimes facilities get involved, sometimes they don’t. And so it’s quite, uh, uh, essential that you keep a careful eye on this. Measure, measure, measure. The more you can monitor this, the better. You know, monitoring solutions are-, are becoming commonly available. We sell them, you know, lots of people sell them and employing them with containment is just a great idea. Uh, I need 45% relative humidity again.
Uhm, you know, it’s due to that set point being, as a return air point to the CRAC, that just becomes more, not necessarily critical but more of an opportunity with containment. I don’t care what the humidity of the air coming back from my CRAC is. [Inaudible] doesn’t care what the humidity of the air coming back to my CRAC is. I care what the humidity going into that machine is, and in fact, I personally don’t care very much. You know, I’ve been associated with data centers here in Colorado for a long time. Many, many data centers do no humidity control.
And in fact that was an article in the ASHRAE journal a year ago now. It was March of 2010. The question is, the need to specify humidity at all. Now on the bottom end, I’ll agree with that all day long. On the top end, it’s a very important that condensation doesn’t occur, and certainly depending on air quality and the kind of gas and the contaminants in the air, humidity can accelerate corrosion in some cases. In the United States it is rarely, rarely, rarely an issue. In my previous position, I saw rare instances of corrosion that we believe we could trace to humid environments with gas as contaminants but they are few and far between. None the less, very prudent to keep that in mind.
So, ideally, I’d like to have humidity float between kind of the recommended, or kind of the allowed values, 20 to 80%, that’s really not possible with most humidity control schemes today. The important thing is that if I’m setting it in the return air stream, I need to calculate what the equivalent humidity in my supplier temperature is and adjust it per CRAC as well. Now it may be that in your scheme your’ going to have very uniform return air temperature.
In that case it might be just fine to have all the humidifies and temperatures set the same. But it’s not necessarily the case and certainly bears watching. Okay, before we move onto our last slide, do we have any questions here, Tanisha?
Tanisha White: Yeah, uhm, you mentioned that is it important to measure the IT inlet temperature. What’s the best way to monitor that?
Scot Heath: Well, you know, most modern IT equipment has got built-in sensors but unfortunately we’re not quite to the point of having a uniform communication strategy to that. That’s actually – I chaired a committee for awhile that was actually looking at that very thing, finding a standard set of parameters that are reported by and method for reporting for all IT equipment. But until we get there, uh, you know, monitor schemes I mentioned before, or monitoring equipment, I mentioned before, is highly prudent and it needs to come in a variety of configurations, wireless, quite easy to set-up, very valuable statistics, many of them include monitoring much more than just temperature. Under floor pressure, power on a per rack, per socket basis. Power from a lot of interfaces with a very common square power monitoring device, so you can even play a real time PUE.
I just, I’m enamored with them, I think they’re fantastic, but you have to use them. Installing them and ignoring them does no good so keep a careful eye no matter what. All right, so what’s the bottom line here? I bet I had 30 questions in the right inbox about what’s best, hot or cold? Well what’s best is completely an economic decision as I mentioned several times. Your particular situation is going to dictate the containment scheme that you chose or if you chose a containment scheme at all. You know, it’s very prudent to do a study, you know, get some numbers in place, know what you’re expected to turn in. What is your goal? You know, if your goal is to save a gazillion dollars and get a year payback, it’s probably not gonna do that.
I was just out at the Green Grid tech conference last week and Disney, I believe it was, presented a case study, uhm, where they hadn’t quite got to containment but they, uh, tightened up all their gap abatement scheme, 100% blanking panels. Uh, they even sealed up and down the side of the equipment in the racks, tightened up the racks, uhm, adjusted the floor tiles and installed the EFTs and even with that, they got less than a 24 month payback which for a structure that’s going to last as a long as data center is just excellent.
They plan to continue this, hopefully they’ll come back with some more data, you know, after having implemented containment. But, you know, it’ll be interesting to see. All right, I’ve talked enough. If we have some more questions, I’m happy to entertain those in the time that we’ve got left.
Tanisha White: Great. Thank you, Scot, for a great presentation. Now I want to open to discussion for any questions. A reminder to our participants, please feel free to use the chat feature to submit your questions. Our first question here is, there are a number of questions around low humidity and ESD effects. Can you please address this.
Scot Heath: Sure. Uhm, you know, again, referring to that study that was published in the ASHRAE journal about a year ago, they basically sited specifics that said no measurable increase can be attributed to increased ESD. And, frankly, ESD testing on the equipment itself is extensive. We ESD every exposed – we don’t manufacture equipment here but test every exposed connector, they use an even body model, they shock these things up one side and down the other. So a lot of attention is paid to the effects of ESD discharges directly to the equipment. So now let’s look at probability of actually having an event like that. You know, this thing is in a grounded metal case which is in a grounded metal cabinet, which, you know, 99 plus percent of the time you’re going to have contact with before you can get close to touching anything that’s going to be critical. So I – empirical evidence, not having specifics from data centers I’m associated with, but certainly, uh, notice things. I have actually no direct experience that ESD has ever been in effect in a failure of a piece of equipment after it’s been assembled or mounted in a rack or even just assembled.
This about even your house, and I don’t know where you live, but where I live, in Colorado, it’s dry here. And every time I walk up to my computer at home it might shock the keyboard or the mouse or the front panel, and that doesn’t fail, and these things are cheap, basically, compared to the equipment that goes in the data center.
Tanisha White: Great, Mike also has a question here and he is asking, what is the sticking point for higher CRAH returns and temperature to result in higher data equipment loads such as increasing UPS load and taking away from return air temperature efficiency gains?
Scot Heath: Great question. And it’s dependant on the specific equipment set that you’ve got and the mix of equipment that’s there, old and new, manufacturer, so on and so forth. Uh, I actually did some testing, uh, you know, on a piece of equipment, uhm, that I was responsible for as power and cooling architect on at HP and it was a little bit skewed. I used a chamber, which has got hot gas recruit control on the compressor, so not the most efficient thing in the world, uhm, but in fact, I found that for that particular situation, a temperature of around 15 degrees C gave me the best energy point. Now substituting a model in that had, uhm, a more normalized coefficient performance curve for the compressor where I didn’t employ hot gas bypass gave me somewhere in the neighbor of 25 degrees C.
So the bottom line is again, measure, measure, measure. If you have the ability, measure what the total energy computation is, raise the temperature a degree at a time and do an extended test, because none of this stuff is measurable in a short term basis. Way too many variables swinging around. Outside temperature, the test I did, I was convinced I could just run an hour at a time. Guess what? They use an ice melt system to cool the water loop during the day and as the ice melts the water loop temperature increase. And I could see in my data the power on the compressor increase as that water loop temperature increased and I had no idea what was going on so I ended investigated that further and then I ended up having to do 24 hour runs at a single temperature even with as controlled an experience as that was.
So wish I could give you a crystal ball answer, but it just can’t happen.
Tanisha White: Okay, thanks. That next questions is. You showed CRACs for all the containment solutions. Are there other types of cooling salutations appropriate for containment?
Scot Heath: Well, sure. Close-coupled cooling, which was mentioned up front and I largely ignored, you’re absolutely right, is a great candidate for containment. In fact, some manufacturers sell that as a whole system, that’s got, you know, the in-row cooling and the containment, kind of a package deal. Uhm, and the big advantage to that is, you know, as I mentioned earlier, uh, the airflow path is so short, response time is excellent, you have great control over fan speed to affect, uh, you know, overall performance. So, uhm, it’s just another way to eliminate margin by bring that equipment out there close. Now the downside is, depending on what your total load is and, you know, how discreet you can get with the in-row devices, you may end up with a more or less solution for redundancy sake. But you know, as densities increase the flexibilities of in-row devices continue to increase, it’s highly likely to be a very economical choice if you’re looking at going in that direction.
Tanisha White: Thank you. We have a question from Andrew and he asks, if the containment system eliminates extensive bypass air, won’t that raise your return temperature?
Scot Heath: It doesn’t eliminate excess bypass air, that’s the key, right? I still am supplying the same amount of air, that same amount of air has got to make it back to my CRAC. And if it doesn’t go through the IT equipment it has to bypass. And even if it does go through the IT equipment, the temperature out the back is lowered relative to the temperature that would have been there had it not gone through there, so, no, it doesn’t affect the Delta T at the CRAC. Now that goes out the window if I have the ability to vary the amount of air that I deliver. So I have economic fans, I have VSD and I can actually reduce the air that I’m delivering, then you bet, I do.
Tanisha White: Okay, from Russell, he’s asking, I have read that a 20 ton unit can provide more than 20 tons with hot aisle containment, and less than 20 tons with no containment. To confirm, you’re saying this is not the case?
Scot Heath: No, I’m not saying that at all. The capacity of the equipment, and I didn’t go into depth on this just because of time, but you are absolutely right, Russell, congratulations for reading that. Look at any manual for – take a [inaudible] 27 manual for the Deluxe System 3. I guess that’s an older one, but a DX system today and certainly add the, uh, inner air temperature which is the return air temperature changes with different capacities, both sensible and total capacity.
Uhm, and you have a choice now, uh, as a consumer to do something with the increased capacity. You can either use it by installing more equipment, or you can allow the compressor to run at a more efficient point without changing the load and actually save energy. So if I gave that impression, I apologize, that is absolutely not the case. It’s not because airflow changes, it’s because entering air temperature changes and that only occurs when I get a change in set point, not from putting in containment and changing bypass because bypass is the same. But if I raise the set point so my inner room temperature goes up, presto, I get more capacity.
Tanisha White: Okay, from Jason, he’s asking, are humidity levels more difficult to control in a cold aisle verses a hot aisle contained area?
Scot Heath: You know, the control point is the same. Uh, I wouldn’t’ say they’re more or less difficult to control. I think the trick is, uh, to get the optimum performance, that balancing act of trying to set the CRAC such as a get a common supply temperature which would be ideal. Uh, if I had the capacity, and some equipment has got variable capacity compressors. So if I have a digital Scroll compressor to use a [inaudible] term and variable speed fans then, uhm, I have the ability to move my, uh, set point out into the cold aisle, and I’m a big fan of that. It’s just not possible in a large amount of current installations because the equipment set is there. So newer data centers the trend is certainly in that direction setting the supply air temperature rather than the return air temperature.
Tanisha White: Okay, this one comes from John. How does containment compare to jimmy cabinets?
Scot Heath: Well jimmy cabinets are certainly a method of containment. Slightly different than a totally contained system where I enclose a rack and do inter rack cooling. Uhm, again, it’s an economic decision. If I have no containment scheme in place whatsoever, I don’t have a cabinet to my name, I need to look at that. I’m going to have to buy cabinets anyway, if you add or purchase cabinets, I’m going to make it more expensive or less expensive than putting in standard cabinets or curtains. As far as effectiveness depending on the total load employed and the kind of jimmy cabinets that you have, most of them are quite capable of very dense loads. So the real decision is, what’s going to be the most economical way to get there.
Not necessarily a big difference in performance between the two.
Tanisha White: Okay, this one’s coming from Rich. Once you create containment, what is the suggested supply air temperature at the rack level to satisfy the majority of data in our equipment, what is the top end temperature?
Scot Heath: Well the very top end temperature, go do a little search. All your manufacturers will publish that. You know, in my past employment our range was quite tight. I can’t think of a piece of equipment that was outside 90 to 95 degrees Fahrenheit so 90 would be your absolute top that your equipment would be guaranteed to run at, but that doesn’t mean it’s going to shut down at 90.001. In fact, the experiment I did, I ran quite a bit past that point and it was fine, it’s just that we don’t guarantee that it’s going to run past that point. So the real best point is the minimum energy point, and again I can’t tell you what that is because it’s dependant on the equipment set that you’ve got.
It’s unlikely that it’s going to be past that threshold for where the equipment’s guaranteed to operate just because the control schemes, monitor control schemes for fans, you’re going to have the fans ramped up significantly at that point. I mean the machine I tested is a big machine, it has a lot of fans but the swing in just fan power alone was a kilowatt. So I’m dumping a kilowatt more power into that, uh, piece of IT equipment. Not only that, but now I have to take the kilowatt back out of the environment. So despite the fact that my compressors were getting more efficient they had that much more load to contend with and the increase in load had negated the increase in core efficiency of performance.
Tanisha White: All right, thank you, Scot. We have a question here, with respect to density, does it matter what containment system you have?
Scot Heath: Uh, well you know, hot aisle containment gives you a little bit more resiliency for changes in density because the supply of air is more flexible. If I have a cold aisle containment system and I try to be very careful about matching my supply air to my load, changes in loads can cause me to have insufficient air supply so I have to prudently set that too high. Now if I am prudent about that, there’s probably not that much difference.
But the flexibility of having that, you know, vast air if you will in the hot aisle containment case let’s me be less sensitive to unexpected changes in density. That IT guy forgot to tell me that he was putting in three blade chasses in this particular contained area, I’d probably be able to adapt to that much easier with hot aisle containment than cold.
Tanisha White: okay, Blake is asking, how much can a CFD study assist with decision making?
Scot Heath: Well I love CFD studies, particularly for what ifs. So, uhm, they can assist a lot in decision-making. Not only in the containment scheme itself but what’s the best place, uh, to put this equipment, and do I need to make any under floor modifications if I’m going to put, you know, extremely dense equipment some place. So, you know, nothing easier than going and adjusting some parameters in a model rather than going and doing the experiment to find that stuff out. That said, management from day to day of temperatures I’m a big believer in monitoring. So I think both these tools are very valuable, and I think they have some overlap, but I think that, you know, the major contribution of CFD modeling is in the what if. The major contribution in monitoring is the day-to-day management and alarming.
Tanisha White: Okay, we have time for one more question from Mike, and he’s asking how or what method would you use to measure before or after CRACs power consumption when raising set point temperatures to prove ROI
Scot Heath: Well the absolutely best method is to measure the power directly, so there are lots of, uh, you know, places you can get that power, you can measure the total power to the CRAC and what you’re really interested in is saving, you know, power to your data center. So unfortunately outside characteristics cause extreme swings in power level. So I’ll give you a good example of that, right. The humidity in some places can be sub 10% to excess 90%. Well guess what that does to your humidity control scheme? These are significantly powerful, uh, operations. In fact, I Just ran some numbers on a fairly large DX unit that, you know, I have one CRAC that is dehumidifying and at the same time reheating while the guy across the room is humidifying. So what am I doing on the CRAC that’s dehumidifying and reheating? I’m actually consuming along 75 kilowatts for nothing but taking water out of the air. Well at the same time you go to this other room. I’ve got 12 kilowatts worth humidification going on. So, you know, big swings isn’t necessarily associated with the compressor power. You can measure compressor power directly, just find the leads inside the CRAC, but you have to do it over a long period of time. That’s a nice thing to know because it let’s you, you know, model that as well. But what you’re overall interested in is everything put together. Can I make it adjust to my humidity, uhm, and see what the effect of that is.
So if you have a point where all your CRACs come together or a few points, putting out a big power monitoring system, and everybody makes them square DGE, pick their favorites, uhm, is the best place to go make that measurement.