Linux clusters give scientists the opportunity to compute previously impenetrable large-scale algorithms, opening up a new world of possibilities not just for businesses, but for society at large. CIO Insight reporter Debra D’Agostino spoke to Rob Pennington, senior associate director of computing and data management for the National Center for Supercomputing Applications (NCSA) about the benefits and challenges of Linux clusters, and potential future breakthroughs.
CIO Insight: What are some of the benefits—and challenges—to using Linux clusters rather than supercomputers? Is it just about saving money?
Pennington: Well, that’s one aspect of it, but there’s another aspect of it that’s extremely important, and that is you can use Linux on a larger number of different machines. You are not tied to a particular vendor. If you have a large-scale machine from vendor A, you have to use software developed for vendor A’s operating system. With Linux machines, that’s not quite as true. So it makes it possible to have portable code without a very large amount of pain moving from machine to machine.
To be blunt, there is no single machine that solves all of the application problems equally efficiently. There are certain applications that work well on clusters, there are certain applications that work well on vector machines, and there are certain applications that work well on large-scale SMPs. So your application should drive the choice of machine you put on the floor. Linux clusters are a viable alternative to classic supercomputers depending on the application. It so happens there are a large number of applications that work very well on high-end Linux clusters. In terms of scalability and stability, these are very respectable machines at this point. The level of maturation over the past few years and the support by the vendors is consistently getting better.
As for the challenges, it is a little more difficult to get support from some of the vendors simply because it’s not their in-house operating system. But that is also changing.
Where are we in terms of adoption?
It has taken off tremendously in the past couple of years. About five years ago, we built our first Linux cluster and we basically had to do everything ourselves from start to finish—buying the machines, doing the integration, doing the testing, finding the software that worked, getting everything sorted out, and then putting it out for the applications people. Now we’re buying large-scale Linux clusters from the vendors as a package. So it has changed from a do-it-yourself type of R&D project to a supported platform by a number of vendors, and it’s important to note that it’s not from a single vendor. You can buy Linux clusters from Dell, HP and IBM, to name a few.
What are some examples where Linux clusters really can make a big impact in terms of research?
Well, ask any application scientist and watch their eyes kind of light up. One example is a detailed nationwide weather forecast, or very detailed local or regional weather forecasts, so you can do things like tornado forecasts. You can look at the incoming data from the Doppler radar and the other sensors, run your model and provide information on where a tornado is likely.
If you look at weather forecasting a hundred years ago, it basically consisted of going out and looking in the direction of the prevailing winds to see what was coming. We have much better sensors now, and much better simulations. When I got up this morning, the weatherman said, “OK, it’s nice now, it’s probably going to rain by noon but it’ll clear off by this afternoon.” So I drove to work with the top down on my car. I went to work, came out at noon and my car’s full of water because I forgot to put the top up. And when I came to work, there were no clouds in the sky.
So that’s an example of a simulation that is already working. But what I’d really like to know is if there’s a 90 percent chance of rain within one mile of where I live, and how much rain we’re going to get, and when it’s going to happen. So if you think about that, you can see the importance of that, for example, to agriculture. You can see the importance of that to the airline industry. You can route around major thunderstorms and reschedule flights if they have a few hours’ notice.
So as these computers become more and more powerful, we’re going to be able to solve issues that can have an impact not just on business, but on society in general?
The essence is that as the machines get faster and the codes improve, it’s possible to put more physics and chemistry into the simulation so it looks more and more like real life. The closer you get it to looking like real life, the better the chances are that you’re going to be able to predict what may actually happen. We’re seeing some of it now, and it’s only going to get better. As you give the application scientists better tools, they can give us better answers. And the better tools consist of more powerful systems, better data management environments and better applications development tools. This is all about information management, and people are at the point now where they are doing direct comparisons between the models. As the computers become more capable and the applications are able to improve the quality of the physics and chemistry that’s going in, the results are going to be better for us.