More Power to Google
How to Increase the Reliability of Your IT Infrastructure Using Predictive Analytics REGISTER >
NEW YORKGoogle is seeking the optimal energy efficiency for its large data centers, and it is counting on its top engineers to help deliver it.
Luiz Barroso, a distinguished engineer at Google, discussed the company's projects to reach optimal energy efficiency in a talk entitled, "Watts, faults and other fascinating 'dirty' words computer architects can no longer afford to ignore," at the company's complex here on April 5.
Barroso, a former Digital Equipment engineer with a history of delivering load balancing software for large-scale systems and for working on the design of the core Google infrastructure, summarized two projects he has been working on.
One, a power provisioning study, will be formally released in a paper this summer, Barroso said.
Two main points arose from the power provisioning study, he said: "Maximizing usage of available power capacity is key," and "systems are typically very power-inefficient on nonpeak conditions."
Moreover, Barroso said, "Power/energy efficiency and fault-tolerance are central to the design of large-scale computing systems today. And technology trends are likely to make them even more relevant in the future, increasingly affecting smaller-scale systems."
Barroso acknowledged that Google is building data centers where there is hydroelectric power and "engineers are squeezing every little watt out of every card."
Indeed while circuit designers have to worry about things like temperature and other issues, "we worry about the affordability of building data centers," Barroso said.
He noted that it costs between $10 and $22 per watt to build a data center, while the U.S. average energy cost is only 80 cents per watt. So "it costs more to build a data center than to power it for 10 years," Barroso said.
"You want to get as close as possible to optimal usage," because unused watts cost money, he said.
So for the power provisioning study, Google looked at how much energy its machines were using over six months.
The example for the study covered only 800 machines of the thousands Google employs, and one of the findings was that "you spend 60 percent of your time at or below your peak, and racks of machines are never at peak at the same time."
Moreover, "the data center as a whole is never going above 70 percent of capacity, and that shows we could have deployed 40 percent more machines."
Barroso highlighted two hot areas of computer design made famous in the '90s that have proven to be flawed. One is the acceleration of single-thread performance, which he referred to as the megahertz race. The other is the building of big, distributed shared memory systems, which he called the DSM race.
The theory behind the DSM race was that large-scale computing systems should use a shared-memory programming model because it was familiar to programmers and facilitates sharing of expensive resources, among other things. But the undoing of the DSM race was fault containment, Barroso said.
"A single fault can bring down the entire shared memory domain," Barroso said. "It's a very hard problem to solve and most of the solutions are inadequate."
Meanwhile, in the megahertz race, where even unmodified software simply gets faster by itself because of some computer architectural tricks; "the megahertz race crashes into the power wall," Barroso said.
He said that every year enterprises can buy faster servers for about the same price, "but much more energy is being used so systems become power-inefficient."
Joked Barroso: "When you get to the point where power costs more than servers, you'll have a situation like the cell phone industry model where utility companies might say, 'I'll give you these servers for free if you sign this energy contract.'"
Barroso also mentioned H.R. 5646, a congressional bill signed into law last year to promote the use of energy-efficient computer servers in the United States.
"There are a lot of things you can do to reduce energy conversion losses, like go to single-voltage rail power supply units [PSUs]," Barroso said. "You can get up to a four times reduction in conversion losses."
Moreover, Barroso said Google is "working with [its] partners to create open standards for higher-efficiency PSUs." He later said the list of partners includes Intel and AMD.
Next Page: The promise of parallelism.