Page 65 - How Cloud Computing Is Transforming Business and Why You Cant Afford to Be Left Behind
P. 65
THE AMORPHOUS CL OUD
that Google operation managers are alerted, the failing server
is identified, and its workload is moved elsewhere before the
battery is exhausted. I suspect but do not know that all this
happens automatically. A human somewhere notes the server
outage. At some point during a regular maintenance sweep,
the power supply unit is replaced and the server is brought
back online, or perhaps the entire server is replaced when it
reaches a certain age.
Google officials have talked about how they’ve designed
their data center expecting such component failures. When
there are tens of thousands of servers working together, such
failures, which are infrequent for the home computer user,
start to occur on a regular basis. Disk drives fail, power sup-
plies fail, network interface cards fail, other components seize
up, and the server grinds to a halt.
In a paper outlining many aspects of the cloud data cen-
ter, Urs Holzle, senior vice president of engineering at Google,
and Luiz Barroso, Google distinguished engineer, say, “An ap-
plication (such as a search engine) running across thousands
of machines may need to react to failure conditions on an
hourly basis.” Holzle and Barroso have given us a major clue
to the rise of cloud computing: it achieves new economies
of scale yet remains broadly available to multitenant users
because it’s being managed by software, not humans, and it
achieves fault tolerance in that software, not the hardware.
For example, Google has designed its search engine oper-
ation with the expectation that one or more single nodes
within the cluster will fail. Rather than try to build infallibility
into the hardware, it has kicked the responsibility upstairs to
45