SHARE
Facebook X Pinterest WhatsApp

Pushing Technology to the Limit

Written By
thumbnail
John Parkinson
John Parkinson
Oct 5, 2007

The hardware and software platforms on which we run our businesses are remarkably reliable, and we take that reliability for granted. Few of us consider just how this level of reliability is achieved. Historically, we haven’t stressed the platforms all that hard; we’ve used only a fraction of the theoretical capability of the technologies we deploy. But that’s about to change.

As the pressure to improve asset usage pushes technologies such as virtualization and grid computing closer to the mainstream, usage rates of our three main technology resources– compute cycles, storage and connectivity–will rise, and those rates will begin to approach the theoretical limits of capacity. Then not only will we need better instrumentation and monitoring tools, but we may also be forced to deal with some surprising reliability issues.

One of my clients inadvertently stepped into this uncharted territory. The company discovered a set of circumstances under which a combination of hardware and software that had always performed flawlessly hit a threshold, causing a cascade of failures eventually tracked to a lack of time for certain actions to complete, and the subsequent triggering of a previously unrecognized endless loop condition.

As a result, one of the company’s core business processes was sidelined for two days. This situation wasn’t caused by some esoteric combination of technologies, but by a very high usage level. When the client queried its vendors, it discovered that the products had never been tested at such high usage.

The manufacturers had concluded that such testing would not be cost-effective, and had not indicated the maximum level at which their quality assurance and capability claims applied. Interestingly, when my client re-ran the workload on a different combination of technologies at the same usage level, the problem disappeared.

I’ve found other examples of this thresholdof- failure phenomenon in other highly used technology combinations, including another client that runs many virtualized servers in an environment where it routinely consumes more than 95 percent of available physical hardware capacity. This company is starting to see unanticipated- failure events that seem to be random and difficult to attribute to a root cause, which makes diagnosis and prevention tough. You might conclude that this is simply too high a usage level (most virtualization software vendors would suggest staying at 80 percent usage maximum for stable operation), but financial pressure will drive more infrastructure managers to try to get closer to 100 percent. And these examples raise a more fundamental question for CIOs and technology strategists: Just how do vendors establish reliability claims in the first place?

The answer seems to be 50 percent physical testing (run to destruction) and 50 percent simulation and modeling based on the physical test data. This is good enough as long as you stay near the middle of the performance envelope, but what does it tell you about life at the edge of theoretical performance? "Not a lot" and "not enough" seem to be the emerging answers.

So next time you think about pushing the limits of your infrastructure capacity to meet your budget constraints, consider asking your suppliers what they really know about their products’ limits, and be prepared to adjust your plans accordingly.

Recommended for you...

What do Amazon, Microsoft, Meta, and IBM Have in Common? Tape Storage
Drew Robb
Aug 15, 2022
What Does Quantum Computing Mean for IT?
Devin Partida
Aug 11, 2022
Solving the Video Surveillance Retention Challenge 
Drew Robb
Jul 28, 2022
Top 6 IT Challenges in Healthcare
Lauren Hansen
Jun 21, 2022
CIO Insight Logo

CIO Insight offers thought leadership and best practices in the IT security and management industry while providing expert recommendations on software solutions for IT leaders. It is the trusted resource for security professionals who need to maintain regulatory compliance for their teams and organizations. CIO Insight is an ideal website for IT decision makers, systems integrators and administrators, and IT managers to stay informed about emerging technologies, software developments and trends in the IT security and management industry.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.