Technology: StorageBy Gary Bolles | Posted 08-19-2002
Every day at the Aviation Weather Center in Kansas City, Mo., a team of 60 meteorologists sifts through a torrent of weather data from thousands of sources, such as satellites, radar ground stations, weather balloons, ships, pilots, even offshore buoys. The group's nine different forecast desks run the information through a variety of computers, including one of the world's fastest supercomputers, generating model after model of dense graphics detailing weather conditions covering two thirds of the globe, from tornados in Kansas to floods in central Pakistan. All in all, says Clinton Wallace, an IT specialist at the center, AWC's two aging Hewlett-Packard K-class servers, which move all the data from one place to another, have to juggle 13 gigabytes of data a dayroughly equivalent to about four and a half million pages of text in the nondigital world.
When they were functioning, that is. But the servers used to crash frequently. "We're just beating them to death," admits Wallace. "We're driving them at 100 miles an hour, all day long." Outages used to last up to an hour, leaving forecasters without the ability to generate advisories for thunderstorms, ice, turbulence and visibility for thousands of pilots crisscrossing the planet.
Aviation Weather's data storms aren't exceptions in today's data-saturated world, and with the amount of information to be warehoused in the digital economy growing at 30 percent a year, according to Gartner Inc., companies and organizations everywhere will need to create a better and cheaper way to handle the data glut, or risk costlyeven life-threateninginformation failures.
Companies "should have secure access to information at any time, over any distance, no matter where the information is kept, no matter the type of computing platformall at the fastest possible speed," says Dan Tanner, a senior storage analyst for the Aberdeen Group, a research firm. "That's the Holy Grail. That should be the strategic aim [of IT]. If you can do that, your business will run smoothly."
Worrying about storing and accessing critical business data wasn't always a concern. In the days of mainframe computers, storage was centralized, and users always knew where it wason the big ironeven if they couldn't always get at it. But with the advent of cheaper and more accessible networked computers in the 1980s and 1990s, IT departments opted to phase out their company's more reliable, enterprise-class storage systems. Cheap and accessible networked devices also let companies put computing and storage horsepower where they thought it belongedin a workgroup, for instance, or a hosting company's data center. But the downside is complexity. "When you vastly increase storage, and you vastly complicate the network," says Tanner, "managing the storage and movement of data becomes steeply more difficult."
Corporate strategy should always drive every major IT architecture decision. The infographic suggests a way to think about how a particular storage design affects strategy, in a company where increased nimbleness and employee productivity are essential to delivering value to customers. To ensure rapid access to data, IT must provide an efficient networking infrastructure to provide the necessary speed. The data must be widely available, which typically means a strategy that blends centralization, reliability and scalability. These factors, in turn, are supported by an increasing trend toward standardization, allowing IT to "virtualize" storage hardware, making it possible to save and access data wherever necessary in the networking infrastructure.
Part of the problem stems from the fact that distributed computers merged all major pieces of the computing puzzle into one box. That means storage was merged with other components, from applications to operating systems to processing. These puzzle pieces aren't easily uncoupled, making it difficult to isolate storage to maximize its effectiveness. Says Tanner: "Now the data center, instead of a mainframe, might have a bunch of open systems. But these computers won't share the information very well if they all have their own storage."
And that complexity can be costly. For IT, managing distributed storage means heavy spending on hardware and staff time. According to Mike Kahn, chairman and cofounder of the Clipper Group, a Wellesley, Mass.-based consulting firm, the cost of labor can be as high as seven to eight times the cost of the hardware itself. That's why most IT shops, especially amid the current economic downturn, are trying to do more with less. The challenge, says Kahn, is: "How can I manage two to four times more storage, and do it better?"
Also contributing to the storage problem: Few business executives have any interest in the technology their company is using to store data. Few understand the link between a company's storage strategy and its ability to use data, in real-time, in the course of cutting costs, boosting profitsor predicting hurricanes. "The most critical thing we need for forecast operations is data," says the Aviation Weather Center's Wallace, and that means "having your data available when people need it."
But the trade-off between cost and complexity doesn't have to be a bad one. In an effort to keep costs down while achieving the reliability of the mainframes of old, many companies are rethinking their digital storage networks.
At Maimonides University Medical Center in Brooklyn, N.Y., a 750-bed hospital with more than $500 million in medical billings a year, the storage problem involved making the right information available to doctorsfast. "Our goal is to use technology as a focal point for improving healthcare and reducing errors," says Mark Moroses, senior director of technology services for the hospital.
Before Maimonides' storage overhaul, medical records for, say, ambulatory care weren't in the same place as the records for generalized patient care information. In fact, much of the data wasn't even digital. The result: Wasted time at the point of care as medical personnel tried to pull records together, and increased risk of error. Worse, system uptime hovered between 90 and 95 percentfar below the "five 9's," or 99.999 percent uptime considered acceptable in many businesses, and absolutely critical in a hospital.
To centralize its data, the hospital decided to uncouple the traditional links between applications, operating system and storage. And because Moroses didn't want the hospital to be limited to one vendor's storage devices, he looked at using inexpensive industry-standard hardware. The result: a storage area network that combines IBM SSA hardware with DataCore's SANSymphony. Maimonides brought all of its records for ambulatory care into one place, speeding the ability of medical personnel to search and update patient information.
The payoff to the hospital for the effort so far can be measured in one simple metric: uptime. "Since we've been up, there have been four hard drive failures," says Moroses, "but we had zero downtime because of it." In fact, uptime for the past year has been over 99 percent, "a pretty huge increase for a hospital."
Edward Jones, a $2 billion financial services subsidiary of The Jones Financial Companies, had a different storage headache. EJ's biggest concern is keeping always-on links to the firm's 8,500 branch offices in three countriesto be able to handle some 25 million to 30 million transactions a day. "Our product is information," says Rich Malone, Edward Jones' CIO. "For us, the information about our customers and our products is crucial to our business. The only profit center is our branches."
Edward Jones moves and stores a massive amount of data on its IBM systemsup to 120 terabytes, at last countand that amount has been increasing at a rate of 50 percent to 75 percent a year for the past several years. Since 1987, the company has used a nationwide satellite network to move data between the company's branch offices and its main data center in St. Louis. But satellite links have one notorious Achilles' heel: clouds. Heavy storm activity can knock out a data connection, sometimes for hours at a time. "We had a lot of inclement weather that caused outages in the past," says Larry Steele, Edward Jones' chief technology officer. "If there was a storm in Paducah, Ky., we'd lose one office. If it was in St. Louis, we'd lose all the branches."
Four years ago, Malone began designing a redundant data center for his company that could remove some of the vulnerabilities of the old architecture. In October 2001, the company opened the center, in Tempe, Ariz. The site, which triples the company's information storage capacity, is equipped with a complete backup computing system that maintains storage from a number of vendors, including EMC and Network Appliance. "Our goal here is to get a fully redundant environment, where each data center could run the whole system at once" if it had to, says Malone. Ever since the site in Tempe went live, his team reports, there hasn't been a single major outage affecting the branches. Whenever heavy storms blanket St. Louis, the less-tempestuous Tempe facility takes over the centralized computing chores without a hiccup. To allow the data centers to work in lockstep, the company now replicates 25 terabytes of data over redundant, high-speed OC3 communications lines between the two centers, and a good portion of that is done continuously. "We're working it down to as close to real-time as possible," reports Hayden.
Centralization has also helped the Aviation Weather Center to solve its reliability problems. By installing a centralized storage server from Auspex, Wallace was able to use industry-standard disks that can be easily added as storage grows. The result: zero downtime and a flexible storage service that can scale as the tide of data rises.
What's In Store?
As companies face the need to store and move more and more dataAWC's Wallace, for example, predicts the weather center will have to handle some 100 gigabytes of data per day in a few years, compared with today's 13 gigssome companies are moving to software that turns storage into a centrally managed resource.
Today's distributed storage creates inevitable business frictions. Workers who need data that's stored in separate systems are often frustrated in their efforts to gain rapid access to critical information, hampering business initiatives and slowing the pace of decision-making. Imagine a company trying to efficiently move products between dozens of warehouses rather than coordinating goods from a single location, and you'll see the challenge.
These new approaches, loosely grouped under the phrase Storage Area Management, or SAM, use software to stitch together disparate storage devices so they can be used like one big virtual pool. The goal: more flexible businesses unencumbered by balkanized data. Rather than forcing users to look for a spreadsheet "needle" in a series of server "haystacks," SAM is intended to provide flexible software services to help users and applications easily store and locate files anywhere in the network.
This approach radically increases flexibility, but it potentially creates new frictions as workgroups are required to toss their storage resources into the pool. It also means that IT needs to learn a new skillset, tracking and managing storage resources to create a rock-solid utility that users and applications can rely on without thinking.
Need for Standards
Need for Standards
Ultimately, though, what this approach really means is that IT needs to push heavily for support of standards. If IT departments are to tailor storage schemes to their companies' needs, they need industry standards to ensure interoperability. "Interoperability is a huge issue today," says Steve Kenniston, an analyst at Enterprise Storage Group, an industry consulting firm, because it requires users to understand what's happening in the various industry standards initiatives. "Companies are reluctant to buy from vendors who aren't participating in those standards committees. The customer today has all the power. The economic times are such that [users can say], 'I'll never buy your product, and I'm going to go down the street and buy whichever product is going to offer me the most flexibility,'" he says. Adds Bill Hayden, a data strategist at Edward Jones: "That's one of the things that we're trying to communicate to the industrythat we want interoperability, and we want it as soon as we can get it."
Agreement over standards won't happen any time soon, but storage is suddenly getting the attention it deserves. "Two or three years ago, you couldn't find a business card that said 'storage administrator' on it," says ESG's Kenniston. Today, it's commonplace. And the increasing mentality among those professionals is to make storage a ubiquitous offering. "We want disk storage to be out there as a service, without caring what the hardware is behind it. We only want one fuel gauge," says Edward Jones' Steele. How long will companies have to wait for that? Experts say it will be two to three years before the standards solidify and the software potentially makes storage a pleasure to manage.
In the meantime, don't worry too much about costs of the commodity components like disks and bandwidth, which should continue their rapid descent. "You can expect round brown memory to get cheaper," says Aberdeen's Tanner. "You also can expect bandwidth to get cheaper. But never lose sight of the fact that the thing that always means the most is intelligence." His words to the wise: "Put all the emphasis on the software, the management [and] the skill level of the people you use."
A well-managed storage architecture provides the solid infrastructure on which critical business initiatives dependent on flexible and rapid access to transparently available data can be built. The move to make storage a companywide utility requires a variety of new technologies to create a stable storage platform.
Effective Storage Allows the Business To
- Provide users with continuous uptime.
- Deliver data to users more rapidly, and in the formats they require.
- Make more flexible decisions about moving and merging workgroups.
- More readily establish new data-intensive business initiatives.
- Gives the organization a rationale for centralizing and standardizing resources.
- Provides a competitive advantage to the organization by making it more agile.
- Saves money through better use of existing hardware, and by allowing companies to buy less expensive hardware.
- Technology is in flux, with industrywide standards still emerging.
- Some technologies, such as Storage Area Networks, can be initially more expensive, especially for smaller companies.
- May create frictions between departments that are required to merge their storage resources.
enterprisestorageforum.com EarthWeb site devoted to large-scale storage issues.
www.networkstorageu.com Variety of online courses on enterprise storage (some vendor involvement).
snia.org Web site of the vendor-sponsored Storage Network Industry Association.
networkbuyersguide.com/netstorguide Series of online guides provided by Strategic Research Corp.
searchstorage.com Storage-specific search site.
storagemagazine.com Exclusive focus on storage management strategies.
EMC (emc.com): Top vendor of networked storage and storage management software, and an OEM to many other storage companies. Offerings include CLARiiON NAS, Celerra HighRoad and ESN Manager for SANs.
IBM (www.storage.ibm.com/snetwork): Number two in networked storage, third in storage management software. Broad product range includes NAS 100 through 300 line, SAN Switches and iSCSI NAS box.
Veritas (veritas.com): Second in storage management software. ServPoint software can be used with industry-standard hardware to create SAN and NAS devices.
Compaq (www.compaq.com/storage): Third in networked storage. Rechristened HP StorageWorks line includes NAS1000 departmental to NAS Executor E7000 enterprise unit, as well as a variety of SAN products.
Virtualization: The process of separating the formerly merged components of operating system, application processing and storage.
Network-Attached Storage (NAS): Allows storage devices to be connected directly to the network, to be shared by servers and users over an IP.
Storage Area Network (SAN): A pool of storage devices managed in a separate network based on the Fibre Channel scheme.
Storage Area Management (SAM): An emerging set of software components that allow IT departments to manage all network storage as a pool of resources.