How to Increase the Reliability of Your IT Infrastructure Using Predictive Analytics REGISTER >
At Hogan & Hartson LLP, a Washington D.C.-based law firm with about 1,000 attorneys in roughly 20 offices around the globe, Bill Gregory recognized the need for better data quality when he took over as CIO in June 2003. His law firm had undergone rapid growth in a short period, and customer files and other data needed to be integrated and checked for accuracy. But getting lawyers and senior partners to support the initiativeand agree to own and manage the datawas a challenge. Gregory says the biggest obstacle in rolling out his strategy was getting employees to "understand the implications of their data and visualize how the quality of their data affects business processes. That takes a lot of effort." Gartner's Friedman says this is a common problem. "It's a thorny issue because who's really responsible for quality of data? People abdicate responsibility," he notes. "Many people think data is IT's problem, and no one wants to step up and say it's their responsibility."
But analysts, vendors and CIOs agree: While IT's job is certainly to facilitate a good data-cleansing strategy, it's the business unit that has to own and manage the data. They're the ones who use the data, after all. "IT knows how the systems work, but the business side knows the method and the madness of its own processes," says AMR's Kirby. "The business must be the ultimate judge, but it needs help from IT on a systems level."
Patrick Wise, vice president of advanced technology for Landstar System Inc., a multimillion-dollar transportation company, says it wasn't difficult to convince business managers of this. "They were terribly excited about having a single view of the business, so they were very willing [to own the data]," he says. His finance department sees the value as well, Wise says, adding, "It's a big relief that we have this in place now that Sarbanes-Oxley is an issue."
To ensure that data quality remains high, analysts agree that CIOs should appoint a data stewarda liaison in charge of managing all the data all the time. But they disagree on where the data steward should sit, and on how many of them should be employed. Friedman argues that business units should appoint many data stewards, each responsible for specific data sets. For example, a call-center manager might be tapped as the steward of all customer data in a particular region. That way, the function will be smaller and can be handled by an employee who's already close to the data. "You want to place the accountability as close to the user as possible," he says. Other analysts think companies should appoint an overall data czar who reports to the COO.
|Data Quality Methodology|
Improving data quality isn't just a desirable (and profitable) goal, it's a processone that requires input and support from all areas of a business. Creating complementary and parallel processes not only saves time and gets better results (cleaner data), it fosters alignment between IT and the business units IT serves. But remember: The responsibility for data quality ultimately lies with the leaders of the business units, not the IT department.
Source: The Data Warehousing Institute
However the data-steward issue is solved, it's imperative for IT to sit down with members from each business unit in order to understand where the greatest data-quality problems are and decide upon standards for data handling. The goal here is to come to consensus on who owns which data, what types of information should be included in each record, how the data should be entered and how relationships between records should be defined.
At Emerson Process Management, a $3.2 billion operating unit of Emerson, Nancy Rybeck, the division's customer-data warehouse-strategy architect, was tasked with building a data warehouse after an earlier attempt had failed. The division, which produces valves, pressure devices and software for power plants, refineries and food and beverage manufacturers, has 14 subdivisions, more than 10,000 employees and over 100,000 customers around the world. While each division makes separate products, their client base often overlaps, and because each division owned its own data, there was no way to get an overall view of the company's business.
Three years ago, Rybeck began an effort to consolidate and clean the division's customer database. It was a significant challenge, she says. "Conceptually it doesn't sound that difficult, but you can't tell just by looking at the information how these pieces all match up."
To make it happen, Rybeck first spoke to all the business unit heads in order to get a better understanding of their processes, the kind of data they needed, and how that data could best be delivered. Then she created a relational data model to meet those parameters. Finally, Rybeck took in the data from the various units and began the long process of cleaning, consolidating and "de-duplicating," using software from Group 1 Software Inc., and appointing a data steward in each business unit to coordinate the effort. Today, the company is able to look at their customers from many angles. "We have a report that shows all the business we have done with a client, broken down by division. Then we can break it down by world area, then by country, then by site," she says.
Keeping the data clean is a constant effort. The system automatically processes 1.7 million addresses per month, some of which are reviewed by individuals. Software does much of the work, but the reviewing process obviously still requires a lot of manual labor. "Some data requires actual eyes," she says, especially when dealing with information about companies in foreign countries. And finding the resources to make that happen can be tough. "You can't just clean the data in one pass. It takes a manual review in some places, and it might take several passes through the software to get it straight," she says.
It's also important to decide when data gets cleaned, and how often. Some companies deploy cleansing software for one-time events, such as a marketing promotion, so data is cleaned when a particular department needs the information. While analysts admit that many companies still approach data quality this way, nearly all warn that one-stop cleaning is a waste of time. "Data quality degrades," says Friedman. "It has a half-life, like radioactive material, depending on the business activity. In certain situations quality will degrade faster than in others. So you can't cleanse once and stop, you have to do it on a constant basis."