12.07.06
What the heck is “data quality management”? (first in a series)
Sometimes, very rarely, I visit the press releases section of our official website. I say this not because it is not a source of tremendous information (note to readers, the same group is tasked with monitoring this blog….), but because GXS is really good about internal communication, so it is not necessary to know what is going on. But sometimes it is really interesting to read the external communications, for instance . . .
“The consequences of poor data quality are well-known and pervasive in the retail and CPG industry. Data quality is critical, but is only one rung on the ladder of complete end-to-end data synchronisation,”
Okay.
Feeling decidedly uninformed, I got in contact with Bryan Larkin, a member of the GXS marketing team who is our resident expert in this area, and asked him about this.
“Oh yeah, if anything [Jon Mier, CTO of UDEX] is understating it. In some instances up to 70% of the product data inside an organization is bad data…”
Basically, this means that often more than half of the information about products is wrong, even in the companies that actually make the products! One can only speculate at the unfortunate outfits that receive such data through the GDSN (Global Data Synchronization Network — more on this in a future post).
Interestingly, according to Bryan, it was the initial effort to synchronize product data between companies that actually highlighted how terrible the quality of internal data was. Imagine a grocery store that has setup shelves for shampoo bottles that are supposed to be 10 inches tall receiving 12 inch bottles (that is an example of “bad” data). Or envision having to deal with three different terms for the same color (black, blk, bl — an example of a “semantic” problem). This can seem like trivia until you start to figure out how many products and suppliers there are (a good size general merchandiser can have more than 10,000 suppliers…..).
Turns out that retail information is not the only challenge — this information is used for logistics. The case dimensions (and weights) are used to plan distribution, like the packing of a truck. If your weight goes down (new plastic packaging, instead of metal), but you don’t update the product data, you are still paying for the higher weight. If you have the wrong dimensions for cases or pallets, it may prevent trucks from being packed.
Worse still, you have to fix all that, every time it happens… But the problems were so spread out across companies, functions, and geography, that most companies just learned to live with it — until it became a customer service issue with the arrival of GDSN….

omarf said,
January 19, 2007 at 5:01 am
It would be interesting to know how logistics solution or erp solution could accomodate the data related to dimension and weight into the orders that are getting generated. If you are sending your goods through a carrier which either charges by weight or volume they better know what is the overall weight or volume of goods transported. There should some way to get all that data into the shipping notice and/or bill of lading.
radkoj said,
January 27, 2007 at 3:16 am
Omar — very good point. Oddly, logistics charges are a major anecdote in data synch, but kind of the opposite case. It turns out that logistics providers have an “idea” (recorded piece of data) of what things weigh, but are really only likely to worry about too much — not too little.
As I mentioned in the post, when a company switched from glass to plastic and weight went down, they paid to ship heavier product for years until discovering the error. The first requirement to send a weight to a partner is to agree inside on the weight — and that is what product information management is all about!