The Fallacy of a Single Version of the Truth About a Customer

If your business tracks detailed information on customers, then we have great news for you: Substantial increases in revenue and profit will result from the thoughtful, sustained application of data mining within a comprehensive CRM framework.

Such a framework was outlined by Tom Collins and DMA Hall of Fame member Stan Rapp years before the invention of the term "CRM," in their seminal 1987 book, "MaxiMarketing." For example, they say on page 211 that:

Your customer database will be your own private marketplace where you can promote additional sales, cross-promote, explore new channels of distribution, test new products, add new revenue streams, start new ventures, and build lifetime customer loyalty "and your competitors can never tell exactly what you are doing until after the fact.

Of course, the foundation for any data mining and CRM program is an accurate and robust centralized data repository, complete with a well-executed Customer Data Integration ("CDI") strategy. But, does the necessity of CDI imply the need for the latest industry buzz-phrase: a single version of the truth about a customer?

Is It Even Possible?

Is a single version of the truth about a customer even possible? Many would argue that today's technology renders this goal attainable. For example, a CDI trend over the past several years is the leveraging of compiled demographic databases that go back one or more decades. The idea is to track individuals as they move from location to location, and often across multiple towns, cities and states.

The capabilities of this technology are impressive. Consider how one such database service recently tracked one of this article's co-authors, Jim Wheaton, back to 1983. It was able to link:

  • His current Chapel Hill, NC residence back to 1997, despite a ZIP Code and street name change.
  • Two Colorado moves between 1991 and 1997.
  • Four Chicago moves between 1984 and 1991, including a household merge as a result of marriage.
  • A Stamford, CT rental apartment from 1981 to 1983.

Likewise, Jim's wife was tracked through multiple moves and states, along with a maiden-to-married-surname change, back to a mid-1970's rental apartment during her time as an MBA student.

However, despite the illustrated virtuosity, the unfortunate reality is that there is no such thing in the CDI business as a single version of the truth about a customer. Or, more precisely, there is no single version of the truth that can be acquired in a financially affordable way. The following two records illustrate why:

Record 1:
James Wheaton
151 Thurton Drive
New Canaan, CT 06840

Record 2:
James Wheaton
151 Thurton Drive
New Canaan, CT 06840

It appears obvious that these represent a single individual at the same address. Surprisingly, they are two different people, a father and his son, with their respective suffixes (Jr. and III) deleted. It was a constant source of confusion for one of the authors while growing up, as is the case with any son who is named after his father.

If a Single Version of the Truth is Not Possible, Then What to Do?

Short of telephoning every single customer, there is no way to be 100 percent correct when it comes to CDI. Instead, CDI professionals are presented with a spectrum of statistical probability. In other words, they "play the odds" as the habitués of Las Vegas would say. They do this by developing programmed logic to make the best possible guesses.

The result of this uncertainty is that all forms of customer consolidation logic produce two types of errors. "Over-kill" occurs when records that reflect two separate customers are consolidated. Conversely, "under-kill" takes place when records that reflect the same customer are not consolidated.

There are costs to both over-kill and under-kill:

The Cost of Over-Kill

For over-kill, the consolidation of data representing two customers into a single record results in the mirage of one very desirable customer. Direct marketers experience this phenomenon whenever two actual "single-buyers" are combined to create a supposed "multi-buyer." This, in turn, reduces the accuracy and effectiveness of subsequent database marketing efforts. For example:

Not only are long-term value estimates overstated, but statistics-based predictive models are negatively impacted when "pseudo multi-buyers" purchase less frequently in the future than expected. Likewise, there is the cost of customer dissatisfaction that occurs when the data corresponding to the incorrectly-merged customer drives targeted promotions. Finally, there is the opportunity cost of not soliciting an incremental sale from the inadvertently-merged customer.

The Cost of Under-Kill

Conversely for under-kill, the splitting of data into two records results in the mirage of two apparent customers who appear less desirable than would be the case with the proper consolidation. These "fractional customers" will display a lower-than-actual long-term value. They will receive the unnecessary marketing promotions that result from targeting an assumed extra customer who does not, in fact, exist. And, they will likely take offense when it becomes apparent that the company is unaware of knowable facts.

Cost Asymmetry and Its Ramifications

Consider all of these ramifications of over-kill and under-kill. Clearly, their corresponding costs are very different. The methodology for quantifying such costs is beyond the scope of this article. Nevertheless, the goal should be to identify the set of consolidation rules that minimizes the total cost of all over-kill and under-kill errors. For example:

Expensive marketing collateral tilts the cost ratio tilts towards over-kill compared with inexpensive email blasts. Likewise, the cost of over-kill looms more ominously in customer service than in marketing because of privacy issues. This may be accentuated when customer service functions are moved off-shore, and especially to countries where less rigorous privacy laws are in effect. Finally, promotions to high-value customer segments, regardless of marketing channel, might tilt the cost ratio towards under-kill. This is because of the steep opportunity cost of not contacting legitimate, highly-desirable customers, which is an unfortunate side-effect of over-kill.

Final Comments

As counter-intuitive as it may seem, different versions of the truth, when tailored to different business decisions, generally are beneficial to a company's overall fortunes. Fortunately, it is not prohibitively expensive to craft a flexible CDI solution. In fact, direct marketers have been doing just this for the past quarter century. For example:

Catalogers such as LL Bean routinely mail millions of catalogs at a time. Often, these mailings are composed of prospect "rental lists" and customer "segments." It is not unusual for rental lists and customer segments to number in the hundreds. Typically, a large percentage of the corresponding records will appear several times across these lists and segments. These "duplicates" have to be identified and "purged" so that multiple catalogs do not get mailed into the same household.

It is industry-standard for the software that performs these sorts of "merge/purges" to operate with several simultaneous over-kill and under-kill parameters. For example:

Most catalogers maintain a small suppression file of high-risk individuals who should not be contacted under any circumstances. Some are fraud artists. Others have threatened to initiate legal action if they are ever mailed again. Typically, these suppression files are run with parameters that result in very aggressive over-kill compared with the other areas of the merge/purge. This is because, in contrast to the alternative, it is infinitely cheaper to fail to contact a handful of legitimate, "over-killed" prospects or customers.

In short, direct marketing firms learned long ago that flexible CDI, implemented with care, is much less expensive than single-version-of-the-truth CDI. Now, it is time for the rest of industry to embrace this reality.