Asking the Right Questions in Data Analysis

Jim Wheaton

Co-Founder & Principal of Wheaton Group

Found in

•

Data Mining & Data Quality

Derived from an article that appeared in

Chief Marketer

What is required for a company to do effective data analysis? Many would respond, "People with advanced degrees in statistics!" This is most assuredly a worthwhile characteristic. However, I would also add, "The ability to ask the right question!"

Asking the right question typically does not require an advanced degree in statistics. Conversely, having an advanced degree in statistics does not guarantee that the right question will be asked. I have seen too many advanced-degreed data mining professionals who have trouble asking the right question. Often, these individuals get so bollixed up in the numbers that it is difficult for them to think strategically.

An Example

I recently participated in a discussion group on the best ways to build a Retail attrition model. Specifically, the stated goal was to predict which customers are likely to defect, and when. It was clear that my fellow participants were both brainy and highly-educated. For example, there were several references to dense academic papers on data mining. Nevertheless, until I raised the question, no one had asked, "Does it even make sense to try to build an attrition model in a Retail environment?"

There is no question that attrition models are appropriate for industries in which contractual relationships are the norm between a company and its customers. Financial Services, Publishing and Telecommunications immediately come to mind. For example, we know exactly when a customer cancels his or her credit card, magazine, or cell phone plan. Therefore, it makes good sense to build models to predict which customers will defect. Likewise, it might even be possible to predict when this will occur.

It is not so clear that attrition models make good sense for non-contractual verticals such as Retail, Catalog and E-Commerce, where the likelihood of being a continuing customer is probabilistic. In these sorts of industries, we typically never know for sure if a customer has defected, much less the exact moment in which the defection took place. And, even when we think we know for sure, the reality is not so cut-and-dried. For example:

Sometimes, a customer requests to never be contacted again. However, with the advent of the Web, the lack of future promotions does not necessarily mean that there will be no future purchases.

Sometimes, companies receive notification about the death of a customer. However, I have seen examples of responder records tagged by an overlay Deceased File. How could that be? The answer is that, in many instances, purchase decisions are made at the household rather than individual level.

So, what options are available to non-contractual industries where there is no way to know for sure if a customer has defected, much less when? Fortunately, many of us have been predicting attrition all along. It's just that we didn't realize it. I am referring to "implicit" attrition models that are a byproduct of trying to predict upcoming purchase activity; for example, near-term dollar volume. Specifically, as a given customer's point score and corresponding model-segment assignment declines over time, the likelihood that defection has taken place increases accordingly. (For companies that employ rules-driven segments such as RFM, the equivalent is being assigned to less favorable cells over time.)

It is important to note that, in non-contractual verticals, even customers who are assigned to the worst segments still retain some probability of making a future purchase. The same is true of Recency, a popular single-variable proxy for defection. For example, I once had a client whose business dynamics were such that several reasonably-sized subsets of customers with Recency as high as nine years could be consistently mailed at a profit.

Four Rules of Thumb

The following are four rules of thumb for effectively dealing with attrition in non-contractual industries such as Retail, Catalog and E-Commerce:

First, do not even try to build an explicit attrition model. Instead, build a model to predict future purchase activity; that is, where the dependent variable ("target") is revenue, response, and the like.

Second, such a model will, by definition, also serve as an implicit attrition model. As a given customer's score and segment assignment degrades, his or her likelihood of having defected increases.

Third, in order to try to reduce attrition, develop business rules that are triggered by patterns of downward segment migration, and then run over-time tests to measure their effectiveness. For example, analysis might indicate that customers who first drop from Decile 1 to Decile 2, and then to Decile 3, are very likely to never make a subsequent purchase. In this way, analysis can establish - retrospectively - that attrition has almost certainly taken place. (But, again, we typically never know for sure.) Therefore, it might make sense to execute an anticipatory intervention strategy as soon as a customer makes that first drop, from Decile 1 to Decile 2.

Fourth, be sure that the model segments (e.g., deciles) are predefined (i.e., "hard-coded"). Otherwise, your score definitions will change every time the model is deployed, which will render problematic the business rules and corresponding intervention strategies that you have developed.

I’m sending out lots of promotions across multiple channels. There’s so much overlap I can’t figure out what’s driving my revenue and profit.

all 10 reasons

With the Web, I don’t always need direct mail to get an order. But, I can’t sort out when this is true, and for which customers.

all 10 reasons

I was shocked at how much a quality marketing database would cost, and how long it would take to build. I need a second opinion.

all 10 reasons

I get nickeled and dimed so much that, in order to stay within budget, I have to cut back on what I want – and need.

all 10 reasons

My service provider promised the 'A Team,' and my invoices reflect it. However, most of the people on my account are barely shaving yet.

all 10 reasons

My business is 100% on-line. I want to be sophisticated using the data I’ve captured, but the e-commerce trade shows haven’t been much help.

all 10 reasons

I’ve done a great job acquiring lots of new customers. The problem is that I need help figuring out what to do with them.

all 10 reasons

I’m #53 on the IT priority list. That’s up from #57 last month. The delay is making it very difficult to do my job.

all 10 reasons

My analytical team answers my questions, but I’m frustrated because they respond like statisticians rather than the business people I need them to be.

all 10 reasons

I have my consumer customers figured out. However, I also have quite a few business buyers, and I don’t know what to do with them.

all 10 reasons

GET US IN YOUR INBOX!

Thank you!
Your email has been added to our list. Please check your email to verify.

Oops! Something went wrong while submitting ... Try again?

Asking the Right Questions in Data Analysis

An Example

Four Rules of Thumb

are we the right partner for you?