The Hype and Reality of Business Intelligence Software, Part 1

You've heard the pitch: for you, the seeker of Customer Relationship Management ("CRM") heights, there is business intelligence software that will take you there. With a click of a mouse, you will count and profile your customers, select your names for a marketing promotion, and then analyze your results on the back-end. Wow!

But after the initial thrill is over, you might be disappointed for one or more reasons:

  • The software's data manipulation capability is not powerful enough.
  • Data-driven, statistics-based predictive modeling is only superficially supported.
  • No training is provided for navigation in the dangerous waters of data analysis.
  • It does not address the issue of data integrity.
  • Simple queries are "well "simple, but complicated ones are nearly impossible.
  • Even if you manage to express your complicated questions in the language of the software, to get answers in a reasonable amount of time may require a significant investment in hardware.
  • There is no framework to translate answers "particularly the more complex ones such as customer behavior models "into the optimization of business decisions.

Of course, nothing can be perfect. However, to minimize disappointment remember to ask eight fundamental questions:

Question #1: Is Important Functionality Missing?

Although automated counts and selects are very important, they are insufficient for sophisticated CRM. That is because CRM is first and foremost the process of using customer history to anticipate ("predict") future productivity under different scenarios. Defined as such, the practice of CRM must center on developing an understanding of the relationships between what was known about a given customer at one point of time and that customer's subsequent behavior.

Data-driven, statistics-based models to predict customer behavior are core to CRM. Therefore, counting and profiling only qualifies as the first stage of CRM "getting familiar with a business and its data. However, marketers new to CRM are likely to stop at counts and profiles. Does this mean they do not use models to predict customer behavior? Of course they do! However, their models are based on judgment, and not methodically driven by data using the engine of rigorous quantitative analysis.

Judgmental models reflect one's intuitions "the subconscious sum of one's experiences. There is nothing wrong with that. But, when there is hard data to provide or deny credence to a hunch, it makes sense to use it. Software that doesn't go much beyond counts and profiles doesn't unlock the full potential of CRM.

Question #2: Can Software Really Build Models?

Many business intelligence software packages claim to perform modeling. Supposedly, you feed it your promotional results, crank away with logistic regression, neural networks, or some other quantitative wizardry, and out pops the result. But, all they are offering you is model calibration. For parametric models, this is referred to as "estimating the parameters." For non-parametric models, it is "discovering the structure."

But, who decides what variables to toss into the magician's hat? You do! But how? If you are using intuition alone, then you are not being sufficiently data-driven, and you are not practicing sophisticated CRM.

The process of predictive modeling is foremost the process of deciding, by business and data analysis, what data to use and how to transform it to illuminate patterns. For example:

  • Defining the modeling subset. What time frames and business segments are relevant? Last fall? This spring? All spring seasons? Last five years? General media? Specialty media?
  • Defining the dependent ("target") variables. What should you try to predict and how should it be measured? Response Rate? Average Purchase Size? Demand per media? Per marketing dollar? Long term value? Gross or net sales? And, within a multi-channel marketing environment that contains a mix and direct response and brand building efforts, how is the incremental effect of each effort to be determined?
  • Defining the independent ("predictor") variables. What variables should be tried as independent variables? Here, the list of possibilities is endless, considering the potential combinations of variables including differences, sums, ratios and percentages. It is in the creation and testing of predictors that interesting data analysis truly takes place.

Once you are past the first round of these questions, you will be ready to calibrate the model. Then you have to validate its performance. The results might suggest fine-tuning and send you back to data analysis in search of new ideas.

Model calibration does not develop any new concepts, nor does it provide theories about what is driving your customers. That is your job! Software that restricts you to variables you had the foresight to record previous to a marketing effort hinders the practice of CRM.

Also, after you have constructed a predictive model "or a network of several "you still have to incorporate it into an overall decision model. Hopefully, an outline of the decision model "how the results of the predictive model(s) will be translated into decisions "was informing your efforts all along. But, in the end, you have to put together the nuts and bolds in such a way that your CRM and marketing resources are optimally allocated. (See "CRM Growth Simulator: Extending the Data Warehouse," Jim Wheaton, Business Intelligence Network, April 21, 2005, www.b-eye-network.com/view/788.)

Question #3: Will Standard Reports Be Enough?

Perhaps the majority of a CRM professional's daily needs can be satisfied with a stack of standard reports and "fill-in the blanks" queries. But just as surely, the remainder "reports produced ad hoc in search of penetrating insight "is what will provide your company with a true competitive edge.

Standard reports help you monitor your business through the lens of your existing predictive models as well as monitor the robustness of the models. They serve to trigger new questions. The ad hoc, never-anticipated queries help answer the questions and move both the models and your business forward. Therefore, while prepackaged report templates are often useful, good CRM software should excel in ad hoc reporting of any depth and complexity. If the answer can be found in the data, the tool should give you power to formulate the question.

Question #4: How Do You Avoid Discovering Useless Things?

The barrier to building robust, actionable, customer behavior models cannot be overcome by software alone. Data analysis expertise is equally essential.

The data, not the software, interacting with an analyst's logical faculties and imagination, drive the course of analysis. Decisions based on the analysis, once set in motion, may have a profound, irreversible and long lasting impact on your business.

An entire area of research exists in the area of human fallibility as it pertains to data, probabilities and statistics. (An excellent resource is "Judgment under Uncertainty: Heuristics and Biases," by Daniel Kahneman, Paul Slovic and Amos Tversky.) Training and experience in data analysis and interpretation is all that stands between you and disaster. For example, the predictive modeling process is full of potential pitfalls such as:

Example #1: In the exploratory phase of modeling, there is a danger of selecting predictor variables that are contaminated by the "target"; that is, what it is that you are trying to predict.

Suppose you have a hunch that customers with children are your best buyers. You decide to add a question to your order entry script and a new field ""presence of children" "to your marketing database. The field is initialized to "no." After a while, you start analyzing if those who answered "yes" bought more frequently. And sure enough, "yes" customers are more frequent buyers than "no" customers.

Did you just find an important key to you business? Before you start paying for demographic overlays and over-circulating households with children, consider that those who buy more frequently were more likely to have been asked and provided an answer. Therefore, being a frequent buyer makes a "yes" more likely, and not the other way around.

The specific lesson is that a single code should not mean two different things. In this example, "unknown" should be a separate code. Moreover, a new segmentation variable must be evaluated while holding constant other variables that are already known to be good segmenters; for example, rules-driven customer segments or your current scoring model.

A more general lesson is that, without being keenly aware of how your business is reflected in the imperfect mirror of your data, and how to evaluate the incremental value of a new idea, it is easy to "discover" useless things.

Part 2 will provide two more examples as well as a discussion of the following questions:

  • Who is verifying what and how?
  • Who is in charge of data integrity?
  • How does the interface deal with query complexity?
  • Was the testing ground the same as your battle ground?