Database Management: Get to Know Your Customers Even Better

In an ideal world, every cataloger would have access to a state-of-the-art CRM system, including Web-enabled business intelligence, campaign management and customer touch-point capabilities. Every organization would enjoy the continuous, widespread internal dissemination of complete, accurate and compellingly-packaged data. All data miners and marketers, as well as all employees interacting with customers, would have instant access to all of the data required for them to excel at their jobs.

In such a world, smaller catalogers would operate with the same technological advantages as their larger competitors. Unfortunately, many catalogers do not have the budget to invest in cutting-edge CRM systems. Hence, choices have to be made. At such times, it is important to remember "Wheaton's Law":

Never scrimp on database content. Without best-practices content, sophisticated, data-driven CRM is impossible.

Characteristics of Best-Practices Content

Best-practices database content provides a consolidated view of all customers and inquirers across all channels, including catalog, e-commerce and "when applicable "retail. It is as robust as the underlying methods of data collection are capable of supporting.

The complete history of transactional detail and relationships must be captured, because high-quality content supports deep insight into the behavior patterns that form the foundation for data-driven decision-making. Everything within reason must be kept, even if its value is not immediately apparent.

Best-practices content includes the following four characteristics:

First, the purchase data "all orders and items "must be time-stamped and at the atomic-level. Robust purchase detail provides the necessary input for seminal data mining exercises such as product affinity analysis. Remember that you can always aggregate, but you can never disaggregate.

Within reason, data must not be archived or eliminated. For example, it is difficult to do a product affinity analysis if orders and items are rolled off the file every "say "36 months. Ideally, even ancient data will be retained. Unlike ten or twenty years ago, disk space is cheap and you never know when you might need the data.

The data "semantics" must be consistent and accurate. For example, merchandise must be easily identifiable over time, despite any changes that have taken place in naming conventions and/or the numbering of inventory. Consider how untenable analysis would be if the data semantics were so inconsistent that "say "item number 1956 referenced a type of necktie several years ago but umbrellas now.

Second, post-demand activity must be kept, such as returns, exchanges, allowances and cancels. These are essential for important exercises such as identifying the customers who will be less likely to make future purchases without remedial action. After all, customers who are disappointed with unavailable, ill-fitting or damaged merchandise probably will have a lower probability of ordering again.

Third, ship-to/bill-to (often, gift-giver/receiver) relationships must be maintained. These enable targeted promotions to extend the customer universe beyond those who placed the original order.

Fourth, promotion history across all available channels must be kept. This is necessary to rapidly and accurately create the past-point-in-time "views" required for most analytical projects (more on this later), including predictive models. One marquee, multi-billion dollar catalog/retailer has learned the hard way the importance of including promotion history. Although it spends seven figures a year on its CRM system, the underlying database does not contain promotion history. As a result, most data mining projects take a week longer than they should.

Multiple, Properly-Linked Levels

Individuals must be scrupulously de-duped and, for B-to-C, properly linked to households. For B-to-B, individuals must be linked to sites, and sites to companies. This is essential for the calculation of accurate performance metrics, including promotional financials.

Optionally, database linkages can be supplemented with third-party overlay data to create a complete view of individual customers and inquirers, households, sites and companies. For B-to-C, the identity of additional adults within customer and inquirer households can be appended, including descriptive demographics such as exact date of birth, age and gender. For B-to-B, additional individuals can be appended to sites, and additional sites to companies, including "firmographics" such as industry type and number of employees.

Rapid Recreation of Past-Point-In-Time Views

Best-practices marketing database content must support the ability to easily and rapidly recreate past-point-in-time customer and inquirer views ("time-0" or "freeze" files). These, in turn, form the basis for virtually all meaningful direct marketing-oriented analytics. For example, they allow the creation of the analysis and validation files required for predictive models. Likewise, they support the creation of the underlying data required for all cohort analysis; for example, long-term value and the monitoring of changes in customer inventories such as fluctuations in segment sizes over time.

Case Study

John Craig is Co-Founder and Principal of Windward Group (, which offers strategic brand, marketing and merchandising consulting to the direct marketing industry. However, he also spent over two decades on the client-side of the catalog industry, including executive management positions with substantial profit and loss responsibility.

At one of his employers, a multi-channel business-to-business marketer, Craig was a Wheaton Group client. Unfortunately, he did not have the budget to build a CRM system with every bell and whistle. So, Craig brainstormed with experts, asked plenty of questions, and arrived at an innovative outsourced/in-house hybrid solution:

"I outsourced the construction and maintenance of the database, and the ongoing house file processing, including promotional selections and matchback reports," says Craig. "No one in my IT department had the background to do this as well as I could get it done on the outside. Plus, even if I had the talent in-house, I knew that this processing would always vie with other internal work for priority, and that a sudden resignation would be disastrous."

"Besides, there was no way I was going to compromise on database content," continued Craig. "The project was a real challenge in terms of content. Our internal systems were not properly documented. We did not have a consistent and unique coding scheme for our thousands of SKU's. The way that our operational systems were engineered made it difficult to extract backorder, cancel and returns transactions. And, as a business-to-business marketer, the entity relationships and order tracking were extremely difficult to map."

As for the in-house portion of the solution, Craig took the $50,000 he had budgeted for an Associate Circulation Manager, and invested it in hiring a technical professional who could program in SQL and leverage inexpensive reporting tools such as Crystal Reports. "Out of every update, a copy of the database was transferred by Wheaton Group to the in-house technical professional. She would load it onto her PC and then run all of the necessary reports, ad hoc counts and queries. She manually pushed information throughout the entire organization on a regular basis. It was an unorthodox solution, but it worked. By the end of the first year, I was able to show a very impressive ROI to the company president."

Final Thoughts

When you do not have the budget to do everything you want, it is time for creative thinking. The ultimate solution will vary, depending on your specific circumstances and needs. You might decide that it is best to outsource, go in-house, or "divide and conquer" by arriving at some combination of the two. Regardless, as John Craig learned first-hand, great success is possible as long as you do not compromise on database content.


Case Study: Another Creative Solution

John Worsley, a former client, had a problem when he was Vice President of Database Marketing for a niche cataloger. "I didn't have a budget to build a standard database," says Worsley, "so I went 'skunk works' instead. I was willing to compromise in all areas but content."

Worsley evaluated his options and decided that the one thing he had in-house was intellectual capital. "I had years of hands-on experience with technical database and merge/purge issues, had done quite a bit of graduate work in statistics, and was proficient in SAS. Plus, I had a staff member with database experience. So, I taught him SAS and created a PC-based database, all programmed and managed in SAS."

"The database had top notch data content," says Worsley. "I had tens of millions of promotion history records going back five years, and pretty much a lifetime of orders and items. Plus, I captured returns, backorders, cancels, gross margins and loyalty club information."

Worsley has some caveats. "With an in-house solution, you have to think about what will happen if you get a resignation. Also, be wary of the 'database modules' that are available with the common fulfillment software packages. Often, they do not offer the essential individual-to-household linkages, summary data and promotion history."


Case Study: What Not to Do

Recently, I discussed a potential data mining project with a gift-oriented cataloger that, unfortunately, has been in decline. It soon became apparent that the company's database content would support neither the project nor any other form of meaningful data mining:

Data is archived after 36 months and is difficult to resurrect. Some portions of the database are maintained at the surname level and others at the individual level. For surname-level database records, only one individual's identity is retained. This means that if a husband orders the first time, and then the wife orders "say "five subsequent times, the database will reflect six orders from the husband. This is particularly problematic for a gift-oriented business. To complicate matters, the database does not track bill-to/ship-to linkages and the corresponding gift relationships that these imply, nor does it contain gender codes.

Often, the acquisition source is inaccurate, which renders problematic many worthwhile analyses such as long-term value. Also, SKU discipline does not exist, the Website does not allow key codes to be entered, and customer records do not reflect a significant portion of post-demand information such as returns.

Promotion history is essentially unusable because the database tracks massive amounts of "spurious" history; for example, "event occurrences" such as records that have been sent to the service bureau for NCOA. Also, there are significant problems with tying promotion history to specific names and addresses, and email promotions are not tracked.

Finally, on the Retail side, distance-to-store calculations are based on imprecise ZIP-to-ZIP Centroids. And, they reflect only the nearest store, not where the actual purchase activity occurred.