Customer Acquisition and Data Mining

Excerpted from the Book Building Data Mining Applications for CRM
by Alex Berson, Stephen Smith, and Kurt Thearling


For most businesses, the primary means of growth involves the acquisition of new customers. This could involve finding customers who previously were not aware of your product, were not candidates for purchasing your product (for example, baby diapers for new parents), or customers who in the past have bought from your competitors. Some of these customers might have been your customers previously, which could be an advantage (more data might be available about them) or a disadvantage (they might have switched as a result of poor service). In any case, data mining can often help segment these prospective customers and increase the response rates that an acquisition marketing campaign can achieve.

The traditional approach to customer acquisition involved a marketing manager developing a combination of mass marketing (magazine advertisements, billboards, etc.) and direct marketing (telemarketing, mail, etc.) campaigns based on their knowledge of the particular customer base that was being targeted. In the case of a marketing campaign trying to influence new parents to purchase a particular brand of diapers, the mass marketing advertisements might be focused in parenting magazines (naturally). The ads could also be placed in more mainstream publications whose readership demographics (age, marital status, gender, etc.) were similar to those of new parents.

In the case of traditional direct marketing, customer acquisition is relatively similar to mass marketing. A marketing manager selects the demographics that they are interested in (which could very well be the same characteristics used for mass market advertising), and then works with a data vendor (sometimes known as a service bureau) to obtain lists of customers who meet those characteristics. The service bureaus have large databases containing millions of prospective customers that can be segmented based on specific demographic criteria (age, gender, interest in particular subjects, etc.). To prepare for the "diapers" direct mail campaign, the marketing manager might request a list of prospects from a service bureau. This list could contain people, aged 18 to 30, who have recently purchased a baby stroller or crib (this information might be collected from people who have returned warranty cards for strollers or cribs). The service bureau will then provide the marketer with a computer file containing the names and addresses for these customers so that the diaper company can contact these customers with their marketing message.

It should be noted that because of the number of possible customer characteristics, the concept of "similar demographics" has traditionally been an art rather than a science. There usually are not hard-and-fast rules about whether two groups of customers share the same characteristics. In the end, much of the segmentation that took place in traditional direct marketing involved hunches on the part of the marketing professional. In the case of 18-to-30 year old purchasers of baby strollers, the hunch might be that people who purchase a stroller in this age group are probably making the purchase before the arrival of their first child (because strollers are saved and used for additional children). They also haven't yet decided which brand of diapers to use. Seasoned veterans of the marketing game know their customers well and are often quite successful in making these kinds of decisions.

How Data Mining and Statistical Modeling Change Things

Although a marketer with a wealth of experience can often choose relevant demographic selection criteria, the process becomes more difficult as the amount of data increases. The complexities of the patterns increase, both with the number of customers being considered and the increasing detail for each customer. The past few years have seen tremendous growth in consumer databases, so the job of segmenting prospective customers is becoming overwhelming.

Data mining can help this process, but it is by no means a solution to all of the problems associated with customer acquisition. The marketer will need to combine the potential customer list that data mining generates with offers that people are interested in. Deciding what is an interesting offer is where the art of marketing comes in.

Defining Some Key Acquisition Concepts

Before the process of customer acquisition begins, it is important to think about the goals of the marketing campaign. In most situations, the goal of an acquisition marketing campaign is to turn a group of potential customers into actual customers of your product or service. This is where things can get a bit fuzzy. There are usually many kinds of customers, and it can often take a significant amount of time before someone becomes a valuable customer. When the results of an acquisition campaign are evaluated, there are often different kinds of responses that need to be considered.

The responses that come in as a result of a marketing campaign are called "response behaviors." The use of the word "behavior" is important because the way in which different people respond to a particular marketing message can vary. How a customer behaves as a result of the campaign needs to take into consideration this variation. A response behavior defines a distinct kind of customer action and categorizes the different possibilities so that they can be further analyzed and reported on.

Binary response behaviors are the simplest kind of response. With a binary response behavior, the customer response is either a yes or no. If someone is sent a catalog, did they buy something from the catalog or not? At the highest level, this is often the kind of response that is talked about. Binary response behaviors do not convey any subtle distinctions between customer actions, and these distinctions are not always necessary for effective marketing campaigns.

Beyond binary response behaviors are categorical response behaviors. As you would expect, a categorical response behavior allows for multiple behaviors to be defined. The rules that define the behaviors are arbitrary and are based on the kind of business you are involved in. Going back to the example of sending out catalogs, one response behavior might be defined to match if the customer purchased women's clothing from the catalog, whereas a different behavior might match when the customer purchased men's clothing. These behaviors can be refined a far as deemed necessary (for example, "purchased men's red polo shirt."

It should be noted that it is possible for different response behaviors to overlap. A behavior might be defined for customers that purchased over $100 from the catalog. This could overlap with the "purchased men's clothing" behavior if the clothing that was purchased cost more than $100. Overlap can also be triggered if the customer purchases more than one item (both men's and women's shirts, for example) as a result of a single offer. Although the use of overlapping behaviors can tend to complicate analysis and reporting, the use of overlapping categorical response behaviors tends to be richer and therefore will provide a better understanding of your customers in the future.

Figure 10-1: Example response analysis broken down by behavior

There are usually several different kinds of positive response behaviors that can be associated with an acquisition marketing campaign. (This assumes that the goal of the campaign is to increase customer purchases, as opposed to an informational marketing campaign in which customers are simply told of your company's existence.) Some of the general categories of response behaviors (Figure 10-1) are the following:

There are also typically two kinds of negative responses. The first is a non-response. This is not to be confused with a definite refusal of your offer. For example, if you contacted the customer via direct mail, there may be any number of reasons why there was no response (wrong address, offer misplaced, etc.). Other customer contact channels (outbound telemarketing, email, etc.) can also result in ambiguous non-responses. The fact there was no response does not necessarily mean that the offer was rejected. As a result, the way you interpret a non-response as part of additional data analysis will need to be thought out (more on this later).

A rejection (also known simply as a "no") by the prospective customer is the other kind of negative response. Depending on the offer and the contact channel, you can often determine exactly whether or not the customer is interested in the offer (for example, an offer made via outbound telemarketing might result in a definitive "no, I'm not interested" response). Although it probably does not seem useful, the definitive "no" response is often as valuable as the positive response when it comes to further analysis of customer interests.

It All Begins with the Data

One of the differences between customer acquisition and most other marketing applications of data mining revolves around the data that is used to build predictive models. The amount of information that you have about people that you do not yet have a relationship with is much more limited than the information you have about your existing customers. In some cases, the data might be limited to their address and/or phone number. The key to this process is finding a relationship between the information that you do have and the behaviors you want to model.

Most acquisition marketing campaigns begin with the prospect list. A prospect list is simply a list of customers that have been selected because they are likely to be interested in your products or services. There are numerous companies around the world that will sell lists of customers, often with a particular focus (for example, new parents, retired people, new car purchasers, etc.).

Sometimes, it is necessary to add additional information to a prospect list by overlaying data from other sources. For example, consider a prospect list that containing only names and addresses. In terms of a potential data mining analysis, the information contained in the prospect list is very weak. There might be some patterns in the city, state, or Zip code fields, but they would be limited in their predictive power. To augment the data, information about customers on the prospect list could be matched with external data. One simple overlay involves combining the customer's ZIP code with U.S. census data about average income, average age, and so on. This can be done manually or, as is often the case with overlays, your list provider can take care of this automatically.

More complicated overlays are also possible. Customers can be matched against purchase, response, and other detailed data that the data vendors collect and refine. This data comes from a variety of sources including retailers, state and local governments, and the customers themselves. If you are mailing out a car accessories catalog, it might be useful to overlay information (make, model, year) about any known cars that people on the prospect list might have registered with their department of motor vehicles.

Test Campaigns

Once you have a list of prospect customers, there is still some work that needs to be done before you can create predictive models for customer acquisition. Unless you have data available from previous acquisition campaigns, you will need to send out a test campaign in order to collect data for analysis. Besides the customers you have selected for your prospect list, it is important to include some other customers in the campaign, so that the data is as rich as possible for future analysis. For example, assume that your prospect list (that you purchased from a list broker) was composed of men over age 30 who recently purchased a new car. If you were to market to these prospective customers and then analyze the results, any patterns found by data mining would be limited to sub-segments of the group of men over 30 who bought a new car. What about women or people under age 30? By not including these people in your test campaign, it will be difficult to expand future campaigns to include segments of the population that are not in your initial prospect list. The solution is to include a small random selection of customers whose demographics differ from the initial prospect list. This random selection should constitute only a small percentage of the overall marketing campaign, but it will provide valuable information for data mining. You will need to work with your data vendor in order to add a random sample to the prospect list.

More sophisticated techniques than random selection do exist, such as those found in statistical experiment design and multi-variable testing (MVT). Deciding when and how to implement these approaches is beyond the scope of this book, but there are numerous resources in the statistical literature that can provide more information.

Although this circular process (customer interaction ? data collection ? data mining ? customer interaction) exists in almost every application of data mining to marketing, there is more room for refinement in customer acquisition campaigns. Not only do the customers that are included in the campaigns change over time, but the data itself can also change. Additional overlay information can be included in the analysis when it becomes available. Also, the use random selection in the test campaigns allows for new segments of people to be added to your customer pool.

Evaluating Test Campaign Responses

Once you have started your test campaign, the job of collecting and categorizing the response behaviors begins. Immediately after the campaign offers go out, you need to track responses. The nature of the response process is such that responses tend to trickle in over time, which means that the campaign can go on forever. In most real-world situations, though, there is a threshold after which you no longer look for responses. At that time, any customers on the prospect list that have not responded are deemed "non-responses." Before the threshold, customers who have not responded are in a state of limbo, somewhere between a response and a non-response.

Building Data Mining Models Using Response Behaviors

With the test campaign response data in hand, the actual mining of customer response behaviors can begin. The first part of this process requires you to choose which behaviors you are interested in predicting, and at what level of granularity. The level at which the predictive models work should reflect the kinds of offers that you can make, not the kinds of responses that you can track. It might be useful (for reporting purposes) to track catalog clothing purchases down to the level of color and size. If all catalogs are the same, however, it really doesn't matter what the specifics of a customer purchase for the data mining analysis. In this case (all catalogs are the same), binary response prediction is the way to go. If separate men's and women's catalogs are available, analyzing response behaviors at the gender level would be appropriate. In either case, it is a straightforward process to turn the lower-level categorical behaviors into a set of responses at the desired level of granularity. If there are overlapping response behaviors, the duplicates should be removed prior to mining.

In some circumstances, predicting individual response behaviors might be an appropriate course of action. With the movement toward one-to-one customer marketing, the idea of catalogs that are custom-produced for each customer is moving closer to reality. Existing channels such as the Internet or outbound telemarketing also allow you to be more specific in the ways you target the exact wants and needs of your prospective customers. A significant drawback of the modeling of individual response behaviors is that the analytical processing power required can grow dramatically because the data mining process needs to be carried our multiple times, once for each response behavior that you are interested in.

How you handle negative responses also needs to be thought out prior to the data analysis phase. As discussed previously, there are two kinds of negative responses: rejections and non-responses. Rejections, by their nature, correspond to specific records in the database that indicate the negative customer response. Non-responses, on the other hand, typically do not represent records in the database. Non-responses usually correspond to the absence of a response behavior record in the database for customers who received the offer.

There are two ways in which to handle non-responses. The most common way is to translate all non-responses into rejections, either explicitly (by creating rejection records for the non-responding customers) or implicitly (usually a function of the data mining software used). This approach will create a data set comprised of all customers who have received offers, with each customer's response being positive (inquiry or purchase) or negative (rejections and non-responses).

The second approach is to leave non-responses out of the analysis data set. This approach is not typically used because it throws away so much data, but it might make sense if the number of actual rejections is large (relative to the number of non-responses); experience has shown that non-responses do not necessarily correspond to a rejection of your product or services offering.

Once the data has been prepared, the actual data mining can be performed. The target variable that the data mining software will predict is the response behavior type at the level you have chosen (binary or categorical). Because some data mining applications cannot predict non-binary variables, some finessing of the data will be required if you are modeling categorical responses using non-categorical software. The inputs to the data mining system are the input variables and all of the demographic characteristics that you might have available, especially any overlay data that you combined with your prospect list.

In the end, a model (or models, if you are predicting multiple categorical response behaviors) will be produced that will predict the response behaviors that you are interested in. The models can then be used to score lists of prospect customers in order to select only those who are likely to response to your offer. Depending on how the data vendors you work with operate, you might be able to provide them with the model, and have them send you only the best prospects. In the situation in which you are purchasing overlay data in order to aid in the selection of prospects, the output of the modeling process should be used to determine whether all of the overlay data is necessary. If a model does not use some of the overlay variables, you might want to save some money and leave out these unused variables the next time you purchase a prospect list.

[ Data Mining Page ] [ White Papers ] [ Data Mining Tutorial ]