## Building Marketing Campaigns using Association Analysis

**A**ssociation analysis – in the
context of consumer product affinity, is also
called Market Basket Analysis. It is an
unsupervised algorithm – easier to comprehend
if you understand the supervised
model. Consider a case where the
temperature and humidity in a particular
location are given, you are trying to predict
the quantity of rain on that day. The inputs
are standardized and you observe / record the
inputs i.e. temperature and humidity and of
course, the rainfall. Given this data, called
training data by the way, you try to get a
formula like (.007*temp + .03* Humidity) =
rain in mm. The converse of this example where
you just have a mass of data and you are
trying to figure out the structure is called
unsupervised learning.

Association is one such algorithm which
identifies patterns or data items that occur
frequently together. The famous story around
beer → diaper correlation i.e.
*“Friday afternoons, young American males
who buy diapers (nappies) also have a
predisposition to buy beer”*** **is a classic example or
output of an Association Analysis. Well,
of all the algorithms we put up so far, this
is conceptually the easiest to
understand. Consider a set of consumer
purchase transactions. For
simplicity, let’s assume that a customer
never purchases more than 3 items in one shot.

A single glance tells you that out of 5 transactions 3 have BEEF & CHEESE occurring together. Well, you don’t really need a complex mining algorithm to do this – a simple cross tab / Cartesian query will give you results like the table below, where the cell data indicates no. of times these two occur together / total number of transactions.

But Association analysis does much more. It digs deep across all this data and gives you the following metrics for all combinations of products. Let’s take the example of beef and cheese occurring together.

Confidence can be used for placement strategies if high enough since it indicates that people buy both together rather than just Beef. Use this intelligence to show these products together or if you are a brick and mortar establishment, physically collocate them so that they are in the eye range of the customer in the aisle. Lift indicates strength of the rule and greater the value, better the strength of the rule. Well, enough math for the time being I guess.

We will take off from the previous story of a focused, narrow data driven consumer persona. Highly Engaged, Valuable Middle Female Customer from Tennessee is what we got to from our previous adventures. Well, now we have Jane who fits these exact criteria but really has just signed up and Elena who has been around for some time. We do not have enough information about Jane’s behavioral patterns but we do know something about Elena. There are several use cases that can be of interest to the marketer if – (a big if mind you,) you know what or how she could behave at a given point in time. For instance, some nice questions that we can ask are:

- What is the product that Jane could most probably buy? What is the first campaign I can send to her that will be relevant to her?
- Elena has been around for sometime and been unresponsive to campaigns. Is there a risk that she will move away?
- Do consumers like Elena & Jane show patterns of attrition for e.g. typically after 3 months post hitting a net value of say $10,000?

Well, we didn’t go through the whole bit for nothing. Let’s put the algorithm to work and see what it spits out. The size of the circles indicates confidence and the color intensity indicates higher lift.

If you analyze the data (read it as the LHS (vertical) à RHS (Horizontal)) - the first entry would be the propensity of buying BAKED BREAD & CHEESE together. If you look at the big intense circles, where really is your sweet spot, BEEF & CHEESE Wins! So you got your first campaign out for Jane (for BEEF) and a cross sell option post that as well - CHEESE.

Cheers and happy Mining!!!