The long-term goal of any business is to grow, improve, and ultimately, increase profits. To achieve this lofty goal, managers should take full advantage of their existing data resources and base their decisions not only on experience, but also on insights gleaned from reliable data. For example, they may have an intuition that the customer most likely to purchase a given product or service fits a certain profile, or they may have heard through other sources that a certain type of customer is more likely to discontinue service. The attributes of this type of customer would be considered potential indicators. However, the attributes that the manager considers could be very narrow—perhaps only including the most basic indicators, such as age or gender. It is possible that other information might be more powerful in explaining which customers are more likely to convert/churn/upgrade, etc. Finding the most valuable indicators is a not only helpful for basic business intelligence applications, but is also a key step of the advanced predictive analytics process.
Unfortunately, for many companies, these indicators reside across different, siloed databases, which makes analysis difficult. But if the data is successfully and accurately linked together, we can begin to take a more comprehensive look at customer behavior. Looking at odds ratios in relation to a particular target of interest can allow us to gain insights across a wide array of indicators. While interpretation and understanding of statistical or predictive models isn’t always simple or straightforward, the ability to interpret odds and odds ratios is a key step in being able to better understand the results of logistic regression output.
Example One: Calculating the Odds of Survival
To illustrate how an odds ratio is calculated, Dr. Jeff Knisley of East Tennessee State University gives an example using data on mountain climbers who descended from Mount Everest between 1978 and 1999 in this video:
|Survived||Did not survive||Total climbers||Odds of survival|
|Had supplemental oxygen||1,045||88||1,133||11.875|
|Did not have supplemental oxygen||32||8||40||4|
|Total: 1,077||Total: 96||Total: 1,173||Odds Ratio: 2.97|
|Odds ratio:||= 2.97|
In this example, survival is the target (i.e., dependent) indicator, and having supplemental oxygen is an independent indicator. The data shows that 1,133 climbers had supplemental oxygen, of which 1,045 survived. This translates to an odds of survival of 1,045/88 = 11.875. Only 40 climbers did not have supplemental oxygen, of which 32 survived. This translates to an odds of survival of 32/8 = 4. The odds ratio simply takes the ratio of these two odds: The odds of survival given supplemental oxygen divided by the odds of survival given no supplemental oxygen. In this case the odds ratio is 11.875/4 = 2.97, which suggests that the odds of survival is about 3 times greater with supplemental oxygen than without supplemental oxygen. Three main takeaways for the analysis or an odds ratio are as follows:
- Odds ratios greater than 1 mean that the indicator is associated with higher odds of the target outcome (i.e. survival, conversion, churn, etc.).
- An odds ratio of 1 means that the indicator is not associated with higher or lower odds of the target outcome (e.g. This would be the case if instead of 1,045 surviving with oxygen, only 352 survived).
- Odds ratios less than 1 mean that the indicator is associated with lower odds of the target outcome.
Note: Additional calculations can show if the odds ratios are statistically significant and what those confidence interval ranges are.
Example Two: Using the Odds Ratio to Examine Age as an Indicator for Churn
Turning back to a business example, odds ratios can be calculated across a wide array of independent indicators – age, gender, ethnicity, month of year, education level, state, time since last visit, number of transactions, etc. – that help describe your customers. For example, age is an indicator that can be binned into 5, 10 or however many groups you would like to assess. Suppose that we have a business that has had 80,000 active customers within the last three years. You may have 10,000 customers between the ages of 18 and 24, another 10,000 between 25 and 33, and another 10,000 in the oldest 70 and older group. Perhaps this comprises 8 total bins of varying age range, but each having about the same number of total customers. An odds ratio calculation can then be performed to see which customer has greater odds of discontinuing service, or churning.
Using this example of about 80,000 customers, we can walk through the same analysis as done on the mountain climbers. Our target indicator is whether or not a customer has discontinued service. In the three-year analyzed time frame, you have gained some new customers. But some existing customers also discontinued service and became former customers, and you want to learn more about these former customers. Suppose that 15,000 customers churned within the last three years, and of those that churned 3,000 were in the youngest age group. For the 10,000 customers in the 18 to 24 age group, this translates into an odds of churning of 3,000/7,000 = 0.4285. The odds of churning for those customers not in the youngest age group is 12,000/58,000 = 0.2069. The odds ratio is calculated by taking the odds of churning within the youngest age group divided by the odds of churning for customers not in the youngest age group. In this case the odds ratio is 2.07, which suggests that the odds of churning is about 2 times greater within this youngest age group than among those not in the youngest age group. This insight can lead management to strategize on ways of retaining these young customers, especially if that segment has been demonstrated to be more profitable in comparison to others.
|Churned||Did not churn||Total customers||Odds of churning|
|All other age groups||12,000||58,000||70,000||0.2069|
|Total: 15,000||Total: 65,000||Total: 80,000||Odds Ratio: 2.07|
|Odds ratio:||= 2.07|
We can extend this calculation to the other age range indicator bins and to all the other available indicators to better understand those customers who have churned. We can also extend the analysis to other questions of interest. In the example we used churned customers as our target. But we could have also analyzed our new customers. Perhaps the youngest age group also has higher odds of becoming new customers. Perhaps not. In either case, odds ratios can either confirm a suspicion or provide some insight that allows a manager to ask more informed questions and dive deeper in understanding how to best respond to customer behavior. This can be taken a step further with predictive analytics. Armed with these odds ratios, a predictive model can be developed that allows the organization to take action, and ultimately, improve the business.