Attribute Models in Marketing

4 min readApr 11, 2022

Half the money I spend on Marketing is wasted and the problem is I don’t know which half.

This is a very famous Marketing Quote. And the biggest problem in marketing.

Imagine I’m the owner of Maggie, I advertise my product in different platforms, Newspaper, TV, Radio and Socail Media.

Now there’s a increase in the sales of my product. This can be attributed to the Marketing made. But what Marketing ? Is it because of TV ads or Social Media or other form of Ads I did.

If there’s some way for me to find out which type of advertisement contributed more to the sales increase, then I can spend more money on that and more importantly, if I could find out which type of Marketing contributed the least, I could also eliminate that. And another important thing is to find out what amount of sales would have increased if no Marketing was done at all. This helps us derive Marketing Stratergies better.

Especially companies like Google make billions from advertising and companies like Coca-Cola spend billions on it.

This is what we call as MMM, Mixed Market Modeling.

This can be done just by simply using a regression model and deriving the feature importance.

I can write x(TV) + y(Radio) + z(Scoial Media) = Overall Sales.

And find x,y and z coefficients, to determine, what would be the increase in sales,if any one of the coefficient is increased by one unit.

But there certain important fine things for us to consider. Like the seasonal sales and currency difference.

Linear Regressionc is manily used because they’re more intreptable than other complex algorithms.

As we already said feature importance is not perfect. And any general model would have a problem, which is called as carry over problem. That is when advertisements are made, the sales aren’t made on the same day. So, we need to consider that as well. An advertisement that was made month ago could contribute to the sale in next month, and we need to carry-over the effect of marketing on the sales made that month.

So we can use something better than that. And that would be Game Theory.

Game theory is the study of mathematical models of strategic interactions among rational agents.

This is what wikepedia said, didn’t understand thing right ? Me neither.

In human understandable language, imagine 10 people are working on building a website.Now I want to know how much did each of them constitute in the whole process of building a website.

This sounds super strange and unattainable right ?Like quantifying the contribution each player(Game theory term) made in making the website.

But this is exactly what game theory does. And it uses shapely values to do it.

Imagine we have n players, shapely value of ith player tells su how much credit to give to ith player.

The math that involves shapely value is pretty complex.

In Game Theory we have function, called characteristic function,which gives us the total number of conversion sales generated.

Although Game Theory could provide a more accurate answer for how each of the marketing techniques constitute for the sales, it’s not used that often and mostly used only by the big companies like Google, Amazon. This is because of their computational cost.And that’s where another technique comes into play which involves low computational cost.


This is what Markov Chain looks like, S0 is where the customer starts from and Sp is the final destination, where the customers purchase.

And from S0 to Sp, there are multiple ways to reach.We calculate the importance of each channel by calculating the overall probability of reaching Sp from S0, and then calculating the same by removing the probability of the channel we want to calculate.Subracting the two, we get the contribution of the particular channel.

The biggest assumption Markov Channel makes, that is it’s biggest drawback is, ignoring the past history.

When Markov Channel calculates probabilities, it only considers the last channel and ignores the rest of the channels in the past.

For example, a customer can reach Sp from So,by following the path, S0, Snp, Sg, Sp. It only considers Sg to Sp, it doesn’t consider the past history of the path.

This is a big issue in Markov Chain as it ignores the past marketing history. To solve this issue we use Survival Analysis.

Survival Analysis

It is maily used in health care.

I’ll explain with a simple example. Imagine I have a patient who has cancer, I need to know what is the probability of his survival.

Simply put the survival model helps us in calculating the probability of the patients survival given time. We plot the graph, keeping x-axis as days and y-axis as proportion survival.

We calculate the probability of an event occuring at a given time.It has function called as hazard function which is to model the possibility of death given a patients age.

This is all for the theory of attribution models.I’ll soon post a comparison of the performance of Survival Analysis and Markov Chain in a seperate blog.




Data Science Machine Learning Data & Business Analytics