I recently learnt that a number of people think that the more the number of variables you use in your model, the better your model is! What has surprised me is that I’ve met a lot of people who think so, and recommendations for simple models haven’t been taken too kindly.
The conversation usually goes like this
“so what variables have you considered for your analysis of ______ ?”
“A,B,C”
“Why don’t you consider D,E,F,… X,Y,Z also? These variables matter for these reasons. You should keep all of them and build a more complete model”
“Well I considered them but they were not significant so my model didn’t pick them up”
“No but I think your model is too simplistic if it uses only three variables”
This is a conversation i’ve had with so many people that i wonder what kind of conceptions people have about analytics. Now I wonder if this is because of the difference in the way I communicate compared to other “analytics professionals”.
When you do analytics, there are two ways to communicate – to simplify and to complicate (for lack of a better word). Based on my experience, what I find is that a majority of analytics professionals and modelers prefer to complicate – they talk about complicated statistical techniques they use for solving the problem (usually with fancy names) and bulldoze the counterparty into thinking they are indeed doing something hi-funda.
The other approach, followed by (in my opinion) a smaller number of people, is to simplify. You try and explain your model in simple terms that the counterparty will understand. So if your final model contains only three explanatory variables, you tell them that only three variables are used, and you show how each of these variables (and combinations thereof) contribute to the model. You draw analogies to models the counterparty can appreciate, and use that to explain.
Now, like analytics professionals can be divided into two kinds (as above), I think consumers of analytics can also be divided into two kinds. There are those that like to understand the model, and those that simply want to get into the insights. The former are better served by the complicating type analytics professionals, and the latter by the simplifying type. The other two combinations lead to disaster.
Like a good management consultant, I represent this problem using the following two-by-two:
As a principle, I like to explain models in a simplified fashion, so that the consumer can completely understand it and use it in a way he sees appropriate. The more pragmatic among you, however, can take a guess on what type the consumer is and tweak your communication accordingly.