I learnt Opeations Research thrice. The first was when I had just finished school and was about to go to IIT. My father had just started on a part-time MBA, and his method of making sure he had learnt something properly was to try and teach it to me. And so, using some old textbook he had bought some twenty years earlier, he taught me how to solve the transportation problem. I had already learnt to solve 2-variable linear programming problems in school (so yes, I learnt OR 4 times then). And my father taught my how to solve 3-variable problems using the Simplex table.
I got quite good at it, but by not using it for the subsequent two years I forgot. And then I happened to take Operations Research as a minor at IIT. And so in my fifth semester I learnt the basics again. I was taught by the highly rated Prof. G Srinivasan. He lived up to his rating. Again, he taught us simplex, transportation and assignment problems, among other things. He showed us how to build and operate the simplex table. It was fun, and surprisingly (in hindsight) never once did I consider it to be laborious.
This time I didn’t forget. OR being my minor meant that I had OR-related courses in the following three semesters, and I liked it enough to even considering applying for a PhD in OR. Then I got cold feet and decided to do an MBA instead, and ended up at IIMB. And there I learnt OR for the fourth time.
The professor who taught us wasn’t particularly reputed, and she lived up to her not-so-particular-reputation. But there was a difference here. When we got to the LP part of the course (it was part of “Quantitative Methods 2”, which included regression and OR), I thought I would easily ace it, given my knowledge of simplex. Initially I was stunned to know that we wouldn’t be taught the simplex. “What do they teach in an OR course if they don’t teach Simplex”, I thought. Soon I would know why. Computer!
We were all asked to install this software called Lindo on our PCs, which would solve any linear programming problem you would throw at it, in multiple dimensions. We also discovered that Excel had the Solver plugin. With programs like these, what use of knowing the Simplex? Simplex was probably useful back in the day when readymade algorithms were not available. Also, IIT being a technical school might have seen value in teaching us the algorithm (though we always solved procedurally. I never remember writing down pseudocode for simplex). The business school would have none of it.
It didn’t matter how the problem was actually solved, as long as we knew how to use the solver. What was more important was the art of transforming a real-life problem into one that could be solved using Solver/Lindo. In terms of formulation, the problems we got in our assignments and exams were tough – back in IIT when we solved manually such problems were out of bounds since Simplex would take too long on those.
I remember taking a few more quant electives at IIM. They were all the same – some theory would be taught where we knew something about the workings of some of the algorithms, but the focus was on applications. How do you formulate a business problem in a way in which you can use the particular technique? How do you decide what technique you use for what problem? These were some of the questions I learnt to answer through the course of my studies at IIM.
I once interviewed with a (now large) marketing analytics firm in Bangalore. They expected me to know how to measure “feelings” and other BS so I politely declined after one round. From what I understood, they had two kinds of people. First they had experienced marketers who would do the “business end” of the problem. Then they had stats/math grads who actually solved the problem. I think that is problematic. But as I have observed in a few other places, that is the norm.
You have tech guys doing absolutely tech stuff and reporting to business guys who know very little of the tech. Because of the business guy’s disinterest in tech, he is unlikely to get his hands dirty with the data. And is likely to take what the tech guy gives him at face value. As for the tech guy doing the data work, he is unlikely to really understand the business problem that he is solving, and so he invariably ends up solving a “tech problem”, which may or may not have business implications.
There are times when people ask me if I “know big data”. When I reply in the negative, they wonder (sometimes aloud) how I can call myself a data scientist. Then there are times when people ask me about a particular statistical technique. Again, it is extremely likely I answer in the negative, and extremely likely they wonder how I call myself a data scientist.
My answer is that if I deem a problem to be solvable by a particular technique, I can then simply read up on the technique! As long as you have the basics right, you don’t need to mug up all available techniques.
Currently I’m working (for a client) on a problem that requires me to cluster data (yes, I know that much stats to know that now the next step is to cluster). So this morning I decided to read up on some clustering algorithms. I’m amazed at the techniques that are out there. I hadn’t even heard of most of them. Then I read up on each of them and considered how well they would fit my data. After reading up, and taking another look at the data, I made what I think is an informed choice. And selected a technique which I think was appropriate. And I had no clue of the existence of the technique two hours before.
Given that I solve business problems using data, I make sure I use techniques that are appropriate to solve the business problem. I know of people who don’t even look at the data at hand and start implementing complex statistical techniques on them. In my last job (at a large investment bank), I know of one guy who suggested five methods (supposedly popular statistical techniques – I had never heard of them; he had a PhD in statistics) to attack a particular problem, without having even seen the data! As far as he was concerned he was solving a technical problem.
Now that this post is turning out to be an advertisement for my consulting services, let me go all the way. Yes, I call myself a “management consultant and data scientist”. I’m both a business guy and a data guy. I don’t know complicated statistical techniques, but don’t see the need to know either – since I usually have the internet at hand while working. I solve business problems using data. The data is only an intermediary step. The problem definition is business-like. As is the solution. Data is only a means.
And for this, I have to thank the not-so-highly-reputed professor who taught me Operations Research for the fourth time – who taught me that it is not necessary to know Simplex (Excel can do it), as long as you can formulate the problem properly.
@karthiks Great post on finding solutions http://t.co/Tgae7El8Yx
you are actually less of a data scientist and more of a data consultant. i attend data science roundtables, and this is a new, upcoming job description. they do what you do, essentially. they try understanding the domain, the problems, the data and try framing things as a data mining problem. and maybe provide techniques to solve it. or even go all the way to solve it.
data scientist has more to it than what you do, atleast from what i’ve seen of folks who are in the data science teams in google, facebook, bit.ly and other places. you’ll probably retain the title coz it’s definitelhy more marketable than ‘data consultant’, but ‘data consultant’ is a more appropriate job title than ‘data scientist’ for you.
Sohan Maheshwar liked this on Facebook.
Do enjoy reading your Blogs…very interesting…Thanks Ram
Ramprasad Alva liked this on Facebook.
Vignesh Vj liked this on Facebook.