About a month or so back I had a long telephonic conversation with this guy who runs an offshored analytics/data science company in Bangalore. Like most other companies that are being built in the field of analytics, this follows the software services model – a large team in an offshored location, providing long-term standardised data science solutions to a client in a different “geography”.

As is usual with conversations like this one, we talked about our respective areas of work and kind of projects we take on, and soon we got to the usual bit in such conversations where we were trying to “find synergies”. Things were going swimmingly when this guy remarked that it was the first time he was coming across a freelancer in this profession. “I’ve heard of freelance designers and writers, but never freelance data scientists or analytics professionals”, he mentioned.

In a separate event I was talking to one old friend about another old friend who has set up a one-man company to do provide what is basically freelance consulting services. We reasoned that the reason this guy had set up a company rather than calling himself a freelancer given the reputation that “freelancers” (irrespective of the work they do) have – if you say you are a freelancer people think of someone smoking pot and working in a coffee shop on a Mac. If you say you are a partner or founder of a company, people imagine someone more corporate.

Now that the digression is out of the way let us get back to my conversation with the guy who runs the offshored shop. During the conversation I didn’t say much, just saying things like “what is wrong with being a freelancer in this profession”. But now that i think more about it, it is simply a function of the profession being a fundamentally creative profession.

For a large number of people, data science is simply about statistics, or “machine learning” or predictive modelling – it is about being given a problem expressed in statistical terms and finding the best possible model and model parameters for it. It is about being given a statistical problem and finding a statistical solution – I’m not saying, of course, that statistical modelling is not a creative profession – there is a fair bit of creativity involved in figuring out what kind of model to model, and picking the right model for the right data. But when you have a large team working on the problem, working effectively like an assembly line (with different people handling different parts of the solution), what you get is effectively an “assembly line solution”.

Coming back, let us look at this “a day in the life” post I wrote about a year back about a particular day in office for me. I’ve detailed in that the various kinds of problems I had to solve that day – hidden markov models and bayesian probability to writing code using dynamic programming and implementing the code in R, and then translating the solution back to the business context. Notice that when I started off working on the problem it was not known what domain the problem belonged in – it took some poking and prodding around in order to figure out the nature of the problem and the first step in solution.

And then on, it was one step leading to another, and there are two important facts to consider about each step – firstly, at each step, it wasn’t clear as to what the best class of technique was to get beyond the step – it was about exploration in order to figure out the best class of technique. Next, at no point in time was it known what the next step was going to be until the current step was solved. You can see that it is hard to do it in an assembly line fashion!

Now, you can talk about it being like a game of chess where you aren’t sure what the opponent will do, but then in chess the opponent is a rational human being, while here the “opponent” is basically the data and the patterns it shows, and there is no way to know until you try something as to how the data will react to that. So it is impossible to list out all steps beforehand and solve it – solution is an exploratory process.

And since solving a “data science problem” (as I define it, of course) is an exploratory, and thus creative, process, it is important to work in an atmosphere that fosters creativity and “thinking without thinking” (basically keep a problem in the back of your mind and then take your mind off it, and distract yourself to solve the problem). This is best done away from a traditional corporate environment – where you have to attend meetings and be liable to be disturbed by colleagues at all times, and this is why a freelance model is actually ideal! A small partnership also works – while you might find it hard to “assembly line” the problem, having someone to bounce thoughts and ideas with can have a positive impact to the creative process. Anything more like a corporate structure and you are removing the conditions necessary to foster creativity, and are in such situations more likely to come up with cookie-cutter solutions.

So unless your business model deals with doing repeatable and continuous analytical work for a client, you are better off organising yourselves in an environment that fosters creativity and not a traditional office kind of structure if you want to solve problems using data science. Then again, your mileage might vary!

Leave a Reply

Your email address will not be published. Required fields are marked *