analytics – Page 2 – Karthik Shashidhar

analytics, mathematics central limit theorem, income, maths, surveys

Surveying Income

February 20, 2019

For a long time now, I’ve been sceptical of the practice of finding out the average income in a country or state or city or locality by doing a random survey. The argument I’ve made is “whether you keep Mukesh Ambani in the sample or not makes a huge difference in your estimate”. So far,…

analytics, arbit, blogging, cricket analytics, cricket, deepextracover, outklip, sports, videos, vlogging

Vlogging!

February 7, 2019

The first seed was sown in my head by Harish “the Psycho” J, who told me a few months back that nobody reads blogs any more, and I should start making “analytics videos” to increase my reach and hopefully hit a new kind of audience with my work. While the idea was great, I wasn’t…

analytics, football goal difference, league table, premier league

Premier League Points Efficiency

January 2, 2019

It would be tautological to say that you win in football by scoring more goals than your opponent. What is interesting is that scoring more goals and letting in fewer works across games in a season as well, as data from the English Premier League shows. We had seen an inkling of this last year,…

analytics, data, football, sport bill shankly, football, liverpool, manager

Built by Shanks

December 20, 2018

This morning, I found this tweet by John Burn-Murdoch, a statistician at the Financial Times, about a graphic he had made for a Simon Kuper (of Soccernomics fame) piece on Jose Mourinho. Burn-Murdoch also helpfully shared the code he had written to produce this graphic, through which I discovered ClubElo, a website that produces chess-style…

analytics, arbit, bangalore, data age, bangalore, names, oldest, youngest

Bangalore names are getting shorter

November 25, 2018

The Bangalore Names Dataset, derived from the Bangalore Voter Rolls (cleaned version here), validates a hypothesis that a lot of people had – that given names in Bangalore are becoming shorter. From an average of 9 letters in the name for a male aged around 80, the length of the name comes down to 6.5…

analytics, data, food, technology R, recommendation, shiny, similarity, single malt, whisky

Single Malt Recommendation App

November 2, 2018

Life is too short to drink whisky you don’t like. How often have you found yourself in a duty free shop in an airport, wondering which whisky to take back home? Unless you are a pro at this already, you might want something you haven’t tried before, but don’t want to end up buying something…

analytics, books books, challenge, goodreads, reading

Book challenge update

September 29, 2018

At the beginning of this year, I took a break from Twitter (which lasted three months), and set myself a target to read at least 50 books during the calendar year. As things stand now, the number stands at 28, and it’s unlikely that I’ll hit my target, unless I count Berry’s story books in…

analytics, visualization, work flamewar, graphics, interactive, visualisation

Taking your audience through your graphics

September 12, 2018

A few weeks back, I got involved in a Twitter flamewar with Shamika Ravi, a member of the Indian Prime Minister’s Economic Advisory Council. The object of the argument was a set of gifs she had released to show different aspects of the Indian economy. Admittedly I started the flamewar. Guilty as charged. Thinking about…

analytics, business, work analytics, business school, managers, quantitative

Analytics for general managers

September 11, 2018

While good managers have always been required to be analytical, the level of analytical ability being asked of managers has been going up over the years, with the increase in availability of data. Now, this post is once again based on that one single and familiar data point – my wife. In fact, if you…

analytics, business, data, work analytics, business, clustering, consulting, data science

The missing middle in data science

September 4, 2018

Over a year back, when I had just moved to London and was job-hunting, I was getting frustrated by the fact that potential employers didn’t recognise my combination of skills of wrangling data and analysing businesses. A few saw me purely as a business guy, and most saw me purely as a data guy, trying…

analytics, data, work data science, machine learning, regression, statistics

Statistics and machine learning approaches

September 3, 2018

A couple of years back, I was part of a team that delivered a workshop in machine learning. Given my background, I had been asked to do a half-day session on Regression, and was told that the standard software package being used was the scikit-learn package in python. Both the programming language and the package…

analytics, data, work analytics, collaboration, data science, excel

Why data scientists should be comfortable with MS Excel

September 1, 2018

Most people who call themselves “data scientists” aren’t usually fond of MS Excel. It is slow and clunky, can only handle a million rows of data (and nearly crash your computer if you go anywhere close to that), and despite the best efforts of Visual Basic, is not very easy to program for doing repeatable…

analytics, data, work classification, credit scoring, domain knowledge, image recognition, machine learning

Meaningful and meaningless variables (and correlations)

August 30, 2018

A number of data scientists I know like to go about their business in a domain-free manner. They make a conscious choice to not know anything about the domain in which they are solving the problem, and instead treat a dataset as just a set of anonymised data, and attack it with the usual methods.…

analytics, data anlaytics, classfication, data science, twobytwos

Yet another way of classifying data scientists

August 29, 2018

There are many axes along which we can classify data scientists. We can classify based on the primary specialty, in terms “analytics”, “business intelligence” and “machine learning”. We can classify based on domain, into “financial data scientists” and “retail data scientists” and “industrial data scientists”. We can classify by the choice of primary software tool,…

analytics, data, finance analytics, balance sheet, financial statements, flows, p/e, stocks

Stocks and flows

August 28, 2018

One common mistake even a lot of experienced analysts make is comparing stocks to flows. Recently, for example, Apple’s trillion dollar valuation was compared to countries’ GDP. A few years back, an article compared the quantum of bad loans in Indian banks to the country’s GDP. Following an IPL auction a few years back, a…

analytics, visualization graphics, tumblr, visualisations

New blog on visualisations

July 27, 2018

For a while now I’ve been commenting on visualisations on Twitter, pointing out the good (and especially bad) graphs. I also have a “chart of the edition” section in my newsletter. Recently, the legendary Krish Ashok suggested that I collect all these bad visualisations in a Tumblr, and I decided to oblige. You can follow…

analytics, computer science, data, technology collaborative filtering, computer science, machine learning, netflix

Beer and diapers: Netflix edition

April 14, 2018

When we started using Netflix last May, we created three personas for the three of us in the family – “Karthik”, “Priyanka” and “Berry”. At that time we didn’t realise that there was already a pre-created “kids” (subsequently renamed “children” – don’t know why that happened) persona there. So while Priyanka and I mostly use…

analytics, arbit, technology, visualization mobile, technology, visualisation

More on interactive graphics

April 10, 2018

So for a while now I’ve been building this cricket visualisation thingy. Basically it’s what I think is a pseudo-innovative way of describing a cricket match, by showing how the game ebbs and flows, and marking off the key events. Here’s a sample, from the ongoing game between Chennai Super Kings and Kolkata Knight Riders.…

analytics, cricket, investment banking, sport, sports analytics, visualization backward induction, banking, cricket, markovian, visualisation

A banker’s apology

April 8, 2018

Whenever there is a massive stock market crash, like the one in 1987, or the crisis in 2008, it is common for investment banking quants to talk about how it was a “1 in zillion years” event. This is on account of their models that typically assume that stock prices are lognormal, and that stock…

analytics, football, sports analytics correlation, english premier league, football, goal difference

English Premier League: Goal Difference to points correlation

March 5, 2018

So I was just looking down the English Premier League Table for the season, and I found that as I went down the list, the goal difference went lower. There’s nothing counterintuitive in this, but the degree of correlation seemed eerie. So I downloaded the data and plotted a scatter-plot. And what do you have?…

analytics, computer science, data analytics, data science, jupyter, machine learning, python, sklearn, stirring the pile

Stirring the pile efficiently

February 14, 2018

Warning: This is a technical post, and involves some code, etc. As I’ve ranted a fair bit on this blog over the last year, a lot of “machine learning” in the industry can be described as “stirring the pile”. Regular readers of this blog will be familiar with this image from XKCD by now: Basically…