As the old song went, “when the giver gives, he tears the roof and gives”. Last week the Government of Karnataka released its report on the covid-19 serosurvey done in the state. You might recall that it had concluded that the number of cases had been undercounted by a factor of 40, but then some…
Covid-19 Prevalence in Karnataka
Finally, many months after other Indian states had conducted a similar exercise, Karnataka released the results of its first “covid-19 sero survey” earlier this week. The headline number being put out is that about 27% of the state has already suffered from the infection, and has antibodies to show for it. From the press release:…
Opinion polling in India and the US
(Relative) old-time readers of this blog might recall that in 2013-14 I wrote a column called “Election Metrics” for Mint, where I used data to analyse elections and everything else related to that. This being the election where Narendra Modi suddenly emerged as a spectacular winner, the hype was high. And I think a lot…
Scrabble
I’ve forgotten which stage of lockdown or “unlock” e-commerce for “non-essential goods” reopened, but among the first things we ordered was a Scrabble board. It was an impulse decision. We were on Amazon ordering puzzles for the daughter, and she had just about started putting together “sounds” to make words, so we thought “scrabble tiles…
Covid-19 superspreaders in Karnataka
Through a combination of luck and competence, my home state of Karnataka has handled the Covid-19 crisis rather well. While the total number of cases detected in the state edged past 2000 recently, the number of locally transmitted cases detected each day has hovered in the 20-25 range. It might make news that Karnataka has…
Placing data labels in bar graphs
If you think you’re a data visualisation junkie, it’s likely that you’ve read Edward Tufte’s Visual Display Of Quantitative Information. If you are only a casual observer of the topic, you are likely to have come across these gifs that show you how to clean up a bar graph and a data table. And if…
More on covid testing
There has been a massive jump in the number of covid-19 positive cases in Karnataka over the last couple of days. Today, there were 44 new cases discovered, and yesterday there were 36. This is a big jump from the average of about 15 cases per day in the preceding 4-5 days. The good news…
Simulating Covid-19 Scenarios
I must warn that this is a super long post. Also I wonder if I should put this on medium in order to get more footage. Most models of disease spread use what is known as a “SIR” framework. This Numberphile video gives a good primer into this framework. The problem with the framework is…
Distribution of political values
Through Baal on Twitter I found this “Political Compass” survey. I took it, and it said this is my “political compass”. Now, I’m not happy with the result. I mean, I’m okay with the average value where the red dot has been put for me, and I think that represents my political leanings rather well.…
Statistical analysis revisited – machine learning edition
Over ten years ago, I wrote this blog post that I had termed as a “lazy post” – it was an email that I’d written to a mailing list, which I’d then copied onto the blog. It was triggered by someone on the group making an off-hand comment of “doing regression analysis”, and I had…
Big Data and Fast Frugal Trees
In his excellent podcast episode with EconTalk’s Russ Roberts, psychologist Gerd Gigerenzer introduces the concept of “fast and frugal trees“. When someone needs to make decisions quickly, Gigerenzer says, they don’t take into account a large number of factors, but instead rely on a small set of thumb rules. The podcast itself is based on…
Liverpool FC: Mid Season Review
After 20 games played, Liverpool are sitting pretty on top of the Premier League with 58 points (out of a possible 60). The only jitter in the campaign so far came in a draw away at Manchester United. I made what I think is a cool graph to put this performance in perspective. I looked…
This year on Spotify
I’m rather disappointed with my end-of-year Spotify report this year. I mean, I know it’s automated analytics, and no human has really verified it, etc. but there are some basics that the algorithm failed to cover. The first few slides of my “annual report” told me that my listening changed by seasons. That in January…
Spurs right to sack Pochettino?
A few months back, I built my “football club elo by manager” visualisation. Essentially, we take the week-by-week Premier League Elo ratings from ClubElo and overlay it with managerial tenures. A clear pattern emerges – a lot of Premier League sackings have been consistent with clubs going down significantly in terms of Elo Ratings. For…
Alchemy
Over the last 4-5 days I kinda immersed myself in finishing Rory Sutherland’s excellent book Alchemy. It all started with a podcast, with Sutherland being the guest on Russ Roberts’ EconTalk last week. I’d barely listened to half the podcast when I knew that I wanted more of Sutherland, and so immediately bought the book…
EPL: Mid-Season Review
Going into the November international break, Liverpool are eight points ahead at the top of the Premier League. Defending champions Manchester City have slipped to fourth place following their loss to Liverpool. The question most commentators are asking is if Liverpool can hold on to this lead. We are two-thirds of the way through the…
Segmentation and machine learning
For best results, use machine learning to do customer segmentation, but then get humans with domain knowledge to validate the segments There are two common ways in which people do customer segmentation. The “traditional” method is to manually define the axes through which the customers will get segmented, and then simply look through the data…
Fishing in data pukes
When a data puke is presented periodically, consumers of the puke learn to “fish” for insights in it. I’ve been wondering why data pukes are so common. After all, they need significant effort on behalf of the consumer to understand what is happening, and to get any sort of insight from it. In contrast, a…
Taking Intelligence For Granted
There was a point in time when the use of artificial intelligence or machine learning or any other kind of intelligence in a product was a source of competitive advantage and differentiation. Nowadays, however, many people have got so spoiled by the use of intelligence in many products they use that it has become more…
More on statistics and machine learning
I’m thinking of a client problem right now, and I thought that something that we need to predict can be modelled as a function of a few other things that we will know. Initially I was thinking about it from the machine learning perspective, and my thought process went “this can be modelled as a…
Periodicals and Dashboards
The purpose of a dashboard is to give you a live view of what is happening with the system. Take for example the instrument it is named after – the car dashboard. It tells you at the moment what the speed of the car is, along with other indicators such as which lights are on,…