lichess.org
Donate

The Association Between Blitz and Puzzle Ratings

LichessSoftware DevelopmentAnalysisPuzzleChess
Let's crunch some numbers...

How much of an association exists between puzzle ratings and blitz ratings?

This was the initial question that I wanted to start with. What I thought would be a simple project turned out to be more extensive than I had original thought.

In order to answer this question, I needed a list of players, especially active ones to compare their ratings. Unfortunately, lichess does not have a full list of the millions of players on the site. I searched through even the source code but could not stumble across anything. Thankfully, I had an alternative plan. The two most populous teams on lichess are the Lichess Swiss and Zhigalko_Sergei & Friends teams. After locating the members tab of each time, I scrolled all the way down and selected all the members from each team (or at least as much as lichess allowed).

It turns out that even the number of players shown on the team members page is limited, 1470 to be exact. So I was only able to collect the usernames of 2940 players. Additionally, since some were duplicates, I had to run the list through a program to check for duplicates. This shrunk my list size to 2872 usernames.

Of these 2872 usernames, I then used random.org 's random number generator to pick exactly 100 numbers. I had assigned each username a number between 1 and 2872, and thus, 100 people were selected to be included in my data set.
The 2872 Players (Usernames hidden)
But of course, somehow, two people in my data set were duplicates. The probability of one person getting selected twice in a list that large is 0.0000001171680285, or 0.00001171680285%. In comparison, winning the lottery is around 0.000000003424657534 or 0.0000003424657534%. If you managed to squint your eyes and count the zeroes, you'd notice that there are only two more zeroes in the lottery than in this selection.

After selecting two more people using the random number generator to account for the two duplicates, I looked through all 100 players' profiles in order to plot their puzzle rating against their blitz rating. This took quite a decent amount of time, but once it was finished, I found that 30 of the players did not have a blitz rating, a puzzle rating, or both. Now with the data set shrunk to 70, here's the grand reveal:
The graph
As you might notice, the graph looks a quite messy, but there is a good amount to learn from it as well. On the x axis is the puzzle rating and the y axis is the blitz rating. After trying a linear regression and a curve fit, the linear model had a better fit.

71.73% of the change in blitz rating is explained by the variation in puzzle rating,

which we determine by reading the coefficient of determination, or the r-squared value.

It can be seen that there is a decent amount of association between these two ratings, which should in fact make sense. It is also critical to note that the slope of the linear function is 0.86, which is less than one, meaning that it is more likely that your puzzle rating is higher than your blitz rating, although that is not true for all cases.

Overall, this gives us a general outlook on the link between blitz and puzzles, and I hope that this gives a little more insight on these two components.

If any of this interests you, I recommend you join this new team: https://lichess.org/team/team-surveys-and-statistics