Chess’s recent surge in popularity, plus the Netflix show “Queens Gambit” featuring a female chess prodigy have unsurprisingly inspired a flurry of new conversations into the old question of differences in chess ability for men versus women.

Prof Wei Ji Ma (NYU, Psychology) had a few recent pieces arguing that the over-representation of men at the top level merely statistically reflect the larger number of men than women who play chess.  A recent piece published in Slate (https://slate.com/technology/2020/12/why-are-the-best-chess-players-men.html) which summarized an analysis posted at Chessbase.com (https://en.chessbase.com/post/what-gender-gap-in-chess).  These focus on chess players in India and largely replicate a prior study of chess players in Germany (Bilalic, Smallbone, McLeod & Gobet, 2008).

Both studies are fairly straightforward and follow from the basic idea that if you sample more often from a normal distribution, you’ll get more values in the extreme.  More men play chess than women, so you get higher top ‘scores’ for men.  Note that this is related to but different from the “fat tails” hypothesis since this simple fact occurs even if both groups are drawn from distributions with identical means and variance (the “fat tails” idea is that the distribution of men has higher variance).  His conclusion is that the difference among top players is completely predicted from the differences in the population size.

However, two other analyses suggest this approach does not account for differences in other countries.  Jose Camacho Collados (https://josecamachocollados.medium.com/the-gender-gap-in-top-level-chess-15591d8990ba) argues that the observed gap is bigger in many countries than should be by simply population sampling.  Nikos Bosse (https://followtheargument.org/gender-differences-among-top-performers-in-chess) makes a similar point.

The approach and ideas are familiar and serve as a good example that every one of these analyses suffers from two major flaws.  First, the statistical sampling approaches do not take into account that they are sampling performance data from a restricted and high performing range.  Second, even if something about chess ability is innate, chess is a learned skill and you cannot draw any real conclusions about performance data without considering influences on the learning process.

The first, sampling point is fairly simple math, actually curiously impactful, and I have never seen it fully considered before in distribution studies.  All the analyses of chess players estimate the distribution of performance (as a proxy for ability) from rated players — which is one of the reason people use chess as an expertise model, since the rating system provides a nicely quantifiable metric of current ability.  However, only the relatively better chess players have a rating, so rated chess players is not an effective representation of the distribution of chess ability in the population.  The implication of this is that this restriction leads to underestimating the population variance and that really throws of your estimate of the tails.  It’s an effect of easily 2x or 3x of your estimate of the expected number of people in the tails.

So you can’t really look at the rated chess population and infer anything about there should be X men/women at the top unless you have an unbiased estimate of the variance in the broader population, which you don’t get from a sample that overrepresents experts.  This seems like a pretty basic sample/population issue and it’s almost weird that some of the people making this mistake are trained statisticians who should know better.

The second point is much more commonly made. It’s also pretty clear that chess rating (ELO) is quite strongly affected by learning and experience and this will be dramatically impacted by societal factors that discourage women from pursuing chess (or limit opportunities or access to high level training).  One of my minor quibbles with the Queens Gambit is that Beth Harmon doesn’t seem to need to actually learn any chess.  Even Bobby Fisher lost to top players as he moved up to playing professionals.  Maybe the point of the scriptwriters was to show that the only way a woman could crack the top echelons of chess would be to be extraordinarily talented from the beginning.  But if so, maybe they should have made that point clearly enough for people to see it.