In the Elo system the interval of a category is defined as one Standard Deviation (1 sigma) [corresponding to a band 200 Elo points wide with 2000 "as a reference point" both "already steeped in tradition" when Elo chose to use them. JM]
Rating categories (based on a 200 point interval and with 2000 as midpoint) -
3000+ Artificial Intelligences
2800+ Human World Champion & contenders
2600+ Strong GMs
2400+ IMs & weaker GMs
2200+ FMs & National Masters
2000+ Candidate Masters & other experts
1800+ Category 1 amateurs
1600+ Category 2 amateurs
1400+ Category 3 amateurs
1200+ Category 4 amateurs
1000+ Novices
Each step up/down from the 2000 midpoint is 200 points and thus 1 sigma (one Standard Deviation).
For a player rated exactly 2000 a -
2200+ performance is 1+ sigma
2400+ performance is 2+ sigma
2600+ performance is 3+ sigma
2800+ performance is 4+ sigma
3000+ performance is 5+ sigma
3200+ performance is 6+ sigma
And so on...
The theory underpinning it is given by A. Elo -
"A player will perform around some average level... Deviations occur... (with) large deviations less frequently than small ones. These facts suggest the basic assumption of the Elo system -
The many performances of an individual will be normally distributed..." [as shown by the well-known symmetrical bell curve, as given further above in this thread. JM]
"Extensive investigation (Elo 1965, McClintock 1977) bore out the validity of this assumption. Alternative assumptions are discussed..." [elsewhere - JM]
"Statistical and probability theory provides a widely used measure of these performance spreads [deviations from the average - JM] a measure which has worked quite well for many other natural phenomena... Standard Deviation." [1 SD is denoted by 1 sigma - JM]
"The central bulk [of the bell curve] - about two-thirds [68% JM] - of an individual's performances lie within 2 Standard Deviations" [i.e. minus 1 to plus 1 SD (-1 sigma to +1 sigma. And that leaves 32% of the player's performances outside of that central bulk. With 16% higher than +1 sigma and the remaining 16% being lower than -1 sigma].
Q.E.D. (At least I'd like to think so.)
Thanks for the warning, Jack.IM Jack Rudd wrote: ↑Tue Jun 16, 2020 10:40 pmBe careful with this analysis: five-round tournaments and nine-round tournaments will probably show significantly different numbers when it comes to variation around a TPR.
Is this the kind of thing you are warning about?
Matthew Turner wrote: ↑Mon Jun 15, 2020 4:22 pmSo, the 200 points per one of z is a rule of thumb, but generally it relates pretty accurately to the Regan tests that we are talking about. The difference with the numbers in the paper you quote are down to sample size.DavidWalker wrote: ↑Mon Jun 15, 2020 3:38 pmDoes a z-score of 4 really equate to an 800 point rating boost? I know that the ELO standard deviation is supposed to be 200, but this z-score is in a different domain (basically measuring actual moves matched against expected matches I believe). There is a draft paper by Dr Regan in which he gives rating estimates for players based on moves played in their games. Randomly checking a few examples from this paper shows a SD of the rating estimates lower than 200.
So for example we have
Steinitz in the World Championship match 1886 Est. performance 2352 (2 sigma range) 2150–2553 No of moves 593 (that is from 20 games)
If we take an example from fewer games we have
Larsen Candidates 1971 Est. performance 2187 2 sigma range 1702 - 2602 No of moves 181 (that is from 6 games)