Paul DuprÃ© wrote:Well, obviously they didn't use good test data. Otherwise, they would understood why iteration doesn't work on big percentages (ie. Callum Kilpatrick 8Â½/9), OR for people who score zero.
You do them a disservice. Very few rating systems work if someone scores 9/9 or 0/9, or a score very near to it. If you take Elo, and I play an average rated field of 2000. If I score 0 out of 9, I'd have an initial rating of 1200, because they subtract 800 as a substitute for infinity. This is why you need at least 1 out of 9 for an initial rating. If I score 9 out of 9, I'd have an initial rating of 2135. If you score more than 50%, this is a fudge to lower your rating, expecting you to then bring it up to par when your k is still 30.
If win all my games apart from one, and then draw it, what grade should I be assigned?
Paul DuprÃ© wrote:It reminds me of the two lighthouse men who played each other 30 games during one chess season. One graded 180, and the other 140. During their matches there was a lot of wins and draws, but at the end the score was 15-15. Guess what happened next, when the new grades came out?
The 180 would come out as 140, and the 140 as 180.
Exactly the same would happen under Elo. The same thirty games, between a 1400 and an 1800, if you lost 15 and won 15, would be that the 1800 becomes 1611, and the 1400 becomes 1589, for k = 15. For k = 30, on the other hand, the 1800 would become 1422, and the 1400 would become 1778.
One issue here is the size of the k-factor. The ECF's version of k is adding 50 for winning and subtracting 50 for losing. The system would be smoother if a lower number than 50 was used.
The other issue is how often you keep count. If the FIDE-ratings updated after every game, then the two ratings would indeed converge to 1600 if they played enough games. With ECF it wouldn't, but the ECF system is based on the lighthouse men issue not arising in practice. I think that, as grading assumptions go, this is a pretty safe one. Most players, over the course of their 9-30 games, will play a variety of different opponents. So the system doesn't need to cater for this. This notwithstanding, if ECF grades were on a rolling, live basis, and incorporated previous grading history as it went, then in the lighthouse men problem, the grades would, too, converge. The second issue is how often you produce a new grading list. Do it live, and it converges. Do it after a year, and you could overshoot.
The second issue will always be present in any rating system. This is why FIDE, presumably, are trying to bring out rating lists with ever-increasing frequency.
Having 50 for a win and 50 for a loss was reasonable over a 12-month period of grading. Now that we have 6-monthly grading, perhaps we need to reduce 50 to 25? People commented elsewhere that their grades seemed more up and down than before. Clearly this was only anecdotal, but reducing 50 to 25 would help to reduce their up-and-down-ness.
Why do we use 50? That's what Clarke did, presumably, and we've copied him ever since. He was working with Elo, so Elo presumably agreed with the choice of 50.