Richard Bates wrote: i agree that their is a danger that changes (upwards) might be made for 'excitement'.
I think where i would see Keith's argument as flawed is where he is effectively comparing two methods - 6 monthly lists and monthly lists and only considering how they compare at their common intersections - ie. the six monthly point. In other words he is considering two lists both published six monthly, but calculated differently (with i suppose unofficial, unpublished ratings for the monthly methodology).
The six monthly lists had the great danger that significantly 'mis-rated' individuals could play a lot of games, but massively "overshoot" their true level of strength (which could then be replicated in the other direction if they played a lot of games in the next period etc...). However against that they also offered the protection that isolated extreme performance would have less effect on those who's ratings were roughly accurate - the lengthy time between lists meant that the good performances would be balanced by the bad.
Monthly/very frequent lists offer almost completely the opposite danger. On the one hand there is little danger of "overshoot", but greater chance of 'noise' from natural variation in performance distorting the accuracy of ratings. For an obvious example at the top Caruana has shown that it is still possible to gain significant points with K10. K20 he might have been on the verge of 2900! What excitement!
So when comparing K-factors on 6 monthly vs 1 monthly lists leads to the following points on accuracy.
- Individuals with "accurate" ratings (if there is such a thing as an 'accurate' rating) arguably "should" have much lower K-factors on monthly lists compared to 6-monthly lists to control the increased distortions cause by natural variation.
- Individuals who are 'mis-rated' "should" have higher K-factors (although this needs to be given the proviso that they did not play so many games under 6 monthly lists that they were 'over-shooters' and needed a reduction any way).
Keith's argument seems to be based solely on the theoretical cases of the latter individuals (who are far less likely to exist at higher rating levels). Whereas the actual K-factor should be a balance to best align with the actual population of the rating pool (which is presumably why it is higher at lower levels - the influence of the 'mis-rated' individuals is far greater).
I'd like to think that if/when FIDE adopt 'changes(upwards)' (ie k-factor 20 where it is currently 10) it's motive will be to correctly rate 'mis-rated individuals' more quickly, while putting up with the disadvantageous by product of more 'excitement', rather than because it (wrongly) sees 'excitement' as beneficial.
Richard may be overly concerned about the effect of 'noise from natural variation in performance' by players with 'accurate ratings', ie 'excitement'. I like to back up my arguments with examples, so let's imagine that an 'accurately rated' 2500 IM creates some 'excitement' by making a GM norm ( 2600 performance) in a 9 round tournament. I can hardly be accused of using a far-fetched example this time, so, lets see what effect these 9 games would have on his rating, were k-factor 20 to be in operation: New monthly rating 2525. Is that really too much 'excitement'? He is now 25 points over-rated, so there will be a slight pull back towards 2500 in his subsequent games.
I would argue that this 'excitement' is a price worth paying in order to more quickly correctly rate mis-rated players.
On a more whimsical level, here is a thought experiment to define an 'accurately rated player': let us imagine that there are 1000 universes, duplicated in every way except that the player takes on a different opponent in each. The strength of his opposition is evenly spread, from those significantly stronger than himself all the way down to those significantly weaker. What's more, let us assume, even though this may be logically impossible, that in each of the duplicated universes, the player has free will. If the player's resulting performance from the 1000 games is the same as the rating he currently has, then he is an 'accurately rated player'
As a total digression, can you calculate how I, with Black, finished off my round 3 opponent, Johan Salomon(2343), here at the London Classic Open? W Qe1,Rb1, Nc6,Kg2,ps h4,g3,f2,d4,b5. B Qf5 Rs a3,a8,Kg7,ps h5,g6,f7,e6. Jack, or anyone, can you make this into a diagram for me please!