Then, according to GS, when calculating grade of player 'A', it is assumed that player 'A' scored less than what the grade difference would suggest exclusively because player 'A' played below level his or her grade would suggest (while player 'B' played exactly at the level his or her grade would suggest), but when calculating grade of player 'B', it is assumed that player 'B' scored more than what the grade difference would suggest exclusively because player 'B' played above level his or her grade would suggest (while player 'A' played exactly at the level his or her grade would suggest), which is a logical contradiction.
On the other hand, according to AGS, when calculating grades of players 'A' and 'B', it is assumed that player 'A' scored less than what the grade difference would suggest because both, player 'A' played below level his or her grade would suggest, and player 'B' played above level his or her grade would suggest, it is assumed that as much player 'A' played worse that much player 'B' played better, i.e., that both players contributed equally to the difference between actual and expected performance.
Therefore, AGS should be better than GS.
In ÉGS2 a further improvement has been made where both players need not to contribute equally to the difference between actual and expected performance, and how much each of the players contributes is calculated based on the estimate of how much one's grade can be trusted (which is in turn based on the number of games the players played in the last season), it is assumed that the less the grade is trusted the more is the contribution to the difference between actual and expected performance. Consequently, less trusted grades change more rapidly than more trusted grades.
Another improvement in ÉGS2 over AGS is in definition of 'p = f(d)' which corresponds to the red line (used in Élo system and is regarded as a very accurate relationship between chess ability differences and expected performances) in the figure 1 below.
Figure 1: Relationship between expected performance 'p' and grade difference 'd' as defined in GS (green line), CGS and AGS (blue line) and ÉGS and ÉGS2 (red line). Expected performance 'p' is a function of grading difference 'd', i.e. 'p = f(d)'.
If player 'A' (graded 180) and player 'B' (graded 170) both play a pool of 30 players with the same grade (say 150) and score the same (say 50%), it looks like as if player 'A' (whose grade is higher than the grade of player 'B') should be penalized more (he should lose more grading points than player 'B'), as expectation from player 'A' was higher than from player 'B'.Sean Hewitt wrote: I haven't seen that suggested before.
The first consequence of it is that two players with identical results would get different grades. eg
Player A (graded 180) plays 30 games against opponents who's average grade is 150 and scores 50%. Under this suggestion he would be graded (180+150)/2 = 165.
Player B (graded 170) plays the same 30 opponents and also scores 50%. His new grade would be (170+150)/2 = 160. So graded 5 points less for an identical performance. This does not seem logical to me.
Under the ECF system both players would get a grade of 150 in the above scenario.
Indeed, this is the case in both cases, AGS (David Shepherd's proposal is AGS with s=50, AGS has s=40) where player 'A' is penalized for 15 and player 'B' for 10 points, and GS where player 'A' is penalized for 30 and player 'B' for 20 points.
In my opinion, GS penalizes players 'A' and 'B' too much, in fact so much as if their under-performance is exclusively due to them playing worse than their grades would suggest (i.e. assuming that the pool of players was playing exactly as the grades in the pool would suggest), but, in fact, we cannot tell for sure why players 'A' and 'B' did under-perform, so AGS would penalize players 'A' and 'B' so much as if their under-performance is due to both, them playing worse than their grades would suggest, and the players in the pool playing better than their grades would suggest (it is assumed that as much players 'A' and 'B' played worse that much players from the pool played better). Also, wen calculating grades of players from the pool, according to GS, the assumption is that the players from the pool over-performed exclusively because they played better (i.e. it is assumed that players 'A' and 'B' were performing exactly as their grades would suggest), which contradicts the assumption made when calculating grades of players 'A' and 'B'.