GRADING ANOMALIES
-
- Posts: 155
- Joined: Wed Apr 11, 2007 9:29 am
Dear Robert & others,
Thank you for initiating this debate. I’d like to offer a couple of observations. Please forgive me if I’ve misunderstood anything posted previously. I almost certainly have.
Robert – I believe you make a compelling case for change, at least from a mathematical viewpoint. Given the choice between AGS and EGS, I would be inclined to plump for EGS – not as a result of any statistical argument, but the Elo system does seem to be the way that many countries are going. If I understand correctly, both AGS and EGS would be considerably better methodologies than the one we currently use.
The debate so far has concentrated on the best methodology of measuring a player’s past performance. But in practice most players think of their grade as a measure of their current playing strength, hence comments like “juniors are under gradedâ€. Do folks feel there is any merit in adjusting a methodology to try to address this? The Elo system has its k-factor, for example.
I concur with the previous comment about negative grades. This is a no-no.
Peter
Thank you for initiating this debate. I’d like to offer a couple of observations. Please forgive me if I’ve misunderstood anything posted previously. I almost certainly have.
Robert – I believe you make a compelling case for change, at least from a mathematical viewpoint. Given the choice between AGS and EGS, I would be inclined to plump for EGS – not as a result of any statistical argument, but the Elo system does seem to be the way that many countries are going. If I understand correctly, both AGS and EGS would be considerably better methodologies than the one we currently use.
The debate so far has concentrated on the best methodology of measuring a player’s past performance. But in practice most players think of their grade as a measure of their current playing strength, hence comments like “juniors are under gradedâ€. Do folks feel there is any merit in adjusting a methodology to try to address this? The Elo system has its k-factor, for example.
I concur with the previous comment about negative grades. This is a no-no.
Peter
The problem with juniors is that they are not all undergraded (as many wrongly believe). Some are, some are not. In fact, the only consistent thing about juniors is that they are consistently inconsistent!!
In my paper on the grading system pblished last year I advocated treating juniors in exactly the same way as we do players with negative grades and that is to ignore their previous grade. Instead, you calculate their grading performance for the current year and use that when calculating the grades of their opponents.
If we were to switch to ELO then the simple way to deal with juniors in that methodology is to give them a larger k factor. If average club players were to hav k=32, you may give juniors k=48 (or more) so that their rating reflects their true playing strength more quickly.
In my paper on the grading system pblished last year I advocated treating juniors in exactly the same way as we do players with negative grades and that is to ignore their previous grade. Instead, you calculate their grading performance for the current year and use that when calculating the grades of their opponents.
If we were to switch to ELO then the simple way to deal with juniors in that methodology is to give them a larger k factor. If average club players were to hav k=32, you may give juniors k=48 (or more) so that their rating reflects their true playing strength more quickly.
-
- Posts: 526
- Joined: Sun May 13, 2007 11:23 pm
-
- Posts: 207
- Joined: Wed May 16, 2007 1:31 pm
- Location: Surrey
Hello Peter, Sean and Paul,
Thank you for your comments.
Grading systems classification...
New grades ('a2' and 'b2') for players A (with grade 'a') and B (with grade 'b') for so far mentioned grading systems, GS, CGS, AGS, ÉGS and NGS, are calculated by formulas...
...where 'q' is actual and 'p' expected performance (of player A). The systems differ in a way of how they define grade (a relationship between expected performance 'p' and grade difference 'd') and coefficient 'k'.
GS defines grade as (green line in the diagram below, 'd' horizontal axes, 'p' vertical axes)...
CGS and AGS define grade as (blue line in the diagram below, 'd' horizontal axes, 'p' vertical axes)...
ÉGS defines grade as (red line in the diagram below, 'd' horizontal axes, 'p' vertical axes)...
NGS defines grade as (green line in the diagram below, 'd' horizontal axes, 'p' vertical axes)...
GS and CGS take...
...which effectively means that when calculating grade of the players A and B, the grade of the other player is taken to match his or her current chess ability, which is a contradiction in itself, as if the players did not perform as expected then if say player's A grade is taken to match his or her chess abilities this cannot be the case with the grade of chess player B, and vice versa.
AGS, ÉGS and NGS take...
...which effectively means that when calculating grade of the players A and B, the grade of the other player is not necessarily taken to match his or her current chess ability, that is to say if the players did not perform as expected it is assumed that the chess ability of both players had changed, if the players performed as expected their grades do not change.
Grade difference...
According to GS's grade definition if a stronger player keeps winning against a weaker player their grade difference will keep increasing up to infinity, while for all other mentioned grading systems their grade difference will stop increasing after reaching some limit, for CGS and AGS that limit is 50, for ÉGS it is approximately 120, and for NGS it is 120. I prefer systems where the grade difference stop increasing after some limit, and frankly speaking do not see any advantage if that limit is 120 rather than 50.
Conservation of total grade...
All of the mentioned grading systems, GS, CGS, AGS, ÉGS and NGS, preserve total grade, that is to say, if you have a closed systems (no players are entering or leaving the system) the sum of all grades remains the same (in every season), only the distribution of the grades amongst the players may change. That is because if a player's grade is increased for an amount the grade of his or her opponent will be decreased for the same amount.
Grade inflation and deflation...
Grade inflation or deflation could be caused by new players entering a system or by players within the system whose chess abilities rapidly change, say rapidly improving juniors, other factors may include inactive players, etc.
If the chess abilities of new players entering the system are underestimated (their initial grades have been underestimated) this would cause grade deflation, and vice versa.
Rapidly improving juniors cause grade deflation as their chess abilities are in principle above of that what their grade would suggest. In principle, no player's grade reflects his or her true chess ability, as player's chess ability change, player's grade may be statistically insignificant, etc., but it is a well known fact that most juniors are stronger than what their grade would suggest.
I do not know if there are any other obvious examples of grade inflation or deflation which could cause problems in practice...
In my opinion present grading system (GS) may cause grade deflation for players with low grades and grade inflation for players with high grades, that is I believe what could be the effect of the fact that... if a stronger player keeps winning against a weaker player their grade difference will keep increasing up to infinity.
Addressing the problem of grade deflation...
When calculating a new grade for a junior who was playing a non-junior use CGS ('k=1', if not performed as expected the grade changes more rapidly) and in all other cases (junior playing junior and non-junior playing either junior or non-junior) use AGS (k=1/2, if not performed as expected the grade does not change so rapidly).
Such mixing of the systems would cause the resulting grading system not to conserve the total grade (for improving juniors the total system grade will increase), but this increase in total grade may account for the fact that 'juniors are under graded'.
For example, if a 100 junior draws against a 120 non-junior, then (according to the above proposal to mix CGS with AGS) junior's new grade would be 120 (not 110, which would be the result if we used AGS instead of suggested CGS) and non-junior's new grade would be 110, which results in the increase in total grade for 10 grading points (junior's grade was increased by 20 points, non-junior's grade was decreased by 10 points), but this would cause juniors gaining their points more rapidly (if they performed better than expected) and non-juniors not losing so many points (as they would lose now according to GS or CGS) when under performing against juniors. If a junior performs worse than expected then he or she will also lose more points than according to AGS, as the above proposal only assumes that juniors are 'rapidly changing', you do not know if it is for better or worse.
This could be achieved with two simple rules...
CGS's rule reads...
Rule 1b: For a win you score your opponent's grade plus 50; for a draw, your opponent's grade; and for a loss, your opponent's grade minus 50. Note that, if your opponent's grade differs from yours by more than 50 (not 40) points, it is taken to be exactly 50 (not 40) points above (or below) yours. At the end of the season an average of points-per-game is taken, and that is your new grade.
AGS's rule reads...
Rule 2: For a win you score average grade plus 25; for a draw, average grade; and for a loss, average grade minus 25. Note that, if your opponent's grade differs from yours by more than 50 points, it is taken to be exactly 50 points above (or below) yours. Average grade is half of the sum of your and your opponent's grade. At the end of the season an average of points-per-game is taken, and that is your new grade.
Thank you for your comments.
Artificially increasing negative grades to zero would not be in accord with the grade definition for any of the mentioned systems and would result in overrating weaker players. Of course, if so desired, negative grades could be published as zero (although they are in fact negative). Please note that a grade of zero is not a special grade, so is not a negative grade, as it simply happened that one particular relative chess ability has been assigned a grade of zero, if all grades were, say, for a 100 points higher, most likely there will be no negative grades.Peter Sowray wrote:I concur with the previous comment about negative grades. This is a no-no.
Grading systems classification...
New grades ('a2' and 'b2') for players A (with grade 'a') and B (with grade 'b') for so far mentioned grading systems, GS, CGS, AGS, ÉGS and NGS, are calculated by formulas...
Code: Select all
a2 = a + k*(q - p);
b2 = b + k*((100-q) - (100-p));
GS defines grade as (green line in the diagram below, 'd' horizontal axes, 'p' vertical axes)...
CGS and AGS define grade as (blue line in the diagram below, 'd' horizontal axes, 'p' vertical axes)...
ÉGS defines grade as (red line in the diagram below, 'd' horizontal axes, 'p' vertical axes)...
NGS defines grade as (green line in the diagram below, 'd' horizontal axes, 'p' vertical axes)...
GS and CGS take...
Code: Select all
k=1;
AGS, ÉGS and NGS take...
Code: Select all
k=1/2;
Grade difference...
According to GS's grade definition if a stronger player keeps winning against a weaker player their grade difference will keep increasing up to infinity, while for all other mentioned grading systems their grade difference will stop increasing after reaching some limit, for CGS and AGS that limit is 50, for ÉGS it is approximately 120, and for NGS it is 120. I prefer systems where the grade difference stop increasing after some limit, and frankly speaking do not see any advantage if that limit is 120 rather than 50.
Conservation of total grade...
All of the mentioned grading systems, GS, CGS, AGS, ÉGS and NGS, preserve total grade, that is to say, if you have a closed systems (no players are entering or leaving the system) the sum of all grades remains the same (in every season), only the distribution of the grades amongst the players may change. That is because if a player's grade is increased for an amount the grade of his or her opponent will be decreased for the same amount.
Grade inflation and deflation...
Grade inflation or deflation could be caused by new players entering a system or by players within the system whose chess abilities rapidly change, say rapidly improving juniors, other factors may include inactive players, etc.
If the chess abilities of new players entering the system are underestimated (their initial grades have been underestimated) this would cause grade deflation, and vice versa.
Rapidly improving juniors cause grade deflation as their chess abilities are in principle above of that what their grade would suggest. In principle, no player's grade reflects his or her true chess ability, as player's chess ability change, player's grade may be statistically insignificant, etc., but it is a well known fact that most juniors are stronger than what their grade would suggest.
I do not know if there are any other obvious examples of grade inflation or deflation which could cause problems in practice...
In my opinion present grading system (GS) may cause grade deflation for players with low grades and grade inflation for players with high grades, that is I believe what could be the effect of the fact that... if a stronger player keeps winning against a weaker player their grade difference will keep increasing up to infinity.
Addressing the problem of grade deflation...
Peter Sowray wrote:The debate so far has concentrated on the best methodology of measuring a player’s past performance. But in practice most players think of their grade as a measure of their current playing strength, hence comments like “juniors are under gradedâ€. Do folks feel there is any merit in adjusting a methodology to try to address this? The Elo system has its k-factor, for example.
Sean Hewitt wrote:If we were to switch to ELO then the simple way to deal with juniors in that methodology is to give them a larger k factor. If average club players were to hav k=32, you may give juniors k=48 (or more) so that their rating reflects their true playing strength more quickly.
Assuming that the only (inflation/ deflation) problem to address is grade deflation caused by the juniors the solution could be as follows...Paul D wrote:I agree with the higher k factor for juniors. Certainly Ireland used to set k=50 for under 21's
When calculating a new grade for a junior who was playing a non-junior use CGS ('k=1', if not performed as expected the grade changes more rapidly) and in all other cases (junior playing junior and non-junior playing either junior or non-junior) use AGS (k=1/2, if not performed as expected the grade does not change so rapidly).
Such mixing of the systems would cause the resulting grading system not to conserve the total grade (for improving juniors the total system grade will increase), but this increase in total grade may account for the fact that 'juniors are under graded'.
For example, if a 100 junior draws against a 120 non-junior, then (according to the above proposal to mix CGS with AGS) junior's new grade would be 120 (not 110, which would be the result if we used AGS instead of suggested CGS) and non-junior's new grade would be 110, which results in the increase in total grade for 10 grading points (junior's grade was increased by 20 points, non-junior's grade was decreased by 10 points), but this would cause juniors gaining their points more rapidly (if they performed better than expected) and non-juniors not losing so many points (as they would lose now according to GS or CGS) when under performing against juniors. If a junior performs worse than expected then he or she will also lose more points than according to AGS, as the above proposal only assumes that juniors are 'rapidly changing', you do not know if it is for better or worse.
This could be achieved with two simple rules...
CGS's rule reads...
Rule 1b: For a win you score your opponent's grade plus 50; for a draw, your opponent's grade; and for a loss, your opponent's grade minus 50. Note that, if your opponent's grade differs from yours by more than 50 (not 40) points, it is taken to be exactly 50 (not 40) points above (or below) yours. At the end of the season an average of points-per-game is taken, and that is your new grade.
AGS's rule reads...
Rule 2: For a win you score average grade plus 25; for a draw, average grade; and for a loss, average grade minus 25. Note that, if your opponent's grade differs from yours by more than 50 points, it is taken to be exactly 50 points above (or below) yours. Average grade is half of the sum of your and your opponent's grade. At the end of the season an average of points-per-game is taken, and that is your new grade.
Robert Jurjevic
Vafra
Vafra
-
- Posts: 207
- Joined: Wed May 16, 2007 1:31 pm
- Location: Surrey
Hello Sean,
When calculating a new grade for a junior who was playing a non-junior use CGS ('k=1', if not performed as expected the grade changes more rapidly) and in all other cases (junior playing junior and non-junior playing either junior or non-junior) use AGS (k=1/2, if not performed as expected the grade does not change so rapidly).
...should work in this case as well, as...
the above proposal only assumes that juniors are 'rapidly changing', you do not know if it is for better or worse
Élo suggested scaling ratings so that a difference of 200 rating points in chess would mean that the stronger player has an expected score of approximately 0.75.
In order to keep present ECF grade scale we could suggest scaling ratings so that a difference of 25 (not 200) rating points in chess would mean that the stronger player has an expected score of approximately 0.75.
...and the difference between FIDE Élo formulas...
...where 'p' and 'q are in the range between 0 and 1 ...and ÉGS's formulas...
...where 'p' and 'q are in the range between 0 and 100...
So, we can see that FIDE's 'k' which would match ÉGS's 'k=1/2' would be 'kf=400', and we know that FIDE's 'k' is 'kf=16' for masters and 'kf=32' for weaker players.
I think that the reason for such a big difference is that FIDE's 'k' is chosen for the system in which grading is done many times in a season, after each tournament, etc. As '400/32 = 12.5' (the ratio between ÉGS FIDE's 'k' equivalent 'kf=400' and actual FIDE's 'k' 'kf=32') and assuming that a player in a tournament, etc. plays on average 9 games we see that FIDE's 'k' 'kf=400' would approximately match a once per season calculation where a player is played around 112 games. In my opinion if you adopt FIDE rating for ECF (as it is now, not ÉGS which is adapted FIDE) and calculate grade only at the end of the season using 'kf=32' the grades would change much slower than what they do now.
This discussion may suggest that 'k' in AGS (or ÉGS) might be smaller than 1/2 as on average players may play less than 112 games per season. Nevertheless, 'k' is anyway a measure of how much a grade is to be changed if a player did not perform in a season as expected and in my opinion should be '1/2' or smaller for all expect a rapidly changing players (juniors) for who we may take 'k=1'.
We could relate 'k' to the number of games a player has played in a season (compared to an average), say if a player has played 6 games in a season and an average was 30 his 'k' could be say 'k=1/2/5=1/10' and his grade (at the end of the season) would be adjusted for a smaller amount as his actual performance is not as much statistically significant as that of a player who played 30 games in the season (for who we take 'k=1/2'), this approach may be better than the current where always at least 30 games are taken into account for grading (from previous seasons if necessary), as then the grades should be closer to the actual chess abilities of the players (as old results would not be taken into account). Of course, such a system would again not conserve the total grade of the system and the grades would not be so easy to calculate (changing 'k' would most likely mean abandoning simple grading rules and the formulas would have to be used instead).
If one likes to keep it simple...
...I think that...
When calculating a new grade for a junior who was playing a non-junior use CGS ('k=1', if not performed as expected the grade changes more rapidly) and in all other cases (junior playing junior and non-junior playing either junior or non-junior) use AGS (k=1/2, if not performed as expected the grade does not change so rapidly).
...should be good enough (addressing the junior problem to some extent and having simple original ECF grade definition, and taking in most cases a more plausible 'k=1/2').
Another alternative could be to use AGS for all but...
...ignoring juniors' previous season grade and instead calculate their grading performance for the current year and use that (newly calculated grades) when calculating the grades of their opponents.
Well, I did not know that, but the above proposal...Sean Hewitt wrote:The problem with juniors is that they are not all under graded (as many wrongly believe). Some are, some are not. In fact, the only consistent thing about juniors is that they are consistently inconsistent!!
When calculating a new grade for a junior who was playing a non-junior use CGS ('k=1', if not performed as expected the grade changes more rapidly) and in all other cases (junior playing junior and non-junior playing either junior or non-junior) use AGS (k=1/2, if not performed as expected the grade does not change so rapidly).
...should work in this case as well, as...
the above proposal only assumes that juniors are 'rapidly changing', you do not know if it is for better or worse
Treating juniors that way might be a good idea, then AGS could be applied for all. Regarding negative grades I guess one should use them in calculation, although zero grade can be published.Sean Hewitt wrote:In my paper on the grading system published last year I advocated treating juniors in exactly the same way as we do players with negative grades and that is to ignore their previous grade. Instead, you calculate their grading performance for the current year and use that when calculating the grades of their opponents.
ÉGS's equivalent to FIDE's 'k' 'kf=32' is 'k=1/25' ('k=kf/800). That can be obtained from the scaling facts...Sean Hewitt wrote:If we were to switch to ELO then the simple way to deal with juniors in that methodology is to give them a larger k factor. If average club players were to have k=32, you may give juniors k=48 (or more) so that their rating reflects their true playing strength more quickly.
Élo suggested scaling ratings so that a difference of 200 rating points in chess would mean that the stronger player has an expected score of approximately 0.75.
In order to keep present ECF grade scale we could suggest scaling ratings so that a difference of 25 (not 200) rating points in chess would mean that the stronger player has an expected score of approximately 0.75.
...and the difference between FIDE Élo formulas...
Code: Select all
a2 = a + k*(q - p);
b2 = b + k*((1-q) - (1-p)) = b - k*(q - p);
Code: Select all
a2 = a + k*(q - p);
b2 = b + k*((100-q) - (100-p)) = b - k*(q - p);
Code: Select all
ClearAll[kf, pf, qf, k, pq];
Solve[{kf*(qf - pf)*25/200 == k*(q - p), p = 100*pf; q == 100*qf}, {k}]
{{k -> kf/800}}
I think that the reason for such a big difference is that FIDE's 'k' is chosen for the system in which grading is done many times in a season, after each tournament, etc. As '400/32 = 12.5' (the ratio between ÉGS FIDE's 'k' equivalent 'kf=400' and actual FIDE's 'k' 'kf=32') and assuming that a player in a tournament, etc. plays on average 9 games we see that FIDE's 'k' 'kf=400' would approximately match a once per season calculation where a player is played around 112 games. In my opinion if you adopt FIDE rating for ECF (as it is now, not ÉGS which is adapted FIDE) and calculate grade only at the end of the season using 'kf=32' the grades would change much slower than what they do now.
This discussion may suggest that 'k' in AGS (or ÉGS) might be smaller than 1/2 as on average players may play less than 112 games per season. Nevertheless, 'k' is anyway a measure of how much a grade is to be changed if a player did not perform in a season as expected and in my opinion should be '1/2' or smaller for all expect a rapidly changing players (juniors) for who we may take 'k=1'.
We could relate 'k' to the number of games a player has played in a season (compared to an average), say if a player has played 6 games in a season and an average was 30 his 'k' could be say 'k=1/2/5=1/10' and his grade (at the end of the season) would be adjusted for a smaller amount as his actual performance is not as much statistically significant as that of a player who played 30 games in the season (for who we take 'k=1/2'), this approach may be better than the current where always at least 30 games are taken into account for grading (from previous seasons if necessary), as then the grades should be closer to the actual chess abilities of the players (as old results would not be taken into account). Of course, such a system would again not conserve the total grade of the system and the grades would not be so easy to calculate (changing 'k' would most likely mean abandoning simple grading rules and the formulas would have to be used instead).
If one likes to keep it simple...
...I think that...
When calculating a new grade for a junior who was playing a non-junior use CGS ('k=1', if not performed as expected the grade changes more rapidly) and in all other cases (junior playing junior and non-junior playing either junior or non-junior) use AGS (k=1/2, if not performed as expected the grade does not change so rapidly).
...should be good enough (addressing the junior problem to some extent and having simple original ECF grade definition, and taking in most cases a more plausible 'k=1/2').
Another alternative could be to use AGS for all but...
...ignoring juniors' previous season grade and instead calculate their grading performance for the current year and use that (newly calculated grades) when calculating the grades of their opponents.
Last edited by Robert Jurjevic on Mon Jun 11, 2007 12:51 pm, edited 1 time in total.
Robert Jurjevic
Vafra
Vafra
-
- Posts: 246
- Joined: Tue Apr 10, 2007 8:49 pm
- Location: Derbyshire, England
-
- Posts: 1420
- Joined: Fri Jun 01, 2007 6:31 pm
Robert - I expect these comments will score around 6.3 on the Truran-Richter boredom scale but here goes.
None of the systems you mention are zero-sum as regards total grade increments. In particular, while I am not saying that ECF is better or worse than other systems eg FIDE, the current ECF (GS) system produces some unusual results. For example:- if 6 players all graded 100 form a club and only play graded games amongst themselves, then after 4 years it is possible for all 6 players to be rated over 200. It is not necessary for players to lose points over the medium term when other players increase their grades. Also under ECF, the maximum amount a players grade can increase from one year to the next is about 98 whereas intuition says it should be 90.
The stabilising equilibrium force from one year to the next is that the weighted mean grade of active players grades at the beginning of the year will be equal to the weighted mean new grade at the end of the year where the weights are the number of games played by each player. That is provided players are only included who have grades at the beginning and end of the season, only games played amongst those players played in the season are graded and new players are excluded. Category B - E grades upset this equilibrium a bit as do new players.
A useful statistic to calculate would be the weighted mean grade mentioned previously, excluding new players and only including results against other players in this group. The weighted mean grade in effect is almost the expected average grade of two players in any game taken at random played during the season and measures the global play level. I am not suggesting this mean should stay exactly the same each year as players improve and deteriorate depending among others on the age structure of players and players drop out. Also over time playing standards should improve. If for the same group of players the statistic is again calculated using the same weights but the normal published end of season grades the deflationary or inflationary effect of new players can be estimated. This type of inflation/deflation is intrinsic to the system, is independent of the 40 point rule and needs regular monitoring.
Other types of inflation/deflation includes player psychological and technical factors and is a result of decisions being influenced by the grades produced by the system. For example a player rated 159 may become more active and play in many U160 events to win prize money whereas the next season when he is rated say 178 he may become less active as he will not get the prize money to offset against travelling expenses. This is a common effect of mathematical models or indexes where it becomes very difficult to predict the outcome of decisions taken on the basis of results produced by the model and to allow for decision feedback in the model design.
I believe that the FIDE and ECF practice of 1st round paring of the top half by grades of a tournament against the bottom half causes irregularities resulting in increased dispersion of grades where the top players gain and the lower players lose. This is another example of decisions based on model results and to some degree affects the validity of statistical surveys carried out, even where these go back to the original game results ignoring historic grades.
Robert Jurjevic wrote:Conservation of total grade...
All of the mentioned grading systems, GS, CGS, AGS, ÉGS and NGS, preserve total grade, that is to say, if you have a closed systems (no players are entering or leaving the system) the sum of all grades remains the same (in every season), only the distribution of the grades amongst the players may change. That is because if a player's grade is increased for an amount the grade of his or her opponent will be decreased for the same amount.
None of the systems you mention are zero-sum as regards total grade increments. In particular, while I am not saying that ECF is better or worse than other systems eg FIDE, the current ECF (GS) system produces some unusual results. For example:- if 6 players all graded 100 form a club and only play graded games amongst themselves, then after 4 years it is possible for all 6 players to be rated over 200. It is not necessary for players to lose points over the medium term when other players increase their grades. Also under ECF, the maximum amount a players grade can increase from one year to the next is about 98 whereas intuition says it should be 90.
The stabilising equilibrium force from one year to the next is that the weighted mean grade of active players grades at the beginning of the year will be equal to the weighted mean new grade at the end of the year where the weights are the number of games played by each player. That is provided players are only included who have grades at the beginning and end of the season, only games played amongst those players played in the season are graded and new players are excluded. Category B - E grades upset this equilibrium a bit as do new players.
Combining the equilibrium force with varying degrees of player activity is a cause of deflation/inflation which you do not mention and also causes the above 2 examples as a result. Grading systems are not primarily mathematical models as we know them but index creation methodologies where the grades are the indexes and like most indexes need to be monitored and rebased every now and again. In this respect ECF grades have not been given the attention they should have over the last 20 years.Robert Jurjevic wrote:I do not know if there are any other obvious examples of grade inflation or deflation which could cause problems in practice...
A useful statistic to calculate would be the weighted mean grade mentioned previously, excluding new players and only including results against other players in this group. The weighted mean grade in effect is almost the expected average grade of two players in any game taken at random played during the season and measures the global play level. I am not suggesting this mean should stay exactly the same each year as players improve and deteriorate depending among others on the age structure of players and players drop out. Also over time playing standards should improve. If for the same group of players the statistic is again calculated using the same weights but the normal published end of season grades the deflationary or inflationary effect of new players can be estimated. This type of inflation/deflation is intrinsic to the system, is independent of the 40 point rule and needs regular monitoring.
Other types of inflation/deflation includes player psychological and technical factors and is a result of decisions being influenced by the grades produced by the system. For example a player rated 159 may become more active and play in many U160 events to win prize money whereas the next season when he is rated say 178 he may become less active as he will not get the prize money to offset against travelling expenses. This is a common effect of mathematical models or indexes where it becomes very difficult to predict the outcome of decisions taken on the basis of results produced by the model and to allow for decision feedback in the model design.
I believe that the FIDE and ECF practice of 1st round paring of the top half by grades of a tournament against the bottom half causes irregularities resulting in increased dispersion of grades where the top players gain and the lower players lose. This is another example of decisions based on model results and to some degree affects the validity of statistical surveys carried out, even where these go back to the original game results ignoring historic grades.
I think Peter has the right approach; the first question is what are the grades required to show and for what purpose. Back in about 1960 local grading lists were usually referred to as a ranking list and were intended to ensure that teams played in the correct board order. In 1971 the BCF published numbers instead of grades like 3b, 6a etc and about that time cheap electronic calculators became available when players began reverse engineering the sums. The words "theoretical probabilities" (TPs) entered the vocabulary of the street-wise county captain. Many players and team tournament organisers used TPs in support of playing teams out of order of strength as they felt the mathematical expected total score was unaffected by board order. This is endowing the metrics with more probabilistic power than they were designed for but what is expected of the system and the reasons for its existence have clearly changed.Peter Sowray wrote: Do folks feel there is any merit in adjusting a methodology to try to address this? The Elo system has its k-factor, for example.
-
- Posts: 207
- Joined: Wed May 16, 2007 1:31 pm
- Location: Surrey
I'm afraid that I'm not following you here... a generic formula for all of the mentioned grading systems, GS, CGS, AGS, ÉGS and NGS, is...Michael White wrote:None of the systems you mention are zero-sum as regards total grade increments.
Code: Select all
a2 = a + k*(q - p);
b2 = b + k*((100-q) - (100-p)) = b - k*(q - p);
I cannot see how this can happen, can you give an example please? In my opinion the sum should be preserved, i.e. remain 600 forever.Michael White wrote:In particular, while I am not saying that ECF is better or worse than other systems eg FIDE, the current ECF (GS) system produces some unusual results. For example:- if 6 players all graded 100 form a club and only play graded games amongst themselves, then after 4 years it is possible for all 6 players to be rated over 200.
What I do not like about the current ECF (GS) system is that...
According to GS's grade definition if a stronger player keeps winning against a weaker player their grade difference will keep increasing up to infinity, while for all other mentioned grading systems their grade difference will stop increasing after reaching some limit, for CGS and AGS that limit is 50, for ÉGS it is approximately 120, and for NGS it is 120. I prefer systems where the grade difference stop increasing after some limit, and frankly speaking do not see any advantage if that limit is 120 rather than 50.
...so say if a player in the group of the 6 players is constantly winning his grade will keep rising and the grades of the other players will keep falling, and eventually become negative, but the total sum should remain 600.
The grade increase is in principle unlimited, imagine that a player is constantly winning and that he played a lot of games in the season, in case of GS his grade will keep raising, while all other system have some upper limit and for AGS once a player is for 50 grading points above his competition his grade will stop rising even if he keeps winning (the grade limit is 50 for CGS and AGS, approximately 120 for ÉGS, 120 for NGS, and infinity for GS).Michael White wrote:Also under ECF, the amount a player's grade can increase from one year to the next is about 98 whereas intuition says it should be 90.
True, a category A grades should be more statistically significant than category B to E grades. I've already proposed what could be done about this...Michael White wrote:Category B - E grades upset this equilibrium a bit as do new players.
We could relate 'k' to the number of games a player has played in a season (compared to an average), say if a player has played 6 games in a season and an average was 30 his 'k' could be say 'k=1/2/5=1/10' and his grade (at the end of the season) would be adjusted for a smaller amount as his actual performance is not as much statistically significant as that of a player who played 30 games in the season (his 'k=1/2'), this approach may be better than the current where always at least 30 games are taken into account for grading (from previous seasons if necessary), as then the grades should be closer to the actual chess abilities of the players (as old results would not be taken into account). Of course, such a system would again not conserve the total grade of the system and the grades would not be so easy to calculate (changing 'k' would most likely mean abandoning simple grading rules and the formulas would have to be used instead).
...but this may prove to be too complex to justify to implement in practice and the simple solution...
When calculating a new grade for a junior who was playing a non-junior use CGS ('k=1', if not performed as expected the grade changes more rapidly) and in all other cases (junior playing junior and non-junior playing either junior or non-junior) use AGS (k=1/2, if not performed as expected the grade does not change so rapidly).
...might be good enough.
I agree that inflation/ deflation is independent of the 40 point rule, what I do not like about 40 point rule is that...Michael White wrote:This type of inflation/deflation is intrinsic to the system, is independent of the 40 point rule and needs regular monitoring.
According to GS's grade definition if a stronger player keeps winning against a weaker player their grade difference will keep increasing up to infinity...
I do not know what do you mean by... This type of inflation/ deflation is intrinsic to the system ...but it should be obvious that the grades need not to match relative chess abilities of the players as a) the grades need not to be statistically significant (a category E grade is less statistically significant than a category A grade with say 60 games played in the season), and b) relative chess abilities of chess players may change (rapidly) during the season (say improving juniors).
In my opinion this 'irregularities' are affecting statistical significance of one's grade, and one could try to impose some rules such as which and how many players one should play in a season, etc., but I guess this would be impractical.Michael White wrote:I believe that the FIDE and ECF practice of 1st round paring of the top half by grades of a tournament against the bottom half causes irregularities resulting in increased dispersion of grades where the top players gain and the lower players lose.
I think that a grade is a measure of a relative chess ability of a chess player. Currently, the measure is directly related to the player's performance, and the mentioned grade definitions relate expected performance 'p' and grade difference 'd' (i.e. a grade is defined in terms of expected performance 'p'). It looks like it is inevitable that a grade does not always show a player's true relative chess ability, the reasons why include the fact that players' relative chess abilities change (say a player does not improve and all other players improve, or a player improves much faster that others, say a promising junior), grades may not be calculated based on statistically significant data (say a player plays only a few game in a season), new players entered the system were under or over graded, players left the system were over or under graded, etc.Michael White wrote:I think Peter has the right approach; the first question is what are the grades required to show and for what purpose.
Our goal should be, I think, to find a mathematically sound grading system (with a chosen grade definition) and try to address some grade inflation/ deflation problems... a simple solution (addressing junior deflation problem) could be...
When calculating a new grade for a junior who was playing a non-junior use CGS ('k=1', if not performed as expected the grade changes more rapidly) and in all other cases (junior playing junior and non-junior playing either junior or non-junior) use AGS (k=1/2, if not performed as expected the grade does not change so rapidly).
...or...
Use AGS for all but ignoring juniors' previous season grade and instead calculate their grading performance for the current year and use that (newly calculated grades) when calculating the grades of their opponents.
Last edited by Robert Jurjevic on Tue Jun 12, 2007 3:41 pm, edited 1 time in total.
Robert Jurjevic
Vafra
Vafra
-
- Posts: 723
- Joined: Thu Apr 05, 2007 8:30 am
- Location: Aylesbury, Bucks, UK
Really? Amazing. Assuming that it's possible for all 6 to improve over that 4 year period to such a point where they actually ARE 200 strength. Where do they get the points from?Michael White wrote:Example:- if 6 players all graded 100 form a club and only play graded games amongst themselves, then after 4 years it is possible for all 6 players to be rated over 200.
...Alternatively, what if there were some unscrupulous and shady characters out there who decided to do just such a thing without even playing (for whatever reasons), then they could continue all the way to, say, 300!!!
Not just intuition. Surely it's not possible to go over 90 unless you are a junior and get a bonus. If the maximum one can gain for a win is +90 then that is the maximum for an entire season - surely.Michael White wrote:Also under ECF, the maximum amount a players grade can increase from one year to the next is about 98 whereas intuition says it should be 90.
e.g. you are rated 100 and play 30 games against players rated 140 or higher and win them all. You would score 190 for each. 190x30/30=190
Hatch End A Captain (Hillingdon League)
Controller (Hillingdon League)
Controller (Hillingdon League)
-
- Posts: 723
- Joined: Thu Apr 05, 2007 8:30 am
- Location: Aylesbury, Bucks, UK
This is the key point. Whilst I actually quite like the current ECF system I do think there should be a limit somewhere. A GM playing a novice should not profit from winning, but what if they draw or lose? Were they taking it seriously? Should they be penalised to the maximum?Robert Jurjevic wrote:According to GS's grade definition, if a stronger player keeps winning against a weaker player their grade will keep increasing up to infinity, while for all other mentioned grading systems their grade will stop increasing after reaching some limit. For CGS and AGS that limit is 50, for ÉGS it is approximately 120, and for NGS it is 120. I prefer systems where the grade difference stops increasing after some limit and frankly speaking do not see any advantage if that limit is 120 rather than 50.
More questions than answers again, sorry!
Hatch End A Captain (Hillingdon League)
Controller (Hillingdon League)
Controller (Hillingdon League)
-
- Posts: 207
- Joined: Wed May 16, 2007 1:31 pm
- Location: Surrey
Right, I made a mistake when I've said...Greg Breed wrote:Not just intuition. Surely it's not possible to go over 90 unless you are a junior and get a bonus. If the maximum one can gain for a win is +90 then that is the maximum for an entire season - surely. e.g. you are rated 100 and play 30 games against players rated 140 or higher and win them all. You would score 190 for each. 190*30/30=190.
The grade increase is in principle unlimited, imagine that a player is constantly winning and that he played a lot of games in the season, in case of GS his grade will keep raising, while all other system have some upper limit and for AGS once a player is for 50 grading points above his competition his grade will stop rising even if he keeps winning (the grade limit is 50 for CGS and AGS, approximately 120 for ÉGS, 120 for NGS, and infinity for GS).
...and I should have said...
The grade increase is in principle unlimited, imagine that a player is constantly winning and that he played a lot of games in a number of seasons, in case of GS his grade will keep raising, while all other system have some upper limit and for AGS once a player is for 50 grading points above his competition his grade will stop rising even if he keeps winning (the grade limit is 50 for CGS and AGS, approximately 120 for ÉGS, 120 for NGS, and infinity for GS).
For AGS the maximum one can get is 50 points, so that would be the AGS's limit of grade increase per season, for ÉGS the limit is the same, 50 points, but dependence on grade difference is more complex, say if you are rated 100 and play 30 games against players rated 150 your new grade will be 145 (you gain 45), but if you have played 30 games against players rated 220 your new grade would be 150 (you would gain 50).
Let us assume that a player A with grade 'a' plays a player B with grade 'b' and scores 'q', the mentioned grading systems give ('a2' is a new grade of player A, 'b2' is a new grade of player B)...Greg Breed wrote:This is the key point. Whilst I actually quite like the current ECF system I do think there should be a limit somewhere. A GM playing a novice should not profit from winning, but what if they draw or lose? Were they taking it seriously? Should they be penalised to the maximum?
More questions than answers again, sorry!
Code: Select all
---------------------------------------------------------
GS CGS AGS ÉGS NGS
a b q a2 b2 a2 b2 a2 b2 a2 b2 a2 b2
---------------------------------------------------------
270 160 100 280 150 270 160 270 160 270 160 271 159
270 160 50 230 200 220 210 245 185 245 185 246 184
270 160 0 180 250 170 260 220 210 220 210 221 209
---------------------------------------------------------
270 140 100 280 130 270 140 270 140 270 140 270 140
270 140 50 230 180 220 190 245 165 245 165 245 165
270 140 0 180 230 170 240 220 190 220 190 220 190
---------------------------------------------------------
270 100 100 280 90 270 100 270 100 270 100 270 100
270 100 50 230 140 220 150 245 125 245 125 245 125
270 100 0 180 190 170 200 220 150 220 150 220 150
---------------------------------------------------------
Why I think GS is bad... let us assume that GM Kiril Georgiev (miraculously) accepts to play a 30 game match (games graded) against me (this season I'm 99 but next season I expect to be below 80) and wins all the games, his grade would be around 280 and, I think, he would not have to fight for these 10 points as hard as if he had played GMs instead.
Robert Jurjevic
Vafra
Vafra
How many games with a difference in grading of 50+ are recorded each year? And in how may games was the result not the expected 1-0 to the higher grade?
Is it possible that you can simply put a cap on such games, e.g. the 1st 4 or a random 4 from each season are graded, the remainder are ignored and ungraded.
Perhaps you could ignore them completely.
Is it possible that you can simply put a cap on such games, e.g. the 1st 4 or a random 4 from each season are graded, the remainder are ignored and ungraded.
Perhaps you could ignore them completely.
-
- Posts: 1420
- Joined: Fri Jun 01, 2007 6:31 pm
Robert Jurjevic wrote:
Michael White wrote:
In particular, while I am not saying that ECF is better or worse than other systems eg FIDE, the current ECF (GS) system produces some unusual results. For example:- if 6 players all graded 100 form a club and only play graded games amongst themselves, then after 4 years it is possible for all 6 players to be rated over 200.
---
I cannot see how this can happen, can you give an example please? In my opinion the sum should be preserved, i.e. remain 600 forever.
I have done this a bit quickly so do you want to check the figures ?
Year 2000/2001
At the start of the 2000 season, players A - F are all graded 100 and the grading total is 600 ie:-
(2000,100,100,100,100,100,100) GT(600)
To avoid calculating B-E grades the number of games played is large
A plays 30 games against each of the others and loses them all
grades now (2001,50,150,150,150,150,150) GT(800) Intrinsic activity based grade inflation (33.3%)
During the next season B plays 30 games against each of the others and loses them all
(2002,140,92,200,200,200,200) GT(1032) ABI(29%)
Then its C for the chop who loses all his games, 30 against each of the others
(2003,230,182,134,250,250,,250) GT(1032) ABI(26%)
Unfortunately the local grader joins the club so each player now plays the others 6 times but they tell the grader the tournament is restricted to those over 130 !
B wins 1 game v each v A D E F and draws the remaining 26 scoring 56.67%
C wins 4 games v each v A D E F and draws the remaining 14 scoring 76.67%
(2004,218,213,201,222,222,222) GT(1286) ABI(-0.8%)
All the players are now graded over 200.
This situation arises due to different activity rates ie the number of games each player plays in the season varies wildly.
Greg Breed wrote:
Michael White wrote:
Also under ECF, the maximum amount a players grade can increase from one year to the next is about 98 whereas intuition says it should be 90.
---
Not just intuition. Surely it's not possible to go over 90 unless you are a junior and get a bonus. If the maximum one can gain for a win is +90 then that is the maximum for an entire season - surely.
e.g. you are rated 100 and play 30 games against players rated 140 or higher and win them all. You would score 190 for each. 190x30/30=190
The highest figure I can produce for an increase in grade form one year to the next is 97.95 Here is an example of an increase of 92:-
In the year 1999/2000 an ungraded player of approximate real strength 60 scores 18 wins at an average of 95. So he has a D grade of 95
(1999,18,95,D,95)
The next year he loses 12 games to oppos graded 75 as he is still only actual strength 60.
(2000,12,25,C,67) .............. (12*25+18*95)/30 =67
In 2001/2002 he practises on the Internet and wins 1 graded OTB game at 160 which counts as 157 due to the >40 rule.
(2001,1,160,E,69.07) ........ (1*157+12*25+17*95)/30 = 69.07
In 2002/2003 he wins 3 games at 170 which count as 159.07 due to the >40 rule.
(2002,3,170,E,58.39) ........... (3*159.07 +1*157+12*25)/16 = 58.39
In 2003/2004 he wins 16 games at 210 which count as 148.39 due to the >40 rule. His grade is therefore (16*148.39+ 3*159.07 +1*157)/20 giving a D grade of 150.42
(2003,16,210,D,150.42) ....... (16*148.39+ 3*159.07 +1*157)/20 =150.42
The increase in the last year is 92.03.
I know grades are normally rounded to 0 places of decimals. Grading systems arent always all they seem.
-
- Posts: 723
- Joined: Thu Apr 05, 2007 8:30 am
- Location: Aylesbury, Bucks, UK
Having browsed the Scottish Chess website posted by Heather Lang I decided (on my lunch break) to try out their system using my results.Greg Breed wrote:Standard-Play Games for Greg Breed from 12 Oct 2006 - rated 106
Total Games: 61
Total Wins: 36
Total Draws: 9
Total Losses: 16
Ave. Opponent Grade: 110
Current ECF
Total Points: 7623
Performance: 124 (rounded down from 124.9672)
R.J.'s AGS
Total Points: 7075.5
Performance: 115.9918
R.J.'s AGS with Decimals Ignored
Total Points: 7060
Performance: 115.7377
Note: These are my own figures not those of the ECF or any other organisation. Having played a number of ungraded or estimated players I had to estimate my performance rating from my games against them as it would normally be done retrospectively.
Using the ECF to ELO calculation: ECF*5+1250=ELO (none of my opponents were even near 213 rated where the calculation changes to ECF*8+600=ELO) I converted my 106 ECF rating into 1780 ELO and did the same to all my opponents. I even added the Junior bonuses correctly.
At the end I came up with a season performance of 1945 which equates to 139 ECF.
So if the ELO system is supposedly the better of the two (ECF v ELO) how come my grade is even higher? I thought that the problem with ECF grades were that they over-inflated grades!
Hatch End A Captain (Hillingdon League)
Controller (Hillingdon League)
Controller (Hillingdon League)
Greg,Greg Breed wrote: So if the ELO system is supposedly the better of the two (ECF v ELO) how come my grade is even higher? I thought that the problem with ECF grades were that they over-inflated grades!
I have been telling the ECF and anyone else who is interested for a year now that the ECF grades have suffered from deflation and so they are nearly all too low. I told them in July last year how to fix it, but they dont even acknowledge that the problem even exists!
Under my proposed fix, 110 ECF would become approx 134.
If you're interested, the details are to be found at http://www.lrca.org.uk/lrca/Grading/Gra ... ction1.doc