Chess Grading Destruction

Discuss anything you like about chess related matters in this forum.
Michael Flatt
Posts: 1235
Joined: Tue Jul 02, 2013 7:36 am
Location: Hertfordshire

Re: Chess Grading Destruction

Post by Michael Flatt » Sat Apr 18, 2015 8:24 am

Maurice,

I still find the basis of your argument difficult to grasp. You really need to illustrate your points with examples of how your intended revised grading system would work and how it differs from the current system.

The ECF does try to be inclusive of all players regardless of how many graded games they play. In fact, activity of individual chess players varies greatly from those who play one or two games a year up to the most active who play many hundreds in a single grading period. The ECF system does provide an indication of a players activity (i.e letters A-E) from which it might be inferred how current their grade is.

Grades are statistically meaningful (if you ignore the new F grade). Ideally, results of at least 30 games are required but for less active players this may fall to 9 games over 3 years, with at least one game in the last 12 months.

Specifically how would your system work? Is it intended to cater solely for the most active players?

Brian Towers
Posts: 1266
Joined: Tue Nov 18, 2014 7:23 pm

Re: Chess Grading Destruction

Post by Brian Towers » Sat Apr 18, 2015 10:45 am

Maurice Lawson wrote:On the contrary I regard smoothness as a failing in that it never tells me anything interesting. What is the point of publishing a number which has been so smoothed as to hardly ever vary?
I think what you are looking for is a more sophisticated system which reports both mean and standard deviation. I suspect that many chess players would lack the necessary knowledge and mathematical sophistication to understand but perhaps I'm being overly pessimistic? Maybe the ECF should go ahead and try it and people would grasp what the numbers meant reasonably quickly as they compared them with how players they knew were performing?

Currently both ECF and Elo gradings use some kind of moving average or mean. BTW, I really like the Yorkshire experiment of using an exponential moving average. I assume these gradings are generated by computer programs rather than by old fashioned constipated mathematicians working it out with a pencil on paper (sorry, I'll get my hat). In that case it should be a simple matter to modify the program to also generate standard deviation. This second number would differentiate between rapidly rising juniors, stable mature players and gently declining very mature players ;-).

The Yorkshire graders seem to be more enterprising than just about everybody else. Maybe they would like to give it a go?
Ah, but I was so much older then. I'm younger than that now.

Maurice Lawson
Posts: 23
Joined: Sun Apr 05, 2015 10:38 pm

Re: Chess Grading Destruction

Post by Maurice Lawson » Sat Apr 18, 2015 12:21 pm

Hello Michael and Brian,

Thanks for the comments. I apologise for lack of clarity in what I am saying. Perhaps I have been making too many peripheral points in order to try to get my thoughts across.

Firstly, let me say I am not proposing additional complexity such as standard deviations etc. We are chess players (even sports people!) and we are essentially interested in performance. I don't believe standard deviations would tell us anything which is of practical value in chess, and I don't think we need that complexity.

Michael says.... "The ECF does try to be inclusive of all players regardless of how many graded games they play. In fact, activity of individual chess players varies greatly from those who play one or two games a year up to the most active who play many hundreds in a single grading period. The ECF system does provide an indication of a players activity (i.e letters A-E) from which it might be inferred how current their grade is."

Michael I entirely agree with the inclusive approach within the ECF grading system. Inclusivity, even for relatively few games, is very good. Perhaps you have thought that I have a problem with the whole ECF approach, when this is not really what I am saying. On the contrary I have no issue with the A to F categories, which are designed to be as inclusive as possible. I don't mind that for example E and F are statistically less significant, since the important point is that people should be graded in a simple understandable way. When I look up someone's grade and see that they are say 175 with an E category, it still helps me enormously to understand roughly what strength of player they are. It's far more useful than looking him up and finding "ungraded". At this end of the category listings it's mainly about having a rough idea of playing strength, since these are people who by definition are not competing very often.

But at the other end of the scale it's much more about performance, since there we are dealing with people who play a great deal and are interested in tracking performance as well as generalised playing strength. The A category requires 30 games within 12 months. This is a reasonable volume of playing, but I would not describe it as satisfactorily including the most active category of players. There are a great many active players who compete in several tournaments per year as well as club and many other matches. These players typically play more than 30 games in a six month period, and some of them play a great deal more.

So the real problem which I am trying to highlight is the loss of the X Category. At the other end, the inclusiveness of the A to E categories is excellent. But ECF, by making sure it is dealing with inclusiveness (good) and by focussing too heavily on stability, has overlooked the fact that it's most active players have lost the ECF's best performance measure, namely the X category.

So when you ask me how my intended grading system would work, all I am really saying is that the X category was incredibly important for the most active players, and yet it has now been removed. And I can see no tangible benefit in it's removal. It measured performance as well as playing strength, and it did that to an acceptable level of accuracy.

However, for those who are more concerned with smoothness (as compared with my concern about recent playing performance) the answer is very simple. Publish both the X category and the A category grading result for those who play more than 30 games in six months. So to that extent I am not asking for anything which causes any administrative issues. The systems and the numbers already exist, but the result has now been hidden. I am simply saying please bring it back.

Brian, you mention that you "really like the Yorkshire experiment of using an exponential moving average". Whilst I'm not familiar with their system, I too like the sound of it in principle, since this is what performance measurement is all about. The X category (for those playing 30 games or more in six months) together with a suitable exponential moving average would indeed move our grading system in the right direction for active players. But the parameters required for the exponential moving average might require some experiment to find the best balance.

In the interim simply bring back the X category grade and publish it alongside the A category result.

Roger de Coverly
Posts: 21318
Joined: Tue Apr 15, 2008 2:51 pm

Re: Chess Grading Destruction

Post by Roger de Coverly » Sat Apr 18, 2015 1:05 pm

Maurice Lawson wrote: But ECF, by making sure it is dealing with inclusiveness (good) and by focussing too heavily on stability, has overlooked the fact that it's most active players have lost the ECF's best performance measure, namely the X category.
Provided the grading archive of detailed results is available, it's not especially difficult to paste your personal results to a spreadsheet and work it out for yourself.
Maurice Lawson wrote:
Publish both the X category and the A category grading result for those who play more than 30 games in six months. So to that extent I am not asking for anything which causes any administrative issues.
But which do you use as the starting grade for the next period? The deliberate intention of switching back from the X grade to the A grade in future calculations was to increase stability. The point being that under X grades, a player whose results fluctuated between say playing at an under 180 standard and playing at an above 200 standard could be 190 over twelve months, but 180 and 200 on the discrete six months of performance.

Whilst it's easy to write other people's programs, publishing a "for information" grade based on the last thirty games or the season so far would not seem intrinsically difficult. The more difficult task would be to get league reporting for grading closer to real time, or even end of each month, which would be a necessary condition for the output to have a measure of completeness.

The systems and the numbers already exist, but the result has now been hidden. I am simply saying please bring it back.

Brian, you mention that you "really like the Yorkshire experiment of using an exponential moving average". Whilst I'm not familiar with their system, I too like the sound of it in principle, since this is what performance measurement is all about. The X category (for those playing 30 games or more in six months) together with a suitable exponential moving average would indeed move our grading system in the right direction for active players. But the parameters required for the exponential moving average might require some experiment to find the best balance.

In the interim simply bring back the X category grade and publish it alongside the A category result.[/quote]

Maurice Lawson
Posts: 23
Joined: Sun Apr 05, 2015 10:38 pm

Re: Chess Grading Destruction

Post by Maurice Lawson » Sat Apr 18, 2015 1:25 pm

Roger,

"But which do you use as the starting grade for the next period?" No problem - they can continue to be independently calculated. Both are averages of a set of games, so neither technically requires a starting point.

"The deliberate intention of switching back from the X grade to the A grade in future calculations was to increase stability." Yes I get that - it's been repeated over and over. But what I'm saying is that we have lost something good and valuable in the process. And we can have both with no great difficulty.

"Provided the grading archive of detailed results is available, it's not especially difficult to paste your personal results to a spreadsheet and work it out for yourself." That's not the point! I already do that! But I want to be able to see everyone's X Category grade, and the system is now hiding it.

Roger de Coverly
Posts: 21318
Joined: Tue Apr 15, 2008 2:51 pm

Re: Chess Grading Destruction

Post by Roger de Coverly » Sat Apr 18, 2015 1:34 pm

Maurice Lawson wrote: they can continue to be independently calculated.
Suppose in July 2015, your reform is carried out. A player is published as 200X, 190A. Which grade is used in the calculations of his opponents for the 2015-16 season ?

Brian Towers
Posts: 1266
Joined: Tue Nov 18, 2014 7:23 pm

Re: Chess Grading Destruction

Post by Brian Towers » Sat Apr 18, 2015 2:05 pm

I think we need to go down the financial route.

In stock charts along with the actual share price you can display additional information like the 50 day moving average, 200 day moving average and, most excitingly for champagne lovers, Bollinger bands ;-).

When you click on somebody's grading you should get to see a similar chart with dates along the x axis, gradings along the y axis and a user selectable choice of 5 game, 10 game, 20 game, 50 game, 100 game, etc. moving averages plus, if you want, Bollinger bands. Bollinger bands have nothing to do with alcohol. Rather they are the +/- 2 standard deviation lines.

RSI, MACD anyone?
Ah, but I was so much older then. I'm younger than that now.

Maurice Lawson
Posts: 23
Joined: Sun Apr 05, 2015 10:38 pm

Re: Chess Grading Destruction

Post by Maurice Lawson » Sat Apr 18, 2015 3:06 pm

Roger, Thanks for the clarification. I understand your point.

My suggestion would be that the A category becomes the base number for the next period grading calculations. There are two reasons.

Firstly, I can see that there would be complete uproar from those who favour stability over performance if I suggested the X category be used for the A category calculations! And I don't see the need to upset their wish to see an A category.

Secondly, I do not believe the use of A category base grades will be likely to have any deleterious effect on the usefulness of the X category calculation. This is because the X category is primarily impacted by performance, rather than by exactness of base grades. Put another way, I am saying that when one takes the mean of base grades across the 30 or more opponents which would form the X category, it is unlikely to differ much whether we use A or X as a base. There will be individual ups and downs opponent by opponent, but the mean will tend to congregate to a similar point. Therefore the ultimate X category calculation will be satisfactory and similar using either A or X as a base. And this makes it perfectly practicable to operate.

Additionally the use of a continuing A category base will keep the stability supporters happy in that it forms the underlying data, whilst keeping the performance supporters satisfied that more recent performance continues to be established and tracked.

Roger de Coverly
Posts: 21318
Joined: Tue Apr 15, 2008 2:51 pm

Re: Chess Grading Destruction

Post by Roger de Coverly » Sat Apr 18, 2015 3:17 pm

Maurice Lawson wrote: My suggestion would be that the A category becomes the base number for the next period grading calculations.
In terms of the overall working of the system from one year to the next, the calculation of an X grade becomes an item of information only.

If the grading team is working on ideas for additional more up-to-date data to be published, I dare say it could be added to the list of possibilities to calculate and display a figure for last x games, last y months or even both.

MartinCarpenter
Posts: 3052
Joined: Tue May 24, 2011 10:58 am

Re: Chess Grading Destruction

Post by MartinCarpenter » Sun Apr 19, 2015 4:38 pm

How to set the parameters for an exponential system is actually an interesting question :) Its the one thing I don't like in Yorkshire.

That system basically says that X games a season is worth a smoothed X game rolling window. Here X is called weight. Works very well if playing ~30. 10-15 and your grade jumps about, 50+ and it takes an earthquake to move it.

I do think it might be improved somehow. If you wanted to track current ECF policy I think it'd be setting everyone's weight to a minimum of 20/30 at the end of each season so long as they've played that many in the last 3 seasons and actually imposing a maximum weight of 30 on peoples grades. Since the weights just go up/down with time it'd be very easy to do this if you want to. Whatever really :)

The really big problems with the X grades were all the horrible 'edge' effects and they were all due to them being based on a discrete window :(

The other thing in terms of Brian's note - and in general for people thinking time is the one dimension to track grade changes along - is of course is that there's plenty of people who are 10-15 pts stronger in evening leagues than long play stuff, or of course vice versa. Sometimes huge grading differences for white vs black too. Or against different grading bands of opposition...... Even win/draw/loss percentage while you're at it.

Might as well show all of those online while you're at it ;) What are computers for anyway?

Maurice Lawson
Posts: 23
Joined: Sun Apr 05, 2015 10:38 pm

Re: Chess Grading Destruction

Post by Maurice Lawson » Mon Apr 20, 2015 3:41 pm

Martin, I agree with much of what you say in respect of the exponential system. And I particularly agree with your 20-30 weighting and maximum 30. This is what I think we need - it's just the right balance, and it tells us interesting and useful performance information.

But I don't believe it would be useful or meaningful to start showing Black/White gradings, or the kind of stratified gradings you have talked about. This would cause too much confusion and I don't feel the results would be particularly meaningful. There might well be a case for introducing some kind of parameter adjuster in the A and exponential X categories to adjust for relative numbers of Blacks and Whites, since this factor is absolutely widely understood as impacting the set of results we all achieve in real match play. Over a 30 game range one can easily get for example an 18-12 colour split, and this really does impact overall grade. The adjuster would not be difficult to formulate, and would be a meaningful new parameter which would help to correct the relatively crude unadjusted calculation. Actually I am quite surprised that no system, including FIDE, has yet seen fit to make this obvious correction given the certainty with which we all know how much it influences results.

Between us I venture to suggest that the various contributors to this thread have come up with a number of potential improvements to the ECF system which are practical, meaningful and forward thinking!

MartinCarpenter
Posts: 3052
Joined: Tue May 24, 2011 10:58 am

Re: Chess Grading Destruction

Post by MartinCarpenter » Mon Apr 20, 2015 4:27 pm

The reason you can't adjust for white/black balance is that its so thoroughly different for different people :) The norm is ~+10ish of course. Some people are close to balanced and some 20-30 pts stronger with white. That makes it intriguing to know and very useful for captains etc, although easy enough to run the numbers yourself if need be.

The thing that the Yorkshire website does publish which really is interesting to look at is stuff like best percentages/points totals for the season so far for every league in Yorkshire, who's played for which teams and has done well etc. I find that much more interesting than the live grades actually.

Brian Valentine
Posts: 577
Joined: Fri Apr 03, 2009 1:30 pm

Re: Chess Grading Destruction

Post by Brian Valentine » Thu Apr 23, 2015 10:09 am

As things have died down for now, I thought I'd at least point out that I have been watching the discussion with interest.

As far as I can see Maurice still wants the X grade calculated, but now accepts that it could just be supplementary information. There have been some ideas about how an up-to-date grade might work and that this might be an improved solution for Maurice's grouse.

I'm sure that the ECF will want to look at these issues at some time. Clearly there is a short term issue in that information already available is not being published and that is the first thing that needs to be fixed. Secondly some results are much delayed in reporting. Congress chess is generally pretty good, but at the other end of the spectrum, internal club games are often a big problem. At the slower end we could improve things but cutting down the number of handovers. That requires some systems upgrades, we are working along this path, but it takes time. I also notice the possible ways the website might add more information. Some of this might just be extractions from our database, others will require extra fields (e.g. a supplementary grade).

While all this is going on we will need to look at how we modify the grading model to adapt to more periodic updates. At some stage members will need to decide what they want out of our system. What I have got out of this debate has been the looseness of some of the concepts described here and therefore how difficult it will be to define what exactly we want. I've got to think a bit more about we propose any tricky changes. But first the grading team need to speed up capturing the data.
Brian
Manager of ECF grading

Brian Towers
Posts: 1266
Joined: Tue Nov 18, 2014 7:23 pm

Re: Chess Grading Destruction

Post by Brian Towers » Thu Apr 23, 2015 10:43 am

Brian Valentine wrote:But first the grading team need to speed up capturing the data.
Where I live results are entered into the system by club secretaries and tournament organisers. Why is this a problematic solution?

A few months ago I visited the grading website for my federation to check on my opponent's grading from the match the day before. While I was there I clicked on my name in the list to check on recent activity and was shocked to see that my game from the day before was already in the system. It showed my opponent, her grading (I don't think I really want to get involved on that other thread) and the delta change to my grading (minus, I'm afraid, since I lost).
Ah, but I was so much older then. I'm younger than that now.

Stewart Reuben
Posts: 4550
Joined: Tue Apr 03, 2007 11:04 pm
Location: writer

Re: Chess Grading Destruction

Post by Stewart Reuben » Thu Apr 23, 2015 11:00 am

Capturing the maximum amount of data surely comes first. Just ignoring most chess played by English players outside the UK results in the loss of a great deal of important information. Even some of the games, played representing England internationally, are ignored. One year Michael Adams was ungraded.
As Brian says speed of capturing the data is also very important.
We rely on players playing about 50% white. The system, used internationally for a few years. where the blacks and whites were taken into account was dropped very quickly. Presumably that was lack of interest.
A player who scores 55 wins and 45 losses in 100 games is a very different animal from one who draws 90 and wins 10, although against the same field would have the same grade.
The London Chess Association system was run for some years with no adverse comments. That was probably because the congress data was collected efficiently and published monthly. It resulted in a growth of chess being played in the area. Updating the concept:
Games where, for 60 moves, the thinking time is at least 120 minutes count threefold. Thus all FIDE Rated chess.
Games less than 120 minutes with quickplay finishes count twofold.
Rapiplay or, where games will be adjudicated if they go on long enough, counted onefold.
I imagine one problem in England would be getting the information accurately as to what type of chess it was.
Of course, it is correct, getting the organisers to input the data is better than sending it to graders who then input the information. In FIDE this would have the problem that the organisers sometimes deliberately falsify the information. e.g. not submitting data where their own players perform poorly. e.g. inventing games that have never occurred.