4NCL Online

MartinCarpenter · Post by **MartinCarpenter** » Fri Jun 19, 2020 9:13 am

Matthew Turner wrote: ↑
Thu Jun 18, 2020 7:20 pm
David,
I don’t really understand what you are trying to say here. For Ken Regan’s test the standard deviation is based on the entire population. So if knightie is 206 that is 2245 and he would have to select engine moves at a rate associated with a player rated 3045 to get a z score of 4.
Matt

He's hypothesising that Knightie has effectively behaved like a junior in improvement terms, so his existing grade is notably lagging his actual playing strength.

That does demonstrably happen with some older players, if really rather rarely. Probably only moved up 1 or 2 sigma but that obviously moves the odds hugely.

DavidWalker · Post by **DavidWalker** » Fri Jun 19, 2020 9:40 am

Matthew,
I was just making the point that an event which appears very unlikely can be made much more plausible by a moderate shift in our assumptions.

The 3045 rating for a 2245 player is only as high as it is because the test sample size is so small (5 or 6 games I assume) and hence there is significant uncertainty in what the model predicts. A 3045 rating is what the model considers the most likely figure, but the distribution is spread out so 2245 sneaks in at the z=-4 level.

For another example, consider this paper which I have mentioned before. Dr. Regan uses his model to estimate the performance rating of players in various high profile matches and tournaments.

Consider Linares 1999, this was an eight player double round robin, so 112 half games in all - not unreasonable for a full season by an ambitious player. The model's 4 sigma value for this number of games is 181 ELO points (~24 ECF points) rather than 800, so not totally unreasonable to achieve by a rapidly improving player. Even if a player only improved by 90 points in a season, this would significantly change their z-score.

Broadly I agree that Dr Regan's model is effective at catching cheats, and that most players caught by it will have cheated. However, I think it is important to have an open mind when considering any appeal, particularly for rapidly improving players or those who may be underrated for whatever reason.

Matthew Turner · Post by **Matthew Turner** » Fri Jun 19, 2020 9:42 am

Martin,
Players improve and some players improve rapidly, but I don't really see the relevance of the past grading history. All we need to know is that Knightie starts at 206 and ends at 230.
He goes from 2245 to 2425

He has got a z score of 4 by selecting engine moves at a rate associated with a 3045

However, if we use the 2425 rating then his z score would drop to 3.1 (that corresponds to a 1 in 1030 chance that the player played the game without assistance)

An appeal would have to decide whether that was sufficient for knightie to be cleared. There are some difficult decisions to be made.

Roger de Coverly · Post by **Roger de Coverly** » Fri Jun 19, 2020 9:49 am

Matthew Turner wrote: ↑
Fri Jun 19, 2020 9:42 am
There are some difficult decisions to be made.

In online chess, yes. In live chess over the board, no.
There's the evidence of witnesses of suspicious behaviour or otherwise.

Matthew Turner · Post by **Matthew Turner** » Fri Jun 19, 2020 9:51 am

DavidWalker wrote: ↑
Fri Jun 19, 2020 9:40 am
Matthew,
I was just making the point that an event which appears very unlikely can be made much more plausible by a moderate shift in our assumptions.

The 3045 rating for a 2245 player is only as high as it is because the test sample size is so small (5 or 6 games I assume) and hence there is significant uncertainty in what the model predicts. A 3045 rating is what the model considers the most likely figure, but the distribution is spread out so 2245 sneaks in at the z=-4 level.

For another example, consider this paper which I have mentioned before. Dr. Regan uses his model to estimate the performance rating of players in various high profile matches and tournaments.

Consider Linares 1999, this was an eight player double round robin, so 112 half games in all - not unreasonable for a full season by an ambitious player. The model's 4 sigma value for this number of games is 181 ELO points (~24 ECF points) rather than 800, so not totally unreasonable to achieve by a rapidly improving player. Even if a player only improved by 90 points in a season, this would significantly change their z-score.

Broadly I agree that Dr Regan's model is effective at catching cheats, and that most players caught by it will have cheated. However, I think it is important to have an open mind when considering any appeal, particularly for rapidly improving players or those who may be underrated for whatever reason.

Agreed

John McKenna · Post by **John McKenna** » Fri Jun 19, 2020 2:42 pm

DavidWalker wrote: ↑
Thu Jun 18, 2020 4:43 pm

Joseph Conlon wrote:Suppose chess player Knightie McKnightFace plays six tournaments a year and has a stable rating...
A stable rating is the key assumption, but suppose that Knightie has had more free time available recently and has been working much harder than usual on their game. As a consequence their playing strength has increased over the course of a season.

Suppose Knightie's average yearly ECF grade over the past 20 years is 210, with a standard deviation (SD) of 4 points. This means that most of the time their published grade would vary between 206 and 214, but values of 202 and 218 would not be unexpected. I think this is a plausible range of grades based on my own experience (in particular between 1994 and 2013. This does assume that the effect of 2009 new grades is relatively small in this grading band).

Suppose that Knightie's work pays off and their rating increases to 230 (again similar to my own experience). This represents a 5 sigma deviation from the previous average. However, this does not mean that anything tremendously unlikely has happened. What it does mean is that it is very unlikely that Knightie's increased rating is solely due to chance but instead is due at least in part to the extra work put in.

The key point is that a very high z-score does not necessarily relate to an implausibly high change in rating.

MartinCarpenter wrote: ↑
Fri Jun 19, 2020 9:13 am

Matthew Turner wrote: ↑
Thu Jun 18, 2020 7:20 pm
David,
I don’t really understand what you are trying to say here. For Ken Regan’s test the standard deviation is based on the entire population. So if knightie is 206 that is 2245 and he would have to select engine moves at a rate associated with a player rated 3045 to get a z score of 4.
Matt
He's hypothesising that Knightie has effectively behaved like a junior in improvement terms, so his existing grade is notably lagging his actual playing strength.

That does demonstrably happen with some older players, if really rather rarely. Probably only moved up 1 or 2 sigma but that obviously moves the odds hugely.

I thought the same thng - "that Knightie has effectively behaved like a junior in improvement terms, so his existing grade is notably lagging his actual playing strength."

In addition all UK players have been playing online since mid-March and a good many have had plenty of time to concentrate on improving their chess. Many adults were effectively temporarily "retired" and all juniors effectively on an extended home-schooling course.

Players of almost any age below retirement age and rated in the 1950-2250 range are well placed to take advantage of the "lockdown" to prepare for games much better and analyse their results more closely. All of which could see some such players rapidly producing exceptional results online without the need to cheat at all.

Playing online in the current circumstances is unique in the history of modern chess and there are bound to be reverbrations (earth-shaking performances) as well as repercussions (an increased level of cheating).

Matt Bridgeman · Post by **Matt Bridgeman** » Fri Jun 19, 2020 3:19 pm

It’s certainly been a unique experience, but I think in terms of time controls people have been playing a lot of 3+2, 5+2, 5+3 and 10+5 in general. I think the challenge later in the year, if over-the-board returns, will be to slow down and think again. It would be nice to see a lot of norms achieved, but I’d guess you’d probably get a lot of rusty players just trying to get match fit again.

John McKenna · Post by **John McKenna** » Fri Jun 19, 2020 3:33 pm

Matt Bridgeman wrote: ↑
Fri Jun 19, 2020 3:19 pm
It’s certainly been a unique experience, but I think in terms of time controls people have been playing a lot of 3+2, 5+2, 5+3 and 10+5 in general. I think the challenge later in the year if over-the-board returns to slow down and think again.

No need to wait for some, Matt.

I've seen a number of games in this 45 mins. + 15 secs. event played at blitz speed by both players and other longer-lasting ones where players self-destructed with plenty of thinking time left on the clock.

Caoimhín de Búrca · Post by **Caoimhín de Búrca** » Sun Jun 21, 2020 6:54 pm

Bit of a long-time lurker first-time caller on this one, so hopefully this isn't repeating things which have been covered already! Just have a bit of interest in this though, so keen to set things out correctly in my head - so maybe feel free to correct anything in the below.

The Lichess algorithm is different to the Ken Regan software used by 4NCL I think? (If maybe based on the same idea?)

Ken Regan's algorithm is based on the player's rating - the idea being that a 1700 playing 2500 moves over a prolonged period is cheating, whereas a 2500 playing 2500 moves isn't.

For a 1700 player, 0<z<1 means they're playing at around 1700 strength, which is to be expected. 1<z<2 means around 1900 strength, which means they're on good form. 2<z<3 means around 2100 strength, which is a very good performance indeed. 3<z<4 means around 2300 strength, which is dodgy but the player is given the benefit of the doubt. And z>4 means around 2500 strength, which where you say "Ah would you stop" and ban the player. Would that - very roughly - be what the data is saying?

The strength of your opponents isn't relevant, because playing 2500 moves is the same regardless of your opponent. But your own rating is fundamentally relevant, because it's the bar you're being measured against.

Lichess doesn't know my OTB rating - so what is its baseline? Is it taking my lichess rating, maybe adjusted for the rating deviation (to filter out provisional, unstable ratings)?

4NCL do know my rating, so I guess they can use that. I have two though - a national one (1780) and a FIDE (1880). These are roughly the same (national + 100 = FIDE at my level, again as a rough rule of thumb), but I think the two baselines would give me different z scores?

Of course, ratings have effectively been frozen for the last 3/4 months, whereas playing strength can still change. So a junior on a national training programme, for example, could gain, say, 200 points in that time, but would still be judged against their old baseline? His strength has changed, but his rating hasn't.

Again, apologies for the long post! Not trying to stoke everything up again - more trying to summarise down what I've read on here over the past dozen pages or so and see if I understand it correctly. Online chess may be around for a little while yet, and I may as well know what's going on!

Ian Thompson · Post by **Ian Thompson** » Sun Jun 21, 2020 7:59 pm

Caoimhín de Búrca wrote: ↑
Sun Jun 21, 2020 6:54 pm
Lichess doesn't know my OTB rating - so what is its baseline? Is it taking my lichess rating, maybe adjusted for the rating deviation (to filter out provisional, unstable ratings)?

I don't think that's been disclosed. I assume one thing it does do is to say that a player consistently playing 3000+ rating moves over a sufficiently large number of games must be cheating because no human is capable of doing that, which, obviously, doesn't require any knowledge of the player's rating.

MartinCarpenter · Post by **MartinCarpenter** » Mon Jun 22, 2020 9:30 am

Caoimhín de Búrca wrote: ↑
Sun Jun 21, 2020 6:54 pm
Of course, ratings have effectively been frozen for the last 3/4 months, whereas playing strength can still change. So a junior on a national training programme, for example, could gain, say, 200 points in that time, but would still be judged against their old baseline? His strength has changed, but his rating hasn't.

Again, apologies for the long post! Not trying to stoke everything up again - more trying to summarise down what I've read on here over the past dozen pages or so and see if I understand it correctly. Online chess may be around for a little while yet, and I may as well know what's going on!

You've understood everything fine from what I can tell

Juniors are given special dispensation for grades lagging playing strength when running the calculations by Ken at least. I doubt if LiChess can as they likely don't collect birth dates. In general LiChess don't like to say quite what they're doing, which is somewhat annoying.

They'll convert your national rating to equivalent FIDE before running the sums, so you'd be very similar either way. Some people have big differences and that can be a tiny bit tricky.

Roger Lancaster · Post by **Roger Lancaster** » Mon Jun 22, 2020 10:57 am

MartinCarpenter wrote: ↑
Mon Jun 22, 2020 9:30 am

They'll convert your national rating to equivalent FIDE before running the sums, so you'd be very similar either way. Some people have big differences and that can be a tiny bit tricky.

A couple of years back, my published ECF standard grade - converted to ELO by the normal formula - exceeded my published FIDE rating by 380 points. The reasons aren't particularly important but, statistically, are explained by the fact that the latter figure was based on just five games.

This has no relevance to Lichess which neither knows nor cares what my true rating is. In any case, I had forgotten the discrepancy until the past few days when I began to wonder which figure Ken Regan's software might have used as the basis for determining my expected performance - if the lower [FIDE] figure was being used when my true playing strength was much higher then any apparent z=4 result would seem to reduce to a true figure of around z=2.2.

This seemed a matter of some concern so I enquired of someone, non-participant in this discussion, more familiar with Ken Regan's methods than I am. I was told, and I have absolutely no reason to doubt the integrity of my informant, that - in such cases - Ken Regan is informed of the higher figure and uses this as a safeguard against producing false results. I'm not totally reassured, as human errors {for example, in overlooking someone's national rating} can still occur but in principle it seems to answer the "tiny bit tricky" point raised by Martin.

David Sedgwick · Post by **David Sedgwick** » Mon Jun 22, 2020 11:24 am

Roger Lancaster wrote: ↑
Mon Jun 22, 2020 10:57 am
This seemed a matter of some concern so I enquired of someone, non-participant in this discussion, more familiar with Ken Regan's methods than I am. I was told, and I have absolutely no reason to doubt the integrity of my informant, that - in such cases - Ken Regan is informed of the higher figure and uses this as a safeguard against producing false results. I'm not totally reassured, as human errors {for example, in overlooking someone's national rating} can still occur but in principle it seems to answer the "tiny bit tricky" point raised by Martin.

Please also bear in mind that Ken Regan himself doesn't ban anyone or publish anything about anyone. He provides information to organisers and arbiters who decide what action (if any) to take on the basis of his findings.

Caoimhín de Búrca · Post by **Caoimhín de Búrca** » Mon Jun 22, 2020 11:26 am

MartinCarpenter wrote: ↑
Mon Jun 22, 2020 9:30 am
You've understood everything fine from what I can tell

Mildly surprised at that if I'm honest!

But it's good to hear!

I suppose the background for my query is that we did have a player banned during the season when their account was marked as receiving computer assistance. It leaves me in a tricky position of trying to understand objectively what has happened - and this experience may be of benefit to the thread in general.

Our player denies cheating (but of course, no-one ever admits it). Both 4NCL and Lichess have said that the player's 4NCL games were not actionable. (Those games scored high, but from reading this thread, it seems this could be explained by a double lag in FIDE rating in an improving player - a natural lag anyway as not all games are FIDE-rated, and then a covid lag as there is now a disconnect between rating and strength)

Aside from the 4NCL 45+15 games, the player only played rapid or blitz games on the site - so the implication is that the computer assistance was at a time control of 10 minutes or less. That strikes me as unlikely, although I accept it's possible.

Other than that, I understand Lichess have not told our player why specifically the ban was issued. Their response time in general has, I understand, been very slow (about a week to answer an e-mail). And while I appreciate there may be valid reasons not to share exact data, our player is in the position of being asked to appeal against something they've been given no information on. That's kind of hard. In the meantime, of course, the league is continuing and we're a player down.

The lichess and 4NCL approach to all this - broadly speaking - is that there's no point appealing because you're wrong. I think we've seen a bit of that frustration on this thread. Certainly given the speed of lichess' responses, it's not possible to appeal in practical terms relevant to 4NCL Online. And 4NCL's rule, of course, is technically inarguable as once an account is marked, the why of it does not feature in things (as it's not 4NCL's marker).

But I think it shouldn't really be the case that a player or captain has to read through 25 pages of posts of not unadvanced statistical information and learn how the system works before being able to frame an appeal. It does appear from reading this thread that there are factors which could skew a result in some minority of cases, and if either lichess or 4NCL were able to instead point a player towards those factors and ask for evidence that any of those may apply, then that might make for a more open and fair appeals process (if maybe impractical from lichess' point of view - how do you prove who's behind a lichess account?) It would be slightly akin to an arbiter taking a player aside at an OTB tournament and asking them to solve some puzzles of a certain standard as part of the process (which I think happens?).

Again, some or all of the above may well be wrong; I'm just trying to work my head around all this and get an objective view of things.

Roger Lancaster · Post by **Roger Lancaster** » Mon Jun 22, 2020 11:32 am

David Sedgwick wrote: ↑
Mon Jun 22, 2020 11:24 am

Roger Lancaster wrote: ↑
Mon Jun 22, 2020 10:57 am
This seemed a matter of some concern so I enquired of someone, non-participant in this discussion, more familiar with Ken Regan's methods than I am. I was told, and I have absolutely no reason to doubt the integrity of my informant, that - in such cases - Ken Regan is informed of the higher figure and uses this as a safeguard against producing false results. I'm not totally reassured, as human errors {for example, in overlooking someone's national rating} can still occur but in principle it seems to answer the "tiny bit tricky" point raised by Martin.
Please also bear in mind that Ken Regan himself doesn't ban anyone or publish anything about anyone. He provides information to organisers and arbiters who decide what action (if any) to take on the basis of his findings.

Entirely true and my post wasn't intended to imply otherwise.

English Chess Forum

4NCL Online

Re: 4NCL Online

Re: 4NCL Online

Re: 4NCL Online

Re: 4NCL Online

Re: 4NCL Online

Re: 4NCL Online

Re: 4NCL Online

Re: 4NCL Online

Re: 4NCL Online

Re: 4NCL Online

Re: 4NCL Online

Re: 4NCL Online

Re: 4NCL Online

Re: 4NCL Online

Re: 4NCL Online