Same position, different assessment

Discuss anything you like about chess related matters in this forum.
User avatar
JustinHorton
Posts: 10364
Joined: Mon Aug 04, 2008 10:06 am
Location: Somewhere you're not

Same position, different assessment

Post by JustinHorton » Mon Jan 04, 2021 1:47 pm

I was just looking at a line that might have occurred in an online game and noticed that though the two best lines were in fact identical courtesy of the en passant rule, my onboard computer was apparently giving them slightly different assessments.

Image

I've seen similar instances of this before, involving transpositions, but their occurring in this way makes it particularly easy to illustrate.

Why does this happen?
"Do you play chess?"
"Yes, but I prefer a game with a better chance of cheating."

lostontime.blogspot.com

Reg Clucas
Posts: 606
Joined: Mon May 16, 2011 3:45 pm

Re: Same position, different assessment

Post by Reg Clucas » Mon Jan 04, 2021 3:36 pm

This is pure speculation, but I wonder if, when several threads are being analysed simultaneously, they are for some reason allocated different amounts of system resources (e.g. RAM, CPU share) which makes them perform differently.

I don't have Stockfish but I've noticed the version used by Chessbomb sometimes produces weird, even nonsensical, evaluations.

John McKenna

Re: Same position, different assessment

Post by John McKenna » Mon Jan 04, 2021 3:43 pm

That may be peculiar to Stockfish.

How long the engine was analysing in the given position was not specified.(Edit - t:14.86 is shown in the screenshot and could be the time the engine was analysing.)

FWIW, after a few minutes HIARCS settles on -

1 +/- (1.31) 20.f3 gxf3 21.Rc2 Rd8 22.Bf4 Nd3...
2 +/- (1.31) 20.f4 gxf3 21.Rc2 Rd8 22.Bf4 Nd3...
3. -+ (-7.79) 20.Re3 Rd8 21.Rc3 Nd3 22.Be3 Ne5...

Perhaps someone else using Stockfish could try to confirm your findings?

Wadih Khoury
Posts: 604
Joined: Sun Jul 12, 2020 8:14 pm

Re: Same position, different assessment

Post by Wadih Khoury » Mon Jan 04, 2021 5:26 pm

Seems there are a ton of different potential reasons:

* Different CPU threads having different speed/load
* Stockfish keeping a transposition status, so 2 identical position will have diferent continuation (one of them will be suboptimal).
* and this technical explanation:
However, in fact Stockfish is using a lot of forward pruning, reduction and extension techniques that are state/history dependent, so the results of the search of a subtree depends on the state/history with which it entered the subtree. Therefore, starting with hash table entries that influence move ordering (and can actually return a different evaluation than searching the subtree again) leads to a different result of for a search of same depth.

User avatar
JustinHorton
Posts: 10364
Joined: Mon Aug 04, 2008 10:06 am
Location: Somewhere you're not

Re: Same position, different assessment

Post by JustinHorton » Thu Jan 21, 2021 5:13 pm

From this afternoon

Image

Image
"Do you play chess?"
"Yes, but I prefer a game with a better chance of cheating."

lostontime.blogspot.com

Roger de Coverly
Posts: 21315
Joined: Tue Apr 15, 2008 2:51 pm

Re: Same position, different assessment

Post by Roger de Coverly » Thu Jan 21, 2021 6:06 pm

JustinHorton wrote:
Mon Jan 04, 2021 1:47 pm
I've seen similar instances of this before, involving transpositions, but their occurring in this way makes it particularly easy to illustrate.
If evidence of the Ken Regan and similar programs was ever to be subjected to a determined attempt in a court of law to discredit it, this inconsistency of evaluations would be a weakness.

John McKenna

Re: Same position, different assessment

Post by John McKenna » Thu Jan 21, 2021 6:50 pm

Whatever this thread purports to be it's not "science".

E Michael White
Posts: 1420
Joined: Fri Jun 01, 2007 6:31 pm

Re: Same position, different assessment

Post by E Michael White » Thu Jan 21, 2021 6:55 pm

Roger de Coverly wrote:
Thu Jan 21, 2021 6:06 pm
JustinHorton wrote:
Mon Jan 04, 2021 1:47 pm
I've seen similar instances of this before, involving transpositions, but their occurring in this way makes it particularly easy to illustrate.
If evidence of the Ken Regan and similar programs was ever to be subjected to a determined attempt in a court of law to discredit it, this inconsistency of evaluations would be a weakness.
I noticed this effect about 9 years ago when analysing positions and concluded that it was due to entries in the hash table potentially being different when the program starts its next block of analysis. I came across it because it was possible to improve the depth of analysis and performance by forcing the beast down the lines you intuitively felt were best then backtracking one move at a time. This forced subsequent analysis to include the line you have selected which was still in the hash table.The end result was that your best guess would be checked out against new computer selected lines and you would sometimes be correct!

I would expect this to be more of an issue recently with the advent of SSD discs and ensuring files needing fast access are stored there. The suggestion, that machine type and size in the cheating thread are irrelevant, is in my view probably misguided.

These are more things that Ken R needs to consider but as he is fundamentally a Computer Science Prof rather than a statistician I expect he can resolve them.

John McKenna

Re: Same position, different assessment

Post by John McKenna » Thu Jan 21, 2021 7:04 pm

E Michael White wrote:
Thu Jan 21, 2021 6:55 pm
Roger de Coverly wrote:
Thu Jan 21, 2021 6:06 pm
JustinHorton wrote:
Mon Jan 04, 2021 1:47 pm
I've seen similar instances of this before, involving transpositions, but their occurring in this way makes it particularly easy to illustrate.
If evidence of the Ken Regan and similar programs was ever to be subjected to a determined attempt in a court of law to discredit it, this inconsistency of evaluations would be a weakness.
I noticed this effect about 9 years ago when analysing positions and concluded that it was due to entries in the hash table potentially being different when the program starts its next block of analysis. I came across it because it was possible to improve the depth of analysis and performance by forcing the beast down the lines you intuitively felt were best then backtracking one move at a time. This forced subsequent analysis to include the line you have selected which was still in the hash table.The end result was that your best guess would be checked out against new computer selected lines and you would sometimes be correct!

I would expect this to be more of an issue recently with the advent of SSD discs and ensuring files needing fast access are stored there. The suggestion, that machine type and size in the cheating thread are irrelevant, is in my view probably misguided.

These are more things that Ken R needs to consider but as he is fundamentally a Computer Science Prof rather than a statistician I expect he can resolve them.
So, in less than 50 minutes it's gone from speculation to conjecture - not bad but still not enough to settle anything.

User avatar
Matt Mackenzie
Posts: 5237
Joined: Tue Mar 31, 2009 11:51 pm
Location: Millom, Cumbria

Re: Same position, different assessment

Post by Matt Mackenzie » Thu Jan 21, 2021 7:11 pm

John McKenna wrote:
Thu Jan 21, 2021 6:50 pm
Whatever this thread purports to be it's not "science".
Who has claimed that it is?
"Set up your attacks so that when the fire is out, it isn't out!" (H N Pillsbury)

John McKenna

Re: Same position, different assessment

Post by John McKenna » Thu Jan 21, 2021 7:58 pm

Matt Mackenzie wrote:
Thu Jan 21, 2021 7:11 pm
John McKenna wrote:
Thu Jan 21, 2021 6:50 pm
Whatever this thread purports to be it's not "science".
Who has claimed that it is?
Do you really not know? Nor even hazard a guess?

In the interminable "Cheating in chess" thread the original poster in this thread has asked for Prof. Ken Regan's cheat-detection s/w to be subjected to standards of scientific proof.

Another longstanding critic of Prof. Regan's method has now brought up Regan's name in this thread.

Just try putting 1 plus 1 together...

If this thread is now going to start to critique Regan's s/w, as well as the other (that does so intermittently), perhaps it would be better to say so now.

To be serious about using this thread to find a meaningful flaw in Regan's s/w Justin H & E Michael White could start by saying whether, or not, they've conducted trials and determined the frequency of the examples of Stockfish outputting what has been shown twice by EJH and described once, in words, by EMW?