Stockfish 15.1

Discuss anything you like about chess related matters in this forum.
Ian Thompson
Posts: 3562
Joined: Wed Jul 02, 2008 4:31 pm
Location: Awbridge, Hampshire

Stockfish 15.1

Post by Ian Thompson » Tue Dec 06, 2022 12:02 pm

Stockfish 15.1 was released a couple of days ago.

Their website contains the interesting comment that:
This release also introduces a new convention for the evaluation that is reported by search. An evaluation of +1 is now no longer tied to the value of one pawn, but to the likelihood of winning the game. With a +1 evaluation, Stockfish has now a 50% chance of winning the game against an equally strong opponent.
That raises a few questions, such as what does an evaluation of +0.5, +2, or anything else mean as a percentage chance of winning the game? Presumably, it won't work properly when used as an engine with any software that expects evaluations it gets from the engine to be related to pawn advantages.

I can't find anything more on it than what's on the Stockfish website and a tweet from Matthew Sadler.

Anyone else found more information about it?

Roger de Coverly
Posts: 21322
Joined: Tue Apr 15, 2008 2:51 pm

Re: Stockfish 15.1

Post by Roger de Coverly » Tue Dec 06, 2022 12:40 pm

Ian Thompson wrote:
Tue Dec 06, 2022 12:02 pm
That raises a few questions, such as what does an evaluation of +0.5, +2, or anything else mean as a percentage chance of winning the game?
The other known point is that +0.0 equates to a belief by Stockfish that it can draw against itself. But is their distribution linear? Presumably not, or it would be capped at +2 when Stockfish believes it can beat itself with 100% reliability.

NickFaulks
Posts: 8475
Joined: Sat Jan 02, 2010 1:28 pm

Re: Stockfish 15.1

Post by NickFaulks » Tue Dec 06, 2022 2:57 pm

"With a +1 evaluation, Stockfish has now a 50% chance of winning the game against an equally strong opponent."

And a 0% chance of losing? That is presumably implied, but must be wrong in general.
If you want a picture of the future, imagine a QR code stamped on a human face — forever.

Roger de Coverly
Posts: 21322
Joined: Tue Apr 15, 2008 2:51 pm

Re: Stockfish 15.1

Post by Roger de Coverly » Tue Dec 06, 2022 3:03 pm

NickFaulks wrote:
Tue Dec 06, 2022 2:57 pm
And a 0% chance of losing? That is presumably implied, but must be wrong in general.
They are using the reporting method first seen with Alpha Zero, namely the number of wins or percentage from a Monte Carlo process. Or at least I think that's how it works. Have the statistics of the number of draws and losses ever been reported in a similar manner? Perhaps they ought to express it terms of a 100 game match, so that plus 1 indicates a 75% result (50 wins 50 draws) or whatever.

Brian Egdell
Posts: 49
Joined: Sun Apr 15, 2018 2:38 pm
Location: The Netherlands

Re: Stockfish 15.1

Post by Brian Egdell » Tue Dec 06, 2022 3:23 pm

Instead of thinking of the probability of winning against an equally strong opponent, think of the probability of not winning. I.E. drawing or losing. This is apparently also 50% when the advantage is +1. Then think of a factor K to multiply that probability by whenever you add one more point to that advantage.

For example, K might be two thirds.

That would mean that an evaluation of 0.0 (equal position) means that you have a probability of 75% of not winning the game against an opponent of equal strength. (That would evidently mean 25% probability of winning, 25% of losing, and 50% of a draw.)

Adding one to the evaluation, giving 1.0, multiplies the 75% probability of not winning against equal strength opposition by two thirds to give 50%.

Adding one again to give 2.0 evaluation means 50% * 2//3 = 33.3% probability of not winning.

Et cetera.

I don't know what the value of K actually is in this construction, but I am guessing that it works along these lines.

Maybe the change has been done because the value of a pawn is a little too arbitrary and rigid for chess positions as evaluated by modern programs.

MartinCarpenter
Posts: 3053
Joined: Tue May 24, 2011 10:58 am

Re: Stockfish 15.1

Post by MartinCarpenter » Tue Dec 06, 2022 5:36 pm

NickFaulks wrote:
Tue Dec 06, 2022 2:57 pm
"With a +1 evaluation, Stockfish has now a 50% chance of winning the game against an equally strong opponent."

And a 0% chance of losing? That is presumably implied, but must be wrong in general.
The chance of SF 15 losing from a slightly better position isn't precisely zero but is certainly frighteningly low. It just doesn't do any of the human things that make us lose them.

Angus French
Posts: 2153
Joined: Thu May 15, 2008 1:37 am

Re: Stockfish 15.1

Post by Angus French » Tue Dec 06, 2022 11:00 pm

lichess apparently now uses Stockfish 15.1 for whole-game analysis but 14+ NNUE for examination of individual moves. I'm guessing the evaluation scheme is the same for both but can't see that it has been explained.

MartinCarpenter
Posts: 3053
Joined: Tue May 24, 2011 10:58 am

Re: Stockfish 15.1

Post by MartinCarpenter » Wed Dec 07, 2022 9:50 am

Angus French wrote:
Tue Dec 06, 2022 11:00 pm
lichess apparently now uses Stockfish 15.1 for whole-game analysis but 14+ NNUE for examination of individual moves. I'm guessing the evaluation scheme is the same for both but can't see that it has been explained.
Should be yes, the evaluation functions are all (I'm fairly sure) based on neural nets trained by massive numbers of self played games. So a %age expected score is actually what the numbers do mean these days.

Of course the score that SF gets playing against itself is not automatically a reliable indicator of how well humans will score from a given position! You could probably actually use databases and so on to try and train something that predicted the expected human %age score.

Roger de Coverly
Posts: 21322
Joined: Tue Apr 15, 2008 2:51 pm

Re: Stockfish 15.1

Post by Roger de Coverly » Wed Dec 07, 2022 10:33 am

MartinCarpenter wrote:
Wed Dec 07, 2022 9:50 am
Of course the score that SF gets playing against itself is not automatically a reliable indicator of how well humans will score from a given position! You could probably actually use databases and so on to try and train something that predicted the expected human %age score.
It would be useful to humans to have some measure of the volatility of the advantage. You get Stockfish evaluations of plus 2 or plus 3 with level material. That means the opponent is believed to be busted, but some degree of tactical accuracy is needed to maintain the advantage. That can contrast with say just being a pawn up where there's nothing really happening. The task then being to nurse the extra material to victory.