Transcript for:
Enhancements in Stockfish Chess Engine

I've improved Stockfish, the world's strongest chess engine, four times, and I just had two more Elo gaining changes accepted into the official code. This was probably my most clever idea yet, so today I'm going to show you how this monster chess bot analyzes a position and how I made it stronger. Stockfish is no ordinary chess player. Within seconds of starting a search, it's already looking deep into the endgame, analyzing tens of millions of positions per second. To do an initial evaluation of a position, it uses a neural network which is based on the way human brains think, to assign it a score, where higher values are better. This neural network started with random outputs but learned from millions of positions to understand what makes a chess position good, and my first improvement to Stockfish was speeding up this neural network. The engine builds a structure of potential future positions by taking each position, or node, and splitting it up based on what happens when you play each move. When you flip it over, it kind of looks like a tree, which is why computer scientists call it a search tree. Then as Stockfish searches, it calculates a dynamic score for each position based on the maximum value it can get from each move. But when it's the opponent's turn, it picks the move that minimizes the score, because it knows the opponent wants it to lose. This is called minimax, and it's the bread and butter of combinatorial game theory, which studies games with two players with perfect information about the game state and no randomness. But if you let the bot keep on going deeper and deeper into the search tree, you'll end up searching more positions than there are atoms in the universe. So Stockfish starts the analysis with a depth parameter that controls how far it will probe into the search tree. Each time it descends one step into the tree it decreases the depth by one. And when the depth hits zero, it just uses that static evaluation I mentioned earlier. But with the plain minimax algorithm, Stockfish spends the same amount of time on each move, no matter how bad it is. On the one hand you don't want to waste time on bad moves, but on the other hand, you want to look at every move so you don't miss something important. So how does Stockfish solve this dilemma? The answer is the late move reduction. If you've watched my last video, you'll know that Stockfish ranks the moves from best to worst by keeping running statistics of which moves have been good lately. And for moves that rank badly, Stockfish will actually decrease the search depth by calculating a reduction based on a bunch of conditions we've refined over the years. You want the size of the reduction to increase as you get more confident the move is bad, which happens if the move is ranked low in the move ordering or the search depth is really high, but you don't want it to grow so fast that we're completely discarding important ideas. Stockfish uses the simple method of multiplying a constant by the product of a slow-growing function of the depth and a slow-growing function of the rank of the move in the ordering. Then it does a reduced search by subtracting the reduction from the depth, and if the value from the search is the best so far, then it verifies the result with a full effort search, so you reset the depth to the initial value before the reduction. This kind of thing where you do a low-effort test to filter out ideas then do a rigorous, high-effort test to verify that idea is kind of like what we do when testing changes to Stockfish, where we start out testing the change on 10-second games, and only move to 60-second games if it shows promise. Changing the size of the reduction based on information about the position has been one of the best ways we've come up with to squeeze Elo, which is our measure of performance out of the engine, and it's how I got this newest pair of improvements. This code may not look like much, but its impact on performance is equal to nearly a year of development progress. In addition to the baseline reduction, we've added a bunch of extra conditions over the years based on other stuff we know about the position. My favorite are these two. This one increases the reduction for moves that don't capture a piece when our best move captures a piece, since missing out on a free piece is generally a mistake, and this one increases or decreases the reduction based on running statistics of how good each move has been in similar positions. I had tried several dozen changes to this part of the code, and every time I did, I felt like I was missing something. There was something about these 20 or so lines of code that felt like they were holding the engine back. The thing about working on something is that if you do it for long enough, the answer will sometimes just come to you. One morning, I woke up with a mental image of Stockfish's main rival Leela Chess Zero analyzing a position, and I realized what it was that was bugging me so much. When calculating a reduction, Stockfish only looks at information we know from the current position. This term is based on how confused we are about the current position, this term is based on whether we think the last move leading up to the current position was bad, and most importantly, this term, which probably by itself is worth more than 10 Elo, is based on how good we think the move is right now. When calculating a reduction, we never actually look at what happens after we make the move. And the best piece of information we get after playing a move is how the static evaluation changes. If it increases, then we know the move is probably good so we want to search it more. But at the same time, you don't want this trick to apply to moves that already have a small reduction, so I set it to only apply when the move was going to be reduced by a depth of at least three. I tested this technique on Stockfish's testing website and it worked quite well, giving around an ELO point or two. It's nice to think that whenever Stockfish initially thinks a move is bad and gets evidence it's worth searching, it can put more effort into that move just like a human would. But wait. Didn't I say that I made a pair of improvements, not just one? The thing is, sometimes the static evaluation increases after making a move, but other times, a move is so bad that it completely takes the evaluation. If that does happen it's a really good sign the move is going to be no good, so instead of increasing the depth like we did before, we can decrease it. For my first test of this idea, I increased the reduction when the static evaluation dropped by 200 internal units, which is about the value of a pawn. I was pretty careful when I implemented this idea since decreasing the depth can have dangerous side effects. Some of these conditions guarding this early version make sure we weren't just in check. I also make sure that the depth is still positive after we make the reduction. This gave around two Elo points, which for top engines is quite a large gain. I then did a simplification test to make sure I could remove those conditions to simplify the code. This didn't gain any Elo, but it did make the code easier to understand. In fact simplifying the code is how we've kept the Stockfish code base roughly the same size over the years. I tried a bunch of stuff to build on this idea like altering the conditions and increasing the size of the effect. We did have a small tune that changed the evaluation drop margin to 3/4s of a pawn from a full pawn, but other than that, here are the four lines of code that represent my newest contribution to the world's strongest chess engine. Thank you all so much for 1,000 subscribers. I've linked the next video in the "Improving Stockfish" series, along with the full playlist, so you can enjoy the whole journey from the beginning.