+150 ELO self-play (1s / move)
- Increased Futility and Reverse Futility Pruning Depths
- Tweaked LMR
- Cleaner, easier to understand code
- Researches null window first before searching full window
- Added Second Set of Killer Moves
- Added King Safety in the form of King Tropism
- Extra bonus to diagonals in line with the king
- Extra bonus to attack if enemy king is near semi-open files
- Weighted by attacker's material
- Evaluation tuned using a logistic regression over a custom constructed dataset, similar to the Texel Method
- Black box tuning was done using Simulated annealing + local search, using a pseudo-huber loss
- pseudo-huber loss was used here since there are likely outliers that would unfavorably skew the relatively simple evaluation function. This was a choice I made based on what I understood about the dataset, and made a marginal improvement to the evaluation quality as opposed to the traditional MSE loss (+10ish elo from 2000 games). If my evaluation were more complex, I might be more tempted to stay with MSE loss, as long as on-board checkmates are removed from the dataset
- Black box tuning was done using Simulated annealing + local search, using a pseudo-huber loss
- Fixed some timeout bugs
- Increased hash table stability
(95)
Gauntlet run for test ratings (1 min, 0.5sec inc), with elo centered around the v1.4 release (ratings from bayeselo):
Rank | Name | Elo | + | - | Games | Score | Oppo. | Draws |
---|---|---|---|---|---|---|---|---|
1 | Barbarossa-0.6.0 | 38 | 34 | 33 | 240 | 55% | 95 | 23% |
2 | CeeChess-v1.4 | 0 | 13 | 13 | 1664 | 65% | -13 | 26% |
3 | Barbarossa-0.5.0-win10-64 | -34 | 33 | 33 | 240 | 45% | 95 | 28% |
4 | Kingfisher.v1.1.1 | -107 | 32 | 33 | 240 | 34% | 95 | 36% |
5 | gopher_check | -146 | 34 | 35 | 238 | 29% | 95 | 26% |
6 | CeeChess 1.3.2 | -149 | 34 | 36 | 238 | 29% | 95 | 25% |
... |
Since CCRL ratings got adjusted down recently (stockfish went from 3900 CCRL to ~3630 afaik), this no longer breaks the CCRL 2400 barrier, but comparing the results here to the old ratings of Barbarossa-0.6.0(2468), Barbarossa-0.5.0(~2375ish i believe?) and the others suggests that this release would have broken that barrier. I now expect the engine to land in the range of 2300-2350, given Barbarossa-0.6.0 has a new rating of 2355