June 2006 DRAFT Aspects of Chess Engine Programming == Board Representation == One wants efficiency, preferably with some readability. There are only 64 squares on the board but for computational reasons the internal numbering range may well be larger with some numbers skipped. It is convenient to have bishops land on square numbers so as to allow easily tracking their "color". There would be no obvious advantage to square numbering beyond 2 digits. == Move Notation == Classic notation (P-QB3) lacks "at a glance" info as TO opponent piece captured as well as column. PGN/SAN lacks "at a glance" info as to the pieces involved in the move. Engine encoding can improve on both. One format (of endless possible) is shown here: SWWXYYZZ where S= +/-1 (sign, - denotes black) WW=move/data code (see appendix) X=piece # YY=from square ZZ=to square Notation does not have to mimic that of human play but the engine should be coded to translate. Pawns do not have to be distinguished but all other pieces should be. One piece numbering is: 1=pawn 2=QB 3=KB 4=QN 5=KN 6=QR 7=KR 8=queen 9=king WW max code may be a multiple of 64 (64,128) After a bit of exposure, the moves are fairly readable in their engine format. For simplicity, simple move (advance or capture) codes will be within the range <900000. As example, here is the opening few moves of a game with square numbering as depicted: PENDING == Move Evaluation - general == In general, moves are evaluated by depth. Evaluation criteria reflect piece threat, mobility and position. How each component is weighted is up to the programmer. Analysis of historical games can help with tuning of the weights. The typical piece strength is 1,3,3,5,9 for pawn,bishop,knight,rook and queen. Some engines assign a nominal strength to the king. Engine treatment of when to use what amount of time under timed format can vary considerably. Some engines will frontload time, some will use simple averaging and others will use other techniques. == Move Evaluation - details == Some engines highly prize analysis efficiency, using bitboard representations and tree-pruning algorithms. Other are more modest. Depending on the play format and move evaluation design, any engine can be competitive. The simplest approach is pure numerical scoring of every evaluated potential position. == Interfaces == One may have a custom interface and/or one of the established ones. Established GUIs include Winboard, UCI and Arena. If using a standard interface, the programmer needs to address the setup and play options of the interface. This includes ponder during opponent's turn, undo/redo, displaying best move evaluations and depth, providing hints and displaying the pawn equivalent points advantage/disadvantage. == Nuances == Strength assigned to pieces can vary with game progression. An engine might weight a pawn in rank 6,7 or 8 as a 2 rather than a 1. An engine may alter piece strength for move eval due to liklihood of opponent castling. For proper draw evaluation, the engine must keep a move count as well as enough recent move history to check for draw by repetition. To properly handle any "undo" interface option certainly the last few moves must be retained. To handle "hint request" the engine must be ready to analyze for its opponents best move on the opponents turn. == Stored Data == The typical engine will want to record information to disk to capture every move (possibly the 2nd choice too), play settings (ponder), starting color, piece values used (if edittable), result, play format used, constraints (depth/ponder), and opponent information. For suspended games, the existing FEN format would be the desired board setup snapshot image. == Move/data codes == The size constraint of an signed (32-bit) long variable type is 2147483647 (2E9). It is better to fit within 32-bit size constraints for backwards-compatibility reasons. The following talks in terms of absolute move codes ie abs(code). For terminal codes (like checkmate), the remainder of the code will typically be clear/zero eg. checkmate=1200000 as mentioned previously, move codes < 900000 (9E5) cover all advance/capture movements. move code = 9XYYZZ are reserved for the remaining movements (enpassant,castling,promotion) eg 9XYYZZ indicates pawn promotion where X=new piece 8 thru 2 eg 91YYZZ indicates en passant eg 90YYZZ indicates castle queenside (**) and 99YYZZ castle kingside (**) move codes beyond 4999999 (4E6) are reserved for critical/analyst comments. The move codes are listed in order to ensure the requisite ones fit in 32 bits. The notation allows for ease of filtering certain moves. This includes by player (<0), by piece # (masking), by non-capture moves (<100000), etc. Here is a fairly extensive list of move/data bullets == Move data (32/64 bits) == capture move 1-8 being the taken piece (9 is reserved for all other special moves) eg ...-114231 indicates a black pawn capturing a white pawn 10 check 11 resign 12 checkmate 13 stalemate 14 draw by agreement 15 draw by insufficient material 16 draw by 50 moves 17 draw by repetition 18 win by adjudication 19 loss on time 20 loss by adjudication 21 walkover 22 forfeit 23 disqualification 24 game protest filed 25 undiscovered check 26 game to be replayed 27 loss by engine operation issue (malfunction) 28 undo/redo operation seen (human-engine match) 29 hint requested (human-engine match) 30 illegal move attempted (includes XYYZZ data) 31 play suspended/resumed (toggle) 32 immediate move forced by opponent 34-49 available/tbd 50-53 reserved for analyst codes 54(critic code) both sides are considered equal here = 55 (critic code) white is slightly better +/= 56(critic code) black is slightly better =/+ 57 (critic code) white has a clear advantage +/- 58 (critic code) black has a clear advantage -/+ 59 (critic code) an excellent move ! 60 (critic code) a blunder ? 61 (critic code) an interesting move that may not be best !? 62 (critic code) a dubious move, but not easily refuted ?! 63 (critic code) note fork 64 (critic code) double check == Tourney data preamble (32/64 bits) == tourney format is Swiss RR tourney format is single elimination tourney format is double elimination tourney format is triple elimination tourney format is true RR (everyone plays everyone) championship played after predetermined game/round limit (2 highest rated) tourney game assignment random for the beginning rounds tourney game assignment purely random thruout all rounds tourney game assignment driven by winner/loser bracketing tourney nationally sanctioned by site country tourney internationally sanctioned local level tourney regional level tourney state level tourney national level tourney international level tourney open event (all-comers) restricted event (qualifying) restricted event (invitation) engine is defending champion or #1 seed play is time per move format play is Blitz play is off the clock play is depth-restricted play uses time credit per move play is elapsed playing time (game in xx) play is engine operation by programmer or designee only play is engine operation by tourney rep engine play is all onsite engine play is can be offsite (proctored) engine play is can be offsite (honor system) ponder allowed during opponent turn use of move database NOT allowed all players use identical hash table size platform is Windows OS platform is Unix/Linux type platform OS is other platform using Intel processor platform using AMD processor platform processor is other platform (OS/processor) at player discretion 43-64 available/tbd == Engine data preamble (32/64 bits) == winning record against humans winning record against engines winning record playing as White winning record playing as Black winning record in sanctioned tourney play 64-bit binary 32-bit binary only engine can skew play for a draw engine can play variant A (Fischer Random) engine can play variant B (Chaturanga) engine can play variant C (play to lose) engine can play variant D (handicapped with missing piece) engine can play variant E (3D) engine code is Java-based engine code is C-based engine code is Perl-based engine code language is other engine version is latest release engine version is beta release engine version is NOT the latest engine is not ELO rated engine rating is Average club player 1600-2100 ELO engine rating is Strong club player 2100-2300 ELO engine rating is International league player 2300-2450 ELO engine rating is International Master (IM) 2450-2600 ELO engine rating is Grandmaster (GM) 2600-2850 ELO engine rating is Supergrandmaster (2851+ ELO) engine is nationally ranked engine is internationally ranked engine can play Blitz engine is unbeaten by human opponents engine is unbeaten by engine opponents engine has a winning record against every engine played multiple times engine is or has been champion engine of the state engine is or has been champion engine of the county engine is or has been champion engine of the country engine is reigning World Champion engine has international play experience engine has known or suspected bugs engine can use a move database optionally engine always uses a move database engine never uses a move database 41-64 available/tbd == References == here are some web links http://www.uschess.org/ http://en.wikipedia.org/wiki/Computer_chess http://wwwcs.uni-paderborn.de/~IPCCC/ http://www.playwitharena.com/ http://www.tim-mann.org/xboard.html http://www.uciengines.de/ http://en.wikipedia.org/wiki/Chess_notation http://en.wikipedia.org/wiki/ELO_rating_system http://www.chessvariants.com/d.chess/chess.html as far as books, I am not in a position to recommend certain references over any other. Bill McEachen C programmer and mediocre chess player