1Standard: Portable Game Notation Specification and Implementation Guide 2 3Revised: 1994.03.12 4 5Authors: Interested readers of the Internet newsgroup rec.games.chess 6 7Coordinator: Steven J. Edwards (send comments to sje@world.std.com) 8 9 100: Preface 11 12From the Tower of Babel story: 13 14"If now, while they are one people, all speaking the same language, they have 15started to do this, nothing will later stop them from doing whatever they 16propose to do." 17 18Genesis XI, v.6, _New American Bible_ 19 20 211: Introduction 22 23PGN is "Portable Game Notation", a standard designed for the representation of 24chess game data using ASCII text files. PGN is structured for easy reading and 25writing by human users and for easy parsing and generation by computer 26programs. The intent of the definition and propagation of PGN is to facilitate 27the sharing of public domain chess game data among chessplayers (both organic 28and otherwise), publishers, and computer chess researchers throughout the 29world. 30 31PGN is not intended to be a general purpose standard that is suitable for every 32possible use; no such standard could fill all conceivable requirements. 33Instead, PGN is proposed as a universal portable representation for data 34interchange. The idea is to allow the construction of a family of chess 35applications that can quickly and easily process chess game data using PGN for 36import and export among themselves. 37 38 392: Chess data representation 40 41Computer usage among chessplayers has become quite common in recent years and a 42variety of different programs, both commercial and public domain, are used to 43generate, access, and propagate chess game data. Some of these programs are 44rather impressive; most are now well behaved in that they correctly follow the 45Laws of Chess and handle users' data with reasonable care. Unfortunately, many 46programs have had serious problems with several aspects of the external 47representation of chess game data. Sometimes these problems become more 48visible when a user attempts to move significant quantities of data from one 49program to another; if there has been no real effort to ensure portability of 50data, then the chances for a successful transfer are small at best. 51 52 532.1: Data interchange incompatibility 54 55The reasons for format incompatibility are easy to understand. In fact, most 56of them are correlated with the same problems that have already been seen with 57commercial software offerings for other domains such as word processing, 58spreadsheets, fonts, and graphics. Sometimes a manufacturer deliberately 59designs a data format using encryption or some other secret, proprietary 60technique to "lock in" a customer. Sometimes a designer may produce a format 61that can be deciphered without too much difficulty, but at the same time 62publicly discourage third party software by claiming trade secret protection. 63Another software producer may develop a non-proprietary system, but it may work 64well only within the scope of a single program or application because it is not 65easily expandable. Finally, some other software may work very well for many 66purposes, but it uses symbols and language not easily understood by people or 67computers available to those outside the country of its development. 68 69 702.2: Specification goals 71 72A specification for a portable game notation must observe the lessons of 73history and be able to handle probable needs of the future. The design 74criteria for PGN were selected to meet these needs. These criteria include: 75 761) The details of the system must be publicly available and free of unnecessary 77complexity. Ideally, if the documentation is not available for some reason, 78typical chess software developers and users should be able to understand most 79of the data without the need for third party assistance. 80 812) The details of the system must be non-proprietary so that users and software 82developers are unrestricted by concerns about infringing on intellectual 83property rights. The idea is to let chess programmers compete in a free market 84where customers may choose software based on their real needs and not based on 85artificial requirements created by a secret data format. 86 873) The system must work for a variety of programs. The format should be such 88that it can be used by chess database programs, chess publishing programs, 89chess server programs, and chessplaying programs without being unnecessarily 90specific to any particular application class. 91 924) The system must be easily expandable and scalable. The expansion ability 93must include handling data items that may not exist currently but could be 94expected to emerge in the future. (Examples: new opening classifications and 95new country names.) The system should be scalable in that it must not have any 96arbitrary restrictions concerning the quantity of stored data. Also, planned 97modes of expansion should either preserve earlier databases or at least allow 98for their automatic conversion. 99 1005) The system must be international. Chess software users are found in many 101countries and the system should be free of difficulties caused by conventions 102local to a given region. 103 1046) Finally, the system should handle the same kinds and amounts of data that 105are already handled by existing chess software and by print media. 106 107 1082.3: A sample PGN game 109 110Although its description may seem rather lengthy, PGN is actually fairly 111simple. A sample PGN game follows; it has most of the important features 112described in later sections of this document. 113 114[Event "F/S Return Match"] 115[Site "Belgrade, Serbia JUG"] 116[Date "1992.11.04"] 117[Round "29"] 118[White "Fischer, Robert J."] 119[Black "Spassky, Boris V."] 120[Result "1/2-1/2"] 121 1221. e4 e5 2. Nf3 Nc6 3. Bb5 a6 4. Ba4 Nf6 5. O-O Be7 6. Re1 b5 7. Bb3 d6 8. c3 123O-O 9. h3 Nb8 10. d4 Nbd7 11. c4 c6 12. cxb5 axb5 13. Nc3 Bb7 14. Bg5 b4 15. 124Nb1 h6 16. Bh4 c5 17. dxe5 Nxe4 18. Bxe7 Qxe7 19. exd6 Qf6 20. Nbd2 Nxd6 21. 125Nc4 Nxc4 22. Bxc4 Nb6 23. Ne5 Rae8 24. Bxf7+ Rxf7 25. Nxf7 Rxe1+ 26. Qxe1 Kxf7 12627. Qe3 Qg5 28. Qxg5 hxg5 29. b3 Ke6 30. a3 Kd6 31. axb4 cxb4 32. Ra5 Nd5 33. 127f3 Bc8 34. Kf2 Bf5 35. Ra7 g6 36. Ra6+ Kc5 37. Ke1 Nf4 38. g3 Nxh3 39. Kd2 Kb5 12840. Rd6 Kc5 41. Ra6 Nf2 42. g4 Bd3 43. Re6 1/2-1/2 129 130 1313: Formats: import and export 132 133There are two formats in the PGN specification. These are the "import" format 134and the "export" format. These are the two different ways of formatting the 135same PGN data according to its source. The details of the two formats are 136described throughout the following sections of this document. 137 138Other than formats, there is the additional topic of PGN presentation. While 139both PGN import and export formats are designed to be readable by humans, there 140is no recommendation that either of these be an ultimate mode of chess data 141presentation. Rather, software developers are urged to consider all of the 142various techniques at their disposal to enhance the display of chess data at 143the presentation level (i.e., highest level) of their programs. This means 144that the use of different fonts, character sizes, color, and other tools of 145computer aided interaction and publishing should be explored to provide a high 146quality presentation appropriate to the function of the particular program. 147 148 1493.1: Import format allows for manually prepared data 150 151The import format is rather flexible and is used to describe data that may have 152been prepared by hand, much like a source file for a high level programming 153language. A program that can read PGN data should be able to handle the 154somewhat lax import format. 155 156 1573.2: Export format used for program generated output 158 159The export format is rather strict and is used to describe data that is usually 160prepared under program control, something like a pretty printed source program 161reformatted by a compiler. 162 163 1643.2.1: Byte equivalence 165 166For a given PGN data file, export format representations generated by different 167PGN programs on the same computing system should be exactly equivalent, byte 168for byte. 169 170 1713.2.2: Archival storage and the newline character 172 173Export format should also be used for archival storage. Here, "archival" 174storage is defined as storage that may be accessed by a variety of computing 175systems. The only extra requirement for archival storage is that the newline 176character have a specific representation that is independent of its value for a 177particular computing system's text file usage. The archival representation of 178a newline is the ASCII control character LF (line feed, decimal value 10, 179hexadecimal value 0x0a). 180 181Sadly, there are some accidents of history that survive to this day that have 182baroque representations for a newline: multicharacter sequences, end-of-line 183record markers, start-of-line byte counts, fixed length records, and so forth. 184It is well beyond the scope of the PGN project to reconcile all of these to the 185unified world of ANSI C and the those enjoying the bliss of a single '\n' 186convention. Some systems may just not be able to handle an archival PGN text 187file with native text editors. In these cases, an indulgence of sorts is 188granted to use the local newline convention in non-archival PGN files for those 189text editors. 190 191 1923.2.3: Speed of processing 193 194Several parts of the export format deal with exact descriptions of line and 195field justification that are absent from the import format details. The main 196reason for these restrictions on the export format are to allow the 197construction of simple data translation programs that can easily scan PGN data 198without having to have a full chess engine or other complex parsing routines. 199The idea is to encourage chess software authors to always allow for at least a 200limited PGN reading capability. Even when a full chess engine parsing 201capability is available, it is likely to be at least two orders of magnitude 202slower than a simple text scanner. 203 204 2053.2.4: Reduced export format 206 207A PGN game represented using export format is said to be in "reduced export 208format" if all of the following hold: 1) it has no commentary, 2) it has only 209the standard seven tag roster identification information ("STR", see below), 3) 210it has no recursive annotation variations ("RAV", see below), and 4) it has no 211numeric annotation glyphs ("NAG", see below). Reduced export format is used 212for bulk storage of unannotated games. It represents a minimum level of 213standard conformance for a PGN exporting application. 214 215 2164: Lexicographical issues 217 218PGN data is composed of characters; non-overlapping contiguous sequences of 219characters form lexical tokens. 220 221 2224.1: Character codes 223 224PGN data is represented using a subset of the eight bit ISO 8859/1 (Latin 1) 225character set. ("ISO" is an acronym for the International Standards 226Organization.) This set is also known as ECMA-94 and is similar to other ISO 227Latin character sets. ISO 8859/1 includes the standard seven bit ASCII 228character set for the 32 control character code values from zero to 31. The 95 229printing character code values from 32 to 126 are also equivalent to seven bit 230ASCII usage. (Code value 127, the ASCII DEL control character, is a graphic 231character in ISO 8859/1; it is not used for PGN data representation.) 232 233The 32 ISO 8859/1 code values from 128 to 159 are non-printing control 234characters. They are not used for PGN data representation. The 32 code values 235from 160 to 191 are mostly non-alphabetic printing characters and their use for 236PGN data is discouraged as their graphic representation varies considerably 237among other ISO Latin sets. Finally, the 64 code values from 192 to 255 are 238mostly alphabetic printing characters with various diacritical marks; their use 239is encouraged for those languages that require such characters. The graphic 240representations of this last set of 64 characters is fairly constant for the 241ISO Latin family. 242 243Printing character codes outside of the seven bit ASCII range may only appear 244in string data and in commentary. They are not permitted for use in symbol 245construction. 246 247Because some PGN users' environments may not support presentation of non-ASCII 248characters, PGN game authors should refrain from using such characters in 249critical commentary or string values in game data that may be referenced in 250such environments. PGN software authors should have their programs handle such 251environments by displaying a question mark ("?") for non-ASCII character codes. 252This is an important point because there are many computing systems that can 253display eight bit character data, but the display graphics may differ among 254machines and operating systems from different manufacturers. 255 256Only four of the ASCII control characters are permitted in PGN import format; 257these are the horizontal and vertical tabs along with the linefeed and carriage 258return codes. 259 260The external representation of the newline character may differ among 261platforms; this is an acceptable variation as long as the details of the 262implementation are hidden from software implementors and users. When a choice 263is practical, the Unix "newline is linefeed" convention is preferred. 264 265 2664.2: Tab characters 267 268Tab characters, both horizontal and vertical, are not permitted in the export 269format. This is because the treatment of tab characters is highly dependent 270upon the particular software in use on the host computing system. Also, tab 271characters may not appear inside of string data. 272 273 2744.3: Line lengths 275 276PGN data are organized as simple text lines without any special bytes or 277markers for secondary record structure imposed by specific operating systems. 278Import format PGN text lines are limited to having a maximum of 255 characters 279per line including the newline character. Lines with 80 or more printing 280characters are strongly discouraged because of the difficulties experienced by 281common text editors with long lines. 282 283In some cases, very long tag values will require 80 or more columns, but these 284are relatively rare. An example of this is the "FEN" tag pair; it may have a 285long tag value, but this particular tag pair is only used to represent a game 286that doesn't start from the usual initial position. 287 288 2895: Commentary 290 291Comment text may appear in PGN data. There are two kinds of comments. The 292first kind is the "rest of line" comment; this comment type starts with a 293semicolon character and continues to the end of the line. The second kind 294starts with a left brace character and continues to the next right brace 295character. Comments cannot appear inside any token. 296 297Brace comments do not nest; a left brace character appearing in a brace comment 298loses its special meaning and is ignored. A semicolon appearing inside of a 299brace comment loses its special meaning and is ignored. Braces appearing 300inside of a semicolon comments lose their special meaning and are ignored. 301 302*** Export format representation of comments needs definition work. 303 304 3056: Escape mechanism 306 307There is a special escape mechanism for PGN data. This mechanism is triggered 308by a percent sign character ("%") appearing in the first column of a line; the 309data on the rest of the line is ignored by publicly available PGN scanning 310software. This escape convention is intended for the private use of software 311developers and researchers to embed non-PGN commands and data in PGN streams. 312 313A percent sign appearing in any other place other than the first position in a 314line does not trigger the escape mechanism. 315 316 3177: Tokens 318 319PGN character data is organized as tokens. A token is a contiguous sequence of 320characters that represents a basic semantic unit. Tokens may be separated from 321adjacent tokens by white space characters. (White space characters include 322space, newline, and tab characters.) Some tokens are self delimiting and do 323not require white space characters. 324 325A string token is a sequence of zero or more printing characters delimited by a 326pair of quote characters (ASCII decimal value 34, hexadecimal value 0x22). An 327empty string is represented by two adjacent quotes. (Note: an apostrophe is 328not a quote.) A quote inside a string is represented by the backslash 329immediately followed by a quote. A backslash inside a string is represented by 330two adjacent backslashes. Strings are commonly used as tag pair values (see 331below). Non-printing characters like newline and tab are not permitted inside 332of strings. A string token is terminated by its closing quote. Currently, a 333string is limited to a maximum of 255 characters of data. 334 335An integer token is a sequence of one or more decimal digit characters. It is 336a special case of the more general "symbol" token class described below. 337Integer tokens are used to help represent move number indications (see below). 338An integer token is terminated just prior to the first non-symbol character 339following the integer digit sequence. 340 341A period character (".") is a token by itself. It is used for move number 342indications (see below). It is self terminating. 343 344An asterisk character ("*") is a token by itself. It is used as one of the 345possible game termination markers (see below); it indicates an incomplete game 346or a game with an unknown or otherwise unavailable result. It is self 347terminating. 348 349The left and right bracket characters ("[" and "]") are tokens. They are used 350to delimit tag pairs (see below). Both are self terminating. 351 352The left and right parenthesis characters ("(" and ")") are tokens. They are 353used to delimit Recursive Annotation Variations (see below). Both are self 354terminating. 355 356The left and right angle bracket characters ("<" and ">") are tokens. They are 357reserved for future expansion. Both are self terminating. 358 359A Numeric Annotation Glyph ("NAG", see below) is a token; it is composed of a 360dollar sign character ("$") immediately followed by one or more digit 361characters. It is terminated just prior to the first non-digit character 362following the digit sequence. 363 364A symbol token starts with a letter or digit character and is immediately 365followed by a sequence of zero or more symbol continuation characters. These 366continuation characters are letter characters ("A-Za-z"), digit characters 367("0-9"), the underscore ("_"), the plus sign ("+"), the octothorpe sign ("#"), 368the equal sign ("="), the colon (":"), and the hyphen ("-"). Symbols are used 369for a variety of purposes. All characters in a symbol are significant. A 370symbol token is terminated just prior to the first non-symbol character 371following the symbol character sequence. Currently, a symbol is limited to a 372maximum of 255 characters in length. 373 374 3758: Parsing games 376 377A PGN database file is a sequential collection of zero or more PGN games. An 378empty file is a valid, although somewhat uninformative, PGN database. 379 380A PGN game is composed of two sections. The first is the tag pair section and 381the second is the movetext section. The tag pair section provides information 382that identifies the game by defining the values associated with a set of 383standard parameters. The movetext section gives the usually enumerated and 384possibly annotated moves of the game along with the concluding game termination 385marker. The chess moves themselves are represented using SAN (Standard 386Algebraic Notation), also described later in this document. 387 388 3898.1: Tag pair section 390 391The tag pair section is composed of a series of zero or more tag pairs. 392 393A tag pair is composed of four consecutive tokens: a left bracket token, a 394symbol token, a string token, and a right bracket token. The symbol token is 395the tag name and the string token is the tag value associated with the tag 396name. (There is a standard set of tag names and semantics described below.) 397The same tag name should not appear more than once in a tag pair section. 398 399A further restriction on tag names is that they are composed exclusively of 400letters, digits, and the underscore character. This is done to facilitate 401mapping of tag names into key and attribute names for use with general purpose 402database programs. 403 404For PGN import format, there may be zero or more white space characters between 405any adjacent pair of tokens in a tag pair. 406 407For PGN export format, there are no white space characters between the left 408bracket and the tag name, there are no white space characters between the tag 409value and the right bracket, and there is a single space character between the 410tag name and the tag value. 411 412Tag names, like all symbols, are case sensitive. All tag names used for 413archival storage begin with an upper case letter. 414 415PGN import format may have multiple tag pairs on the same line and may even 416have a tag pair spanning more than a single line. Export format requires each 417tag pair to appear left justified on a line by itself; a single empty line 418follows the last tag pair. 419 420Some tag values may be composed of a sequence of items. For example, a 421consultation game may have more than one player for a given side. When this 422occurs, the single character ":" (colon) appears between adjacent items. 423Because of this use as an internal separator in strings, the colon should not 424otherwise appear in a string. 425 426The tag pair format is designed for expansion; initially only strings are 427allowed as tag pair values. Tag value formats associated with the STR (Seven 428Tag Roster, see below) will not change; they will always be string values. 429However, there are long term plans to allow general list structures as tag 430values for non-STR tag pairs. Use of these expanded tag values will likely be 431restricted to special research programs. In all events, the top level 432structure of a tag pair remains the same: left bracket, tag name, tag value, 433and right bracket. 434 435 4368.1.1: Seven Tag Roster 437 438There is a set of tags defined for mandatory use for archival storage of PGN 439data. This is the STR (Seven Tag Roster). The interpretation of these tags is 440fixed as is the order in which they appear. Although the definition and use of 441additional tag names and semantics is permitted and encouraged when needed, the 442STR is the common ground that all programs should follow for public data 443interchange. 444 445For import format, the order of tag pairs is not important. For export format, 446the STR tag pairs appear before any other tag pairs. (The STR tag pairs must 447also appear in order; this order is described below). Also for export format, 448any additional tag pairs appear in ASCII order by tag name. 449 450The seven tag names of the STR are (in order): 451 4521) Event (the name of the tournament or match event) 453 4542) Site (the location of the event) 455 4563) Date (the starting date of the game) 457 4584) Round (the playing round ordinal of the game) 459 4605) White (the player of the white pieces) 461 4626) Black (the player of the black pieces) 463 4647) Result (the result of the game) 465 466A set of supplemental tag names is given later in this document. 467 468For PGN export format, a single blank line appears after the last of the tag 469pairs to conclude the tag pair section. This helps simple scanning programs to 470quickly determine the end of the tag pair section and the beginning of the 471movetext section. 472 473 4748.1.1.1: The Event tag 475 476The Event tag value should be reasonably descriptive. Abbreviations are to be 477avoided unless absolutely necessary. A consistent event naming should be used 478to help facilitate database scanning. If the name of the event is unknown, a 479single question mark should appear as the tag value. 480 481Examples: 482 483[Event "FIDE World Championship"] 484 485[Event "Moscow City Championship"] 486 487[Event "ACM North American Computer Championship"] 488 489[Event "Casual Game"] 490 491 4928.1.1.2: The Site tag 493 494The Site tag value should include city and region names along with a standard 495name for the country. The use of the IOC (International Olympic Committee) 496three letter names is suggested for those countries where such codes are 497available. If the site of the event is unknown, a single question mark should 498appear as the tag value. A comma may be used to separate a city from a region. 499No comma is needed to separate a city or region from the IOC country code. A 500later section of this document gives a list of three letter nation codes along 501with a few additions for "locations" not covered by the IOC. 502 503Examples: 504 505[Site "New York City, NY USA"] 506 507[Site "St. Petersburg RUS"] 508 509[Site "Riga LAT"] 510 511 5128.1.1.3: The Date tag 513 514The Date tag value gives the starting date for the game. (Note: this is not 515necessarily the same as the starting date for the event.) The date is given 516with respect to the local time of the site given in the Event tag. The Date 517tag value field always uses a standard ten character format: "YYYY.MM.DD". The 518first four characters are digits that give the year, the next character is a 519period, the next two characters are digits that give the month, the next 520character is a period, and the final two characters are digits that give the 521day of the month. If the any of the digit fields are not known, then question 522marks are used in place of the digits. 523 524Examples: 525 526[Date "1992.08.31"] 527 528[Date "1993.??.??"] 529 530[Date "2001.01.01"] 531 532 5338.1.1.4: The Round tag 534 535The Round tag value gives the playing round for the game. In a match 536competition, this value is the number of the game played. If the use of a 537round number is inappropriate, then the field should be a single hyphen 538character. If the round is unknown, a single question mark should appear as 539the tag value. 540 541Some organizers employ unusual round designations and have multipart playing 542rounds and sometimes even have conditional rounds. In these cases, a multipart 543round identifier can be made from a sequence of integer round numbers separated 544by periods. The leftmost integer represents the most significant round and 545succeeding integers represent round numbers in descending hierarchical order. 546 547Examples: 548 549[Round "1"] 550 551[Round "3.1"] 552 553[Round "4.1.2"] 554 555 5568.1.1.5: The White tag 557 558The White tag value is the name of the player or players of the white pieces. 559The names are given as they would appear in a telephone directory. The family 560or last name appears first. If a first name or first initial is available, it 561is separated from the family name by a comma and a space. Finally, one or more 562middle initials may appear. (Wherever a comma appears, the very next character 563should be a space. Wherever an initial appears, the very next character should 564be a period.) If the name is unknown, a single question mark should appear as 565the tag value. 566 567The intent is to allow meaningful ASCII sorting of the tag value that is 568independent of regional name formation customs. If more than one person is 569playing the white pieces, the names are listed in alphabetical order and are 570separated by the colon character between adjacent entries. A player who is 571also a computer program should have appropriate version information listed 572after the name of the program. 573 574The format used in the FIDE Rating Lists is appropriate for use for player name 575tags. 576 577Examples: 578 579[White "Tal, Mikhail N."] 580 581[White "van der Wiel, Johan"] 582 583[White "Acme Pawngrabber v.3.2"] 584 585[White "Fine, R."] 586 587 5888.1.1.6: The Black tag 589 590The Black tag value is the name of the player or players of the black pieces. 591The names are given here as they are for the White tag value. 592 593Examples: 594 595[Black "Lasker, Emmanuel"] 596 597[Black "Smyslov, Vasily V."] 598 599[Black "Smith, John Q.:Woodpusher 2000"] 600 601[Black "Morphy"] 602 603 6048.1.1.7: The Result tag 605 606The Result field value is the result of the game. It is always exactly the 607same as the game termination marker that concludes the associated movetext. It 608is always one of four possible values: "1-0" (White wins), "0-1" (Black wins), 609"1/2-1/2" (drawn game), and "*" (game still in progress, game abandoned, or 610result otherwise unknown). Note that the digit zero is used in both of the 611first two cases; not the letter "O". 612 613All possible examples: 614 615[Result "0-1"] 616 617[Result "1-0"] 618 619[Result "1/2-1/2"] 620 621[Result "*"] 622 623 6248.2: Movetext section 625 626The movetext section is composed of chess moves, move number indications, 627optional annotations, and a single concluding game termination marker. 628 629Because illegal moves are not real chess moves, they are not permitted in PGN 630movetext. They may appear in commentary, however. One would hope that illegal 631moves are relatively rare in games worthy of recording. 632 633 6348.2.1: Movetext line justification 635 636In PGN import format, tokens in the movetext do not require any specific line 637justification. 638 639In PGN export format, tokens in the movetext are placed left justified on 640successive text lines each of which has less than 80 printing characters. As 641many tokens as possible are placed on a line with the remainder appearing on 642successive lines. A single space character appears between any two adjacent 643symbol tokens on the same line in the movetext. As with the tag pair section, 644a single empty line follows the last line of data to conclude the movetext 645section. 646 647Neither the first or the last character on an export format PGN line is a 648space. (This may change in the case of commentary; this area is currently 649under development.) 650 651 6528.2.2: Movetext move number indications 653 654A move number indication is composed of one or more adjacent digits (an integer 655token) followed by zero or more periods. The integer portion of the indication 656gives the move number of the immediately following white move (if present) and 657also the immediately following black move (if present). 658 659 6608.2.2.1: Import format move number indications 661 662PGN import format does not require move number indications. It does not 663prohibit superfluous move number indications anywhere in the movetext as long 664as the move numbers are correct. 665 666PGN import format move number indications may have zero or more period 667characters following the digit sequence that gives the move number; one or more 668white space characters may appear between the digit sequence and the period(s). 669 670 6718.2.2.2: Export format move number indications 672 673There are two export format move number indication formats, one for use 674appearing immediately before a white move element and one for use appearing 675immediately before a black move element. A white move number indication is 676formed from the integer giving the fullmove number with a single period 677character appended. A black move number indication is formed from the integer 678giving the fullmove number with three period characters appended. 679 680All white move elements have a preceding move number indication. A black move 681element has a preceding move number indication only in two cases: first, if 682there is intervening annotation or commentary between the black move and the 683previous white move; and second, if there is no previous white move in the 684special case where a game starts from a position where Black is the active 685player. 686 687There are no other cases where move number indications appear in PGN export 688format. 689 690 6918.2.3: Movetext SAN (Standard Algebraic Notation) 692 693SAN (Standard Algebraic Notation) is a representation standard for chess moves 694using the ASCII Latin alphabet. 695 696Examples of SAN recorded games are found throughout most modern chess 697publications. SAN as presented in this document uses English language single 698character abbreviations for chess pieces, although this is easily changed in 699the source. English is chosen over other languages because it appears to be 700the most widely recognized. 701 702An alternative to SAN is FAN (Figurine Algebraic Notation). FAN uses miniature 703piece icons instead of single letter piece abbreviations. The two notations 704are otherwise identical. 705 706 7078.2.3.1: Square identification 708 709SAN identifies each of the sixty four squares on the chessboard with a unique 710two character name. The first character of a square identifier is the file of 711the square; a file is a column of eight squares designated by a single lower 712case letter from "a" (leftmost or queenside) up to and including "h" (rightmost 713or kingside). The second character of a square identifier is the rank of the 714square; a rank is a row of eight squares designated by a single digit from "1" 715(bottom side [White's first rank]) up to and including "8" (top side [Black's 716first rank]). The initial squares of some pieces are: white queen rook at a1, 717white king at e1, black queen knight pawn at b7, and black king rook at h8. 718 719 7208.2.3.2: Piece identification 721 722SAN identifies each piece by a single upper case letter. The standard English 723values: pawn = "P", knight = "N", bishop = "B", rook = "R", queen = "Q", and 724king = "K". 725 726The letter code for a pawn is not used for SAN moves in PGN export format 727movetext. However, some PGN import software disambiguation code may allow for 728the appearance of pawn letter codes. Also, pawn and other piece letter codes 729are needed for use in some tag pair and annotation constructs. 730 731It is admittedly a bit chauvinistic to select English piece letters over those 732from other languages. There is a slight justification in that English is a de 733facto universal second language among most chessplayers and program users. It 734is probably the best that can be done for now. A later section of this 735document gives alternative piece letters, but these should be used only for 736local presentation software and not for archival storage or for dynamic 737interchange among programs. 738 739 7408.2.3.3: Basic SAN move construction 741 742A basic SAN move is given by listing the moving piece letter (omitted for 743pawns) followed by the destination square. Capture moves are denoted by the 744lower case letter "x" immediately prior to the destination square; pawn 745captures include the file letter of the originating square of the capturing 746pawn immediately prior to the "x" character. 747 748SAN kingside castling is indicated by the sequence "O-O"; queenside castling is 749indicated by the sequence "O-O-O". Note that the upper case letter "O" is 750used, not the digit zero. The use of a zero character is not only incompatible 751with traditional text practices, but it can also confuse parsing algorithms 752which also have to understand about move numbers and game termination markers. 753Also note that the use of the letter "O" is consistent with the practice of 754having all chess move symbols start with a letter; also, it follows the 755convention that all non-pwn move symbols start with an upper case letter. 756 757En passant captures do not have any special notation; they are formed as if the 758captured pawn were on the capturing pawn's destination square. Pawn promotions 759are denoted by the equal sign "=" immediately following the destination square 760with a promoted piece letter (indicating one of knight, bishop, rook, or queen) 761immediately following the equal sign. As above, the piece letter is in upper 762case. 763 764 7658.2.3.4: Disambiguation 766 767In the case of ambiguities (multiple pieces of the same type moving to the same 768square), the first appropriate disambiguating step of the three following steps 769is taken: 770 771First, if the moving pieces can be distinguished by their originating files, 772the originating file letter of the moving piece is inserted immediately after 773the moving piece letter. 774 775Second (when the first step fails), if the moving pieces can be distinguished 776by their originating ranks, the originating rank digit of the moving piece is 777inserted immediately after the moving piece letter. 778 779Third (when both the first and the second steps fail), the two character square 780coordinate of the originating square of the moving piece is inserted 781immediately after the moving piece letter. 782 783Note that the above disambiguation is needed only to distinguish among moves of 784the same piece type to the same square; it is not used to distinguish among 785attacks of the same piece type to the same square. An example of this would be 786a position with two white knights, one on square c3 and one on square g1 and a 787vacant square e2 with White to move. Both knights attack square e2, and if 788both could legally move there, then a file disambiguation is needed; the 789(nonchecking) knight moves would be "Nce2" and "Nge2". However, if the white 790king were at square e1 and a black bishop were at square b4 with a vacant 791square d2 (thus an absolute pin of the white knight at square c3), then only 792one white knight (the one at square g1) could move to square e2: "Ne2". 793 794 7958.2.3.5: Check and checkmate indication characters 796 797If the move is a checking move, the plus sign "+" is appended as a suffix to 798the basic SAN move notation; if the move is a checkmating move, the octothorpe 799sign "#" is appended instead. 800 801Neither the appearance nor the absence of either a check or checkmating 802indicator is used for disambiguation purposes. This means that if two (or 803more) pieces of the same type can move to the same square the differences in 804checking status of the moves does not allieviate the need for the standard rank 805and file disabiguation described above. (Note that a difference in checking 806status for the above may occur only in the case of a discovered check.) 807 808Neither the checking or checkmating indicators are considered annotation as 809they do not communicate subjective information. Therefore, they are 810qualitatively different from move suffix annotations like "!" and "?". 811Subjective move annotations are handled using Numeric Annotation Glyphs as 812described in a later section of this document. 813 814There are no special markings used for double checks or discovered checks. 815 816There are no special markings used for drawing moves. 817 818 8198.2.3.6: SAN move length 820 821SAN moves can be as short as two characters (e.g., "d4"), or as long as seven 822characters (e.g., "Qa6xb7#", "fxg1=Q+"). The average SAN move length seen in 823realistic games is probably just fractionally longer than three characters. If 824the SAN rules seem complicated, be assured that the earlier notation systems of 825LEN (Long English Notation) and EDN (English Descriptive Notation) are much 826more complex, and that LAN (Long Algebraic Notation, the predecessor of SAN) is 827unnecessarily bulky. 828 829 8308.2.3.7: Import and export SAN 831 832PGN export format always uses the above canonical SAN to represent moves in the 833movetext section of a PGN game. Import format is somewhat more relaxed and it 834makes allowances for moves that do not conform exactly to the canonical format. 835However, these allowances may differ among different PGN reader programs. Only 836data appearing in export format is in all cases guaranteed to be importable 837into all PGN readers. 838 839There are a number of suggested guidelines for use with implementing PGN reader 840software for permitting non-canonical SAN move representation. The idea is to 841have a PGN reader apply various transformations to attempt to discover the move 842that is represented by non-canonical input. Some suggested transformations 843include: letter case remapping, capture indicator insertion, check indicator 844insertion, and checkmate indicator insertion. 845 846 8478.2.3.8: SAN move suffix annotations 848 849Import format PGN allows for the use of traditional suffix annotations for 850moves. There are exactly six such annotations available: "!", "?", "!!", "!?", 851"?!", and "??". At most one such suffix annotation may appear per move, and if 852present, it is always the last part of the move symbol. 853 854When exported, a move suffix annotation is translated into the corresponding 855Numeric Annotation Glyph as described in a later section of this document. For 856example, if the single move symbol "Qxa8?" appears in an import format PGN 857movetext, it would be replaced with the two adjacent symbols "Qxa8 $2". 858 859 8608.2.4: Movetext NAG (Numeric Annotation Glyph) 861 862An NAG (Numeric Annotation Glyph) is a movetext element that is used to 863indicate a simple annotation in a language independent manner. An NAG is 864formed from a dollar sign ("$") with a non-negative decimal integer suffix. 865The non-negative integer must be from zero to 255 in value. 866 867 8688.2.5: Movetext RAV (Recursive Annotation Variation) 869 870An RAV (Recursive Annotation Variation) is a sequence of movetext containing 871one or more moves enclosed in parentheses. An RAV is used to represent an 872alternative variation. The alternate move sequence given by an RAV is one that 873may be legally played by first unplaying the move that appears immediately 874prior to the RAV. Because the RAV is a recursive construct, it may be nested. 875 876*** The specification for import/export representation of RAV elements needs 877further development. 878 879 8808.2.6: Game Termination Markers 881 882Each movetext section has exactly one game termination marker; the marker 883always occurs as the last element in the movetext. The game termination marker 884is a symbol that is one of the following four values: "1-0" (White wins), "0-1" 885(Black wins), "1/2-1/2" (drawn game), and "*" (game in progress, result 886unknown, or game abandoned). Note that the digit zero is used in the above; 887not the upper case letter "O". The game termination marker appearing in the 888movetext of a game must match the value of the game's Result tag pair. (While 889the marker appears as a string in the Result tag, it appears as a symbol 890without quotes in the movetext.) 891 892 8939: Supplemental tag names 894 895The following tag names and their associated semantics are recommended for use 896for information not contained in the Seven Tag Roster. 897 898 8999.1: Player related information 900 901Note that if there is more than one player field in an instance of a player 902(White or Black) tag, then there will be corresponding multiple fields in any 903of the following tags. For example, if the White tag has the three field value 904"Jones:Smith:Zacharias" (a consultation game), then the WhiteTitle tag could 905have a value of "IM:-:GM" if Jones was an International Master, Smith was 906untitled, and Zacharias was a Grandmaster. 907 908 9099.1.1: Tags: WhiteTitle, BlackTitle 910 911These use string values such as "FM", "IM", and "GM"; these tags are used only 912for the standard abbreviations for FIDE titles. A value of "-" is used for an 913untitled player. 914 915 9169.1.2: Tags: WhiteElo, BlackElo 917 918These tags use integer values; these are used for FIDE Elo ratings. A value of 919"-" is used for an unrated player. 920 921 9229.1.3: Tags: WhiteUSCF, BlackUSCF 923 924These tags use integer values; these are used for USCF (United States Chess 925Federation) ratings. Similar tag names can be constructed for other rating 926agencies. 927 928 9299.1.4: Tags: WhiteNA, BlackNA 930 931These tags use string values; these are the e-mail or network addresses of the 932players. A value of "-" is used for a player without an electronic address. 933 934 9359.1.5: Tags: WhiteType, BlackType 936 937These tags use string values; these describe the player types. The value 938"human" should be used for a person while the value "program" should be used 939for algorithmic (computer) players. 940 941 9429.2: Event related information 943 944The following tags are used for providing additional information about the 945event. 946 947 9489.2.1: Tag: EventDate 949 950This uses a date value, similar to the Date tag field, that gives the starting 951date of the Event. 952 953 9549.2.2: Tag: EventSponsor 955 956This uses a string value giving the name of the sponsor of the event. 957 958 9599.2.3: Tag: Section 960 961This uses a string; this is used for the playing section of a tournament (e.g., 962"Open" or "Reserve"). 963 964 9659.2.4: Tag: Stage 966 967This uses a string; this is used for the stage of a multistage event (e.g., 968"Preliminary" or "Semifinal"). 969 970 9719.2.5: Tag: Board 972 973This uses an integer; this identifies the board number in a team event and also 974in a simultaneous exhibition. 975 976 9779.3: Opening information (locale specific) 978 979The following tag pairs are used for traditional opening names. The associated 980tag values will vary according to the local language in use. 981 982 9839.3.1: Tag: Opening 984 985This uses a string; this is used for the traditional opening name. This will 986vary by locale. This tag pair is associated with the use of the EPD opcode 987"v0" described in a later section of this document. 988 989 9909.3.2: Tag: Variation 991 992This uses a string; this is used to further refine the Opening tag. This will 993vary by locale. This tag pair is associated with the use of the EPD opcode 994"v1" described in a later section of this document. 995 996 9979.3.3: Tag: SubVariation 998 999This uses a string; this is used to further refine the Variation tag. This 1000will vary by locale. This tag pair is associated with the use of the EPD 1001opcode "v2" described in a later section of this document. 1002 1003 10049.4: Opening information (third party vendors) 1005 1006The following tag pairs are used for representing opening identification 1007according to various third party vendors and organizations. References to 1008these organizations does not imply any endorsement of them or any endorsement 1009by them. 1010 1011 10129.4.1: Tag: ECO 1013 1014This uses a string of either the form "XDD" or the form "XDD/DD" where the "X" 1015is a letter from "A" to "E" and the "D" positions are digits; this is used for 1016an opening designation from the five volume _Encyclopedia of Chess Openings_. 1017This tag pair is associated with the use of the EPD opcode "eco" described in a 1018later section of this document. 1019 1020 10219.4.2: Tag: NIC 1022 1023This uses a string; this is used for an opening designation from the _New in 1024Chess_ database. This tag pair is associated with the use of the EPD opcode 1025"nic" described in a later section of this document. 1026 1027 10289.5: Time and date related information 1029 1030The following tags assist with further refinement of the time and data 1031information associated with a game. 1032 1033 10349.5.1: Tag: Time 1035 1036This uses a time-of-day value in the form "HH:MM:SS"; similar to the Date tag 1037except that it denotes the local clock time (hours, minutes, and seconds) of 1038the start of the game. Note that colons, not periods, are used for field 1039separators for the Time tag value. The value is taken from the local time 1040corresponding to the location given in the Site tag pair. 1041 1042 10439.5.2: Tag: UTCTime 1044 1045This tag is similar to the Time tag except that the time is given according to 1046the Universal Coordinated Time standard. 1047 1048 10499.5.3: Tag:; UTCDate 1050 1051This tag is similar to the Date tag except that the date is given according to 1052the Universal Coordinated Time standard. 1053 1054 10559.6: Time control 1056 1057The follwing tag is used to help describe the time control used with the game. 1058 1059 10609.6.1: Tag: TimeControl 1061 1062This uses a list of one or more time control fields. Each field contains a 1063descriptor for each time control period; if more than one descriptor is present 1064then they are separated by the colon character (":"). The descriptors appear 1065in the order in which they are used in the game. The last field appearing is 1066considered to be implicitly repeated for further control periods as needed. 1067 1068There are six kinds of TimeControl fields. 1069 1070The first kind is a single question mark ("?") which means that the time 1071control mode is unknown. When used, it is usually the only descriptor present. 1072 1073The second kind is a single hyphen ("-") which means that there was no time 1074control mode in use. When used, it is usually the only descriptor present. 1075 1076The third Time control field kind is formed as two positive integers separated 1077by a solidus ("/") character. The first integer is the number of moves in the 1078period and the second is the number of seconds in the period. Thus, a time 1079control period of 40 moves in 2 1/2 hours would be represented as "40/9000". 1080 1081The fourth TimeControl field kind is used for a "sudden death" control period. 1082It should only be used for the last descriptor in a TimeControl tag value. It 1083is sometimes the only descriptor present. The format consists of a single 1084integer that gives the number of seconds in the period. Thus, a blitz game 1085would be represented with a TimeControl tag value of "300". 1086 1087The fifth TimeControl field kind is used for an "incremental" control period. 1088It should only be used for the last descriptor in a TimeControl tag value and 1089is usually the only descriptor in the value. The format consists of two 1090positive integers separated by a plus sign ("+") character. The first integer 1091gives the minimum number of seconds allocated for the period and the second 1092integer gives the number of extra seconds added after each move is made. So, 1093an incremental time control of 90 minutes plus one extra minute per move would 1094be given by "4500+60" in the TimeControl tag value. 1095 1096The sixth TimeControl field kind is used for a "sandclock" or "hourglass" 1097control period. It should only be used for the last descriptor in a 1098TimeControl tag value and is usually the only descriptor in the value. The 1099format consists of an asterisk ("*") immediately followed by a positive 1100integer. The integer gives the total number of seconds in the sandclock 1101period. The time control is implemented as if a sandclock were set at the 1102start of the period with an equal amount of sand in each of the two chambers 1103and the players invert the sandclock after each move with a time forfeit 1104indicated by an empty upper chamber. Electronic implementation of a physical 1105sandclock may be used. An example sandclock specification for a common three 1106minute egg timer sandclock would have a tag value of "*180". 1107 1108Additional TimeControl field kinds will be defined as necessary. 1109 1110 11119.7: Alternative starting positions 1112 1113There are two tags defined for assistance with describing games that did not 1114start from the usual initial array. 1115 1116 11179.7.1: Tag: SetUp 1118 1119This tag takes an integer that denotes the "set-up" status of the game. A 1120value of "0" indicates that the game has started from the usual initial array. 1121A value of "1" indicates that the game started from a set-up position; this 1122position is given in the "FEN" tag pair. This tag must appear for a game 1123starting with a set-up position. If it appears with a tag value of "1", a FEN 1124tag pair must also appear. 1125 1126 11279.7.2: Tag: FEN 1128 1129This tag uses a string that gives the Forsyth-Edwards Notation for the starting 1130position used in the game. FEN is described in a later section of this 1131document. If a SetUp tag appears with a tag value of "1", the FEN tag pair is 1132also required. 1133 1134 11359.8: Game conclusion 1136 1137There is a single tag that discusses the conclusion of the game. 1138 1139 11409.8.1: Tag: Termination 1141 1142This takes a string that describes the reason for the conclusion of the game. 1143While the Result tag gives the result of the game, it does not provide any 1144extra information and so the Termination tag is defined for this purpose. 1145 1146Strings that may appear as Termination tag values: 1147 1148* "abandoned": abandoned game. 1149 1150* "adjudication": result due to third party adjudication process. 1151 1152* "death": losing player called to greater things, one hopes. 1153 1154* "emergency": game concluded due to unforeseen circumstances. 1155 1156* "normal": game terminated in a normal fashion. 1157 1158* "rules infraction": administrative forfeit due to losing player's failure to 1159observe either the Laws of Chess or the event regulations. 1160 1161* "time forfeit": loss due to losing player's failure to meet time control 1162requirements. 1163 1164* "unterminated": game not terminated. 1165 1166 11679.9: Miscellaneous 1168 1169These are tags that can be briefly described and that doon't fit well inother 1170sections. 1171 1172 11739.9.1: Tag: Annotator 1174 1175This tag uses a name or names in the format of the player name tags; this 1176identifies the annotator or annotators of the game. 1177 1178 11799.9.2: Tag: Mode 1180 1181This uses a string that gives the playing mode of the game. Examples: "OTB" 1182(over the board), "PM" (paper mail), "EM" (electronic mail), "ICS" (Internet 1183Chess Server), and "TC" (general telecommunication). 1184 1185 11869.9.3: Tag: PlyCount 1187 1188This tag takes a single integer that gives the number of ply (moves) in the 1189game. 1190 1191 119210: Numeric Annotation Glyphs 1193 1194NAG zero is used for a null annotation; it is provided for the convenience of 1195software designers as a placeholder value and should probably not be used in 1196external PGN data. 1197 1198NAGs with values from 1 to 9 annotate the move just played. 1199 1200NAGs with values from 10 to 135 modify the current position. 1201 1202NAGs with values from 136 to 139 describe time pressure. 1203 1204Other NAG values are reserved for future definition. 1205 1206Note: the number assignments listed below should be considered preliminary in 1207nature; they are likely to be changed as a result of reviewer feedback. 1208 1209NAG Interpretation 1210--- -------------- 1211 0 null annotation 1212 1 good move (traditional "!") 1213 2 poor move (traditional "?") 1214 3 very good move (traditional "!!") 1215 4 very poor move (traditional "??") 1216 5 speculative move (traditional "!?") 1217 6 questionable move (traditional "?!") 1218 7 forced move (all others lose quickly) 1219 8 singular move (no reasonable alternatives) 1220 9 worst move 1221 10 drawish position 1222 11 equal chances, quiet position 1223 12 equal chances, active position 1224 13 unclear position 1225 14 White has a slight advantage 1226 15 Black has a slight advantage 1227 16 White has a moderate advantage 1228 17 Black has a moderate advantage 1229 18 White has a decisive advantage 1230 19 Black has a decisive advantage 1231 20 White has a crushing advantage (Black should resign) 1232 21 Black has a crushing advantage (White should resign) 1233 22 White is in zugzwang 1234 23 Black is in zugzwang 1235 24 White has a slight space advantage 1236 25 Black has a slight space advantage 1237 26 White has a moderate space advantage 1238 27 Black has a moderate space advantage 1239 28 White has a decisive space advantage 1240 29 Black has a decisive space advantage 1241 30 White has a slight time (development) advantage 1242 31 Black has a slight time (development) advantage 1243 32 White has a moderate time (development) advantage 1244 33 Black has a moderate time (development) advantage 1245 34 White has a decisive time (development) advantage 1246 35 Black has a decisive time (development) advantage 1247 36 White has the initiative 1248 37 Black has the initiative 1249 38 White has a lasting initiative 1250 39 Black has a lasting initiative 1251 40 White has the attack 1252 41 Black has the attack 1253 42 White has insufficient compensation for material deficit 1254 43 Black has insufficient compensation for material deficit 1255 44 White has sufficient compensation for material deficit 1256 45 Black has sufficient compensation for material deficit 1257 46 White has more than adequate compensation for material deficit 1258 47 Black has more than adequate compensation for material deficit 1259 48 White has a slight center control advantage 1260 49 Black has a slight center control advantage 1261 50 White has a moderate center control advantage 1262 51 Black has a moderate center control advantage 1263 52 White has a decisive center control advantage 1264 53 Black has a decisive center control advantage 1265 54 White has a slight kingside control advantage 1266 55 Black has a slight kingside control advantage 1267 56 White has a moderate kingside control advantage 1268 57 Black has a moderate kingside control advantage 1269 58 White has a decisive kingside control advantage 1270 59 Black has a decisive kingside control advantage 1271 60 White has a slight queenside control advantage 1272 61 Black has a slight queenside control advantage 1273 62 White has a moderate queenside control advantage 1274 63 Black has a moderate queenside control advantage 1275 64 White has a decisive queenside control advantage 1276 65 Black has a decisive queenside control advantage 1277 66 White has a vulnerable first rank 1278 67 Black has a vulnerable first rank 1279 68 White has a well protected first rank 1280 69 Black has a well protected first rank 1281 70 White has a poorly protected king 1282 71 Black has a poorly protected king 1283 72 White has a well protected king 1284 73 Black has a well protected king 1285 74 White has a poorly placed king 1286 75 Black has a poorly placed king 1287 76 White has a well placed king 1288 77 Black has a well placed king 1289 78 White has a very weak pawn structure 1290 79 Black has a very weak pawn structure 1291 80 White has a moderately weak pawn structure 1292 81 Black has a moderately weak pawn structure 1293 82 White has a moderately strong pawn structure 1294 83 Black has a moderately strong pawn structure 1295 84 White has a very strong pawn structure 1296 85 Black has a very strong pawn structure 1297 86 White has poor knight placement 1298 87 Black has poor knight placement 1299 88 White has good knight placement 1300 89 Black has good knight placement 1301 90 White has poor bishop placement 1302 91 Black has poor bishop placement 1303 92 White has good bishop placement 1304 93 Black has good bishop placement 1305 84 White has poor rook placement 1306 85 Black has poor rook placement 1307 86 White has good rook placement 1308 87 Black has good rook placement 1309 98 White has poor queen placement 1310 99 Black has poor queen placement 1311100 White has good queen placement 1312101 Black has good queen placement 1313102 White has poor piece coordination 1314103 Black has poor piece coordination 1315104 White has good piece coordination 1316105 Black has good piece coordination 1317106 White has played the opening very poorly 1318107 Black has played the opening very poorly 1319108 White has played the opening poorly 1320109 Black has played the opening poorly 1321110 White has played the opening well 1322111 Black has played the opening well 1323112 White has played the opening very well 1324113 Black has played the opening very well 1325114 White has played the middlegame very poorly 1326115 Black has played the middlegame very poorly 1327116 White has played the middlegame poorly 1328117 Black has played the middlegame poorly 1329118 White has played the middlegame well 1330119 Black has played the middlegame well 1331120 White has played the middlegame very well 1332121 Black has played the middlegame very well 1333122 White has played the ending very poorly 1334123 Black has played the ending very poorly 1335124 White has played the ending poorly 1336125 Black has played the ending poorly 1337126 White has played the ending well 1338127 Black has played the ending well 1339128 White has played the ending very well 1340129 Black has played the ending very well 1341130 White has slight counterplay 1342131 Black has slight counterplay 1343132 White has moderate counterplay 1344133 Black has moderate counterplay 1345134 White has decisive counterplay 1346135 Black has decisive counterplay 1347136 White has moderate time control pressure 1348137 Black has moderate time control pressure 1349138 White has severe time control pressure 1350139 Black has severe time control pressure 1351 1352 135311: File names and directories 1354 1355File names chosen for PGN data should be both informative and portable. The 1356directory names and arrangements should also be chosen for the same reasons and 1357also for ease of navigation. 1358 1359Some of suggested file and directory names may be difficult or impossible to 1360represent on certain computing systems. Use of appropriate conversion customs 1361is encouraged. 1362 1363 136411.1: File name suffix for PGN data 1365 1366The use of the file suffix ".pgn" is encouraged for ASCII text files containing 1367PGN data. 1368 1369 137011.2: File name formation for PGN data for a specific player 1371 1372PGN games for a specific player should have a file name consisting of the 1373player's last name followed by the ".pgn" suffix. 1374 1375 137611.3: File name formation for PGN data for a specific event 1377 1378PGN games for a specific event should have a file name consisting of the 1379event's name followed by the ".pgn" suffix. 1380 1381 138211.4: File name formation for PGN data for chronologically ordered games 1383 1384PGN data files used for chronologically ordered (oldest first) archives use 1385date information as file name root strings. A file containing all the PGN 1386games for a given year would have an eight character name in the format 1387"YYYY.pgn". A file containing PGN data for a given month would have a ten 1388character name in the format "YYYYMM.pgn". Finally, a file for PGN games for a 1389single day would have a twelve character name in the format "YYYYMMDD.pgn". 1390Large files are split into smaller files as needed. 1391 1392As game files are commonly arranged by chronological order, games with missing 1393or incomplete Date tag pair data are to be avoided. Any question mark 1394characters in a Date tag value will be treated as zero digits for collation 1395within a file and also for file naming. 1396 1397Large quantities of PGN data arranged by chronological order should be 1398organized into hierarchical directories. A directory containing all PGN data 1399for a given year would have a four character name in the format "YYYY"; 1400directories containing PGN files for a given month would have a six character 1401name in the format "YYYYMM". 1402 1403 140411.5: Suggested directory tree organization 1405 1406A suggested directory arrangement for ftp sites and CD-ROM distributions: 1407 1408* PGN: master directory of the PGN subtree (pub/chess/Game-Databases/PGN) 1409 1410* PGN/Events: directory of PGN files, each for a specific event 1411 1412* PGN/Events/News: news and status of the event collection 1413 1414* PGN/Events/ReadMe: brief description of the local directory contents 1415 1416* PGN/MGR: directory of the Master Games Repository subtree 1417 1418* PGN/MGR/News: news and status of the entire PGN/MGR subtree 1419 1420* PGN/MGR/ReadMe: brief description of the local directory contents 1421 1422* PGN/MGR/YYYY: directory of games or subtrees for the year YYYY 1423 1424* PGN/MGR/YYYY/ReadMe: description of local directory for year YYYY 1425 1426* PGN/MGR/YYYY/News: news and status for year YYYY data 1427 1428* PGN/News: news and status of the entire PGN subtree 1429 1430* PGN/Players: directory of PGN files, each for a specific player 1431 1432* PGN/Players/News: news and status of the player collection 1433 1434* PGN/Players/ReadMe: brief description of the local directory contents 1435 1436* PGN/ReadMe: brief description of the local directory contents 1437 1438* PGN/Standard: the PGN standard (this document) 1439 1440* PGN/Tools: software utilities that access PGN data 1441 1442 144312: PGN collating sequence 1444 1445There is a standard sorting order for PGN games within a file. This collation 1446is based on eight keys; these are the seven tag values of the STR and also the 1447movetext itself. 1448 1449The first (most important, primary key) is the Date tag. Earlier dated games 1450appear prior to games played at a later date. This field is sorted by 1451ascending numeric value first with the year, then the month, and finally the 1452day of the month. Query characters used for unknown date digit values will be 1453treated as zero digit characters for ordering comparison. 1454 1455The second key is the Event tag. This is sorted in ascending ASCII order. 1456 1457The third key is the Site tag. This is sorted in ascending ASCII order. 1458 1459The fourth key is the Round tag. This is sorted in ascending numeric order 1460based on the value of the integer used to denote the playing round. A query or 1461hyphen used for the round is ordered before any integer value. A query 1462character is ordered before a hyphen character. 1463 1464The fifth key is the White tag. This is sorted in ascending ASCII order. 1465 1466The sixth key is the Black tag. This is sorted in ascending ASCII order. 1467 1468The seventh key is the Result tag. This is sorted in ascending ASCII order. 1469 1470The eighth key is the movetext itself. This is sorted in ascending ASCII order 1471with the entire text including spaces and newline characters. 1472 1473 147413: PGN software 1475 1476This section describes some PGN software that is either currently available or 1477expected to be available in the near future. The entries are presented in 1478rough chronological order of their being made known to the PGN standard 1479coordinator. Authors of PGN capable software are encouraged to contact the 1480coordinator (e-mail address listed near the start of this document) so that the 1481information may be included here in this section. 1482 1483In addition to the PGN standard, there are two more chess standards of interest 1484to the chess software community. These are the FEN standard (Forsyth-Edwards 1485Notation) for position notation and the EPD standard (Extended Position 1486Description) for comprehensive position description for automated interprogram 1487processing. These are described in a later section of this document. 1488 1489Some PGN software is freeware and can be gotten from ftp sites and other 1490sources. Other PGN software is payware and appears as part of commercial 1491chessplaying programs and chess database managers. Those who are interested in 1492the propagation of the PGN standard are encouraged to support manufacturers of 1493chess software that use the standard. If a particular vendor does not offer 1494PGN compatibility, it is likely that a few letters to them along with a copy of 1495this specification may help them decide to include PGN support in their next 1496release. 1497 1498The staff at the University of Oklahoma at Norman (USA) have graciously 1499provided an ftp site (chess.uoknor.edu) for the storage of chess related data 1500and programs. Because file names change over time, those accessing the site 1501are encouraged to first retrieve the file "pub/chess/ls-lR.gz" for a current 1502listing. A scan of this listing will also help locate versions of PGN programs 1503for machine types and operating systems other than those listed below. Further 1504information about this archive can be gotten from its administrator, Chris 1505Petroff (chris@uoknor.edu). 1506 1507For European users, the kind staff at the University of Hamburg (Germany) have 1508provided the ftp site ftp.math.uni-hamburg.de; this carries a daily mirror of 1509the pub/chess directory at the chess.uoknor.edu site. 1510 1511 151213.1: The SAN Kit 1513 1514The "SAN Kit" is an ANSI C source chess programming toolkit available for free 1515from the ftp site chess.uoknor.edu in the directory pub/chess/Unix as the file 1516"SAN.tar.gz" (a gzip tar archive). This kit contains code for PGN import and 1517export and can be used to "regularize" PGN data into reduced export format by 1518use of its "tfgg" command. The SAN Kit also supports FEN I/O. Code from this 1519kit is freely redistributable for anyone as long as future distribution is 1520unhindered for everyone. The SAN Kit is undergoing continuous development, 1521although dates of future deliveries are quite difficult to predict and releases 1522sometimes appear months apart. Suggestions and comments should be directed to 1523its author, Steven J. Edwards (sje@world.std.com). 1524 1525 152613.2: pgnRead 1527 1528The program "pgnRead" runs under MS Windows 3.1 and provides an interactive 1529graphical user interface for scanning PGN data files. This program includes a 1530colorful figurine chessboard display and scrolling controls for game and game 1531text selection. It is available from the chess.uoknor.edu ftp site in the 1532pub/chess/DOS directory; several versions are available with names of the form 1533"pgnrd**.exe"; the latest at this writing is "PGNRD130.EXE". Suggestions and 1534comments should be directed to its author, Keith Fuller (keithfx@aol.com). 1535 1536 153713.3: mail2pgn/GIICS 1538 1539The program "mail2pgn" produces a PGN version of chess game data generated by 1540the ICS (Internet Chess Server). It can be found at the chess.uoknor.edu ftp 1541site in the pub/chess/DOS directory as the file "mail2pgn.zip" A C language 1542version is in the directory pub/chess/Unix as the file "mail2pgn.c". 1543Suggestions and comments should be directed to its author, John Aronson 1544(aronson@helios.ece.arizona.edu). This code has been reportedly incorporated 1545into the GIICS (Graphical Interface for the ICS); suggestions and comments 1546should be directed to its author, Tony Acero (ace3@midway.uchicago.edu). 1547 1548There is a report that mail2pgn has been superseded by the newer program 1549"MV2PGN" described below. 1550 1551 155213.4: XBoard 1553 1554"XBoard" is a comprehensive chess utility running under the X Window System 1555that provides a graphical user interface in a portable manner. A new version 1556now handles PGN data. It is available from the chess.uoknor.edu ftp site in 1557the pub/chess/X directory as the file "xboard-3.0.pl9.tar.gz". Suggestions and 1558comments should be directed to its author, Tim Mann (mann@src.dec.com). 1559 1560 156113.5: cupgn 1562 1563The program "cupgn" converts game data stored in the ChessBase format into PGN. 1564It is available from the chess.uoknor.edu ftp site in the 1565pub/chess/Game-Databases/CBUFF directory as the file "cupgn.tar.gz". Another 1566version is in the directory pub/chess/DOS as the file "cupgn120.exe". 1567Suggestions and comments should be directed to its author, Anjo Anjewierden 1568(anjo@swi.psy.uva.nl). 1569 1570 157113.6: Zarkov 1572 1573The current version (3.0) of the commercial chessplaying program "Zarkov" can 1574read and write games using PGN. This program can also use the EPD standard for 1575communication with other EPD capable programs. Historically, Zarkov is the 1576very first program to use EPD. Suggestions and comments should be directed to 1577its author, John Stanback (jhs@icbdfcs1.fc.hp.com). 1578 1579A vendor for North America is: 1580 1581 International Chess Enterprises 1582 P.O. Box 19457 1583 Seattle, WA 98109 1584 USA 1585 (800) 262-4277 1586 1587A vendor for Europe is: 1588 1589 Gambit-Soft 1590 Feckenhauser Strasse 27 1591 D-78628 Rottweil 1592 GERMANY 1593 49-741-21573 1594 1595 159613.7: Chess Assistant 1597 1598The upcoming version of the multifunction commercial database program "Chess 1599Assistant" will be able to use the PGN standard as an import and export option. 1600There is a report of a freeware program, "PGN2CA", that will convert PGN 1601databases into Chess Assistant format. For more information, the contact is 1602Victor Zakharov, one of the members of the Chess Assistant development team 1603(VICTOR@ldis.cs.msu.su). 1604 1605A vendor for North America is: 1606 1607 International Chess Enterprises 1608 P.O. Box 19457 1609 Seattle, WA 98109 1610 USA 1611 (800) 262-4277 1612 1613 161413.8: BOOKUP 1615 1616The MS-DOS edition of the multifunction commercial program BOOKUP, version 8.1, 1617is able to use the EPD standard for communication with other EPD capable 1618programs. It may also be PGN capable as well. 1619 1620The BOOKUP 8.1.1 Addenda notes dated 1993.12.17 provide comprehensive 1621information on how to use EPD in conjunction with "analyst" programs such as 1622Zarkov and HIARCS. Specifically, the search and evaluation abilities of an 1623analyst program are combined with the information organization abilities of the 1624BOOKUP database program to provide position scoring. This is done by first 1625having BOOKUP export a database in EPD format, then having an analyst program 1626annotate each EPD record with a numeric score, and then having BOOKUP import 1627the changed EPD file. BOOKUP can then apply minimaxing to the imported 1628database; this results in scores from terminal positions being propagated back 1629to earlier positions and even back to moves from the starting array. 1630 1631For some reason, BOOKUP calls this process "backsolving", but it's really just 1632standard minimaxing. In any case, it's a good example of how different 1633programs from different authors performing different types of tasks can be 1634integrated by use of a common, non-proprietary standard. This allows for a new 1635set of powerful features that are beyond the capabilities of any one of the 1636individual component programs. 1637 1638BOOKUP allows for some customizing of EPD actions. One such customization is 1639to require the positional evaluations to follow the EPD standard; this means 1640that the score is always given from the viewpoint of the active player. This 1641is explained more fully in the section on the "ce" (centipawn evaluation) 1642opcode in the EPD description in a later section of this document. To ensure 1643that BOOKUP handles the centipawn evaluations in the "right" way, the EPD 1644setting "Positive for White" must be set to "N". This makes BOOKUP work 1645correctly with Zarkov and with all other programs that use the "right" 1646centipawn evaluation convention. There is an apparent problem with HIARCS that 1647requires this option to be set to "Y"; but this really means that, if true, 1648HIARCS needs to be adjusted to use the "right" centipawn evaluation convention. 1649 1650A vendor in North America is: 1651 1652 BOOKUP 1653 2763 Kensington Place West 1654 Columbus, OH 43202 1655 USA 1656 (800) 949-5445 1657 (614) 263-7219 1658 1659 166013.9: HIARCS 1661 1662The current version (2.1) of the commercial chessplaying program "HIARCS" is 1663able to use the EPD standard for communication with other EPD capable programs. 1664It may also be PGN capable as well. More details will appear here as they 1665become available. 1666 1667A vendor in North America is: 1668 1669 HIARCS 1670 c/o BOOKUP 1671 2763 Kensington Place West 1672 Columbus, OH 43202 1673 USA 1674 (800) 949-5445 1675 (614) 263-7219 1676 1677 167813.10: Deja Vu 1679 1680The chess database "Deja Vu" from ChessWorks is a PGN compatible collection of 1681over 300,000 games. It is available only on CD-ROM and is scheduled for 1682release in 1994.05 with periodic revisions thereafter. The introductory price 1683is US$329. For further information, the authors are John Crayton and Eric 1684Schiller and they can be contacted via e-mail (chesswks@netcom.com). 1685 1686 168713.11: MV2PGN 1688 1689The program "MV2PGN" can be used to convert game data generated by both current 1690and older versions of the GIICS (Graphical Interface - Internet Chess Server). 1691The program is included in the self extracting archive available from 1692chess.uoknor.edu in the directory pub/chess/DOS as the file "ics2pgn.exe". 1693Source code is also included. This program is reported to supersede the older 1694"mail2pgn" and was needed due to a change in ICS recording format in late 1993. 1695For further information about MV2PGN, the contact person is Gary Bastin 1696(gbastin@x102a.ess.harris.com). 1697 1698 169913.12: The Hansen utilities (cb2pgn, nic2pgn, pgn2cb, pgn2nic) 1700 1701The Hansen utilities are used to convert among various chess data 1702representation formats. The PGN related programs include: "cb2pgn.exe" 1703(convert ChessBase to PGN), "nic2pgn.exe" (convert NIC to PGN), "pgn2cb.exe" 1704(convert PGN to ChessBase), and "pgn2nic.exe" (convert PGN to NIC). 1705 1706The ChessBase related utilities (cb2pgn/pgn2cb) are found at chess.uoknor.edu 1707in the pub/chess/Game-Databases/ChessBase directory. 1708 1709The NIC related utilities (nic2pgn/pgn2nic) are found at chess.uoknor.edu in 1710the pub/chess/Game-Databases/NIC directory. 1711 1712For further information about the Hansen utilities, the contact person is the 1713author, Carsten Hansen (ch0506@hdc.hha.dk). 1714 1715 171613.13: Slappy the Database 1717 1718"Slappy the Database" is a commercial chess database and translation program 1719scheduled for release no sooner than late 1994. It is a low cost utility with 1720a simple character interface intended for those who want a supported product 1721but who do not need (or cannot afford) a comprehensive, feature-laden program 1722with a graphical user interface. Slappy's two most important features are its 1723batch processing ability and its full implementation of each and every standard 1724described in this document. Versions of Slappy the Database will be provided 1725for various platforms including: Intel 386/486 Unix, Apple Macintosh, and 1726MS-DOS. 1727 1728Slappy may also be useful to those who have a full feature program who also 1729need to run time consuming chess database tasks on a spare computer. 1730 1731Suggestions and comments should be directed to its author, Steven J. Edwards 1732(sje@world.std.com). More details will appear here as they become available. 1733 1734 173513.14: CBASCII 1736 1737"CBASCII" is a general utility for converting chess data between ChessBase 1738format and ASCII representations. It has PGN capability, and it is available 1739from the chess.uoknor.edu ftp site in the pub/chess/DOS directory as the file 1740"cba1_2.zip". The contact person is the program's author, Andy Duplain 1741(duplain@btcs.bt.co.uk). 1742 1743 174413.15: ZZZZZZ 1745 1746"ZZZZZZ" is a chessplaying program, complete with source, that also includes 1747some database functions. A recent version is reported to have both PGN and EPD 1748capabilities. It is available from the chess.uoknor.edu ftp site in the 1749pub/chess/Unix directory as the file "zzzzzz-3.2b1.tar.gz". The contact person 1750is its author, Gijsbert Wiesenecker (wiesenecker@sara.nl). 1751 1752 175313.16: icsconv 1754 1755The program "icsconv" can be used to convert Internet Chess Server games, both 1756old and new format, to PGN. It is available from the chess.uoknor.edu site in 1757the pub/chess/Game-Databases/PGN/Tools directory as the file "icsconv.exe". 1758The contact person is the author, Kevin Nomura (chow@netcom.com). 1759 1760 176113.17: CHESSOP (CHESSOPN/CHESSOPG) 1762 1763CHESSOP is an openings database and viewing tool with support for reading PGN 1764games. It runs under MS-DOS and displays positions rather than games. For 1765each position, both good and bad moves are listed with appropriate annotation. 1766Transpositions are handled as well. The distributed database contains over 1767100,000 positions covering all the common openings. Users can feed in their 1768own PGN data as well. CHESSOP takes 3 Mbyte of hard disk, costs US$39 and can 1769be obtained from: 1770 1771 CHESSX Software 1772 12 Bluebell Close 1773 Glenmore Park 1774 AUSTRALIA 2745. 1775 1776The ideas behind CHESSOP can be seen in CHESSOPN (alias CHESSOPG), a free 1777version on the ICS server which has a reduced openings database (25,000 1778positions) and no PGN or transposition support but is otherwise the same as 1779CHESSOP. (These are the files "chessopg.zip" in the directory pub/chess/DOS at 1780the chess.uoknor.edu ftp site.) 1781 1782 178313.18: CAT2PGN 1784 1785The program "CAT2PGN" is a utility that translates data from the format used by 1786Chess Assistant into PGN. It is available from the chess.uoknor.edu ftp site. 1787The contact person for CAT2PGN is its author, David Myers 1788(myers@frodo.biochem.duke.edu). 1789 1790 179113.19: pgn2opg 1792 1793The utility "pgn2opg" can be used to convert PGN files into a text format used 1794by the "CHESSOPG" program mentioned above. Although it does not perform any 1795semantic analysis on PGN input, it has been demonstrated to handle known 1796correct PGN input properly. The file can be found in the pub/chess/PGN/Tools 1797directory at the chess.uoknor.edu ftp site. For more information, the author 1798is David Barnes (djb@ukc.ac.uk). 1799 1800 180114: PGN data archives 1802 1803The primary PGN data archive repository is located at the ftp site 1804chess.uoknor.edu as the directory "pub/chess/Game-Databases/PGN". It is 1805organized according to the description given in section C.5 of this document. 1806The European site ftp.math.uni-hamburg.de is also reported to carry a regularly 1807updated copy of the repository. 1808 1809 181015: International Olympic Committee country codes 1811 1812International Olympic Committee country codes are employed for Site nation 1813information because of their traditional use with the reporting of 1814international sporting events. Due to changes in geography and linguistic 1815custom, some of the following may be incorrect or outdated. Corrections and 1816extensions should be sent via e-mail to the PGN coordinator whose address 1817listed near the start of this document. 1818 1819AFG: Afghanistan 1820AIR: Aboard aircraft 1821ALB: Albania 1822ALG: Algeria 1823AND: Andorra 1824ANG: Angola 1825ANT: Antigua 1826ARG: Argentina 1827ARM: Armenia 1828ATA: Antarctica 1829AUS: Australia 1830AZB: Azerbaijan 1831BAN: Bangladesh 1832BAR: Bahrain 1833BHM: Bahamas 1834BEL: Belgium 1835BER: Bermuda 1836BIH: Bosnia and Herzegovina 1837BLA: Belarus 1838BLG: Bulgaria 1839BLZ: Belize 1840BOL: Bolivia 1841BRB: Barbados 1842BRS: Brazil 1843BRU: Brunei 1844BSW: Botswana 1845CAN: Canada 1846CHI: Chile 1847COL: Columbia 1848CRA: Costa Rica 1849CRO: Croatia 1850CSR: Czechoslovakia 1851CUB: Cuba 1852CYP: Cyprus 1853DEN: Denmark 1854DOM: Dominican Republic 1855ECU: Ecuador 1856EGY: Egypt 1857ENG: England 1858ESP: Spain 1859EST: Estonia 1860FAI: Faroe Islands 1861FIJ: Fiji 1862FIN: Finland 1863FRA: France 1864GAM: Gambia 1865GCI: Guernsey-Jersey 1866GEO: Georgia 1867GER: Germany 1868GHA: Ghana 1869GRC: Greece 1870GUA: Guatemala 1871GUY: Guyana 1872HAI: Haiti 1873HKG: Hong Kong 1874HON: Honduras 1875HUN: Hungary 1876IND: India 1877IRL: Ireland 1878IRN: Iran 1879IRQ: Iraq 1880ISD: Iceland 1881ISR: Israel 1882ITA: Italy 1883IVO: Ivory Coast 1884JAM: Jamaica 1885JAP: Japan 1886JRD: Jordan 1887JUG: Yugoslavia 1888KAZ: Kazakhstan 1889KEN: Kenya 1890KIR: Kyrgyzstan 1891KUW: Kuwait 1892LAT: Latvia 1893LEB: Lebanon 1894LIB: Libya 1895LIC: Liechtenstein 1896LTU: Lithuania 1897LUX: Luxembourg 1898MAL: Malaysia 1899MAU: Mauritania 1900MEX: Mexico 1901MLI: Mali 1902MLT: Malta 1903MNC: Monaco 1904MOL: Moldova 1905MON: Mongolia 1906MOZ: Mozambique 1907MRC: Morocco 1908MRT: Mauritius 1909MYN: Myanmar 1910NCG: Nicaragua 1911NET: The Internet 1912NIG: Nigeria 1913NLA: Netherlands Antilles 1914NLD: Netherlands 1915NOR: Norway 1916NZD: New Zealand 1917OST: Austria 1918PAK: Pakistan 1919PAL: Palestine 1920PAN: Panama 1921PAR: Paraguay 1922PER: Peru 1923PHI: Philippines 1924PNG: Papua New Guinea 1925POL: Poland 1926POR: Portugal 1927PRC: People's Republic of China 1928PRO: Puerto Rico 1929QTR: Qatar 1930RIN: Indonesia 1931ROM: Romania 1932RUS: Russia 1933SAF: South Africa 1934SAL: El Salvador 1935SCO: Scotland 1936SEA: At Sea 1937SEN: Senegal 1938SEY: Seychelles 1939SIP: Singapore 1940SLV: Slovenia 1941SMA: San Marino 1942SPC: Aboard spacecraft 1943SRI: Sri Lanka 1944SUD: Sudan 1945SUR: Surinam 1946SVE: Sweden 1947SWZ: Switzerland 1948SYR: Syria 1949TAI: Thailand 1950TMT: Turkmenistan 1951TRK: Turkey 1952TTO: Trinidad and Tobago 1953TUN: Tunisia 1954UAE: United Arab Emirates 1955UGA: Uganda 1956UKR: Ukraine 1957UNK: Unknown 1958URU: Uruguay 1959USA: United States of America 1960UZB: Uzbekistan 1961VEN: Venezuela 1962VGB: British Virgin Islands 1963VIE: Vietnam 1964VUS: U.S. Virgin Islands 1965WLS: Wales 1966YEM: Yemen 1967YUG: Yugoslavia 1968ZAM: Zambia 1969ZIM: Zimbabwe 1970ZRE: Zaire 1971 1972 197316: Additional chess data standards 1974 1975While PGN is used for game storage, there are other data representation 1976standards for other chess related purposes. Two important standards are FEN 1977and EPD, both described in this section. 1978 1979 198016.1: FEN 1981 1982FEN is "Forsyth-Edwards Notation"; it is a standard for describing chess 1983positions using the ASCII character set. 1984 1985A single FEN record uses one text line of variable length composed of six data 1986fields. The first four fields of the FEN specification are the same as the 1987first four fields of the EPD specification. 1988 1989A text file composed exclusively of FEN data records should have a file name 1990with the suffix ".fen". 1991 1992 199316.1.1: History 1994 1995FEN is based on a 19th century standard for position recording designed by the 1996Scotsman David Forsyth, a newspaper journalist. The original Forsyth standard 1997has been slightly extended for use with chess software by Steven Edwards with 1998assistance from commentators on the Internet. This new standard, FEN, was 1999first implemented in Edwards' SAN Kit. 2000 2001 200216.1.2: Uses for a position notation 2003 2004Having a standard position notation is particularly important for chess 2005programmers as it allows them to share position databases. For example, there 2006exist standard position notation databases with many of the classical benchmark 2007tests for chessplaying programs, and by using a common position notation format 2008many hours of tedious data entry can be saved. Additionally, a position 2009notation can be useful for page layout programs and for confirming position 2010status for e-mail competition. 2011 2012Many interesting chess problem sets represented using FEN can be found at the 2013chess.uoknor.edu ftp site in the directory pub/chess/SAN_testsuites. 2014 2015 201616.1.3: Data fields 2017 2018FEN specifies the piece placement, the active color, the castling availability, 2019the en passant target square, the halfmove clock, and the fullmove number. 2020These can all fit on a single text line in an easily read format. The length 2021of a FEN position description varies somewhat according to the position. In 2022some cases, the description could be eighty or more characters in length and so 2023may not fit conveniently on some displays. However, these positions aren't too 2024common. 2025 2026A FEN description has six fields. Each field is composed only of non-blank 2027printing ASCII characters. Adjacent fields are separated by a single ASCII 2028space character. 2029 2030 203116.1.3.1: Piece placement data 2032 2033The first field represents the placement of the pieces on the board. The board 2034contents are specified starting with the eighth rank and ending with the first 2035rank. For each rank, the squares are specified from file a to file h. White 2036pieces are identified by uppercase SAN piece letters ("PNBRQK") and black 2037pieces are identified by lowercase SAN piece letters ("pnbrqk"). Empty squares 2038are represented by the digits one through eight; the digit used represents the 2039count of contiguous empty squares along a rank. A solidus character "/" is 2040used to separate data of adjacent ranks. 2041 2042 204316.1.3.2: Active color 2044 2045The second field represents the active color. A lower case "w" is used if 2046White is to move; a lower case "b" is used if Black is the active player. 2047 2048 204916.1.3.3: Castling availability 2050 2051The third field represents castling availability. This indicates potential 2052future castling that may of may not be possible at the moment due to blocking 2053pieces or enemy attacks. If there is no castling availability for either side, 2054the single character symbol "-" is used. Otherwise, a combination of from one 2055to four characters are present. If White has kingside castling availability, 2056the uppercase letter "K" appears. If White has queenside castling 2057availability, the uppercase letter "Q" appears. If Black has kingside castling 2058availability, the lowercase letter "k" appears. If Black has queenside 2059castling availability, then the lowercase letter "q" appears. Those letters 2060which appear will be ordered first uppercase before lowercase and second 2061kingside before queenside. There is no white space between the letters. 2062 2063 206416.1.3.4: En passant target square 2065 2066The fourth field is the en passant target square. If there is no en passant 2067target square then the single character symbol "-" appears. If there is an en 2068passant target square then is represented by a lowercase file character 2069immediately followed by a rank digit. Obviously, the rank digit will be "3" 2070following a white pawn double advance (Black is the active color) or else be 2071the digit "6" after a black pawn double advance (White being the active color). 2072 2073An en passant target square is given if and only if the last move was a pawn 2074advance of two squares. Therefore, an en passant target square field may have 2075a square name even if there is no pawn of the opposing side that may 2076immediately execute the en passant capture. 2077 2078 207916.1.3.5: Halfmove clock 2080 2081The fifth field is a nonnegative integer representing the halfmove clock. This 2082number is the count of halfmoves (or ply) since the last pawn advance or 2083capturing move. This value is used for the fifty move draw rule. 2084 2085 208616.1.3.6: Fullmove number 2087 2088The sixth and last field is a positive integer that gives the fullmove number. 2089This will have the value "1" for the first move of a game for both White and 2090Black. It is incremented by one immediately after each move by Black. 2091 2092 209316.1.4: Examples 2094 2095Here's the FEN for the starting position: 2096 2097rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1 2098 2099And after the move 1. e4: 2100 2101rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR b KQkq e3 0 1 2102 2103And then after 1. ... c5: 2104 2105rnbqkbnr/pp1ppppp/8/2p5/4P3/8/PPPP1PPP/RNBQKBNR w KQkq c6 0 2 2106 2107And then after 2. Nf3: 2108 2109rnbqkbnr/pp1ppppp/8/2p5/4P3/5N2/PPPP1PPP/RNBQKB1R b KQkq - 1 2 2110 2111For two kings on their home squares and a white pawn on e2 (White to move) with 2112thirty eight full moves played with five halfmoves since the last pawn move or 2113capture: 2114 21154k3/8/8/8/8/8/4P3/4K3 w - - 5 39 2116 2117 211816.2: EPD 2119 2120EPD is "Extended Position Description"; it is a standard for describing chess 2121positions along with an extended set of structured attribute values using the 2122ASCII character set. It is intended for data and command interchange among 2123chessplaying programs. It is also intended for the representation of portable 2124opening library repositories. 2125 2126A single EPD uses one text line of variable length composed of four data field 2127followed by zero or more operations. The four fields of the EPD specification 2128are the same as the first four fields of the FEN specification. 2129 2130A text file composed exclusively of EPD data records should have a file name 2131with the suffix ".epd". 2132 2133 213416.2.1: History 2135 2136EPD is based in part on the earlier FEN standard; it has added extensions for 2137use with opening library preparation and also for general data and command 2138interchange among advanced chess programs. EPD was developed by John Stanback 2139and Steven Edwards; its first implementation is in Stanback's master strength 2140chessplaying program Zarkov. 2141 2142 214316.2.2: Uses for an extended position notation 2144 2145Like FEN, EPD can also be used for general position description. However, 2146unlike FEN, EPD is designed to be expandable by the addition of new operations 2147that provide new functionality as needs arise. 2148 2149Many interesting chess problem sets represented using EPD can be found at the 2150chess.uoknor.edu ftp site in the directory pub/chess/SAN_testsuites. 2151 2152 215316.2.3: Data fields 2154 2155EPD specifies the piece placement, the active color, the castling availability, 2156and the en passant target square of a position. These can all fit on a single 2157text line in an easily read format. The length of an EPD position description 2158varies somewhat according to the position and any associated operations. In 2159some cases, the description could be eighty or more characters in length and so 2160may not fit conveniently on some displays. However, most EPD descriptions pass 2161among programs only and these are not usually seen by program users. 2162 2163(Note: due to the likelihood of future expansion of EPD, implementors are 2164encouraged to have their programs handle EPD text lines of up to 1024 2165characters long.) 2166 2167Each EPD data field is composed only of non-blank printing ASCII characters. 2168Adjacent data fields are separated by a single ASCII space character. 2169 2170 217116.2.3.1: Piece placement data 2172 2173The first field represents the placement of the pieces on the board. The board 2174contents are specified starting with the eighth rank and ending with the first 2175rank. For each rank, the squares are specified from file a to file h. White 2176pieces are identified by uppercase SAN piece letters ("PNBRQK") and black 2177pieces are identified by lowercase SAN piece letters ("pnbrqk"). Empty squares 2178are represented by the digits one through eight; the digit used represents the 2179count of contiguous empty squares along a rank. A solidus character "/" is 2180used to separate data of adjacent ranks. 2181 2182 218316.2.3.2: Active color 2184 2185The second field represents the active color. A lower case "w" is used if 2186White is to move; a lower case "b" is used if Black is the active player. 2187 2188 218916.2.3.3: Castling availability 2190 2191The third field represents castling availability. This indicates potential 2192future castling that may or may not be possible at the moment due to blocking 2193pieces or enemy attacks. If there is no castling availability for either side, 2194the single character symbol "-" is used. Otherwise, a combination of from one 2195to four characters are present. If White has kingside castling availability, 2196the uppercase letter "K" appears. If White has queenside castling 2197availability, the uppercase letter "Q" appears. If Black has kingside castling 2198availability, the lowercase letter "k" appears. If Black has queenside 2199castling availability, then the lowercase letter "q" appears. Those letters 2200which appear will be ordered first uppercase before lowercase and second 2201kingside before queenside. There is no white space between the letters. 2202 2203 220416.2.3.4: En passant target square 2205 2206The fourth field is the en passant target square. If there is no en passant 2207target square then the single character symbol "-" appears. If there is an en 2208passant target square then is represented by a lowercase file character 2209immediately followed by a rank digit. Obviously, the rank digit will be "3" 2210following a white pawn double advance (Black is the active color) or else be 2211the digit "6" after a black pawn double advance (White being the active color). 2212 2213An en passant target square is given if and only if the last move was a pawn 2214advance of two squares. Therefore, an en passant target square field may have 2215a square name even if there is no pawn of the opposing side that may 2216immediately execute the en passant capture. 2217 2218 221916.2.4: Operations 2220 2221An EPD operation is composed of an opcode followed by zero or more operands and 2222is concluded by a semicolon. 2223 2224Multiple operations are separated by a single space character. If there is at 2225least one operation present in an EPD line, it is separated from the last 2226(fourth) data field by a single space character. 2227 2228 222916.2.4.1: General format 2230 2231An opcode is an identifier that starts with a letter character and may be 2232followed by up to fourteen more characters. Each additional character may be a 2233letter or a digit or the underscore character. 2234 2235An operand is either a set of contiguous non-white space printing characters or 2236a string. A string is a set of contiguous printing characters delimited by a 2237quote character at each end. A string value must have less than 256 bytes of 2238data. 2239 2240If at least one operand is present in an operation, there is a single space 2241between the opcode and the first operand. If more than one operand is present 2242in an operation, there is a single blank character between every two adjacent 2243operands. If there are no operands, a semicolon character is appended to the 2244opcode to mark the end of the operation. If any operands appear, the last 2245operand has an appended semicolon that marks the end of the operation. 2246 2247Any given opcode appears at most once per EPD record. Multiple operations in a 2248single EPD record should appear in ASCII order of their opcode names 2249(mnemonics). However, a program reading EPD records may allow for operations 2250not in ASCII order by opcode mnemonics; the semantics are the same in either 2251case. 2252 2253Some opcodes that allow for more than one operand may have special ordering 2254requirements for the operands. For example, the "pv" (predicted variation) 2255opcode requires its operands (moves) to appear in the order in which they would 2256be played. All other opcodes that allow for more than one operand should have 2257operands appearing in ASCII order. An example of the latter set is the "bm" 2258(best move[s]) opcode; its operands are moves that are all immediately playable 2259from the current position. 2260 2261Some opcodes require one or more operands that are chess moves. These moves 2262should be represented using SAN. If a different representation is used, there 2263is no guarantee that the EPD will be read correctly during subsequent 2264processing. 2265 2266Some opcodes require one or more operands that are integers. Some opcodes may 2267require that an integer operand must be within a given range; the details are 2268described in the opcode list given below. A negative integer is formed with a 2269hyphen (minus sign) preceding the integer digit sequence. An optional plus 2270sign may be used for indicating a non-negative value, but such use is not 2271required and is indeed discouraged. 2272 2273Some opcodes require one or more operands that are floating point numbers. 2274Some opcodes may require that a floating point operand must be within a given 2275range; the details are described in the opcode list given below. A floating 2276point operand is constructed from an optional sign character ("+" or "-"), a 2277digit sequence (with at least one digit), a radix point (always "."), and a 2278final digit sequence (with at least one digit). 2279 2280 228116.2.4.2: Opcode mnemonics 2282 2283An opcode mnemonic used for archival storage and for interprogram communication 2284starts with a lower case letter and is composed of only lower case letters, 2285digits, and the underscore character (i.e., no upper case letters). These 2286mnemonics will also all be at least two characters in length. 2287 2288Opcode mnemonics used only by a single program or an experimental suite of 2289programs should start with an upper case letter. This is so they may be easily 2290distinguished should they be inadvertently be encountered by other programs. 2291When a such a "private" opcode be demonstrated to be widely useful, it should 2292be brought into the official list (appearing below) in a lower case form. 2293 2294If a given program does not recognize a particular opcode, that operation is 2295simply ignored; it is not signaled as an error. 2296 2297 229816.2.5: Opcode list 2299 2300The opcodes are listed here in ASCII order of their mnemonics. Suggestions for 2301new opcodes should be sent to the PGN standard coordinator listed near the 2302start of this document. 2303 2304 230516.2.5.1: Opcode "acn": analysis count: nodes 2306 2307The opcode "acn" takes a single non-negative integer operand. It is used to 2308represent the number of nodes examined in an analysis. Note that the value may 2309be quite large for some extended searches and so use of (at least) a long (four 2310byte) representation is suggested. 2311 2312 231316.2.5.2: Opcode "acs": analysis count: seconds 2314 2315The opcode "acs" takes a single non-negative integer operand. It is used to 2316represent the number of seconds used for an analysis. Note that the value may 2317be quite large for some extended searches and so use of (at least) a long (four 2318byte) representation is suggested. 2319 2320 232116.2.5.3: Opcode "am": avoid move(s) 2322 2323The opcode "am" indicates a set of zero or more moves, all immediately playable 2324from the current position, that are to be avoided in the opinion of the EPD 2325writer. Each operand is a SAN move; they appear in ASCII order. 2326 2327 232816.2.5.4: Opcode "bm": best move(s) 2329 2330The opcode "bm" indicates a set of zero or more moves, all immediately playable 2331from the current position, that are judged to the best available by the EPD 2332writer. Each operand is a SAN move; they appear in ASCII order. 2333 2334 233516.2.5.5: Opcode "c0": comment (primary, also "c1" though "c9") 2336 2337The opcode "c0" (lower case letter "c", digit character zero) indicates a top 2338level comment that applies to the given position. It is the first of ten 2339ranked comments, each of which has a mnemonic formed from the lower case letter 2340"c" followed by a single decimal digit. Each of these opcodes takes either a 2341single string operand or no operand at all. 2342 2343This ten member comment family of opcodes is intended for use as descriptive 2344commentary for a complete game or game fragment. The usual processing of these 2345opcodes are as follows: 2346 23471) At the beginning of a game (or game fragment), a move sequence scanning 2348program initializes each element of its set of ten comment string registers to 2349be null. 2350 23512) As the EPD record for each position in the game is processed, the comment 2352operations are interpreted from left to right. (Actually, all operations in n 2353EPD record are interpreted from left to right.) Because operations appear in 2354ASCII order according to their opcode mnemonics, opcode "c0" (if present) will 2355be handled prior to all other opcodes, then opcode "c1" (if present), and so 2356forth until opcode "c9" (if present). 2357 23583) The processing of opcode "cN" (0 <= N <= 9) involves two steps. First, all 2359comment string registers with an index equal to or greater than N are set to 2360null. (This is the set "cN" though "c9".) Second, and only if a string 2361operand is present, the value of the corresponding comment string register is 2362set equal to the string operand. 2363 2364 236516.2.5.6: Opcode "ce": centipawn evaluation 2366 2367The opcode "ce" indicates the evaluation of the indicated position in centipawn 2368units. It takes a single operand, an optionally signed integer that gives an 2369evaluation of the position from the viewpoint of the active player; i.e., the 2370player with the move. Positive values indicate a position favorable to the 2371moving player while negative values indicate a position favorable to the 2372passive player; i.e., the player without the move. A centipawn evaluation 2373value close to zero indicates a neutral positional evaluation. 2374 2375Values are restricted to integers that are equal to or greater than -32767 and 2376are less than or equal to 32766. 2377 2378A value greater than 32000 indicates the availability of a forced mate to the 2379active player. The number of plies until mate is given by subtracting the 2380evaluation from the value 32767. Thus, a winning mate in N fullmoves is a mate 2381in ((2 * N) - 1) halfmoves (or ply) and has a corresponding centipawn 2382evaluation of (32767 - ((2 * N) - 1)). For example, a mate on the move (mate 2383in one) has a centipawn evaluation of 32766 while a mate in five has a 2384centipawn evaluation of 32758. 2385 2386A value less than -32000 indicates the availability of a forced mate to the 2387passive player. The number of plies until mate is given by subtracting the 2388evaluation from the value -32767 and then negating the result. Thus, a losing 2389mate in N fullmoves is a mate in (2 * N) halfmoves (or ply) and has a 2390corresponding centipawn evaluation of (-32767 + (2 * N)). For example, a mate 2391after the move (losing mate in one) has a centipawn evaluation of -32765 while 2392a losing mate in five has a centipawn evaluation of -32757. 2393 2394A value of -32767 indicates an illegal position. A stalemate position has a 2395centipawn evaluation of zero as does a position drawn due to insufficient 2396mating material. Any other position known to be a certain forced draw also has 2397a centipawn evaluation of zero. 2398 2399 240016.2.5.7: Opcode "dm": direct mate fullmove count 2401 2402The "dm" opcode is used to indicate the number of fullmoves until checkmate is 2403to be delivered by the active color for the indicated position. It always 2404takes a single operand which is a positive integer giving the fullmove count. 2405For example, a position known to be a "mate in three" would have an operation 2406of "dm 3;" to indicate this. 2407 2408This opcode is intended for use with problem sets composed of positions 2409requiring direct mate answers as solutions. 2410 2411 241216.2.5.8: Opcode "draw_accept": accept a draw offer 2413 2414The opcode "draw_accept" is used to indicate that a draw offer made after the 2415move that lead to the indicated position is accepted by the active player. 2416This opcode takes no operands. 2417 2418 241916.2.5.9: Opcode "draw_claim": claim a draw 2420 2421The opcode "draw_claim" is used to indicate claim by the active player that a 2422draw exists. The draw is claimed because of a third time repetition or because 2423of the fifty move rule or because of insufficient mating material. A supplied 2424move (see the opcode "sm") is also required to appear as part of the same EPD 2425record. The draw_claim opcode takes no operands. 2426 2427 242816.2.5.10: Opcode "draw_offer": offer a draw 2429 2430The opcode "draw_offer" is used to indicate that a draw is offered by the 2431active player. A supplied move (see the opcode "sm") is also required to 2432appear as part of the same EPD record; this move is considered played from the 2433indicated position. The draw_offer opcode takes no operands. 2434 2435 243616.2.5.11: Opcode "draw_reject": reject a draw offer 2437 2438The opcode "draw_reject" is used to indicate that a draw offer made after the 2439move that lead to the indicated position is rejected by the active player. 2440This opcode takes no operands. 2441 2442 244316.2.5.12: Opcode "eco": _Encyclopedia of Chess Openings_ opening code 2444 2445The opcode "eco" is used to associate an opening designation from the 2446_Encyclopedia of Chess Openings_ taxonomy with the indicated position. The 2447opcode takes either a single string operand (the ECO opening name) or no 2448operand at all. If an operand is present, its value is associated with an 2449"ECO" string register of the scanning program. If there is no operand, the ECO 2450string register of the scanning program is set to null. 2451 2452The usage is similar to that of the "ECO" tag pair of the PGN standard. 2453 2454 245516.2.5.13: Opcode "fmvn": fullmove number 2456 2457The opcode "fmvn" represents the fullmove n umber associated with the position. 2458It always takes a single operand that is the positive integer value of the move 2459number. 2460 2461This opcode is used to explicitly represent the fullmove number in EPD that is 2462present by default in FEN as the sixth field. Fullmove number information is 2463usually omitted from EPD because it does not affect move generation (commonly 2464needed for EPD-using tasks) but it does affect game notation (commonly needed 2465for FEN-using tasks). Because of the desire for space optimization for large 2466EPD files, fullmove numbers were dropped from EPD's parent FEN. The halfmove 2467clock information was similarly dropped. 2468 2469 247016.2.5.14: Opcode "hmvc": halfmove clock 2471 2472The opcode "hmvc" represents the halfmove clock associated with the position. 2473The halfmove clock of a position is equal to the number of plies since the last 2474pawn move or capture. This information is used to implement the fifty move 2475draw rule. It always takes a single operand that is the non-negative integer 2476value of the halfmove clock. 2477 2478This opcode is used to explicitly represent the halfmove clock in EPD that is 2479present by default in FEN as the fifth field. Halfmove clock information is 2480usually omitted from EPD because it does not affect move generation (commonly 2481needed for EPD-using tasks) but it does affect game termination issues 2482(commonly needed for FEN-using tasks). Because of the desire for space 2483optimization for large EPD files, halfmove clock values were dropped from EPD's 2484parent FEN. The fullmove number information was similarly dropped. 2485 2486 248716.2.5.15: Opcode "id": position identification 2488 2489The opcode "id" is used to provide a simple identifying label for the indicated 2490position. It takes a single string operand. 2491 2492This opcode is intended for use with test suites used for measuring 2493chessplaying program strength. An example "id" operand for the seven hundred 2494fifty seventh position of the one thousand one problems in Reinfeld's _1001 2495Winning Chess Sacrifices and Combinations_ would be "WCSAC.0757" while the 2496fifteenth position in the twenty four problem Bratko-Kopec test suite would 2497have an "id" operand of "BK.15". 2498 2499 250016.2.5.16: Opcode "nic": _New In Chess_ opening code 2501 2502The opcode "nic" is used to associate an opening designation from the _New In 2503Chess_ taxonomy with the indicated position. The opcode takes either a single 2504string operand (the NIC opening name) or no operand at all. If an operand is 2505present, its value is associated with an "NIC" string register of the scanning 2506program. If there is no operand, the NIC string register of the scanning 2507program is set to null. 2508 2509The usage is similar to that of the "NIC" tag pair of the PGN standard. 2510 2511 251216.2.5.17: Opcode "noop": no operation 2513 2514The "noop" opcode is used to indicate no operation. It takes zero or more 2515operands, each of which may be of any type. The operation involves no 2516processing. It is intended for use by developers for program testing purposes. 2517 2518 251916.2.5.18: Opcode "pm": predicted move 2520 2521The "pm" opcode is used to provide a single predicted move for the indicated 2522position. It has exactly one operand, a move playable from the position. This 2523move is judged by the EPD writer to represent the best move available to the 2524active player. 2525 2526If a non-empty "pv" (predicted variation) line of play is also present in the 2527same EPD record, the first move of the predicted variation is the same as the 2528predicted move. 2529 2530The "pm" opcode is intended for use as a general "display hint" mechanism. 2531 2532 253316.2.5.19: Opcode "pv": predicted variation 2534 2535The "pv" opcode is used to provide a predicted variation for the indicated 2536position. It has zero or more operands which represent a sequence of moves 2537playable from the position. This sequence is judged by the EPD writer to 2538represent the best play available. 2539 2540If a "pm" (predicted move) operation is also present in the same EPD record, 2541the predicted move is the same as the first move of the predicted variation. 2542 2543 254416.2.5.20: Opcode "rc": repetition count 2545 2546The "rc" opcode is used to indicate the number of occurrences of the indicated 2547position. It takes a single, positive integer operand. Any position, 2548including the initial starting position, is considered to have an "rc" value of 2549at least one. A value of three indicates a candidate for a draw claim by the 2550position repetition rule. 2551 2552 255316.2.5.21: Opcode "resign": game resignation 2554 2555The opcode "resign" is used to indicate that the active player has resigned the 2556game. This opcode takes no operands. 2557 2558 255916.2.5.22: Opcode "sm": supplied move 2560 2561The "sm" opcode is used to provide a single supplied move for the indicated 2562position. It has exactly one operand, a move playable from the position. This 2563move is the move to be played from the position. 2564 2565The "sm" opcode is intended for use to communicate the most recent played move 2566in an active game. It is used to communicate moves between programs in 2567automatic play via a network. This includes correspondence play using e-mail 2568and also programs acting as network front ends to human players. 2569 2570 257116.2.5.23: Opcode "tcgs": telecommunication: game selector 2572 2573The "tcgs" opcode is one of the telecommunication family of opcodes used for 2574games conducted via e-mail and similar means. This opcode takes a single 2575operand that is a positive integer. It is used to select among various games 2576in progress between the same sender and receiver. 2577 2578 257916.2.5.24: Opcode "tcri": telecommunication: receiver identification 2580 2581The "tcri" opcode is one of the telecommunication family of opcodes used for 2582games conducted via e-mail and similar means. This opcode takes two order 2583dependent string operands. The first operand is the e-mail address of the 2584receiver of the EPD record. The second operand is the name of the player 2585(program or human) at the address who is the actual receiver of the EPD record. 2586 2587 258816.2.5.25: Opcode "tcsi": telecommunication: sender identification 2589 2590The "tcsi" opcode is one of the telecommunication family of opcodes used for 2591games conducted via e-mail and similar means. This opcode takes two order 2592dependent string operands. The first operand is the e-mail address of the 2593sender of the EPD record. The second operand is the name of the player 2594(program or human) at the address who is the actual sender of the EPD record. 2595 2596 259716.2.5.26: Opcode "v0": variation name (primary, also "v1" though "v9") 2598 2599The opcode "v0" (lower case letter "v", digit character zero) indicates a top 2600level variation name that applies to the given position. It is the first of 2601ten ranked variation names, each of which has a mnemonic formed from the lower 2602case letter "v" followed by a single decimal digit. Each of these opcodes 2603takes either a single string operand or no operand at all. 2604 2605This ten member variation name family of opcodes is intended for use as 2606traditional variation names for a complete game or game fragment. The usual 2607processing of these opcodes are as follows: 2608 26091) At the beginning of a game (or game fragment), a move sequence scanning 2610program initializes each element of its set of ten variation name string 2611registers to be null. 2612 26132) As the EPD record for each position in the game is processed, the variation 2614name operations are interpreted from left to right. (Actually, all operations 2615in n EPD record are interpreted from left to right.) Because operations appear 2616in ASCII order according to their opcode mnemonics, opcode "v0" (if present) 2617will be handled prior to all other opcodes, then opcode "v1" (if present), and 2618so forth until opcode "v9" (if present). 2619 26203) The processing of opcode "vN" (0 <= N <= 9) involves two steps. First, all 2621variation name string registers with an index equal to or greater than N are 2622set to null. (This is the set "vN" though "v9".) Second, and only if a string 2623operand is present, the value of the corresponding variation name string 2624register is set equal to the string operand. 2625 2626 262717: Alternative chesspiece identifier letters 2628 2629English language piece names are used to define the letter set for identifying 2630chesspieces in PGN movetext. However, authors of programs which are used only 2631for local presentation or scanning of chess move data may find it convenient to 2632use piece letter codes common in their locales. This is not a problem as long 2633as PGN data that resides in archival storage or that is exchanged among 2634programs still uses the SAN (English) piece letter codes: "PNBRQK". 2635 2636For the above authors only, a list of alternative piece letter codes are 2637provided: 2638 2639Language Piece letters (pawn knight bishop rook queen king) 2640---------- -------------------------------------------------- 2641Czech P J S V D K 2642Danish B S L T D K 2643Dutch O P L T D K 2644English P N B R Q K 2645Estonian P R O V L K 2646Finnish P R L T D K 2647French P C F T D R 2648German B S L T D K 2649Hungarian G H F B V K 2650Icelandic P R B H D K 2651Italian P C A T D R 2652Norwegian B S L T D K 2653Polish P S G W H K 2654Portuguese P C B T D R 2655Romanian P C N T D R 2656Spanish P C A T D R 2657Swedish B S L T D K 2658 2659 266018: Formal syntax 2661 2662<PGN-database> ::= <PGN-game> <PGN-database> 2663 <empty> 2664 2665<PGN-game> ::= <tag-section> <movetext-section> 2666 2667<tag-section> ::= <tag-pair> <tag-section> 2668 <empty> 2669 2670<tag-pair> ::= [ <tag-name> <tag-value> ] 2671 2672<tag-name> ::= <identifier> 2673 2674<tag-value> ::= <string> 2675 2676<movetext-section> ::= <element-sequence> <game-termination> 2677 2678<element-sequence> ::= <element> <element-sequence> 2679 <recursive-variation> <element-sequence> 2680 <empty> 2681 2682<element> ::= <move-number-indication> 2683 <SAN-move> 2684 <numeric-annotation-glyph> 2685 2686<recursive-variation> ::= ( <element-sequence> ) 2687 2688<game-termination> ::= 1-0 2689 0-1 2690 1/2-1/2 2691 * 2692<empty> ::= 2693 2694 269519: Canonical chess position hash coding 2696 2697*** This section is under development. 2698 2699 270020: Binary representation (PGC) 2701 2702*** This section is under development. 2703 2704The binary coded version of PGN is PGC (PGN Game Coding). PGC is a binary 2705representation standard of PGN data designed for the dual goals of storage 2706efficiency and program I/O. A file containing PGC data should have a name with 2707a suffix of ".pgc". 2708 2709Unlike PGN text files that may have locale dependent representations for 2710newlines, PGC files have data that does not vary due to local processing 2711environment. This means that PGC files may be transferred among systems using 2712general binary file methods. 2713 2714PGC files should be used only when the use of PGN is impractical due to time 2715and space resource constraints. As the general level of processing 2716capabilities increases, the need for PGC over PGN will decrease. Therefore, 2717implementors are encouraged not to use PGC as the default representation 2718because it is much more difficult (than PGN) to understand without proper 2719software. 2720 2721PGC data is composed of a sequence of PGC records. Each record is composed of 2722a sequence of one or more bytes. The first byte is the PGN record marker and 2723it specifies the interpretation of the remaining portion of the record. This 2724remaining portion is composed of zero or more PGN record items. Item types 2725include move sequences, move sets, and character strings. 2726 2727 272820.1: Bytes, words, and doublewords 2729 2730At the lowest level, PGC binary data is organized as bytes, words (two 2731contiguous bytes), and doublewords (four contiguous bytes). All eight bits of 2732a byte are used. Longwords (eight contiguous bytes) are not used. Integer 2733values are stored using two's complement representation. Integers may be 2734signed or unsigned depending on context. Multibyte integers are stored in 2735low-endian format with the least significant byte appearing first. 2736 2737A one byte integer item is called "int-1". A two byte integer item is called 2738"int-2". A four byte integer item is called "int-4". 2739 2740Characters are stored as bytes using the ISO 8859/1 Latin-1 (ECMA-94) code set. 2741There is no provision for other characters sets or representations. 2742 2743 274420.2: Move ordinals 2745 2746A chess move is represented using a move ordinal. This is a single unsigned 2747byte quantity with values from zero to 255. A move ordinal is interpreted as 2748an index into the list of legal moves from the current position. This list is 2749constructed by generating the legal moves from the current position, assigning 2750SAN ASCII strings to each move, and then sorting these strings in ascending 2751order. Note that a seven bit ordinal, as used by some inferior representation 2752systems, is insufficient as there are some positions that have more than 128 2753moves available. 2754 2755Examples: From the initial position, there are twenty moves. Move ordinal 0 2756corresponds to the SAN move string "Na3"; move ordinal 1 corresponds to "Nc3", 2757move ordinal 4 corresponds to "a3", and move ordinal 19 corresponds to "h4". 2758 2759Moves can be organized into sequences and sets. A move sequence is an ordered 2760list of moves that are played, one after another from first to last. A move 2761set is a list of moves that are all playable from the current position. 2762 2763Move sequence data is represented using a length header followed by move 2764ordinal data. The length header is an unsigned integer that may be a byte or a 2765word. The integer gives the number, possibly zero, of following move ordinal 2766bytes. Most move sequences can be represented using just a byte header; these 2767are called "mvseq-1" items. Move sequence data using a word header are called 2768"mvseq-2" items. 2769 2770Move set data is represented using a length header followed by move ordinal 2771data. The length header is an unsigned integer that is a byte. The integer 2772gives the number, possibly zero, of following move ordinal bytes. All move 2773sets are be represented using just a byte header; these are called "mvset-1" 2774items. (Note the implied restriction that a move set can only have a maximum 2775of 255 of the possible 256 ordinals present at one time.) 2776 2777 277820.3: String data 2779 2780PGC string data is represented using a length header followed by bytes of 2781character data. The length header is an unsigned integer that may be a byte, a 2782word, or a doubleword. The integer gives the number, possibly zero, of 2783following character bytes. Most strings can be represented using just a byte 2784header; these are called "string-1" items. String data using a word header are 2785called "string-2" items and string data using a doubleword header are called 2786"string-4" items. No special ASCII NUL termination byte is required for PGC 2787storage of a string as the length is explicitly given in the item header. 2788 2789 279020.4: Marker codes 2791 2792PGC marker codes are given in hexadecimal format. PGC marker code zero (marker 27930x00) is the "noop" marker and carries no meaning. Each additional marker code 2794defined appears in its own subsection below. 2795 2796 279720.4.1: Marker 0x01: reduced export format single game 2798 2799Marker 0x01 is used to indicate a single complete game in reduced export 2800format. This refers to a game that has only the Seven Tag Roster data, played 2801moves, and no annotations or comments. This record type is used as an 2802alternative to the general game data begin/end record pairs described below. 2803The general marker pair (0x05/0x06) is used to help represent game data that 2804can't be adequately represented in reduced export format. There are eight 2805items that follow marker 0x01 to form the "reduced export format single game" 2806record. In order, these are: 2807 28081) string-1 (Event tag value) 2809 28102) string-1 (Site tag value) 2811 28123) string-1 (Date tag value) 2813 28144) string-1 (Round tag value) 2815 28165) string-1 (White tag value) 2817 28186) string-1 (Black tag value) 2819 28207) string-1 (Result tag value) 2821 28228) mvseq-2 (played moves) 2823 2824 282520.4.2: Marker 0x02: tag pair 2826 2827Marker 0x02 is used to indicate a single tag pair. There are two items that 2828follow marker 0x02 to form the "tag pair" record; in order these are: 2829 28301) string-1 (tag pair name) 2831 28322) string-1 (tag pair value) 2833 2834 283520.4.3: Marker 0x03: short move sequence 2836 2837Marker 0x03 is used to indicate a short move sequence. There is one item that 2838follows marker 0x03 to form the "short move sequence" record; this is: 2839 28401) mvseq-1 (played moves) 2841 2842 284320.4.4: Marker 0x04: long move sequence 2844 2845Marker 0x04 is used to indicate a long move sequence. There is one item that 2846follows marker 0x04 to form the "long move sequence" record; this is: 2847 28481) mvseq-2 (played moves) 2849 2850 285120.4.5: Marker 0x05: general game data begin 2852 2853Marker 0x05 is used to indicate the beginning of data for a game. It has no 2854associated items; it is a complete record by itself. Instead, it marks the 2855beginning of PGC records used to describe a game. All records up to the 2856corresponding "general game data end" record are considered to be part of the 2857same game. (PGC record type 0x01, "reduced export format single game", is not 2858permitted to appear within a general game begin/end record pair. The general 2859game construct is to be used as an alternative to record type 0x01 in those 2860cases where the latter is too restrictive to contain the data for a game.) 2861 2862 286320.4.6: Marker 0x06: general game data end 2864 2865Marker 0x06 is used to indicate the end of data for a game. It has no 2866associated items; it is a complete record by itself. Instead, it marks the end 2867of PGC records used to describe a game. All records after the corresponding 2868(and earlier appearing) "general game data begin" record are considered to be 2869part of the same game. 2870 2871 287220.4.7: Marker 0x07: simple-nag 2873 2874Marker 0x07 is used to indicate the presence of a simple NAG (Numeric 2875Annotation Glyph). This is an annotation marker that has only a short type 2876identification and no operands. There is one item that follows marker 0x07 to 2877form the "simple-nag" record; this is: 2878 28791) int-1 (unsigned NAG value, from 0 to 255) 2880 2881 288220.4.8: Marker 0x08: rav-begin 2883 2884Marker 0x08 is used to indicate the beginning of an RAV (Recursive Annotation 2885Variation). It has no associated items; it is a complete record by itself. 2886Instead, it marks the beginning of PGC records used to describe a recursive 2887annotation. It is considered an opening bracket for a later rav-end record; 2888the recursive annotation is completely described between the bracket pair. The 2889rav-begin/data/rav-end structures can be nested. 2890 2891 289220.4.9: Marker 0x09: rav-end 2893 2894Marker 0x09 is used to indicate the end of an RAV (Recursive Annotation 2895Variation). It has no associated items; it is a complete record by itself. 2896Instead, it marks the end of PGC records used to describe a recursive 2897annotation. It is considered a closing bracket for an earlier rav-begin 2898record; the recursive annotation is completely described between the bracket 2899pair. The rav-begin/data/rav-end structures can be nested. 2900 2901 290220.4.10: Marker 0x0a: escape-string 2903 2904Marker 0x0a is used to indicate the presence of an escape string. This is a 2905string represented by the use of the percent sign ("%") escape mechanism in 2906PGN. The data that is escaped is the sequence of characters immediately 2907follwoing the percent sign up to but not including the terminating newline. As 2908is the case with the PGN percent sign escape, the use of a PGC escape-string 2909record is limited to use for non-archival data. There is one item that follows 2910marker 0x0a to form the "escape-string" record; this is the string data being 2911escaped: 2912 29131) string-2 (escaped string data) 2914 2915 291621: E-mail correspondence usage 2917 2918*** This section is under development. 2919 2920 2921Standard: EOF 2922