1Standard: Portable Game Notation Specification and Implementation Guide
2
3Revised: 1994.03.12
4
5Authors: Interested readers of the Internet newsgroup rec.games.chess
6
7Coordinator: Steven J. Edwards (send comments to sje@world.std.com)
8
9
100: Preface
11
12From the Tower of Babel story:
13
14"If now, while they are one people, all speaking the same language, they have
15started to do this, nothing will later stop them from doing whatever they
16propose to do."
17
18Genesis XI, v.6, _New American Bible_
19
20
211: Introduction
22
23PGN is "Portable Game Notation", a standard designed for the representation of
24chess game data using ASCII text files.  PGN is structured for easy reading and
25writing by human users and for easy parsing and generation by computer
26programs.  The intent of the definition and propagation of PGN is to facilitate
27the sharing of public domain chess game data among chessplayers (both organic
28and otherwise), publishers, and computer chess researchers throughout the
29world.
30
31PGN is not intended to be a general purpose standard that is suitable for every
32possible use; no such standard could fill all conceivable requirements.
33Instead, PGN is proposed as a universal portable representation for data
34interchange.  The idea is to allow the construction of a family of chess
35applications that can quickly and easily process chess game data using PGN for
36import and export among themselves.
37
38
392: Chess data representation
40
41Computer usage among chessplayers has become quite common in recent years and a
42variety of different programs, both commercial and public domain, are used to
43generate, access, and propagate chess game data.  Some of these programs are
44rather impressive; most are now well behaved in that they correctly follow the
45Laws of Chess and handle users' data with reasonable care.  Unfortunately, many
46programs have had serious problems with several aspects of the external
47representation of chess game data.  Sometimes these problems become more
48visible when a user attempts to move significant quantities of data from one
49program to another; if there has been no real effort to ensure portability of
50data, then the chances for a successful transfer are small at best.
51
52
532.1: Data interchange incompatibility
54
55The reasons for format incompatibility are easy to understand.  In fact, most
56of them are correlated with the same problems that have already been seen with
57commercial software offerings for other domains such as word processing,
58spreadsheets, fonts, and graphics.  Sometimes a manufacturer deliberately
59designs a data format using encryption or some other secret, proprietary
60technique to "lock in" a customer.  Sometimes a designer may produce a format
61that can be deciphered without too much difficulty, but at the same time
62publicly discourage third party software by claiming trade secret protection.
63Another software producer may develop a non-proprietary system, but it may work
64well only within the scope of a single program or application because it is not
65easily expandable.  Finally, some other software may work very well for many
66purposes, but it uses symbols and language not easily understood by people or
67computers available to those outside the country of its development.
68
69
702.2: Specification goals
71
72A specification for a portable game notation must observe the lessons of
73history and be able to handle probable needs of the future.  The design
74criteria for PGN were selected to meet these needs.  These criteria include:
75
761) The details of the system must be publicly available and free of unnecessary
77complexity.  Ideally, if the documentation is not available for some reason,
78typical chess software developers and users should be able to understand most
79of the data without the need for third party assistance.
80
812) The details of the system must be non-proprietary so that users and software
82developers are unrestricted by concerns about infringing on intellectual
83property rights.  The idea is to let chess programmers compete in a free market
84where customers may choose software based on their real needs and not based on
85artificial requirements created by a secret data format.
86
873) The system must work for a variety of programs.  The format should be such
88that it can be used by chess database programs, chess publishing programs,
89chess server programs, and chessplaying programs without being unnecessarily
90specific to any particular application class.
91
924) The system must be easily expandable and scalable.  The expansion ability
93must include handling data items that may not exist currently but could be
94expected to emerge in the future.  (Examples: new opening classifications and
95new country names.)  The system should be scalable in that it must not have any
96arbitrary restrictions concerning the quantity of stored data.  Also, planned
97modes of expansion should either preserve earlier databases or at least allow
98for their automatic conversion.
99
1005) The system must be international.  Chess software users are found in many
101countries and the system should be free of difficulties caused by conventions
102local to a given region.
103
1046) Finally, the system should handle the same kinds and amounts of data that
105are already handled by existing chess software and by print media.
106
107
1082.3: A sample PGN game
109
110Although its description may seem rather lengthy, PGN is actually fairly
111simple.  A sample PGN game follows; it has most of the important features
112described in later sections of this document.
113
114[Event "F/S Return Match"]
115[Site "Belgrade, Serbia JUG"]
116[Date "1992.11.04"]
117[Round "29"]
118[White "Fischer, Robert J."]
119[Black "Spassky, Boris V."]
120[Result "1/2-1/2"]
121
1221. e4 e5 2. Nf3 Nc6 3. Bb5 a6 4. Ba4 Nf6 5. O-O Be7 6. Re1 b5 7. Bb3 d6 8. c3
123O-O 9. h3 Nb8 10. d4 Nbd7 11. c4 c6 12. cxb5 axb5 13. Nc3 Bb7 14. Bg5 b4 15.
124Nb1 h6 16. Bh4 c5 17. dxe5 Nxe4 18. Bxe7 Qxe7 19. exd6 Qf6 20. Nbd2 Nxd6 21.
125Nc4 Nxc4 22. Bxc4 Nb6 23. Ne5 Rae8 24. Bxf7+ Rxf7 25. Nxf7 Rxe1+ 26. Qxe1 Kxf7
12627. Qe3 Qg5 28. Qxg5 hxg5 29. b3 Ke6 30. a3 Kd6 31. axb4 cxb4 32. Ra5 Nd5 33.
127f3 Bc8 34. Kf2 Bf5 35. Ra7 g6 36. Ra6+ Kc5 37. Ke1 Nf4 38. g3 Nxh3 39. Kd2 Kb5
12840. Rd6 Kc5 41. Ra6 Nf2 42. g4 Bd3 43. Re6 1/2-1/2
129
130
1313: Formats: import and export
132
133There are two formats in the PGN specification.  These are the "import" format
134and the "export" format.  These are the two different ways of formatting the
135same PGN data according to its source.  The details of the two formats are
136described throughout the following sections of this document.
137
138Other than formats, there is the additional topic of PGN presentation.  While
139both PGN import and export formats are designed to be readable by humans, there
140is no recommendation that either of these be an ultimate mode of chess data
141presentation.  Rather, software developers are urged to consider all of the
142various techniques at their disposal to enhance the display of chess data at
143the presentation level (i.e., highest level) of their programs.  This means
144that the use of different fonts, character sizes, color, and other tools of
145computer aided interaction and publishing should be explored to provide a high
146quality presentation appropriate to the function of the particular program.
147
148
1493.1: Import format allows for manually prepared data
150
151The import format is rather flexible and is used to describe data that may have
152been prepared by hand, much like a source file for a high level programming
153language.  A program that can read PGN data should be able to handle the
154somewhat lax import format.
155
156
1573.2: Export format used for program generated output
158
159The export format is rather strict and is used to describe data that is usually
160prepared under program control, something like a pretty printed source program
161reformatted by a compiler.
162
163
1643.2.1: Byte equivalence
165
166For a given PGN data file, export format representations generated by different
167PGN programs on the same computing system should be exactly equivalent, byte
168for byte.
169
170
1713.2.2: Archival storage and the newline character
172
173Export format should also be used for archival storage.  Here, "archival"
174storage is defined as storage that may be accessed by a variety of computing
175systems.  The only extra requirement for archival storage is that the newline
176character have a specific representation that is independent of its value for a
177particular computing system's text file usage.  The archival representation of
178a newline is the ASCII control character LF (line feed, decimal value 10,
179hexadecimal value 0x0a).
180
181Sadly, there are some accidents of history that survive to this day that have
182baroque representations for a newline: multicharacter sequences, end-of-line
183record markers, start-of-line byte counts, fixed length records, and so forth.
184It is well beyond the scope of the PGN project to reconcile all of these to the
185unified world of ANSI C and the those enjoying the bliss of a single '\n'
186convention.  Some systems may just not be able to handle an archival PGN text
187file with native text editors.  In these cases, an indulgence of sorts is
188granted to use the local newline convention in non-archival PGN files for those
189text editors.
190
191
1923.2.3: Speed of processing
193
194Several parts of the export format deal with exact descriptions of line and
195field justification that are absent from the import format details.  The main
196reason for these restrictions on the export format are to allow the
197construction of simple data translation programs that can easily scan PGN data
198without having to have a full chess engine or other complex parsing routines.
199The idea is to encourage chess software authors to always allow for at least a
200limited PGN reading capability.  Even when a full chess engine parsing
201capability is available, it is likely to be at least two orders of magnitude
202slower than a simple text scanner.
203
204
2053.2.4: Reduced export format
206
207A PGN game represented using export format is said to be in "reduced export
208format" if all of the following hold: 1) it has no commentary, 2) it has only
209the standard seven tag roster identification information ("STR", see below), 3)
210it has no recursive annotation variations ("RAV", see below), and 4) it has no
211numeric annotation glyphs ("NAG", see below).  Reduced export format is used
212for bulk storage of unannotated games.  It represents a minimum level of
213standard conformance for a PGN exporting application.
214
215
2164: Lexicographical issues
217
218PGN data is composed of characters; non-overlapping contiguous sequences of
219characters form lexical tokens.
220
221
2224.1: Character codes
223
224PGN data is represented using a subset of the eight bit ISO 8859/1 (Latin 1)
225character set.  ("ISO" is an acronym for the International Standards
226Organization.)  This set is also known as ECMA-94 and is similar to other ISO
227Latin character sets.  ISO 8859/1 includes the standard seven bit ASCII
228character set for the 32 control character code values from zero to 31.  The 95
229printing character code values from 32 to 126 are also equivalent to seven bit
230ASCII usage.  (Code value 127, the ASCII DEL control character, is a graphic
231character in ISO 8859/1; it is not used for PGN data representation.)
232
233The 32 ISO 8859/1 code values from 128 to 159 are non-printing control
234characters.  They are not used for PGN data representation.  The 32 code values
235from 160 to 191 are mostly non-alphabetic printing characters and their use for
236PGN data is discouraged as their graphic representation varies considerably
237among other ISO Latin sets.  Finally, the 64 code values from 192 to 255 are
238mostly alphabetic printing characters with various diacritical marks; their use
239is encouraged for those languages that require such characters.  The graphic
240representations of this last set of 64 characters is fairly constant for the
241ISO Latin family.
242
243Printing character codes outside of the seven bit ASCII range may only appear
244in string data and in commentary.  They are not permitted for use in symbol
245construction.
246
247Because some PGN users' environments may not support presentation of non-ASCII
248characters, PGN game authors should refrain from using such characters in
249critical commentary or string values in game data that may be referenced in
250such environments.  PGN software authors should have their programs handle such
251environments by displaying a question mark ("?") for non-ASCII character codes.
252This is an important point because there are many computing systems that can
253display eight bit character data, but the display graphics may differ among
254machines and operating systems from different manufacturers.
255
256Only four of the ASCII control characters are permitted in PGN import format;
257these are the horizontal and vertical tabs along with the linefeed and carriage
258return codes.
259
260The external representation of the newline character may differ among
261platforms; this is an acceptable variation as long as the details of the
262implementation are hidden from software implementors and users.  When a choice
263is practical, the Unix "newline is linefeed" convention is preferred.
264
265
2664.2: Tab characters
267
268Tab characters, both horizontal and vertical, are not permitted in the export
269format.  This is because the treatment of tab characters is highly dependent
270upon the particular software in use on the host computing system.  Also, tab
271characters may not appear inside of string data.
272
273
2744.3: Line lengths
275
276PGN data are organized as simple text lines without any special bytes or
277markers for secondary record structure imposed by specific operating systems.
278Import format PGN text lines are limited to having a maximum of 255 characters
279per line including the newline character.  Lines with 80 or more printing
280characters are strongly discouraged because of the difficulties experienced by
281common text editors with long lines.
282
283In some cases, very long tag values will require 80 or more columns, but these
284are relatively rare.  An example of this is the "FEN" tag pair; it may have a
285long tag value, but this particular tag pair is only used to represent a game
286that doesn't start from the usual initial position.
287
288
2895: Commentary
290
291Comment text may appear in PGN data.  There are two kinds of comments.  The
292first kind is the "rest of line" comment; this comment type starts with a
293semicolon character and continues to the end of the line.  The second kind
294starts with a left brace character and continues to the next right brace
295character.  Comments cannot appear inside any token.
296
297Brace comments do not nest; a left brace character appearing in a brace comment
298loses its special meaning and is ignored.  A semicolon appearing inside of a
299brace comment loses its special meaning and is ignored.  Braces appearing
300inside of a semicolon comments lose their special meaning and are ignored.
301
302*** Export format representation of comments needs definition work.
303
304
3056: Escape mechanism
306
307There is a special escape mechanism for PGN data.  This mechanism is triggered
308by a percent sign character ("%") appearing in the first column of a line; the
309data on the rest of the line is ignored by publicly available PGN scanning
310software.  This escape convention is intended for the private use of software
311developers and researchers to embed non-PGN commands and data in PGN streams.
312
313A percent sign appearing in any other place other than the first position in a
314line does not trigger the escape mechanism.
315
316
3177: Tokens
318
319PGN character data is organized as tokens.  A token is a contiguous sequence of
320characters that represents a basic semantic unit.  Tokens may be separated from
321adjacent tokens by white space characters.  (White space characters include
322space, newline, and tab characters.)  Some tokens are self delimiting and do
323not require white space characters.
324
325A string token is a sequence of zero or more printing characters delimited by a
326pair of quote characters (ASCII decimal value 34, hexadecimal value 0x22).  An
327empty string is represented by two adjacent quotes.  (Note: an apostrophe is
328not a quote.)  A quote inside a string is represented by the backslash
329immediately followed by a quote.  A backslash inside a string is represented by
330two adjacent backslashes.  Strings are commonly used as tag pair values (see
331below).  Non-printing characters like newline and tab are not permitted inside
332of strings.  A string token is terminated by its closing quote.  Currently, a
333string is limited to a maximum of 255 characters of data.
334
335An integer token is a sequence of one or more decimal digit characters.  It is
336a special case of the more general "symbol" token class described below.
337Integer tokens are used to help represent move number indications (see below).
338An integer token is terminated just prior to the first non-symbol character
339following the integer digit sequence.
340
341A period character (".") is a token by itself.  It is used for move number
342indications (see below).  It is self terminating.
343
344An asterisk character ("*") is a token by itself.  It is used as one of the
345possible game termination markers (see below); it indicates an incomplete game
346or a game with an unknown or otherwise unavailable result.  It is self
347terminating.
348
349The left and right bracket characters ("[" and "]") are tokens.  They are used
350to delimit tag pairs (see below).  Both are self terminating.
351
352The left and right parenthesis characters ("(" and ")") are tokens.  They are
353used to delimit Recursive Annotation Variations (see below).  Both are self
354terminating.
355
356The left and right angle bracket characters ("<" and ">") are tokens.  They are
357reserved for future expansion.  Both are self terminating.
358
359A Numeric Annotation Glyph ("NAG", see below) is a token; it is composed of a
360dollar sign character ("$") immediately followed by one or more digit
361characters.  It is terminated just prior to the first non-digit character
362following the digit sequence.
363
364A symbol token starts with a letter or digit character and is immediately
365followed by a sequence of zero or more symbol continuation characters.  These
366continuation characters are letter characters ("A-Za-z"), digit characters
367("0-9"), the underscore ("_"), the plus sign ("+"), the octothorpe sign ("#"),
368the equal sign ("="), the colon (":"),  and the hyphen ("-").  Symbols are used
369for a variety of purposes.  All characters in a symbol are significant.  A
370symbol token is terminated just prior to the first non-symbol character
371following the symbol character sequence.  Currently, a symbol is limited to a
372maximum of 255 characters in length.
373
374
3758: Parsing games
376
377A PGN database file is a sequential collection of zero or more PGN games.  An
378empty file is a valid, although somewhat uninformative, PGN database.
379
380A PGN game is composed of two sections.  The first is the tag pair section and
381the second is the movetext section.  The tag pair section provides information
382that identifies the game by defining the values associated with a set of
383standard parameters.  The movetext section gives the usually enumerated and
384possibly annotated moves of the game along with the concluding game termination
385marker.  The chess moves themselves are represented using SAN (Standard
386Algebraic Notation), also described later in this document.
387
388
3898.1: Tag pair section
390
391The tag pair section is composed of a series of zero or more tag pairs.
392
393A tag pair is composed of four consecutive tokens: a left bracket token, a
394symbol token, a string token, and a right bracket token.  The symbol token is
395the tag name and the string token is the tag value associated with the tag
396name.  (There is a standard set of tag names and semantics described below.)
397The same tag name should not appear more than once in a tag pair section.
398
399A further restriction on tag names is that they are composed exclusively of
400letters, digits, and the underscore character.  This is done to facilitate
401mapping of tag names into key and attribute names for use with general purpose
402database programs.
403
404For PGN import format, there may be zero or more white space characters between
405any adjacent pair of tokens in a tag pair.
406
407For PGN export format, there are no white space characters between the left
408bracket and the tag name, there are no white space characters between the tag
409value and the right bracket, and there is a single space character between the
410tag name and the tag value.
411
412Tag names, like all symbols, are case sensitive.  All tag names used for
413archival storage begin with an upper case letter.
414
415PGN import format may have multiple tag pairs on the same line and may even
416have a tag pair spanning more than a single line.  Export format requires each
417tag pair to appear left justified on a line by itself; a single empty line
418follows the last tag pair.
419
420Some tag values may be composed of a sequence of items.  For example, a
421consultation game may have more than one player for a given side.  When this
422occurs, the single character ":" (colon) appears between adjacent items.
423Because of this use as an internal separator in strings, the colon should not
424otherwise appear in a string.
425
426The tag pair format is designed for expansion; initially only strings are
427allowed as tag pair values.  Tag value formats associated with the STR (Seven
428Tag Roster, see below) will not change; they will always be string values.
429However, there are long term plans to allow general list structures as tag
430values for non-STR tag pairs.  Use of these expanded tag values will likely be
431restricted to special research programs.  In all events, the top level
432structure of a tag pair remains the same: left bracket, tag name, tag value,
433and right bracket.
434
435
4368.1.1: Seven Tag Roster
437
438There is a set of tags defined for mandatory use for archival storage of PGN
439data.  This is the STR (Seven Tag Roster).  The interpretation of these tags is
440fixed as is the order in which they appear.  Although the definition and use of
441additional tag names and semantics is permitted and encouraged when needed, the
442STR is the common ground that all programs should follow for public data
443interchange.
444
445For import format, the order of tag pairs is not important.  For export format,
446the STR tag pairs appear before any other tag pairs.  (The STR tag pairs must
447also appear in order; this order is described below).  Also for export format,
448any additional tag pairs appear in ASCII order by tag name.
449
450The seven tag names of the STR are (in order):
451
4521) Event (the name of the tournament or match event)
453
4542) Site (the location of the event)
455
4563) Date (the starting date of the game)
457
4584) Round (the playing round ordinal of the game)
459
4605) White (the player of the white pieces)
461
4626) Black (the player of the black pieces)
463
4647) Result (the result of the game)
465
466A set of supplemental tag names is given later in this document.
467
468For PGN export format, a single blank line appears after the last of the tag
469pairs to conclude the tag pair section.  This helps simple scanning programs to
470quickly determine the end of the tag pair section and the beginning of the
471movetext section.
472
473
4748.1.1.1: The Event tag
475
476The Event tag value should be reasonably descriptive.  Abbreviations are to be
477avoided unless absolutely necessary.  A consistent event naming should be used
478to help facilitate database scanning.  If the name of the event is unknown, a
479single question mark should appear as the tag value.
480
481Examples:
482
483[Event "FIDE World Championship"]
484
485[Event "Moscow City Championship"]
486
487[Event "ACM North American Computer Championship"]
488
489[Event "Casual Game"]
490
491
4928.1.1.2: The Site tag
493
494The Site tag value should include city and region names along with a standard
495name for the country.  The use of the IOC (International Olympic Committee)
496three letter names is suggested for those countries where such codes are
497available.  If the site of the event is unknown, a single question mark should
498appear as the tag value.  A comma may be used to separate a city from a region.
499No comma is needed to separate a city or region from the IOC country code.  A
500later section of this document gives a list of three letter nation codes along
501with a few additions for "locations" not covered by the IOC.
502
503Examples:
504
505[Site "New York City, NY USA"]
506
507[Site "St. Petersburg RUS"]
508
509[Site "Riga LAT"]
510
511
5128.1.1.3: The Date tag
513
514The Date tag value gives the starting date for the game.  (Note: this is not
515necessarily the same as the starting date for the event.)  The date is given
516with respect to the local time of the site given in the Event tag.  The Date
517tag value field always uses a standard ten character format: "YYYY.MM.DD".  The
518first four characters are digits that give the year, the next character is a
519period, the next two characters are digits that give the month, the next
520character is a period, and the final two characters are digits that give the
521day of the month.  If the any of the digit fields are not known, then question
522marks are used in place of the digits.
523
524Examples:
525
526[Date "1992.08.31"]
527
528[Date "1993.??.??"]
529
530[Date "2001.01.01"]
531
532
5338.1.1.4: The Round tag
534
535The Round tag value gives the playing round for the game.  In a match
536competition, this value is the number of the game played.  If the use of a
537round number is inappropriate, then the field should be a single hyphen
538character.  If the round is unknown, a single question mark should appear as
539the tag value.
540
541Some organizers employ unusual round designations and have multipart playing
542rounds and sometimes even have conditional rounds.  In these cases, a multipart
543round identifier can be made from a sequence of integer round numbers separated
544by periods.  The leftmost integer represents the most significant round and
545succeeding integers represent round numbers in descending hierarchical order.
546
547Examples:
548
549[Round "1"]
550
551[Round "3.1"]
552
553[Round "4.1.2"]
554
555
5568.1.1.5: The White tag
557
558The White tag value is the name of the player or players of the white pieces.
559The names are given as they would appear in a telephone directory.  The family
560or last name appears first.  If a first name or first initial is available, it
561is separated from the family name by a comma and a space.  Finally, one or more
562middle initials may appear.  (Wherever a comma appears, the very next character
563should be a space.  Wherever an initial appears, the very next character should
564be a period.)  If the name is unknown, a single question mark should appear as
565the tag value.
566
567The intent is to allow meaningful ASCII sorting of the tag value that is
568independent of regional name formation customs.  If more than one person is
569playing the white pieces, the names are listed in alphabetical order and are
570separated by the colon character between adjacent entries.  A player who is
571also a computer program should have appropriate version information listed
572after the name of the program.
573
574The format used in the FIDE Rating Lists is appropriate for use for player name
575tags.
576
577Examples:
578
579[White "Tal, Mikhail N."]
580
581[White "van der Wiel, Johan"]
582
583[White "Acme Pawngrabber v.3.2"]
584
585[White "Fine, R."]
586
587
5888.1.1.6: The Black tag
589
590The Black tag value is the name of the player or players of the black pieces.
591The names are given here as they are for the White tag value.
592
593Examples:
594
595[Black "Lasker, Emmanuel"]
596
597[Black "Smyslov, Vasily V."]
598
599[Black "Smith, John Q.:Woodpusher 2000"]
600
601[Black "Morphy"]
602
603
6048.1.1.7: The Result tag
605
606The Result field value is the result of the game.  It is always exactly the
607same as the game termination marker that concludes the associated movetext.  It
608is always one of four possible values: "1-0" (White wins), "0-1" (Black wins),
609"1/2-1/2" (drawn game), and "*" (game still in progress, game abandoned, or
610result otherwise unknown).  Note that the digit zero is used in both of the
611first two cases; not the letter "O".
612
613All possible examples:
614
615[Result "0-1"]
616
617[Result "1-0"]
618
619[Result "1/2-1/2"]
620
621[Result "*"]
622
623
6248.2: Movetext section
625
626The movetext section is composed of chess moves, move number indications,
627optional annotations, and a single concluding game termination marker.
628
629Because illegal moves are not real chess moves, they are not permitted in PGN
630movetext.  They may appear in commentary, however.  One would hope that illegal
631moves are relatively rare in games worthy of recording.
632
633
6348.2.1: Movetext line justification
635
636In PGN import format, tokens in the movetext do not require any specific line
637justification.
638
639In PGN export format, tokens in the movetext are placed left justified on
640successive text lines each of which has less than 80 printing characters.  As
641many tokens as possible are placed on a line with the remainder appearing on
642successive lines.  A single space character appears between any two adjacent
643symbol tokens on the same line in the movetext.  As with the tag pair section,
644a single empty line follows the last line of data to conclude the movetext
645section.
646
647Neither the first or the last character on an export format PGN line is a
648space.  (This may change in the case of commentary; this area is currently
649under development.)
650
651
6528.2.2: Movetext move number indications
653
654A move number indication is composed of one or more adjacent digits (an integer
655token) followed by zero or more periods.  The integer portion of the indication
656gives the move number of the immediately following white move (if present) and
657also the immediately following black move (if present).
658
659
6608.2.2.1: Import format move number indications
661
662PGN import format does not require move number indications.  It does not
663prohibit superfluous move number indications anywhere in the movetext as long
664as the move numbers are correct.
665
666PGN import format move number indications may have zero or more period
667characters following the digit sequence that gives the move number; one or more
668white space characters may appear between the digit sequence and the period(s).
669
670
6718.2.2.2: Export format move number indications
672
673There are two export format move number indication formats, one for use
674appearing immediately before a white move element and one for use appearing
675immediately before a black move element.  A white move number indication is
676formed from the integer giving the fullmove number with a single period
677character appended.  A black move number indication is formed from the integer
678giving the fullmove number with three period characters appended.
679
680All white move elements have a preceding move number indication.  A black move
681element has a preceding move number indication only in two cases: first, if
682there is intervening annotation or commentary between the black move and the
683previous white move; and second, if there is no previous white move in the
684special case where a game starts from a position where Black is the active
685player.
686
687There are no other cases where move number indications appear in PGN export
688format.
689
690
6918.2.3: Movetext SAN (Standard Algebraic Notation)
692
693SAN (Standard Algebraic Notation) is a representation standard for chess moves
694using the ASCII Latin alphabet.
695
696Examples of SAN recorded games are found throughout most modern chess
697publications.  SAN as presented in this document uses English language single
698character abbreviations for chess pieces, although this is easily changed in
699the source.  English is chosen over other languages because it appears to be
700the most widely recognized.
701
702An alternative to SAN is FAN (Figurine Algebraic Notation).  FAN uses miniature
703piece icons instead of single letter piece abbreviations.  The two notations
704are otherwise identical.
705
706
7078.2.3.1: Square identification
708
709SAN identifies each of the sixty four squares on the chessboard with a unique
710two character name.  The first character of a square identifier is the file of
711the square; a file is a column of eight squares designated by a single lower
712case letter from "a" (leftmost or queenside) up to and including "h" (rightmost
713or kingside).  The second character of a square identifier is the rank of the
714square; a rank is a row of eight squares designated by a single digit from "1"
715(bottom side [White's first rank]) up to and including "8" (top side [Black's
716first rank]).  The initial squares of some pieces are: white queen rook at a1,
717white king at e1, black queen knight pawn at b7, and black king rook at h8.
718
719
7208.2.3.2: Piece identification
721
722SAN identifies each piece by a single upper case letter.  The standard English
723values: pawn = "P", knight = "N", bishop = "B", rook = "R", queen = "Q", and
724king = "K".
725
726The letter code for a pawn is not used for SAN moves in PGN export format
727movetext.  However, some PGN import software disambiguation code may allow for
728the appearance of pawn letter codes.  Also, pawn and other piece letter codes
729are needed for use in some tag pair and annotation constructs.
730
731It is admittedly a bit chauvinistic to select English piece letters over those
732from other languages.  There is a slight justification in that English is a de
733facto universal second language among most chessplayers and program users.  It
734is probably the best that can be done for now.  A later section of this
735document gives alternative piece letters, but these should be used only for
736local presentation software and not for archival storage or for dynamic
737interchange among programs.
738
739
7408.2.3.3: Basic SAN move construction
741
742A basic SAN move is given by listing the moving piece letter (omitted for
743pawns) followed by the destination square.  Capture moves are denoted by the
744lower case letter "x" immediately prior to the destination square; pawn
745captures include the file letter of the originating square of the capturing
746pawn immediately prior to the "x" character.
747
748SAN kingside castling is indicated by the sequence "O-O"; queenside castling is
749indicated by the sequence "O-O-O".  Note that the upper case letter "O" is
750used, not the digit zero.  The use of a zero character is not only incompatible
751with traditional text practices, but it can also confuse parsing algorithms
752which also have to understand about move numbers and game termination markers.
753Also note that the use of the letter "O" is consistent with the practice of
754having all chess move symbols start with a letter; also, it follows the
755convention that all non-pwn move symbols start with an upper case letter.
756
757En passant captures do not have any special notation; they are formed as if the
758captured pawn were on the capturing pawn's destination square.  Pawn promotions
759are denoted by the equal sign "=" immediately following the destination square
760with a promoted piece letter (indicating one of knight, bishop, rook, or queen)
761immediately following the equal sign.  As above, the piece letter is in upper
762case.
763
764
7658.2.3.4: Disambiguation
766
767In the case of ambiguities (multiple pieces of the same type moving to the same
768square), the first appropriate disambiguating step of the three following steps
769is taken:
770
771First, if the moving pieces can be distinguished by their originating files,
772the originating file letter of the moving piece is inserted immediately after
773the moving piece letter.
774
775Second (when the first step fails), if the moving pieces can be distinguished
776by their originating ranks, the originating rank digit of the moving piece is
777inserted immediately after the moving piece letter.
778
779Third (when both the first and the second steps fail), the two character square
780coordinate of the originating square of the moving piece is inserted
781immediately after the moving piece letter.
782
783Note that the above disambiguation is needed only to distinguish among moves of
784the same piece type to the same square; it is not used to distinguish among
785attacks of the same piece type to the same square.  An example of this would be
786a position with two white knights, one on square c3 and one on square g1 and a
787vacant square e2 with White to move.  Both knights attack square e2, and if
788both could legally move there, then a file disambiguation is needed; the
789(nonchecking) knight moves would be "Nce2" and "Nge2".  However, if the white
790king were at square e1 and a black bishop were at square b4 with a vacant
791square d2 (thus an absolute pin of the white knight at square c3), then only
792one white knight (the one at square g1) could move to square e2: "Ne2".
793
794
7958.2.3.5: Check and checkmate indication characters
796
797If the move is a checking move, the plus sign "+" is appended as a suffix to
798the basic SAN move notation; if the move is a checkmating move, the octothorpe
799sign "#" is appended instead.
800
801Neither the appearance nor the absence of either a check or checkmating
802indicator is used for disambiguation purposes.  This means that if two (or
803more) pieces of the same type can move to the same square the differences in
804checking status of the moves does not allieviate the need for the standard rank
805and file disabiguation described above.  (Note that a difference in checking
806status for the above may occur only in the case of a discovered check.)
807
808Neither the checking or checkmating indicators are considered annotation as
809they do not communicate subjective information.  Therefore, they are
810qualitatively different from move suffix annotations like "!" and "?".
811Subjective move annotations are handled using Numeric Annotation Glyphs as
812described in a later section of this document.
813
814There are no special markings used for double checks or discovered checks.
815
816There are no special markings used for drawing moves.
817
818
8198.2.3.6: SAN move length
820
821SAN moves can be as short as two characters (e.g., "d4"), or as long as seven
822characters (e.g., "Qa6xb7#", "fxg1=Q+").  The average SAN move length seen in
823realistic games is probably just fractionally longer than three characters.  If
824the SAN rules seem complicated, be assured that the earlier notation systems of
825LEN (Long English Notation) and EDN (English Descriptive Notation) are much
826more complex, and that LAN (Long Algebraic Notation, the predecessor of SAN) is
827unnecessarily bulky.
828
829
8308.2.3.7: Import and export SAN
831
832PGN export format always uses the above canonical SAN to represent moves in the
833movetext section of a PGN game.  Import format is somewhat more relaxed and it
834makes allowances for moves that do not conform exactly to the canonical format.
835However, these allowances may differ among different PGN reader programs.  Only
836data appearing in export format is in all cases guaranteed to be importable
837into all PGN readers.
838
839There are a number of suggested guidelines for use with implementing PGN reader
840software for permitting non-canonical SAN move representation.  The idea is to
841have a PGN reader apply various transformations to attempt to discover the move
842that is represented by non-canonical input.  Some suggested transformations
843include: letter case remapping, capture indicator insertion, check indicator
844insertion, and checkmate indicator insertion.
845
846
8478.2.3.8: SAN move suffix annotations
848
849Import format PGN allows for the use of traditional suffix annotations for
850moves.  There are exactly six such annotations available: "!", "?", "!!", "!?",
851"?!", and "??".  At most one such suffix annotation may appear per move, and if
852present, it is always the last part of the move symbol.
853
854When exported, a move suffix annotation is translated into the corresponding
855Numeric Annotation Glyph as described in a later section of this document.  For
856example, if the single move symbol "Qxa8?" appears in an import format PGN
857movetext, it would be replaced with the two adjacent symbols "Qxa8 $2".
858
859
8608.2.4: Movetext NAG (Numeric Annotation Glyph)
861
862An NAG (Numeric Annotation Glyph) is a movetext element that is used to
863indicate a simple annotation in a language independent manner.  An NAG is
864formed from a dollar sign ("$") with a non-negative decimal integer suffix.
865The non-negative integer must be from zero to 255 in value.
866
867
8688.2.5: Movetext RAV (Recursive Annotation Variation)
869
870An RAV (Recursive Annotation Variation) is a sequence of movetext containing
871one or more moves enclosed in parentheses.  An RAV is used to represent an
872alternative variation.  The alternate move sequence given by an RAV is one that
873may be legally played by first unplaying the move that appears immediately
874prior to the RAV.  Because the RAV is a recursive construct, it may be nested.
875
876*** The specification for import/export representation of RAV elements needs
877further development.
878
879
8808.2.6: Game Termination Markers
881
882Each movetext section has exactly one game termination marker; the marker
883always occurs as the last element in the movetext.  The game termination marker
884is a symbol that is one of the following four values: "1-0" (White wins), "0-1"
885(Black wins), "1/2-1/2" (drawn game), and "*" (game in progress, result
886unknown, or game abandoned).  Note that the digit zero is used in the above;
887not the upper case letter "O".  The game termination marker appearing in the
888movetext of a game must match the value of the game's Result tag pair.  (While
889the marker appears as a string in the Result tag, it appears as a symbol
890without quotes in the movetext.)
891
892
8939: Supplemental tag names
894
895The following tag names and their associated semantics are recommended for use
896for information not contained in the Seven Tag Roster.
897
898
8999.1: Player related information
900
901Note that if there is more than one player field in an instance of a player
902(White or Black) tag, then there will be corresponding multiple fields in any
903of the following tags.  For example, if the White tag has the three field value
904"Jones:Smith:Zacharias" (a consultation game), then the WhiteTitle tag could
905have a value of "IM:-:GM" if Jones was an International Master, Smith was
906untitled, and Zacharias was a Grandmaster.
907
908
9099.1.1: Tags: WhiteTitle, BlackTitle
910
911These use string values such as "FM", "IM", and "GM"; these tags are used only
912for the standard abbreviations for FIDE titles.  A value of "-" is used for an
913untitled player.
914
915
9169.1.2: Tags: WhiteElo, BlackElo
917
918These tags use integer values; these are used for FIDE Elo ratings.  A value of
919"-" is used for an unrated player.
920
921
9229.1.3: Tags: WhiteUSCF, BlackUSCF
923
924These tags use integer values; these are used for USCF (United States Chess
925Federation) ratings.  Similar tag names can be constructed for other rating
926agencies.
927
928
9299.1.4: Tags: WhiteNA, BlackNA
930
931These tags use string values; these are the e-mail or network addresses of the
932players.  A value of "-" is used for a player without an electronic address.
933
934
9359.1.5: Tags: WhiteType, BlackType
936
937These tags use string values; these describe the player types.  The value
938"human" should be used for a person while the value "program" should be used
939for algorithmic (computer) players.
940
941
9429.2: Event related information
943
944The following tags are used for providing additional information about the
945event.
946
947
9489.2.1: Tag: EventDate
949
950This uses a date value, similar to the Date tag field, that gives the starting
951date of the Event.
952
953
9549.2.2: Tag: EventSponsor
955
956This uses a string value giving the name of the sponsor of the event.
957
958
9599.2.3: Tag: Section
960
961This uses a string; this is used for the playing section of a tournament (e.g.,
962"Open" or "Reserve").
963
964
9659.2.4: Tag: Stage
966
967This uses a string; this is used for the stage of a multistage event (e.g.,
968"Preliminary" or "Semifinal").
969
970
9719.2.5: Tag: Board
972
973This uses an integer; this identifies the board number in a team event and also
974in a simultaneous exhibition.
975
976
9779.3: Opening information (locale specific)
978
979The following tag pairs are used for traditional opening names.  The associated
980tag values will vary according to the local language in use.
981
982
9839.3.1: Tag: Opening
984
985This uses a string; this is used for the traditional opening name.  This will
986vary by locale.  This tag pair is associated with the use of the EPD opcode
987"v0" described in a later section of this document.
988
989
9909.3.2: Tag: Variation
991
992This uses a string; this is used to further refine the Opening tag.  This will
993vary by locale.  This tag pair is associated with the use of the EPD opcode
994"v1" described in a later section of this document.
995
996
9979.3.3: Tag: SubVariation
998
999This uses a string; this is used to further refine the Variation tag.  This
1000will vary by locale.  This tag pair is associated with the use of the EPD
1001opcode "v2" described in a later section of this document.
1002
1003
10049.4: Opening information (third party vendors)
1005
1006The following tag pairs are used for representing opening identification
1007according to various third party vendors and organizations.  References to
1008these organizations does not imply any endorsement of them or any endorsement
1009by them.
1010
1011
10129.4.1: Tag: ECO
1013
1014This uses a string of either the form "XDD" or the form "XDD/DD" where the "X"
1015is a letter from "A" to "E" and the "D" positions are digits; this is used for
1016an opening designation from the five volume _Encyclopedia of Chess Openings_.
1017This tag pair is associated with the use of the EPD opcode "eco" described in a
1018later section of this document.
1019
1020
10219.4.2: Tag: NIC
1022
1023This uses a string; this is used for an opening designation from the _New in
1024Chess_ database.  This tag pair is associated with the use of the EPD opcode
1025"nic" described in a later section of this document.
1026
1027
10289.5: Time and date related information
1029
1030The following tags assist with further refinement of the time and data
1031information associated with a game.
1032
1033
10349.5.1: Tag: Time
1035
1036This uses a time-of-day value in the form "HH:MM:SS"; similar to the Date tag
1037except that it denotes the local clock time (hours, minutes, and seconds) of
1038the start of the game.  Note that colons, not periods, are used for field
1039separators for the Time tag value.  The value is taken from the local time
1040corresponding to the location given in the Site tag pair.
1041
1042
10439.5.2: Tag: UTCTime
1044
1045This tag is similar to the Time tag except that the time is given according to
1046the Universal Coordinated Time standard.
1047
1048
10499.5.3: Tag:; UTCDate
1050
1051This tag is similar to the Date tag except that the date is given according to
1052the Universal Coordinated Time standard.
1053
1054
10559.6: Time control
1056
1057The follwing tag is used to help describe the time control used with the game.
1058
1059
10609.6.1: Tag: TimeControl
1061
1062This uses a list of one or more time control fields.  Each field contains a
1063descriptor for each time control period; if more than one descriptor is present
1064then they are separated by the colon character (":").  The descriptors appear
1065in the order in which they are used in the game.  The last field appearing is
1066considered to be implicitly repeated for further control periods as needed.
1067
1068There are six kinds of TimeControl fields.
1069
1070The first kind is a single question mark ("?") which means that the time
1071control mode is unknown.  When used, it is usually the only descriptor present.
1072
1073The second kind is a single hyphen ("-") which means that there was no time
1074control mode in use.  When used, it is usually the only descriptor present.
1075
1076The third Time control field kind is formed as two positive integers separated
1077by a solidus ("/") character.  The first integer is the number of moves in the
1078period and the second is the number of seconds in the period.  Thus, a time
1079control period of 40 moves in 2 1/2 hours would be represented as "40/9000".
1080
1081The fourth TimeControl field kind is used for a "sudden death" control period.
1082It should only be used for the last descriptor in a TimeControl tag value.  It
1083is sometimes the only descriptor present.  The format consists of a single
1084integer that gives the number of seconds in the period.  Thus, a blitz game
1085would be represented with a TimeControl tag value of "300".
1086
1087The fifth TimeControl field kind is used for an "incremental" control period.
1088It should only be used for the last descriptor in a TimeControl tag value and
1089is usually the only descriptor in the value.  The format consists of two
1090positive integers separated by a plus sign ("+") character.  The first integer
1091gives the minimum number of seconds allocated for the period and the second
1092integer gives the number of extra seconds added after each move is made.  So,
1093an incremental time control of 90 minutes plus one extra minute per move would
1094be given by "4500+60" in the TimeControl tag value.
1095
1096The sixth TimeControl field kind is used for a "sandclock" or "hourglass"
1097control period.  It should only be used for the last descriptor in a
1098TimeControl tag value and is usually the only descriptor in the value.  The
1099format consists of an asterisk ("*") immediately followed by a positive
1100integer.  The integer gives the total number of seconds in the sandclock
1101period.  The time control is implemented as if a sandclock were set at the
1102start of the period with an equal amount of sand in each of the two chambers
1103and the players invert the sandclock after each move with a time forfeit
1104indicated by an empty upper chamber.  Electronic implementation of a physical
1105sandclock may be used.  An example sandclock specification for a common three
1106minute egg timer sandclock would have a tag value of "*180".
1107
1108Additional TimeControl field kinds will be defined as necessary.
1109
1110
11119.7: Alternative starting positions
1112
1113There are two tags defined for assistance with describing games that did not
1114start from the usual initial array.
1115
1116
11179.7.1: Tag: SetUp
1118
1119This tag takes an integer that denotes the "set-up" status of the game.  A
1120value of "0" indicates that the game has started from the usual initial array.
1121A value of "1" indicates that the game started from a set-up position; this
1122position is given in the "FEN" tag pair.  This tag must appear for a game
1123starting with a set-up position.  If it appears with a tag value of "1", a FEN
1124tag pair must also appear.
1125
1126
11279.7.2: Tag: FEN
1128
1129This tag uses a string that gives the Forsyth-Edwards Notation for the starting
1130position used in the game.  FEN is described in a later section of this
1131document.  If a SetUp tag appears with a tag value of "1", the FEN tag pair is
1132also required.
1133
1134
11359.8: Game conclusion
1136
1137There is a single tag that discusses the conclusion of the game.
1138
1139
11409.8.1: Tag: Termination
1141
1142This takes a string that describes the reason for the conclusion of the game.
1143While the Result tag gives the result of the game, it does not provide any
1144extra information and so the Termination tag is defined for this purpose.
1145
1146Strings that may appear as Termination tag values:
1147
1148* "abandoned": abandoned game.
1149
1150* "adjudication": result due to third party adjudication process.
1151
1152* "death": losing player called to greater things, one hopes.
1153
1154* "emergency": game concluded due to unforeseen circumstances.
1155
1156* "normal": game terminated in a normal fashion.
1157
1158* "rules infraction": administrative forfeit due to losing player's failure to
1159observe either the Laws of Chess or the event regulations.
1160
1161* "time forfeit": loss due to losing player's failure to meet time control
1162requirements.
1163
1164* "unterminated": game not terminated.
1165
1166
11679.9: Miscellaneous
1168
1169These are tags that can be briefly described and that doon't fit well inother
1170sections.
1171
1172
11739.9.1: Tag: Annotator
1174
1175This tag uses a name or names in the format of the player name tags; this
1176identifies the annotator or annotators of the game.
1177
1178
11799.9.2: Tag: Mode
1180
1181This uses a string that gives the playing mode of the game.  Examples: "OTB"
1182(over the board), "PM" (paper mail), "EM" (electronic mail), "ICS" (Internet
1183Chess Server), and "TC" (general telecommunication).
1184
1185
11869.9.3: Tag: PlyCount
1187
1188This tag takes a single integer that gives the number of ply (moves) in the
1189game.
1190
1191
119210: Numeric Annotation Glyphs
1193
1194NAG zero is used for a null annotation; it is provided for the convenience of
1195software designers as a placeholder value and should probably not be used in
1196external PGN data.
1197
1198NAGs with values from 1 to 9 annotate the move just played.
1199
1200NAGs with values from 10 to 135 modify the current position.
1201
1202NAGs with values from 136 to 139 describe time pressure.
1203
1204Other NAG values are reserved for future definition.
1205
1206Note: the number assignments listed below should be considered preliminary in
1207nature; they are likely to be changed as a result of reviewer feedback.
1208
1209NAG    Interpretation
1210---    --------------
1211  0    null annotation
1212  1    good move (traditional "!")
1213  2    poor move (traditional "?")
1214  3    very good move (traditional "!!")
1215  4    very poor move (traditional "??")
1216  5    speculative move (traditional "!?")
1217  6    questionable move (traditional "?!")
1218  7    forced move (all others lose quickly)
1219  8    singular move (no reasonable alternatives)
1220  9    worst move
1221 10    drawish position
1222 11    equal chances, quiet position
1223 12    equal chances, active position
1224 13    unclear position
1225 14    White has a slight advantage
1226 15    Black has a slight advantage
1227 16    White has a moderate advantage
1228 17    Black has a moderate advantage
1229 18    White has a decisive advantage
1230 19    Black has a decisive advantage
1231 20    White has a crushing advantage (Black should resign)
1232 21    Black has a crushing advantage (White should resign)
1233 22    White is in zugzwang
1234 23    Black is in zugzwang
1235 24    White has a slight space advantage
1236 25    Black has a slight space advantage
1237 26    White has a moderate space advantage
1238 27    Black has a moderate space advantage
1239 28    White has a decisive space advantage
1240 29    Black has a decisive space advantage
1241 30    White has a slight time (development) advantage
1242 31    Black has a slight time (development) advantage
1243 32    White has a moderate time (development) advantage
1244 33    Black has a moderate time (development) advantage
1245 34    White has a decisive time (development) advantage
1246 35    Black has a decisive time (development) advantage
1247 36    White has the initiative
1248 37    Black has the initiative
1249 38    White has a lasting initiative
1250 39    Black has a lasting initiative
1251 40    White has the attack
1252 41    Black has the attack
1253 42    White has insufficient compensation for material deficit
1254 43    Black has insufficient compensation for material deficit
1255 44    White has sufficient compensation for material deficit
1256 45    Black has sufficient compensation for material deficit
1257 46    White has more than adequate compensation for material deficit
1258 47    Black has more than adequate compensation for material deficit
1259 48    White has a slight center control advantage
1260 49    Black has a slight center control advantage
1261 50    White has a moderate center control advantage
1262 51    Black has a moderate center control advantage
1263 52    White has a decisive center control advantage
1264 53    Black has a decisive center control advantage
1265 54    White has a slight kingside control advantage
1266 55    Black has a slight kingside control advantage
1267 56    White has a moderate kingside control advantage
1268 57    Black has a moderate kingside control advantage
1269 58    White has a decisive kingside control advantage
1270 59    Black has a decisive kingside control advantage
1271 60    White has a slight queenside control advantage
1272 61    Black has a slight queenside control advantage
1273 62    White has a moderate queenside control advantage
1274 63    Black has a moderate queenside control advantage
1275 64    White has a decisive queenside control advantage
1276 65    Black has a decisive queenside control advantage
1277 66    White has a vulnerable first rank
1278 67    Black has a vulnerable first rank
1279 68    White has a well protected first rank
1280 69    Black has a well protected first rank
1281 70    White has a poorly protected king
1282 71    Black has a poorly protected king
1283 72    White has a well protected king
1284 73    Black has a well protected king
1285 74    White has a poorly placed king
1286 75    Black has a poorly placed king
1287 76    White has a well placed king
1288 77    Black has a well placed king
1289 78    White has a very weak pawn structure
1290 79    Black has a very weak pawn structure
1291 80    White has a moderately weak pawn structure
1292 81    Black has a moderately weak pawn structure
1293 82    White has a moderately strong pawn structure
1294 83    Black has a moderately strong pawn structure
1295 84    White has a very strong pawn structure
1296 85    Black has a very strong pawn structure
1297 86    White has poor knight placement
1298 87    Black has poor knight placement
1299 88    White has good knight placement
1300 89    Black has good knight placement
1301 90    White has poor bishop placement
1302 91    Black has poor bishop placement
1303 92    White has good bishop placement
1304 93    Black has good bishop placement
1305 84    White has poor rook placement
1306 85    Black has poor rook placement
1307 86    White has good rook placement
1308 87    Black has good rook placement
1309 98    White has poor queen placement
1310 99    Black has poor queen placement
1311100    White has good queen placement
1312101    Black has good queen placement
1313102    White has poor piece coordination
1314103    Black has poor piece coordination
1315104    White has good piece coordination
1316105    Black has good piece coordination
1317106    White has played the opening very poorly
1318107    Black has played the opening very poorly
1319108    White has played the opening poorly
1320109    Black has played the opening poorly
1321110    White has played the opening well
1322111    Black has played the opening well
1323112    White has played the opening very well
1324113    Black has played the opening very well
1325114    White has played the middlegame very poorly
1326115    Black has played the middlegame very poorly
1327116    White has played the middlegame poorly
1328117    Black has played the middlegame poorly
1329118    White has played the middlegame well
1330119    Black has played the middlegame well
1331120    White has played the middlegame very well
1332121    Black has played the middlegame very well
1333122    White has played the ending very poorly
1334123    Black has played the ending very poorly
1335124    White has played the ending poorly
1336125    Black has played the ending poorly
1337126    White has played the ending well
1338127    Black has played the ending well
1339128    White has played the ending very well
1340129    Black has played the ending very well
1341130    White has slight counterplay
1342131    Black has slight counterplay
1343132    White has moderate counterplay
1344133    Black has moderate counterplay
1345134    White has decisive counterplay
1346135    Black has decisive counterplay
1347136    White has moderate time control pressure
1348137    Black has moderate time control pressure
1349138    White has severe time control pressure
1350139    Black has severe time control pressure
1351
1352
135311: File names and directories
1354
1355File names chosen for PGN data should be both informative and portable.  The
1356directory names and arrangements should also be chosen for the same reasons and
1357also for ease of navigation.
1358
1359Some of suggested file and directory names may be difficult or impossible to
1360represent on certain computing systems.  Use of appropriate conversion customs
1361is encouraged.
1362
1363
136411.1: File name suffix for PGN data
1365
1366The use of the file suffix ".pgn" is encouraged for ASCII text files containing
1367PGN data.
1368
1369
137011.2: File name formation for PGN data for a specific player
1371
1372PGN games for a specific player should have a file name consisting of the
1373player's last name followed by the ".pgn" suffix.
1374
1375
137611.3: File name formation for PGN data for a specific event
1377
1378PGN games for a specific event should have a file name consisting of the
1379event's name followed by the ".pgn" suffix.
1380
1381
138211.4: File name formation for PGN data for chronologically ordered games
1383
1384PGN data files used for chronologically ordered (oldest first) archives use
1385date information as file name root strings.  A file containing all the PGN
1386games for a given year would have an eight character name in the format
1387"YYYY.pgn".  A file containing PGN data for a given month would have a ten
1388character name in the format "YYYYMM.pgn".  Finally, a file for PGN games for a
1389single day would have a twelve character name in the format "YYYYMMDD.pgn".
1390Large files are split into smaller files as needed.
1391
1392As game files are commonly arranged by chronological order, games with missing
1393or incomplete Date tag pair data are to be avoided.  Any question mark
1394characters in a Date tag value will be treated as zero digits for collation
1395within a file and also for file naming.
1396
1397Large quantities of PGN data arranged by chronological order should be
1398organized into hierarchical directories.  A directory containing all PGN data
1399for a given year would have a four character name in the format "YYYY";
1400directories containing PGN files for a given month would have a six character
1401name in the format "YYYYMM".
1402
1403
140411.5: Suggested directory tree organization
1405
1406A suggested directory arrangement for ftp sites and CD-ROM distributions:
1407
1408* PGN: master directory of the PGN subtree (pub/chess/Game-Databases/PGN)
1409
1410* PGN/Events: directory of PGN files, each for a specific event
1411
1412* PGN/Events/News: news and status of the event collection
1413
1414* PGN/Events/ReadMe: brief description of the local directory contents
1415
1416* PGN/MGR: directory of the Master Games Repository subtree
1417
1418* PGN/MGR/News: news and status of the entire PGN/MGR subtree
1419
1420* PGN/MGR/ReadMe: brief description of the local directory contents
1421
1422* PGN/MGR/YYYY: directory of games or subtrees for the year YYYY
1423
1424* PGN/MGR/YYYY/ReadMe: description of local directory for year YYYY
1425
1426* PGN/MGR/YYYY/News: news and status for year YYYY data
1427
1428* PGN/News: news and status of the entire PGN subtree
1429
1430* PGN/Players: directory of PGN files, each for a specific player
1431
1432* PGN/Players/News: news and status of the player collection
1433
1434* PGN/Players/ReadMe: brief description of the local directory contents
1435
1436* PGN/ReadMe: brief description of the local directory contents
1437
1438* PGN/Standard: the PGN standard (this document)
1439
1440* PGN/Tools: software utilities that access PGN data
1441
1442
144312: PGN collating sequence
1444
1445There is a standard sorting order for PGN games within a file.  This collation
1446is based on eight keys; these are the seven tag values of the STR and also the
1447movetext itself.
1448
1449The first (most important, primary key) is the Date tag.  Earlier dated games
1450appear prior to games played at a later date.  This field is sorted by
1451ascending numeric value first with the year, then the month, and finally the
1452day of the month.  Query characters used for unknown date digit values will be
1453treated as zero digit characters for ordering comparison.
1454
1455The second key is the Event tag.  This is sorted in ascending ASCII order.
1456
1457The third key is the Site tag.  This is sorted in ascending ASCII order.
1458
1459The fourth key is the Round tag.  This is sorted in ascending numeric order
1460based on the value of the integer used to denote the playing round.  A query or
1461hyphen used for the round is ordered before any integer value.  A query
1462character is ordered before a hyphen character.
1463
1464The fifth key is the White tag.  This is sorted in ascending ASCII order.
1465
1466The sixth key is the Black tag.  This is sorted in ascending ASCII order.
1467
1468The seventh key is the Result tag.  This is sorted in ascending ASCII order.
1469
1470The eighth key is the movetext itself.  This is sorted in ascending ASCII order
1471with the entire text including spaces and newline characters.
1472
1473
147413: PGN software
1475
1476This section describes some PGN software that is either currently available or
1477expected to be available in the near future.  The entries are presented in
1478rough chronological order of their being made known to the PGN standard
1479coordinator.  Authors of PGN capable software are encouraged to contact the
1480coordinator (e-mail address listed near the start of this document) so that the
1481information may be included here in this section.
1482
1483In addition to the PGN standard, there are two more chess standards of interest
1484to the chess software community.  These are the FEN standard (Forsyth-Edwards
1485Notation) for position notation and the EPD standard (Extended Position
1486Description) for comprehensive position description for automated interprogram
1487processing.  These are described in a later section of this document.
1488
1489Some PGN software is freeware and can be gotten from ftp sites and other
1490sources.  Other PGN software is payware and appears as part of commercial
1491chessplaying programs and chess database managers.  Those who are interested in
1492the propagation of the PGN standard are encouraged to support manufacturers of
1493chess software that use the standard.  If a particular vendor does not offer
1494PGN compatibility, it is likely that a few letters to them along with a copy of
1495this specification may help them decide to include PGN support in their next
1496release.
1497
1498The staff at the University of Oklahoma at Norman (USA) have graciously
1499provided an ftp site (chess.uoknor.edu) for the storage of chess related data
1500and programs.  Because file names change over time, those accessing the site
1501are encouraged to first retrieve the file "pub/chess/ls-lR.gz" for a current
1502listing.  A scan of this listing will also help locate versions of PGN programs
1503for machine types and operating systems other than those listed below.  Further
1504information about this archive can be gotten from its administrator, Chris
1505Petroff (chris@uoknor.edu).
1506
1507For European users, the kind staff at the University of Hamburg (Germany) have
1508provided the ftp site ftp.math.uni-hamburg.de; this carries a daily mirror of
1509the pub/chess directory at the chess.uoknor.edu site.
1510
1511
151213.1: The SAN Kit
1513
1514The "SAN Kit" is an ANSI C source chess programming toolkit available for free
1515from the ftp site chess.uoknor.edu in the directory pub/chess/Unix as the file
1516"SAN.tar.gz" (a gzip tar archive).  This kit contains code for PGN import and
1517export and can be used to "regularize" PGN data into reduced export format by
1518use of its "tfgg" command.  The SAN Kit also supports FEN I/O.  Code from this
1519kit is freely redistributable for anyone as long as future distribution is
1520unhindered for everyone.  The SAN Kit is undergoing continuous development,
1521although dates of future deliveries are quite difficult to predict and releases
1522sometimes appear months apart.  Suggestions and comments should be directed to
1523its author, Steven J. Edwards (sje@world.std.com).
1524
1525
152613.2: pgnRead
1527
1528The program "pgnRead" runs under MS Windows 3.1 and provides an interactive
1529graphical user interface for scanning PGN data files.  This program includes a
1530colorful figurine chessboard display and scrolling controls for game and game
1531text selection.  It is available from the chess.uoknor.edu ftp site in the
1532pub/chess/DOS directory; several versions are available with names of the form
1533"pgnrd**.exe"; the latest at this writing is "PGNRD130.EXE".  Suggestions and
1534comments should be directed to its author, Keith Fuller (keithfx@aol.com).
1535
1536
153713.3: mail2pgn/GIICS
1538
1539The program "mail2pgn" produces a PGN version of chess game data generated by
1540the ICS (Internet Chess Server).  It can be found at the chess.uoknor.edu ftp
1541site in the pub/chess/DOS directory as the file "mail2pgn.zip"  A C language
1542version is in the directory pub/chess/Unix as the file "mail2pgn.c".
1543Suggestions and comments should be directed to its author, John Aronson
1544(aronson@helios.ece.arizona.edu).  This code has been reportedly incorporated
1545into the GIICS (Graphical Interface for the ICS); suggestions and comments
1546should be directed to its author, Tony Acero (ace3@midway.uchicago.edu).
1547
1548There is a report that mail2pgn has been superseded by the newer program
1549"MV2PGN" described below.
1550
1551
155213.4: XBoard
1553
1554"XBoard" is a comprehensive chess utility running under the X Window System
1555that provides a graphical user interface in a portable manner.  A new version
1556now handles PGN data.  It is available from the chess.uoknor.edu ftp site in
1557the pub/chess/X directory as the file "xboard-3.0.pl9.tar.gz".  Suggestions and
1558comments should be directed to its author, Tim Mann (mann@src.dec.com).
1559
1560
156113.5: cupgn
1562
1563The program "cupgn" converts game data stored in the ChessBase format into PGN.
1564It is available from the chess.uoknor.edu ftp site in the
1565pub/chess/Game-Databases/CBUFF directory as the file "cupgn.tar.gz".  Another
1566version is in the directory pub/chess/DOS as the file "cupgn120.exe".
1567Suggestions and comments should be directed to its author, Anjo Anjewierden
1568(anjo@swi.psy.uva.nl).
1569
1570
157113.6: Zarkov
1572
1573The current version (3.0) of the commercial chessplaying program "Zarkov" can
1574read and write games using PGN.  This program can also use the EPD standard for
1575communication with other EPD capable programs.  Historically, Zarkov is the
1576very first program to use EPD.  Suggestions and comments should be directed to
1577its author, John Stanback (jhs@icbdfcs1.fc.hp.com).
1578
1579A vendor for North America is:
1580
1581    International Chess Enterprises
1582    P.O. Box 19457
1583    Seattle, WA 98109
1584    USA
1585    (800) 262-4277
1586
1587A vendor for Europe is:
1588
1589    Gambit-Soft
1590    Feckenhauser Strasse 27
1591    D-78628 Rottweil
1592    GERMANY
1593    49-741-21573
1594
1595
159613.7: Chess Assistant
1597
1598The upcoming version of the multifunction commercial database program "Chess
1599Assistant" will be able to use the PGN standard as an import and export option.
1600There is a report of a freeware program, "PGN2CA", that will convert PGN
1601databases into Chess Assistant format.  For more information, the contact is
1602Victor Zakharov, one of the members of the Chess Assistant development team
1603(VICTOR@ldis.cs.msu.su).
1604
1605A vendor for North America is:
1606
1607    International Chess Enterprises
1608    P.O. Box 19457
1609    Seattle, WA 98109
1610    USA
1611    (800) 262-4277
1612
1613
161413.8: BOOKUP
1615
1616The MS-DOS edition of the multifunction commercial program BOOKUP, version 8.1,
1617is able to use the EPD standard for communication with other EPD capable
1618programs.  It may also be PGN capable as well.
1619
1620The BOOKUP 8.1.1 Addenda notes dated 1993.12.17 provide comprehensive
1621information on how to use EPD in conjunction with "analyst" programs such as
1622Zarkov and HIARCS.  Specifically, the search and evaluation abilities of an
1623analyst program are combined with the information organization abilities of the
1624BOOKUP database program to provide position scoring.  This is done by first
1625having BOOKUP export a database in EPD format, then having an analyst program
1626annotate each EPD record with a numeric score, and then having BOOKUP import
1627the changed EPD file.  BOOKUP can then apply minimaxing to the imported
1628database; this results in scores from terminal positions being propagated back
1629to earlier positions and even back to moves from the starting array.
1630
1631For some reason, BOOKUP calls this process "backsolving", but it's really just
1632standard minimaxing.  In any case, it's a good example of how different
1633programs from different authors performing different types of tasks can be
1634integrated by use of a common, non-proprietary standard.  This allows for a new
1635set of powerful features that are beyond the capabilities of any one of the
1636individual component programs.
1637
1638BOOKUP allows for some customizing of EPD actions.  One such customization is
1639to require the positional evaluations to follow the EPD standard; this means
1640that the score is always given from the viewpoint of the active player.  This
1641is explained more fully in the section on the "ce" (centipawn evaluation)
1642opcode in the EPD description in a later section of this document.  To ensure
1643that BOOKUP handles the centipawn evaluations in the "right" way, the EPD
1644setting "Positive for White" must be set to "N".  This makes BOOKUP work
1645correctly with Zarkov and with all other programs that use the "right"
1646centipawn evaluation convention.  There is an apparent problem with HIARCS that
1647requires this option to be set to "Y"; but this really means that, if true,
1648HIARCS needs to be adjusted to use the "right" centipawn evaluation convention.
1649
1650A vendor in North America is:
1651
1652    BOOKUP
1653    2763 Kensington Place West
1654    Columbus, OH 43202
1655    USA
1656    (800) 949-5445
1657    (614) 263-7219
1658
1659
166013.9: HIARCS
1661
1662The current version (2.1) of the commercial chessplaying program "HIARCS" is
1663able to use the EPD standard for communication with other EPD capable programs.
1664It may also be PGN capable as well.  More details will appear here as they
1665become available.
1666
1667A vendor in North America is:
1668
1669    HIARCS
1670    c/o BOOKUP
1671    2763 Kensington Place West
1672    Columbus, OH 43202
1673    USA
1674    (800) 949-5445
1675    (614) 263-7219
1676
1677
167813.10: Deja Vu
1679
1680The chess database "Deja Vu" from ChessWorks is a PGN compatible collection of
1681over 300,000 games.  It is available only on CD-ROM and is scheduled for
1682release in 1994.05 with periodic revisions thereafter.  The introductory price
1683is US$329.  For further information, the authors are John Crayton and Eric
1684Schiller and they can be contacted via e-mail (chesswks@netcom.com).
1685
1686
168713.11: MV2PGN
1688
1689The program "MV2PGN" can be used to convert game data generated by both current
1690and older versions of the GIICS (Graphical Interface - Internet Chess Server).
1691The program is included in the self extracting archive available from
1692chess.uoknor.edu in the directory pub/chess/DOS as the file "ics2pgn.exe".
1693Source code is also included.  This program is reported to supersede the older
1694"mail2pgn" and was needed due to a change in ICS recording format in late 1993.
1695For further information about MV2PGN, the contact person is Gary Bastin
1696(gbastin@x102a.ess.harris.com).
1697
1698
169913.12: The Hansen utilities (cb2pgn, nic2pgn, pgn2cb, pgn2nic)
1700
1701The Hansen utilities are used to convert among various chess data
1702representation formats.  The PGN related programs include: "cb2pgn.exe"
1703(convert ChessBase to PGN), "nic2pgn.exe" (convert NIC to PGN), "pgn2cb.exe"
1704(convert PGN to ChessBase), and "pgn2nic.exe" (convert PGN to NIC).
1705
1706The ChessBase related utilities (cb2pgn/pgn2cb) are found at chess.uoknor.edu
1707in the pub/chess/Game-Databases/ChessBase directory.
1708
1709The NIC related utilities (nic2pgn/pgn2nic) are found at chess.uoknor.edu in
1710the pub/chess/Game-Databases/NIC directory.
1711
1712For further information about the Hansen utilities, the contact person is the
1713author, Carsten Hansen (ch0506@hdc.hha.dk).
1714
1715
171613.13: Slappy the Database
1717
1718"Slappy the Database" is a commercial chess database and translation program
1719scheduled for release no sooner than late 1994.  It is a low cost utility with
1720a simple character interface intended for those who want a supported product
1721but who do not need (or cannot afford) a comprehensive, feature-laden program
1722with a graphical user interface.  Slappy's two most important features are its
1723batch processing ability and its full implementation of each and every standard
1724described in this document.  Versions of Slappy the Database will be provided
1725for various platforms including: Intel 386/486 Unix, Apple Macintosh, and
1726MS-DOS.
1727
1728Slappy may also be useful to those who have a full feature program who also
1729need to run time consuming chess database tasks on a spare computer.
1730
1731Suggestions and comments should be directed to its author, Steven J. Edwards
1732(sje@world.std.com).  More details will appear here as they become available.
1733
1734
173513.14: CBASCII
1736
1737"CBASCII" is a general utility for converting chess data between ChessBase
1738format and ASCII representations.  It has PGN capability, and it is available
1739from the chess.uoknor.edu ftp site in the pub/chess/DOS directory as the file
1740"cba1_2.zip".  The contact person is the program's author, Andy Duplain
1741(duplain@btcs.bt.co.uk).
1742
1743
174413.15: ZZZZZZ
1745
1746"ZZZZZZ" is a chessplaying program, complete with source, that also includes
1747some database functions.  A recent version is reported to have both PGN and EPD
1748capabilities.  It is available from the chess.uoknor.edu ftp site in the
1749pub/chess/Unix directory as the file "zzzzzz-3.2b1.tar.gz".  The contact person
1750is its author, Gijsbert Wiesenecker (wiesenecker@sara.nl).
1751
1752
175313.16: icsconv
1754
1755The program "icsconv" can be used to convert Internet Chess Server games, both
1756old and new format, to PGN.  It is available from the chess.uoknor.edu site in
1757the pub/chess/Game-Databases/PGN/Tools directory as the file "icsconv.exe".
1758The contact person is the author, Kevin Nomura (chow@netcom.com).
1759
1760
176113.17: CHESSOP (CHESSOPN/CHESSOPG)
1762
1763CHESSOP is an openings database and viewing tool with support for reading PGN
1764games.  It runs under MS-DOS and displays positions rather than games.  For
1765each position, both good and bad moves are listed with appropriate annotation.
1766Transpositions are handled as well.  The distributed database contains over
1767100,000 positions covering all the common openings.  Users can feed in their
1768own PGN data as well.  CHESSOP takes 3 Mbyte of hard disk, costs US$39 and can
1769be obtained from:
1770
1771    CHESSX Software
1772    12 Bluebell Close
1773    Glenmore Park
1774    AUSTRALIA 2745.
1775
1776The ideas behind CHESSOP can be seen in CHESSOPN (alias CHESSOPG), a free
1777version on the ICS server which has a reduced openings database (25,000
1778positions) and no PGN or transposition support but is otherwise the same as
1779CHESSOP.  (These are the files "chessopg.zip" in the directory pub/chess/DOS at
1780the chess.uoknor.edu ftp site.)
1781
1782
178313.18: CAT2PGN
1784
1785The program "CAT2PGN" is a utility that translates data from the format used by
1786Chess Assistant into PGN.  It is available from the chess.uoknor.edu ftp site.
1787The contact person for CAT2PGN is its author, David Myers
1788(myers@frodo.biochem.duke.edu).
1789
1790
179113.19: pgn2opg
1792
1793The utility "pgn2opg" can be used to convert PGN files into a text format used
1794by the "CHESSOPG" program mentioned above.  Although it does not perform any
1795semantic analysis on PGN input, it has been demonstrated to handle known
1796correct PGN input properly.  The file can be found in the pub/chess/PGN/Tools
1797directory at the chess.uoknor.edu ftp site.  For more information, the author
1798is David Barnes (djb@ukc.ac.uk).
1799
1800
180114: PGN data archives
1802
1803The primary PGN data archive repository is located at the ftp site
1804chess.uoknor.edu as the directory "pub/chess/Game-Databases/PGN".  It is
1805organized according to the description given in section C.5 of this document.
1806The European site ftp.math.uni-hamburg.de is also reported to carry a regularly
1807updated copy of the repository.
1808
1809
181015: International Olympic Committee country codes
1811
1812International Olympic Committee country codes are employed for Site nation
1813information because of their traditional use with the reporting of
1814international sporting events.  Due to changes in geography and linguistic
1815custom, some of the following may be incorrect or outdated.  Corrections and
1816extensions should be sent via e-mail to the PGN coordinator whose address
1817listed near the start of this document.
1818
1819AFG: Afghanistan
1820AIR: Aboard aircraft
1821ALB: Albania
1822ALG: Algeria
1823AND: Andorra
1824ANG: Angola
1825ANT: Antigua
1826ARG: Argentina
1827ARM: Armenia
1828ATA: Antarctica
1829AUS: Australia
1830AZB: Azerbaijan
1831BAN: Bangladesh
1832BAR: Bahrain
1833BHM: Bahamas
1834BEL: Belgium
1835BER: Bermuda
1836BIH: Bosnia and Herzegovina
1837BLA: Belarus
1838BLG: Bulgaria
1839BLZ: Belize
1840BOL: Bolivia
1841BRB: Barbados
1842BRS: Brazil
1843BRU: Brunei
1844BSW: Botswana
1845CAN: Canada
1846CHI: Chile
1847COL: Columbia
1848CRA: Costa Rica
1849CRO: Croatia
1850CSR: Czechoslovakia
1851CUB: Cuba
1852CYP: Cyprus
1853DEN: Denmark
1854DOM: Dominican Republic
1855ECU: Ecuador
1856EGY: Egypt
1857ENG: England
1858ESP: Spain
1859EST: Estonia
1860FAI: Faroe Islands
1861FIJ: Fiji
1862FIN: Finland
1863FRA: France
1864GAM: Gambia
1865GCI: Guernsey-Jersey
1866GEO: Georgia
1867GER: Germany
1868GHA: Ghana
1869GRC: Greece
1870GUA: Guatemala
1871GUY: Guyana
1872HAI: Haiti
1873HKG: Hong Kong
1874HON: Honduras
1875HUN: Hungary
1876IND: India
1877IRL: Ireland
1878IRN: Iran
1879IRQ: Iraq
1880ISD: Iceland
1881ISR: Israel
1882ITA: Italy
1883IVO: Ivory Coast
1884JAM: Jamaica
1885JAP: Japan
1886JRD: Jordan
1887JUG: Yugoslavia
1888KAZ: Kazakhstan
1889KEN: Kenya
1890KIR: Kyrgyzstan
1891KUW: Kuwait
1892LAT: Latvia
1893LEB: Lebanon
1894LIB: Libya
1895LIC: Liechtenstein
1896LTU: Lithuania
1897LUX: Luxembourg
1898MAL: Malaysia
1899MAU: Mauritania
1900MEX: Mexico
1901MLI: Mali
1902MLT: Malta
1903MNC: Monaco
1904MOL: Moldova
1905MON: Mongolia
1906MOZ: Mozambique
1907MRC: Morocco
1908MRT: Mauritius
1909MYN: Myanmar
1910NCG: Nicaragua
1911NET: The Internet
1912NIG: Nigeria
1913NLA: Netherlands Antilles
1914NLD: Netherlands
1915NOR: Norway
1916NZD: New Zealand
1917OST: Austria
1918PAK: Pakistan
1919PAL: Palestine
1920PAN: Panama
1921PAR: Paraguay
1922PER: Peru
1923PHI: Philippines
1924PNG: Papua New Guinea
1925POL: Poland
1926POR: Portugal
1927PRC: People's Republic of China
1928PRO: Puerto Rico
1929QTR: Qatar
1930RIN: Indonesia
1931ROM: Romania
1932RUS: Russia
1933SAF: South Africa
1934SAL: El Salvador
1935SCO: Scotland
1936SEA: At Sea
1937SEN: Senegal
1938SEY: Seychelles
1939SIP: Singapore
1940SLV: Slovenia
1941SMA: San Marino
1942SPC: Aboard spacecraft
1943SRI: Sri Lanka
1944SUD: Sudan
1945SUR: Surinam
1946SVE: Sweden
1947SWZ: Switzerland
1948SYR: Syria
1949TAI: Thailand
1950TMT: Turkmenistan
1951TRK: Turkey
1952TTO: Trinidad and Tobago
1953TUN: Tunisia
1954UAE: United Arab Emirates
1955UGA: Uganda
1956UKR: Ukraine
1957UNK: Unknown
1958URU: Uruguay
1959USA: United States of America
1960UZB: Uzbekistan
1961VEN: Venezuela
1962VGB: British Virgin Islands
1963VIE: Vietnam
1964VUS: U.S. Virgin Islands
1965WLS: Wales
1966YEM: Yemen
1967YUG: Yugoslavia
1968ZAM: Zambia
1969ZIM: Zimbabwe
1970ZRE: Zaire
1971
1972
197316: Additional chess data standards
1974
1975While PGN is used for game storage, there are other data representation
1976standards for other chess related purposes.  Two important standards are FEN
1977and EPD, both described in this section.
1978
1979
198016.1: FEN
1981
1982FEN is "Forsyth-Edwards Notation"; it is a standard for describing chess
1983positions using the ASCII character set.
1984
1985A single FEN record uses one text line of variable length composed of six data
1986fields.  The first four fields of the FEN specification are the same as the
1987first four fields of the EPD specification.
1988
1989A text file composed exclusively of FEN data records should have a file name
1990with the suffix ".fen".
1991
1992
199316.1.1: History
1994
1995FEN is based on a 19th century standard for position recording designed by the
1996Scotsman David Forsyth, a newspaper journalist.  The original Forsyth standard
1997has been slightly extended for use with chess software by Steven Edwards with
1998assistance from commentators on the Internet.  This new standard, FEN, was
1999first implemented in Edwards' SAN Kit.
2000
2001
200216.1.2: Uses for a position notation
2003
2004Having a standard position notation is particularly important for chess
2005programmers as it allows them to share position databases.  For example, there
2006exist standard position notation databases with many of the classical benchmark
2007tests for chessplaying programs, and by using a common position notation format
2008many hours of tedious data entry can be saved.  Additionally, a position
2009notation can be useful for page layout programs and for confirming position
2010status for e-mail competition.
2011
2012Many interesting chess problem sets represented using FEN can be found at the
2013chess.uoknor.edu ftp site in the directory pub/chess/SAN_testsuites.
2014
2015
201616.1.3: Data fields
2017
2018FEN specifies the piece placement, the active color, the castling availability,
2019the en passant target square, the halfmove clock, and the fullmove number.
2020These can all fit on a single text line in an easily read format.  The length
2021of a FEN position description varies somewhat according to the position. In
2022some cases, the description could be eighty or more characters in length and so
2023may not fit conveniently on some displays.  However, these positions aren't too
2024common.
2025
2026A FEN description has six fields.  Each field is composed only of non-blank
2027printing ASCII characters.  Adjacent fields are separated by a single ASCII
2028space character.
2029
2030
203116.1.3.1: Piece placement data
2032
2033The first field represents the placement of the pieces on the board.  The board
2034contents are specified starting with the eighth rank and ending with the first
2035rank.  For each rank, the squares are specified from file a to file h.  White
2036pieces are identified by uppercase SAN piece letters ("PNBRQK") and black
2037pieces are identified by lowercase SAN piece letters ("pnbrqk").  Empty squares
2038are represented by the digits one through eight; the digit used represents the
2039count of contiguous empty squares along a rank.  A solidus character "/" is
2040used to separate data of adjacent ranks.
2041
2042
204316.1.3.2: Active color
2044
2045The second field represents the active color.  A lower case "w" is used if
2046White is to move; a lower case "b" is used if Black is the active player.
2047
2048
204916.1.3.3: Castling availability
2050
2051The third field represents castling availability.  This indicates potential
2052future castling that may of may not be possible at the moment due to blocking
2053pieces or enemy attacks.  If there is no castling availability for either side,
2054the single character symbol "-" is used.  Otherwise, a combination of from one
2055to four characters are present.  If White has kingside castling availability,
2056the uppercase letter "K" appears.  If White has queenside castling
2057availability, the uppercase letter "Q" appears.  If Black has kingside castling
2058availability, the lowercase letter "k" appears.  If Black has queenside
2059castling availability, then the lowercase letter "q" appears.  Those letters
2060which appear will be ordered first uppercase before lowercase and second
2061kingside before queenside.  There is no white space between the letters.
2062
2063
206416.1.3.4: En passant target square
2065
2066The fourth field is the en passant target square.  If there is no en passant
2067target square then the single character symbol "-" appears.  If there is an en
2068passant target square then is represented by a lowercase file character
2069immediately followed by a rank digit.  Obviously, the rank digit will be "3"
2070following a white pawn double advance (Black is the active color) or else be
2071the digit "6" after a black pawn double advance (White being the active color).
2072
2073An en passant target square is given if and only if the last move was a pawn
2074advance of two squares.  Therefore, an en passant target square field may have
2075a square name even if there is no pawn of the opposing side that may
2076immediately execute the en passant capture.
2077
2078
207916.1.3.5: Halfmove clock
2080
2081The fifth field is a nonnegative integer representing the halfmove clock.  This
2082number is the count of halfmoves (or ply) since the last pawn advance or
2083capturing move.  This value is used for the fifty move draw rule.
2084
2085
208616.1.3.6: Fullmove number
2087
2088The sixth and last field is a positive integer that gives the fullmove number.
2089This will have the value "1" for the first move of a game for both White and
2090Black.  It is incremented by one immediately after each move by Black.
2091
2092
209316.1.4: Examples
2094
2095Here's the FEN for the starting position:
2096
2097rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
2098
2099And after the move 1. e4:
2100
2101rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR b KQkq e3 0 1
2102
2103And then after 1. ... c5:
2104
2105rnbqkbnr/pp1ppppp/8/2p5/4P3/8/PPPP1PPP/RNBQKBNR w KQkq c6 0 2
2106
2107And then after 2. Nf3:
2108
2109rnbqkbnr/pp1ppppp/8/2p5/4P3/5N2/PPPP1PPP/RNBQKB1R b KQkq - 1 2
2110
2111For two kings on their home squares and a white pawn on e2 (White to move) with
2112thirty eight full moves played with five halfmoves since the last pawn move or
2113capture:
2114
21154k3/8/8/8/8/8/4P3/4K3 w - - 5 39
2116
2117
211816.2: EPD
2119
2120EPD is "Extended Position Description"; it is a standard for describing chess
2121positions along with an extended set of structured attribute values using the
2122ASCII character set.  It is intended for data and command interchange among
2123chessplaying programs.  It is also intended for the representation of portable
2124opening library repositories.
2125
2126A single EPD uses one text line of variable length composed of four data field
2127followed by zero or more operations.  The four fields of the EPD specification
2128are the same as the first four fields of the FEN specification.
2129
2130A text file composed exclusively of EPD data records should have a file name
2131with the suffix ".epd".
2132
2133
213416.2.1: History
2135
2136EPD is based in part on the earlier FEN standard; it has added extensions for
2137use with opening library preparation and also for general data and command
2138interchange among advanced chess programs.  EPD was developed by John Stanback
2139and Steven Edwards; its first implementation is in Stanback's master strength
2140chessplaying program Zarkov.
2141
2142
214316.2.2: Uses for an extended position notation
2144
2145Like FEN, EPD can also be used for general position description.  However,
2146unlike FEN, EPD is designed to be expandable by the addition of new operations
2147that provide new functionality as needs arise.
2148
2149Many interesting chess problem sets represented using EPD can be found at the
2150chess.uoknor.edu ftp site in the directory pub/chess/SAN_testsuites.
2151
2152
215316.2.3: Data fields
2154
2155EPD specifies the piece placement, the active color, the castling availability,
2156and the en passant target square of a position.  These can all fit on a single
2157text line in an easily read format.  The length of an EPD position description
2158varies somewhat according to the position and any associated operations. In
2159some cases, the description could be eighty or more characters in length and so
2160may not fit conveniently on some displays.  However, most EPD descriptions pass
2161among programs only and these are not usually seen by program users.
2162
2163(Note: due to the likelihood of future expansion of EPD, implementors are
2164encouraged to have their programs handle EPD text lines of up to 1024
2165characters long.)
2166
2167Each EPD data field is composed only of non-blank printing ASCII characters.
2168Adjacent data fields are separated by a single ASCII space character.
2169
2170
217116.2.3.1: Piece placement data
2172
2173The first field represents the placement of the pieces on the board.  The board
2174contents are specified starting with the eighth rank and ending with the first
2175rank.  For each rank, the squares are specified from file a to file h.  White
2176pieces are identified by uppercase SAN piece letters ("PNBRQK") and black
2177pieces are identified by lowercase SAN piece letters ("pnbrqk").  Empty squares
2178are represented by the digits one through eight; the digit used represents the
2179count of contiguous empty squares along a rank.  A solidus character "/" is
2180used to separate data of adjacent ranks.
2181
2182
218316.2.3.2: Active color
2184
2185The second field represents the active color.  A lower case "w" is used if
2186White is to move; a lower case "b" is used if Black is the active player.
2187
2188
218916.2.3.3: Castling availability
2190
2191The third field represents castling availability.  This indicates potential
2192future castling that may or may not be possible at the moment due to blocking
2193pieces or enemy attacks.  If there is no castling availability for either side,
2194the single character symbol "-" is used.  Otherwise, a combination of from one
2195to four characters are present.  If White has kingside castling availability,
2196the uppercase letter "K" appears.  If White has queenside castling
2197availability, the uppercase letter "Q" appears.  If Black has kingside castling
2198availability, the lowercase letter "k" appears.  If Black has queenside
2199castling availability, then the lowercase letter "q" appears.  Those letters
2200which appear will be ordered first uppercase before lowercase and second
2201kingside before queenside.  There is no white space between the letters.
2202
2203
220416.2.3.4: En passant target square
2205
2206The fourth field is the en passant target square.  If there is no en passant
2207target square then the single character symbol "-" appears.  If there is an en
2208passant target square then is represented by a lowercase file character
2209immediately followed by a rank digit.  Obviously, the rank digit will be "3"
2210following a white pawn double advance (Black is the active color) or else be
2211the digit "6" after a black pawn double advance (White being the active color).
2212
2213An en passant target square is given if and only if the last move was a pawn
2214advance of two squares.  Therefore, an en passant target square field may have
2215a square name even if there is no pawn of the opposing side that may
2216immediately execute the en passant capture.
2217
2218
221916.2.4: Operations
2220
2221An EPD operation is composed of an opcode followed by zero or more operands and
2222is concluded by a semicolon.
2223
2224Multiple operations are separated by a single space character.  If there is at
2225least one operation present in an EPD line, it is separated from the last
2226(fourth) data field by a single space character.
2227
2228
222916.2.4.1: General format
2230
2231An opcode is an identifier that starts with a letter character and may be
2232followed by up to fourteen more characters.  Each additional character may be a
2233letter or a digit or the underscore character.
2234
2235An operand is either a set of contiguous non-white space printing characters or
2236a string.  A string is a set of contiguous printing characters delimited by a
2237quote character at each end.  A string value must have less than 256 bytes of
2238data.
2239
2240If at least one operand is present in an operation, there is a single space
2241between the opcode and the first operand.  If more than one operand is present
2242in an operation, there is a single blank character between every two adjacent
2243operands.  If there are no operands, a semicolon character is appended to the
2244opcode to mark the end of the operation.  If any operands appear, the last
2245operand has an appended semicolon that marks the end of the operation.
2246
2247Any given opcode appears at most once per EPD record.  Multiple operations in a
2248single EPD record should appear in ASCII order of their opcode names
2249(mnemonics).  However, a program reading EPD records may allow for operations
2250not in ASCII order by opcode mnemonics; the semantics are the same in either
2251case.
2252
2253Some opcodes that allow for more than one operand may have special ordering
2254requirements for the operands.  For example, the "pv" (predicted variation)
2255opcode requires its operands (moves) to appear in the order in which they would
2256be played.  All other opcodes that allow for more than one operand should have
2257operands appearing in ASCII order.  An example of the latter set is the "bm"
2258(best move[s]) opcode; its operands are moves that are all immediately playable
2259from the current position.
2260
2261Some opcodes require one or more operands that are chess moves.  These moves
2262should be represented using SAN.  If a different representation is used, there
2263is no guarantee that the EPD will be read correctly during subsequent
2264processing.
2265
2266Some opcodes require one or more operands that are integers.  Some opcodes may
2267require that an integer operand must be within a given range; the details are
2268described in the opcode list given below.  A negative integer is formed with a
2269hyphen (minus sign) preceding the integer digit sequence.  An optional plus
2270sign may be used for indicating a non-negative value, but such use is not
2271required and is indeed discouraged.
2272
2273Some opcodes require one or more operands that are floating point numbers.
2274Some opcodes may require that a floating point operand must be within a given
2275range; the details are described in the opcode list given below.  A floating
2276point operand is constructed from an optional sign character ("+" or "-"), a
2277digit sequence (with at least one digit), a radix point (always "."), and a
2278final digit sequence (with at least one digit).
2279
2280
228116.2.4.2: Opcode mnemonics
2282
2283An opcode mnemonic used for archival storage and for interprogram communication
2284starts with a lower case letter and is composed of only lower case letters,
2285digits, and the underscore character (i.e., no upper case letters).  These
2286mnemonics will also all be at least two characters in length.
2287
2288Opcode mnemonics used only by a single program or an experimental suite of
2289programs should start with an upper case letter.  This is so they may be easily
2290distinguished should they be inadvertently be encountered by other programs.
2291When a such a "private" opcode be demonstrated to be widely useful, it should
2292be brought into the official list (appearing below) in a lower case form.
2293
2294If a given program does not recognize a particular opcode, that operation is
2295simply ignored; it is not signaled as an error.
2296
2297
229816.2.5: Opcode list
2299
2300The opcodes are listed here in ASCII order of their mnemonics.  Suggestions for
2301new opcodes should be sent to the PGN standard coordinator listed near the
2302start of this document.
2303
2304
230516.2.5.1: Opcode "acn": analysis count: nodes
2306
2307The opcode "acn" takes a single non-negative integer operand.  It is used to
2308represent the number of nodes examined in an analysis.  Note that the value may
2309be quite large for some extended searches and so use of (at least) a long (four
2310byte) representation is suggested.
2311
2312
231316.2.5.2: Opcode "acs": analysis count: seconds
2314
2315The opcode "acs" takes a single non-negative integer operand.  It is used to
2316represent the number of seconds used for an analysis.  Note that the value may
2317be quite large for some extended searches and so use of (at least) a long (four
2318byte) representation is suggested.
2319
2320
232116.2.5.3: Opcode "am": avoid move(s)
2322
2323The opcode "am" indicates a set of zero or more moves, all immediately playable
2324from the current position, that are to be avoided in the opinion of the EPD
2325writer.  Each operand is a SAN move; they appear in ASCII order.
2326
2327
232816.2.5.4: Opcode "bm": best move(s)
2329
2330The opcode "bm" indicates a set of zero or more moves, all immediately playable
2331from the current position, that are judged to the best available by the EPD
2332writer.  Each operand is a SAN move; they appear in ASCII order.
2333
2334
233516.2.5.5: Opcode "c0": comment (primary, also "c1" though "c9")
2336
2337The opcode "c0" (lower case letter "c", digit character zero) indicates a top
2338level comment that applies to the given position.  It is the first of ten
2339ranked comments, each of which has a mnemonic formed from the lower case letter
2340"c" followed by a single decimal digit.  Each of these opcodes takes either a
2341single string operand or no operand at all.
2342
2343This ten member comment family of opcodes is intended for use as descriptive
2344commentary for a complete game or game fragment.  The usual processing of these
2345opcodes are as follows:
2346
23471) At the beginning of a game (or game fragment), a move sequence scanning
2348program initializes each element of its set of ten comment string registers to
2349be null.
2350
23512) As the EPD record for each position in the game is processed, the comment
2352operations are interpreted from left to right.  (Actually, all operations in n
2353EPD record are interpreted from left to right.)  Because operations appear in
2354ASCII order according to their opcode mnemonics, opcode "c0" (if present) will
2355be handled prior to all other opcodes, then opcode "c1" (if present), and so
2356forth until opcode "c9" (if present).
2357
23583) The processing of opcode "cN" (0 <= N <= 9) involves two steps.  First, all
2359comment string registers with an index equal to or greater than N are set to
2360null.  (This is the set "cN" though "c9".)  Second, and only if a string
2361operand is present, the value of the corresponding comment string register is
2362set equal to the string operand.
2363
2364
236516.2.5.6: Opcode "ce": centipawn evaluation
2366
2367The opcode "ce" indicates the evaluation of the indicated position in centipawn
2368units.  It takes a single operand, an optionally signed integer that gives an
2369evaluation of the position from the viewpoint of the active player; i.e., the
2370player with the move.  Positive values indicate a position favorable to the
2371moving player while negative values indicate a position favorable to the
2372passive player; i.e., the player without the move.  A centipawn evaluation
2373value close to zero indicates a neutral positional evaluation.
2374
2375Values are restricted to integers that are equal to or greater than -32767 and
2376are less than or equal to 32766.
2377
2378A value greater than 32000 indicates the availability of a forced mate to the
2379active player.  The number of plies until mate is given by subtracting the
2380evaluation from the value 32767.  Thus, a winning mate in N fullmoves is a mate
2381in ((2 * N) - 1) halfmoves (or ply) and has a corresponding centipawn
2382evaluation of (32767 - ((2 * N) - 1)).  For example, a mate on the move (mate
2383in one) has a centipawn evaluation of 32766 while a mate in five has a
2384centipawn evaluation of 32758.
2385
2386A value less than -32000 indicates the availability of a forced mate to the
2387passive player.  The number of plies until mate is given by subtracting the
2388evaluation from the value -32767 and then negating the result.  Thus, a losing
2389mate in N fullmoves is a mate in (2 * N) halfmoves (or ply) and has a
2390corresponding centipawn evaluation of (-32767 + (2 * N)).  For example, a mate
2391after the move (losing mate in one) has a centipawn evaluation of -32765 while
2392a losing mate in five has a centipawn evaluation of -32757.
2393
2394A value of -32767 indicates an illegal position.  A stalemate position has a
2395centipawn evaluation of zero as does a position drawn due to insufficient
2396mating material.  Any other position known to be a certain forced draw also has
2397a centipawn evaluation of zero.
2398
2399
240016.2.5.7: Opcode "dm": direct mate fullmove count
2401
2402The "dm" opcode is used to indicate the number of fullmoves until checkmate is
2403to be delivered by the active color for the indicated position.  It always
2404takes a single operand which is a positive integer giving the fullmove count.
2405For example, a position known to be a "mate in three" would have an operation
2406of "dm 3;" to indicate this.
2407
2408This opcode is intended for use with problem sets composed of positions
2409requiring direct mate answers as solutions.
2410
2411
241216.2.5.8: Opcode "draw_accept": accept a draw offer
2413
2414The opcode "draw_accept" is used to indicate that a draw offer made after the
2415move that lead to the indicated position is accepted by the active player.
2416This opcode takes no operands.
2417
2418
241916.2.5.9: Opcode "draw_claim": claim a draw
2420
2421The opcode "draw_claim" is used to indicate claim by the active player that a
2422draw exists.  The draw is claimed because of a third time repetition or because
2423of the fifty move rule or because of insufficient mating material.  A supplied
2424move (see the opcode "sm") is also required to appear as part of the same EPD
2425record.  The draw_claim opcode takes no operands.
2426
2427
242816.2.5.10: Opcode "draw_offer": offer a draw
2429
2430The opcode "draw_offer" is used to indicate that a draw is offered by the
2431active player.  A supplied move (see the opcode "sm") is also required to
2432appear as part of the same EPD record; this move is considered played from the
2433indicated position.  The draw_offer opcode takes no operands.
2434
2435
243616.2.5.11: Opcode "draw_reject": reject a draw offer
2437
2438The opcode "draw_reject" is used to indicate that a draw offer made after the
2439move that lead to the indicated position is rejected by the active player.
2440This opcode takes no operands.
2441
2442
244316.2.5.12: Opcode "eco": _Encyclopedia of Chess Openings_ opening code
2444
2445The opcode "eco" is used to associate an opening designation from the
2446_Encyclopedia of Chess Openings_ taxonomy with the indicated position.  The
2447opcode takes either a single string operand (the ECO opening name) or no
2448operand at all.  If an operand is present, its value is associated with an
2449"ECO" string register of the scanning program.  If there is no operand, the ECO
2450string register of the scanning program is set to null.
2451
2452The usage is similar to that of the "ECO" tag pair of the PGN standard.
2453
2454
245516.2.5.13: Opcode "fmvn": fullmove number
2456
2457The opcode "fmvn" represents the fullmove n umber associated with the position.
2458It always takes a single operand that is the positive integer value of the move
2459number.
2460
2461This opcode is used to explicitly represent the fullmove number in EPD that is
2462present by default in FEN as the sixth field.  Fullmove number information is
2463usually omitted from EPD because it does not affect move generation (commonly
2464needed for EPD-using tasks) but it does affect game notation (commonly needed
2465for FEN-using tasks).  Because of the desire for space optimization for large
2466EPD files, fullmove numbers were dropped from EPD's parent FEN.  The halfmove
2467clock information was similarly dropped.
2468
2469
247016.2.5.14: Opcode "hmvc": halfmove clock
2471
2472The opcode "hmvc" represents the halfmove clock associated with the position.
2473The halfmove clock of a position is equal to the number of plies since the last
2474pawn move or capture.  This information is used to implement the fifty move
2475draw rule.  It always takes a single operand that is the non-negative integer
2476value of the halfmove clock.
2477
2478This opcode is used to explicitly represent the halfmove clock in EPD that is
2479present by default in FEN as the fifth field.  Halfmove clock information is
2480usually omitted from EPD because it does not affect move generation (commonly
2481needed for EPD-using tasks) but it does affect game termination issues
2482(commonly needed for FEN-using tasks).  Because of the desire for space
2483optimization for large EPD files, halfmove clock values were dropped from EPD's
2484parent FEN.  The fullmove number information was similarly dropped.
2485
2486
248716.2.5.15: Opcode "id": position identification
2488
2489The opcode "id" is used to provide a simple identifying label for the indicated
2490position.  It takes a single string operand.
2491
2492This opcode is intended for use with test suites used for measuring
2493chessplaying program strength.  An example "id" operand for the seven hundred
2494fifty seventh position of the one thousand one problems in Reinfeld's _1001
2495Winning Chess Sacrifices and Combinations_ would be "WCSAC.0757" while the
2496fifteenth position in the twenty four problem Bratko-Kopec test suite would
2497have an "id" operand of "BK.15".
2498
2499
250016.2.5.16: Opcode "nic": _New In Chess_ opening code
2501
2502The opcode "nic" is used to associate an opening designation from the _New In
2503Chess_ taxonomy with the indicated position.  The opcode takes either a single
2504string operand (the NIC opening name) or no operand at all.  If an operand is
2505present, its value is associated with an "NIC" string register of the scanning
2506program.  If there is no operand, the NIC string register of the scanning
2507program is set to null.
2508
2509The usage is similar to that of the "NIC" tag pair of the PGN standard.
2510
2511
251216.2.5.17: Opcode "noop": no operation
2513
2514The "noop" opcode is used to indicate no operation.  It takes zero or more
2515operands, each of which may be of any type.  The operation involves no
2516processing.  It is intended for use by developers for program testing purposes.
2517
2518
251916.2.5.18: Opcode "pm": predicted move
2520
2521The "pm" opcode is used to provide a single predicted move for the indicated
2522position.  It has exactly one operand, a move playable from the position.  This
2523move is judged by the EPD writer to represent the best move available to the
2524active player.
2525
2526If a non-empty "pv" (predicted variation) line of play is also present in the
2527same EPD record, the first move of the predicted variation is the same as the
2528predicted move.
2529
2530The "pm" opcode is intended for use as a general "display hint" mechanism.
2531
2532
253316.2.5.19: Opcode "pv": predicted variation
2534
2535The "pv" opcode is used to provide a predicted variation for the indicated
2536position.  It has zero or more operands which represent a sequence of moves
2537playable from the position.  This sequence is judged by the EPD writer to
2538represent the best play available.
2539
2540If a "pm" (predicted move) operation is also present in the same EPD record,
2541the predicted move is the same as the first move of the predicted variation.
2542
2543
254416.2.5.20: Opcode "rc": repetition count
2545
2546The "rc" opcode is used to indicate the number of occurrences of the indicated
2547position.  It takes a single, positive integer operand.  Any position,
2548including the initial starting position, is considered to have an "rc" value of
2549at least one.  A value of three indicates a candidate for a draw claim by the
2550position repetition rule.
2551
2552
255316.2.5.21: Opcode "resign": game resignation
2554
2555The opcode "resign" is used to indicate that the active player has resigned the
2556game.  This opcode takes no operands.
2557
2558
255916.2.5.22: Opcode "sm": supplied move
2560
2561The "sm" opcode is used to provide a single supplied move for the indicated
2562position.  It has exactly one operand, a move playable from the position.  This
2563move is the move to be played from the position.
2564
2565The "sm" opcode is intended for use to communicate the most recent played move
2566in an active game.  It is used to communicate moves between programs in
2567automatic play via a network.  This includes correspondence play using e-mail
2568and also programs acting as network front ends to human players.
2569
2570
257116.2.5.23: Opcode "tcgs": telecommunication: game selector
2572
2573The "tcgs" opcode is one of the telecommunication family of opcodes used for
2574games conducted via e-mail and similar means.  This opcode takes a single
2575operand that is a positive integer.  It is used to select among various games
2576in progress between the same sender and receiver.
2577
2578
257916.2.5.24: Opcode "tcri": telecommunication: receiver identification
2580
2581The "tcri" opcode is one of the telecommunication family of opcodes used for
2582games conducted via e-mail and similar means.  This opcode takes two order
2583dependent string operands.  The first operand is the e-mail address of the
2584receiver of the EPD record.  The second operand is the name of the player
2585(program or human) at the address who is the actual receiver of the EPD record.
2586
2587
258816.2.5.25: Opcode "tcsi": telecommunication: sender identification
2589
2590The "tcsi" opcode is one of the telecommunication family of opcodes used for
2591games conducted via e-mail and similar means.  This opcode takes two order
2592dependent string operands.  The first operand is the e-mail address of the
2593sender of the EPD record.  The second operand is the name of the player
2594(program or human) at the address who is the actual sender of the EPD record.
2595
2596
259716.2.5.26: Opcode "v0": variation name (primary, also "v1" though "v9")
2598
2599The opcode "v0" (lower case letter "v", digit character zero) indicates a top
2600level variation name that applies to the given position.  It is the first of
2601ten ranked variation names, each of which has a mnemonic formed from the lower
2602case letter "v" followed by a single decimal digit.  Each of these opcodes
2603takes either a single string operand or no operand at all.
2604
2605This ten member variation name family of opcodes is intended for use as
2606traditional variation names for a complete game or game fragment.  The usual
2607processing of these opcodes are as follows:
2608
26091) At the beginning of a game (or game fragment), a move sequence scanning
2610program initializes each element of its set of ten variation name string
2611registers to be null.
2612
26132) As the EPD record for each position in the game is processed, the variation
2614name operations are interpreted from left to right.  (Actually, all operations
2615in n EPD record are interpreted from left to right.)  Because operations appear
2616in ASCII order according to their opcode mnemonics, opcode "v0" (if present)
2617will be handled prior to all other opcodes, then opcode "v1" (if present), and
2618so forth until opcode "v9" (if present).
2619
26203) The processing of opcode "vN" (0 <= N <= 9) involves two steps.  First, all
2621variation name string registers with an index equal to or greater than N are
2622set to null.  (This is the set "vN" though "v9".)  Second, and only if a string
2623operand is present, the value of the corresponding variation name string
2624register is set equal to the string operand.
2625
2626
262717: Alternative chesspiece identifier letters
2628
2629English language piece names are used to define the letter set for identifying
2630chesspieces in PGN movetext.  However, authors of programs which are used only
2631for local presentation or scanning of chess move data may find it convenient to
2632use piece letter codes common in their locales.  This is not a problem as long
2633as PGN data that resides in archival storage or that is exchanged among
2634programs still uses the SAN (English) piece letter codes: "PNBRQK".
2635
2636For the above authors only, a list of alternative piece letter codes are
2637provided:
2638
2639Language     Piece letters (pawn knight bishop rook queen king)
2640----------   --------------------------------------------------
2641Czech        P J S V D K
2642Danish       B S L T D K
2643Dutch        O P L T D K
2644English      P N B R Q K
2645Estonian     P R O V L K
2646Finnish      P R L T D K
2647French       P C F T D R
2648German       B S L T D K
2649Hungarian    G H F B V K
2650Icelandic    P R B H D K
2651Italian      P C A T D R
2652Norwegian    B S L T D K
2653Polish       P S G W H K
2654Portuguese   P C B T D R
2655Romanian     P C N T D R
2656Spanish      P C A T D R
2657Swedish      B S L T D K
2658
2659
266018: Formal syntax
2661
2662<PGN-database> ::= <PGN-game> <PGN-database>
2663                   <empty>
2664
2665<PGN-game> ::= <tag-section> <movetext-section>
2666
2667<tag-section> ::= <tag-pair> <tag-section>
2668                  <empty>
2669
2670<tag-pair> ::= [ <tag-name> <tag-value> ]
2671
2672<tag-name> ::= <identifier>
2673
2674<tag-value> ::= <string>
2675
2676<movetext-section> ::= <element-sequence> <game-termination>
2677
2678<element-sequence> ::= <element> <element-sequence>
2679                       <recursive-variation> <element-sequence>
2680                       <empty>
2681
2682<element> ::= <move-number-indication>
2683              <SAN-move>
2684              <numeric-annotation-glyph>
2685
2686<recursive-variation> ::= ( <element-sequence> )
2687
2688<game-termination> ::= 1-0
2689                       0-1
2690                       1/2-1/2
2691                       *
2692<empty> ::=
2693
2694
269519: Canonical chess position hash coding
2696
2697*** This section is under development.
2698
2699
270020: Binary representation (PGC)
2701
2702*** This section is under development.
2703
2704The binary coded version of PGN is PGC (PGN Game Coding).  PGC is a binary
2705representation standard of PGN data designed for the dual goals of storage
2706efficiency and program I/O.  A file containing PGC data should have a name with
2707a suffix of ".pgc".
2708
2709Unlike PGN text files that may have locale dependent representations for
2710newlines, PGC files have data that does not vary due to local processing
2711environment.  This means that PGC files may be transferred among systems using
2712general binary file methods.
2713
2714PGC files should be used only when the use of PGN is impractical due to time
2715and space resource constraints.  As the general level of processing
2716capabilities increases, the need for PGC over PGN will decrease.  Therefore,
2717implementors are encouraged not to use PGC as the default representation
2718because it is much more difficult (than PGN) to understand without proper
2719software.
2720
2721PGC data is composed of a sequence of PGC records.  Each record is composed of
2722a sequence of one or more bytes.  The first byte is the PGN record marker and
2723it specifies the interpretation of the remaining portion of the record.  This
2724remaining portion is composed of zero or more PGN record items.  Item types
2725include move sequences, move sets, and character strings.
2726
2727
272820.1: Bytes, words, and doublewords
2729
2730At the lowest level, PGC binary data is organized as bytes, words (two
2731contiguous bytes), and doublewords (four contiguous bytes).  All eight bits of
2732a byte are used.  Longwords (eight contiguous bytes) are not used.  Integer
2733values are stored using two's complement representation.  Integers may be
2734signed or unsigned depending on context.  Multibyte integers are stored in
2735low-endian format with the least significant byte appearing first.
2736
2737A one byte integer item is called "int-1".  A two byte integer item is called
2738"int-2".  A four byte integer item is called "int-4".
2739
2740Characters are stored as bytes using the ISO 8859/1 Latin-1 (ECMA-94) code set.
2741There is no provision for other characters sets or representations.
2742
2743
274420.2: Move ordinals
2745
2746A chess move is represented using a move ordinal.  This is a single unsigned
2747byte quantity with values from zero to 255.  A move ordinal is interpreted as
2748an index into the list of legal moves from the current position.  This list is
2749constructed by generating the legal moves from the current position, assigning
2750SAN ASCII strings to each move, and then sorting these strings in ascending
2751order.  Note that a seven bit ordinal, as used by some inferior representation
2752systems, is insufficient as there are some positions that have more than 128
2753moves available.
2754
2755Examples:  From the initial position, there are twenty moves.  Move ordinal 0
2756corresponds to the SAN move string "Na3"; move ordinal 1 corresponds to "Nc3",
2757move ordinal 4 corresponds to "a3", and move ordinal 19 corresponds to "h4".
2758
2759Moves can be organized into sequences and sets.  A move sequence is an ordered
2760list of moves that are played, one after another from first to last.  A move
2761set is a list of moves that are all playable from the current position.
2762
2763Move sequence data is represented using a length header followed by move
2764ordinal data.  The length header is an unsigned integer that may be a byte or a
2765word.  The integer gives the number, possibly zero, of following move ordinal
2766bytes.  Most move sequences can be represented using just a byte header; these
2767are called "mvseq-1" items.  Move sequence data using a word header are called
2768"mvseq-2" items.
2769
2770Move set data is represented using a length header followed by move ordinal
2771data.  The length header is an unsigned integer that is a byte.  The integer
2772gives the number, possibly zero, of following move ordinal bytes.  All move
2773sets are be represented using just a byte header; these are called "mvset-1"
2774items.  (Note the implied restriction that a move set can only have a maximum
2775of 255 of the possible 256 ordinals present at one time.)
2776
2777
277820.3: String data
2779
2780PGC string data is represented using a length header followed by bytes of
2781character data.  The length header is an unsigned integer that may be a byte, a
2782word, or a doubleword.  The integer gives the number, possibly zero, of
2783following character bytes.  Most strings can be represented using just a byte
2784header; these are called "string-1" items.  String data using a word header are
2785called "string-2" items and string data using a doubleword header are called
2786"string-4" items.  No special ASCII NUL termination byte is required for PGC
2787storage of a string as the length is explicitly given in the item header.
2788
2789
279020.4: Marker codes
2791
2792PGC marker codes are given in hexadecimal format.  PGC marker code zero (marker
27930x00) is the "noop" marker and carries no meaning.  Each additional marker code
2794defined appears in its own subsection below.
2795
2796
279720.4.1: Marker 0x01: reduced export format single game
2798
2799Marker 0x01 is used to indicate a single complete game in reduced export
2800format.  This refers to a game that has only the Seven Tag Roster data, played
2801moves, and no annotations or comments.  This record type is used as an
2802alternative to the general game data begin/end record pairs described below.
2803The general marker pair (0x05/0x06) is used to help represent game data that
2804can't be adequately represented in reduced export format.  There are eight
2805items that follow marker 0x01 to form the "reduced export format single game"
2806record.  In order, these are:
2807
28081) string-1 (Event tag value)
2809
28102) string-1 (Site tag value)
2811
28123) string-1 (Date tag value)
2813
28144) string-1 (Round tag value)
2815
28165) string-1 (White tag value)
2817
28186) string-1 (Black tag value)
2819
28207) string-1 (Result tag value)
2821
28228) mvseq-2 (played moves)
2823
2824
282520.4.2: Marker 0x02: tag pair
2826
2827Marker 0x02 is used to indicate a single tag pair.  There are two items that
2828follow marker 0x02 to form the "tag pair" record; in order these are:
2829
28301) string-1 (tag pair name)
2831
28322) string-1 (tag pair value)
2833
2834
283520.4.3: Marker 0x03: short move sequence
2836
2837Marker 0x03 is used to indicate a short move sequence.  There is one item that
2838follows marker 0x03 to form the "short move sequence" record; this is:
2839
28401) mvseq-1 (played moves)
2841
2842
284320.4.4: Marker 0x04: long move sequence
2844
2845Marker 0x04 is used to indicate a long move sequence.  There is one item that
2846follows marker 0x04 to form the "long move sequence" record; this is:
2847
28481) mvseq-2 (played moves)
2849
2850
285120.4.5: Marker 0x05: general game data begin
2852
2853Marker 0x05 is used to indicate the beginning of data for a game.  It has no
2854associated items; it is a complete record by itself.  Instead, it marks the
2855beginning of PGC records used to describe a game.  All records up to the
2856corresponding "general game data end" record are considered to be part of the
2857same game.  (PGC record type 0x01, "reduced export format single game", is not
2858permitted to appear within a general game begin/end record pair.  The general
2859game construct is to be used as an alternative to record type 0x01 in those
2860cases where the latter is too restrictive to contain the data for a game.)
2861
2862
286320.4.6: Marker 0x06: general game data end
2864
2865Marker 0x06 is used to indicate the end of data for a game.  It has no
2866associated items; it is a complete record by itself.  Instead, it marks the end
2867of PGC records used to describe a game.  All records after the corresponding
2868(and earlier appearing) "general game data begin" record are considered to be
2869part of the same game.
2870
2871
287220.4.7: Marker 0x07: simple-nag
2873
2874Marker 0x07 is used to indicate the presence of a simple NAG (Numeric
2875Annotation Glyph).  This is an annotation marker that has only a short type
2876identification and no operands.  There is one item that follows marker 0x07 to
2877form the "simple-nag" record; this is:
2878
28791) int-1 (unsigned NAG value, from 0 to 255)
2880
2881
288220.4.8: Marker 0x08: rav-begin
2883
2884Marker 0x08 is used to indicate the beginning of an RAV (Recursive Annotation
2885Variation).  It has no associated items; it is a complete record by itself.
2886Instead, it marks the beginning of PGC records used to describe a recursive
2887annotation.  It is considered an opening bracket for a later rav-end record;
2888the recursive annotation is completely described between the bracket pair.  The
2889rav-begin/data/rav-end structures can be nested.
2890
2891
289220.4.9: Marker 0x09: rav-end
2893
2894Marker 0x09 is used to indicate the end of an RAV (Recursive Annotation
2895Variation).  It has no associated items; it is a complete record by itself.
2896Instead, it marks the end of PGC records used to describe a recursive
2897annotation.  It is considered a closing bracket for an earlier rav-begin
2898record; the recursive annotation is completely described between the bracket
2899pair.  The rav-begin/data/rav-end structures can be nested.
2900
2901
290220.4.10: Marker 0x0a: escape-string
2903
2904Marker 0x0a is used to indicate the presence of an escape string.  This is a
2905string represented by the use of the percent sign ("%") escape mechanism in
2906PGN.  The data that is escaped is the sequence of characters immediately
2907follwoing the percent sign up to but not including the terminating newline.  As
2908is the case with the PGN percent sign escape, the use of a PGC escape-string
2909record is limited to use for non-archival data.  There is one item that follows
2910marker 0x0a to form the "escape-string" record; this is the string data being
2911escaped:
2912
29131) string-2 (escaped string data)
2914
2915
291621: E-mail correspondence usage
2917
2918*** This section is under development.
2919
2920
2921Standard: EOF
2922