| Alias(es) | ISO-IR-111 | 
|---|---|
| Languages | Russian, Belarusian, Macedonian, Serbian, Ukrainian (partial) | 
| Standard | ECMA-113:1986 | 
| Classification | Extended ASCII, KOI | 
| Extends | KOI8-B | 
| Succeeded by | ECMA-113:1988 (ISO-8859-5) | 
| Other related encoding | KOI8-F | 
ISO-IR-111[1] or KOI8-E[2] is an 8-bit character set. It is a multinational extension of KOI-8 for Belarusian, Macedonian, Serbian, and Ukrainian (except Ґґ which is added to KOI8-F). The name "ISO-IR-111" refers to its registration number in the ISO-IR registry, and denotes it as a set usable with ISO/IEC 2022.
It was defined by the first (1986) edition of ECMA-113,[3] which is the Ecma International standard corresponding to ISO/IEC 8859-5, and as such also corresponds to a 1987 draft version of ISO-8859-5.[4] The published editions of ISO/IEC 8859-5 instead correspond to subsequent editions of ECMA-113, which defines a different encoding.[5]
Naming confusion
[edit]ISO-IR-111, the 1985 edition of ECMA-113 (also called "ECMA-Cyrillic" or "KOI8-E"), was based on the 1974 edition of GOST 19768 (i.e. KOI-8). In 1987 ECMA-113 was redesigned.[5] These newer editions of ECMA-113 are equivalent to ISO-8859-5,[5][6] and do not follow the KOI layout. This confusion has led to a common misconception that ISO-8859-5 was defined in or based on GOST 19768-74.[6]
Possibly as another consequence of this, RFC 1345 erroneously lists a different codepage under the names "ISO-IR-111" and "ECMA-Cyrillic", resembling ISO-8859-5 with re-ordered rows, and partially compatible with Windows-1251.[7][6] Due to concerns that existing implementations might use the RFC 1345 definition for those two labels, it was proposed that the IANA additionally recognise KOI8-E as a label for ECMA-113:1985 content,[7] and the IANA presently lists that label as an alias.[2]
Character set
[edit]The following table shows the ISO-IR-111 encoding. Each character is shown with its equivalent Unicode code point.
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 0x | ||||||||||||||||
| 1x | ||||||||||||||||
| 2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / | 
| 3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? | 
| 4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | 
| 5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ | 
| 6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | 
| 7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | |
| 8x | ||||||||||||||||
| 9x | ||||||||||||||||
| Ax | NBSP | ђ 0452  | 
ѓ 0453  | 
ё 0451  | 
є 0454  | 
ѕ 0455  | 
і 0456  | 
ї 0457  | 
ј 0458  | 
љ 0459  | 
њ 045A  | 
ћ 045B  | 
ќ 045C  | 
SHY | ў 045E  | 
џ 045F  | 
| Bx | № 2116  | 
Ђ 0402  | 
Ѓ 0403  | 
Ё 0401  | 
Є 0404  | 
Ѕ 0405  | 
І 0406  | 
Ї 0407  | 
Ј 0408  | 
Љ 0409  | 
Њ 040A  | 
Ћ 040B  | 
Ќ 040C  | 
¤ 00A4  | 
Ў 040E  | 
Џ 040F  | 
| Cx | ю 044E  | 
а 0430  | 
б 0431  | 
ц 0446  | 
д 0434  | 
е 0435  | 
ф 0444  | 
г 0433  | 
х 0445  | 
и 0438  | 
й 0439  | 
к 043A  | 
л 043B  | 
м 043C  | 
н 043D  | 
о 043E  | 
| Dx | п 043F  | 
я 044F  | 
р 0440  | 
с 0441  | 
т 0442  | 
у 0443  | 
ж 0436  | 
в 0432  | 
ь 044C  | 
ы 044B  | 
з 0437  | 
ш 0448  | 
э 044D  | 
щ 0449  | 
ч 0447  | 
ъ 044A  | 
| Ex | Ю 042E  | 
А 0410  | 
Б 0411  | 
Ц 0426  | 
Д 0414  | 
Е 0415  | 
Ф 0424  | 
Г 0413  | 
Х 0425  | 
И 0418  | 
Й 0419  | 
К 041A  | 
Л 041B  | 
М 041C  | 
Н 041D  | 
О 041E  | 
| Fx | П 041F  | 
Я 042F  | 
Р 0420  | 
С 0421  | 
Т 0422  | 
У 0423  | 
Ж 0416  | 
В 0412  | 
Ь 042C  | 
Ы 042B  | 
З 0417  | 
Ш 0428  | 
Э 042D  | 
Щ 0429  | 
Ч 0427  | 
Ъ 042A  | 
Extended and modified versions
[edit]A modified version named KOI8 Unified or KOI8-F was used in software produced by Fingertip Software, adding the Ґ in its KOI8-U location (replacing the soft hyphen and displacing the universal currency sign), and adding some graphical characters in the C1 control codes area, mainly from KOI8-R and Windows-1251.[4][6][8][9]
Incorrect RFC 1345 code page
[edit]| Languages | Russian, Belarusian, Macedonian, Serbian | 
|---|---|
| Standard | RFC 1345 | 
| Classification | Extended ASCII | 
| Transforms / Encodes | ISO-IR-111 | 
| Other related encodings | ISO-8859-5, Windows-1251 | 
RFC 1345 erroneously lists a different code page under the name ISO-IR-111, encoding the same Cyrillic characters but with a different layout. It resembles a mixture of Windows-1251 and ISO-8859-5.[7] Specifically, line A_ corresponds to ISO-8859-5, lines C_ through F_ correspond to Windows-1251[6] (equivalent to lines B_ through E_ of ISO-8859-5), and line B_ nearly corresponds to line F_ of ISO-8859-5, with the exception of the § being replaced with a ¤.
Certain codes resemble ISO-IR-111 with flipped letter case, which may have contributed to the confusion. The majority differ and are shown below.
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| Ax | NBSP | Ё | Ђ | Ѓ | Є | Ѕ | І | Ї | Ј | Љ | Њ | Ћ | Ќ | SHY | Ў | Џ | 
| Bx | № | ё | ђ | ѓ | є | ѕ | і | ї | ј | љ | њ | ћ | ќ | ¤ | ў | џ | 
| Cx | А | Б | В | Г | Д | Е | Ж | З | И | Й | К | Л | М | Н | О | П | 
| Dx | Р | С | Т | У | Ф | Х | Ц | Ч | Ш | Щ | Ъ | Ы | Ь | Э | Ю | Я | 
| Ex | а | б | в | г | д | е | ж | з | и | й | к | л | м | н | о | п | 
| Fx | р | с | т | у | ф | х | ц | ч | ш | щ | ъ | ы | ь | э | ю | я | 
See also
[edit]References
[edit]- ^ ECMA (1 August 1985). Right-hand Part of the Cyrillic Alphabet (PDF). ITSCJ/IPSJ. ISO-IR-111.
 - ^ a b "Character Sets". IANA.
 - ^ ECMA-113. 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Cyrillic Alphabet (1st ed., June 1986)
 - ^ a b Czyborra, Roman (1998-11-30) [1998-05-25]. "The Cyrillic Charset Soup". Archived from the original on 2016-12-03. Retrieved 2016-12-03.
 - ^ a b c ECMA-113. 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Cyrillic Alphabet (2nd ed., June 1988)
 - ^ a b c d e Nechayev, Valentin (2013) [2001]. "Review of 8-bit Cyrillic encodings universe". Archived from the original on 2016-12-05. Retrieved 2016-12-05.
 - ^ a b c Sokolov, Michael (2003-04-05). "ECMA-cyrillic alias iso-ir-111 sore". IETF Charsets Mailing List.
 - ^ "KOI8 Unified". Fingertip Software. Archived from the original on 1998-01-09. Retrieved 2020-02-11.
 - ^ Leisher, Mark (2008) [1998-03-05]. "KOI8 Unified Cyrillic to Unicode 2.1 mapping table". Department of Mathematical Sciences, New Mexico State University. Archived from the original on 2020-07-12. Retrieved 2020-05-02.