[lex.ccon] # 5 Lexical conventions [[lex]](./#lex) ## 5.13 Literals [[lex.literal]](lex.literal#lex.ccon) ### 5.13.3 Character literals [lex.ccon] [character-literal:](#nt:character-literal "5.13.3 Character literals [lex.ccon]") [*encoding-prefix*](#nt:encoding-prefix "5.13.3 Character literals [lex.ccon]")opt ' [*c-char-sequence*](#nt:c-char-sequence "5.13.3 Character literals [lex.ccon]") ' [encoding-prefix:](#nt:encoding-prefix "5.13.3 Character literals [lex.ccon]") one of u8 u U L [c-char-sequence:](#nt:c-char-sequence "5.13.3 Character literals [lex.ccon]") [*c-char*](#nt:c-char "5.13.3 Character literals [lex.ccon]") [*c-char-sequence*](#nt:c-char-sequence "5.13.3 Character literals [lex.ccon]")opt [c-char:](#nt:c-char "5.13.3 Character literals [lex.ccon]") [*basic-c-char*](#nt:basic-c-char "5.13.3 Character literals [lex.ccon]") [*escape-sequence*](#nt:escape-sequence "5.13.3 Character literals [lex.ccon]") [*universal-character-name*](lex.universal.char#nt:universal-character-name "5.3.2 Universal character names [lex.universal.char]") [basic-c-char:](#nt:basic-c-char "5.13.3 Character literals [lex.ccon]") any member of the translation character set except the U+0027 apostrophe, U+005c reverse solidus, or new-line character [escape-sequence:](#nt:escape-sequence "5.13.3 Character literals [lex.ccon]") [*simple-escape-sequence*](#nt:simple-escape-sequence "5.13.3 Character literals [lex.ccon]") [*numeric-escape-sequence*](#nt:numeric-escape-sequence "5.13.3 Character literals [lex.ccon]") [*conditional-escape-sequence*](#nt:conditional-escape-sequence "5.13.3 Character literals [lex.ccon]") [simple-escape-sequence:](#nt:simple-escape-sequence "5.13.3 Character literals [lex.ccon]") \ [*simple-escape-sequence-char*](#nt:simple-escape-sequence-char "5.13.3 Character literals [lex.ccon]") [simple-escape-sequence-char:](#nt:simple-escape-sequence-char "5.13.3 Character literals [lex.ccon]") one of ' " ? \ a b f n r t v [numeric-escape-sequence:](#nt:numeric-escape-sequence "5.13.3 Character literals [lex.ccon]") [*octal-escape-sequence*](#nt:octal-escape-sequence "5.13.3 Character literals [lex.ccon]") [*hexadecimal-escape-sequence*](#nt:hexadecimal-escape-sequence "5.13.3 Character literals [lex.ccon]") [simple-octal-digit-sequence:](#nt:simple-octal-digit-sequence "5.13.3 Character literals [lex.ccon]") [*octal-digit*](lex.icon#nt:octal-digit "5.13.2 Integer literals [lex.icon]") [*simple-octal-digit-sequence*](#nt:simple-octal-digit-sequence "5.13.3 Character literals [lex.ccon]")opt [octal-escape-sequence:](#nt:octal-escape-sequence "5.13.3 Character literals [lex.ccon]") \ [*octal-digit*](lex.icon#nt:octal-digit "5.13.2 Integer literals [lex.icon]") \ [*octal-digit*](lex.icon#nt:octal-digit "5.13.2 Integer literals [lex.icon]") [*octal-digit*](lex.icon#nt:octal-digit "5.13.2 Integer literals [lex.icon]") \ [*octal-digit*](lex.icon#nt:octal-digit "5.13.2 Integer literals [lex.icon]") [*octal-digit*](lex.icon#nt:octal-digit "5.13.2 Integer literals [lex.icon]") [*octal-digit*](lex.icon#nt:octal-digit "5.13.2 Integer literals [lex.icon]") \o{ [*simple-octal-digit-sequence*](#nt:simple-octal-digit-sequence "5.13.3 Character literals [lex.ccon]") } [hexadecimal-escape-sequence:](#nt:hexadecimal-escape-sequence "5.13.3 Character literals [lex.ccon]") \x [*simple-hexadecimal-digit-sequence*](lex.universal.char#nt:simple-hexadecimal-digit-sequence "5.3.2 Universal character names [lex.universal.char]") \x{ [*simple-hexadecimal-digit-sequence*](lex.universal.char#nt:simple-hexadecimal-digit-sequence "5.3.2 Universal character names [lex.universal.char]") } [conditional-escape-sequence:](#nt:conditional-escape-sequence "5.13.3 Character literals [lex.ccon]") \ [*conditional-escape-sequence-char*](#nt:conditional-escape-sequence-char "5.13.3 Character literals [lex.ccon]") [conditional-escape-sequence-char:](#nt:conditional-escape-sequence-char "5.13.3 Character literals [lex.ccon]") any member of the basic character set that is not an [*octal-digit*](lex.icon#nt:octal-digit "5.13.2 Integer literals [lex.icon]"), a [*simple-escape-sequence-char*](#nt:simple-escape-sequence-char "5.13.3 Character literals [lex.ccon]"), or the characters N, o, u, U, or x [1](#1) [#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/lex.tex#L1505) A [*multicharacter literal*](#def:literal,multicharacter "5.13.3 Character literals [lex.ccon]") is a [*character-literal*](#nt:character-literal "5.13.3 Character literals [lex.ccon]") whose [*c-char-sequence*](#nt:c-char-sequence "5.13.3 Character literals [lex.ccon]") consists of more than one [*c-char*](#nt:c-char "5.13.3 Character literals [lex.ccon]")[.](#1.sentence-1) A multicharacter literal shall not have an [*encoding-prefix*](#nt:encoding-prefix "5.13.3 Character literals [lex.ccon]")[.](#1.sentence-2) If a multicharacter literal contains a [*c-char*](#nt:c-char "5.13.3 Character literals [lex.ccon]") that is not encodable as a single code unit in the ordinary literal encoding, the program is ill-formed[.](#1.sentence-3) Multicharacter literals are conditionally-supported[.](#1.sentence-4) [2](#2) [#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/lex.tex#L1525) The kind of a [*character-literal*](#nt:character-literal "5.13.3 Character literals [lex.ccon]"), its type, and its associated character encoding ([[lex.charset]](lex.charset "5.3.1 Character sets")) are determined by its [*encoding-prefix*](#nt:encoding-prefix "5.13.3 Character literals [lex.ccon]") and its [*c-char-sequence*](#nt:c-char-sequence "5.13.3 Character literals [lex.ccon]") as defined by Table [9](#tab:lex.ccon.literal "Table 9: Character literals")[.](#2.sentence-1) Table [9](#tab:lex.ccon.literal) — Character literals [[tab:lex.ccon.literal]](./tab:lex.ccon.literal) | [🔗](#tab:lex.ccon.literal-row-1)
**Encoding** | **Kind** | **Type** | **Associated char-** | **Example** | | --- | --- | --- | --- | --- | | [🔗](#tab:lex.ccon.literal-row-2)
**prefix** | | | **acter encoding** | | | [🔗](#tab:lex.ccon.literal-row-3)
none | [*ordinary character literal*](#def:literal,character,ordinary "5.13.3 Character literals [lex.ccon]") | char | ordinary literal | 'v' | | [🔗](#tab:lex.ccon.literal-row-4)
| multicharacter literal | int | encoding | 'abcd' | | [🔗](#tab:lex.ccon.literal-row-5)
L | [*wide character literal*](#def:literal,character,wide "5.13.3 Character literals [lex.ccon]") | wchar_t | wide literal | L'w' | | [🔗](#tab:lex.ccon.literal-row-6) | | | encoding | | | [🔗](#tab:lex.ccon.literal-row-7)
u8 | [*UTF-8 character literal*](#def:literal,character,UTF-8 "5.13.3 Character literals [lex.ccon]") | char8_t | UTF-8 | u8'x' | | [🔗](#tab:lex.ccon.literal-row-8)
u | [*UTF-16 character literal*](#def:literal,character,UTF-16 "5.13.3 Character literals [lex.ccon]") | char16_t | UTF-16 | u'y' | | [🔗](#tab:lex.ccon.literal-row-9)
U | [*UTF-32 character literal*](#def:literal,character,UTF-32 "5.13.3 Character literals [lex.ccon]") | char32_t | UTF-32 | U'z' | [3](#3) [#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/lex.tex#L1571) In translation phase 4, the value of a [*character-literal*](#nt:character-literal "5.13.3 Character literals [lex.ccon]") is determined using the range of representable values of the [*character-literal*](#nt:character-literal "5.13.3 Character literals [lex.ccon]")'s type in translation phase 7[.](#3.sentence-1) A multicharacter literal has animplementation-defined value[.](#3.sentence-2) The value of any other kind of [*character-literal*](#nt:character-literal "5.13.3 Character literals [lex.ccon]") is determined as follows: - [(3.1)](#3.1) A [*character-literal*](#nt:character-literal "5.13.3 Character literals [lex.ccon]") with a [*c-char-sequence*](#nt:c-char-sequence "5.13.3 Character literals [lex.ccon]") consisting of a single[*basic-c-char*](#nt:basic-c-char "5.13.3 Character literals [lex.ccon]"),[*simple-escape-sequence*](#nt:simple-escape-sequence "5.13.3 Character literals [lex.ccon]"), or[*universal-character-name*](lex.universal.char#nt:universal-character-name "5.3.2 Universal character names [lex.universal.char]") is the code unit value of the specified character as encoded in the literal's associated character encoding[.](#3.1.sentence-1) If the specified character lacks representation in the literal's associated character encoding or if it cannot be encoded as a single code unit, then the program is ill-formed[.](#3.1.sentence-2) - [(3.2)](#3.2) A [*character-literal*](#nt:character-literal "5.13.3 Character literals [lex.ccon]") with a [*c-char-sequence*](#nt:c-char-sequence "5.13.3 Character literals [lex.ccon]") consisting of a single [*numeric-escape-sequence*](#nt:numeric-escape-sequence "5.13.3 Character literals [lex.ccon]") has a value as follows: * [(3.2.1)](#3.2.1) Let v be the integer value represented by the octal number comprising the sequence of [*octal-digit*](lex.icon#nt:octal-digit "5.13.2 Integer literals [lex.icon]")*s* in an [*octal-escape-sequence*](#nt:octal-escape-sequence "5.13.3 Character literals [lex.ccon]") or by the hexadecimal number comprising the sequence of [*hexadecimal-digit*](lex.icon#nt:hexadecimal-digit "5.13.2 Integer literals [lex.icon]")*s* in a [*hexadecimal-escape-sequence*](#nt:hexadecimal-escape-sequence "5.13.3 Character literals [lex.ccon]")[.](#3.2.1.sentence-1) * [(3.2.2)](#3.2.2) If v does not exceed the range of representable values of the [*character-literal*](#nt:character-literal "5.13.3 Character literals [lex.ccon]")'s type, then the value is v[.](#3.2.2.sentence-1) * [(3.2.3)](#3.2.3) Otherwise, if the [*character-literal*](#nt:character-literal "5.13.3 Character literals [lex.ccon]")'s [*encoding-prefix*](#nt:encoding-prefix "5.13.3 Character literals [lex.ccon]") is absent or L, andv does not exceed the range of representable values of the corresponding unsigned type for the underlying type of the [*character-literal*](#nt:character-literal "5.13.3 Character literals [lex.ccon]")'s type, then the value is the unique value of the [*character-literal*](#nt:character-literal "5.13.3 Character literals [lex.ccon]")'s type T that is congruent to v modulo 2N, where N is the width of T[.](#3.2.3.sentence-1) * [(3.2.4)](#3.2.4) Otherwise, the program is ill-formed[.](#3.2.4.sentence-1) - [(3.3)](#3.3) A [*character-literal*](#nt:character-literal "5.13.3 Character literals [lex.ccon]") with a [*c-char-sequence*](#nt:c-char-sequence "5.13.3 Character literals [lex.ccon]") consisting of a single [*conditional-escape-sequence*](#nt:conditional-escape-sequence "5.13.3 Character literals [lex.ccon]") is conditionally-supported and has an implementation-defined value[.](#3.3.sentence-1) [4](#4) [#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/lex.tex#L1629) The character specified by a [*simple-escape-sequence*](#nt:simple-escape-sequence "5.13.3 Character literals [lex.ccon]") is specified in Table [10](#tab:lex.ccon.esc "Table 10: Simple escape sequences")[.](#4.sentence-1) [*Note [1](#note-1)*: Using an escape sequence for a question mark is supported for compatibility with C++ 2014 and C[.](#4.sentence-2) — *end note*] Table [10](#tab:lex.ccon.esc) — Simple escape sequences [[tab:lex.ccon.esc]](./tab:lex.ccon.esc) | [🔗](#tab:lex.ccon.esc-row-1)
**character** | | **[*simple-escape-sequence*](#nt:simple-escape-sequence "5.13.3 Character literals [lex.ccon]")** | | --- | --- | --- | | [🔗](#tab:lex.ccon.esc-row-2)
U+000a | line feed | \n | | [🔗](#tab:lex.ccon.esc-row-3)
U+0009 | character tabulation | \t | | [🔗](#tab:lex.ccon.esc-row-4)
U+000b | line tabulation | \v | | [🔗](#tab:lex.ccon.esc-row-5)
U+0008 | backspace | \b | | [🔗](#tab:lex.ccon.esc-row-6)
U+000d | carriage return | \r | | [🔗](#tab:lex.ccon.esc-row-7)
U+000c | form feed | \f | | [🔗](#tab:lex.ccon.esc-row-8)
U+0007 | alert | \a | | [🔗](#tab:lex.ccon.esc-row-9)
U+005c | reverse solidus | \\ | | [🔗](#tab:lex.ccon.esc-row-10)
U+003f | question mark | \? | | [🔗](#tab:lex.ccon.esc-row-11)
U+0027 | apostrophe | \' | | [🔗](#tab:lex.ccon.esc-row-12)
U+0022 | quotation mark | \" |