Files
cppdraft_translate/cppdraft/lex/ccon.md
2025-10-25 03:02:53 +03:00

13 KiB

[lex.ccon]

5 Lexical conventions [lex]

5.13 Literals [lex.literal]

5.13.3 Character literals [lex.ccon]

character-literal:
encoding-prefixopt ' c-char-sequence '

encoding-prefix: one of
u8 u U L

c-char-sequence:
c-char c-char-sequenceopt

c-char:
basic-c-char
escape-sequence
universal-character-name

basic-c-char:
any member of the translation character set except the U+0027 apostrophe,
U+005c reverse solidus, or new-line character

escape-sequence:
simple-escape-sequence
numeric-escape-sequence
conditional-escape-sequence

simple-escape-sequence:
\ simple-escape-sequence-char

simple-escape-sequence-char: one of
' " ? \ a b f n r t v

numeric-escape-sequence:
octal-escape-sequence
hexadecimal-escape-sequence

simple-octal-digit-sequence:
octal-digit simple-octal-digit-sequenceopt

octal-escape-sequence:
\ octal-digit
\ octal-digit octal-digit
\ octal-digit octal-digit octal-digit
\o{ simple-octal-digit-sequence }

hexadecimal-escape-sequence:
\x simple-hexadecimal-digit-sequence
\x{ simple-hexadecimal-digit-sequence }

conditional-escape-sequence:
\ conditional-escape-sequence-char

conditional-escape-sequence-char:
any member of the basic character set that is not an octal-digit, a simple-escape-sequence-char, or the characters N, o, u, U, or x

1

#

A multicharacter literal is a character-literal whose c-char-sequence consists of more than one c-char.

A multicharacter literal shall not have an encoding-prefix.

If a multicharacter literal contains a c-char that is not encodable as a single code unit in the ordinary literal encoding, the program is ill-formed.

Multicharacter literals are conditionally-supported.

2

#

The kind of a character-literal, its type, and its associated character encoding ([lex.charset]) are determined by its encoding-prefix and its c-char-sequence as defined by Table 9.

Table 9 — Character literals [tab:lex.ccon.literal]

🔗
Encoding
Kind Type Associated char- Example
🔗
prefix
acter encoding
🔗
none
ordinary character literal char ordinary literal 'v'
🔗
multicharacter literal int encoding 'abcd'
🔗
L
wide character literal wchar_t wide literal L'w'
🔗 encoding
🔗
u8
UTF-8 character literal char8_t UTF-8 u8'x'
🔗
u
UTF-16 character literal char16_t UTF-16 u'y'
🔗
U
UTF-32 character literal char32_t UTF-32 U'z'

3

#

In translation phase 4, the value of a character-literal is determined using the range of representable values of the character-literal's type in translation phase 7.

A multicharacter literal has animplementation-defined value.

The value of any other kind of character-literal is determined as follows:

4

#

The character specified by a simple-escape-sequence is specified in Table 10.

[Note 1:

Using an escape sequence for a question mark is supported for compatibility with C++ 2014 and C.

— end note]

Table 10 — Simple escape sequences [tab:lex.ccon.esc]

🔗
character
simple-escape-sequence
🔗
U+000a
line feed \n
🔗
U+0009
character tabulation \t
🔗
U+000b
line tabulation \v
🔗
U+0008
backspace \b
🔗
U+000d
carriage return \r
🔗
U+000c
form feed \f
🔗
U+0007
alert \a
🔗
U+005c
reverse solidus \
🔗
U+003f
question mark ?
🔗
U+0027
apostrophe '
🔗
U+0022
quotation mark "