6.1 KiB
[lex.universal.char]
5 Lexical conventions [lex]
5.3 Characters [lex.char]
5.3.2 Universal character names [lex.universal.char]
n-char:
any member of the translation character set except the U+007d right curly bracket or new-line character
n-char-sequence:
n-char n-char-sequenceopt
named-universal-character:
\N{ n-char-sequence }
hex-quad:
hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit
simple-hexadecimal-digit-sequence:
hexadecimal-digit simple-hexadecimal-digit-sequenceopt
universal-character-name:
\u hex-quad
\U hex-quad hex-quad
\u{ simple-hexadecimal-digit-sequence }
named-universal-character
The universal-character-name construct provides a way to name any element in the translation character set using just the basic character set.
If a universal-character-name outside the c-char-sequence, s-char-sequence, orr-char-sequence of a character-literal orstring-literal (in either case, including within a user-defined-literal) corresponds to a control character or to a character in the basic character set, the program is ill-formed.
[Note 1:
A sequence of characters resembling a universal-character-name in anr-char-sequence ([lex.string]) does not form auniversal-character-name.
â end note]
A universal-character-name of the form \u hex-quad,\U hex-quad hex-quad, or\u{simple-hexadecimal-digit-sequence} designates the character in the translation character set whose Unicode scalar value is the hexadecimal number represented by the sequence of hexadecimal-digits in the universal-character-name.
The program is ill-formed if that number is not a Unicode scalar value.
A universal-character-name that is a named-universal-character designates the corresponding character in the Unicode Standard (chapter 4.8 Name) if the n-char-sequence is equal to its character name or to one of its character name aliases of type âcontrolâ, âcorrectionâ, or âalternateâ; otherwise, the program is ill-formed.
[Note 2:
These aliases are listed in the Unicode Character Database's NameAliases.txt.
None of these names or aliases have leading or trailing spaces.
â end note]