Files
2025-10-25 03:02:53 +03:00

78 lines
6.1 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[lex.universal.char]
# 5 Lexical conventions [[lex]](./#lex)
## 5.3 Characters [[lex.char]](lex.char#lex.universal.char)
### 5.3.2 Universal character names [lex.universal.char]
[n-char:](#nt:n-char "5.3.2Universal character names[lex.universal.char]")
any member of the translation character set except the U+007d right curly bracket or new-line character
[n-char-sequence:](#nt:n-char-sequence "5.3.2Universal character names[lex.universal.char]")
[*n-char*](#nt:n-char "5.3.2Universal character names[lex.universal.char]") [*n-char-sequence*](#nt:n-char-sequence "5.3.2Universal character names[lex.universal.char]")opt
[named-universal-character:](#nt:named-universal-character "5.3.2Universal character names[lex.universal.char]")
\N{ [*n-char-sequence*](#nt:n-char-sequence "5.3.2Universal character names[lex.universal.char]") }
[hex-quad:](#nt:hex-quad "5.3.2Universal character names[lex.universal.char]")
[*hexadecimal-digit*](lex.icon#nt:hexadecimal-digit "5.13.2Integer literals[lex.icon]") [*hexadecimal-digit*](lex.icon#nt:hexadecimal-digit "5.13.2Integer literals[lex.icon]") [*hexadecimal-digit*](lex.icon#nt:hexadecimal-digit "5.13.2Integer literals[lex.icon]") [*hexadecimal-digit*](lex.icon#nt:hexadecimal-digit "5.13.2Integer literals[lex.icon]")
[simple-hexadecimal-digit-sequence:](#nt:simple-hexadecimal-digit-sequence "5.3.2Universal character names[lex.universal.char]")
[*hexadecimal-digit*](lex.icon#nt:hexadecimal-digit "5.13.2Integer literals[lex.icon]") [*simple-hexadecimal-digit-sequence*](#nt:simple-hexadecimal-digit-sequence "5.3.2Universal character names[lex.universal.char]")opt
[universal-character-name:](#nt:universal-character-name "5.3.2Universal character names[lex.universal.char]")
\u [*hex-quad*](#nt:hex-quad "5.3.2Universal character names[lex.universal.char]")
\U [*hex-quad*](#nt:hex-quad "5.3.2Universal character names[lex.universal.char]") [*hex-quad*](#nt:hex-quad "5.3.2Universal character names[lex.universal.char]")
\u{ [*simple-hexadecimal-digit-sequence*](#nt:simple-hexadecimal-digit-sequence "5.3.2Universal character names[lex.universal.char]") }
[*named-universal-character*](#nt:named-universal-character "5.3.2Universal character names[lex.universal.char]")
[1](#1)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/lex.tex#L467)
The [*universal-character-name*](#nt:universal-character-name "5.3.2Universal character names[lex.universal.char]") construct provides a way to name any
element in the translation character set using just the basic character set[.](#1.sentence-1)
If a [*universal-character-name*](#nt:universal-character-name "5.3.2Universal character names[lex.universal.char]") outside
the [*c-char-sequence*](lex.ccon#nt:c-char-sequence "5.13.3Character literals[lex.ccon]"), [*s-char-sequence*](lex.string#nt:s-char-sequence "5.13.5String literals[lex.string]"), or[*r-char-sequence*](lex.string#nt:r-char-sequence "5.13.5String literals[lex.string]") of a [*character-literal*](lex.ccon#nt:character-literal "5.13.3Character literals[lex.ccon]") or[*string-literal*](lex.string#nt:string-literal "5.13.5String literals[lex.string]") (in either case, including within a [*user-defined-literal*](lex.ext#nt:user-defined-literal "5.13.9User-defined literals[lex.ext]"))
corresponds to a control character or to a character in the basic character set,
the program is ill-formed[.](#1.sentence-2)
[*Note [1](#note-1)*:
A sequence of characters resembling a [*universal-character-name*](#nt:universal-character-name "5.3.2Universal character names[lex.universal.char]") in an[*r-char-sequence*](lex.string#nt:r-char-sequence "5.13.5String literals[lex.string]") ([[lex.string]](lex.string "5.13.5String literals")) does not form a[*universal-character-name*](#nt:universal-character-name "5.3.2Universal character names[lex.universal.char]")[.](#1.sentence-3)
— *end note*]
[2](#2)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/lex.tex#L483)
A [*universal-character-name*](#nt:universal-character-name "5.3.2Universal character names[lex.universal.char]") of the form \u [*hex-quad*](#nt:hex-quad "5.3.2Universal character names[lex.universal.char]"),\U [*hex-quad*](#nt:hex-quad "5.3.2Universal character names[lex.universal.char]") [*hex-quad*](#nt:hex-quad "5.3.2Universal character names[lex.universal.char]"), or\u{[*simple-hexadecimal-digit-sequence*](#nt:simple-hexadecimal-digit-sequence "5.3.2Universal character names[lex.universal.char]")} designates the character in the translation character set
whose Unicode scalar value is the hexadecimal number represented by
the sequence of [*hexadecimal-digit*](lex.icon#nt:hexadecimal-digit "5.13.2Integer literals[lex.icon]")*s* in the [*universal-character-name*](#nt:universal-character-name "5.3.2Universal character names[lex.universal.char]")[.](#2.sentence-1)
The program is ill-formed if that number is not a Unicode scalar value[.](#2.sentence-2)
[3](#3)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/lex.tex#L494)
A [*universal-character-name*](#nt:universal-character-name "5.3.2Universal character names[lex.universal.char]") that is a [*named-universal-character*](#nt:named-universal-character "5.3.2Universal character names[lex.universal.char]") designates the corresponding character
in the Unicode Standard (chapter 4.8 Name)
if the [*n-char-sequence*](#nt:n-char-sequence "5.3.2Universal character names[lex.universal.char]") is equal
to its character name or
to one of its character name aliases of
type “control”, “correction”, or “alternate”;
otherwise, the program is ill-formed[.](#3.sentence-1)
[*Note [2](#note-2)*:
These aliases are listed in
the Unicode Character Database's NameAliases.txt[.](#3.sentence-2)
None of these names or aliases have leading or trailing spaces[.](#3.sentence-3)
— *end note*]