This commit is contained in:
2025-10-25 03:02:53 +03:00
commit 043225d523
3416 changed files with 681196 additions and 0 deletions

150
cppdraft/character/seq.md Normal file
View File

@@ -0,0 +1,150 @@
[character.seq]
# 16 Library introduction [[library]](./#library)
## 16.3 Method of description [[description]](description#character.seq)
### 16.3.3 Other conventions [[conventions]](conventions#character.seq)
#### 16.3.3.3 Type descriptions [[type.descriptions]](type.descriptions#character.seq)
#### 16.3.3.3.4 Character sequences [character.seq]
#### [16.3.3.3.4.1](#general) General [[character.seq.general]](character.seq.general)
[1](#general-1)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/lib-intro.tex#L720)
The C standard library makes widespread useof characters and character sequences that follow a few uniform conventions:
- [(1.1)](#general-1.1)
Properties specified as [*locale-specific*](#def:locale-specific "16.3.3.3.4.1General[character.seq.general]") may change during program execution
by a call to setlocale(int, const char*) ([[clocale.syn]](clocale.syn "28.3.5.1Header <clocale> synopsis")), or
by a change to a locale object,
as described in [[locales]](locales "28.3.3Locales") and [[input.output]](input.output "31Input/output library")[.](#general-1.1.sentence-1)
- [(1.2)](#general-1.2)
The [*execution character set*](#def:character_set,execution "16.3.3.3.4.1General[character.seq.general]") and
the [*execution wide-character set*](#def:wide-character_set,execution "16.3.3.3.4.1General[character.seq.general]") are supersets of the basic literal character set ([[lex.charset]](lex.charset "5.3.1Character sets"))[.](#general-1.2.sentence-1)
The encodings of the execution character sets and
the sets of additional elements (if any) are locale-specific[.](#general-1.2.sentence-2)
Each element of the execution wide-character set is encoded as
a single code unit representable by a value of type wchar_t[.](#general-1.2.sentence-3)
[*Note [1](#general-note-1)*:
The encodings of the execution character sets can be unrelated
to any literal encoding[.](#general-1.2.sentence-4)
— *end note*]
- [(1.3)](#general-1.3)
A [*letter*](#def:letter "16.3.3.3.4.1General[character.seq.general]") is any of the 26 lowercase or 26uppercase letters in the basic character set[.](#general-1.3.sentence-1)
- [(1.4)](#general-1.4)
The[*decimal-point character*](#def:character,decimal-point "16.3.3.3.4.1General[character.seq.general]") is the locale-specific
(single-byte) character used by functions that convert between a (single-byte)
character sequence and a value of one of the floating-point types[.](#general-1.4.sentence-1)
It is used
in the character sequence to denote the beginning of a fractional part[.](#general-1.4.sentence-2)
It is
represented in [[support]](support "17Language support library") through [[exec]](exec "33Execution control library") and [[depr]](depr "Annex D(normative)Compatibility features") by a period,'.',
which is
also its value in the "C" locale[.](#general-1.4.sentence-3)
- [(1.5)](#general-1.5)
A[*character sequence*](#def:character_sequence "16.3.3.3.4.1General[character.seq.general]") is an [array object](dcl.array "9.3.4.5Arrays[dcl.array]") *A* that
can be declared as*T A*[*N*],
where *T* is any of the typeschar,unsigned char,
orsigned char ([[basic.fundamental]](basic.fundamental "6.9.2Fundamental types")), optionally qualified by any combination ofconst orvolatile[.](#general-1.5.sentence-1)
The initial elements of the
array have defined contents up to and including an element determined by some
predicate[.](#general-1.5.sentence-2)
A character sequence can be designated by a pointer value*S* that points to its first element[.](#general-1.5.sentence-3)
#### [16.3.3.3.4.2](#byte.strings) Byte strings [[byte.strings]](byte.strings)
[1](#byte.strings-1)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/lib-intro.tex#L788)
A [*null-terminated byte string*](#def:ntbs "16.3.3.3.4.2Byte strings[byte.strings]"),
or ntbs,
is a character sequence whose highest-addressed element
with defined content has the value zero
(the [*terminating null character*](#def:character,terminating_null "16.3.3.3.4.2Byte strings[byte.strings]"));
no other element in the sequence has the value zero[.](#byte.strings-1.sentence-1)[139](#footnote-139 "Many of the objects manipulated by function signatures declared in <cstring> are character sequences or ntbss. The size of some of these character sequences is limited by a length value, maintained separately from the character sequence.")
[2](#byte.strings-2)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/lib-intro.tex#L803)
The [*length of an ntbs*](#def:ntbs,length "16.3.3.3.4.2Byte strings[byte.strings]") is the number of elements that
precede the terminating null character[.](#byte.strings-2.sentence-1)
An [*empty ntbs*](#def:ntbs,empty "16.3.3.3.4.2Byte strings[byte.strings]") has a length of zero[.](#byte.strings-2.sentence-2)
[3](#byte.strings-3)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/lib-intro.tex#L810)
The [*value of an ntbs*](#def:ntbs,value "16.3.3.3.4.2Byte strings[byte.strings]") is the sequence of values of the
elements up to and including the terminating null character[.](#byte.strings-3.sentence-1)
[4](#byte.strings-4)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/lib-intro.tex#L815)
A [*static ntbs*](#def:ntbs,static "16.3.3.3.4.2Byte strings[byte.strings]") is an ntbs with
static storage duration[.](#byte.strings-4.sentence-1)[140](#footnote-140 "A string-literal, such as &quot;abc&quot;, is a static ntbs.")
[139)](#footnote-139)[139)](#footnoteref-139)
Many of the objects manipulated by
function signatures declared in[<cstring>](cstring.syn#header:%3ccstring%3e "27.5.1Header <cstring> synopsis[cstring.syn]") are character sequences or ntbss[.](#footnote-139.sentence-1)
The size of some of these character sequences is limited by
a length value, maintained separately from the character sequence[.](#footnote-139.sentence-2)
[140)](#footnote-140)[140)](#footnoteref-140)
A [*string-literal*](lex.string#nt:string-literal "5.13.5String literals[lex.string]"), such as"abc",
is a static ntbs[.](#footnote-140.sentence-1)
#### [16.3.3.3.4.3](#multibyte.strings) Multibyte strings [[multibyte.strings]](multibyte.strings)
[1](#multibyte.strings-1)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/lib-intro.tex#L827)
A [*multibyte character*](#def:character,multibyte "16.3.3.3.4.3Multibyte strings[multibyte.strings]") is
a sequence of one or more bytes representing the
code unit sequence for an encoded character of the
execution character set[.](#multibyte.strings-1.sentence-1)
[2](#multibyte.strings-2)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/lib-intro.tex#L834)
A [*null-terminated multibyte string*](#def:ntmbs "16.3.3.3.4.3Multibyte strings[multibyte.strings]"),
or ntmbs,
is an ntbs that constitutes a
sequence of valid multibyte characters, beginning and ending in the initial
shift state[.](#multibyte.strings-2.sentence-1)[141](#footnote-141 "An ntbs that contains characters only from the basic literal character set is also an ntmbs. Each multibyte character then consists of a single byte.")
[3](#multibyte.strings-3)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/lib-intro.tex#L847)
A [*static ntmbs*](#def:ntmbs,static "16.3.3.3.4.3Multibyte strings[multibyte.strings]") is an ntmbs with static storage duration[.](#multibyte.strings-3.sentence-1)
[141)](#footnote-141)[141)](#footnoteref-141)
An ntbs that contains characters only from the
basic literal character set is also an ntmbs[.](#footnote-141.sentence-1)
Each multibyte character then
consists of a single byte[.](#footnote-141.sentence-2)