Files
2025-10-25 03:02:53 +03:00

7.4 KiB
Raw Permalink Blame History

[text.encoding.members]

28 Text processing library [text]

28.4 Text encodings identification [text.encoding]

28.4.2 Class text_encoding [text.encoding.class]

28.4.2.3 Members [text.encoding.members]

🔗

constexpr explicit text_encoding(string_view enc) noexcept;

1

#

Preconditions:

  • (1.1)

    enc represents a string in the ordinary literal encoding consisting only of elements of the basic character set ([lex.charset]).

  • (1.2)

    enc.size() <= max_name_length is true.

  • (1.3)

    enc.contains('\0') is false.

2

#

Postconditions:

  • (2.1)

    If there exists a primary name or alias a of a known registered character encoding such thatcomp-name(a, enc) is true,mib_ has the value of the enumerator of id associated with that registered character encoding. Otherwise, mib_ == id::other is true.

  • (2.2)

    enc.compare(name_) == 0 is true.

🔗

constexpr text_encoding(id i) noexcept;

3

#

Preconditions: i has the value of one of the enumerators of id.

4

#

Postconditions:

  • (4.1)

    mib_ == i is true.

  • (4.2)

    If (mib_ == id::unknown || mib_ == id::other) is true,strlen(name_) == 0 is true. Otherwise,ranges::contains(aliases(), string_view(name_)) is true.

🔗

constexpr id mib() const noexcept;

5

#

Returns: mib_.

🔗

constexpr const char* name() const noexcept;

6

#

Returns: name_.

7

#

Remarks: name() is an ntbs and accessing elements of name_ outside of the range name()+[0, strlen(name()) + 1) is undefined behavior.

🔗

constexpr aliases_view aliases() const noexcept;

Let r denote an instance of aliases_view.

If *this represents a known registered character encoding, then:

  • r.front() is the primary name of the registered character encoding,
  • r contains the aliases of the registered character encoding, and
  • r does not contain duplicate values when compared with strcmp.

Otherwise, r is an empty range.

8

#

Each element in r is a non-null, non-empty ntbs encoded in the literal character encoding and comprising only characters from the basic character set.

9

#

Returns: r.

10

#

[Note 1:

The order of aliases in r is unspecified.

— end note]

🔗

static consteval text_encoding literal() noexcept;

11

#

Mandates: CHAR_BIT == 8 is true.

12

#

Returns: A text_encoding object representing the ordinary character literal encoding ([lex.charset]).

🔗

static text_encoding environment();

13

#

Mandates: CHAR_BIT == 8 is true.

14

#

Returns: A text_encoding object representing the implementation-defined character encoding scheme of the environment.

On a POSIX implementation, this is the encoding scheme associated with the POSIX locale denoted by the empty string "".

15

#

[Note 2:

This function is not affected by calls to setlocale.

— end note]

16

#

Recommended practice: Implementations should return a value that is not affected by calls to the POSIX function setenv and other functions which can modify the environment ([support.runtime]).

🔗

template<id i> static bool environment_is();

17

#

Mandates: CHAR_BIT == 8 is true.

18

#

Returns: environment() == i.

🔗

static constexpr bool comp-name(string_view a, string_view b);

19

#

Returns: true if the two strings a and b encoded in the ordinary literal encoding are equal, ignoring, from left-to-right,

all elements that are not digits or letters ([character.seq.general]),

character case, and

any sequence of one or more 0 characters not immediately preceded by a numeric prefix, where a numeric prefix is a sequence consisting of a digit in the range [1, 9] optionally followed by one or more elements which are not digits or letters,

and false otherwise.

[Note 3:

This comparison is identical to the “Charset Alias Matching” algorithm described in the Unicode Technical Standard 22[bib].

— end note]

[Example 1: static_assert(comp-name("UTF-8", "utf8") == true);static_assert(comp-name("u.t.f-008", "utf8") == true);static_assert(comp-name("ut8", "utf8") == false);static_assert(comp-name("utf-80", "utf8") == false); — end example]