This commit is contained in:
2025-10-25 03:02:53 +03:00
commit 043225d523
3416 changed files with 681196 additions and 0 deletions

341
cppdraft/re/tokiter.md Normal file
View File

@@ -0,0 +1,341 @@
[re.tokiter]
# 28 Text processing library [[text]](./#text)
## 28.6 Regular expressions library [[re]](re#tokiter)
### 28.6.11 Regular expression iterators [[re.iter]](re.iter#re.tokiter)
#### 28.6.11.2 Class template regex_token_iterator [re.tokiter]
#### [28.6.11.2.1](#general) General [[re.tokiter.general]](re.tokiter.general)
[1](#general-1)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12466)
The class template regex_token_iterator is an iterator adaptor; that
is to say it represents a new view of an existing iterator sequence,
by enumerating all the occurrences of a regular expression within that
sequence, and presenting one or more sub-expressions for each match
found[.](#general-1.sentence-1)
Each position enumerated by the iterator is a sub_match class
template instance that represents what matched a particular sub-expression
within the regular expression[.](#general-1.sentence-2)
[2](#general-2)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12476)
When class regex_token_iterator is used to enumerate a
single sub-expression with index −1 the iterator performs field
splitting: that is to say it enumerates one sub-expression for each section of
the character container sequence that does not match the regular
expression specified[.](#general-2.sentence-1)
[3](#general-3)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12483)
After it is constructed, the iterator finds and stores a valueregex_iterator<BidirectionalIterator> position and sets the internal count N to zero[.](#general-3.sentence-1)
It also maintains a sequencesubs which contains a list of the sub-expressions which will be
enumerated[.](#general-3.sentence-2)
Every time operator++ is used
the count N is incremented; if N exceeds or equals subs.size(),
then the iterator increments member position and sets count N to zero[.](#general-3.sentence-3)
[4](#general-4)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12494)
If the end of sequence is reached (position is equal to the end of
sequence iterator), the iterator becomes equal to the end-of-sequence
iterator value, unless the sub-expression being enumerated has index −1,
in which case the iterator enumerates one last sub-expression that contains
all the characters from the end of the last regular expression match to the
end of the input sequence being enumerated, provided that this would not be an
empty sub-expression[.](#general-4.sentence-1)
[5](#general-5)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12503)
The default constructor constructs
an end-of-sequence iterator object, which is the only legitimate
iterator to be used for the end condition[.](#general-5.sentence-1)
The result of operator* on
an end-of-sequence iterator is not defined[.](#general-5.sentence-2)
For any other iterator value aconst sub_match<BidirectionalIterator>& is returned[.](#general-5.sentence-3)
The result of operator-> on an end-of-sequence iterator
is not defined[.](#general-5.sentence-4)
For any other iterator value a const sub_match<BidirectionalIterator>* is returned[.](#general-5.sentence-5)
[6](#general-6)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12514)
It is impossible to store things
into regex_token_iterators[.](#general-6.sentence-1)
Two end-of-sequence iterators are always
equal[.](#general-6.sentence-2)
An end-of-sequence iterator is not equal to a
non-end-of-sequence iterator[.](#general-6.sentence-3)
Two non-end-of-sequence iterators are
equal when they are constructed from the same arguments[.](#general-6.sentence-4)
namespace std {template<class BidirectionalIterator, class charT = typename iterator_traits<BidirectionalIterator>::value_type, class traits = regex_traits<charT>>class regex_token_iterator {public:using regex_type = basic_regex<charT, traits>; using iterator_category = forward_iterator_tag; using iterator_concept = input_iterator_tag; using value_type = sub_match<BidirectionalIterator>; using difference_type = ptrdiff_t; using pointer = const value_type*; using reference = const value_type&;
regex_token_iterator();
regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b, const regex_type& re, int submatch = 0,
regex_constants::match_flag_type m = regex_constants::match_default);
regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b, const regex_type& re, const vector<int>& submatches,
regex_constants::match_flag_type m = regex_constants::match_default);
regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b, const regex_type& re,
initializer_list<int> submatches,
regex_constants::match_flag_type m = regex_constants::match_default); template<size_t N> regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b, const regex_type& re, const int (&submatches)[N],
regex_constants::match_flag_type m = regex_constants::match_default);
regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b, const regex_type&& re, int submatch = 0,
regex_constants::match_flag_type m = regex_constants::match_default) = delete;
regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b, const regex_type&& re, const vector<int>& submatches,
regex_constants::match_flag_type m = regex_constants::match_default) = delete;
regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b, const regex_type&& re,
initializer_list<int> submatches,
regex_constants::match_flag_type m = regex_constants::match_default) = delete; template<size_t N> regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b, const regex_type&& re, const int (&submatches)[N],
regex_constants::match_flag_type m = regex_constants::match_default) = delete;
regex_token_iterator(const regex_token_iterator&);
regex_token_iterator& operator=(const regex_token_iterator&); bool operator==(const regex_token_iterator&) const; bool operator==(default_sentinel_t) const { return *this == regex_token_iterator(); }const value_type& operator*() const; const value_type* operator->() const;
regex_token_iterator& operator++();
regex_token_iterator operator++(int); private:using position_iterator = regex_iterator<BidirectionalIterator, charT, traits>; // *exposition only* position_iterator position; // *exposition only*const value_type* result; // *exposition only* value_type suffix; // *exposition only* size_t N; // *exposition only* vector<int> subs; // *exposition only*};}
[7](#general-7)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12601)
A *suffix iterator* is a regex_token_iterator object
that points to a final sequence of characters at
the end of the target sequence[.](#general-7.sentence-1)
In a suffix iterator the
member result holds a pointer to the data
member suffix, the value of the member suffix.match is true, suffix.first points to the beginning of the
final sequence, and suffix.second points to the end of the
final sequence[.](#general-7.sentence-2)
[8](#general-8)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12611)
[*Note [1](#general-note-1)*:
For a suffix iterator, data
member suffix.first is the same as the end of the last match
found, and suffix.second is the same as the end of the target
sequence[.](#general-8.sentence-1)
— *end note*]
[9](#general-9)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12619)
The *current match* is (*position).prefix() if subs[N] == -1, or(*position)[subs[N]] for any other value of subs[N][.](#general-9.sentence-1)
#### [28.6.11.2.2](#cnstr) Constructors [[re.tokiter.cnstr]](re.tokiter.cnstr)
[🔗](#lib:regex_token_iterator,constructor)
`regex_token_iterator();
`
[1](#cnstr-1)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12631)
*Effects*: Constructs the end-of-sequence iterator[.](#cnstr-1.sentence-1)
[🔗](#lib:regex_token_iterator,constructor_)
`regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b,
const regex_type& re,
int submatch = 0,
regex_constants::match_flag_type m = regex_constants::match_default);
regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b,
const regex_type& re,
const vector<int>& submatches,
regex_constants::match_flag_type m = regex_constants::match_default);
regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b,
const regex_type& re,
initializer_list<int> submatches,
regex_constants::match_flag_type m = regex_constants::match_default);
template<size_t N>
regex_token_iterator(BidirectionalIterator a, BidirectionalIterator b,
const regex_type& re,
const int (&submatches)[N],
regex_constants::match_flag_type m = regex_constants::match_default);
`
[2](#cnstr-2)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12661)
*Preconditions*: Each of the initialization values of submatches is >= -1[.](#cnstr-2.sentence-1)
[3](#cnstr-3)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12665)
*Effects*: The first constructor initializes the member subs to hold the single
value submatch[.](#cnstr-3.sentence-1)
The second, third, and fourth constructors
initialize the member subs to hold a copy of the sequence of integer values
pointed to by the iterator range
[begin(submatches), end(submatches))[.](#cnstr-3.sentence-2)
[4](#cnstr-4)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12674)
Each constructor then sets N to 0, and position toposition_iterator(a, b, re, m)[.](#cnstr-4.sentence-1)
If position is not an
end-of-sequence iterator the constructor sets result to the
address of the current match[.](#cnstr-4.sentence-2)
Otherwise if any of the values stored
in subs is equal to −1 the constructor sets *this to a suffix
iterator that points to the range [a, b), otherwise the constructor
sets *this to an end-of-sequence iterator[.](#cnstr-4.sentence-3)
#### [28.6.11.2.3](#comp) Comparisons [[re.tokiter.comp]](re.tokiter.comp)
[🔗](#lib:regex_token_iterator,operator==_)
`bool operator==(const regex_token_iterator& right) const;
`
[1](#comp-1)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12692)
*Returns*: true if *this and right are both end-of-sequence iterators,
or if *this and right are both suffix iterators and suffix == right.suffix;
otherwise returns false if *this or right is an end-of-sequence
iterator or a suffix iterator[.](#comp-1.sentence-1)
Otherwise returns true if position == right.position,N == right.N, and subs == right.subs[.](#comp-1.sentence-2)
Otherwise returns false[.](#comp-1.sentence-3)
#### [28.6.11.2.4](#deref) Indirection [[re.tokiter.deref]](re.tokiter.deref)
[🔗](#lib:regex_token_iterator,operator*)
`const value_type& operator*() const;
`
[1](#deref-1)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12709)
*Returns*: *result[.](#deref-1.sentence-1)
[🔗](#lib:operator-%3e,regex_token_iterator)
`const value_type* operator->() const;
`
[2](#deref-2)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12720)
*Returns*: result[.](#deref-2.sentence-1)
#### [28.6.11.2.5](#incr) Increment [[re.tokiter.incr]](re.tokiter.incr)
[🔗](#lib:regex_token_iterator,operator++)
`regex_token_iterator& operator++();
`
[1](#incr-1)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12734)
*Effects*: Constructs a local variable prev of
type position_iterator, initialized with the value
of position[.](#incr-1.sentence-1)
[2](#incr-2)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12740)
If *this is a suffix iterator, sets *this to an
end-of-sequence iterator[.](#incr-2.sentence-1)
[3](#incr-3)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12744)
Otherwise, if N + 1 < subs.size(), increments N and
sets result to the address of the current match[.](#incr-3.sentence-1)
[4](#incr-4)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12748)
Otherwise, sets N to 0 and
increments position[.](#incr-4.sentence-1)
If position is not an
end-of-sequence iterator the operator sets result to the
address of the current match[.](#incr-4.sentence-2)
[5](#incr-5)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12754)
Otherwise, if any of the values stored in subs is equal to −1 andprev->suffix().length() is not 0 the operator sets *this to a
suffix iterator that points to the range [prev->suffix().first, prev->suffix().second)[.](#incr-5.sentence-1)
[6](#incr-6)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12759)
Otherwise, sets *this to an end-of-sequence iterator[.](#incr-6.sentence-1)
[7](#incr-7)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12762)
*Returns*: *this[.](#incr-7.sentence-1)
[🔗](#lib:regex_token_iterator,operator++_)
`regex_token_iterator& operator++(int);
`
[8](#incr-8)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12773)
*Effects*: Constructs a copy tmp of *this, then calls ++(*this)[.](#incr-8.sentence-1)
[9](#incr-9)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/text.tex#L12777)
*Returns*: tmp[.](#incr-9.sentence-1)