Files
cppdraft_translate/cppdraft/algorithms/parallel/exec.md
2025-10-25 03:02:53 +03:00

242 lines
9.3 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[algorithms.parallel.exec]
# 26 Algorithms library [[algorithms]](./#algorithms)
## 26.3 Parallel algorithms [[algorithms.parallel]](algorithms.parallel#exec)
### 26.3.3 Effect of execution policies on algorithm execution [algorithms.parallel.exec]
[1](#1)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L391)
An execution policy template parameter describes
the manner in which the execution of a parallel algorithm may be
parallelized and the manner in which it applies the element access functions[.](#1.sentence-1)
[2](#2)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L396)
If an object is modified by an element access function,
the algorithm will perform no other unsynchronized accesses to that object[.](#2.sentence-1)
The modifying element access functions are those
which are specified as modifying the object[.](#2.sentence-2)
[*Note [1](#note-1)*:
For example,swap,++,--,@=, and
assignments
modify the object[.](#2.sentence-3)
For the assignment and @= operators, only the left argument is modified[.](#2.sentence-4)
— *end note*]
[3](#3)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L412)
Unless otherwise stated, implementations may make arbitrary copies of elements
(with type T) from sequences
where is_trivially_copy_constructible_v<T> and is_trivially_destructible_v<T> are true[.](#3.sentence-1)
[*Note [2](#note-2)*:
This implies that user-supplied function objects cannot rely on
object identity of arguments for such input sequences[.](#3.sentence-2)
If object identity of the arguments to these function objects
is important, a wrapping iterator
that returns a non-copied implementation object
such as reference_wrapper<T> ([[refwrap]](refwrap "22.10.6Class template reference_­wrapper")),
or some equivalent solution, can be used[.](#3.sentence-3)
— *end note*]
[4](#4)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L427)
The invocations of element access functions in parallel algorithms invoked with
an execution policy object of type execution::sequenced_policy all occur
in the calling thread of execution[.](#4.sentence-1)
[*Note [3](#note-3)*:
The invocations are not interleaved; see [[intro.execution]](intro.execution "6.10.1Sequential execution")[.](#4.sentence-2)
— *end note*]
[5](#5)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L435)
The invocations of element access functions in parallel algorithms invoked with
an execution policy object of type execution::unsequenced_policy are permitted to execute in an unordered fashion
in the calling thread of execution,
unsequenced with respect to one another in the calling thread of execution[.](#5.sentence-1)
[*Note [4](#note-4)*:
This means that multiple function object invocations
can be interleaved on a single thread of execution,
which overrides the usual guarantee from [[intro.execution]](intro.execution "6.10.1Sequential execution") that function executions do not overlap with one another[.](#5.sentence-2)
— *end note*]
The behavior of a program is undefined if
it invokes a vectorization-unsafe standard library function
from user code
called from an execution::unsequenced_policy algorithm[.](#5.sentence-3)
[*Note [5](#note-5)*:
Because execution::unsequenced_policy allows
the execution of element access functions
to be interleaved on a single thread of execution,
blocking synchronization, including the use of mutexes, risks deadlock[.](#5.sentence-4)
— *end note*]
[6](#6)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L458)
The invocations of element access functions in parallel algorithms invoked with
an execution policy object of type execution::parallel_policy are permitted to execute either
in the invoking thread of execution or
in a thread of execution implicitly created by the library
to support parallel algorithm execution[.](#6.sentence-1)
If the threads of execution created by thread ([[thread.thread.class]](thread.thread.class "32.4.3Class thread"))
or jthread ([[thread.jthread.class]](thread.jthread.class "32.4.4Class jthread"))
provide concurrent forward progress guarantees ([[intro.progress]](intro.progress "6.10.2.3Forward progress")),
then a thread of execution implicitly created by the library will provide
parallel forward progress guarantees;
otherwise, the provided forward progress guarantee isimplementation-defined[.](#6.sentence-2)
Any such invocations executing in the same thread of execution
are indeterminately sequenced with respect to each other[.](#6.sentence-3)
[*Note [6](#note-6)*:
It is the caller's responsibility to ensure
that the invocation does not introduce data races or deadlocks[.](#6.sentence-4)
— *end note*]
[*Example [1](#example-1)*: int a[] = {0,1};
std::vector<int> v;
std::for_each(std::execution::par, std::begin(a), std::end(a), [&](int i) { v.push_back(i*2+1); // incorrect: data race});
The program above has a data race because of the unsynchronized access to the
container v[.](#6.sentence-5)
— *end example*]
[*Example [2](#example-2)*: std::atomic<int> x{0};int a[] = {1,2};
std::for_each(std::execution::par, std::begin(a), std::end(a), [&](int) { x.fetch_add(1, std::memory_order::relaxed); // spin wait for another iteration to change the value of xwhile (x.load(std::memory_order::relaxed) == 1) { } // incorrect: assumes execution order});
The above example depends on the order of execution of the iterations, and
will not terminate if both iterations are executed sequentially
on the same thread of execution[.](#6.sentence-6)
— *end example*]
[*Example [3](#example-3)*: int x = 0;
std::mutex m;int a[] = {1,2};
std::for_each(std::execution::par, std::begin(a), std::end(a), [&](int) { std::lock_guard<mutex> guard(m); ++x;});
The above example synchronizes access to object x ensuring that it is incremented correctly[.](#6.sentence-7)
— *end example*]
[7](#7)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L518)
The invocations of element access functions in parallel algorithms invoked with
an execution policy object of type execution::parallel_unsequenced_policy are
permitted to execute
in an unordered fashion in unspecified threads of execution, and
unsequenced with respect to one another within each thread of execution[.](#7.sentence-1)
These threads of execution are
either the invoking thread of execution
or threads of execution implicitly created by the library;
the latter will provide weakly parallel forward progress guarantees[.](#7.sentence-2)
[*Note [7](#note-7)*:
This means that multiple function object invocations can be interleaved
on a single thread of execution,
which overrides the usual guarantee from [[intro.execution]](intro.execution "6.10.1Sequential execution") that function executions do not overlap with one another[.](#7.sentence-3)
— *end note*]
The behavior of a program is undefined if
it invokes a vectorization-unsafe standard library function
from user code
called from an execution::parallel_unsequenced_policy algorithm[.](#7.sentence-4)
[*Note [8](#note-8)*:
Because execution::parallel_unsequenced_policy allows
the execution of element access functions
to be interleaved on a single thread of execution,
blocking synchronization, including the use of mutexes, risks deadlock[.](#7.sentence-5)
— *end note*]
[8](#8)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L545)
[*Note [9](#note-9)*:
The semantics of invocation withexecution::unsequenced_policy,execution::parallel_policy, orexecution::parallel_unsequenced_policy allow the implementation to fall back to sequential execution
if the system cannot parallelize an algorithm invocation,
e.g., due to lack of resources[.](#8.sentence-1)
— *end note*]
[9](#9)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L556)
If an invocation of a parallel algorithm uses threads of execution
implicitly created by the library,
then the invoking thread of execution will either
- [(9.1)](#9.1)
temporarily block
with forward progress guarantee delegation ([[intro.progress]](intro.progress "6.10.2.3Forward progress"))
on the completion of these library-managed threads of execution, or
- [(9.2)](#9.2)
eventually execute an element access function;
the thread of execution will continue to do so until the algorithm is finished[.](#9.sentence-1)
[*Note [10](#note-10)*:
In blocking with forward progress guarantee delegation in this context,
a thread of execution created by the library
is considered to have finished execution
as soon as it has finished the execution
of the particular element access function
that the invoking thread of execution logically depends on[.](#9.sentence-2)
— *end note*]
[10](#10)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L578)
The semantics of parallel algorithms invoked with an execution policy object ofimplementation-defined type
are implementation-defined[.](#10.sentence-1)