242 lines
9.3 KiB
Markdown
242 lines
9.3 KiB
Markdown
[algorithms.parallel.exec]
|
||
|
||
# 26 Algorithms library [[algorithms]](./#algorithms)
|
||
|
||
## 26.3 Parallel algorithms [[algorithms.parallel]](algorithms.parallel#exec)
|
||
|
||
### 26.3.3 Effect of execution policies on algorithm execution [algorithms.parallel.exec]
|
||
|
||
[1](#1)
|
||
|
||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L391)
|
||
|
||
An execution policy template parameter describes
|
||
the manner in which the execution of a parallel algorithm may be
|
||
parallelized and the manner in which it applies the element access functions[.](#1.sentence-1)
|
||
|
||
[2](#2)
|
||
|
||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L396)
|
||
|
||
If an object is modified by an element access function,
|
||
the algorithm will perform no other unsynchronized accesses to that object[.](#2.sentence-1)
|
||
|
||
The modifying element access functions are those
|
||
which are specified as modifying the object[.](#2.sentence-2)
|
||
|
||
[*Note [1](#note-1)*:
|
||
|
||
For example,swap,++,--,@=, and
|
||
assignments
|
||
modify the object[.](#2.sentence-3)
|
||
|
||
For the assignment and @= operators, only the left argument is modified[.](#2.sentence-4)
|
||
|
||
â *end note*]
|
||
|
||
[3](#3)
|
||
|
||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L412)
|
||
|
||
Unless otherwise stated, implementations may make arbitrary copies of elements
|
||
(with type T) from sequences
|
||
where is_trivially_copy_constructible_v<T> and is_trivially_destructible_v<T> are true[.](#3.sentence-1)
|
||
|
||
[*Note [2](#note-2)*:
|
||
|
||
This implies that user-supplied function objects cannot rely on
|
||
object identity of arguments for such input sequences[.](#3.sentence-2)
|
||
|
||
If object identity of the arguments to these function objects
|
||
is important, a wrapping iterator
|
||
that returns a non-copied implementation object
|
||
such as reference_wrapper<T> ([[refwrap]](refwrap "22.10.6 Class template reference_wrapper")),
|
||
or some equivalent solution, can be used[.](#3.sentence-3)
|
||
|
||
â *end note*]
|
||
|
||
[4](#4)
|
||
|
||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L427)
|
||
|
||
The invocations of element access functions in parallel algorithms invoked with
|
||
an execution policy object of type execution::sequenced_policy all occur
|
||
in the calling thread of execution[.](#4.sentence-1)
|
||
|
||
[*Note [3](#note-3)*:
|
||
|
||
The invocations are not interleaved; see [[intro.execution]](intro.execution "6.10.1 Sequential execution")[.](#4.sentence-2)
|
||
|
||
â *end note*]
|
||
|
||
[5](#5)
|
||
|
||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L435)
|
||
|
||
The invocations of element access functions in parallel algorithms invoked with
|
||
an execution policy object of type execution::unsequenced_policy are permitted to execute in an unordered fashion
|
||
in the calling thread of execution,
|
||
unsequenced with respect to one another in the calling thread of execution[.](#5.sentence-1)
|
||
|
||
[*Note [4](#note-4)*:
|
||
|
||
This means that multiple function object invocations
|
||
can be interleaved on a single thread of execution,
|
||
which overrides the usual guarantee from [[intro.execution]](intro.execution "6.10.1 Sequential execution") that function executions do not overlap with one another[.](#5.sentence-2)
|
||
|
||
â *end note*]
|
||
|
||
The behavior of a program is undefined if
|
||
it invokes a vectorization-unsafe standard library function
|
||
from user code
|
||
called from an execution::unsequenced_policy algorithm[.](#5.sentence-3)
|
||
|
||
[*Note [5](#note-5)*:
|
||
|
||
Because execution::unsequenced_policy allows
|
||
the execution of element access functions
|
||
to be interleaved on a single thread of execution,
|
||
blocking synchronization, including the use of mutexes, risks deadlock[.](#5.sentence-4)
|
||
|
||
â *end note*]
|
||
|
||
[6](#6)
|
||
|
||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L458)
|
||
|
||
The invocations of element access functions in parallel algorithms invoked with
|
||
an execution policy object of type execution::parallel_policy are permitted to execute either
|
||
in the invoking thread of execution or
|
||
in a thread of execution implicitly created by the library
|
||
to support parallel algorithm execution[.](#6.sentence-1)
|
||
|
||
If the threads of execution created by thread ([[thread.thread.class]](thread.thread.class "32.4.3 Class thread"))
|
||
or jthread ([[thread.jthread.class]](thread.jthread.class "32.4.4 Class jthread"))
|
||
provide concurrent forward progress guarantees ([[intro.progress]](intro.progress "6.10.2.3 Forward progress")),
|
||
then a thread of execution implicitly created by the library will provide
|
||
parallel forward progress guarantees;
|
||
otherwise, the provided forward progress guarantee isimplementation-defined[.](#6.sentence-2)
|
||
|
||
Any such invocations executing in the same thread of execution
|
||
are indeterminately sequenced with respect to each other[.](#6.sentence-3)
|
||
|
||
[*Note [6](#note-6)*:
|
||
|
||
It is the caller's responsibility to ensure
|
||
that the invocation does not introduce data races or deadlocks[.](#6.sentence-4)
|
||
|
||
â *end note*]
|
||
|
||
[*Example [1](#example-1)*: int a[] = {0,1};
|
||
std::vector<int> v;
|
||
std::for_each(std::execution::par, std::begin(a), std::end(a), [&](int i) { v.push_back(i*2+1); // incorrect: data race});
|
||
|
||
The program above has a data race because of the unsynchronized access to the
|
||
container v[.](#6.sentence-5)
|
||
|
||
â *end example*]
|
||
|
||
[*Example [2](#example-2)*: std::atomic<int> x{0};int a[] = {1,2};
|
||
std::for_each(std::execution::par, std::begin(a), std::end(a), [&](int) { x.fetch_add(1, std::memory_order::relaxed); // spin wait for another iteration to change the value of xwhile (x.load(std::memory_order::relaxed) == 1) { } // incorrect: assumes execution order});
|
||
|
||
The above example depends on the order of execution of the iterations, and
|
||
will not terminate if both iterations are executed sequentially
|
||
on the same thread of execution[.](#6.sentence-6)
|
||
|
||
â *end example*]
|
||
|
||
[*Example [3](#example-3)*: int x = 0;
|
||
std::mutex m;int a[] = {1,2};
|
||
std::for_each(std::execution::par, std::begin(a), std::end(a), [&](int) { std::lock_guard<mutex> guard(m); ++x;});
|
||
|
||
The above example synchronizes access to object x ensuring that it is incremented correctly[.](#6.sentence-7)
|
||
|
||
â *end example*]
|
||
|
||
[7](#7)
|
||
|
||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L518)
|
||
|
||
The invocations of element access functions in parallel algorithms invoked with
|
||
an execution policy object of type execution::parallel_unsequenced_policy are
|
||
permitted to execute
|
||
in an unordered fashion in unspecified threads of execution, and
|
||
unsequenced with respect to one another within each thread of execution[.](#7.sentence-1)
|
||
|
||
These threads of execution are
|
||
either the invoking thread of execution
|
||
or threads of execution implicitly created by the library;
|
||
the latter will provide weakly parallel forward progress guarantees[.](#7.sentence-2)
|
||
|
||
[*Note [7](#note-7)*:
|
||
|
||
This means that multiple function object invocations can be interleaved
|
||
on a single thread of execution,
|
||
which overrides the usual guarantee from [[intro.execution]](intro.execution "6.10.1 Sequential execution") that function executions do not overlap with one another[.](#7.sentence-3)
|
||
|
||
â *end note*]
|
||
|
||
The behavior of a program is undefined if
|
||
it invokes a vectorization-unsafe standard library function
|
||
from user code
|
||
called from an execution::parallel_unsequenced_policy algorithm[.](#7.sentence-4)
|
||
|
||
[*Note [8](#note-8)*:
|
||
|
||
Because execution::parallel_unsequenced_policy allows
|
||
the execution of element access functions
|
||
to be interleaved on a single thread of execution,
|
||
blocking synchronization, including the use of mutexes, risks deadlock[.](#7.sentence-5)
|
||
|
||
â *end note*]
|
||
|
||
[8](#8)
|
||
|
||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L545)
|
||
|
||
[*Note [9](#note-9)*:
|
||
|
||
The semantics of invocation withexecution::unsequenced_policy,execution::parallel_policy, orexecution::parallel_unsequenced_policy allow the implementation to fall back to sequential execution
|
||
if the system cannot parallelize an algorithm invocation,
|
||
e.g., due to lack of resources[.](#8.sentence-1)
|
||
|
||
â *end note*]
|
||
|
||
[9](#9)
|
||
|
||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L556)
|
||
|
||
If an invocation of a parallel algorithm uses threads of execution
|
||
implicitly created by the library,
|
||
then the invoking thread of execution will either
|
||
|
||
- [(9.1)](#9.1)
|
||
|
||
temporarily block
|
||
with forward progress guarantee delegation ([[intro.progress]](intro.progress "6.10.2.3 Forward progress"))
|
||
on the completion of these library-managed threads of execution, or
|
||
|
||
- [(9.2)](#9.2)
|
||
|
||
eventually execute an element access function;
|
||
|
||
the thread of execution will continue to do so until the algorithm is finished[.](#9.sentence-1)
|
||
|
||
[*Note [10](#note-10)*:
|
||
|
||
In blocking with forward progress guarantee delegation in this context,
|
||
a thread of execution created by the library
|
||
is considered to have finished execution
|
||
as soon as it has finished the execution
|
||
of the particular element access function
|
||
that the invoking thread of execution logically depends on[.](#9.sentence-2)
|
||
|
||
â *end note*]
|
||
|
||
[10](#10)
|
||
|
||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L578)
|
||
|
||
The semantics of parallel algorithms invoked with an execution policy object ofimplementation-defined type
|
||
are implementation-defined[.](#10.sentence-1)
|