Init
This commit is contained in:
241
cppdraft/algorithms/parallel/exec.md
Normal file
241
cppdraft/algorithms/parallel/exec.md
Normal file
@@ -0,0 +1,241 @@
|
||||
[algorithms.parallel.exec]
|
||||
|
||||
# 26 Algorithms library [[algorithms]](./#algorithms)
|
||||
|
||||
## 26.3 Parallel algorithms [[algorithms.parallel]](algorithms.parallel#exec)
|
||||
|
||||
### 26.3.3 Effect of execution policies on algorithm execution [algorithms.parallel.exec]
|
||||
|
||||
[1](#1)
|
||||
|
||||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L391)
|
||||
|
||||
An execution policy template parameter describes
|
||||
the manner in which the execution of a parallel algorithm may be
|
||||
parallelized and the manner in which it applies the element access functions[.](#1.sentence-1)
|
||||
|
||||
[2](#2)
|
||||
|
||||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L396)
|
||||
|
||||
If an object is modified by an element access function,
|
||||
the algorithm will perform no other unsynchronized accesses to that object[.](#2.sentence-1)
|
||||
|
||||
The modifying element access functions are those
|
||||
which are specified as modifying the object[.](#2.sentence-2)
|
||||
|
||||
[*Note [1](#note-1)*:
|
||||
|
||||
For example,swap,++,--,@=, and
|
||||
assignments
|
||||
modify the object[.](#2.sentence-3)
|
||||
|
||||
For the assignment and @= operators, only the left argument is modified[.](#2.sentence-4)
|
||||
|
||||
â *end note*]
|
||||
|
||||
[3](#3)
|
||||
|
||||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L412)
|
||||
|
||||
Unless otherwise stated, implementations may make arbitrary copies of elements
|
||||
(with type T) from sequences
|
||||
where is_trivially_copy_constructible_v<T> and is_trivially_destructible_v<T> are true[.](#3.sentence-1)
|
||||
|
||||
[*Note [2](#note-2)*:
|
||||
|
||||
This implies that user-supplied function objects cannot rely on
|
||||
object identity of arguments for such input sequences[.](#3.sentence-2)
|
||||
|
||||
If object identity of the arguments to these function objects
|
||||
is important, a wrapping iterator
|
||||
that returns a non-copied implementation object
|
||||
such as reference_wrapper<T> ([[refwrap]](refwrap "22.10.6 Class template reference_wrapper")),
|
||||
or some equivalent solution, can be used[.](#3.sentence-3)
|
||||
|
||||
â *end note*]
|
||||
|
||||
[4](#4)
|
||||
|
||||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L427)
|
||||
|
||||
The invocations of element access functions in parallel algorithms invoked with
|
||||
an execution policy object of type execution::sequenced_policy all occur
|
||||
in the calling thread of execution[.](#4.sentence-1)
|
||||
|
||||
[*Note [3](#note-3)*:
|
||||
|
||||
The invocations are not interleaved; see [[intro.execution]](intro.execution "6.10.1 Sequential execution")[.](#4.sentence-2)
|
||||
|
||||
â *end note*]
|
||||
|
||||
[5](#5)
|
||||
|
||||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L435)
|
||||
|
||||
The invocations of element access functions in parallel algorithms invoked with
|
||||
an execution policy object of type execution::unsequenced_policy are permitted to execute in an unordered fashion
|
||||
in the calling thread of execution,
|
||||
unsequenced with respect to one another in the calling thread of execution[.](#5.sentence-1)
|
||||
|
||||
[*Note [4](#note-4)*:
|
||||
|
||||
This means that multiple function object invocations
|
||||
can be interleaved on a single thread of execution,
|
||||
which overrides the usual guarantee from [[intro.execution]](intro.execution "6.10.1 Sequential execution") that function executions do not overlap with one another[.](#5.sentence-2)
|
||||
|
||||
â *end note*]
|
||||
|
||||
The behavior of a program is undefined if
|
||||
it invokes a vectorization-unsafe standard library function
|
||||
from user code
|
||||
called from an execution::unsequenced_policy algorithm[.](#5.sentence-3)
|
||||
|
||||
[*Note [5](#note-5)*:
|
||||
|
||||
Because execution::unsequenced_policy allows
|
||||
the execution of element access functions
|
||||
to be interleaved on a single thread of execution,
|
||||
blocking synchronization, including the use of mutexes, risks deadlock[.](#5.sentence-4)
|
||||
|
||||
â *end note*]
|
||||
|
||||
[6](#6)
|
||||
|
||||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L458)
|
||||
|
||||
The invocations of element access functions in parallel algorithms invoked with
|
||||
an execution policy object of type execution::parallel_policy are permitted to execute either
|
||||
in the invoking thread of execution or
|
||||
in a thread of execution implicitly created by the library
|
||||
to support parallel algorithm execution[.](#6.sentence-1)
|
||||
|
||||
If the threads of execution created by thread ([[thread.thread.class]](thread.thread.class "32.4.3 Class thread"))
|
||||
or jthread ([[thread.jthread.class]](thread.jthread.class "32.4.4 Class jthread"))
|
||||
provide concurrent forward progress guarantees ([[intro.progress]](intro.progress "6.10.2.3 Forward progress")),
|
||||
then a thread of execution implicitly created by the library will provide
|
||||
parallel forward progress guarantees;
|
||||
otherwise, the provided forward progress guarantee isimplementation-defined[.](#6.sentence-2)
|
||||
|
||||
Any such invocations executing in the same thread of execution
|
||||
are indeterminately sequenced with respect to each other[.](#6.sentence-3)
|
||||
|
||||
[*Note [6](#note-6)*:
|
||||
|
||||
It is the caller's responsibility to ensure
|
||||
that the invocation does not introduce data races or deadlocks[.](#6.sentence-4)
|
||||
|
||||
â *end note*]
|
||||
|
||||
[*Example [1](#example-1)*: int a[] = {0,1};
|
||||
std::vector<int> v;
|
||||
std::for_each(std::execution::par, std::begin(a), std::end(a), [&](int i) { v.push_back(i*2+1); // incorrect: data race});
|
||||
|
||||
The program above has a data race because of the unsynchronized access to the
|
||||
container v[.](#6.sentence-5)
|
||||
|
||||
â *end example*]
|
||||
|
||||
[*Example [2](#example-2)*: std::atomic<int> x{0};int a[] = {1,2};
|
||||
std::for_each(std::execution::par, std::begin(a), std::end(a), [&](int) { x.fetch_add(1, std::memory_order::relaxed); // spin wait for another iteration to change the value of xwhile (x.load(std::memory_order::relaxed) == 1) { } // incorrect: assumes execution order});
|
||||
|
||||
The above example depends on the order of execution of the iterations, and
|
||||
will not terminate if both iterations are executed sequentially
|
||||
on the same thread of execution[.](#6.sentence-6)
|
||||
|
||||
â *end example*]
|
||||
|
||||
[*Example [3](#example-3)*: int x = 0;
|
||||
std::mutex m;int a[] = {1,2};
|
||||
std::for_each(std::execution::par, std::begin(a), std::end(a), [&](int) { std::lock_guard<mutex> guard(m); ++x;});
|
||||
|
||||
The above example synchronizes access to object x ensuring that it is incremented correctly[.](#6.sentence-7)
|
||||
|
||||
â *end example*]
|
||||
|
||||
[7](#7)
|
||||
|
||||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L518)
|
||||
|
||||
The invocations of element access functions in parallel algorithms invoked with
|
||||
an execution policy object of type execution::parallel_unsequenced_policy are
|
||||
permitted to execute
|
||||
in an unordered fashion in unspecified threads of execution, and
|
||||
unsequenced with respect to one another within each thread of execution[.](#7.sentence-1)
|
||||
|
||||
These threads of execution are
|
||||
either the invoking thread of execution
|
||||
or threads of execution implicitly created by the library;
|
||||
the latter will provide weakly parallel forward progress guarantees[.](#7.sentence-2)
|
||||
|
||||
[*Note [7](#note-7)*:
|
||||
|
||||
This means that multiple function object invocations can be interleaved
|
||||
on a single thread of execution,
|
||||
which overrides the usual guarantee from [[intro.execution]](intro.execution "6.10.1 Sequential execution") that function executions do not overlap with one another[.](#7.sentence-3)
|
||||
|
||||
â *end note*]
|
||||
|
||||
The behavior of a program is undefined if
|
||||
it invokes a vectorization-unsafe standard library function
|
||||
from user code
|
||||
called from an execution::parallel_unsequenced_policy algorithm[.](#7.sentence-4)
|
||||
|
||||
[*Note [8](#note-8)*:
|
||||
|
||||
Because execution::parallel_unsequenced_policy allows
|
||||
the execution of element access functions
|
||||
to be interleaved on a single thread of execution,
|
||||
blocking synchronization, including the use of mutexes, risks deadlock[.](#7.sentence-5)
|
||||
|
||||
â *end note*]
|
||||
|
||||
[8](#8)
|
||||
|
||||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L545)
|
||||
|
||||
[*Note [9](#note-9)*:
|
||||
|
||||
The semantics of invocation withexecution::unsequenced_policy,execution::parallel_policy, orexecution::parallel_unsequenced_policy allow the implementation to fall back to sequential execution
|
||||
if the system cannot parallelize an algorithm invocation,
|
||||
e.g., due to lack of resources[.](#8.sentence-1)
|
||||
|
||||
â *end note*]
|
||||
|
||||
[9](#9)
|
||||
|
||||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L556)
|
||||
|
||||
If an invocation of a parallel algorithm uses threads of execution
|
||||
implicitly created by the library,
|
||||
then the invoking thread of execution will either
|
||||
|
||||
- [(9.1)](#9.1)
|
||||
|
||||
temporarily block
|
||||
with forward progress guarantee delegation ([[intro.progress]](intro.progress "6.10.2.3 Forward progress"))
|
||||
on the completion of these library-managed threads of execution, or
|
||||
|
||||
- [(9.2)](#9.2)
|
||||
|
||||
eventually execute an element access function;
|
||||
|
||||
the thread of execution will continue to do so until the algorithm is finished[.](#9.sentence-1)
|
||||
|
||||
[*Note [10](#note-10)*:
|
||||
|
||||
In blocking with forward progress guarantee delegation in this context,
|
||||
a thread of execution created by the library
|
||||
is considered to have finished execution
|
||||
as soon as it has finished the execution
|
||||
of the particular element access function
|
||||
that the invoking thread of execution logically depends on[.](#9.sentence-2)
|
||||
|
||||
â *end note*]
|
||||
|
||||
[10](#10)
|
||||
|
||||
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/algorithms.tex#L578)
|
||||
|
||||
The semantics of parallel algorithms invoked with an execution policy object ofimplementation-defined type
|
||||
are implementation-defined[.](#10.sentence-1)
|
||||
Reference in New Issue
Block a user