This commit is contained in:
2025-10-25 03:02:53 +03:00
commit 043225d523
3416 changed files with 681196 additions and 0 deletions

354
cppdraft/simd/loadstore.md Normal file
View File

@@ -0,0 +1,354 @@
[simd.loadstore]
# 29 Numerics library [[numerics]](./#numerics)
## 29.10 Data-parallel types [[simd]](simd#loadstore)
### 29.10.8 basic_vec non-member operations [[simd.nonmembers]](simd.nonmembers#simd.loadstore)
#### 29.10.8.6 basic_vec load and store functions [simd.loadstore]
[🔗](#lib:unchecked_load,simd)
`template<class V = see below, ranges::[contiguous_range](range.refinements#concept:contiguous_range "25.4.6Other range refinements[range.refinements]") R, class... Flags>
requires ranges::[sized_range](range.sized#concept:sized_range "25.4.4Sized ranges[range.sized]")<R>
constexpr V unchecked_load(R&& r, flags<Flags...> f = {});
template<class V = see below, ranges::[contiguous_range](range.refinements#concept:contiguous_range "25.4.6Other range refinements[range.refinements]") R, class... Flags>
requires ranges::[sized_range](range.sized#concept:sized_range "25.4.4Sized ranges[range.sized]")<R>
constexpr V unchecked_load(R&& r, const typename V::mask_type& mask, flags<Flags...> f = {});
template<class V = see below, [contiguous_iterator](iterator.concept.contiguous#concept:contiguous_iterator "24.3.4.14Concept contiguous_­iterator[iterator.concept.contiguous]") I, class... Flags>
constexpr V unchecked_load(I first, iter_difference_t<I> n, flags<Flags...> f = {});
template<class V = see below, [contiguous_iterator](iterator.concept.contiguous#concept:contiguous_iterator "24.3.4.14Concept contiguous_­iterator[iterator.concept.contiguous]") I, class... Flags>
constexpr V unchecked_load(I first, iter_difference_t<I> n, const typename V::mask_type& mask,
flags<Flags...> f = {});
template<class V = see below, [contiguous_iterator](iterator.concept.contiguous#concept:contiguous_iterator "24.3.4.14Concept contiguous_­iterator[iterator.concept.contiguous]") I, [sized_sentinel_for](iterator.concept.sizedsentinel#concept:sized_sentinel_for "24.3.4.8Concept sized_­sentinel_­for[iterator.concept.sizedsentinel]")<I> S, class... Flags>
constexpr V unchecked_load(I first, S last, flags<Flags...> f = {});
template<class V = see below, [contiguous_iterator](iterator.concept.contiguous#concept:contiguous_iterator "24.3.4.14Concept contiguous_­iterator[iterator.concept.contiguous]") I, [sized_sentinel_for](iterator.concept.sizedsentinel#concept:sized_sentinel_for "24.3.4.8Concept sized_­sentinel_­for[iterator.concept.sizedsentinel]")<I> S, class... Flags>
constexpr V unchecked_load(I first, S last, const typename V::mask_type& mask,
flags<Flags...> f = {});
`
[1](#1)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18486)
Let
- [(1.1)](#1.1)
mask be V::mask_type(true) for the overloads with no mask parameter;
- [(1.2)](#1.2)
R be span<const iter_value_t<I>> for the overloads with no
template parameter R;
- [(1.3)](#1.3)
r be R(first, n) for the overloads with an n parameter and R(first, last) for the overloads with a last parameter[.](#1.sentence-1)
[2](#2)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18501)
*Mandates*: If ranges::size(r) is a constant expression thenranges::size(r) ≥ V::size()[.](#2.sentence-1)
[3](#3)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18506)
*Preconditions*:
- [(3.1)](#3.1)
[first, first + n) is a valid range for the overloads with an n parameter[.](#3.1.sentence-1)
- [(3.2)](#3.2)
[first, last) is a valid range for the overloads with a last parameter[.](#3.2.sentence-1)
- [(3.3)](#3.3)
ranges::size(r) ≥ V::size()
[4](#4)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18516)
*Effects*: Equivalent to: return partial_load<V>(r, mask, f);
[5](#5)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18520)
*Remarks*: The default argument for template parameter V isbasic_vec<ranges::range_value_t<R>>[.](#5.sentence-1)
[🔗](#lib:partial_load,simd)
`template<class V = see below, ranges::[contiguous_range](range.refinements#concept:contiguous_range "25.4.6Other range refinements[range.refinements]") R, class... Flags>
requires ranges::[sized_range](range.sized#concept:sized_range "25.4.4Sized ranges[range.sized]")<R>
constexpr V partial_load(R&& r, flags<Flags...> f = {});
template<class V = see below, ranges::[contiguous_range](range.refinements#concept:contiguous_range "25.4.6Other range refinements[range.refinements]") R, class... Flags>
requires ranges::[sized_range](range.sized#concept:sized_range "25.4.4Sized ranges[range.sized]")<R>
constexpr V partial_load(R&& r, const typename V::mask_type& mask, flags<Flags...> f = {});
template<class V = see below, [contiguous_iterator](iterator.concept.contiguous#concept:contiguous_iterator "24.3.4.14Concept contiguous_­iterator[iterator.concept.contiguous]") I, class... Flags>
constexpr V partial_load(I first, iter_difference_t<I> n, flags<Flags...> f = {});
template<class V = see below, [contiguous_iterator](iterator.concept.contiguous#concept:contiguous_iterator "24.3.4.14Concept contiguous_­iterator[iterator.concept.contiguous]") I, class... Flags>
constexpr V partial_load(I first, iter_difference_t<I> n, const typename V::mask_type& mask,
flags<Flags...> f = {});
template<class V = see below, [contiguous_iterator](iterator.concept.contiguous#concept:contiguous_iterator "24.3.4.14Concept contiguous_­iterator[iterator.concept.contiguous]") I, [sized_sentinel_for](iterator.concept.sizedsentinel#concept:sized_sentinel_for "24.3.4.8Concept sized_­sentinel_­for[iterator.concept.sizedsentinel]")<I> S, class... Flags>
constexpr V partial_load(I first, S last, flags<Flags...> f = {});
template<class V = see below, [contiguous_iterator](iterator.concept.contiguous#concept:contiguous_iterator "24.3.4.14Concept contiguous_­iterator[iterator.concept.contiguous]") I, [sized_sentinel_for](iterator.concept.sizedsentinel#concept:sized_sentinel_for "24.3.4.8Concept sized_­sentinel_­for[iterator.concept.sizedsentinel]")<I> S, class... Flags>
constexpr V partial_load(I first, S last, const typename V::mask_type& mask,
flags<Flags...> f = {});
`
[6](#6)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18547)
Let
- [(6.1)](#6.1)
mask be V::mask_type(true) for the overloads with no mask parameter;
- [(6.2)](#6.2)
R be span<const iter_value_t<I>> for the overloads with no
template parameter R;
- [(6.3)](#6.3)
r be R(first, n) for the overloads with an n parameter and R(first, last) for the overloads with a last parameter[.](#6.sentence-1)
[7](#7)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18562)
*Mandates*:
- [(7.1)](#7.1)
ranges::range_value_t<R> is a vectorizable type,
- [(7.2)](#7.2)
same_as<remove_cvref_t<V>, V> is true,
- [(7.3)](#7.3)
V is an enabled specialization of basic_vec, and
- [(7.4)](#7.4)
if the template parameter pack Flags does not contain *convert-flag*, then the conversion from ranges::range_value_t<R> to V::value_type is
value-preserving[.](#7.sentence-1)
[8](#8)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18578)
*Preconditions*:
- [(8.1)](#8.1)
[first, first + n) is a valid range for the overloads with an n parameter[.](#8.1.sentence-1)
- [(8.2)](#8.2)
[first, last) is a valid range for the overloads with a last parameter[.](#8.2.sentence-1)
- [(8.3)](#8.3)
If the template parameter pack Flags contains *aligned-flag*, ranges::data(r) points to storage
aligned by alignment_v<V, ranges::range_value_t<R>>[.](#8.3.sentence-1)
- [(8.4)](#8.4)
If the template parameter pack Flags contains *overaligned-flag*<N>, ranges::data(r) points to
storage aligned by N[.](#8.4.sentence-1)
[9](#9)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18597)
*Effects*: Initializes the ith element with
mask[i] && i < ranges::size(r) ?static_cast<T>(ranges::data(r)[i]) : T() for all i in the range of
[0, V::size())[.](#9.sentence-2)
[10](#10)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18604)
*Remarks*: The default argument for template parameter V isbasic_vec<ranges::range_value_t<R>>[.](#10.sentence-1)
[🔗](#lib:unchecked_store,simd)
`template<class T, class Abi, ranges::[contiguous_range](range.refinements#concept:contiguous_range "25.4.6Other range refinements[range.refinements]") R, class... Flags>
requires ranges::[sized_range](range.sized#concept:sized_range "25.4.4Sized ranges[range.sized]")<R> && [indirectly_writable](iterator.concept.writable#concept:indirectly_writable "24.3.4.3Concept indirectly_­writable[iterator.concept.writable]")<ranges::iterator_t<R>, T>
constexpr void unchecked_store(const basic_vec<T, Abi>& v, R&& r, flags<Flags...> f = {});
template<class T, class Abi, ranges::[contiguous_range](range.refinements#concept:contiguous_range "25.4.6Other range refinements[range.refinements]") R, class... Flags>
requires ranges::[sized_range](range.sized#concept:sized_range "25.4.4Sized ranges[range.sized]")<R> && [indirectly_writable](iterator.concept.writable#concept:indirectly_writable "24.3.4.3Concept indirectly_­writable[iterator.concept.writable]")<ranges::iterator_t<R>, T>
constexpr void unchecked_store(const basic_vec<T, Abi>& v, R&& r,
const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
template<class T, class Abi, [contiguous_iterator](iterator.concept.contiguous#concept:contiguous_iterator "24.3.4.14Concept contiguous_­iterator[iterator.concept.contiguous]") I, class... Flags>
requires [indirectly_writable](iterator.concept.writable#concept:indirectly_writable "24.3.4.3Concept indirectly_­writable[iterator.concept.writable]")<I, T>
constexpr void unchecked_store(const basic_vec<T, Abi>& v, I first, iter_difference_t<I> n,
flags<Flags...> f = {});
template<class T, class Abi, [contiguous_iterator](iterator.concept.contiguous#concept:contiguous_iterator "24.3.4.14Concept contiguous_­iterator[iterator.concept.contiguous]") I, class... Flags>
requires [indirectly_writable](iterator.concept.writable#concept:indirectly_writable "24.3.4.3Concept indirectly_­writable[iterator.concept.writable]")<I, T>
constexpr void unchecked_store(const basic_vec<T, Abi>& v, I first, iter_difference_t<I> n,
const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
template<class T, class Abi, [contiguous_iterator](iterator.concept.contiguous#concept:contiguous_iterator "24.3.4.14Concept contiguous_­iterator[iterator.concept.contiguous]") I, [sized_sentinel_for](iterator.concept.sizedsentinel#concept:sized_sentinel_for "24.3.4.8Concept sized_­sentinel_­for[iterator.concept.sizedsentinel]")<I> S, class... Flags>
requires [indirectly_writable](iterator.concept.writable#concept:indirectly_writable "24.3.4.3Concept indirectly_­writable[iterator.concept.writable]")<I, T>
constexpr void unchecked_store(const basic_vec<T, Abi>& v, I first, S last,
flags<Flags...> f = {});
template<class T, class Abi, [contiguous_iterator](iterator.concept.contiguous#concept:contiguous_iterator "24.3.4.14Concept contiguous_­iterator[iterator.concept.contiguous]") I, [sized_sentinel_for](iterator.concept.sizedsentinel#concept:sized_sentinel_for "24.3.4.8Concept sized_­sentinel_­for[iterator.concept.sizedsentinel]")<I> S, class... Flags>
requires [indirectly_writable](iterator.concept.writable#concept:indirectly_writable "24.3.4.3Concept indirectly_­writable[iterator.concept.writable]")<I, T>
constexpr void unchecked_store(const basic_vec<T, Abi>& v, I first, S last,
const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
`
[11](#11)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18638)
Let
- [(11.1)](#11.1)
mask be basic_vec<T, Abi>::mask_type(true) for the
overloads with no mask parameter;
- [(11.2)](#11.2)
R be span<iter_value_t<I>> for the overloads with no
template parameter R;
- [(11.3)](#11.3)
r be R(first, n) for the overloads with an n parameter and R(first, last) for the overloads with a last parameter[.](#11.sentence-1)
[12](#12)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18653)
*Mandates*: If ranges::size(r) is a constant expression thenranges::size(r) ≥ *simd-size-v*<T, Abi>[.](#12.sentence-1)
[13](#13)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18658)
*Preconditions*:
- [(13.1)](#13.1)
[first, first + n) is a valid range for the overloads with an n parameter[.](#13.1.sentence-1)
- [(13.2)](#13.2)
[first, last) is a valid range for the overloads with a last parameter[.](#13.2.sentence-1)
- [(13.3)](#13.3)
ranges::size(r) ≥ *simd-size-v*<T, Abi>
[14](#14)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18671)
*Effects*: Equivalent to: partial_store(v, r, mask, f)[.](#14.sentence-1)
[🔗](#lib:partial_store,simd)
`template<class T, class Abi, ranges::[contiguous_range](range.refinements#concept:contiguous_range "25.4.6Other range refinements[range.refinements]") R, class... Flags>
requires ranges::[sized_range](range.sized#concept:sized_range "25.4.4Sized ranges[range.sized]")<R> && [indirectly_writable](iterator.concept.writable#concept:indirectly_writable "24.3.4.3Concept indirectly_­writable[iterator.concept.writable]")<ranges::iterator_t<R>, T>
constexpr void partial_store(const basic_vec<T, Abi>& v, R&& r, flags<Flags...> f = {});
template<class T, class Abi, ranges::[contiguous_range](range.refinements#concept:contiguous_range "25.4.6Other range refinements[range.refinements]") R, class... Flags>
requires ranges::[sized_range](range.sized#concept:sized_range "25.4.4Sized ranges[range.sized]")<R> && [indirectly_writable](iterator.concept.writable#concept:indirectly_writable "24.3.4.3Concept indirectly_­writable[iterator.concept.writable]")<ranges::iterator_t<R>, T>
constexpr void partial_store(const basic_vec<T, Abi>& v, R&& r,
const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
template<class T, class Abi, [contiguous_iterator](iterator.concept.contiguous#concept:contiguous_iterator "24.3.4.14Concept contiguous_­iterator[iterator.concept.contiguous]") I, class... Flags>
requires [indirectly_writable](iterator.concept.writable#concept:indirectly_writable "24.3.4.3Concept indirectly_­writable[iterator.concept.writable]")<I, T>
constexpr void partial_store(const basic_vec<T, Abi>& v, I first, iter_difference_t<I> n,
flags<Flags...> f = {});
template<class T, class Abi, [contiguous_iterator](iterator.concept.contiguous#concept:contiguous_iterator "24.3.4.14Concept contiguous_­iterator[iterator.concept.contiguous]") I, class... Flags>
requires [indirectly_writable](iterator.concept.writable#concept:indirectly_writable "24.3.4.3Concept indirectly_­writable[iterator.concept.writable]")<I, T>
constexpr void partial_store(const basic_vec<T, Abi>& v, I first, iter_difference_t<I> n,
const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
template<class T, class Abi, [contiguous_iterator](iterator.concept.contiguous#concept:contiguous_iterator "24.3.4.14Concept contiguous_­iterator[iterator.concept.contiguous]") I, [sized_sentinel_for](iterator.concept.sizedsentinel#concept:sized_sentinel_for "24.3.4.8Concept sized_­sentinel_­for[iterator.concept.sizedsentinel]")<I> S, class... Flags>
requires [indirectly_writable](iterator.concept.writable#concept:indirectly_writable "24.3.4.3Concept indirectly_­writable[iterator.concept.writable]")<I, T>
constexpr void partial_store(const basic_vec<T, Abi>& v, I first, S last,
flags<Flags...> f = {});
template<class T, class Abi, [contiguous_iterator](iterator.concept.contiguous#concept:contiguous_iterator "24.3.4.14Concept contiguous_­iterator[iterator.concept.contiguous]") I, [sized_sentinel_for](iterator.concept.sizedsentinel#concept:sized_sentinel_for "24.3.4.8Concept sized_­sentinel_­for[iterator.concept.sizedsentinel]")<I> S, class... Flags>
requires [indirectly_writable](iterator.concept.writable#concept:indirectly_writable "24.3.4.3Concept indirectly_­writable[iterator.concept.writable]")<I, T>
constexpr void partial_store(const basic_vec<T, Abi>& v, I first, S last,
const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
`
[15](#15)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18704)
Let
- [(15.1)](#15.1)
mask be basic_vec<T, Abi>::mask_type(true) for the
overloads with no mask parameter;
- [(15.2)](#15.2)
R be span<iter_value_t<I>> for the overloads with no
template parameter R;
- [(15.3)](#15.3)
r be R(first, n) for the overloads with an n parameter and R(first, last) for the overloads with a last parameter[.](#15.sentence-1)
[16](#16)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18719)
*Mandates*:
- [(16.1)](#16.1)
ranges::range_value_t<R> is a vectorizable type, and
- [(16.2)](#16.2)
if the template parameter pack Flags does not contain *convert-flag*, then the conversion from T to ranges::range_value_t<R> is value-preserving[.](#16.sentence-1)
[17](#17)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18730)
*Preconditions*:
- [(17.1)](#17.1)
[first, first + n) is a valid range for the overloads with an n parameter[.](#17.1.sentence-1)
- [(17.2)](#17.2)
[first, last) is a valid range for the overloads with a last parameter[.](#17.2.sentence-1)
- [(17.3)](#17.3)
If the template parameter pack Flags contains *aligned-flag*, ranges::data(r) points to storage
aligned by alignment_v<basic_vec<T, Abi>,
ranges::range_value_t<R>>[.](#17.3.sentence-1)
- [(17.4)](#17.4)
If the template parameter pack Flags contains *overaligned-flag*<N>, ranges::data(r) points to
storage aligned by N[.](#17.4.sentence-1)
[18](#18)
[#](http://github.com/Eelis/draft/tree/9adde4bc1c62ec234483e63ea3b70a59724c745a/source/numerics.tex#L18750)
*Effects*: For all i in the range of [0, basic_vec<T, Abi>::size()), ifmask[i] && i < ranges::size(r) is true, evaluatesranges::data(r)[i] = v[i][.](#18.sentence-1)