* Instructions wr[u]ss[d|q] no longer accept register as the first
operand (that was a bug to accept this form)
* Moved APX version of legacy instructions closer so they are next
to each other
* Removed AVX512_ER, AVX512_PF, AVX512_4FMAPS, and AVX512_4VNNIW
extensions and corresponding instructions (these were never
advertised by any x86 CPU and were only used by Xeon Phi acc.,
which AsmJit never supported)
* Removed CPU extensions HLE, MPX, and TSX
* Kept extension RTM, which is only for backward compatibility to
recognize instructions, but it's no longer checked by CpuInfo as
it's been deprecated together with HLE and MPX
* The xtest instruction now reports it requires RTM
* Reorganized x86 extensions a bit - they are now reordered to group
them by category, preparing for the future where extension IDs will
be always added after existing records for ABI compatibility
* Instruction vcvtneps2bf16 no longer accepts form without an explicit
memory operand size
* Removed aliased instructions in CMOVcc, Jcc, And SETcc categories,
now there is only a single instruction id for all aliased instructions.
* Added a new feature to always show instruction aliases in Logger, which
includes formatting instructio nodes (Builder, Compiler)
Instruction DB-only updates (not applied to C++ yet):
* AsmJit DB from now uses the same license as AsmJit (Zlib) and
no longer applies dual licensing (Zlib and Public Domain)
* Added support for aggregated instruction definitions in
x86 instruction database, which should simplify the maintenance
and reduce bugs (also the syntax is comparable to descriptions
used by Intel APX instruction manuals)
* Added support for APX instructions and new features
* Added support for AVX10.1 and AVX10.2 instructions (both new
instructions and new encodings of existing instructions)
* Added support for MOVRS instructions
* Added support for KL instructions (loadiwkey)
* Added support for AESKLE instructions
* Added support for AESKLEWIDE_KL instructions
* Added support for AMX_[AVX512|MOVRS|FP8|TF32|TRANSPOSE]
* NOTE: None of the instruction additions is currently used by
Asmjit, it's a pure database update that needs more work to
make all the instructions available in future AsmJit
* The implementation tries to detect whether a virtual register
only lives in a single basic block and then uses a move approach
instead of spill/alloc when reallocating
* Additionally, the implementation now improves the use of scratch
registers during function arguments allocation - scratch is only
reserved when it's actually needed
Changes the reassognment decision used by local register allocator.
When the virtual register is killed by the instruction (or has a
separate output slot) it would be reassigned instead of spilled.
This should be a minor improvement for certain border cases.
* Fixes a potential UB in number to string conversion because of
a possible undefined behavior caused by -int64_t(a) code. The
fix replaces the code with Support::neg() function, which was
designed for exactly this.
* Little documentation fixes
* Added virtual destructor to emit helpers to silence warnings of
some compilers (it's totally useless change though as it changes
nothing in reality - emit helpers are allocated mostly on stack)
* The first operand (destination) is read/write and not overwrite
In addition, added the following new AArch64 instructions to DB:
* CPA extensions (DB-only)
* FAMINMAX extensions (DB-only)
* FP8 ASIMD extensions (DB-only)
* LUT extensions (DB-only)
Extend option in ADD, ADDS, SUB, SUBS, CMP, and CMN instructions
doesn't always use the same second register type. For example when
extending from a BYTE the second source register must be W and not
X.
This change makes sure that the assembler accepts the correct
combination and refuses the incorrect one.
IMPORTANT: Although this is not an ABI change, the new behavior
can break AArch64 code that used the incorrect signatures.
* The latest cmake versions started showing warnings about the
minimum version supported as there are possibly some breaking
changes not affecting us
* Reworked some bits in CMakeLists.txt to take advantage of the
raised version
* Removed the use of policies that are now enabled by cmake by
default
* Removed deprecated build options
The C-style cast was discarding const and casting to `(uint8_t *)` at
the same time, causing a warning. Add const and non-const versions of
the method.
BaseMem::setSize() should not be used anymore as the only memory
operand that understands size is x86::Mem, which makes it x86
specific.
The reason is that other architectures require more bits, so for
example arm::Mem uses the storage used by x86 size for storing
other information such as offset mode, and possibly more information
will be needed in the future to support AArch64 SVE or SME, etc...
At the moment BaseMem::setSize() has been deprecated, so code using
it would still compile, but with a warning. It will be removed in
the future though.
For some reason the growing strategy of asmjit::String was too
aggressive, basically reaching the maximum doubling capacity too
fast (after the first reallocation). This code adapts the current
vector growing strategy to be used also by asmjit::String, which
doubles the capacity until a threshold is reached and then grows
linearly.
During bin-packing, a single function nonOverlappingUnionOf() is
called many times to calculate a union of one live ranges with
another. Before this change it used ZoneVector::reserve() to make
sure that there is enough space for the union, however, in some
cases this is not ideal in case that the union grows every time
the function is called. In that case it's reallocating the vector
many times, which affects performance.
Instead of calling reserve(), a new function growingReserve() was
added to tell the vector to grow when it needs to reallocate.
In addition, this change fixes some documentation regarding the
use of JitAllocator (Explicit Code Relocation section in core.h).
This feature has been disabled for a long time so
it could be tested properly, but production didn't
reveal any issues.
When try mode is enabled the RA will try to allocate
the reassignment first to avoid possibly having to
emit code in a separate block (try mode basically
"tries" to emit code before a branch and not as a
consequence of it).
C++20 deprecates mixing enums of different types (comparisons, etc...),
however, we use enums instea of "static constexpr" in classes to define
constants, because otherwise we would have to give such constants
storage - this is required for up to C++14 and since we still support
C++11 we have to keep using enums...
The problem is that the rewriter must also rewrite an instruction
ID in case that it's a [K|V]MOV[B|W|D|Q] instruction that moves
from either K or SIMD register to GP register. when such instruction
is rewritten in a way that it ends up as "xMOVx GP, [MEM]" it would
be invalid if it's not changed to a general purpose MOV.
The problem can only happen in case that the compiler spills a
virtual register, which is then moved to a scalar register.
In addition, checks were added to MOVD|MOVQ to ensure that when an
invalid instruction is emitted it's not ignored as it used to be.
* clang-18 on now enabled on CI and used for static analysis
* return error when X86Internal_setupSaveRestoreInfo() is called
with invalid register group. Should never happen though.
Register allocator now tries to allocate preserved registers last,
improving prolog/epilog sequence especially on AArch64, which has
calling conventions that require to preserve both GP and vector
registers.
The code was fine, however, some compilers may be able to optimize
it and in some border cases the features returned would be all zero.
This prevents such behavior.
* Each architecture now provides r32() and r64() functions for
register casting
* Each architecture now provides v128() function for register
casting, returning just Vec to make writing cross platform
code easier
* Added some basic condition code abstractions so it can be used
interchangeably across architectures
* Added back unlicense to asmjit database (now it's dual licensed)
mach_vm_remap() allows to create a dual mapping without having to
use a file descriptor, which has to open a file or shm memory. The
problem is that the recent macos version started displaying a popup
message when such file is opened and that annoyed a lot of users.
Thus, the initial code-path is no longer used, and mach_vm_remap()
is used instead.
This change only applies for x86 macs. Apple silicon doesn't allow
dual mapping and instead uses MAP_JIT.
MSVC incorrectly auto-vectorizes a loop that is used in liveness
analysis. Due to this bug the result is wrong, which then affects
how registers are allocated. This workarounds a C++ compiler bug.
A new HardenedRuntimeFlags::kDualMapping flag has been introduced to
detect whether dual mapping is provided by the target platform. This
flag can be set even when hardened runtime is not enforced in cases,
in which dual mapping is not available.
This fixes running unit tests on Apple hardware where dual mapping
is not available, but MAP_JIT is (AArch64 hardware).
Additionally, this changeset fixes using -msse2 flag on non-x86
targets, where compiler don't mind "-msse2" flag, but warns about
it. This makes the build 100% clean.