* Denested src folder to root, renamed testing to asmjit-testing
* Refactored how headers are included into <asmjit/...> form. This
is necessary as compilers would never simplify a path once a ..
appears in include directory - then paths such as ../core/../core
appeared in asserts, which was ugly
* Moved support utilities into asmjit/support/... (still included
by asmjit/core.h for convenience and compatibility)
* Added CMakePresets.json for making it easy to develop AsmJit
* Reworked CMakeLists to be shorter and use CMake option(),
etc... This simplifies it and makes it using more standard
features
* ASMJIT_EMBED now creates asmjit_embed INTERFACE library,
which is accessible via asmjit::asmjit target - this simplifies
embedding and makes it the same as library targets from a CMake
perspective
* Removed ASMJIT_DEPS - this is now provided by cmake target
aliases - 'asmjit::asmjit' so users should not need this variable
* Changed meaning of ASMJIT_LIBS - this now contains only AsmJit
dependencies without asmjit::asmjit target alias. Don't rely on
ASMJIT_LIBS anymore as it's only used internally
* Removed ASMJIT_NO_DEPRECATED option - AsmJit is not going
to provide controllable deprecations in the future
* Removed ASMJIT_NO_VALIDATION in favor of ASMJIT_NO_INTROSPECTION,
which now controls query, features, and validation API presence
* Removed ASMJIT_DIR option - it was never really needed
* Removed AMX_TRANSPOSE feature from instruction database (X86).
Intel has removed it as well, so it's a feature that won't
be siliconized
* Renamed round to round_even
* Added round_half_up intrinsic
* Added floating-point mod
* Added a scalar version of floating-point abs and neg
* Added a behavior enum to specify how float to int conversion
handles out-of-range and NaN cases
* Updated some APX stuff in instruction database
* Refactored the whole codebase to use snake_case convention to
name functions and variables, including member variables.
Class naming is unchanged and each starts with upper-case
character. The intention of this change is to make the source
code more readable and consistent across multiple projects
where AsmJit is currently used.
* Refactored support.h to make it more shareable across projects.
* x86::Vec now inherits from UniVec
* minor changes in JitAllocator and WriteScope in order to make
the size of WriteScope smaller
* added ZoneStatistics and Zone::statistics() getter
* improved x86::EmitHelper to use tables instead of choose() and
other mechanisms to pick between SSE and AVX instructions
* Refactored the whole codebase to use snake_case convention for
for functions names, function parameter names, struct members,
and variables
* Added a non-owning asmjit::Span<T> type and use into public API
to hide the usage of ZoneVector in CodeHolder, Builder, and
Compiler. Users now only get Span (with data and size), which
doesn't require users to know about ZoneVector
* Removed RAWorkId from RATiedReg in favor of RAWorkReg*
* Removed GEN from LiveInfo as it's not needed by CFG construction
to save memory (GEN was merged with LIVE-IN bits). The remaining
LIVE-IN, LIVE-OUT, and KILL bits are enough, however KILL bits may
be removed in the future as KILL bits are not needed after LIVE-IN
and LIVE-OUT converged
* Optimized the representation of LIVE-IN, LIVE-OUT, and KILL bits
per block. Now only registers that live across multiple basic
blocks are included here, which means that virtual registers that
only live in a single block are not included and won't be overhead
during liveness analysis. This optimization alone can make liveness
analysis 90% faster depending on the code generated (more virtual
registers that only live in a single basic block -> more gains)
* Optimized building liveness information bits per block. The new
code uses an optimized algorithm to prevent too many traversals
and uses a more optimized code for a case in which not too many
registers are used (it avoids array operations if the number of
all virtual registers within the function fits a single BitWord)
* Optimized code that computes which virtual register is only used
in a single basic block - this aims to optimize register allocator
in the future by using a designed code path for allocating regs
only used in a single basic block
* Reduced the information required for each live-span, which is used
by bin-packing. Now the struct is 8 bytes, which is good for a lot
of optimizations C++ compiler can do
* Added UniCompiler (ujit) which can be used to share code paths
between X86, X86_64, and AArch64 code generation (experimental).
* Reworked register operands - all vector registers are now
platform::Vec deriving from UniVec (universal vector operand),
additionally, there is no platform::Reg, instead asmjit::Reg
provides all necessary features to make it a base register for
each target architecture
* Reworked casting between registers - now architecture agnostic
names are preferred - use Gp32 instead of Gpd or GpW, Gp64
instead of Gpq and GpX, etc...
* Reworked vector registers and their names - architecture
agnostic naming is now preferred Vec32, Vec64, Vec128, etc...
* Reworked naming conventions used across AsmJit - for clarity
Identifiers are now prefixed with the type, like sectionId(),
labelId(), etc...
* Reworked how Zone and ZoneAllocator are used across AsmJit,
prefering Zone in most cases and ZoneAllocator only for
containers - this change alone achieves around 5% better
performance of Builder and Compiler
* Reworked LabelEntry - decreased the size of the base entry
to 16 bytes for anonymous and unnamed labels. Avoided an
indirection when using labelEntries() - LabelEntry is now
a value and not a pointer
* Renamed LabelLink to Fixup
* Added a new header <asmjit/host.h> which would include
<asmjit/core.h> + target tools for the host architecture,
if enabled and supported
* Added new AArch64 instructions (BTI, CSSC, CHKFEAT)
* Added a mvn_ alternative of mvn instruction (fix for Windows
ARM64 SDK)
* Added more AArch64 CPU features to CpuInfo
* Added better support for Apple CPU detection (Apple M3, M4)
* Added a new benchmarking tool asmjit_bench_overhead, which
benchmarks the overhead of CodeHolder::init()/reset() and
creating/attaching emitters to it. Thanks to the benchmark the
most common code-paths were optimized
* Added a new benchmarking tool asmjit_bench_regalloc, which
aims to benchmark the cost and complexity of register allocation.
* Renamed asmjit_test_perf to asmjit_bench_codegen to make it
clear what is a test and what is a benchmark
* Instructions wr[u]ss[d|q] no longer accept register as the first
operand (that was a bug to accept this form)
* Moved APX version of legacy instructions closer so they are next
to each other
* Removed AVX512_ER, AVX512_PF, AVX512_4FMAPS, and AVX512_4VNNIW
extensions and corresponding instructions (these were never
advertised by any x86 CPU and were only used by Xeon Phi acc.,
which AsmJit never supported)
* Removed CPU extensions HLE, MPX, and TSX
* Kept extension RTM, which is only for backward compatibility to
recognize instructions, but it's no longer checked by CpuInfo as
it's been deprecated together with HLE and MPX
* The xtest instruction now reports it requires RTM
* Reorganized x86 extensions a bit - they are now reordered to group
them by category, preparing for the future where extension IDs will
be always added after existing records for ABI compatibility
* Instruction vcvtneps2bf16 no longer accepts form without an explicit
memory operand size
* Removed aliased instructions in CMOVcc, Jcc, And SETcc categories,
now there is only a single instruction id for all aliased instructions.
* Added a new feature to always show instruction aliases in Logger, which
includes formatting instructio nodes (Builder, Compiler)
Instruction DB-only updates (not applied to C++ yet):
* AsmJit DB from now uses the same license as AsmJit (Zlib) and
no longer applies dual licensing (Zlib and Public Domain)
* Added support for aggregated instruction definitions in
x86 instruction database, which should simplify the maintenance
and reduce bugs (also the syntax is comparable to descriptions
used by Intel APX instruction manuals)
* Added support for APX instructions and new features
* Added support for AVX10.1 and AVX10.2 instructions (both new
instructions and new encodings of existing instructions)
* Added support for MOVRS instructions
* Added support for KL instructions (loadiwkey)
* Added support for AESKLE instructions
* Added support for AESKLEWIDE_KL instructions
* Added support for AMX_[AVX512|MOVRS|FP8|TF32|TRANSPOSE]
* NOTE: None of the instruction additions is currently used by
Asmjit, it's a pure database update that needs more work to
make all the instructions available in future AsmJit
* The first operand (destination) is read/write and not overwrite
In addition, added the following new AArch64 instructions to DB:
* CPA extensions (DB-only)
* FAMINMAX extensions (DB-only)
* FP8 ASIMD extensions (DB-only)
* LUT extensions (DB-only)
* Each architecture now provides r32() and r64() functions for
register casting
* Each architecture now provides v128() function for register
casting, returning just Vec to make writing cross platform
code easier
* Added some basic condition code abstractions so it can be used
interchangeably across architectures
* Added back unlicense to asmjit database (now it's dual licensed)
* Renamed all eq() methods to equals() (consistency) (ABI)
* Reorganized some X86 instructions in X86 database
* Properly detect RISC-V CPU at compile time (Environment)
* Removed CallConvId::kNone in favor of defaulting to kCDecl (ABI)
* CallConvId::kHost is now alias to CallConvId::kCDecl (ABI)
* Added FloatABI to Environment to disginguish between softfloat
and hardfloat
* Added more AArch64 CPU features and their detection (ABI)
* Because of CallConvId changes it's now possible to run
compiler tests on 32-bit ARM (fixes a bug in test cases)
* Added QEMU to CI build matrix to test different architectures
This changeset contains an updated instruction database that brings
ARM32 instructions for the first time. It also updates instruction
database tooling especially for ARM64, which will also be used by
ARM32 generator.
Additionally, new operan has been added, which represents a register
list as used by ARM32 instruction set.
Other minor changes are related to ARM - some stuff had to be moved
to a64 namespace from arm namespace as it's incompatible between
32-bit and 64-bit ISA.
* Instruction database is now part of asmjit to keep it in sync
* X86/X64 ISA data has been reworked, now in a proper JSON format
* ARM32 ISA data has been added (currently only DB, support later)
* ARM64 ISA data has been added
* ARM features detection has been updated