asmjit

mirror of https://github.com/asmjit/asmjit.git synced 2025-12-16 20:17:05 +03:00

Author	SHA1	Message	Date
kobalicek	12f9ca3b32	[doc] Documentation update (chat links)	2025-11-29 09:13:10 +01:00
kobalicek	28295814dd	[doc] Documentation and funding update * Removed github funding links * Fixed few documentation links	2025-11-15 10:54:23 +01:00
kobalicek	b56f4176cb	Codebase update and improvements, instruction DB update * Denested src folder to root, renamed testing to asmjit-testing * Refactored how headers are included into <asmjit/...> form. This is necessary as compilers would never simplify a path once a .. appears in include directory - then paths such as ../core/../core appeared in asserts, which was ugly * Moved support utilities into asmjit/support/... (still included by asmjit/core.h for convenience and compatibility) * Added CMakePresets.json for making it easy to develop AsmJit * Reworked CMakeLists to be shorter and use CMake option(), etc... This simplifies it and makes it using more standard features * ASMJIT_EMBED now creates asmjit_embed INTERFACE library, which is accessible via asmjit::asmjit target - this simplifies embedding and makes it the same as library targets from a CMake perspective * Removed ASMJIT_DEPS - this is now provided by cmake target aliases - 'asmjit::asmjit' so users should not need this variable * Changed meaning of ASMJIT_LIBS - this now contains only AsmJit dependencies without asmjit::asmjit target alias. Don't rely on ASMJIT_LIBS anymore as it's only used internally * Removed ASMJIT_NO_DEPRECATED option - AsmJit is not going to provide controllable deprecations in the future * Removed ASMJIT_NO_VALIDATION in favor of ASMJIT_NO_INTROSPECTION, which now controls query, features, and validation API presence * Removed ASMJIT_DIR option - it was never really needed * Removed AMX_TRANSPOSE feature from instruction database (X86). Intel has removed it as well, so it's a feature that won't be siliconized	2025-11-02 22:31:46 +01:00
kobalicek	cdc4eacbb1	[abi] Added more functionality to ujit * Renamed round to round_even * Added round_half_up intrinsic * Added floating-point mod * Added a scalar version of floating-point abs and neg * Added a behavior enum to specify how float to int conversion handles out-of-range and NaN cases * Updated some APX stuff in instruction database	2025-10-05 17:31:24 +02:00
kobalicek	7596c6d035	[abi] AsmJit v1.18 - performance and memory footprint improvements * Refactored the whole codebase to use snake_case convention to name functions and variables, including member variables. Class naming is unchanged and each starts with upper-case character. The intention of this change is to make the source code more readable and consistent across multiple projects where AsmJit is currently used. * Refactored support.h to make it more shareable across projects. * x86::Vec now inherits from UniVec * minor changes in JitAllocator and WriteScope in order to make the size of WriteScope smaller * added ZoneStatistics and Zone::statistics() getter * improved x86::EmitHelper to use tables instead of choose() and other mechanisms to pick between SSE and AVX instructions * Refactored the whole codebase to use snake_case convention for for functions names, function parameter names, struct members, and variables * Added a non-owning asmjit::Span<T> type and use into public API to hide the usage of ZoneVector in CodeHolder, Builder, and Compiler. Users now only get Span (with data and size), which doesn't require users to know about ZoneVector * Removed RAWorkId from RATiedReg in favor of RAWorkReg* * Removed GEN from LiveInfo as it's not needed by CFG construction to save memory (GEN was merged with LIVE-IN bits). The remaining LIVE-IN, LIVE-OUT, and KILL bits are enough, however KILL bits may be removed in the future as KILL bits are not needed after LIVE-IN and LIVE-OUT converged * Optimized the representation of LIVE-IN, LIVE-OUT, and KILL bits per block. Now only registers that live across multiple basic blocks are included here, which means that virtual registers that only live in a single block are not included and won't be overhead during liveness analysis. This optimization alone can make liveness analysis 90% faster depending on the code generated (more virtual registers that only live in a single basic block -> more gains) * Optimized building liveness information bits per block. The new code uses an optimized algorithm to prevent too many traversals and uses a more optimized code for a case in which not too many registers are used (it avoids array operations if the number of all virtual registers within the function fits a single BitWord) * Optimized code that computes which virtual register is only used in a single basic block - this aims to optimize register allocator in the future by using a designed code path for allocating regs only used in a single basic block * Reduced the information required for each live-span, which is used by bin-packing. Now the struct is 8 bytes, which is good for a lot of optimizations C++ compiler can do * Added UniCompiler (ujit) which can be used to share code paths between X86, X86_64, and AArch64 code generation (experimental).	2025-09-06 13:44:34 +02:00
kobalicek	964e7c20b5	[abi] API cleanup and documentation fixes * Added first node to Zone so the reset is simpler * Added x86::Xmm/Ymm/Zmm deprecated aliases of x86::Vec to make user code not break when using these deprecated types * Documentation fixes and clarifications	2025-06-16 10:13:04 +02:00
kobalicek	2ff454d415	[abi] AsmJit v1.17 - cumulative & breaking changes * Reworked register operands - all vector registers are now platform::Vec deriving from UniVec (universal vector operand), additionally, there is no platform::Reg, instead asmjit::Reg provides all necessary features to make it a base register for each target architecture * Reworked casting between registers - now architecture agnostic names are preferred - use Gp32 instead of Gpd or GpW, Gp64 instead of Gpq and GpX, etc... * Reworked vector registers and their names - architecture agnostic naming is now preferred Vec32, Vec64, Vec128, etc... * Reworked naming conventions used across AsmJit - for clarity Identifiers are now prefixed with the type, like sectionId(), labelId(), etc... * Reworked how Zone and ZoneAllocator are used across AsmJit, prefering Zone in most cases and ZoneAllocator only for containers - this change alone achieves around 5% better performance of Builder and Compiler * Reworked LabelEntry - decreased the size of the base entry to 16 bytes for anonymous and unnamed labels. Avoided an indirection when using labelEntries() - LabelEntry is now a value and not a pointer * Renamed LabelLink to Fixup * Added a new header <asmjit/host.h> which would include <asmjit/core.h> + target tools for the host architecture, if enabled and supported * Added new AArch64 instructions (BTI, CSSC, CHKFEAT) * Added a mvn_ alternative of mvn instruction (fix for Windows ARM64 SDK) * Added more AArch64 CPU features to CpuInfo * Added better support for Apple CPU detection (Apple M3, M4) * Added a new benchmarking tool asmjit_bench_overhead, which benchmarks the overhead of CodeHolder::init()/reset() and creating/attaching emitters to it. Thanks to the benchmark the most common code-paths were optimized * Added a new benchmarking tool asmjit_bench_regalloc, which aims to benchmark the cost and complexity of register allocation. * Renamed asmjit_test_perf to asmjit_bench_codegen to make it clear what is a test and what is a benchmark	2025-06-15 16:45:37 +02:00
kobalicek	408476b0b3	[ci] Updated CI to not use a deprecated windows image	2025-05-30 18:50:31 +02:00
kobalicek	c993fd9bfc	Reworked asmjit_environment_info	2025-05-30 15:07:32 +02:00
kobalicek	356dddbc55	[abi] Switched to C++17	2025-05-24 19:21:17 +02:00
kobalicek	6c9a6b2454	[abi] Reorganized instruction DB, removed deprecated instructions * Removed AVX512_ER, AVX512_PF, AVX512_4FMAPS, and AVX512_4VNNIW extensions and corresponding instructions (these were never advertised by any x86 CPU and were only used by Xeon Phi acc., which AsmJit never supported) * Removed CPU extensions HLE, MPX, and TSX * Kept extension RTM, which is only for backward compatibility to recognize instructions, but it's no longer checked by CpuInfo as it's been deprecated together with HLE and MPX * The xtest instruction now reports it requires RTM * Reorganized x86 extensions a bit - they are now reordered to group them by category, preparing for the future where extension IDs will be always added after existing records for ABI compatibility * Instruction vcvtneps2bf16 no longer accepts form without an explicit memory operand size * Removed aliased instructions in CMOVcc, Jcc, And SETcc categories, now there is only a single instruction id for all aliased instructions. * Added a new feature to always show instruction aliases in Logger, which includes formatting instructio nodes (Builder, Compiler) Instruction DB-only updates (not applied to C++ yet): * AsmJit DB from now uses the same license as AsmJit (Zlib) and no longer applies dual licensing (Zlib and Public Domain) * Added support for aggregated instruction definitions in x86 instruction database, which should simplify the maintenance and reduce bugs (also the syntax is comparable to descriptions used by Intel APX instruction manuals) * Added support for APX instructions and new features * Added support for AVX10.1 and AVX10.2 instructions (both new instructions and new encodings of existing instructions) * Added support for MOVRS instructions * Added support for KL instructions (loadiwkey) * Added support for AESKLE instructions * Added support for AESKLEWIDE_KL instructions * Added support for AMX_[AVX512\|MOVRS\|FP8\|TF32\|TRANSPOSE] * NOTE: None of the instruction additions is currently used by Asmjit, it's a pure database update that needs more work to make all the instructions available in future AsmJit	2025-05-10 15:04:11 +02:00
kobalicek	4cd9198a6c	Improved register allocation of consecutive register in some cases * The implementation tries to detect whether a virtual register only lives in a single basic block and then uses a move approach instead of spill/alloc when reallocating * Additionally, the implementation now improves the use of scratch registers during function arguments allocation - scratch is only reserved when it's actually needed	2025-04-20 18:44:52 +02:00
kobalicek	029075b84b	Minor things * Added missing noexcept to some helper functions * Removed Linux images from workflows that will be out of support soon (ubuntu 20.04)	2025-02-12 16:17:09 +01:00
kobalicek	9b28f627a5	[ci] Updated macos configuration (GCC bumbed to 14)	2024-08-26 07:24:01 +02:00
kobalicek	330aa64386	Avoid unused function warnings when building for Windows/ARM64	2024-07-08 11:53:33 +02:00
kobalicek	ffac9f36fb	[bug] Deprecate BaseMem::setSize() BaseMem::setSize() should not be used anymore as the only memory operand that understands size is x86::Mem, which makes it x86 specific. The reason is that other architectures require more bits, so for example arm::Mem uses the storage used by x86 size for storing other information such as offset mode, and possibly more information will be needed in the future to support AArch64 SVE or SME, etc... At the moment BaseMem::setSize() has been deprecated, so code using it would still compile, but with a warning. It will be removed in the future though.	2024-06-28 22:07:13 +02:00
kobalicek	062e69ca81	[Bug] Fixed a string buffer growing strategy For some reason the growing strategy of asmjit::String was too aggressive, basically reaching the maximum doubling capacity too fast (after the first reallocation). This code adapts the current vector growing strategy to be used also by asmjit::String, which doubles the capacity until a threshold is reached and then grows linearly.	2024-06-22 10:39:33 +02:00
kobalicek	63e7d060ac	Support C++20 without warnings C++20 deprecates mixing enums of different types (comparisons, etc...), however, we use enums instea of "static constexpr" in classes to define constants, because otherwise we would have to give such constants storage - this is required for up to C++14 and since we still support C++11 we have to keep using enums...	2024-06-05 00:33:15 +02:00
kobalicek	b9c8b5399f	[Bug] Fixed MOV reg->mem instruction rewriting (Compiler) The problem is that the rewriter must also rewrite an instruction ID in case that it's a [K\|V]MOV[B\|W\|D\|Q] instruction that moves from either K or SIMD register to GP register. when such instruction is rewritten in a way that it ends up as "xMOVx GP, [MEM]" it would be invalid if it's not changed to a general purpose MOV. The problem can only happen in case that the compiler spills a virtual register, which is then moved to a scalar register. In addition, checks were added to MOVD\|MOVQ to ensure that when an invalid instruction is emitted it's not ignored as it used to be.	2024-05-19 17:51:16 +02:00
kobalicek	2110882ef2	[CI] Updated workflow to run on AArch64 runners	2024-05-16 22:11:18 +02:00
kobalicek	268bce7952	Minor change making static analysis happy * clang-18 on now enabled on CI and used for static analysis * return error when X86Internal_setupSaveRestoreInfo() is called with invalid register group. Should never happen though.	2024-03-09 11:53:14 +01:00
kobalicek	7ff9c2a545	[CI] Minor changes in CMakeLists.txt, disable arm/v7 because of CI	2024-03-09 08:26:46 +01:00
kobalicek	a63d41e80b	Added support for mach_vm_remap() for dual mapping mach_vm_remap() allows to create a dual mapping without having to use a file descriptor, which has to open a file or shm memory. The problem is that the recent macos version started displaying a popup message when such file is opened and that annoyed a lot of users. Thus, the initial code-path is no longer used, and mach_vm_remap() is used instead. This change only applies for x86 macs. Apple silicon doesn't allow dual mapping and instead uses MAP_JIT.	2024-01-25 22:23:13 +01:00
kobalicek	03b784c9fe	[Doc] Added CONTRIBUTING.md and issue templates; updated docs	2024-01-22 00:25:23 +01:00
kobalicek	3772c447ca	[ABI] Accumulated API/ABI changes * Renamed all eq() methods to equals() (consistency) (ABI) * Reorganized some X86 instructions in X86 database * Properly detect RISC-V CPU at compile time (Environment) * Removed CallConvId::kNone in favor of defaulting to kCDecl (ABI) * CallConvId::kHost is now alias to CallConvId::kCDecl (ABI) * Added FloatABI to Environment to disginguish between softfloat and hardfloat * Added more AArch64 CPU features and their detection (ABI) * Because of CallConvId changes it's now possible to run compiler tests on 32-bit ARM (fixes a bug in test cases) * Added QEMU to CI build matrix to test different architectures	2024-01-01 20:15:00 +01:00
kobalicek	073f6e85e4	[ABI] Improvements to avoid UB and warnings, clean build with MSAN * Added more clang compilers on CI (CI) * Added memory sanitizer to build matrix (CI) * Use problem matcher in all builds (CI) * Fixed the use of some constructs in tests * Fixed warnings about unused functions in tests * Fixed warnings about unused variables caused by some build options * Fixed tests to be clean with MSAN (zeroing memory filled by JIT code) * Removed -Wclass-memaccess (gcc) from ignored warnings * Removed -Wconstant-logical-operand (clang) from ignored warnings * Removed -Wunnamed-type-template-args (clang) from ignored warnings * Reworked InstData and InstExData to not cause UB (ABI break) Unfortunately the existing InstData and InstExData was not good for static analysis and in general compilers emitted warnings regarding accessing InstNode::_opArray. The reason was that InstExNode added one or two more operands which extended InstData::_opArray, but there was no way to tell the C++ compiler about this layout. It has been changed to InstNode having no operands and InstNodeWithOperands being templatized for the right number of operands. Nodes that need to inherit InstNode would just inherit InstNodeWithOperands<N>. It works the same way as before, just the class hierarchy changed a little.	2023-12-26 19:00:00 +01:00
kobalicek	7e64eabca4	[CI] Updated BSD versions on CI	2023-10-07 11:37:07 +02:00
kobalicek	4413d78c98	[CI] Removed NetBSD builds because the VM doesn't work	2023-10-06 12:06:47 +02:00
kobalicek	51b10b19b6	[Bug] Fixed build having ASMJIT_NO_TEXT enabled (AArch64)	2023-03-25 00:09:15 +01:00
kobalicek	c1019f1642	Improved testing * Refactored workflows to use a single workflow for both VM and non-VM builds * Compiler tests are now able to test compilation of different architectures	2023-03-11 00:31:03 +01:00
kobalicek	965d19506f	[CI] Updated build matrix, updated docs regarding CI	2023-02-25 00:46:33 +01:00
kobalicek	8552e286c2	[CI] Updated to use build-actions new prepare-environment	2023-02-23 16:18:07 +01:00
kobalicek	9d33c892f7	[Bug] Use mremap() to allocate a dual mapped region on NetBSD In addition, always enable DualMapping when RWX pages are not possible to allocate in JitAllocator, because otherwise the allocator would not be able to allocate memory for JIT code execution. New CI runners to test FreeBSD, NetBSD, and OpenBSD are also provided.	2023-02-23 00:40:20 +01:00
kobalicek	3ee3846283	[ABI] Raised ABI version due to recent changes	2023-01-16 14:55:03 +01:00
kobalicek	1ed8b77f5b	[ABI] Added CpuFeatures to Target and CodeHolder, improved test_perf	2023-01-16 00:10:56 +01:00
kobalicek	8a33b814d6	[Bug] Assign inline comments to Invoke/Func nodes, annotate without Logger	2023-01-08 14:34:36 +01:00
kobalicek	5b5b0b3877	[CI] Updated CI configurations	2022-12-10 15:07:55 +01:00
kobalicek	2ae2d897f4	[CI] Use newer macos images as the older ones were deprecated	2022-07-21 21:52:33 +02:00
kobalicek	06d0badec5	Suppress -Wbitwise-instead-of-logical warning that was introduced by clang 14	2022-06-22 00:30:34 +02:00
kobalicek	2cfa686ebd	[API] Fixed most static analysis issues reported by clang [Bug] Workarounded GCC 11 issue affecting unaligned loads/stores (most likely a compiler bug)	2021-12-14 01:19:28 +01:00
kobalicek	996deae327	[ABI] Refactored AsmJit to use strong-typed enums, this breaks both API and ABI [ABI] Added ABI version as an inline namespace, which forms asmjit::_abi_MAJOR_MINOR [ABI] Added support for AVX512_FP16, 16-bit broadcast, and AVX512_FP16 tests [ABI] Added initial support for consecutive registers into instruction database and register allocator [ABI] Added a possibility to use temporary memory in CodeHolder's zone [ABI] Compiler::setArg() is now deprecated, use FuncNode::setArg() [Bug] Fixed correct RW information of instructions that only support implicit zeroing with {k} [Bug] Fixed broadcast to be able to broadcast bcst16 operands	2021-12-13 19:34:56 +01:00
kobalicek	e822fba53e	[ABI] Added the possibility to use AVX512 in Compiler and FuncFrame	2021-03-17 18:05:48 +01:00
kobalicek	7836449c30	Added asmjit_test_perf, which replaces asmjit_bench and provides much better performance overview Removed asmjit_test_opcode (not needed anymore as we have asmjit_test_assembler and asmjit_test_perf)	2021-03-13 23:05:48 +01:00
kobalicek	2ab380e0bd	[Bug] [Critical] [ABI] Update that fixes all problems discovered by comparison with LLVM-MC Fixed POP Sreg instruction, which was incorrectly implemented to emit nothing Fixed CVTSD2SI, CVTSS2SI, CVTTSD2SI, and CVTTSS2SI instructions to not consider the size of the memory operand when calculagint REX.W prefix Fixed VCMPPD, VCMPPS, VCMPSD, VCMPSS, VPCMPEQ, VPCMPGT instructions to always force EVEX prefix when the first operand is K register Fixed ENDBR32 and ENDBR64 instructions (wrong opcode) Fixed CLRSSBSY and RSTORSSP instructions (wrong logic in Assembler) Fixed SLDT, SMSW, and STR instructions to not consider memory size when determining prefixes Fixed UD0 and UD1 instructions to consider both operands Fixed VCVTNE2PS2BF16, VCVTNEPS2BF16, and VDPBF16PS instructions (incorrect calculation of LL field) (AVX512) Fixed VCVTPD2DQ, VCVTPD2PS, VCVTPD2UDQ, VCVTQQ2PS, VCVTTOD2DQ, VCVTTPD2UDQ, VCVTUQQ2PS in AVX512 case (incorrect calculation of LL field) Fixed VGATHERPF* and VSCATTERPF* instructions (some instructions were encoded incorrectly by not considering the memory index register type in LL field) Fixed VPBLENDVB (incorrect calculation of LL field) Fixed VPEXTRW to use use a shorter encoding when possible (vpextrw r32, xmm, imm) Fixed VPSLLD, VPSLLQ, VPSLLW, VPSRAD, VPSRAQ, VPSRAW, VPSRLD, VPSRLQ, VPSRLW instructions to always force EVEX prefix when the instruction is RMI (AVX512) Fixed the accepted memory operand size of MMX PUNPCKL??? instructions from m64 to m32 (only affects validation) Added explicit forms to XSAVE* and XRSTOR* instructions Added HRESET and UINTR instructions Changed MOV and all ARITH instructions to output the same binary as LLVM in 'reg, reg' case (it used an alternative encoding initially) Renamed LRET to RETF Renamed VBLENDM* instructions to VPBLENDM* (the name was incorrect) Renamed VPBROADCASTMB2D to VPBROADCASTMW2D (the name was incorrect) Renamed SYSEXIT64 to SYSEXITQ and SYSRET64 to SYSRETQ Removed non-standard IRETW (use IRET, IRETD, or IRETQ to select the form)	2021-02-03 09:36:11 +01:00
kobalicek	58b6c025f2	[ABI] Added more AVX_VNNI instructions, added MOVABS for explicit Imm64 encodings, added more assembler tests	2021-01-26 01:00:29 +01:00
kobalicek	4b13f71314	Improved GitHub workflows	2020-11-09 00:47:32 +01:00
kobalicek	2199c7d4e7	Improved CI problem matching by always doing out-of-source build (Ninja issue)	2020-11-07 19:31:03 +01:00
kobalicek	fe89388e52	Added problem matchers to CI workflow	2020-11-07 16:05:48 +01:00
kobalicek	88129d7389	[Bug] Don't unlink immediately when creating anonymous memory file, switch to GH actions (Fixes #312 )	2020-11-07 00:02:16 +01:00
kobalicek	b49d685cd9	Added github funding file	2020-09-22 20:16:28 +02:00

50 Commits