asmjit

Mirrors/asmjit

Fork 0

mirror of https://github.com/asmjit/asmjit.git synced 2025-12-18 13:04:36 +03:00

Commit Graph

Author	SHA1	Message	Date
kobalicek	7596c6d035	[abi] AsmJit v1.18 - performance and memory footprint improvements * Refactored the whole codebase to use snake_case convention to name functions and variables, including member variables. Class naming is unchanged and each starts with upper-case character. The intention of this change is to make the source code more readable and consistent across multiple projects where AsmJit is currently used. * Refactored support.h to make it more shareable across projects. * x86::Vec now inherits from UniVec * minor changes in JitAllocator and WriteScope in order to make the size of WriteScope smaller * added ZoneStatistics and Zone::statistics() getter * improved x86::EmitHelper to use tables instead of choose() and other mechanisms to pick between SSE and AVX instructions * Refactored the whole codebase to use snake_case convention for for functions names, function parameter names, struct members, and variables * Added a non-owning asmjit::Span<T> type and use into public API to hide the usage of ZoneVector in CodeHolder, Builder, and Compiler. Users now only get Span (with data and size), which doesn't require users to know about ZoneVector * Removed RAWorkId from RATiedReg in favor of RAWorkReg* * Removed GEN from LiveInfo as it's not needed by CFG construction to save memory (GEN was merged with LIVE-IN bits). The remaining LIVE-IN, LIVE-OUT, and KILL bits are enough, however KILL bits may be removed in the future as KILL bits are not needed after LIVE-IN and LIVE-OUT converged * Optimized the representation of LIVE-IN, LIVE-OUT, and KILL bits per block. Now only registers that live across multiple basic blocks are included here, which means that virtual registers that only live in a single block are not included and won't be overhead during liveness analysis. This optimization alone can make liveness analysis 90% faster depending on the code generated (more virtual registers that only live in a single basic block -> more gains) * Optimized building liveness information bits per block. The new code uses an optimized algorithm to prevent too many traversals and uses a more optimized code for a case in which not too many registers are used (it avoids array operations if the number of all virtual registers within the function fits a single BitWord) * Optimized code that computes which virtual register is only used in a single basic block - this aims to optimize register allocator in the future by using a designed code path for allocating regs only used in a single basic block * Reduced the information required for each live-span, which is used by bin-packing. Now the struct is 8 bytes, which is good for a lot of optimizations C++ compiler can do * Added UniCompiler (ujit) which can be used to share code paths between X86, X86_64, and AArch64 code generation (experimental).	2025-09-06 13:44:34 +02:00
kobalicek	2ff454d415	[abi] AsmJit v1.17 - cumulative & breaking changes * Reworked register operands - all vector registers are now platform::Vec deriving from UniVec (universal vector operand), additionally, there is no platform::Reg, instead asmjit::Reg provides all necessary features to make it a base register for each target architecture * Reworked casting between registers - now architecture agnostic names are preferred - use Gp32 instead of Gpd or GpW, Gp64 instead of Gpq and GpX, etc... * Reworked vector registers and their names - architecture agnostic naming is now preferred Vec32, Vec64, Vec128, etc... * Reworked naming conventions used across AsmJit - for clarity Identifiers are now prefixed with the type, like sectionId(), labelId(), etc... * Reworked how Zone and ZoneAllocator are used across AsmJit, prefering Zone in most cases and ZoneAllocator only for containers - this change alone achieves around 5% better performance of Builder and Compiler * Reworked LabelEntry - decreased the size of the base entry to 16 bytes for anonymous and unnamed labels. Avoided an indirection when using labelEntries() - LabelEntry is now a value and not a pointer * Renamed LabelLink to Fixup * Added a new header <asmjit/host.h> which would include <asmjit/core.h> + target tools for the host architecture, if enabled and supported * Added new AArch64 instructions (BTI, CSSC, CHKFEAT) * Added a mvn_ alternative of mvn instruction (fix for Windows ARM64 SDK) * Added more AArch64 CPU features to CpuInfo * Added better support for Apple CPU detection (Apple M3, M4) * Added a new benchmarking tool asmjit_bench_overhead, which benchmarks the overhead of CodeHolder::init()/reset() and creating/attaching emitters to it. Thanks to the benchmark the most common code-paths were optimized * Added a new benchmarking tool asmjit_bench_regalloc, which aims to benchmark the cost and complexity of register allocation. * Renamed asmjit_test_perf to asmjit_bench_codegen to make it clear what is a test and what is a benchmark	2025-06-15 16:45:37 +02:00

Author

SHA1

Message

Date

kobalicek

7596c6d035

[abi] AsmJit v1.18 - performance and memory footprint improvements

* Refactored the whole codebase to use snake_case convention to
    name functions and variables, including member variables.
    Class naming is unchanged and each starts with upper-case
    character. The intention of this change is to make the source
    code more readable and consistent across multiple projects
    where AsmJit is currently used.

  * Refactored support.h to make it more shareable across projects.

  * x86::Vec now inherits from UniVec

  * minor changes in JitAllocator and WriteScope in order to make
    the size of WriteScope smaller

  * added ZoneStatistics and Zone::statistics() getter

  * improved x86::EmitHelper to use tables instead of choose() and
    other mechanisms to pick between SSE and AVX instructions

  * Refactored the whole codebase to use snake_case convention for
    for functions names, function parameter names, struct members,
    and variables

  * Added a non-owning asmjit::Span<T> type and use into public API
    to hide the usage of ZoneVector in CodeHolder, Builder, and
    Compiler. Users now only get Span (with data and size), which
    doesn't require users to know about ZoneVector

  * Removed RAWorkId from RATiedReg in favor of RAWorkReg*

  * Removed GEN from LiveInfo as it's not needed by CFG construction
    to save memory (GEN was merged with LIVE-IN bits). The remaining
    LIVE-IN, LIVE-OUT, and KILL bits are enough, however KILL bits may
    be removed in the future as KILL bits are not needed after LIVE-IN
    and LIVE-OUT converged

  * Optimized the representation of LIVE-IN, LIVE-OUT, and KILL bits
    per block. Now only registers that live across multiple basic
    blocks are included here, which means that virtual registers that
    only live in a single block are not included and won't be overhead
    during liveness analysis. This optimization alone can make liveness
    analysis 90% faster depending on the code generated (more virtual
    registers that only live in a single basic block -> more gains)

  * Optimized building liveness information bits per block. The new
    code uses an optimized algorithm to prevent too many traversals
    and uses a more optimized code for a case in which not too many
    registers are used (it avoids array operations if the number of
    all virtual registers within the function fits a single BitWord)

  * Optimized code that computes which virtual register is only used
    in a single basic block - this aims to optimize register allocator
    in the future by using a designed code path for allocating regs
    only used in a single basic block

  * Reduced the information required for each live-span, which is used
    by bin-packing. Now the struct is 8 bytes, which is good for a lot
    of optimizations C++ compiler can do

  * Added UniCompiler (ujit) which can be used to share code paths
    between X86, X86_64, and AArch64 code generation (experimental).

2025-09-06 13:44:34 +02:00

kobalicek

2ff454d415

[abi] AsmJit v1.17 - cumulative & breaking changes

* Reworked register operands - all vector registers are now
    platform::Vec deriving from UniVec (universal vector operand),
    additionally, there is no platform::Reg, instead asmjit::Reg
    provides all necessary features to make it a base register for
    each target architecture
  * Reworked casting between registers - now architecture agnostic
    names are preferred - use Gp32 instead of Gpd or GpW, Gp64
    instead of Gpq and GpX, etc...
  * Reworked vector registers and their names - architecture
    agnostic naming is now preferred Vec32, Vec64, Vec128, etc...
  * Reworked naming conventions used across AsmJit - for clarity
    Identifiers are now prefixed with the type, like sectionId(),
    labelId(), etc...
  * Reworked how Zone and ZoneAllocator are used across AsmJit,
    prefering Zone in most cases and ZoneAllocator only for
    containers - this change alone achieves around 5% better
    performance of Builder and Compiler
  * Reworked LabelEntry - decreased the size of the base entry
    to 16 bytes for anonymous and unnamed labels. Avoided an
    indirection when using labelEntries() - LabelEntry is now
    a value and not a pointer
  * Renamed LabelLink to Fixup
  * Added a new header <asmjit/host.h> which would include
    <asmjit/core.h> + target tools for the host architecture,
    if enabled and supported
  * Added new AArch64 instructions (BTI, CSSC, CHKFEAT)
  * Added a mvn_ alternative of mvn instruction (fix for Windows
    ARM64 SDK)
  * Added more AArch64 CPU features to CpuInfo
  * Added better support for Apple CPU detection (Apple M3, M4)
  * Added a new benchmarking tool asmjit_bench_overhead, which
    benchmarks the overhead of CodeHolder::init()/reset() and
    creating/attaching emitters to it. Thanks to the benchmark the
    most common code-paths were optimized
  * Added a new benchmarking tool asmjit_bench_regalloc, which
    aims to benchmark the cost and complexity of register allocation.
  * Renamed asmjit_test_perf to asmjit_bench_codegen to make it
    clear what is a test and what is a benchmark

2025-06-15 16:45:37 +02:00

2 Commits