Class JitCompiler
JitPcodeEmulator
.
This is the translation engine from "any" machine language into JVM bytecode. The same caveats that apply to interpretation-based p-code emulation apply to JIT-accelerated emulation: Ghidra must have a Sleigh specification for the emulation target language, there must be userop libraries (built-in or user-provided) defining any userops encountered during the course of execution, all dependent code must be loaded or stubbed out, etc.
A passage is decoded at a desired entry point using the JitPassageDecoder
. This compiler
then translates the passage into bytecode. It will produce a classfile which is then loaded and
returned to the emulator (or other client). The provided class will have three principal methods,
not counting getters: 1) The class initializer, which initializes static fields; 2) The
constructor, which takes a thread and initializes instance fields, and 3) The
run
method, which comprises the actual translation. A static
field ENTRIES
describes each entry point generated by the compiler. To execute the
passage starting at a given entry point, the emulation thread must retrieve the index of the
appropriate entry (i.e., address and contextreg value), instantiate the class, and then invoke
the run method, passing it the entry index. The translated passage will read variables from the
thread's state
as needed, perform the equivalent operations as
expressed in the source p-code, and then write the resulting variables back into the state.
Memory variables are treated similarly, but without scope-based optimizations. In this manner,
execution of the translated passage produces exactly the same effect on the emulation state as
interpretation of the same p-code passage. The run method returns the next entry point to execute
or null
when the emulator must look up the next entry point.
Translation of a passage takes place in distinct phases. See each respective class for details of its design and implementation:
- Control Flow Analysis:
JitControlFlowModel
- Data Flow Analysis:
JitDataFlowModel
- Variable Scope Analysis:
JitVarScopeModel
- Type Assignment:
JitTypeModel
- Variable Allocation:
JitAllocationModel
- Operation Elimination:
JitOpUseModel
- Code Generation:
JitCodeGenerator
Control Flow Analysis
Some rudimentary control flow analysis is performed during decode, but the output of decode is a
passage, i.e., collection of strides, not basic blocks. The control flow analysis breaks
each stride down into basic blocks at the p-code level. Note that a single instruction's pcode
(as well as any user instrumentation on that instruction's address) may have complex control
flow. Additionally, branches that leave an instruction preclude execution of its remaining
p-code. Thus, p-code basic blocks do not coincide precisely with instruction-level basic blocks.
See JitControlFlowModel
.
Data Flow Analysis
Most every following step consumes the control flow analysis. Data flow analysis interprets each
basic block independently using an abstraction that produces a use-def graph. A varnode that is
read before it is written produces a "missing" variable. Those missing variables are converted to
phi nodes and later resolved during inter-block analysis. The graph is also able to
consider aliasing, partial accesses, overlapping accesses, etc., by synthesizing operations to
model those effects. See JitDataFlowModel
.
Variable Scope Analysis
Because accessing PcodeExecutorState
is expensive (relative to accessing a JVM local
variable), the translation seeks to minimize such accesses. This is generally not recommended for
memory accesses, as there is no telling in multi-threaded applications whether a given memory
variable is shared/volatile or not. However, for registers and uniques, we can allocate the
variables as JVM locals. Then we only "birth" them (read them in) when they come into scope and
"retire" them (write them out) when they leave scope. This analyzer determines which variables
are in scope (alive) in which basic blocks. See JitVarScopeModel
.
Type Assignment
For those variables we allocate as JVM locals, we have to choose a type, because the JVM requires
it. We have essentially 4 to choose from. (Though we could also choose a reference type,
depending on the strategy we eventually choose for multi-precision arithmetic.) Those four are
the JVM primitives: int, float, long, and double. For those more familiar with Java but not the
JVM, the smaller integral primitives are all represented by JVM ints. The JVM does not permit
type confusion, e.g., the application of float addition FADD
to int variables. However,
the emulation target may permit type confusion. (Those familiar with the constant 0x5f759df may
appreciate intentional type confusion.) When this happens, we must explicitly convert by calling,
e.g., Float.floatToRawIntBits(float)
, which is essentially just a bit cast. Nevertheless,
we seek to reduce the number of such calls we encode into the translation. See
JitTypeModel
.
Variable Allocation
Once we've decided the type of each use-def variable node, we allocate JVM locals and assign
their types accordingly. To keep things simple and fast, we just allocate variables by varnode.
Partial/overlapping accesses are coalesced to the containing varnode and cause the type to be a
JVM int (to facilitate shifting and masking). Otherwise, types are assigned according to the most
common use of the varnode, i.e., by taking a vote among the use-def variable nodes sharing that
varnode. See JitAllocationModel
.
Operation Elimination
Each instruction typically produces several p-code ops, the outputs of which may not actually be
used by any subsequent op. This analysis seeks to identify such p-code ops and remove them. Since
many ISAs employ "flags," which are set by nearly every arithmetic instruction, such ops are
incredibly common. Worse yet, their computation is very expensive, because the JVM does not have
comparable flag registers, nor does it provide opcodes for producing comparable values. We have
to emit the bit banging operations ourselves. Thus, performing this elimination stands to improve
execution speed significantly. However, eliminating these operations may lead to confusing
results if execution is interrupted and the state inspected by a user. The effects of the
eliminated operations will be missing. Even though they do not (or should not) matter, the user
may expect to see them. Thus, this step can be toggled by
JitConfiguration.removeUnusedOperations()
. See JitOpUseModel
.
Code Generation
For simplicity, we seek to generate JVM bytecode in the same order as the source p-code ops.
There are several details given the optimizations informed by all the preceding analysis. For
example, the transfer of control to the requested entry point, the placement of variable birth
and retirement on control flow edges (including fall-through).... We take an object-oriented
approach to the translation of each p-code op, the handling of each variable's allocation and
access, the conversion of types, etc. This phase outputs the final classfile bytes, which are
then loaded as a hidden class. See JitCodeGenerator
.
-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final EnumSet
<JitCompiler.Diag> The set of enabled diagnostic toggles.static final long
Exclude a given address offset from ASM'sClassWriter.COMPUTE_MAXS
andClassWriter.COMPUTE_FRAMES
. -
Constructor Summary
ConstructorsConstructorDescriptionJitCompiler
(JitConfiguration config) Construct a p-code to bytecode translator. -
Method Summary
Modifier and TypeMethodDescriptioncompilePassage
(MethodHandles.Lookup lookup, JitPassage passage) Translate a passage using the given lookupGet this compiler's configuration
-
Field Details
-
ENABLE_DIAGNOSTICS
The set of enabled diagnostic toggles.In production, this should be empty.
-
EXCLUDE_MAXS
public static final long EXCLUDE_MAXSExclude a given address offset from ASM'sClassWriter.COMPUTE_MAXS
andClassWriter.COMPUTE_FRAMES
.Unfortunately, when automatic computation of frames and maxes fails, the ASM library offers little in terms of diagnostics. It usually crashes with an NPE or an AIOOBE. Worse, when this happens, it fails to output any of the classfile trace. To help with this, a developer may identify the address of the passage seed that causes such a failure and set this variable to its offset. This will prevent ASM from attempting this computation so that it at least prints the trace and dumps out the classfile to disk (if those
JitCompiler.Diag
nostics are enabled).Once the trace/classfile is obtained, set this back to -1 and then apply debug prints in the crashing method. Since it's probably in the ASM library, you'll need to use your IDE / debugger to inject those prints. The way to do this in Eclipse is to set a "conditional breakpoint" then have the condition print the value and return false, so that execution continues. Sadly, this will still slow execution down considerably, so you'll want to set some other conditional breakpoint to catch when the troublesome passage is being translated. Probably the most helpful thing to print is the bytecode offset of each basic block ASM is processing as it computes the frames. Once it crashes, look at the last couple of bytecode offsets in the dumped classfile.
- See Also:
-
-
Constructor Details
-
JitCompiler
Construct a p-code to bytecode translator.In general, this should only be used by the JIT emulator and its test suite.
- Parameters:
config
- the configuration
-
-
Method Details
-
compilePassage
Translate a passage using the given lookup- Parameters:
lookup
- a lookup that can access everything the passage may need, e.g., userop libraries. Likely, this should come from the emulator, which may be in a script. If you are unsure what to use here, useMethodHandles.lookup()
. If you see errors about accessing stuff during the compilation, ensure everything the emulator needs is accessible from the method callingMethodHandles.lookup()
.passage
- the decoded passage to compile- Returns:
- the compiled class, not instantiated for any particular thread
-
getConfiguration
Get this compiler's configuration- Returns:
- the configuration
-