Class JitPcodeEmulator
- All Implemented Interfaces:
PcodeMachine<byte[]>
PcodeEmulator
that applies Just-in-Time (JIT) translation to accelerate
execution.
This is meant as a near drop-in replacement for the class it extends. Aside from some additional
configuration, and some annotations you might add to a PcodeUseropLibrary
, if applicable,
you can simply replace new PcodeEmulator()
with new JitPcodeEmulator(...)
.
A JIT-Accelerated P-code Emulator for the Java Virtual Machine
There are two major tasks to achieving JIT-accelerated p-code emulation: 1) The translation of p-code to a suitable target's machine language, and 2) The selection, decoding, and cache management of passages of machine code translations. For our purposes, the target language is JVM bytecode, which introduces some restrictions which make the translation process substantially different than targeting native machine language.
Terminology
Because of the potential for confusion of terms with similar meanings from similar disciplines, and to distinguish our particular use of the terms, we establish some definitions up front:
- Basic block: A block of p-code ops for which there are no branches into or from, except at its top and bottom. Note that this definition pertains only to p-code ops in the same passage. Branches into a block from ops generated elsewhere in the translation source need not be considered. Note also that p-code basic blocks might not coincide with machine-code basic blocks.
- Bytecode: Shorthand for "JVM bytecode." Others sometimes use this to mean any machine code, but for us "bytecode" only refers to the JVM's machine code.
- Decode context: The input contextreg value for decoding an instruction. This is often paired with an address to seed passages, identify an instruction's "location," and identify an entry point.
- Emulation host: The machine or environment on which the emulation target is being hosted. This is usually also the translation target. For our purposes, this is the JVM, often the same JVM executing Ghidra.
- Emulation target: The machine being emulated. As opposed to the translation target or emulation host. While this can include many aspects of a target platform, we often just mean the Instruction Set Architecture (ISA, or language) of the machine.
- Entry point: An address (and contextreg value) by which execution may enter a passage. In addition to the decode seed, the translator may expose many entries into a given passage, usually at branch targets or the start of each basic block coinciding with an instruction.
- Instruction: A single machine-code instruction.
- Machine code: The sequence of bytes and/or decoded instructions executed by a machine.
- Passage: A collection of strides connected by branches. Often each stride begins at the target of some branch in another stride.
- P-code: An intermediate representation used by Ghidra in much of its analysis and execution modeling. For our purposes, we mean "low p-code," which is the common language into which the source machine code is translated before final translation to bytecode.
- P-code op: A single p-code operation. A single instruction usually generates several p-code ops.
- Stride: A contiguous sequence of instructions (and their emitted p-code) connected by fall-through. Note that conditional branches may appear in the middle of the stride. So long as fall-through is possible, the stride may continue.
- Translation source: The machine code of the emulation target that is being translated and subsequently executed by the emulation host.
- Translation target: The target of the JIT translation, usually the emulation host. For our purposes, this is always JVM bytecode.
- Varnode: The triple (space,offset,size) giving the address and size of a variable in
the emulation target's machine state. This is distinct from a variable node (see
JitVal
) in theuse-def
graph. The name "Varnode
" is an unfortunate inheritance from the Ghidra API, where they can represent genuine variable nodes in the "high p-code" returned by the decompiler. However, the emulator consumes the "low p-code" where varnodes are mere triples, which is how we use the term.
Just-in-Time Translation
For details of the translation process, see JitCompiler
.
Translation Cache
This class, aside from overriding and replacing the state and thread objects with respective
extensions, manages a part of the translation cache. For reasons discussed in the translation
section, there are two levels of caching. Once a passage is translated into a classfile, it must
be loaded as a class and then instantiated for the thread executing it. Thus, at the machine (or
emulator) level, each translated passage's class is cached. Then, each thread caches its instance
of that class. When a thread encounters an address (and contextreg value) that it has not yet
translated, it requests that the emulator perform that translation. The details of this check are
described in getEntryPrototype(AddrCtx, JitPassageDecoder)
and
JitPcodeThread.getEntry(AddrCtx)
.
-
Nested Class Summary
Nested classes/interfaces inherited from interface ghidra.pcode.emu.PcodeMachine
PcodeMachine.AccessKind, PcodeMachine.SwiMode
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final Map
<JitPassage.AddrCtx, CompletableFuture<JitCompiledPassage.EntryPointPrototype>> This emulator's cache of passage translations, incl.protected final JitCompiler
The compiler which translates passages into JVM classesFields inherited from class ghidra.pcode.emu.AbstractPcodeMachine
accessBreakpoints, arithmetic, injects, language, library, stubLibrary, suspended, swiMode, threads, threadsView
-
Constructor Summary
ConstructorsConstructorDescriptionJitPcodeEmulator
(Language language, JitConfiguration config, MethodHandles.Lookup lookup) Create a JIT-accelerated p-code emulator -
Method Summary
Modifier and TypeMethodDescriptionvoid
addAccessBreakpoint
(AddressRange range, PcodeMachine.AccessKind kind) Add an access breakpoint over the given rangeprotected JitCompiledPassageClass
compileWithMaxOpsBackoff
(JitPassage.AddrCtx pcCtx, JitPassageDecoder decoder) Translate a new passage starting at the given seed.protected PcodeExecutorState
<byte[]> createLocalState
(PcodeThread<byte[]> thread) A factory method to create the (register) state local to the given threadprotected PcodeExecutorState
<byte[]> A factory method to create the (memory) state shared by all threads in this machineprotected JitPcodeThread
createThread
(String name) A factory method to create a new thread in this machineprotected PcodeUseropLibrary
<byte[]> A factory method to create the userop library shared by all threads in this machineGet the configuration for this emulator.getEntryPrototype
(JitPassage.AddrCtx pcCtx, JitPassageDecoder decoder) Get the entry prototype for a given address and contextreg value.boolean
Check if the emulator already has translated a given entry point.Create a new thread with a default name in this machineCreate a new thread with the given name in this machineMethods inherited from class ghidra.pcode.emu.PcodeEmulator
createArithmetic
Methods inherited from class ghidra.pcode.emu.AbstractPcodeMachine
addBreakpoint, assertSleigh, checkLoad, checkStore, clearAccessBreakpoints, clearAllInjects, clearInject, compileSleigh, createThreadStubLibrary, doPluggableInitialization, getAllThreads, getArithmetic, getInject, getLanguage, getPluggableInitializer, getSharedState, getSoftwareInterruptMode, getStubUseropLibrary, getThread, getUseropLibrary, inject, isSuspended, setSoftwareInterruptMode, setSuspended, stepped, swi
-
Field Details
-
compiler
The compiler which translates passages into JVM classes -
codeCache
protected final Map<JitPassage.AddrCtx,CompletableFuture<JitCompiledPassage.EntryPointPrototype>> codeCacheThis emulator's cache of passage translations, incl. all entry points.TODO: Invalidation of entries. One possible complication is any thread may still have an instance of one, and could possibly be executing it. Perhaps this could be a weak hash map, and they'll stay alive by virtue of the instances pointing to their classes? Still, we might like to impose a total size max, which would have to be implemented among the threads. Other reasons we may need to invalidate include:
- Self-modifying code (we'll probably want to provide a configuration toggle given how expensive that may become).
- Changes to the memory map. At the moment, however, the p-code emulator does not provide a memory management unit (MMU).
- Addition of a new inject by the user or script. This one's actually pretty likely. For now, we might just document that injects should not be changes once execution starts.
-
-
Constructor Details
-
JitPcodeEmulator
Create a JIT-accelerated p-code emulator- Parameters:
language
- the emulation target langaugeconfig
- configuration options for this emulatorlookup
- a lookup in case the emulator (or its target) needs access to non-public elements, e.g., to access a nestedPcodeUseropLibrary
.
-
-
Method Details
-
createLocalState
Description copied from class:AbstractPcodeMachine
A factory method to create the (register) state local to the given thread- Overrides:
createLocalState
in classPcodeEmulator
- Parameters:
thread
- the thread- Returns:
- the thread-local state
-
createThread
Description copied from class:AbstractPcodeMachine
A factory method to create a new thread in this machine- Overrides:
createThread
in classPcodeEmulator
- Parameters:
name
- the name of the new thread- Returns:
- the new thread
- See Also:
-
newThread
Description copied from interface:PcodeMachine
Create a new thread with a default name in this machine- Specified by:
newThread
in interfacePcodeMachine<byte[]>
- Overrides:
newThread
in classAbstractPcodeMachine<byte[]>
- Returns:
- the new thread
-
newThread
Description copied from interface:PcodeMachine
Create a new thread with the given name in this machine- Specified by:
newThread
in interfacePcodeMachine<byte[]>
- Overrides:
newThread
in classAbstractPcodeMachine<byte[]>
- Parameters:
name
- the name- Returns:
- the new thread
-
createUseropLibrary
A factory method to create the userop library shared by all threads in this machineUserops can be optimized by the JIT translator under certain circumstances. To read more, see
JitDataFlowUseropLibrary
. DO NOT extend that library. The internals use it to wrap the library you provide here, but its documentation describes when and how the JIT translator optimizes invocations to your userops.WARNING: Userops that accept floating point types via direct invocation should be careful that the sizes match exactly. That is, if you pass a
float
argument to adouble
parameter, you may have problems. This does not imply a conversion of floating point type. Instead, it will simply zero-fill the upper bits (as if zero-exending an integer) and reinterpret the resulting bits as a double. This is almost certainly not what you want. Until/unless we resolve this, the userop implementor must accept the proper types. It's possible multiple versions of the userop must be provided (overloading is not supported) to accept types of various sizes.- Overrides:
createUseropLibrary
in classPcodeEmulator
- Returns:
- the library
-
hasEntryPrototype
Check if the emulator already has translated a given entry point.This is used by the decoder to detect if it should end a stride before reaching its natural end (i.e., a non-fall-through instruction.) This was a design decision to reduce re-translation of the same machine code. Terminating the stride will cause execution to exit the translated passage, but it will then immediately enter the existing translated passage.
- Parameters:
pcCtx
- the program counter and contextreg value to check- Returns:
- true if the emulator has a translation which can be entered at the given pcCtx.
-
compileWithMaxOpsBackoff
protected JitCompiledPassageClass compileWithMaxOpsBackoff(JitPassage.AddrCtx pcCtx, JitPassageDecoder decoder) Translate a new passage starting at the given seed.Note the compiler must provide an entry to the resulting passage at the requested seed. It and any additional entry points are placed into the code cache. Each thread executing the passage must still create (and ought to cache) an instance of the translation.
- Parameters:
pcCtx
- the seed address and contextreg value for decoding and selecting a passagedecoder
- the passage decoder, provided by the thread- Returns:
- the class that is the translation of the passage, and information about its entry points.
-
getEntryPrototype
public JitCompiledPassage.EntryPointPrototype getEntryPrototype(JitPassage.AddrCtx pcCtx, JitPassageDecoder decoder) Get the entry prototype for a given address and contextreg value.An entry prototype is a class representing a translated passage and an index identifying the point at which to enter the passage. The compiler numbers each entry point it generates and provides those indices via a static field in the output class. Those entry point indices are entered into the code cache for each translated passage. If no entry point exists for the requested address and contextreg value, the emulator will decode and translate a new passage at the requested seed.
It's a bit odd to take the thread's decoder for a machine-level thing; however, all thread decoders ought to have the same behavior. The particular thread's decoder will have better cached instruction block state for decoding in the vicinity of its past execution, though.
- Parameters:
pcCtx
- the counter and decoder contextdecoder
- the thread's decoder needing this entry point prototype- Returns:
- the entry point prototype
- See Also:
-
getConfiguration
Get the configuration for this emulator.- Returns:
- the configuration
-
addAccessBreakpoint
Add an access breakpoint over the given rangeAccess breakpoints are implemented out of band, without modification to the emulated image. The breakpoints are only effective for p-code
PcodeOp.LOAD
andPcodeOp.STORE
operations with concrete offsets. Thus, an operation that refers directly to a memory address, e.g., a memory-mapped register, will not be trapped. Similarly, access breakpoints on registers or unique variables will not work. Access to an abstract offset that cannot be made concrete, i.e., viaPcodeArithmetic.toConcrete(Object, Purpose)
cannot be trapped. To interrupt on direct and/or abstract accesses, consider wrapping the relevant state and/or overridingPcodeExecutorStatePiece.getVar(Varnode, Reason)
and related. For accesses to abstract offsets, consider overridingAbstractPcodeMachine.checkLoad(AddressSpace, Object, int)
and/orAbstractPcodeMachine.checkStore(AddressSpace, Object, int)
instead.A breakpoint's range cannot cross more than one page boundary. Pages are 4096 bytes each. This allows implementations to optimize checking for breakpoints. If a breakpoint does not follow this rule, the behavior is undefined. Breakpoints may overlap, but currently no indication is given as to which breakpoint interrupted emulation.
No synchronization is provided on the internal breakpoint storage. Clients should ensure the machine is not executing when adding breakpoints. Additionally, the client must ensure only one thread is adding breakpoints to the machine at a time.
TODO: The JIT-accelerated emulator does not currently implement access breakpoints. Furthermore, because JIT generated code is granted direct access to the emulator's state internals, it is not sufficient to override
getVar
and related.- Specified by:
addAccessBreakpoint
in interfacePcodeMachine<byte[]>
- Overrides:
addAccessBreakpoint
in classAbstractPcodeMachine<byte[]>
- Parameters:
range
- the address range to trapkind
- the kind of access to trap