Class JitCodeGenerator

java.lang.Object
ghidra.pcode.emu.jit.gen.JitCodeGenerator

public class JitCodeGenerator extends Object
The bytecode generator for JIT-accelerated emulation.

This implements the Code Generation phase of the JitCompiler. With all the prior analysis, code generation is just a careful process of visiting all of the ops, variables, and analytic results to ensure everything is incorporated and accounted for.

The Target Classfile

The target is a classfile that implements JitCompiledPassage. As such, it must implement all of the specified methods in that interface as well as a constructor having a specific signature. That signature takes a JitPcodeThread and, being a constructor, returns void. We will also need to generate a static initializer to populate some metadata and pre-fetch any static things, e.g., the SleighLanguage for the emulation target. The fields are:

Static Initializer

In the Java language, statements in a class's static block, as well as the initial values of static fields are implemented by the classfile's <clinit> method. We use it to pre-construct contextreg values and varnode refs for use in birthing and retirement. They are kept in static fields. We also initialize the static ENTRIES field, which is public (via reflection) and describes each entry point generated. It has the type List<JitPassage.AddrCtx>. A call to JitCompiledPassage.run(int) should pass in the position of the desired entry point in the ENTRIES list.

Constructor

In the Java language, statements in a class's constructor, as well as the initial values of instance fields are implemented by the classfile's <init> methods. We provide a single constructor that accepts a JitPcodeThread. Upon construction, the generated JitCompiledPassage is "bound" to the given thread. The constructor pre-fetches parts of the thread's state and userop definitions, and it allocates JitCompiledPassage.ExitSlots. Each of these are kept in instance fields.

thread() Method

This method implements JitCompiledPassage.thread(), a simple getter for the thread field.

run() Method

This method implements JitCompiledPassage.run(int), the actual semantics of the translated machine instructions selected for the passage. It accepts a single parameter, which is the position in the ENTRIES list of the desired entry point blockId. The structure is as follows:

  1. Parameter declarations - this and blockId
  2. Allocated local declarations - declares all locals allocated by JitAllocationModel
  3. Entry point dispatch - a large switch statement on the entry blockId
  4. P-code translation - the block-by-block op-by-op translation of the p-code to bytecode
  5. Exception handlers - exception handlers as requested by various elements of the p-code translation

Entry Point Dispatch

This part of the run method dispatches execution to the correct entry point within the translated passage. It consists of these sub-parts:

  1. Switch table - a tableswitch to jump to the code for the scope transition into the entry block given by blockId
  2. Scope transitions - for each block, birth its live varnodes then jump to the block's translation
  3. Default case - throws an IllegalArgumentException for an invalid blockId

This first ensure that a valid entry point was given in blockId. If not, we jump to the default case which throws an exception. Otherwise, we jump to the appropriate entry transition. Every block flow edge is subject to a scope transition wherein varnodes that leave scope must be retired and varnodes that enter scope must be birthed. We generate an entry transition for each possible entry block. That transition births all the varnodes that are in scope for that entry block then jumps to the entry block's p-code translation.

P-code Translation

Here, most of the generation is performed via delegation to an object model, based on the use-def graph. We first iterate over the blocks, in the same order as they appear in the decoded passage. This will ensure that fall-through control transitions in the p-code map to fall-through transitions in the emitted bytecode. If the block is the target of a bytecode jump, i.e., it's an entry block or the target of a p-code branch, then we emit a label at the start of the block. We then iterate over each p-code op in the block delegating each to the appropriate generator. We emit "line number" information for each op to help debug crashes. A generator may register an exception handler to be emitted later in the "exception handlers" part of the run method. If the block has fall through, we emit the appropriate scope transition before proceeding to the next block. Note that scope transitions for branch ops are emitted by the generators for those ops.

For details about individual p-code op translations, see OpGen. For details about individual SSA value (constant and variable) translations, see VarGen. For details about emitting scope transitions, see VarGen.BlockTransition.

  • Constructor Details

    • JitCodeGenerator

      Construct a code generator for the given passage's target classfile

      This constructor chooses the name for the target classfile based on the passage's entry seed. It has the form: Passage$at_address_context. The address is as rendered by Address.toString() but with characters replaced to make it a valid JVM classfile name. The decode context is rendered in hexadecimal. This constructor also declares the fields and methods, and emits the definition for JitCompiledPassage.thread().

      Parameters:
      lookup - a means of accessing user-defined components, namely userops
      context - the analysis context for the passage
      cfm - the control flow model
      dfm - the data flow model
      vsm - the variable scope model
      tm - the type model
      am - the allocation model
      oum - the op use model
  • Method Details

    • getAnalysisContext

      public JitAnalysisContext getAnalysisContext()
      Get the analysis context
      Returns:
      the context
    • getVariableScopeModel

      public JitVarScopeModel getVariableScopeModel()
      Get the variable scope model
      Returns:
      the model
    • getTypeModel

      public JitTypeModel getTypeModel()
      Get the type model
      Returns:
      the model
    • getAllocationModel

      public JitAllocationModel getAllocationModel()
      Get the allocation model
      Returns:
      the model
    • startStaticInitializer

      protected void startStaticInitializer()
      Emit the first bytecodes for the static initializer

      This generates code equivalent to:

       static {
              LANGUAGE = getLanguage(LANGUAGE_ID);
              ADDRESS_FACTORY = LANGUAGE.getAddressFactory();
       }
       

      Note that LANGUAGE_ID is initialized to a constant String in its declaration. Additional static fields may be requested as the p-code translation is emitted.

    • startConstructor

      protected void startConstructor()
      Emit the first bytecodes for the class constructor

      This generates code equivalent to:

       public Passage$at_00400000_0(JitPcodeThread thread) {
              super(); // Implicit in Java, but we must emit i
              this.thread = thread;
              this.state = thread.GetState();
       }
       

      Additional instance fields may be requested as the p-code translation is emitted.

    • generateLoadJitStateSpace

      protected void generateLoadJitStateSpace(AddressSpace space, org.objectweb.asm.MethodVisitor iv)
      Emit bytecode to load the given JitBytesPcodeExecutorStatePiece.JitBytesPcodeExecutorStateSpace onto the JVM stack

      This is equivalent to the Java expression state.getForSpace(AddressFactory.getAddressSpace(spaceId)). The id of the given space is encoded as an immediate or in the constant pool and is represented as spaceId.

      Parameters:
      space - the space to load at run time
      iv - the visitor for the class constructor
    • requestFieldForSpaceIndirect

      public FieldForSpaceIndirect requestFieldForSpaceIndirect(AddressSpace space)
      Request a field for a JitBytesPcodeExecutorStatePiece.JitBytesPcodeExecutorStateSpace for the given address space
      Parameters:
      space - the address space
      Returns:
      the field request
    • requestFieldForArrDirect

      public FieldForArrDirect requestFieldForArrDirect(Address address)
      Request a field for the bytes backing the page at the given address
      Parameters:
      address - the address contained by the desired page
      Returns:
      the field request
    • requestStaticFieldForContext

      protected ghidra.pcode.emu.jit.gen.FieldForContext requestStaticFieldForContext(RegisterValue ctx)
      Request a field for the given contextreg value
      Parameters:
      ctx - the contextreg value
      Returns:
      the field request
    • requestStaticFieldForVarnode

      public FieldForVarnode requestStaticFieldForVarnode(Varnode vn)
      Request a field for the given varnode
      Parameters:
      vn - the varnode
      Returns:
      the field request
    • requestFieldForUserop

      public FieldForUserop requestFieldForUserop(PcodeUseropLibrary.PcodeUseropDefinition<?> userop)
      Request a field for the given userop
      Parameters:
      userop - the userop
      Returns:
      the field request
    • requestFieldForExitSlot

      public FieldForExitSlot requestFieldForExitSlot(JitPassage.AddrCtx target)
      Request a field for the JitCompiledPassage.ExitSlot for the given target
      Parameters:
      target - the target address and decode context
      Returns:
      the field request
    • labelForBlock

      public org.objectweb.asm.Label labelForBlock(JitControlFlowModel.JitBlock block)
      Get the label at the start of a block's translation
      Parameters:
      block - the block
      Returns:
      the label
    • requestExceptionHandler

      public ExceptionHandler requestExceptionHandler(JitPassage.DecodedPcodeOp op, JitControlFlowModel.JitBlock block)
      Request an exception handler that can retire state for a given op
      Parameters:
      op - the op that might throw an exception
      block - the block containing the op
      Returns:
      the exception handler request
    • generateValInitCode

      protected void generateValInitCode(JitVal v)
      Emit into the constructor any bytecode necessary to support the given value.
      Parameters:
      v - the value from the use-def graph
    • generateValReadCode

      public JitType generateValReadCode(JitVal v, JitTypeBehavior typeReq)
      Emit into the run method the bytecode to read the given value onto the JVM stack.

      Although the value may be assigned a type by the JitTypeModel, the type needed by a given op might be different. This method accepts the JitTypeBehavior for the operand and will ensure the value pushed onto the JVM stack is compatible with that type.

      Parameters:
      v - the value to read
      typeReq - the required type of the value
      Returns:
      the actual type of the value on the stack
    • generateVarWriteCode

      public void generateVarWriteCode(JitVar v, JitType type)
      Emit into the run method the bytecode to write the value on the JVM stack into the given variable.

      Although the destination variable may be assigned a type by the JitTypeModel, the type of the value on the stack may not match. This method needs to know that type so that, if necessary, it can convert it to the appropriate JVM type for local variable that holds it.

      Parameters:
      v - the variable to write
      type - the actual type of the value on the stack
    • generateInitCode

      protected void generateInitCode()
      Emit all the bytecode for the constructor

      Note that some elements of the p-code translation may request additional bytecodes to be emitted, even after this method is finished. That code will be emitted at the time requested.

      To ensure a reasonable order, for debugging's sake, we request fields (and their initializations) for all the variables and values before iterating over the ops. This ensures, e.g., locals are declared in order of address for the varnodes they hold. Similarly, the pre-fetched byte arrays, whether for uniques, registers, or memory are initialized in order of address. Were these requests not made, they'd still get requested by the op generators, but the order would be less helpful.

    • generateCodeForOp

      protected void generateCodeForOp(PcodeOp op, JitControlFlowModel.JitBlock block, int opIdx)
      Emit the bytecode translation for a given p-code op

      This first finds the use-def node for the op and then verifies that it has not been eliminated. If not, then it find the appropriate generator, emits line number information, and then emits the actual translation.

      Line number information in the JVM is a map of strictly-positive line numbers to bytecode offsets. The ASM library allows this to be populated by placing labels and then emitting a line-number-to-label entry (via MethodVisitor.visitLineNumber(int, Label). It seems the JVM presumes the entire class is defined in a single source file, so we are unable to (ab)use a filename field to encode debug information. We can encode the op index into the (integer) line number, although we have to add 1 to make it strictly positive.

      Parameters:
      op - the op
      block - the block containing the op
      opIdx - the index of the op within the whole passage
    • generateCodeForBlockOps

      protected int generateCodeForBlockOps(JitControlFlowModel.JitBlock block, int opIdx)
      Emit the bytecode translation for the ops in the given p-code block

      This simply invoked generateCodeForOp(PcodeOp, JitBlock, int) on each op in the block and counts up the indices. Other per-block instrumentation is not included.

      Parameters:
      block - the block
      opIdx - the index, within the whole passage, of the first op in the block
      Returns:
      the index, within the whole passage, of the op immediately after the block
      See Also:
    • generateCodeForBlock

      protected int generateCodeForBlock(JitControlFlowModel.JitBlock block, int opIdx)
      Emit the bytecode translation for the given p-code block

      This checks if the block needs a label, i.e., it is an entry or the target of a branch, and then optionally emits an invocation of JitCompiledPassage.count(int, int). Finally, it emits the actual ops' translations via generateCodeForBlockOps(JitBlock, int).

      Parameters:
      block - the block
      opIdx - the index, within the whole passage, of the first op in the block
      Returns:
      the index, within the whole passage, of the op immediately after the block
    • generateAddress

      protected void generateAddress(Address address, org.objectweb.asm.MethodVisitor mv)
      Emit code to load an Address onto the JVM stack

      Note this does not load the identical address, but reconstructs it at run time.

      Parameters:
      address - the address to load
      mv - the visitor for the method being generated
    • generateStaticEntry

      protected void generateStaticEntry(JitPassage.AddrCtx entry)
      Emit bytecode into the class initializer that adds the given entry point into ENTRIES.

      Consider the entry (ram:00400000,ctx=80000000). The code would be equivalent to:

       static {
              ENTRIES.add(new AddrCtx(
                      ADDRESS_FACTORY.getAddressSpace(ramId).getAddress(0x400000), CTX_80000000));
       }
       

      Note this method will request the appropriate CTX_... field.

      Parameters:
      entry - the entry point to add
    • generateStaticEntries

      protected void generateStaticEntries()
      Emit code into the static initializer to initialize the ENTRIES field.

      This first constructs a new ArrayList and assigns it to the field. Then, for each block representing a possible entry, it adds an element giving the address and contextreg value for the first op of that block.

    • generateRunCode

      protected void generateRunCode()
      Emit all the bytecode for the run method.

      The structure of this method is described by this class's documentation. It first declares all the locals allocated by the JitAllocationModel. It then collects the list of entries points and assigns a label to each. These are used when emitting the entry dispatch code. Several of those labels may also be re-used when translating branch ops. We must iterate over the blocks in the same order as generateStaticEntries(), so that our indices and its match. Thus, we emit a tableswitch where each value maps to the blocks label identified in the same position of the ENTRIES field. We also provide a default case that just throws an IllegalArgumentException. We do not jump directly to the block's translation. Instead we emit a prologue for each block, wherein we birth the variables that block expects to be live, and then jump to the translation. Then, we emit the translation for each block using generateCodeForBlock(JitBlock, int), placing transitions between those connected by fall through using VarGen.computeBlockTransition(JitCodeGenerator, JitBlock, JitBlock). Finally, we emit each requested exception handler using ExceptionHandler.generateRunCode(JitCodeGenerator, MethodVisitor).

    • load

      public JitCompiledPassageClass load()
      Generate the classfile for this passage and load it into this JVM.
      Returns:
      the translation, wrapped in utilities that knows how to process and instantiate it
    • dumpBytecode

      protected byte[] dumpBytecode(byte[] bytes)
      For diagnostics: Dump the generated classfile to an actual file on disk
      Parameters:
      bytes - the classfile bytes
      Returns:
      the same classfile bytes
      See Also:
    • generate

      protected byte[] generate()
      Generate the classfile and get the raw bytes

      This emits all the bytecode for all the required methods, static initializer, and constructor. Once complete, this closes out the methods by letting the ASM library compute the JVM stack frames as well as the maximum stack size and local variable count. Finally, it closes out the class a retrieves the resulting bytes.

      Returns:
      the classfile bytes
    • getOpEntry

      public JitPassage.AddrCtx getOpEntry(PcodeOp op)
      Check if the given p-code op is the first of an instruction.
      Parameters:
      op - the op to check
      Returns:
      the address-context pair
      See Also:
    • getExitContext

      public RegisterValue getExitContext(PcodeOp op)
      Get the context of the instruction that generated the given p-code op.

      This is necessary when exiting the passage, whether due to an exception or "normal" exit. The emulator's context must be updated so that it can resume execution appropriately.

      Parameters:
      op - the p-code op causing the exit
      Returns:
      the contextreg value
    • generateRetirePcCtx

      public void generateRetirePcCtx(Runnable pcGen, RegisterValue ctx, org.objectweb.asm.MethodVisitor rv)
      Emit bytecode to set the emulator's counter and contextreg.

      Within a translated passage, there's no need to keep constant track of the program counter (nor decode context), since all the decoding has already been done. However, whenever we exit the passage and return control back to the emulator (whether by return or throw) we must "retire" the program counter and decode context, as if the emulator had interpreted all the instructions just executed. This ensures that the emulator has the correct seed when seeking its next entry point, which may require decoding a new passage.

      Parameters:
      pcGen - a means to emit bytecode to load the counter (as a long) onto the JVM stack. For errors, this is the address of the op causing the error. For branches, this is the branch target, which may be loaded from a varnode for an indirect branch.
      ctx - the contextreg value. For errors, this is the decode context of the op causing the error. For branches, this is the decode context at the target.
      rv - the visitor for the run method
    • generatePassageExit

      public void generatePassageExit(JitControlFlowModel.JitBlock block, Runnable pcGen, RegisterValue ctx, org.objectweb.asm.MethodVisitor rv)
      Emit code to exit the passage

      This retires all the variables of the current block as well as the program counter and decode coontext. It does not generate the actual areturn or athrow, but everything required up to that point.

      Parameters:
      block - the block containing the op at which we are exiting
      pcGen - as in generateRetirePcCtx(Runnable, RegisterValue, MethodVisitor)
      ctx - as in generateRetirePcCtx(Runnable, RegisterValue, MethodVisitor)
      rv - the visitor for the run method
    • getErrorMessage

      public String getErrorMessage(PcodeOp op)
      Get the error message for a given p-code op.
      Parameters:
      op - the p-code op generating the error
      Returns:
      the message
      See Also:
    • getAddressForOp

      public Address getAddressForOp(PcodeOp op)
      Get the address that generated the given p-code op.

      NOTE: The decoder rewrites ops to ensure they have the decode address, even if they were injected or from an inlined userop.

      Parameters:
      op - the op
      Returns:
      the address, i.e., the program counter at the time the op is executed