Class JitVarScopeModel

java.lang.Object
ghidra.pcode.emu.jit.analysis.JitVarScopeModel

public class JitVarScopeModel extends Object
The variable scope analysis of JIT-accelerated emulation.

This implements the Variable Scope Analysis phase of the JitCompiler. The result provides the set of in-scope (alive) varnodes for each basic block. The design of this analysis, and the shortcuts we take, are informed by the design of downstream phases. In particular, we do not intend to allocate each SSA variable. There are often many, many such variables, and attempting to allocate them to as few target resources, e.g., JVM locals, as possible is probably a complicated and expensive algorithm. I don't think we'd gain much from it either. Instead, we'll just allocate by varnode. To do that, though, we still have to consider that some varnodes overlap and otherwise alias others. If we are able to handle all that aliasing in place, then we need not generate code for the synthetic ops. One might ask, well then why do any of the Data Flow Analysis in the first place? 1) We still need data flow to inform the selection of JVM local types. We have not measured the run-time cost of the bitwise casts, but we do know the bytecode for each cast occupies space, counted against the 65,535-byte max. 2) We also need data flow to inform operation elimination, which removes many wasted flag computations.

To handle the aliasing, we coalesce overlapping varnodes. For example, EAX will get coalesced with RAX, but BH will not get coalesced with BL, assuming no other part of RBX is accessed. The JitDataFlowModel records all varnodes accessed in the course of its intra-block analysis. Only those actually accessed are considered. We then compute scope in terms of these coalesced varnodes. For example, if both RAX and EAX are used by a passage, then an access of EAX causes RAX to remain in scope.

The decision to compute scope on a block-by-block basis instead of op-by-op is for simplicity. We intend to birth and retire variables along block transitions by considering what variables are coming into or leaving scope on the flow edge. Birthing is just reading a variable's value from the run-time state into its allocated JVM local. Conversely, retiring is writing the value back out to the state. There's little to be gained by retiring a variable midway through a block as opposed to the end of the block. Perhaps if one giant block handles a series of variables in sequence, we could have used a single JVM local to allocate each, but we're already committed to allocating a JVM local per (coalesced) varnode. So, while that may ensure only one variable is alive at a time, the number of JVM locals required remains the same. Furthermore, the amount of bytecode emitted remains the same, but at different locations in the block. The case where this might be worth considering is a userop invocation, because all live variables must be forcefully retired.

We then consider what common cases we want to ensure are optimized, when we're limited to a block-by-block analysis. One that comes to mind is a function with an early bail. Consider the following C source:

 int func(my_struct* ptr) {
   if (ptr == NULL) {
     return ERR;
   }
   // Do some serious work
   return ptr->v;
 }
 

Often, the C compiler will group all the returns into one final basic block, so we might get the following p-code:

  1   RSP    = INT_SUB RSP, 0x20:8
  2   $U00:1 = INT_EQUAL RDI, 0:8     # RDI is ptr
  3            CBRANCH <err>, $U0:1

  4            # Do some serious work
  5   $U10:8 = INT_ADD RDI, 0xc:8     # Offset to field v
  6   EAX    = LOAD [ram] $U10:8
  7            BRANCH <exit>
 <err>
  8   EAX    = COPY 0xffffffff:4
 <exit>
  9   RSP    = INT_ADD RSP, 0x20:8
  10  RIP    = LOAD [ram] RSP
  11  RSP    = INT_ADD RSP, 8:8
  12           RETURN RIP
 

Note that I've elided the actual x86 machine code and all of the noise generated by C compilation and p-code lifting, and I've presumed the decoded passage contains exactly the example function. The result is your typical if-else diamond. We'll place the error case on the left:

     +---------+
     |   1--3  |
     | CBRANCH |
     +-T-----F-+
      /       \
     /         \
 +--------+ +--------+
 |   8    | |  4--7  |
 | (fall) | | BRANCH |
 +--------+ +--------+
     \         /
      \       /
     +---------+
     |  9--12  |
     | RETURN  |
     +---------+
 

Suppose the "serious work" on line 4 accesses several varnodes: RBX, RCX, RDX, and RSI. If execution follows the error path, we'd rather not birth any of those variables. Thus, we might like the result of the scope analysis to be:

Block Live Vars
1–3 RDI, RSP, $U00:1
4–7 EAX, RBX, RCX, RDI, RDX, RSI, RSP, $U10:8
8 EAX, RSP
9–12 RIP, RSP

This can be achieved rather simply: Define two sets for each block, the upward view and the downward view. The first corresponds to all varnodes that could be accessed before entering this block or while in it. The second corresponds to all varnodes that could be access while in this block or after leaving it. The upward view is computed by initializing each set to the varnodes accessed by its block. Then we "push" each set upward by adding its elements into the set for each block with flows into this one, until the sets converge. The downward sets are similarly computed, independently of the upward sets. The result is the intersection of these sets, per block. The algorithm is somewhat intuitive in that we accrue live variables as we move toward the "body" of the control flow graph, and they begin to drop off as we approach an exit. The accrual is captured by the downward set, and the drop off is captured by intersection with the upward set. This will also prevent retirement and rebirth of variables. Essentially, if we are between two accesses of a varnode, then that varnode is alive. Consider RSP from the example above. The algorithm considers it alive in blocks 4–7 and 8, despite the fact neither actually accesses it. Nevertheless, we'd rather generate one birth upon entering block 1–3, keep it alive in the body, and then generate one retirement upon leaving block 9–12.

One notable effect of this algorithm is that all blocks in a loop will have the same variables in scope.... I think this is okay. We'll birth the relevant variables upon entering the loop, keep them all alive during loop execution, and then retire them (unless they're accessed downstream) upon leaving.

  • Constructor Details

    • JitVarScopeModel

      public JitVarScopeModel(JitControlFlowModel cfm, JitDataFlowModel dfm)
      Construct the model
      Parameters:
      cfm - the control flow model
      dfm - the data flow model
  • Method Details

    • getCoalesced

      public Varnode getCoalesced(Varnode part)
      Get the varnode into which the given varnode was coalesced

      In many cases, the result is the same varnode.

      Parameters:
      part - the varnode
      Returns:
      the coalesced varnode
    • coalescedVarnodes

      public Iterable<Varnode> coalescedVarnodes()
      Get the collection of all coalesced varnodes
      Returns:
      the varnodes
    • getLiveVars

      public Set<Varnode> getLiveVars(JitControlFlowModel.JitBlock block)
      Get the set of live varnodes for the given block
      Parameters:
      block - the block
      Returns:
      the live varnodes
    • dumpResult

      public void dumpResult()
      For diagnostics: Dump the analysis result to stderr
      See Also: