Class StructuredSleigh

java.lang.Object
ghidra.pcode.struct.StructuredSleigh
Direct Known Subclasses:
AnnotatedEmuSyscallUseropLibrary.StructuredPart

public class StructuredSleigh extends Object
The primary class for using the "structured sleigh" DSL

This provides some conveniences for generating Sleigh source code, which is otherwise completely typeless and lacks basic control structure. In general, the types are not used so much for type checking as they are for easing access to fields of C structures, array indexing, etc. Furthermore, it becomes possible to re-use code when data types differ among platforms, so long as those variations are limited to field offsets and type sizes.

Start by declaring an extension of StructuredSleigh. Then put any necessary "forward declarations" as fields of the class. Then declare methods annotated with StructuredSleigh.StructuredUserop. Inside those methods, all the protected methods of this class are accessible, providing a DSL (as far as Java can provide :/ ) for writing Sleigh code. For example:

 class MyStructuredPart extends StructuredSleigh {
        Var r0 = lang("r0", "/long");
 
        protected MyStructuredPart() {
                super(program);
        }
 
        @StructuredUserop
        public void my_userop() {
                r0.set(0xdeadbeef);
        }
 }
 

This will simply generate the source "r0 = 0xdeadbeef:4", but it also provides all the scaffolding to compile and invoke the userop as in a PcodeUseropLibrary. Internal methods -- which essentially behave like macros -- may be used, so only annotate methods to export as userops. For a more complete and practical example of using structured sleigh in a userop library, see AbstractEmuUnixSyscallUseropLibrary.

Structured sleigh is also usable in a more standalone manner:

 StructuredSleigh ss = new StructuredSleigh(compilerSpec) {
        @StructuredUserop
        public void my_userop() {
                // Something interesting, I'm sure
        }
 };
 
 SleighPcodeUseropDefinition<Object> myUserop = ss.generate().get("my_userop");
 // To print source
 myUserop.getLines().forEach(System.out::print);
 
 // To compile for given parameters (none in this case) and print the p-code
 Register r0 = lang.getRegister("r0");
 System.out.println(myUserop.programFor(new Varnode(r0.getAddress(), r0.getNumBytes()), List.of(),
        PcodeUseropLibrary.NIL));
 

Known limitations:

  • Recursion is not really possible. Currently, local variables of a userop do not actually get their own unique storage per invocation record. Furthermore, it's possible that local variable in different userop definition will be assigned the same storage location, meaning they could be unintentionally aliased if one invokes the other. Care should be taken when invoking one sleigh-based userop from another, or it should be avoided altogether until this limitation is addressed. It's generally safe to allow such invocations at the tail.
  • Parameters are passed by reference. Essentially, the formal argument becomes an alias to its parameter. This is more a feature, but can be surprising if C semantics are expected.
  • Calling one Structured Sleigh userop from another still requires a "external declaration" of the callee, despite being defined in the same "compilation unit."
  • Field Details

  • Constructor Details

    • StructuredSleigh

      protected StructuredSleigh(Program program)
      Bind this Structured Sleigh context to the given program's language, compiler spec, and data type manager.
      Parameters:
      program - the program
    • StructuredSleigh

      protected StructuredSleigh(CompilerSpec cs)
      Bind this Structured Sleigh context to the given compiler spec using only built-in types
      Parameters:
      cs - the compiler spec
  • Method Details

    • findComponentByName

      protected static DataTypeComponent findComponentByName(Composite composite, String name)
      Utility: Get the named component (field) from the given composite data type
      Parameters:
      composite - the type
      name - the name of the component
      Returns:
      the found component, or null
    • addDataTypeSource

      protected void addDataTypeSource(DataTypeManager source)
      Add another data type manager as a possible source of data types
      Parameters:
      source - the additional data type manager
      See Also:
    • addDataTypeSources

      protected void addDataTypeSources(Collection<DataTypeManager> sources)
      Add several data type managers as source of data types
      Parameters:
      sources - the additional managers
      See Also:
    • lang

      protected StructuredSleigh.Var lang(String name, DataType type)
      Import a variable defined by the processor language
      Parameters:
      name - the name of the variable. The name must already be defined by the processor
      type - the type of the variable
      Returns:
      a handle to the variable
    • reg

      protected StructuredSleigh.Var reg(Register register, DataType type)
      Import a register variable
      Parameters:
      register - the register
      type - the type of the variable
      Returns:
      a handle to the variable
    • local

      protected StructuredSleigh.Var local(String name, DataType type)
      Declare a local variable with the given name and type

      If the variable has no definitive type, but has a known size, use e.g., Undefined8DataType or type(String) with "/undefined8". If the variable's size depends on the ABI, use the most appropriate integer or pointer type, e.g., "/void*".

      Parameters:
      name - the name of the variable. The name cannot already be defined by the processor
      type - the type of the variable
      Returns:
      a handle to the variable
    • local

      protected StructuredSleigh.Var local(String name, StructuredSleigh.RVal init)
      Declare a local variable with the given name and initial value

      The type is taken from that of the initial value.

      Parameters:
      name - the name of the variable. The name cannot already be defined by the processor
      init - the initial value (and type)
      Returns:
      a handle to the variable
    • temp

      protected StructuredSleigh.Var temp(DataType type)
      Allocate a temporary local variable of the given type
      Parameters:
      type - the type
      Returns:
      a handle to the variable
    • type

      protected DataType type(String path)
      Get a type from a bound data type manager by path
      Parameters:
      path - the full path to the data type, including leading "/"
      Returns:
      the data type
      Throws:
      StructuredSleigh.StructuredSleighError - if the type cannot be found
    • types

      protected List<DataType> types(String... paths)
      Get several types
      Parameters:
      paths - the data types paths
      Returns:
      the data types in the same order
      Throws:
      StructuredSleigh.StructuredSleighError - if any type cannot be found
      See Also:
    • userop

      protected StructuredSleigh.UseropDecl userop(DataType returnType, String name, List<DataType> parameterTypes)
      Declare an external userop
      Parameters:
      returnType - the userop's "return type"
      name - the name of the userop as it would appear in Sleigh code
      parameterTypes - the types of its parameters, in order
      Returns:
      the declaration, suitable for generating invocations
    • lit

      protected StructuredSleigh.RVal lit(long val, int size)
      Generate a literal (or immediate or constant) value

      WARNING: Passing a literal int that turns out to be negative (easy to do in hex notation) can be perilous. For example, 0xdeadbeef will actually result in 0xffffffffdeadbeef because Java will cast it to a long before it's passed into this method. Use 0xdeadbeefL instead.

      Parameters:
      val - the value
      size - the size of the value in bytes
      Returns:
      a handle to the value
    • litf

      protected StructuredSleigh.RVal litf(float val)
      Generate a literal (or immediate or constant) single-precision floating-point value
      Parameters:
      val - the value
      Returns:
      a handle to the value
    • litd

      protected StructuredSleigh.RVal litd(double val)
      Generate a literal (or immediate or constant) double-precision floating-point value
      Parameters:
      val -
      Returns:
      a handle to the value
    • litf

      protected StructuredSleigh.RVal litf(double val, DataType type)
      Generate a literal (or immediate or constant) floating-point value
      Parameters:
      val -
      type - the type of the value
      Returns:
      a handle to the value
    • s

      public StructuredSleigh.Stmt s(String rawStmt)
      Generate Sleigh code

      This is similar in concept to inline assembly. It allows the embedding of Sleigh code into Structured Sleigh that is otherwise impossible or inconvenient to state. No effort is made to ensure the correctness of the given Sleigh code nor its impact in context.

      Parameters:
      rawStmt - the Sleigh code
      Returns:
      a handle to the statement
    • e

      public Expr e(String rawExpr)
      Generate a Sleigh expression

      This is similar in concept to inline assembly, except it also has a value. It allows the embedding of Sleigh code into Structured Sleigh that is otherwise impossible or inconvenient to express. No effort is made to ensure the correctness of the given Sleigh expression nor its impact in context. The result is assigned a type of "void".

      Parameters:
      rawExpr - the Sleigh expression
      Returns:
      a handle to the value
    • _if

      Generate an "if" statement

      The body is usually a lambda containing additional statements, predicated on this statement's condition, so that it resembles Java / C syntax:

       _if(r0.eq(4), () -> {
              r1.set(1);
       });
       

      The returned "wrapper" provides for additional follow-on syntax, e.g.:

       _if(r0.eq(4), () -> {
              r1.set(1);
       })._elif(r0.eq(5), () -> {
              r1.set(3);
       })._else(() -> {
              r1.set(r0.muli(2));
       });
       
      Parameters:
      cond - the condition
      body - the body of the statement
      Returns:
      a wrapper to the generated "if" statement
    • _while

      protected void _while(StructuredSleigh.RVal cond, Runnable body)
      Generate a "while" statement

      The body is usually a lambda containing the controlled statements, so that it resembles Java / C syntax:

       Var temp = local("temp", "/int");
       _while(temp.ltiu(10), () -> {
              temp.inc();
       });
       
      Parameters:
      cond - the condition
      body - the body of the loop
    • _for

      protected void _for(StructuredSleigh.Stmt init, StructuredSleigh.RVal cond, StructuredSleigh.Stmt step, Runnable body)
      Generate a "for" statement

      The body is usually a lambda containing the controlled statements, so that it resembles Java / C syntax:

       Var temp = local("temp", "/int");
       Var total = local("total", "/int");
       total.set(0);
       _for(temp.set(0), temp.ltiu(10), temp.inc(1), () -> {
              total.addiTo(temp);
       });
       

      TIP: If the number of repetitions is known at generation time, consider using a standard Java for loop, as a sort of Structured Sleigh macro. For example, to broadcast element 0 to an in-memory 16-long vector pointed to by r0:

       Var arr = lang("r0", "/int *");
       for (int i = 1; i < 16; i++) {
              arr.index(i).deref().set(arr.index(0).deref());
       }
       

      Instead of generating a loop, this will generate 15 Sleigh statements.

      Parameters:
      init - the loop initializer
      cond - the loop condition
      step - the loop stepper
      body - the body of the loop
    • _break

      protected void _break()
      Generate a "break" statement

      This must appear in the body of a loop statement. It binds to the innermost loop statement in which it appears, generating code to leave that loop.

    • _continue

      protected void _continue()
      Generate a "continue" statement

      This must appear in the body of a loop statement. It binds to the innermost loop statement in which it appears, generating code to immediately repeat the loop, skipping the remainder of its body.

    • _result

      protected void _result(StructuredSleigh.RVal result)
      Generate a "result" statement

      This is semantically similar to a C "return" statement, but is named differently to avoid confusion with Sleigh's return statement. When this is code implementing a p-code userop, this immediately exits the userop, returning control to the caller where the invocation takes the value given in this statement.

      Contrast with _return(RVal)

      Parameters:
      result - the resulting value of the userop
    • _return

      protected void _return(StructuredSleigh.RVal target)
      Generate a "return" statement

      This models (in part) a C-style return from the current target function to its caller. It simply generates the "return" Sleigh statement, which is an indirect branch to the given target. Target is typically popped from the stack or read from a link register.

      Contrast with _result(RVal)

      Parameters:
      target - the offset of the target
    • _goto

      protected void _goto(StructuredSleigh.RVal target)
      Generate a "goto" statement to another address in the processor's code space
      Parameters:
      target - the offset of the target address
    • getMethodLookup

      protected MethodHandles.Lookup getMethodLookup()
      Get the method lookup for this context

      If the annotated methods cannot be accessed by StructuredSleigh, this method must be overridden. It should simply return MethodHandles.lookup(). This is necessary when the author chooses access modifiers other than public, which is good practice, or when the class is an anonymous inner class, as is often the case with stand-alone use.

      Returns:
      the lookup
    • generate

      public <T> void generate(Map<String,? super SleighPcodeUseropDefinition<T>> into)
      Generate all the exported userops and place them into the given map
      Type Parameters:
      T - the type of values used by the userops. For sleigh, this can be anything.
      Parameters:
      into - the destination map, usually belonging to a PcodeUseropLibrary.
    • generate

      public <T> SleighPcodeUseropDefinition<T> generate(Method m)
      Generate the userop for a given Java method
      Type Parameters:
      T - the type of values used by the userop. For sleigh, this can be anything.
      Parameters:
      m - the method exported as a userop
      Returns:
      the userop
    • doGenerate

      protected <T> SleighPcodeUseropDefinition<T> doGenerate(MethodHandles.Lookup lookup, Method m)
    • generate

      public <T> Map<String,SleighPcodeUseropDefinition<T>> generate()
      Generate all the exported userops and return them in a map

      This is typically only used when not part of a larger PcodeUseropLibrary, for example to aid in developing a Sleigh module or for generating injects.

      Type Parameters:
      T - the type of values used by the userop. For sleigh, this can be anything.
      Returns:
      the userop
    • computeFloatSize

      protected int computeFloatSize(DataType type)
      Validate and compute the size (in bytes) of a floating-point data type
      Parameters:
      type - the type
      Returns:
      the size of the type
    • encodeFloat

      protected long encodeFloat(double val, int size)
      Encode a floating-point value
      Parameters:
      val - the value
      size - the size (in bytes)
      Returns:
      the encoded bits
    • isAssignable

      protected boolean isAssignable(DataType varType, DataType valType)
      Extension point: Specify whether values of a given type can be assigned to variables of another type

      The default is to check if the types are equivalent: DataType.isEquivalent(DataType).

      Parameters:
      varType - the variable's data type (assign to)
      valType - the value's data type (assign from)
      Returns:
      true if allowed, false otherwise
    • emitAssignmentTypeMismatch

      protected void emitAssignmentTypeMismatch(StructuredSleigh.LVal lhs, StructuredSleigh.RVal rhs)
      Extension point: Specify how to handle a type mismatch in an assignment

      The default is to log a warning and continue.

      Parameters:
      lhs - the variable being assigned
      rhs - the value being assigned to the variable
    • emitParameterCountMismatch

      protected void emitParameterCountMismatch(StructuredSleigh.UseropDecl userop, List<StructuredSleigh.RVal> arguments)
      Extension point: Specify how to handle a parameter to argument count mismatch

      The default is to throw an unrecoverable error. If allowed to continue, the matched parameters are type checked and the invocation generated as specified. Most likely, the emulator will crash while executing the invoked userop.

      Parameters:
      userop - the userop being called
      arguments - the arguments being passed
    • emitParameterTypeMismatch

      protected void emitParameterTypeMismatch(StructuredSleigh.UseropDecl userop, int position, StructuredSleigh.RVal value)
      Extension point: Specify how to handle a parameter type mismatch

      The default is to log a warning and continue.

      Parameters:
      userop - the userop being called
      position - the position of the parameter
      value - the value being assigned
    • emitResultTypeMismatch

      protected void emitResultTypeMismatch(ghidra.pcode.struct.RoutineStmt routine, StructuredSleigh.RVal result)
      Extension point: Specify how to handle a result type mismatch

      The default is to log a warning and continue.

      Parameters:
      routine - the routine (userop) containing the result statement
      result - the result value specified in the statement
    • computeDerefType

      protected DataType computeDerefType(StructuredSleigh.RVal addr)
      Compute the type of a dereferenced address
      Parameters:
      addr - the value of the pointer
      Returns:
      the resulting type
    • emitDerefNonPointer

      protected void emitDerefNonPointer(StructuredSleigh.RVal addr)
      Extension point: Specify how to handle dereference of a non-pointer value

      The default is to log a warning and continue. If permitted to continue, the resulting type will be void, likely resulting in more issues. See computeDerefType(RVal).

      Parameters:
      addr - the value being dereferenced
    • computeElementLength

      protected int computeElementLength(StructuredSleigh.RVal addr)
      Compute the length (in bytes) of an element of a pointer to an array
      Parameters:
      addr - the value of the pointer
      Returns:
      the length of one element
    • findComponent

      protected DataTypeComponent findComponent(StructuredSleigh.RVal addr, String name)
      Find the type component (field) of a pointer to a composite type

      In terms of type manipulation, this is equivalent the C expression addr->field. StructuredSleigh.LVal.field(String) uses the component to derive the offset and the resulting pointer type.

      Parameters:
      addr - the value of the pointer
      name - the field being accessed
      Returns:
      the found component
      Throws:
      StructuredSleigh.StructuredSleighError - if the field cannot be found