Attributes and Children | ||
<context_set> |
(0 or more) Set a context variable across a region of memory | |
<tracked_set> |
(0 or more) Set default value of register |
A <context_data>
tag consists of zero or more <context_set>
and <tracked_set>
subtags, which allow certain values to be assumed by analysis.
Attributes and Children | ||
space |
Name of address space | |
first |
(Optional) Starting offset of range | |
last |
(Optional) Ending offset of range | |
<set> |
Specify the context variable and the new value | |
name |
Name of the context variable | |
val |
Integer value being set | |
description |
(Optional) Description of what is set |
A <context_set>
tag sets a SLEIGH context variable over a specified address range.
This potentially affects how instructions are disassembled within that range. This is more
commonly used in the processor specification file but can also be used
here for specific compilers.
The attributes space
, first
, and last
describe the range.
Omitting first
and/or last
causes the range to start at the beginning
and/or run to the end of the address space respectively.
The <set>
subtag describes the variable and its setting.
Example 3.
<context_data> <context_set space="ram"> <set name="mode16" val="1" description="Set 16-bit mode across all of ram"/> </context_set> </contextdata>
Attributes and Children | ||
space |
Name of address space | |
first |
(Optional) Starting offset of range | |
last |
(Optional) Ending offset of range | |
<set> |
Specify the register and the new value | |
name |
Name of the register | |
val |
Integer value being set | |
description |
(Optional) Description of what is set |
A <tracked_set>
tag informs the decompiler that a register takes a specific value
for any function whose entry point is in the indicated range. Compilers sometimes know or assume that
registers have specific values coming into a function it produces. This tag allows the decompiler to
make the same assumption and possibly use constant propagation to make further simplifications.
Example 4.
<context_data> <tracked_set space="ram"> <set name="spsr" val="0"/> </tracked_set> </context_data>
Attributes and Children | ||
name |
The identifier for this callfixup | |
<target> |
(0 or more) Map this callfixup to a specific symbol | |
name |
The specific symbol name | |
<pcode> |
Description of p-code to inject. |
Attributes and Children | ||
paramshift |
(Optional) Integer for shifting parameters at the callpoint. | |
<body> |
P-code to inject. | |
text |
Compilers frequently make use of special bookkeeping functions that are really internal to the compiler and not a direct reflection of functions in the original source code. During analysis it can be helpful to replace a call to such a function with a snippet of p-code that inlines the behavior, or a portion of the behavior, so that the decompiler can use it during its simplification rather than displaying it as an opaque call. A typical use is to inline prologue functions that help set up a stack frame.
The name
attribute can be used to identify the callfixup
within the Ghidra CodeBrowser and manually force certain functions to
be replaced. The name
attribute of
the <callfixup>
tag and any
optional <target>
subtags identify function names
which will automatically be replaced.
The text of the <body>
subtag is fed directly to
the SLEIGH semantic expression parser to create the p-code snippet.
Identifiers are interpreted as formal registers, if the register exists,
but are otherwise interpreted as temporary registers in the unique space
of the processor. Its usually best to surround text with the XML <![CDATA[ construct.
Example 5.
<callfixup name="get_pc_thunk_bx"> <target name="__i686.get_pc_thunk.bx"/> <pcode> <body><![CDATA[ EBX = * ESP; ESP = ESP + 4; ]]></body> </pcode> </callfixup>
Attributes and Children | ||
targetop |
Name of the CALLOTHER operator to inject. | |
<pcode> |
Description of p-code to inject. |
Attributes and Children | ||
<input> |
(0 or more) Description of formal input parameter. | |
name |
Name of the specific input symbol. | |
size |
Expected size of the parameter in bytes. | |
<output> |
(0 or more) Description of formal output parameter. | |
name |
Name of the specific output symbol. | |
size |
Expected size of output in bytes. | |
<body> |
P-code to inject. | |
text |
The <callotherfixup>
is similar to a <callfixup>
tag but is used to describe
injections that replace user-defined p-code operations, rather than CALL
operations. User-defined
p-code operations, referred to generically as CALLOTHER
operations, are black-box
operations that a SLEIGH specification can define to encapsulate complicated (or esoteric) actions performed
by the processor. The specification must define a unique name for each such operation. The targetop
attribute links the p-code described here to the specific operation via this name.
As with any p-code operation,
the CALLOTHER
takes formal varnodes as inputs and/or outputs. These varnodes can be referred to
in the injection <body>
by predefining them using <input>
or
<output>
tags. The sequence of <input>
tags correspond in order to the
input parameters of the CALLOTHER
, and a <output>
tag corresponds to output varnode
if present. The tags listed here must match the number of input and output
parameters in the actual p-code operation, or an exception will be thrown during p-code generation. The optional
size
attribute in each tag will, if present, impose a size restriction on the parameter as well.
As with a <callfixup>
, the <body>
tag is fed straight to the SLEIGH semantic
parser. It can refer to registers via their symbolic name defined in SLEIGH, it can refer to the operator parameters
via their <input>
or <output>
names, and it can also refer to
inst_start
, inst_next
and inst_next2
as addresses describing the instruction
containing the CALLOTHER
.
Example 6.
<callotherfixup targetop="saturate"> <pcode> <input name="in1" size="4"/> <input name="in2" size="4"/> <body><![CDATA[ in1 = in1 + in2; if (in1 < 0x10000) goto <end>; in1 = 0xffff; <end> ]]></body> </pcode> </callotherfixup>
Attributes and Children | ||
style | Strategy for splitting: inhalf | |
<register> or <varnode> | (1 or more) varnode tags |
This tag is designed to mark specific registers as packed, containing multiple logical values that need to be split. The decompiler attempts to split up any operator that reads or writes the register into multiple p-code operations that operate on each logical value individually.
The tag lists one or more varnode tags describing the registers or other storage locations that need to be split. The style attribute indicates how the storage locations should be split. Currently the only accepted style value is "inhalf", which means that each varnode should be split into two equal pieces.
Splitting a varnode is only possible if the all p-code operations it is involved in don't mix their action across the logical pieces. If this is not possible, the p-code will not be altered for that particular varnode.
Example 7.
<prefersplit style="inhalf"> <register name="xr0"/> <register name="xr1"/> <register name="xr2"/> <prefersplit>
This tag tells the decompiler that p-code extension operations are likely to be a side-effect of the processor and are obscuring what is just the manipulation of the smaller logical value. The decompiler normally trims extensions and other operations where it can prove that the most significant bytes of the result are unused. This tag lets the decompiler be more aggressive when use of the extended bytes is more indeterminate. It can assume that extensions into sub-function parameters and into the return value are extraneous.
The signext attribute turns the behavior on specifically for the sign-extension operation. Currently there is no toggle for zero-extensions.