Taint-tracking is the ability to capture the flow of data through a program by following transfers from one variable to another. Often, this involves specifying where the data originates (a source) and which endpoints are of interest (sinks). In Ghidra, taint-tracking leverages external engines which rely on the SSA-nature of Ghidra's PCode to describe varnode-to-varnode flows.
Taint-tracking is a four-step process. First, the PCode underlying the target program must be exported to a directory for subsequent "indexing". Second, the program is indexed creating an index database. These are one-time actions and need only be re-executed when you change programs or modify the target program's PCode in a substantive way. Given a database, any number of queries may be processed using the index. Step three, making a query generally includes marking sources and sinks, and then executing the query. Last, the results of the query may be selectively re-applied as markup on the decompilation/disassembly.
The first two steps are activated in the menus under Tools → Source-Sink. Their options are accessible via Edit → Tool Options under Options/Decompiler/Taint. The third step is controlled by the pop-up menus (or associated keyboard actions) and the toolbar items within the Decompiler window. Pop-up menus on the results table control how the results are applied.
Deletes any pre-existing data (facts and database) from the directories specified under Taint Options.
Run an engine-specific ghidra script to export the PCode for the current program as a set of ASCII fact files to be consumed by the engine. (For CTADL, our default engine, this script is ExportPCodeForCTADL.java.)
Converts the directory of PCode facts into an indexed database for future queries.
Updates the existing fact set for the current function, which may be useful if the decompilation has been improved during the course of analysis. (Does require re-indexing the database.)
These options govern the locations for various elements of the taint engine.
The following actions appear in the Taint sub-menu of the Decompiler's context menu, accessed by right-clicking a token. The pop-up menu is context sensitive and the type of token in particular determines what actions are available. The token clicked provides a local context for the action and may be used to pinpoint the exact variable or operation affected.
These actions apply after the source and sinks have been chosen.
Uses the defined source, sinks, and gates to compose and execute a query. Input may include parameters, stack variables, variables associated with registers, or "dynamic" variables. Queries require an index database generated from PCode.
Use pre-defined sources and sinks to execute the engine's default query. (Ignores the sources and sinks specified by the user and tries to apply whatever the engine considers the de-facto set of sources/sinks - which may be undefined for a given target.)
Executes the query referenced in option without rebuilding it based on sources, sinks, etc. Unedited, this will re-execute the last query, but the file can be modified by hand to reflect any query you're interested in.
These actions appear in the context menu of the Query Results table and transfer the selected results to the decompiler/disassembly.
Applies SARIF results to the current progam generically, based on the current set of handlers.