Class ExecutableComparison
java.lang.Object
ghidra.features.bsim.query.client.ExecutableComparison
Compare an entire set of executables to each other by combining
significance scores between functions. If individual functions
demonstrate multiple similarities, its score contributions are not
over counted, and the final scores are symmetric. Scoring is efficient
because it iterates over the precomputed clusters of similar functions
in a BSim database. The algorithm does divide and conquer based on
clusters of similar functions, which greatly improves efficiency over
full quadratic comparison of all functions. This can be further bounded
by putting a threshold on how close functions have to be to be considered
in the same cluster and on how many functions can be in a cluster before
ignoring their score contributions.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
Mutable integer class for histogram -
Constructor Summary
ConstructorsConstructorDescriptionExecutableComparison
(FunctionDatabase database, int hitCountThreshold, TaskMonitor monitor) Initialize a comparison object with an active connection and thresholds, using the matrix scorerExecutableComparison
(FunctionDatabase database, int hitCountThreshold, String md5, ScoreCaching cache, TaskMonitor monitor) Initialize a comparison object with an active connection and thresholds, using the row scorer -
Method Summary
Modifier and TypeMethodDescriptionvoid
addAllExecutables
(int limit) Add all executables currently in the database to this object for comparison.void
addExecutable
(String md5) Register an executable to be scoredvoid
Generate any missing self-scores within the list of registered executables.int
int
boolean
void
Perform scoring between all registered executables.void
resetThresholds
(double simThreshold, double sigThreshold) Remove any old scores and set new thresholds for the scorer
-
Constructor Details
-
ExecutableComparison
public ExecutableComparison(FunctionDatabase database, int hitCountThreshold, TaskMonitor monitor) throws LSHException Initialize a comparison object with an active connection and thresholds, using the matrix scorer- Parameters:
database
- is the active connection to a BSim databasehitCountThreshold
- is the maximum number of functions to consider in one clustermonitor
- is a monitor to provide progress and cancellation checks- Throws:
LSHException
- if the database connection is not established
-
ExecutableComparison
public ExecutableComparison(FunctionDatabase database, int hitCountThreshold, String md5, ScoreCaching cache, TaskMonitor monitor) throws LSHException Initialize a comparison object with an active connection and thresholds, using the row scorer- Parameters:
database
- is the active connection to a BSim databasehitCountThreshold
- is the maximum number of functions to consider in one clustermd5
- is the 32-character md5 string of the executable to single out for comparisoncache
- holds the self-scores or is null if normalized scores aren't neededmonitor
- is a monitor to provide progress and cancellation checks- Throws:
LSHException
- if the database connection is not established
-
-
Method Details
-
getMaxHitCount
public int getMaxHitCount()- Returns:
- maximum hit count seen for a cluster
-
getExceedCount
public int getExceedCount()- Returns:
- number of clusters that exceeded hitCountThreshold
-
isConfigured
public boolean isConfigured()- Returns:
- true if similarity and significance thresholds have been set
-
getScorer
- Returns:
- the ExecutableScorer to allow examination of scores
-
addExecutable
Register an executable to be scored- Parameters:
md5
- is the MD5 string of the executable- Throws:
LSHException
- if the executable is not in the database
-
addAllExecutables
Add all executables currently in the database to this object for comparison.- Parameters:
limit
- is the max number of executables to compare against (if greater than zero)- Throws:
LSHException
- for problems retrieving ExecutableRecords from the database
-
performScoring
Perform scoring between all registered executables.- Throws:
LSHException
- for any connection issues during the processCancelledException
- if the monitor reports cancellation
-
resetThresholds
Remove any old scores and set new thresholds for the scorer- Parameters:
simThreshold
- is the similarity threshold for new scoressigThreshold
- is the significance threshold for new scores- Throws:
LSHException
- if there are problems saving new thresholds
-
fillinSelfScores
Generate any missing self-scores within the list of registered executables.- Throws:
LSHException
- for problems retrieving vectorsCancelledException
- if the user clicks "cancel"
-