Class ExecutableComparison

java.lang.Object
ghidra.features.bsim.query.client.ExecutableComparison

public class ExecutableComparison extends Object
Compare an entire set of executables to each other by combining significance scores between functions. If individual functions demonstrate multiple similarities, its score contributions are not over counted, and the final scores are symmetric. Scoring is efficient because it iterates over the precomputed clusters of similar functions in a BSim database. The algorithm does divide and conquer based on clusters of similar functions, which greatly improves efficiency over full quadratic comparison of all functions. This can be further bounded by putting a threshold on how close functions have to be to be considered in the same cluster and on how many functions can be in a cluster before ignoring their score contributions.
  • Constructor Details

    • ExecutableComparison

      public ExecutableComparison(FunctionDatabase database, int hitCountThreshold, TaskMonitor monitor) throws LSHException
      Initialize a comparison object with an active connection and thresholds, using the matrix scorer
      Parameters:
      database - is the active connection to a BSim database
      hitCountThreshold - is the maximum number of functions to consider in one cluster
      monitor - is a monitor to provide progress and cancellation checks
      Throws:
      LSHException - if the database connection is not established
    • ExecutableComparison

      public ExecutableComparison(FunctionDatabase database, int hitCountThreshold, String md5, ScoreCaching cache, TaskMonitor monitor) throws LSHException
      Initialize a comparison object with an active connection and thresholds, using the row scorer
      Parameters:
      database - is the active connection to a BSim database
      hitCountThreshold - is the maximum number of functions to consider in one cluster
      md5 - is the 32-character md5 string of the executable to single out for comparison
      cache - holds the self-scores or is null if normalized scores aren't needed
      monitor - is a monitor to provide progress and cancellation checks
      Throws:
      LSHException - if the database connection is not established
  • Method Details

    • getMaxHitCount

      public int getMaxHitCount()
      Returns:
      maximum hit count seen for a cluster
    • getExceedCount

      public int getExceedCount()
      Returns:
      number of clusters that exceeded hitCountThreshold
    • isConfigured

      public boolean isConfigured()
      Returns:
      true if similarity and significance thresholds have been set
    • getScorer

      public ExecutableScorer getScorer()
      Returns:
      the ExecutableScorer to allow examination of scores
    • addExecutable

      public void addExecutable(String md5) throws LSHException
      Register an executable to be scored
      Parameters:
      md5 - is the MD5 string of the executable
      Throws:
      LSHException - if the executable is not in the database
    • addAllExecutables

      public void addAllExecutables(int limit) throws LSHException
      Add all executables currently in the database to this object for comparison.
      Parameters:
      limit - is the max number of executables to compare against (if greater than zero)
      Throws:
      LSHException - for problems retrieving ExecutableRecords from the database
    • performScoring

      public void performScoring() throws LSHException, CancelledException
      Perform scoring between all registered executables.
      Throws:
      LSHException - for any connection issues during the process
      CancelledException - if the monitor reports cancellation
    • resetThresholds

      public void resetThresholds(double simThreshold, double sigThreshold) throws LSHException
      Remove any old scores and set new thresholds for the scorer
      Parameters:
      simThreshold - is the similarity threshold for new scores
      sigThreshold - is the significance threshold for new scores
      Throws:
      LSHException - if there are problems saving new thresholds
    • fillinSelfScores

      public void fillinSelfScores() throws LSHException, CancelledException
      Generate any missing self-scores within the list of registered executables.
      Throws:
      LSHException - for problems retrieving vectors
      CancelledException - if the user clicks "cancel"