Importer

Introduction

Ghidra can import a variety of different types of files into a Ghidra project as Ghidra "programs." There are separate actions for importing single files, importing multiple files, and importing a file into an existing program. The actions for importing single or multiple files into a new program are available in both the front-end project window or the CodeBrowser tool. The action for adding to an existing program is only available from the Code Browser tool and only if there is a currently open program in the tool.

Supported Formats

File Import Actions

These actions can be used to import one or more files into a Ghidra project. They can be accessed via the File menu in either the Front-end Project Window or the CodeBrowser Tool unless otherwise specified.

Import File

This action is used to import a single file into Ghidra. If the file is an archive consisting of multiple programs, then this action will bring up the Batch Importer Dialog, otherwise it will use the standard single file Importer Dialog to complete the import.

Steps:

Alternative Steps (drag-and-drop):

Batch Import

This action is used to import multiple files by selecting a root directory and letting it recursively find programs to import.

Steps:

Open File System

This action is used to open the File System Browser which can be used to view the contents of container files (tar, zip, etc.) and import files from within those containers.

Steps:

Add to Program

This action is used to import data from a file into an existing program. The program must be open in the tool to perform this action.

Steps:

Other Import Actions

Import Selection

This action is used to import a selection from an open program in the CodeBrowser tool.

Steps:

Importer Dialog

When the user initiates a single file import, the Importer Dialog is used to configure the import for that file.



Dialog Fields

If this dialog appears as a result of the Add To Program action, then the Language, Destination Folder, and Filename fields will be disabled since these values are already determined by the existing program.



Options

The import options differ depending on the selected format.

Common Options

These options appear in many of the standard executable program formats such as ELF, PE, etc.

Apply Processor Defined Labels

If this option is on, the importer will create processor labels at specific addresses as defined by the processor specification. This is usually used to label things like the reset vector or interrupt vector.

Anchor Processor Defined Labels

If this option is on, labels created from the processor specification are anchored. This means that if the image base is changed or a memory block is moved, those symbols will remain at the address they were originally placed. If the option is off, the symbols will move with the image base or the memory block.

Link Existing Project Libraries

Searches the project for existing library programs and creates external references to them.

Project Library Search Folder

The project folder that will get searched for existing library programs. If left empty, the folder that the main program is being imported to will be searched.

Load Local Libraries From Disk

Searches the executable's directory to recursively resolve the external libraries used by the executable. The entire library dependency tree will be traversed in a depth-first manner and a program will be created for each found library (if it doesn't exist already). The external references in these programs will be resolved.

Load System Libraries From Disk

Searches a user-defined path list to recursively resolve the external libraries used by the executable. The entire library dependency tree will be traversed in a depth-first manner and a program will be created for each found library (if it doesn't exist already). The external references in these program will be resolved.
The "Edit Paths" button will bring up the Library Paths Dialog

Recursive Library Load Depth

Specifies how many levels deep the depth-first library dependency tree will be traversed when loading local or system libraries.

Library Destination Folder

The project folder where newly loaded library programs will get created. If left empty, they will get created in the same folder as the main program being imported.

COFF Options

COFF format has all the Common Options, plus:

Attempt to link sections located at 0x0

If selected, sections located at 0x0 will be relocated sequentially in memory. This will avoid conflicts and keeps sections from being ignored.

ELF Options

ELF format has all the Common Options, plus:

Perform Symbol Relocations

If selected, Ghidra will attempt to apply the relocations specified in the ELF header.

Image Base

Specifies the image base to use for importing the memory sections.

Import Non-loaded Data

If selected, Ghidra will import ELF sections that don't get loaded into memory when the program is run. These sections will not be stored in a special address space called "other".

Max Zero-Segment Discard Size

When both section-headers and program-headers are present, this option controls the maximum byte-size of a non-section-based memory block which has a zero-fill which will be discarded. This is intended to allow section-alignment load sequences to be ignored and discarded. A value of "0" will disable all such discards. The default value is 255-bytes.

Intel Hex Options

Base Address

This field is used to specify the start address in memory for where to load the bytes.

Overlay

If selected, the bytes will be loaded as an initialized overlay block. A new overlay space will be created with the same name as the Block Name.

Block Name

This field is used to specify the name of the memory block that will contain the newly imported bytes.

Mach-O Options

The Mac OSX Mach-O format has only the Common Options.

Motorola Hex Options

Base Address

This field is used to specify the start address in memory for where to load the bytes.

Overlay

If selected, the bytes will be loaded as an overlay. A new overlay space will be created with the same name as the Block Name.

Block Name

This field is used to specify the name of the memory block that will contain the newly imported bytes.

MZ Options

The MZ format has only the Common Options.

NE Options

The NE format has all the Common Options, plus:

Perform Library Ordinal Lookup

Looks up and applies pre-generated exported symbol ordinal name mappings and stack purge information. This information is stored in symbol files located in <GHIDRA_INSTALL_DIR>/Ghidra/Features/Base/data/symbols/<OS>.

If there is no pre-generated information for a given library but the ordinal name mappings and/or stack purge information is extracted during the library load/analysis process, the information will be cached locally to the user's .ghidra/ directory to speed up future imports.

PE Options

The PE format has all the Common Options, plus:

Perform Library Ordinal Lookup

Looks up and applies pre-generated exported symbol ordinal name mappings and stack purge information. This information is stored in symbol files located in <GHIDRA_INSTALL_DIR>/Ghidra/Features/Base/data/symbols/<OS>.

If there is no pre-generated information for a given library but the ordinal name mappings and/or stack purge information is extracted during the library load/analysis process, the information will be cached locally to the user's .ghidra/ directory to speed up future imports.

When running Ghidra with symbol files created from an older operating system, you may receive the following warning message:

Unable to locate [symbol_name] in [<filepath>.exports]. Please verify the version is correct.

This warning message indicates which symbols do not exist in the corresponding .exports file. The only information lost by not including these symbols is function purge and comments. If you require this information, manually delete the .exports file and Ghidra will regenerate it.

Parse CLI headers (if present)

If selected, any CLI headers present will be processed.

Raw Binary Options

Block Name

The name of the memory block that will contain the raw bytes from the file. By default, it will be the name of the default address space (usually "ram")

Base Address

This field is the address offset for the block of bytes to be imported. By default, this will be 0.

File Offset

This field is the byte offset into the imported file from which to start importing raw bytes. By default, this will be 0.

Length

This field is the number of bytes to import. By default, this will be set to the total number of bytes in the imported file.

Apply Processor Defined Labels

If this option is on, the importer will create processor labels at specific addresses as defined by the processor specification. This is usually used to label things like the reset vector or interrupt vector.

Anchor Processor Defined Labels

If this option is on, labels created from the processor specification are anchored. This means that if the image base is changed or a memory block is moved, those symbols will remain at the address they were originally placed. If the option is off, the symbols will move with the image base or the memory block.

XML Options

The XML format is used to load from a Ghidra XML formatted file. The options are simply switches for which types of program information to import.

Memory Blocks

Imports memory block definitions (name, start address, length, etc). See Memory Map

Memory Contents

Imports bytes for the memory blocks.

Instructions

Imports disassembled instructions. See Disassembly.

Data

Imports data types and defined data. See Data Type Manager and Data.

Symbols

Imports user-defined symbols. See Symbol Table.

Equates

Import equate definitions and references. See Equate Table.

Comments

Imports comments (pre, post, eol, plate, repeatable). See Comments.

Properties

Imports user-defined properties.

Bookmarks

Imports Bookmarks.

Trees

Imports program organizations (program trees, modules, fragments). See Program Tree.

References

Imports user-defined memory, stack, and external references. See References.

Functions

Imports functions, stack frames and variables. See Functions.

Registers

Imports program context and registers. See Register Values.

Relocation Table

See Relocation Table.

Entry Points

Imports program entry points.

External Libraries

See External Program Names.

SARIF Options

The SARIF format is used to load from a SARIF formatted file. The options are simply switches for which types of program information to import and are identical to the options specified above for XML.

Library Search Path

The Library Search Path dialog is used to specify the directories that Ghidra should use to resolve external libraries (e.g.; *.dll, *.so) while importing.

 

Change the Library Path Search Order

To change the search order of the paths within the list:

  1. Select a path from the list
  2. Select the button to move the path up in the list
  3. Select the button to move the path down in the list

The search order is important when you have different versions of a libraries in different directories. The first directory in the search path that contains a required library is the one that Ghidra will use.

Add Library Search Path

  1. Click the button
  2. Select a directory from the file chooser
  3. Click the "Select Directory" button

The newly added path will be placed at the top of the list.

Remove Library Search Path

  1. Select one or more paths from the list
  2. Click the button

Reset Library Search Paths

To reset the paths to the default list:

  1. Click the button
  2. Click "Yes" on the pop-up dialog to confirm path reset

This option will remove any paths added manually.

Language and Compiler Specification Dialog

This dialog is used specify of the Ghidra language (Processor/Compiler Spec) of the program being imported. Certain formats, like "PE", "ELF", or "XML", will usually choose the appropriate language/compiler spec. If not, this dialog can be used to select one or override the default selection.



Each row in the table represents a unique processor language/compiler spec pair. To select one, simple click on the row and press the OK button.

Table Columns

Filter

The filter can be used to reduce the number of entries in the table. Only the entries that contain the text in this field will be displayed.

Description

This field shows the currently selected language/compiler spec.

Show Only Recommended Language/Compiler Specs

If selected, only the languages suggested by the selected importer format will be shown. Otherwise, all known languages will be shown. Not all importer formats can determine an appropriate language, in which case all the languages will be displayed.

Batch Import Dialog

The Batch Import Dialog is used to import multiple files at the same time. The files may be individual files in a directory tree, and/or files from an archive file of some sort such as a zip or tar file.

Import Sources

This section manages a list of folder trees or container files (e.g., zips) to scan for files to import. Initially, this contains the folder or file that was initially selected from the file chooser.

Adding an additional import source folder.

Pressing the Add button will bring up a file chooser for picking an additional folder or file (import source) to search for import files

Removing an import source folder

Select a folder in the import sources window and press the Remove button.

Depth limit

This field specifies the depth or level of nested containers to search for each of the specified import sources. Note that this is not the level of subfolders to search, but rather the nesting levels of archive type files. (i.e. zips in zips)

Rescan

This button will rescan the import sources to the current depth for files to import.

Files to Import

This section displays a table showing the files that were found. Each row represents a set of similar files that can be imported. The table columns are as follows:

Import Options

Strip leading path

If selected, the newly imported files will not use the relative path of the file when storing the result in the project. Otherwise, the file will be in a corresponding relative path in the project.

Strip container paths

If selected, the newly imported files will not use the interior archive path when storing the result in the project. Otherwise, the file will be in a corresponding relative path to the path the file was in its archive.

Project Destination

This shows the destination folder in the project that will be the root folder for storing the imported files. Each imported file will be stored in a relative path to that root folder. The relative path is usually the relative path of the file to its import source folder, but can be adjusted with some of the path options described earlier.

Provided By: Importer Plugin