Class AssemblyParser

java.lang.Object
ghidra.app.plugin.assembler.sleigh.parse.AssemblyParser

public class AssemblyParser extends Object
A class to encapsulate LALR(1) parsing for a given grammar

This class constructs the Action/Goto table (and all the other trappings) of a LALR(1) parser and provides a parse(String) method to parse actual sentences.

This implementation is somewhat unconventional in that it permits ambiguous grammars. Instead of complaining, it produces the set of all possible parse trees. Of course, this comes at the cost of some efficiency.

See Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman, Compilers: Principles, Techniques, & Tools. Boston, MA: Pearson, 2007.

See Jackson, Stephen. LALR(1) Parsing. Halifax, Nova Scotia, Canada: Dalhousie University. <http://web.cs.dal.ca/~sjackson/lalr1.html>

  • Field Details

  • Constructor Details

    • AssemblyParser

      public AssemblyParser(AssemblyGrammar grammar)
      Construct a LALR(1) parser from the given grammar
      Parameters:
      grammar - the grammar
  • Method Details

    • buildLR0Machine

      protected void buildLR0Machine()
    • addLR0State

      protected int addLR0State(AssemblyParseState state)
      Add a newly-constructed LR0 state, and return it's assigned number

      If the state already exists, this just returns its previously assigned number

      Parameters:
      state - the newly-constructed state
      Returns:
      the assigned number
    • buildExtendedGrammar

      protected void buildExtendedGrammar()
    • extend

      protected AssemblyExtendedProduction extend(AssemblyProduction prod, int start)
      Extend a production, using the given LR0 start state
      Parameters:
      prod - the production to extend
      start - the starting LR0 state
      Returns:
      the extended production, if the start state is valid for it
    • buildActionGotoTable

      protected void buildActionGotoTable()
    • parse

      public Iterable<AssemblyParseResult> parse(String input)
      Parse the given sentence
      Parameters:
      input - the sentence to parse
      Returns:
      all possible parse trees (and possible errors)
    • parse

      Parse the given sentence with the given defined symbols

      The tokenizer for numeric terminals also accepts any key in labels. In such cases, the resulting token is assigned the value of the symbols.

      Parameters:
      input - the sentence to parser
      symbols - the symbols
      Returns:
      all possible parse results (trees and errors)
    • printGrammar

      public void printGrammar(PrintStream out)
      For debugging
    • printLR0States

      public void printLR0States(PrintStream out)
      For debugging
    • printLR0TransitionTable

      public void printLR0TransitionTable(PrintStream out)
      For debugging
    • printExtendedGrammar

      public void printExtendedGrammar(PrintStream out)
      For debugging
    • printGeneralFF

      public void printGeneralFF(PrintStream out)
      For debugging
    • printExtendedFF

      public void printExtendedFF(PrintStream out)
      For debugging
    • printMergers

      public void printMergers(PrintStream out)
      For debugging
    • printParseTable

      public void printParseTable(PrintStream out)
      For debugging
    • printStuff

      public void printStuff(PrintStream out)
      For debugging
    • getGrammar

      public AssemblyGrammar getGrammar()
      Get the grammar used to construct this parser
      Returns:
      the grammar