internal.html revision 6498
16498Snate@binkert.org<html>
26498Snate@binkert.org<head>
36498Snate@binkert.org<title>PLY Internals</title>
46498Snate@binkert.org</head>
56498Snate@binkert.org<body bgcolor="#ffffff">
66498Snate@binkert.org
76498Snate@binkert.org<h1>PLY Internals</h1>
86498Snate@binkert.org 
96498Snate@binkert.org<b>
106498Snate@binkert.orgDavid M. Beazley <br>
116498Snate@binkert.orgdave@dabeaz.com<br>
126498Snate@binkert.org</b>
136498Snate@binkert.org
146498Snate@binkert.org<p>
156498Snate@binkert.org<b>PLY Version: 3.0</b>
166498Snate@binkert.org<p>
176498Snate@binkert.org
186498Snate@binkert.org<!-- INDEX -->
196498Snate@binkert.org<div class="sectiontoc">
206498Snate@binkert.org<ul>
216498Snate@binkert.org<li><a href="#internal_nn1">Introduction</a>
226498Snate@binkert.org<li><a href="#internal_nn2">Grammar Class</a>
236498Snate@binkert.org<li><a href="#internal_nn3">Productions</a>
246498Snate@binkert.org<li><a href="#internal_nn4">LRItems</a>
256498Snate@binkert.org<li><a href="#internal_nn5">LRTable</a>
266498Snate@binkert.org<li><a href="#internal_nn6">LRGeneratedTable</a>
276498Snate@binkert.org<li><a href="#internal_nn7">LRParser</a>
286498Snate@binkert.org<li><a href="#internal_nn8">ParserReflect</a>
296498Snate@binkert.org<li><a href="#internal_nn9">High-level operation</a>
306498Snate@binkert.org</ul>
316498Snate@binkert.org</div>
326498Snate@binkert.org<!-- INDEX -->
336498Snate@binkert.org
346498Snate@binkert.org
356498Snate@binkert.org<H2><a name="internal_nn1"></a>1. Introduction</H2>
366498Snate@binkert.org
376498Snate@binkert.org
386498Snate@binkert.orgThis document describes classes and functions that make up the internal
396498Snate@binkert.orgoperation of PLY.  Using this programming interface, it is possible to
406498Snate@binkert.orgmanually build an parser using a different interface specification
416498Snate@binkert.orgthan what PLY normally uses.  For example, you could build a gramar
426498Snate@binkert.orgfrom information parsed in a completely different input format.  Some of
436498Snate@binkert.orgthese objects may be useful for building more advanced parsing engines
446498Snate@binkert.orgsuch as GLR.
456498Snate@binkert.org
466498Snate@binkert.org<p>
476498Snate@binkert.orgIt should be stressed that using PLY at this level is not for the
486498Snate@binkert.orgfaint of heart.  Generally, it's assumed that you know a bit of
496498Snate@binkert.orgthe underlying compiler theory and how an LR parser is put together.
506498Snate@binkert.org
516498Snate@binkert.org<H2><a name="internal_nn2"></a>2. Grammar Class</H2>
526498Snate@binkert.org
536498Snate@binkert.org
546498Snate@binkert.orgThe file <tt>ply.yacc</tt> defines a class <tt>Grammar</tt> that 
556498Snate@binkert.orgis used to hold and manipulate information about a grammar
566498Snate@binkert.orgspecification.   It encapsulates the same basic information
576498Snate@binkert.orgabout a grammar that is put into a YACC file including 
586498Snate@binkert.orgthe list of tokens, precedence rules, and grammar rules. 
596498Snate@binkert.orgVarious operations are provided to perform different validations
606498Snate@binkert.orgon the grammar.  In addition, there are operations to compute
616498Snate@binkert.orgthe first and follow sets that are needed by the various table
626498Snate@binkert.orggeneration algorithms.
636498Snate@binkert.org
646498Snate@binkert.org<p>
656498Snate@binkert.org<tt><b>Grammar(terminals)</b></tt>
666498Snate@binkert.org
676498Snate@binkert.org<blockquote>
686498Snate@binkert.orgCreates a new grammar object.  <tt>terminals</tt> is a list of strings
696498Snate@binkert.orgspecifying the terminals for the grammar.  An instance <tt>g</tt> of
706498Snate@binkert.org<tt>Grammar</tt> has the following methods:
716498Snate@binkert.org</blockquote>
726498Snate@binkert.org
736498Snate@binkert.org<p>
746498Snate@binkert.org<b><tt>g.set_precedence(term,assoc,level)</tt></b>
756498Snate@binkert.org<blockquote>
766498Snate@binkert.orgSets the precedence level and associativity for a given terminal <tt>term</tt>.  
776498Snate@binkert.org<tt>assoc</tt> is one of <tt>'right'</tt>,
786498Snate@binkert.org<tt>'left'</tt>, or <tt>'nonassoc'</tt> and <tt>level</tt> is a positive integer.  The higher
796498Snate@binkert.orgthe value of <tt>level</tt>, the higher the precedence.  Here is an example of typical
806498Snate@binkert.orgprecedence settings:
816498Snate@binkert.org
826498Snate@binkert.org<pre>
836498Snate@binkert.orgg.set_precedence('PLUS',  'left',1)
846498Snate@binkert.orgg.set_precedence('MINUS', 'left',1)
856498Snate@binkert.orgg.set_precedence('TIMES', 'left',2)
866498Snate@binkert.orgg.set_precedence('DIVIDE','left',2)
876498Snate@binkert.orgg.set_precedence('UMINUS','left',3)
886498Snate@binkert.org</pre>
896498Snate@binkert.org
906498Snate@binkert.orgThis method must be called prior to adding any productions to the
916498Snate@binkert.orggrammar with <tt>g.add_production()</tt>.  The precedence of individual grammar
926498Snate@binkert.orgrules is determined by the precedence of the right-most terminal.
936498Snate@binkert.org
946498Snate@binkert.org</blockquote>
956498Snate@binkert.org<p>
966498Snate@binkert.org<b><tt>g.add_production(name,syms,func=None,file='',line=0)</tt></b>
976498Snate@binkert.org<blockquote>
986498Snate@binkert.orgAdds a new grammar rule.  <tt>name</tt> is the name of the rule,
996498Snate@binkert.org<tt>syms</tt> is a list of symbols making up the right hand
1006498Snate@binkert.orgside of the rule, <tt>func</tt> is the function to call when
1016498Snate@binkert.orgreducing the rule.   <tt>file</tt> and <tt>line</tt> specify
1026498Snate@binkert.orgthe filename and line number of the rule and are used for
1036498Snate@binkert.orggenerating error messages.    
1046498Snate@binkert.org
1056498Snate@binkert.org<p>
1066498Snate@binkert.orgThe list of symbols in <tt>syms</tt> may include character
1076498Snate@binkert.orgliterals and <tt>%prec</tt> specifiers.  Here are some
1086498Snate@binkert.orgexamples:
1096498Snate@binkert.org
1106498Snate@binkert.org<pre>
1116498Snate@binkert.orgg.add_production('expr',['expr','PLUS','term'],func,file,line)
1126498Snate@binkert.orgg.add_production('expr',['expr','"+"','term'],func,file,line)
1136498Snate@binkert.orgg.add_production('expr',['MINUS','expr','%prec','UMINUS'],func,file,line)
1146498Snate@binkert.org</pre>
1156498Snate@binkert.org
1166498Snate@binkert.org<p>
1176498Snate@binkert.orgIf any kind of error is detected, a <tt>GrammarError</tt> exception
1186498Snate@binkert.orgis raised with a message indicating the reason for the failure.
1196498Snate@binkert.org</blockquote>
1206498Snate@binkert.org
1216498Snate@binkert.org<p>
1226498Snate@binkert.org<b><tt>g.set_start(start=None)</tt></b>
1236498Snate@binkert.org<blockquote>
1246498Snate@binkert.orgSets the starting rule for the grammar.  <tt>start</tt> is a string
1256498Snate@binkert.orgspecifying the name of the start rule.   If <tt>start</tt> is omitted,
1266498Snate@binkert.orgthe first grammar rule added with <tt>add_production()</tt> is taken to be
1276498Snate@binkert.orgthe starting rule.  This method must always be called after all
1286498Snate@binkert.orgproductions have been added.
1296498Snate@binkert.org</blockquote>
1306498Snate@binkert.org
1316498Snate@binkert.org<p>
1326498Snate@binkert.org<b><tt>g.find_unreachable()</tt></b>
1336498Snate@binkert.org<blockquote>
1346498Snate@binkert.orgDiagnostic function.  Returns a list of all unreachable non-terminals
1356498Snate@binkert.orgdefined in the grammar.  This is used to identify inactive parts of
1366498Snate@binkert.orgthe grammar specification.
1376498Snate@binkert.org</blockquote>
1386498Snate@binkert.org
1396498Snate@binkert.org<p>
1406498Snate@binkert.org<b><tt>g.infinite_cycle()</tt></b>
1416498Snate@binkert.org<blockquote>
1426498Snate@binkert.orgDiagnostic function.  Returns a list of all non-terminals in the
1436498Snate@binkert.orggrammar that result in an infinite cycle.  This condition occurs if
1446498Snate@binkert.orgthere is no way for a grammar rule to expand to a string containing
1456498Snate@binkert.orgonly terminal symbols.
1466498Snate@binkert.org</blockquote>
1476498Snate@binkert.org
1486498Snate@binkert.org<p>
1496498Snate@binkert.org<b><tt>g.undefined_symbols()</tt></b>
1506498Snate@binkert.org<blockquote>
1516498Snate@binkert.orgDiagnostic function.  Returns a list of tuples <tt>(name, prod)</tt>
1526498Snate@binkert.orgcorresponding to undefined symbols in the grammar. <tt>name</tt> is the
1536498Snate@binkert.orgname of the undefined symbol and <tt>prod</tt> is an instance of 
1546498Snate@binkert.org<tt>Production</tt> which has information about the production rule
1556498Snate@binkert.orgwhere the undefined symbol was used.
1566498Snate@binkert.org</blockquote>
1576498Snate@binkert.org
1586498Snate@binkert.org<p>
1596498Snate@binkert.org<b><tt>g.unused_terminals()</tt></b>
1606498Snate@binkert.org<blockquote>
1616498Snate@binkert.orgDiagnostic function.  Returns a list of terminals that were defined,
1626498Snate@binkert.orgbut never used in the grammar.
1636498Snate@binkert.org</blockquote>
1646498Snate@binkert.org
1656498Snate@binkert.org<p>
1666498Snate@binkert.org<b><tt>g.unused_rules()</tt></b>
1676498Snate@binkert.org<blockquote>
1686498Snate@binkert.orgDiagnostic function.  Returns a list of <tt>Production</tt> instances
1696498Snate@binkert.orgcorresponding to production rules that were defined in the grammar,
1706498Snate@binkert.orgbut never used anywhere.  This is slightly different
1716498Snate@binkert.orgthan <tt>find_unreachable()</tt>.
1726498Snate@binkert.org</blockquote>
1736498Snate@binkert.org
1746498Snate@binkert.org<p>
1756498Snate@binkert.org<b><tt>g.unused_precedence()</tt></b>
1766498Snate@binkert.org<blockquote>
1776498Snate@binkert.orgDiagnostic function.  Returns a list of tuples <tt>(term, assoc)</tt> 
1786498Snate@binkert.orgcorresponding to precedence rules that were set, but never used the
1796498Snate@binkert.orggrammar.  <tt>term</tt> is the terminal name and <tt>assoc</tt> is the
1806498Snate@binkert.orgprecedence associativity (e.g., <tt>'left'</tt>, <tt>'right'</tt>, 
1816498Snate@binkert.orgor <tt>'nonassoc'</tt>.
1826498Snate@binkert.org</blockquote>
1836498Snate@binkert.org
1846498Snate@binkert.org<p>
1856498Snate@binkert.org<b><tt>g.compute_first()</tt></b>
1866498Snate@binkert.org<blockquote>
1876498Snate@binkert.orgCompute all of the first sets for all symbols in the grammar.  Returns a dictionary
1886498Snate@binkert.orgmapping symbol names to a list of all first symbols.
1896498Snate@binkert.org</blockquote>
1906498Snate@binkert.org
1916498Snate@binkert.org<p>
1926498Snate@binkert.org<b><tt>g.compute_follow()</tt></b>
1936498Snate@binkert.org<blockquote>
1946498Snate@binkert.orgCompute all of the follow sets for all non-terminals in the grammar.
1956498Snate@binkert.orgThe follow set is the set of all possible symbols that might follow a
1966498Snate@binkert.orggiven non-terminal.  Returns a dictionary mapping non-terminal names
1976498Snate@binkert.orgto a list of symbols.
1986498Snate@binkert.org</blockquote>
1996498Snate@binkert.org
2006498Snate@binkert.org<p>
2016498Snate@binkert.org<b><tt>g.build_lritems()</tt></b>
2026498Snate@binkert.org<blockquote>
2036498Snate@binkert.orgCalculates all of the LR items for all productions in the grammar.  This
2046498Snate@binkert.orgstep is required before using the grammar for any kind of table generation.
2056498Snate@binkert.orgSee the section on LR items below.
2066498Snate@binkert.org</blockquote>
2076498Snate@binkert.org
2086498Snate@binkert.org<p>
2096498Snate@binkert.orgThe following attributes are set by the above methods and may be useful
2106498Snate@binkert.orgin code that works with the grammar.  All of these attributes should be
2116498Snate@binkert.orgassumed to be read-only.  Changing their values directly will likely 
2126498Snate@binkert.orgbreak the grammar.
2136498Snate@binkert.org
2146498Snate@binkert.org<p>
2156498Snate@binkert.org<b><tt>g.Productions</tt></b>
2166498Snate@binkert.org<blockquote>
2176498Snate@binkert.orgA list of all productions added.  The first entry is reserved for
2186498Snate@binkert.orga production representing the starting rule.  The objects in this list
2196498Snate@binkert.orgare instances of the <tt>Production</tt> class, described shortly.
2206498Snate@binkert.org</blockquote>
2216498Snate@binkert.org
2226498Snate@binkert.org<p>
2236498Snate@binkert.org<b><tt>g.Prodnames</tt></b>
2246498Snate@binkert.org<blockquote>
2256498Snate@binkert.orgA dictionary mapping the names of nonterminals to a list of all
2266498Snate@binkert.orgproductions of that nonterminal.
2276498Snate@binkert.org</blockquote>
2286498Snate@binkert.org
2296498Snate@binkert.org<p>
2306498Snate@binkert.org<b><tt>g.Terminals</tt></b>
2316498Snate@binkert.org<blockquote>
2326498Snate@binkert.orgA dictionary mapping the names of terminals to a list of the
2336498Snate@binkert.orgproduction numbers where they are used.
2346498Snate@binkert.org</blockquote>
2356498Snate@binkert.org
2366498Snate@binkert.org<p>
2376498Snate@binkert.org<b><tt>g.Nonterminals</tt></b>
2386498Snate@binkert.org<blockquote>
2396498Snate@binkert.orgA dictionary mapping the names of nonterminals to a list of the
2406498Snate@binkert.orgproduction numbers where they are used.
2416498Snate@binkert.org</blockquote>
2426498Snate@binkert.org
2436498Snate@binkert.org<p>
2446498Snate@binkert.org<b><tt>g.First</tt></b>
2456498Snate@binkert.org<blockquote>
2466498Snate@binkert.orgA dictionary representing the first sets for all grammar symbols.  This is
2476498Snate@binkert.orgcomputed and returned by the <tt>compute_first()</tt> method.
2486498Snate@binkert.org</blockquote>
2496498Snate@binkert.org
2506498Snate@binkert.org<p>
2516498Snate@binkert.org<b><tt>g.Follow</tt></b>
2526498Snate@binkert.org<blockquote>
2536498Snate@binkert.orgA dictionary representing the follow sets for all grammar rules.  This is
2546498Snate@binkert.orgcomputed and returned by the <tt>compute_follow()</tt> method.
2556498Snate@binkert.org</blockquote>
2566498Snate@binkert.org
2576498Snate@binkert.org<p>
2586498Snate@binkert.org<b><tt>g.Start</tt></b>
2596498Snate@binkert.org<blockquote>
2606498Snate@binkert.orgStarting symbol for the grammar.  Set by the <tt>set_start()</tt> method.
2616498Snate@binkert.org</blockquote>
2626498Snate@binkert.org
2636498Snate@binkert.orgFor the purposes of debugging, a <tt>Grammar</tt> object supports the <tt>__len__()</tt> and
2646498Snate@binkert.org<tt>__getitem__()</tt> special methods.  Accessing <tt>g[n]</tt> returns the nth production
2656498Snate@binkert.orgfrom the grammar.
2666498Snate@binkert.org
2676498Snate@binkert.org
2686498Snate@binkert.org<H2><a name="internal_nn3"></a>3. Productions</H2>
2696498Snate@binkert.org
2706498Snate@binkert.org
2716498Snate@binkert.org<tt>Grammar</tt> objects store grammar rules as instances of a <tt>Production</tt> class.  This
2726498Snate@binkert.orgclass has no public constructor--you should only create productions by calling <tt>Grammar.add_production()</tt>.
2736498Snate@binkert.orgThe following attributes are available on a <tt>Production</tt> instance <tt>p</tt>.
2746498Snate@binkert.org
2756498Snate@binkert.org<p>
2766498Snate@binkert.org<b><tt>p.name</tt></b>
2776498Snate@binkert.org<blockquote>
2786498Snate@binkert.orgThe name of the production. For a grammar rule such as <tt>A : B C D</tt>, this is <tt>'A'</tt>.
2796498Snate@binkert.org</blockquote>
2806498Snate@binkert.org
2816498Snate@binkert.org<p>
2826498Snate@binkert.org<b><tt>p.prod</tt></b>
2836498Snate@binkert.org<blockquote>
2846498Snate@binkert.orgA tuple of symbols making up the right-hand side of the production.  For a grammar rule such as <tt>A : B C D</tt>, this is <tt>('B','C','D')</tt>.
2856498Snate@binkert.org</blockquote>
2866498Snate@binkert.org
2876498Snate@binkert.org<p>
2886498Snate@binkert.org<b><tt>p.number</tt></b>
2896498Snate@binkert.org<blockquote>
2906498Snate@binkert.orgProduction number.  An integer containing the index of the production in the grammar's <tt>Productions</tt> list.
2916498Snate@binkert.org</blockquote>
2926498Snate@binkert.org
2936498Snate@binkert.org<p>
2946498Snate@binkert.org<b><tt>p.func</tt></b>
2956498Snate@binkert.org<blockquote>
2966498Snate@binkert.orgThe name of the reduction function associated with the production.
2976498Snate@binkert.orgThis is the function that will execute when reducing the entire
2986498Snate@binkert.orggrammar rule during parsing.
2996498Snate@binkert.org</blockquote>
3006498Snate@binkert.org
3016498Snate@binkert.org<p>
3026498Snate@binkert.org<b><tt>p.callable</tt></b>
3036498Snate@binkert.org<blockquote>
3046498Snate@binkert.orgThe callable object associated with the name in <tt>p.func</tt>.  This is <tt>None</tt>
3056498Snate@binkert.orgunless the production has been bound using <tt>bind()</tt>.
3066498Snate@binkert.org</blockquote>
3076498Snate@binkert.org
3086498Snate@binkert.org<p>
3096498Snate@binkert.org<b><tt>p.file</tt></b>
3106498Snate@binkert.org<blockquote>
3116498Snate@binkert.orgFilename associated with the production.  Typically this is the file where the production was defined.  Used for error messages.
3126498Snate@binkert.org</blockquote>
3136498Snate@binkert.org
3146498Snate@binkert.org<p>
3156498Snate@binkert.org<b><tt>p.lineno</tt></b>
3166498Snate@binkert.org<blockquote>
3176498Snate@binkert.orgLine number associated with the production.  Typically this is the line number in <tt>p.file</tt> where the production was defined.  Used for error messages.
3186498Snate@binkert.org</blockquote>
3196498Snate@binkert.org
3206498Snate@binkert.org<p>
3216498Snate@binkert.org<b><tt>p.prec</tt></b>
3226498Snate@binkert.org<blockquote>
3236498Snate@binkert.orgPrecedence and associativity associated with the production.  This is a tuple <tt>(assoc,level)</tt> where
3246498Snate@binkert.org<tt>assoc</tt> is one of <tt>'left'</tt>,<tt>'right'</tt>, or <tt>'nonassoc'</tt> and <tt>level</tt> is
3256498Snate@binkert.organ integer.   This value is determined by the precedence of the right-most terminal symbol in the production
3266498Snate@binkert.orgor by use of the <tt>%prec</tt> specifier when adding the production.
3276498Snate@binkert.org</blockquote>
3286498Snate@binkert.org
3296498Snate@binkert.org<p>
3306498Snate@binkert.org<b><tt>p.usyms</tt></b>
3316498Snate@binkert.org<blockquote>
3326498Snate@binkert.orgA list of all unique symbols found in the production.
3336498Snate@binkert.org</blockquote>
3346498Snate@binkert.org
3356498Snate@binkert.org<p>
3366498Snate@binkert.org<b><tt>p.lr_items</tt></b>
3376498Snate@binkert.org<blockquote>
3386498Snate@binkert.orgA list of all LR items for this production.  This attribute only has a meaningful value if the
3396498Snate@binkert.org<tt>Grammar.build_lritems()</tt> method has been called.  The items in this list are 
3406498Snate@binkert.orginstances of <tt>LRItem</tt> described below.
3416498Snate@binkert.org</blockquote>
3426498Snate@binkert.org
3436498Snate@binkert.org<p>
3446498Snate@binkert.org<b><tt>p.lr_next</tt></b>
3456498Snate@binkert.org<blockquote>
3466498Snate@binkert.orgThe head of a linked-list representation of the LR items in <tt>p.lr_items</tt>.  
3476498Snate@binkert.orgThis attribute only has a meaningful value if the <tt>Grammar.build_lritems()</tt> 
3486498Snate@binkert.orgmethod has been called.  Each <tt>LRItem</tt> instance has a <tt>lr_next</tt> attribute
3496498Snate@binkert.orgto move to the next item.  The list is terminated by <tt>None</tt>.
3506498Snate@binkert.org</blockquote>
3516498Snate@binkert.org
3526498Snate@binkert.org<p>
3536498Snate@binkert.org<b><tt>p.bind(dict)</tt></b>
3546498Snate@binkert.org<blockquote>
3556498Snate@binkert.orgBinds the production function name in <tt>p.func</tt> to a callable object in 
3566498Snate@binkert.org<tt>dict</tt>.   This operation is typically carried out in the last step
3576498Snate@binkert.orgprior to running the parsing engine and is needed since parsing tables are typically
3586498Snate@binkert.orgread from files which only include the function names, not the functions themselves.
3596498Snate@binkert.org</blockquote>
3606498Snate@binkert.org
3616498Snate@binkert.org<P>
3626498Snate@binkert.org<tt>Production</tt> objects support
3636498Snate@binkert.orgthe <tt>__len__()</tt>, <tt>__getitem__()</tt>, and <tt>__str__()</tt>
3646498Snate@binkert.orgspecial methods.
3656498Snate@binkert.org<tt>len(p)</tt> returns the number of symbols in <tt>p.prod</tt>
3666498Snate@binkert.organd <tt>p[n]</tt> is the same as <tt>p.prod[n]</tt>. 
3676498Snate@binkert.org
3686498Snate@binkert.org<H2><a name="internal_nn4"></a>4. LRItems</H2>
3696498Snate@binkert.org
3706498Snate@binkert.org
3716498Snate@binkert.orgThe construction of parsing tables in an LR-based parser generator is primarily
3726498Snate@binkert.orgdone over a set of "LR Items".   An LR item represents a stage of parsing one
3736498Snate@binkert.orgof the grammar rules.   To compute the LR items, it is first necessary to
3746498Snate@binkert.orgcall <tt>Grammar.build_lritems()</tt>.  Once this step, all of the productions
3756498Snate@binkert.orgin the grammar will have their LR items attached to them.
3766498Snate@binkert.org
3776498Snate@binkert.org<p>
3786498Snate@binkert.orgHere is an interactive example that shows what LR items look like if you
3796498Snate@binkert.orginteractively experiment.  In this example, <tt>g</tt> is a <tt>Grammar</tt> 
3806498Snate@binkert.orgobject.
3816498Snate@binkert.org
3826498Snate@binkert.org<blockquote>
3836498Snate@binkert.org<pre>
3846498Snate@binkert.org>>> <b>g.build_lritems()</b>
3856498Snate@binkert.org>>> <b>p = g[1]</b>
3866498Snate@binkert.org>>> <b>p</b>
3876498Snate@binkert.orgProduction(statement -> ID = expr)
3886498Snate@binkert.org>>>
3896498Snate@binkert.org</pre>
3906498Snate@binkert.org</blockquote>
3916498Snate@binkert.org
3926498Snate@binkert.orgIn the above code, <tt>p</tt> represents the first grammar rule. In
3936498Snate@binkert.orgthis case, a rule <tt>'statement -> ID = expr'</tt>.
3946498Snate@binkert.org
3956498Snate@binkert.org<p>
3966498Snate@binkert.orgNow, let's look at the LR items for <tt>p</tt>.
3976498Snate@binkert.org
3986498Snate@binkert.org<blockquote>
3996498Snate@binkert.org<pre>
4006498Snate@binkert.org>>> <b>p.lr_items</b>
4016498Snate@binkert.org[LRItem(statement -> . ID = expr), 
4026498Snate@binkert.org LRItem(statement -> ID . = expr), 
4036498Snate@binkert.org LRItem(statement -> ID = . expr), 
4046498Snate@binkert.org LRItem(statement -> ID = expr .)]
4056498Snate@binkert.org>>>
4066498Snate@binkert.org</pre>
4076498Snate@binkert.org</blockquote>
4086498Snate@binkert.org
4096498Snate@binkert.orgIn each LR item, the dot (.) represents a specific stage of parsing.  In each LR item, the dot
4106498Snate@binkert.orgis advanced by one symbol.  It is only when the dot reaches the very end that a production
4116498Snate@binkert.orgis successfully parsed.
4126498Snate@binkert.org
4136498Snate@binkert.org<p>
4146498Snate@binkert.orgAn instance <tt>lr</tt> of <tt>LRItem</tt> has the following
4156498Snate@binkert.orgattributes that hold information related to that specific stage of
4166498Snate@binkert.orgparsing.
4176498Snate@binkert.org
4186498Snate@binkert.org<p>
4196498Snate@binkert.org<b><tt>lr.name</tt></b>
4206498Snate@binkert.org<blockquote>
4216498Snate@binkert.orgThe name of the grammar rule. For example, <tt>'statement'</tt> in the above example.
4226498Snate@binkert.org</blockquote>
4236498Snate@binkert.org
4246498Snate@binkert.org<p>
4256498Snate@binkert.org<b><tt>lr.prod</tt></b>
4266498Snate@binkert.org<blockquote>
4276498Snate@binkert.orgA tuple of symbols representing the right-hand side of the production, including the
4286498Snate@binkert.orgspecial <tt>'.'</tt> character.  For example, <tt>('ID','.','=','expr')</tt>.
4296498Snate@binkert.org</blockquote>
4306498Snate@binkert.org
4316498Snate@binkert.org<p>
4326498Snate@binkert.org<b><tt>lr.number</tt></b>
4336498Snate@binkert.org<blockquote>
4346498Snate@binkert.orgAn integer representing the production number in the grammar.
4356498Snate@binkert.org</blockquote>
4366498Snate@binkert.org
4376498Snate@binkert.org<p>
4386498Snate@binkert.org<b><tt>lr.usyms</tt></b>
4396498Snate@binkert.org<blockquote>
4406498Snate@binkert.orgA set of unique symbols in the production.  Inherited from the original <tt>Production</tt> instance.
4416498Snate@binkert.org</blockquote>
4426498Snate@binkert.org
4436498Snate@binkert.org<p>
4446498Snate@binkert.org<b><tt>lr.lr_index</tt></b>
4456498Snate@binkert.org<blockquote>
4466498Snate@binkert.orgAn integer representing the position of the dot (.).  You should never use <tt>lr.prod.index()</tt>
4476498Snate@binkert.orgto search for it--the result will be wrong if the grammar happens to also use (.) as a character
4486498Snate@binkert.orgliteral.
4496498Snate@binkert.org</blockquote>
4506498Snate@binkert.org
4516498Snate@binkert.org<p>
4526498Snate@binkert.org<b><tt>lr.lr_after</tt></b>
4536498Snate@binkert.org<blockquote>
4546498Snate@binkert.orgA list of all productions that can legally appear immediately to the right of the
4556498Snate@binkert.orgdot (.).  This list contains <tt>Production</tt> instances.   This attribute
4566498Snate@binkert.orgrepresents all of the possible branches a parse can take from the current position.
4576498Snate@binkert.orgFor example, suppose that <tt>lr</tt> represents a stage immediately before
4586498Snate@binkert.organ expression like this:
4596498Snate@binkert.org
4606498Snate@binkert.org<pre>
4616498Snate@binkert.org>>> <b>lr</b>
4626498Snate@binkert.orgLRItem(statement -> ID = . expr)
4636498Snate@binkert.org>>>
4646498Snate@binkert.org</pre>
4656498Snate@binkert.org
4666498Snate@binkert.orgThen, the value of <tt>lr.lr_after</tt> might look like this, showing all productions that
4676498Snate@binkert.orgcan legally appear next:
4686498Snate@binkert.org
4696498Snate@binkert.org<pre>
4706498Snate@binkert.org>>> <b>lr.lr_after</b>
4716498Snate@binkert.org[Production(expr -> expr PLUS expr), 
4726498Snate@binkert.org Production(expr -> expr MINUS expr), 
4736498Snate@binkert.org Production(expr -> expr TIMES expr), 
4746498Snate@binkert.org Production(expr -> expr DIVIDE expr), 
4756498Snate@binkert.org Production(expr -> MINUS expr), 
4766498Snate@binkert.org Production(expr -> LPAREN expr RPAREN), 
4776498Snate@binkert.org Production(expr -> NUMBER), 
4786498Snate@binkert.org Production(expr -> ID)]
4796498Snate@binkert.org>>>
4806498Snate@binkert.org</pre>
4816498Snate@binkert.org
4826498Snate@binkert.org</blockquote>
4836498Snate@binkert.org
4846498Snate@binkert.org<p>
4856498Snate@binkert.org<b><tt>lr.lr_before</tt></b>
4866498Snate@binkert.org<blockquote>
4876498Snate@binkert.orgThe grammar symbol that appears immediately before the dot (.) or <tt>None</tt> if
4886498Snate@binkert.orgat the beginning of the parse.  
4896498Snate@binkert.org</blockquote>
4906498Snate@binkert.org
4916498Snate@binkert.org<p>
4926498Snate@binkert.org<b><tt>lr.lr_next</tt></b>
4936498Snate@binkert.org<blockquote>
4946498Snate@binkert.orgA link to the next LR item, representing the next stage of the parse.  <tt>None</tt> if <tt>lr</tt>
4956498Snate@binkert.orgis the last LR item.
4966498Snate@binkert.org</blockquote>
4976498Snate@binkert.org
4986498Snate@binkert.org<tt>LRItem</tt> instances also support the <tt>__len__()</tt> and <tt>__getitem__()</tt> special methods.
4996498Snate@binkert.org<tt>len(lr)</tt> returns the number of items in <tt>lr.prod</tt> including the dot (.). <tt>lr[n]</tt>
5006498Snate@binkert.orgreturns <tt>lr.prod[n]</tt>.
5016498Snate@binkert.org
5026498Snate@binkert.org<p>
5036498Snate@binkert.orgIt goes without saying that all of the attributes associated with LR
5046498Snate@binkert.orgitems should be assumed to be read-only.  Modifications will very
5056498Snate@binkert.orglikely create a small black-hole that will consume you and your code.
5066498Snate@binkert.org
5076498Snate@binkert.org<H2><a name="internal_nn5"></a>5. LRTable</H2>
5086498Snate@binkert.org
5096498Snate@binkert.org
5106498Snate@binkert.orgThe <tt>LRTable</tt> class is used to represent LR parsing table data. This
5116498Snate@binkert.orgminimally includes the production list, action table, and goto table. 
5126498Snate@binkert.org
5136498Snate@binkert.org<p>
5146498Snate@binkert.org<b><tt>LRTable()</tt></b>
5156498Snate@binkert.org<blockquote>
5166498Snate@binkert.orgCreate an empty LRTable object.  This object contains only the information needed to
5176498Snate@binkert.orgrun an LR parser.  
5186498Snate@binkert.org</blockquote>
5196498Snate@binkert.org
5206498Snate@binkert.orgAn instance <tt>lrtab</tt> of <tt>LRTable</tt> has the following methods:
5216498Snate@binkert.org
5226498Snate@binkert.org<p>
5236498Snate@binkert.org<b><tt>lrtab.read_table(module)</tt></b>
5246498Snate@binkert.org<blockquote>
5256498Snate@binkert.orgPopulates the LR table with information from the module specified in <tt>module</tt>.
5266498Snate@binkert.org<tt>module</tt> is either a module object already loaded with <tt>import</tt> or
5276498Snate@binkert.orgthe name of a Python module.   If it's a string containing a module name, it is
5286498Snate@binkert.orgloaded and parsing data is extracted.   Returns the signature  value that was used
5296498Snate@binkert.orgwhen initially writing the tables.  Raises a <tt>VersionError</tt> exception if
5306498Snate@binkert.orgthe module was created using an incompatible version of PLY.
5316498Snate@binkert.org</blockquote>
5326498Snate@binkert.org
5336498Snate@binkert.org<p>
5346498Snate@binkert.org<b><tt>lrtab.bind_callables(dict)</tt></b>
5356498Snate@binkert.org<blockquote>
5366498Snate@binkert.orgThis binds all of the function names used in productions to callable objects
5376498Snate@binkert.orgfound in the dictionary <tt>dict</tt>.  During table generation and when reading
5386498Snate@binkert.orgLR tables from files, PLY only uses the names of action functions such as <tt>'p_expr'</tt>,
5396498Snate@binkert.org<tt>'p_statement'</tt>, etc.  In order to actually run the parser, these names
5406498Snate@binkert.orghave to be bound to callable objects.   This method is always called prior to
5416498Snate@binkert.orgrunning a parser.
5426498Snate@binkert.org</blockquote>
5436498Snate@binkert.org
5446498Snate@binkert.orgAfter <tt>lrtab</tt> has been populated, the following attributes are defined.
5456498Snate@binkert.org
5466498Snate@binkert.org<p>
5476498Snate@binkert.org<b><tt>lrtab.lr_method</tt></b>
5486498Snate@binkert.org<blockquote>
5496498Snate@binkert.orgThe LR parsing method used (e.g., <tt>'LALR'</tt>)
5506498Snate@binkert.org</blockquote>
5516498Snate@binkert.org
5526498Snate@binkert.org
5536498Snate@binkert.org<p>
5546498Snate@binkert.org<b><tt>lrtab.lr_productions</tt></b>
5556498Snate@binkert.org<blockquote>
5566498Snate@binkert.orgThe production list.  If the parsing tables have been newly
5576498Snate@binkert.orgconstructed, this will be a list of <tt>Production</tt> instances.  If
5586498Snate@binkert.orgthe parsing tables have been read from a file, it's a list
5596498Snate@binkert.orgof <tt>MiniProduction</tt> instances.  This, together
5606498Snate@binkert.orgwith <tt>lr_action</tt> and <tt>lr_goto</tt> contain all of the
5616498Snate@binkert.orginformation needed by the LR parsing engine.
5626498Snate@binkert.org</blockquote>
5636498Snate@binkert.org
5646498Snate@binkert.org<p>
5656498Snate@binkert.org<b><tt>lrtab.lr_action</tt></b>
5666498Snate@binkert.org<blockquote>
5676498Snate@binkert.orgThe LR action dictionary that implements the underlying state machine.
5686498Snate@binkert.orgThe keys of this dictionary are the LR states.
5696498Snate@binkert.org</blockquote>
5706498Snate@binkert.org
5716498Snate@binkert.org<p>
5726498Snate@binkert.org<b><tt>lrtab.lr_goto</tt></b>
5736498Snate@binkert.org<blockquote>
5746498Snate@binkert.orgThe LR goto table that contains information about grammar rule reductions.
5756498Snate@binkert.org</blockquote>
5766498Snate@binkert.org
5776498Snate@binkert.org
5786498Snate@binkert.org<H2><a name="internal_nn6"></a>6. LRGeneratedTable</H2>
5796498Snate@binkert.org
5806498Snate@binkert.org
5816498Snate@binkert.orgThe <tt>LRGeneratedTable</tt> class represents constructed LR parsing tables on a
5826498Snate@binkert.orggrammar.  It is a subclass of <tt>LRTable</tt>.
5836498Snate@binkert.org
5846498Snate@binkert.org<p>
5856498Snate@binkert.org<b><tt>LRGeneratedTable(grammar, method='LALR',log=None)</tt></b>
5866498Snate@binkert.org<blockquote>
5876498Snate@binkert.orgCreate the LR parsing tables on a grammar.  <tt>grammar</tt> is an instance of <tt>Grammar</tt>,
5886498Snate@binkert.org<tt>method</tt> is a string with the parsing method (<tt>'SLR'</tt> or <tt>'LALR'</tt>), and
5896498Snate@binkert.org<tt>log</tt> is a logger object used to write debugging information.  The debugging information
5906498Snate@binkert.orgwritten to <tt>log</tt> is the same as what appears in the <tt>parser.out</tt> file created
5916498Snate@binkert.orgby yacc.  By supplying a custom logger with a different message format, it is possible to get
5926498Snate@binkert.orgmore information (e.g., the line number in <tt>yacc.py</tt> used for issuing each line of
5936498Snate@binkert.orgoutput in the log).   The result is an instance of <tt>LRGeneratedTable</tt>.
5946498Snate@binkert.org</blockquote>
5956498Snate@binkert.org
5966498Snate@binkert.org<p>
5976498Snate@binkert.orgAn instance <tt>lr</tt> of <tt>LRGeneratedTable</tt> has the following attributes.
5986498Snate@binkert.org
5996498Snate@binkert.org<p>
6006498Snate@binkert.org<b><tt>lr.grammar</tt></b>
6016498Snate@binkert.org<blockquote>
6026498Snate@binkert.orgA link to the Grammar object used to construct the parsing tables.
6036498Snate@binkert.org</blockquote>
6046498Snate@binkert.org
6056498Snate@binkert.org<p>
6066498Snate@binkert.org<b><tt>lr.lr_method</tt></b>
6076498Snate@binkert.org<blockquote>
6086498Snate@binkert.orgThe LR parsing method used (e.g., <tt>'LALR'</tt>)
6096498Snate@binkert.org</blockquote>
6106498Snate@binkert.org
6116498Snate@binkert.org
6126498Snate@binkert.org<p>
6136498Snate@binkert.org<b><tt>lr.lr_productions</tt></b>
6146498Snate@binkert.org<blockquote>
6156498Snate@binkert.orgA reference to <tt>grammar.Productions</tt>.  This, together with <tt>lr_action</tt> and <tt>lr_goto</tt>
6166498Snate@binkert.orgcontain all of the information needed by the LR parsing engine.
6176498Snate@binkert.org</blockquote>
6186498Snate@binkert.org
6196498Snate@binkert.org<p>
6206498Snate@binkert.org<b><tt>lr.lr_action</tt></b>
6216498Snate@binkert.org<blockquote>
6226498Snate@binkert.orgThe LR action dictionary that implements the underlying state machine.  The keys of this dictionary are
6236498Snate@binkert.orgthe LR states.
6246498Snate@binkert.org</blockquote>
6256498Snate@binkert.org
6266498Snate@binkert.org<p>
6276498Snate@binkert.org<b><tt>lr.lr_goto</tt></b>
6286498Snate@binkert.org<blockquote>
6296498Snate@binkert.orgThe LR goto table that contains information about grammar rule reductions.
6306498Snate@binkert.org</blockquote>
6316498Snate@binkert.org
6326498Snate@binkert.org<p>
6336498Snate@binkert.org<b><tt>lr.sr_conflicts</tt></b>
6346498Snate@binkert.org<blockquote>
6356498Snate@binkert.orgA list of tuples <tt>(state,token,resolution)</tt> identifying all shift/reduce conflicts. <tt>state</tt> is the LR state
6366498Snate@binkert.orgnumber where the conflict occurred, <tt>token</tt> is the token causing the conflict, and <tt>resolution</tt> is
6376498Snate@binkert.orga string describing the resolution taken.  <tt>resolution</tt> is either <tt>'shift'</tt> or <tt>'reduce'</tt>.
6386498Snate@binkert.org</blockquote>
6396498Snate@binkert.org
6406498Snate@binkert.org<p>
6416498Snate@binkert.org<b><tt>lr.rr_conflicts</tt></b>
6426498Snate@binkert.org<blockquote>
6436498Snate@binkert.orgA list of tuples <tt>(state,rule,rejected)</tt> identifying all reduce/reduce conflicts.  <tt>state</tt> is the
6446498Snate@binkert.orgLR state number where the conflict occurred, <tt>rule</tt> is the production rule that was selected
6456498Snate@binkert.organd <tt>rejected</tt> is the production rule that was rejected.   Both <tt>rule</tt> and </tt>rejected</tt> are
6466498Snate@binkert.orginstances of <tt>Production</tt>.  They can be inspected to provide the user with more information.
6476498Snate@binkert.org</blockquote>
6486498Snate@binkert.org
6496498Snate@binkert.org<p>
6506498Snate@binkert.orgThere are two public methods of <tt>LRGeneratedTable</tt>.
6516498Snate@binkert.org
6526498Snate@binkert.org<p>
6536498Snate@binkert.org<b><tt>lr.write_table(modulename,outputdir="",signature="")</tt></b>
6546498Snate@binkert.org<blockquote>
6556498Snate@binkert.orgWrites the LR parsing table information to a Python module.  <tt>modulename</tt> is a string 
6566498Snate@binkert.orgspecifying the name of a module such as <tt>"parsetab"</tt>.  <tt>outputdir</tt> is the name of a 
6576498Snate@binkert.orgdirectory where the module should be created.  <tt>signature</tt> is a string representing a
6586498Snate@binkert.orggrammar signature that's written into the output file. This can be used to detect when
6596498Snate@binkert.orgthe data stored in a module file is out-of-sync with the the grammar specification (and that
6606498Snate@binkert.orgthe tables need to be regenerated).  If <tt>modulename</tt> is a string <tt>"parsetab"</tt>,
6616498Snate@binkert.orgthis function creates a file called <tt>parsetab.py</tt>.  If the module name represents a
6626498Snate@binkert.orgpackage such as <tt>"foo.bar.parsetab"</tt>, then only the last component, <tt>"parsetab"</tt> is
6636498Snate@binkert.orgused.
6646498Snate@binkert.org</blockquote>
6656498Snate@binkert.org
6666498Snate@binkert.org
6676498Snate@binkert.org<H2><a name="internal_nn7"></a>7. LRParser</H2>
6686498Snate@binkert.org
6696498Snate@binkert.org
6706498Snate@binkert.orgThe <tt>LRParser</tt> class implements the low-level LR parsing engine.
6716498Snate@binkert.org
6726498Snate@binkert.org
6736498Snate@binkert.org<p>
6746498Snate@binkert.org<b><tt>LRParser(lrtab, error_func)</tt></b>
6756498Snate@binkert.org<blockquote>
6766498Snate@binkert.orgCreate an LRParser.  <tt>lrtab</tt> is an instance of <tt>LRTable</tt>
6776498Snate@binkert.orgcontaining the LR production and state tables.  <tt>error_func</tt> is the
6786498Snate@binkert.orgerror function to invoke in the event of a parsing error.
6796498Snate@binkert.org</blockquote>
6806498Snate@binkert.org
6816498Snate@binkert.orgAn instance <tt>p</tt> of <tt>LRParser</tt> has the following methods:
6826498Snate@binkert.org
6836498Snate@binkert.org<p>
6846498Snate@binkert.org<b><tt>p.parse(input=None,lexer=None,debug=0,tracking=0,tokenfunc=None)</tt></b>
6856498Snate@binkert.org<blockquote>
6866498Snate@binkert.orgRun the parser.  <tt>input</tt> is a string, which if supplied is fed into the
6876498Snate@binkert.orglexer using its <tt>input()</tt> method.  <tt>lexer</tt> is an instance of the
6886498Snate@binkert.org<tt>Lexer</tt> class to use for tokenizing.  If not supplied, the last lexer
6896498Snate@binkert.orgcreated with the <tt>lex</tt> module is used.   <tt>debug</tt> is a boolean flag
6906498Snate@binkert.orgthat enables debugging.   <tt>tracking</tt> is a boolean flag that tells the
6916498Snate@binkert.orgparser to perform additional line number tracking.  <tt>tokenfunc</tt> is a callable
6926498Snate@binkert.orgfunction that returns the next token.  If supplied, the parser will use it to get
6936498Snate@binkert.orgall tokens.
6946498Snate@binkert.org</blockquote>
6956498Snate@binkert.org
6966498Snate@binkert.org<p>
6976498Snate@binkert.org<b><tt>p.restart()</tt></b>
6986498Snate@binkert.org<blockquote>
6996498Snate@binkert.orgResets the parser state for a parse already in progress.
7006498Snate@binkert.org</blockquote>
7016498Snate@binkert.org
7026498Snate@binkert.org<H2><a name="internal_nn8"></a>8. ParserReflect</H2>
7036498Snate@binkert.org
7046498Snate@binkert.org
7056498Snate@binkert.org<p>
7066498Snate@binkert.orgThe <tt>ParserReflect</tt> class is used to collect parser specification data
7076498Snate@binkert.orgfrom a Python module or object.   This class is what collects all of the
7086498Snate@binkert.org<tt>p_rule()</tt> functions in a PLY file, performs basic error checking,
7096498Snate@binkert.organd collects all of the needed information to build a grammar.    Most of the
7106498Snate@binkert.orghigh-level PLY interface as used by the <tt>yacc()</tt> function is actually
7116498Snate@binkert.orgimplemented by this class.
7126498Snate@binkert.org
7136498Snate@binkert.org<p>
7146498Snate@binkert.org<b><tt>ParserReflect(pdict, log=None)</tt></b>
7156498Snate@binkert.org<blockquote>
7166498Snate@binkert.orgCreates a <tt>ParserReflect</tt> instance. <tt>pdict</tt> is a dictionary
7176498Snate@binkert.orgcontaining parser specification data.  This dictionary typically corresponds
7186498Snate@binkert.orgto the module or class dictionary of code that implements a PLY parser.
7196498Snate@binkert.org<tt>log</tt> is a logger instance that will be used to report error
7206498Snate@binkert.orgmessages.
7216498Snate@binkert.org</blockquote>
7226498Snate@binkert.org
7236498Snate@binkert.orgAn instance <tt>p</tt> of <tt>ParserReflect</tt> has the following methods:
7246498Snate@binkert.org
7256498Snate@binkert.org<p>
7266498Snate@binkert.org<b><tt>p.get_all()</tt></b>
7276498Snate@binkert.org<blockquote>
7286498Snate@binkert.orgCollect and store all required parsing information.
7296498Snate@binkert.org</blockquote>
7306498Snate@binkert.org
7316498Snate@binkert.org<p>
7326498Snate@binkert.org<b><tt>p.validate_all()</tt></b>
7336498Snate@binkert.org<blockquote>
7346498Snate@binkert.orgValidate all of the collected parsing information.  This is a seprate step
7356498Snate@binkert.orgfrom <tt>p.get_all()</tt> as a performance optimization.  In order to
7366498Snate@binkert.orgincrease parser start-up time, a parser can elect to only validate the
7376498Snate@binkert.orgparsing data when regenerating the parsing tables.   The validation
7386498Snate@binkert.orgstep tries to collect as much information as possible rather than
7396498Snate@binkert.orgraising an exception at the first sign of trouble.  The attribute
7406498Snate@binkert.org<tt>p.error</tt> is set if there are any validation errors.  The
7416498Snate@binkert.orgvalue of this attribute is also returned.
7426498Snate@binkert.org</blockquote>
7436498Snate@binkert.org
7446498Snate@binkert.org<p>
7456498Snate@binkert.org<b><tt>p.signature()</tt></b>
7466498Snate@binkert.org<blockquote>
7476498Snate@binkert.orgCompute a signature representing the contents of the collected parsing
7486498Snate@binkert.orgdata.  The signature value should change if anything in the parser
7496498Snate@binkert.orgspecification has changed in a way that would justify parser table
7506498Snate@binkert.orgregeneration.   This method can be called after <tt>p.get_all()</tt>,
7516498Snate@binkert.orgbut before <tt>p.validate_all()</tt>.
7526498Snate@binkert.org</blockquote>
7536498Snate@binkert.org
7546498Snate@binkert.orgThe following attributes are set in the process of collecting data:
7556498Snate@binkert.org
7566498Snate@binkert.org<p>
7576498Snate@binkert.org<b><tt>p.start</tt></b>
7586498Snate@binkert.org<blockquote>
7596498Snate@binkert.orgThe grammar start symbol, if any. Taken from <tt>pdict['start']</tt>.
7606498Snate@binkert.org</blockquote>
7616498Snate@binkert.org
7626498Snate@binkert.org<p>
7636498Snate@binkert.org<b><tt>p.error_func</tt></b>
7646498Snate@binkert.org<blockquote>
7656498Snate@binkert.orgThe error handling function or <tt>None</tt>. Taken from <tt>pdict['p_error']</tt>.
7666498Snate@binkert.org</blockquote>
7676498Snate@binkert.org
7686498Snate@binkert.org<p>
7696498Snate@binkert.org<b><tt>p.tokens</tt></b>
7706498Snate@binkert.org<blockquote>
7716498Snate@binkert.orgThe token list. Taken from <tt>pdict['tokens']</tt>.
7726498Snate@binkert.org</blockquote>
7736498Snate@binkert.org
7746498Snate@binkert.org<p>
7756498Snate@binkert.org<b><tt>p.prec</tt></b>
7766498Snate@binkert.org<blockquote>
7776498Snate@binkert.orgThe precedence specifier.  Taken from <tt>pdict['precedence']</tt>.
7786498Snate@binkert.org</blockquote>
7796498Snate@binkert.org
7806498Snate@binkert.org<p>
7816498Snate@binkert.org<b><tt>p.preclist</tt></b>
7826498Snate@binkert.org<blockquote>
7836498Snate@binkert.orgA parsed version of the precedence specified.  A list of tuples of the form
7846498Snate@binkert.org<tt>(token,assoc,level)</tt> where <tt>token</tt> is the terminal symbol,
7856498Snate@binkert.org<tt>assoc</tt> is the associativity (e.g., <tt>'left'</tt>) and <tt>level</tt>
7866498Snate@binkert.orgis a numeric precedence level.
7876498Snate@binkert.org</blockquote>
7886498Snate@binkert.org
7896498Snate@binkert.org<p>
7906498Snate@binkert.org<b><tt>p.grammar</tt></b>
7916498Snate@binkert.org<blockquote>
7926498Snate@binkert.orgA list of tuples <tt>(name, rules)</tt> representing the grammar rules. <tt>name</tt> is the
7936498Snate@binkert.orgname of a Python function or method in <tt>pdict</tt> that starts with <tt>"p_"</tt>.
7946498Snate@binkert.org<tt>rules</tt> is a list of tuples <tt>(filename,line,prodname,syms)</tt> representing
7956498Snate@binkert.orgthe grammar rules found in the documentation string of that function. <tt>filename</tt> and <tt>line</tt> contain location
7966498Snate@binkert.orginformation that can be used for debugging. <tt>prodname</tt> is the name of the 
7976498Snate@binkert.orgproduction. <tt>syms</tt> is the right-hand side of the production.  If you have a
7986498Snate@binkert.orgfunction like this
7996498Snate@binkert.org
8006498Snate@binkert.org<pre>
8016498Snate@binkert.orgdef p_expr(p):
8026498Snate@binkert.org    '''expr : expr PLUS expr
8036498Snate@binkert.org            | expr MINUS expr
8046498Snate@binkert.org            | expr TIMES expr
8056498Snate@binkert.org            | expr DIVIDE expr'''
8066498Snate@binkert.org</pre>
8076498Snate@binkert.org
8086498Snate@binkert.orgthen the corresponding entry in <tt>p.grammar</tt> might look like this:
8096498Snate@binkert.org
8106498Snate@binkert.org<pre>
8116498Snate@binkert.org('p_expr', [ ('calc.py',10,'expr', ['expr','PLUS','expr']),
8126498Snate@binkert.org             ('calc.py',11,'expr', ['expr','MINUS','expr']),
8136498Snate@binkert.org             ('calc.py',12,'expr', ['expr','TIMES','expr']),
8146498Snate@binkert.org             ('calc.py',13,'expr', ['expr','DIVIDE','expr'])
8156498Snate@binkert.org           ])
8166498Snate@binkert.org</pre>
8176498Snate@binkert.org</blockquote>
8186498Snate@binkert.org
8196498Snate@binkert.org<p>
8206498Snate@binkert.org<b><tt>p.pfuncs</tt></b>
8216498Snate@binkert.org<blockquote>
8226498Snate@binkert.orgA sorted list of tuples <tt>(line, file, name, doc)</tt> representing all of
8236498Snate@binkert.orgthe <tt>p_</tt> functions found. <tt>line</tt> and <tt>file</tt> give location
8246498Snate@binkert.orginformation.  <tt>name</tt> is the name of the function. <tt>doc</tt> is the
8256498Snate@binkert.orgdocumentation string.   This list is sorted in ascending order by line number.
8266498Snate@binkert.org</blockquote>
8276498Snate@binkert.org
8286498Snate@binkert.org<p>
8296498Snate@binkert.org<b><tt>p.files</tt></b>
8306498Snate@binkert.org<blockquote>
8316498Snate@binkert.orgA dictionary holding all of the source filenames that were encountered
8326498Snate@binkert.orgwhile collecting parser information.  Only the keys of this dictionary have
8336498Snate@binkert.organy meaning.
8346498Snate@binkert.org</blockquote>
8356498Snate@binkert.org
8366498Snate@binkert.org<p>
8376498Snate@binkert.org<b><tt>p.error</tt></b>
8386498Snate@binkert.org<blockquote>
8396498Snate@binkert.orgAn attribute that indicates whether or not any critical errors 
8406498Snate@binkert.orgoccurred in validation.  If this is set, it means that that some kind
8416498Snate@binkert.orgof problem was detected and that no further processing should be
8426498Snate@binkert.orgperformed.
8436498Snate@binkert.org</blockquote>
8446498Snate@binkert.org
8456498Snate@binkert.org
8466498Snate@binkert.org<H2><a name="internal_nn9"></a>9. High-level operation</H2>
8476498Snate@binkert.org
8486498Snate@binkert.org
8496498Snate@binkert.orgUsing all of the above classes requires some attention to detail.  The <tt>yacc()</tt>
8506498Snate@binkert.orgfunction carries out a very specific sequence of operations to create a grammar.
8516498Snate@binkert.orgThis same sequence should be emulated if you build an alternative PLY interface.
8526498Snate@binkert.org
8536498Snate@binkert.org<ol>
8546498Snate@binkert.org<li>A <tt>ParserReflect</tt> object is created and raw grammar specification data is
8556498Snate@binkert.orgcollected.
8566498Snate@binkert.org<li>A <tt>Grammar</tt> object is created and populated with information
8576498Snate@binkert.orgfrom the specification data.
8586498Snate@binkert.org<li>A <tt>LRGenerator</tt> object is created to run the LALR algorithm over
8596498Snate@binkert.orgthe <tt>Grammar</tt> object.
8606498Snate@binkert.org<li>Productions in the LRGenerator and bound to callables using the <tt>bind_callables()</tt>
8616498Snate@binkert.orgmethod.
8626498Snate@binkert.org<li>A <tt>LRParser</tt> object is created from from the information in the
8636498Snate@binkert.org<tt>LRGenerator</tt> object.
8646498Snate@binkert.org</ol>
8656498Snate@binkert.org
8666498Snate@binkert.org</body>
8676498Snate@binkert.org</html>
8686498Snate@binkert.org
8696498Snate@binkert.org
8706498Snate@binkert.org
8716498Snate@binkert.org
8726498Snate@binkert.org
8736498Snate@binkert.org
8746498Snate@binkert.org
875