internal.html revision 6498
16498Snate@binkert.org<html> 26498Snate@binkert.org<head> 36498Snate@binkert.org<title>PLY Internals</title> 46498Snate@binkert.org</head> 56498Snate@binkert.org<body bgcolor="#ffffff"> 66498Snate@binkert.org 76498Snate@binkert.org<h1>PLY Internals</h1> 86498Snate@binkert.org 96498Snate@binkert.org<b> 106498Snate@binkert.orgDavid M. Beazley <br> 116498Snate@binkert.orgdave@dabeaz.com<br> 126498Snate@binkert.org</b> 136498Snate@binkert.org 146498Snate@binkert.org<p> 156498Snate@binkert.org<b>PLY Version: 3.0</b> 166498Snate@binkert.org<p> 176498Snate@binkert.org 186498Snate@binkert.org<!-- INDEX --> 196498Snate@binkert.org<div class="sectiontoc"> 206498Snate@binkert.org<ul> 216498Snate@binkert.org<li><a href="#internal_nn1">Introduction</a> 226498Snate@binkert.org<li><a href="#internal_nn2">Grammar Class</a> 236498Snate@binkert.org<li><a href="#internal_nn3">Productions</a> 246498Snate@binkert.org<li><a href="#internal_nn4">LRItems</a> 256498Snate@binkert.org<li><a href="#internal_nn5">LRTable</a> 266498Snate@binkert.org<li><a href="#internal_nn6">LRGeneratedTable</a> 276498Snate@binkert.org<li><a href="#internal_nn7">LRParser</a> 286498Snate@binkert.org<li><a href="#internal_nn8">ParserReflect</a> 296498Snate@binkert.org<li><a href="#internal_nn9">High-level operation</a> 306498Snate@binkert.org</ul> 316498Snate@binkert.org</div> 326498Snate@binkert.org<!-- INDEX --> 336498Snate@binkert.org 346498Snate@binkert.org 356498Snate@binkert.org<H2><a name="internal_nn1"></a>1. Introduction</H2> 366498Snate@binkert.org 376498Snate@binkert.org 386498Snate@binkert.orgThis document describes classes and functions that make up the internal 396498Snate@binkert.orgoperation of PLY. Using this programming interface, it is possible to 406498Snate@binkert.orgmanually build an parser using a different interface specification 416498Snate@binkert.orgthan what PLY normally uses. For example, you could build a gramar 426498Snate@binkert.orgfrom information parsed in a completely different input format. Some of 436498Snate@binkert.orgthese objects may be useful for building more advanced parsing engines 446498Snate@binkert.orgsuch as GLR. 456498Snate@binkert.org 466498Snate@binkert.org<p> 476498Snate@binkert.orgIt should be stressed that using PLY at this level is not for the 486498Snate@binkert.orgfaint of heart. Generally, it's assumed that you know a bit of 496498Snate@binkert.orgthe underlying compiler theory and how an LR parser is put together. 506498Snate@binkert.org 516498Snate@binkert.org<H2><a name="internal_nn2"></a>2. Grammar Class</H2> 526498Snate@binkert.org 536498Snate@binkert.org 546498Snate@binkert.orgThe file <tt>ply.yacc</tt> defines a class <tt>Grammar</tt> that 556498Snate@binkert.orgis used to hold and manipulate information about a grammar 566498Snate@binkert.orgspecification. It encapsulates the same basic information 576498Snate@binkert.orgabout a grammar that is put into a YACC file including 586498Snate@binkert.orgthe list of tokens, precedence rules, and grammar rules. 596498Snate@binkert.orgVarious operations are provided to perform different validations 606498Snate@binkert.orgon the grammar. In addition, there are operations to compute 616498Snate@binkert.orgthe first and follow sets that are needed by the various table 626498Snate@binkert.orggeneration algorithms. 636498Snate@binkert.org 646498Snate@binkert.org<p> 656498Snate@binkert.org<tt><b>Grammar(terminals)</b></tt> 666498Snate@binkert.org 676498Snate@binkert.org<blockquote> 686498Snate@binkert.orgCreates a new grammar object. <tt>terminals</tt> is a list of strings 696498Snate@binkert.orgspecifying the terminals for the grammar. An instance <tt>g</tt> of 706498Snate@binkert.org<tt>Grammar</tt> has the following methods: 716498Snate@binkert.org</blockquote> 726498Snate@binkert.org 736498Snate@binkert.org<p> 746498Snate@binkert.org<b><tt>g.set_precedence(term,assoc,level)</tt></b> 756498Snate@binkert.org<blockquote> 766498Snate@binkert.orgSets the precedence level and associativity for a given terminal <tt>term</tt>. 776498Snate@binkert.org<tt>assoc</tt> is one of <tt>'right'</tt>, 786498Snate@binkert.org<tt>'left'</tt>, or <tt>'nonassoc'</tt> and <tt>level</tt> is a positive integer. The higher 796498Snate@binkert.orgthe value of <tt>level</tt>, the higher the precedence. Here is an example of typical 806498Snate@binkert.orgprecedence settings: 816498Snate@binkert.org 826498Snate@binkert.org<pre> 836498Snate@binkert.orgg.set_precedence('PLUS', 'left',1) 846498Snate@binkert.orgg.set_precedence('MINUS', 'left',1) 856498Snate@binkert.orgg.set_precedence('TIMES', 'left',2) 866498Snate@binkert.orgg.set_precedence('DIVIDE','left',2) 876498Snate@binkert.orgg.set_precedence('UMINUS','left',3) 886498Snate@binkert.org</pre> 896498Snate@binkert.org 906498Snate@binkert.orgThis method must be called prior to adding any productions to the 916498Snate@binkert.orggrammar with <tt>g.add_production()</tt>. The precedence of individual grammar 926498Snate@binkert.orgrules is determined by the precedence of the right-most terminal. 936498Snate@binkert.org 946498Snate@binkert.org</blockquote> 956498Snate@binkert.org<p> 966498Snate@binkert.org<b><tt>g.add_production(name,syms,func=None,file='',line=0)</tt></b> 976498Snate@binkert.org<blockquote> 986498Snate@binkert.orgAdds a new grammar rule. <tt>name</tt> is the name of the rule, 996498Snate@binkert.org<tt>syms</tt> is a list of symbols making up the right hand 1006498Snate@binkert.orgside of the rule, <tt>func</tt> is the function to call when 1016498Snate@binkert.orgreducing the rule. <tt>file</tt> and <tt>line</tt> specify 1026498Snate@binkert.orgthe filename and line number of the rule and are used for 1036498Snate@binkert.orggenerating error messages. 1046498Snate@binkert.org 1056498Snate@binkert.org<p> 1066498Snate@binkert.orgThe list of symbols in <tt>syms</tt> may include character 1076498Snate@binkert.orgliterals and <tt>%prec</tt> specifiers. Here are some 1086498Snate@binkert.orgexamples: 1096498Snate@binkert.org 1106498Snate@binkert.org<pre> 1116498Snate@binkert.orgg.add_production('expr',['expr','PLUS','term'],func,file,line) 1126498Snate@binkert.orgg.add_production('expr',['expr','"+"','term'],func,file,line) 1136498Snate@binkert.orgg.add_production('expr',['MINUS','expr','%prec','UMINUS'],func,file,line) 1146498Snate@binkert.org</pre> 1156498Snate@binkert.org 1166498Snate@binkert.org<p> 1176498Snate@binkert.orgIf any kind of error is detected, a <tt>GrammarError</tt> exception 1186498Snate@binkert.orgis raised with a message indicating the reason for the failure. 1196498Snate@binkert.org</blockquote> 1206498Snate@binkert.org 1216498Snate@binkert.org<p> 1226498Snate@binkert.org<b><tt>g.set_start(start=None)</tt></b> 1236498Snate@binkert.org<blockquote> 1246498Snate@binkert.orgSets the starting rule for the grammar. <tt>start</tt> is a string 1256498Snate@binkert.orgspecifying the name of the start rule. If <tt>start</tt> is omitted, 1266498Snate@binkert.orgthe first grammar rule added with <tt>add_production()</tt> is taken to be 1276498Snate@binkert.orgthe starting rule. This method must always be called after all 1286498Snate@binkert.orgproductions have been added. 1296498Snate@binkert.org</blockquote> 1306498Snate@binkert.org 1316498Snate@binkert.org<p> 1326498Snate@binkert.org<b><tt>g.find_unreachable()</tt></b> 1336498Snate@binkert.org<blockquote> 1346498Snate@binkert.orgDiagnostic function. Returns a list of all unreachable non-terminals 1356498Snate@binkert.orgdefined in the grammar. This is used to identify inactive parts of 1366498Snate@binkert.orgthe grammar specification. 1376498Snate@binkert.org</blockquote> 1386498Snate@binkert.org 1396498Snate@binkert.org<p> 1406498Snate@binkert.org<b><tt>g.infinite_cycle()</tt></b> 1416498Snate@binkert.org<blockquote> 1426498Snate@binkert.orgDiagnostic function. Returns a list of all non-terminals in the 1436498Snate@binkert.orggrammar that result in an infinite cycle. This condition occurs if 1446498Snate@binkert.orgthere is no way for a grammar rule to expand to a string containing 1456498Snate@binkert.orgonly terminal symbols. 1466498Snate@binkert.org</blockquote> 1476498Snate@binkert.org 1486498Snate@binkert.org<p> 1496498Snate@binkert.org<b><tt>g.undefined_symbols()</tt></b> 1506498Snate@binkert.org<blockquote> 1516498Snate@binkert.orgDiagnostic function. Returns a list of tuples <tt>(name, prod)</tt> 1526498Snate@binkert.orgcorresponding to undefined symbols in the grammar. <tt>name</tt> is the 1536498Snate@binkert.orgname of the undefined symbol and <tt>prod</tt> is an instance of 1546498Snate@binkert.org<tt>Production</tt> which has information about the production rule 1556498Snate@binkert.orgwhere the undefined symbol was used. 1566498Snate@binkert.org</blockquote> 1576498Snate@binkert.org 1586498Snate@binkert.org<p> 1596498Snate@binkert.org<b><tt>g.unused_terminals()</tt></b> 1606498Snate@binkert.org<blockquote> 1616498Snate@binkert.orgDiagnostic function. Returns a list of terminals that were defined, 1626498Snate@binkert.orgbut never used in the grammar. 1636498Snate@binkert.org</blockquote> 1646498Snate@binkert.org 1656498Snate@binkert.org<p> 1666498Snate@binkert.org<b><tt>g.unused_rules()</tt></b> 1676498Snate@binkert.org<blockquote> 1686498Snate@binkert.orgDiagnostic function. Returns a list of <tt>Production</tt> instances 1696498Snate@binkert.orgcorresponding to production rules that were defined in the grammar, 1706498Snate@binkert.orgbut never used anywhere. This is slightly different 1716498Snate@binkert.orgthan <tt>find_unreachable()</tt>. 1726498Snate@binkert.org</blockquote> 1736498Snate@binkert.org 1746498Snate@binkert.org<p> 1756498Snate@binkert.org<b><tt>g.unused_precedence()</tt></b> 1766498Snate@binkert.org<blockquote> 1776498Snate@binkert.orgDiagnostic function. Returns a list of tuples <tt>(term, assoc)</tt> 1786498Snate@binkert.orgcorresponding to precedence rules that were set, but never used the 1796498Snate@binkert.orggrammar. <tt>term</tt> is the terminal name and <tt>assoc</tt> is the 1806498Snate@binkert.orgprecedence associativity (e.g., <tt>'left'</tt>, <tt>'right'</tt>, 1816498Snate@binkert.orgor <tt>'nonassoc'</tt>. 1826498Snate@binkert.org</blockquote> 1836498Snate@binkert.org 1846498Snate@binkert.org<p> 1856498Snate@binkert.org<b><tt>g.compute_first()</tt></b> 1866498Snate@binkert.org<blockquote> 1876498Snate@binkert.orgCompute all of the first sets for all symbols in the grammar. Returns a dictionary 1886498Snate@binkert.orgmapping symbol names to a list of all first symbols. 1896498Snate@binkert.org</blockquote> 1906498Snate@binkert.org 1916498Snate@binkert.org<p> 1926498Snate@binkert.org<b><tt>g.compute_follow()</tt></b> 1936498Snate@binkert.org<blockquote> 1946498Snate@binkert.orgCompute all of the follow sets for all non-terminals in the grammar. 1956498Snate@binkert.orgThe follow set is the set of all possible symbols that might follow a 1966498Snate@binkert.orggiven non-terminal. Returns a dictionary mapping non-terminal names 1976498Snate@binkert.orgto a list of symbols. 1986498Snate@binkert.org</blockquote> 1996498Snate@binkert.org 2006498Snate@binkert.org<p> 2016498Snate@binkert.org<b><tt>g.build_lritems()</tt></b> 2026498Snate@binkert.org<blockquote> 2036498Snate@binkert.orgCalculates all of the LR items for all productions in the grammar. This 2046498Snate@binkert.orgstep is required before using the grammar for any kind of table generation. 2056498Snate@binkert.orgSee the section on LR items below. 2066498Snate@binkert.org</blockquote> 2076498Snate@binkert.org 2086498Snate@binkert.org<p> 2096498Snate@binkert.orgThe following attributes are set by the above methods and may be useful 2106498Snate@binkert.orgin code that works with the grammar. All of these attributes should be 2116498Snate@binkert.orgassumed to be read-only. Changing their values directly will likely 2126498Snate@binkert.orgbreak the grammar. 2136498Snate@binkert.org 2146498Snate@binkert.org<p> 2156498Snate@binkert.org<b><tt>g.Productions</tt></b> 2166498Snate@binkert.org<blockquote> 2176498Snate@binkert.orgA list of all productions added. The first entry is reserved for 2186498Snate@binkert.orga production representing the starting rule. The objects in this list 2196498Snate@binkert.orgare instances of the <tt>Production</tt> class, described shortly. 2206498Snate@binkert.org</blockquote> 2216498Snate@binkert.org 2226498Snate@binkert.org<p> 2236498Snate@binkert.org<b><tt>g.Prodnames</tt></b> 2246498Snate@binkert.org<blockquote> 2256498Snate@binkert.orgA dictionary mapping the names of nonterminals to a list of all 2266498Snate@binkert.orgproductions of that nonterminal. 2276498Snate@binkert.org</blockquote> 2286498Snate@binkert.org 2296498Snate@binkert.org<p> 2306498Snate@binkert.org<b><tt>g.Terminals</tt></b> 2316498Snate@binkert.org<blockquote> 2326498Snate@binkert.orgA dictionary mapping the names of terminals to a list of the 2336498Snate@binkert.orgproduction numbers where they are used. 2346498Snate@binkert.org</blockquote> 2356498Snate@binkert.org 2366498Snate@binkert.org<p> 2376498Snate@binkert.org<b><tt>g.Nonterminals</tt></b> 2386498Snate@binkert.org<blockquote> 2396498Snate@binkert.orgA dictionary mapping the names of nonterminals to a list of the 2406498Snate@binkert.orgproduction numbers where they are used. 2416498Snate@binkert.org</blockquote> 2426498Snate@binkert.org 2436498Snate@binkert.org<p> 2446498Snate@binkert.org<b><tt>g.First</tt></b> 2456498Snate@binkert.org<blockquote> 2466498Snate@binkert.orgA dictionary representing the first sets for all grammar symbols. This is 2476498Snate@binkert.orgcomputed and returned by the <tt>compute_first()</tt> method. 2486498Snate@binkert.org</blockquote> 2496498Snate@binkert.org 2506498Snate@binkert.org<p> 2516498Snate@binkert.org<b><tt>g.Follow</tt></b> 2526498Snate@binkert.org<blockquote> 2536498Snate@binkert.orgA dictionary representing the follow sets for all grammar rules. This is 2546498Snate@binkert.orgcomputed and returned by the <tt>compute_follow()</tt> method. 2556498Snate@binkert.org</blockquote> 2566498Snate@binkert.org 2576498Snate@binkert.org<p> 2586498Snate@binkert.org<b><tt>g.Start</tt></b> 2596498Snate@binkert.org<blockquote> 2606498Snate@binkert.orgStarting symbol for the grammar. Set by the <tt>set_start()</tt> method. 2616498Snate@binkert.org</blockquote> 2626498Snate@binkert.org 2636498Snate@binkert.orgFor the purposes of debugging, a <tt>Grammar</tt> object supports the <tt>__len__()</tt> and 2646498Snate@binkert.org<tt>__getitem__()</tt> special methods. Accessing <tt>g[n]</tt> returns the nth production 2656498Snate@binkert.orgfrom the grammar. 2666498Snate@binkert.org 2676498Snate@binkert.org 2686498Snate@binkert.org<H2><a name="internal_nn3"></a>3. Productions</H2> 2696498Snate@binkert.org 2706498Snate@binkert.org 2716498Snate@binkert.org<tt>Grammar</tt> objects store grammar rules as instances of a <tt>Production</tt> class. This 2726498Snate@binkert.orgclass has no public constructor--you should only create productions by calling <tt>Grammar.add_production()</tt>. 2736498Snate@binkert.orgThe following attributes are available on a <tt>Production</tt> instance <tt>p</tt>. 2746498Snate@binkert.org 2756498Snate@binkert.org<p> 2766498Snate@binkert.org<b><tt>p.name</tt></b> 2776498Snate@binkert.org<blockquote> 2786498Snate@binkert.orgThe name of the production. For a grammar rule such as <tt>A : B C D</tt>, this is <tt>'A'</tt>. 2796498Snate@binkert.org</blockquote> 2806498Snate@binkert.org 2816498Snate@binkert.org<p> 2826498Snate@binkert.org<b><tt>p.prod</tt></b> 2836498Snate@binkert.org<blockquote> 2846498Snate@binkert.orgA tuple of symbols making up the right-hand side of the production. For a grammar rule such as <tt>A : B C D</tt>, this is <tt>('B','C','D')</tt>. 2856498Snate@binkert.org</blockquote> 2866498Snate@binkert.org 2876498Snate@binkert.org<p> 2886498Snate@binkert.org<b><tt>p.number</tt></b> 2896498Snate@binkert.org<blockquote> 2906498Snate@binkert.orgProduction number. An integer containing the index of the production in the grammar's <tt>Productions</tt> list. 2916498Snate@binkert.org</blockquote> 2926498Snate@binkert.org 2936498Snate@binkert.org<p> 2946498Snate@binkert.org<b><tt>p.func</tt></b> 2956498Snate@binkert.org<blockquote> 2966498Snate@binkert.orgThe name of the reduction function associated with the production. 2976498Snate@binkert.orgThis is the function that will execute when reducing the entire 2986498Snate@binkert.orggrammar rule during parsing. 2996498Snate@binkert.org</blockquote> 3006498Snate@binkert.org 3016498Snate@binkert.org<p> 3026498Snate@binkert.org<b><tt>p.callable</tt></b> 3036498Snate@binkert.org<blockquote> 3046498Snate@binkert.orgThe callable object associated with the name in <tt>p.func</tt>. This is <tt>None</tt> 3056498Snate@binkert.orgunless the production has been bound using <tt>bind()</tt>. 3066498Snate@binkert.org</blockquote> 3076498Snate@binkert.org 3086498Snate@binkert.org<p> 3096498Snate@binkert.org<b><tt>p.file</tt></b> 3106498Snate@binkert.org<blockquote> 3116498Snate@binkert.orgFilename associated with the production. Typically this is the file where the production was defined. Used for error messages. 3126498Snate@binkert.org</blockquote> 3136498Snate@binkert.org 3146498Snate@binkert.org<p> 3156498Snate@binkert.org<b><tt>p.lineno</tt></b> 3166498Snate@binkert.org<blockquote> 3176498Snate@binkert.orgLine number associated with the production. Typically this is the line number in <tt>p.file</tt> where the production was defined. Used for error messages. 3186498Snate@binkert.org</blockquote> 3196498Snate@binkert.org 3206498Snate@binkert.org<p> 3216498Snate@binkert.org<b><tt>p.prec</tt></b> 3226498Snate@binkert.org<blockquote> 3236498Snate@binkert.orgPrecedence and associativity associated with the production. This is a tuple <tt>(assoc,level)</tt> where 3246498Snate@binkert.org<tt>assoc</tt> is one of <tt>'left'</tt>,<tt>'right'</tt>, or <tt>'nonassoc'</tt> and <tt>level</tt> is 3256498Snate@binkert.organ integer. This value is determined by the precedence of the right-most terminal symbol in the production 3266498Snate@binkert.orgor by use of the <tt>%prec</tt> specifier when adding the production. 3276498Snate@binkert.org</blockquote> 3286498Snate@binkert.org 3296498Snate@binkert.org<p> 3306498Snate@binkert.org<b><tt>p.usyms</tt></b> 3316498Snate@binkert.org<blockquote> 3326498Snate@binkert.orgA list of all unique symbols found in the production. 3336498Snate@binkert.org</blockquote> 3346498Snate@binkert.org 3356498Snate@binkert.org<p> 3366498Snate@binkert.org<b><tt>p.lr_items</tt></b> 3376498Snate@binkert.org<blockquote> 3386498Snate@binkert.orgA list of all LR items for this production. This attribute only has a meaningful value if the 3396498Snate@binkert.org<tt>Grammar.build_lritems()</tt> method has been called. The items in this list are 3406498Snate@binkert.orginstances of <tt>LRItem</tt> described below. 3416498Snate@binkert.org</blockquote> 3426498Snate@binkert.org 3436498Snate@binkert.org<p> 3446498Snate@binkert.org<b><tt>p.lr_next</tt></b> 3456498Snate@binkert.org<blockquote> 3466498Snate@binkert.orgThe head of a linked-list representation of the LR items in <tt>p.lr_items</tt>. 3476498Snate@binkert.orgThis attribute only has a meaningful value if the <tt>Grammar.build_lritems()</tt> 3486498Snate@binkert.orgmethod has been called. Each <tt>LRItem</tt> instance has a <tt>lr_next</tt> attribute 3496498Snate@binkert.orgto move to the next item. The list is terminated by <tt>None</tt>. 3506498Snate@binkert.org</blockquote> 3516498Snate@binkert.org 3526498Snate@binkert.org<p> 3536498Snate@binkert.org<b><tt>p.bind(dict)</tt></b> 3546498Snate@binkert.org<blockquote> 3556498Snate@binkert.orgBinds the production function name in <tt>p.func</tt> to a callable object in 3566498Snate@binkert.org<tt>dict</tt>. This operation is typically carried out in the last step 3576498Snate@binkert.orgprior to running the parsing engine and is needed since parsing tables are typically 3586498Snate@binkert.orgread from files which only include the function names, not the functions themselves. 3596498Snate@binkert.org</blockquote> 3606498Snate@binkert.org 3616498Snate@binkert.org<P> 3626498Snate@binkert.org<tt>Production</tt> objects support 3636498Snate@binkert.orgthe <tt>__len__()</tt>, <tt>__getitem__()</tt>, and <tt>__str__()</tt> 3646498Snate@binkert.orgspecial methods. 3656498Snate@binkert.org<tt>len(p)</tt> returns the number of symbols in <tt>p.prod</tt> 3666498Snate@binkert.organd <tt>p[n]</tt> is the same as <tt>p.prod[n]</tt>. 3676498Snate@binkert.org 3686498Snate@binkert.org<H2><a name="internal_nn4"></a>4. LRItems</H2> 3696498Snate@binkert.org 3706498Snate@binkert.org 3716498Snate@binkert.orgThe construction of parsing tables in an LR-based parser generator is primarily 3726498Snate@binkert.orgdone over a set of "LR Items". An LR item represents a stage of parsing one 3736498Snate@binkert.orgof the grammar rules. To compute the LR items, it is first necessary to 3746498Snate@binkert.orgcall <tt>Grammar.build_lritems()</tt>. Once this step, all of the productions 3756498Snate@binkert.orgin the grammar will have their LR items attached to them. 3766498Snate@binkert.org 3776498Snate@binkert.org<p> 3786498Snate@binkert.orgHere is an interactive example that shows what LR items look like if you 3796498Snate@binkert.orginteractively experiment. In this example, <tt>g</tt> is a <tt>Grammar</tt> 3806498Snate@binkert.orgobject. 3816498Snate@binkert.org 3826498Snate@binkert.org<blockquote> 3836498Snate@binkert.org<pre> 3846498Snate@binkert.org>>> <b>g.build_lritems()</b> 3856498Snate@binkert.org>>> <b>p = g[1]</b> 3866498Snate@binkert.org>>> <b>p</b> 3876498Snate@binkert.orgProduction(statement -> ID = expr) 3886498Snate@binkert.org>>> 3896498Snate@binkert.org</pre> 3906498Snate@binkert.org</blockquote> 3916498Snate@binkert.org 3926498Snate@binkert.orgIn the above code, <tt>p</tt> represents the first grammar rule. In 3936498Snate@binkert.orgthis case, a rule <tt>'statement -> ID = expr'</tt>. 3946498Snate@binkert.org 3956498Snate@binkert.org<p> 3966498Snate@binkert.orgNow, let's look at the LR items for <tt>p</tt>. 3976498Snate@binkert.org 3986498Snate@binkert.org<blockquote> 3996498Snate@binkert.org<pre> 4006498Snate@binkert.org>>> <b>p.lr_items</b> 4016498Snate@binkert.org[LRItem(statement -> . ID = expr), 4026498Snate@binkert.org LRItem(statement -> ID . = expr), 4036498Snate@binkert.org LRItem(statement -> ID = . expr), 4046498Snate@binkert.org LRItem(statement -> ID = expr .)] 4056498Snate@binkert.org>>> 4066498Snate@binkert.org</pre> 4076498Snate@binkert.org</blockquote> 4086498Snate@binkert.org 4096498Snate@binkert.orgIn each LR item, the dot (.) represents a specific stage of parsing. In each LR item, the dot 4106498Snate@binkert.orgis advanced by one symbol. It is only when the dot reaches the very end that a production 4116498Snate@binkert.orgis successfully parsed. 4126498Snate@binkert.org 4136498Snate@binkert.org<p> 4146498Snate@binkert.orgAn instance <tt>lr</tt> of <tt>LRItem</tt> has the following 4156498Snate@binkert.orgattributes that hold information related to that specific stage of 4166498Snate@binkert.orgparsing. 4176498Snate@binkert.org 4186498Snate@binkert.org<p> 4196498Snate@binkert.org<b><tt>lr.name</tt></b> 4206498Snate@binkert.org<blockquote> 4216498Snate@binkert.orgThe name of the grammar rule. For example, <tt>'statement'</tt> in the above example. 4226498Snate@binkert.org</blockquote> 4236498Snate@binkert.org 4246498Snate@binkert.org<p> 4256498Snate@binkert.org<b><tt>lr.prod</tt></b> 4266498Snate@binkert.org<blockquote> 4276498Snate@binkert.orgA tuple of symbols representing the right-hand side of the production, including the 4286498Snate@binkert.orgspecial <tt>'.'</tt> character. For example, <tt>('ID','.','=','expr')</tt>. 4296498Snate@binkert.org</blockquote> 4306498Snate@binkert.org 4316498Snate@binkert.org<p> 4326498Snate@binkert.org<b><tt>lr.number</tt></b> 4336498Snate@binkert.org<blockquote> 4346498Snate@binkert.orgAn integer representing the production number in the grammar. 4356498Snate@binkert.org</blockquote> 4366498Snate@binkert.org 4376498Snate@binkert.org<p> 4386498Snate@binkert.org<b><tt>lr.usyms</tt></b> 4396498Snate@binkert.org<blockquote> 4406498Snate@binkert.orgA set of unique symbols in the production. Inherited from the original <tt>Production</tt> instance. 4416498Snate@binkert.org</blockquote> 4426498Snate@binkert.org 4436498Snate@binkert.org<p> 4446498Snate@binkert.org<b><tt>lr.lr_index</tt></b> 4456498Snate@binkert.org<blockquote> 4466498Snate@binkert.orgAn integer representing the position of the dot (.). You should never use <tt>lr.prod.index()</tt> 4476498Snate@binkert.orgto search for it--the result will be wrong if the grammar happens to also use (.) as a character 4486498Snate@binkert.orgliteral. 4496498Snate@binkert.org</blockquote> 4506498Snate@binkert.org 4516498Snate@binkert.org<p> 4526498Snate@binkert.org<b><tt>lr.lr_after</tt></b> 4536498Snate@binkert.org<blockquote> 4546498Snate@binkert.orgA list of all productions that can legally appear immediately to the right of the 4556498Snate@binkert.orgdot (.). This list contains <tt>Production</tt> instances. This attribute 4566498Snate@binkert.orgrepresents all of the possible branches a parse can take from the current position. 4576498Snate@binkert.orgFor example, suppose that <tt>lr</tt> represents a stage immediately before 4586498Snate@binkert.organ expression like this: 4596498Snate@binkert.org 4606498Snate@binkert.org<pre> 4616498Snate@binkert.org>>> <b>lr</b> 4626498Snate@binkert.orgLRItem(statement -> ID = . expr) 4636498Snate@binkert.org>>> 4646498Snate@binkert.org</pre> 4656498Snate@binkert.org 4666498Snate@binkert.orgThen, the value of <tt>lr.lr_after</tt> might look like this, showing all productions that 4676498Snate@binkert.orgcan legally appear next: 4686498Snate@binkert.org 4696498Snate@binkert.org<pre> 4706498Snate@binkert.org>>> <b>lr.lr_after</b> 4716498Snate@binkert.org[Production(expr -> expr PLUS expr), 4726498Snate@binkert.org Production(expr -> expr MINUS expr), 4736498Snate@binkert.org Production(expr -> expr TIMES expr), 4746498Snate@binkert.org Production(expr -> expr DIVIDE expr), 4756498Snate@binkert.org Production(expr -> MINUS expr), 4766498Snate@binkert.org Production(expr -> LPAREN expr RPAREN), 4776498Snate@binkert.org Production(expr -> NUMBER), 4786498Snate@binkert.org Production(expr -> ID)] 4796498Snate@binkert.org>>> 4806498Snate@binkert.org</pre> 4816498Snate@binkert.org 4826498Snate@binkert.org</blockquote> 4836498Snate@binkert.org 4846498Snate@binkert.org<p> 4856498Snate@binkert.org<b><tt>lr.lr_before</tt></b> 4866498Snate@binkert.org<blockquote> 4876498Snate@binkert.orgThe grammar symbol that appears immediately before the dot (.) or <tt>None</tt> if 4886498Snate@binkert.orgat the beginning of the parse. 4896498Snate@binkert.org</blockquote> 4906498Snate@binkert.org 4916498Snate@binkert.org<p> 4926498Snate@binkert.org<b><tt>lr.lr_next</tt></b> 4936498Snate@binkert.org<blockquote> 4946498Snate@binkert.orgA link to the next LR item, representing the next stage of the parse. <tt>None</tt> if <tt>lr</tt> 4956498Snate@binkert.orgis the last LR item. 4966498Snate@binkert.org</blockquote> 4976498Snate@binkert.org 4986498Snate@binkert.org<tt>LRItem</tt> instances also support the <tt>__len__()</tt> and <tt>__getitem__()</tt> special methods. 4996498Snate@binkert.org<tt>len(lr)</tt> returns the number of items in <tt>lr.prod</tt> including the dot (.). <tt>lr[n]</tt> 5006498Snate@binkert.orgreturns <tt>lr.prod[n]</tt>. 5016498Snate@binkert.org 5026498Snate@binkert.org<p> 5036498Snate@binkert.orgIt goes without saying that all of the attributes associated with LR 5046498Snate@binkert.orgitems should be assumed to be read-only. Modifications will very 5056498Snate@binkert.orglikely create a small black-hole that will consume you and your code. 5066498Snate@binkert.org 5076498Snate@binkert.org<H2><a name="internal_nn5"></a>5. LRTable</H2> 5086498Snate@binkert.org 5096498Snate@binkert.org 5106498Snate@binkert.orgThe <tt>LRTable</tt> class is used to represent LR parsing table data. This 5116498Snate@binkert.orgminimally includes the production list, action table, and goto table. 5126498Snate@binkert.org 5136498Snate@binkert.org<p> 5146498Snate@binkert.org<b><tt>LRTable()</tt></b> 5156498Snate@binkert.org<blockquote> 5166498Snate@binkert.orgCreate an empty LRTable object. This object contains only the information needed to 5176498Snate@binkert.orgrun an LR parser. 5186498Snate@binkert.org</blockquote> 5196498Snate@binkert.org 5206498Snate@binkert.orgAn instance <tt>lrtab</tt> of <tt>LRTable</tt> has the following methods: 5216498Snate@binkert.org 5226498Snate@binkert.org<p> 5236498Snate@binkert.org<b><tt>lrtab.read_table(module)</tt></b> 5246498Snate@binkert.org<blockquote> 5256498Snate@binkert.orgPopulates the LR table with information from the module specified in <tt>module</tt>. 5266498Snate@binkert.org<tt>module</tt> is either a module object already loaded with <tt>import</tt> or 5276498Snate@binkert.orgthe name of a Python module. If it's a string containing a module name, it is 5286498Snate@binkert.orgloaded and parsing data is extracted. Returns the signature value that was used 5296498Snate@binkert.orgwhen initially writing the tables. Raises a <tt>VersionError</tt> exception if 5306498Snate@binkert.orgthe module was created using an incompatible version of PLY. 5316498Snate@binkert.org</blockquote> 5326498Snate@binkert.org 5336498Snate@binkert.org<p> 5346498Snate@binkert.org<b><tt>lrtab.bind_callables(dict)</tt></b> 5356498Snate@binkert.org<blockquote> 5366498Snate@binkert.orgThis binds all of the function names used in productions to callable objects 5376498Snate@binkert.orgfound in the dictionary <tt>dict</tt>. During table generation and when reading 5386498Snate@binkert.orgLR tables from files, PLY only uses the names of action functions such as <tt>'p_expr'</tt>, 5396498Snate@binkert.org<tt>'p_statement'</tt>, etc. In order to actually run the parser, these names 5406498Snate@binkert.orghave to be bound to callable objects. This method is always called prior to 5416498Snate@binkert.orgrunning a parser. 5426498Snate@binkert.org</blockquote> 5436498Snate@binkert.org 5446498Snate@binkert.orgAfter <tt>lrtab</tt> has been populated, the following attributes are defined. 5456498Snate@binkert.org 5466498Snate@binkert.org<p> 5476498Snate@binkert.org<b><tt>lrtab.lr_method</tt></b> 5486498Snate@binkert.org<blockquote> 5496498Snate@binkert.orgThe LR parsing method used (e.g., <tt>'LALR'</tt>) 5506498Snate@binkert.org</blockquote> 5516498Snate@binkert.org 5526498Snate@binkert.org 5536498Snate@binkert.org<p> 5546498Snate@binkert.org<b><tt>lrtab.lr_productions</tt></b> 5556498Snate@binkert.org<blockquote> 5566498Snate@binkert.orgThe production list. If the parsing tables have been newly 5576498Snate@binkert.orgconstructed, this will be a list of <tt>Production</tt> instances. If 5586498Snate@binkert.orgthe parsing tables have been read from a file, it's a list 5596498Snate@binkert.orgof <tt>MiniProduction</tt> instances. This, together 5606498Snate@binkert.orgwith <tt>lr_action</tt> and <tt>lr_goto</tt> contain all of the 5616498Snate@binkert.orginformation needed by the LR parsing engine. 5626498Snate@binkert.org</blockquote> 5636498Snate@binkert.org 5646498Snate@binkert.org<p> 5656498Snate@binkert.org<b><tt>lrtab.lr_action</tt></b> 5666498Snate@binkert.org<blockquote> 5676498Snate@binkert.orgThe LR action dictionary that implements the underlying state machine. 5686498Snate@binkert.orgThe keys of this dictionary are the LR states. 5696498Snate@binkert.org</blockquote> 5706498Snate@binkert.org 5716498Snate@binkert.org<p> 5726498Snate@binkert.org<b><tt>lrtab.lr_goto</tt></b> 5736498Snate@binkert.org<blockquote> 5746498Snate@binkert.orgThe LR goto table that contains information about grammar rule reductions. 5756498Snate@binkert.org</blockquote> 5766498Snate@binkert.org 5776498Snate@binkert.org 5786498Snate@binkert.org<H2><a name="internal_nn6"></a>6. LRGeneratedTable</H2> 5796498Snate@binkert.org 5806498Snate@binkert.org 5816498Snate@binkert.orgThe <tt>LRGeneratedTable</tt> class represents constructed LR parsing tables on a 5826498Snate@binkert.orggrammar. It is a subclass of <tt>LRTable</tt>. 5836498Snate@binkert.org 5846498Snate@binkert.org<p> 5856498Snate@binkert.org<b><tt>LRGeneratedTable(grammar, method='LALR',log=None)</tt></b> 5866498Snate@binkert.org<blockquote> 5876498Snate@binkert.orgCreate the LR parsing tables on a grammar. <tt>grammar</tt> is an instance of <tt>Grammar</tt>, 5886498Snate@binkert.org<tt>method</tt> is a string with the parsing method (<tt>'SLR'</tt> or <tt>'LALR'</tt>), and 5896498Snate@binkert.org<tt>log</tt> is a logger object used to write debugging information. The debugging information 5906498Snate@binkert.orgwritten to <tt>log</tt> is the same as what appears in the <tt>parser.out</tt> file created 5916498Snate@binkert.orgby yacc. By supplying a custom logger with a different message format, it is possible to get 5926498Snate@binkert.orgmore information (e.g., the line number in <tt>yacc.py</tt> used for issuing each line of 5936498Snate@binkert.orgoutput in the log). The result is an instance of <tt>LRGeneratedTable</tt>. 5946498Snate@binkert.org</blockquote> 5956498Snate@binkert.org 5966498Snate@binkert.org<p> 5976498Snate@binkert.orgAn instance <tt>lr</tt> of <tt>LRGeneratedTable</tt> has the following attributes. 5986498Snate@binkert.org 5996498Snate@binkert.org<p> 6006498Snate@binkert.org<b><tt>lr.grammar</tt></b> 6016498Snate@binkert.org<blockquote> 6026498Snate@binkert.orgA link to the Grammar object used to construct the parsing tables. 6036498Snate@binkert.org</blockquote> 6046498Snate@binkert.org 6056498Snate@binkert.org<p> 6066498Snate@binkert.org<b><tt>lr.lr_method</tt></b> 6076498Snate@binkert.org<blockquote> 6086498Snate@binkert.orgThe LR parsing method used (e.g., <tt>'LALR'</tt>) 6096498Snate@binkert.org</blockquote> 6106498Snate@binkert.org 6116498Snate@binkert.org 6126498Snate@binkert.org<p> 6136498Snate@binkert.org<b><tt>lr.lr_productions</tt></b> 6146498Snate@binkert.org<blockquote> 6156498Snate@binkert.orgA reference to <tt>grammar.Productions</tt>. This, together with <tt>lr_action</tt> and <tt>lr_goto</tt> 6166498Snate@binkert.orgcontain all of the information needed by the LR parsing engine. 6176498Snate@binkert.org</blockquote> 6186498Snate@binkert.org 6196498Snate@binkert.org<p> 6206498Snate@binkert.org<b><tt>lr.lr_action</tt></b> 6216498Snate@binkert.org<blockquote> 6226498Snate@binkert.orgThe LR action dictionary that implements the underlying state machine. The keys of this dictionary are 6236498Snate@binkert.orgthe LR states. 6246498Snate@binkert.org</blockquote> 6256498Snate@binkert.org 6266498Snate@binkert.org<p> 6276498Snate@binkert.org<b><tt>lr.lr_goto</tt></b> 6286498Snate@binkert.org<blockquote> 6296498Snate@binkert.orgThe LR goto table that contains information about grammar rule reductions. 6306498Snate@binkert.org</blockquote> 6316498Snate@binkert.org 6326498Snate@binkert.org<p> 6336498Snate@binkert.org<b><tt>lr.sr_conflicts</tt></b> 6346498Snate@binkert.org<blockquote> 6356498Snate@binkert.orgA list of tuples <tt>(state,token,resolution)</tt> identifying all shift/reduce conflicts. <tt>state</tt> is the LR state 6366498Snate@binkert.orgnumber where the conflict occurred, <tt>token</tt> is the token causing the conflict, and <tt>resolution</tt> is 6376498Snate@binkert.orga string describing the resolution taken. <tt>resolution</tt> is either <tt>'shift'</tt> or <tt>'reduce'</tt>. 6386498Snate@binkert.org</blockquote> 6396498Snate@binkert.org 6406498Snate@binkert.org<p> 6416498Snate@binkert.org<b><tt>lr.rr_conflicts</tt></b> 6426498Snate@binkert.org<blockquote> 6436498Snate@binkert.orgA list of tuples <tt>(state,rule,rejected)</tt> identifying all reduce/reduce conflicts. <tt>state</tt> is the 6446498Snate@binkert.orgLR state number where the conflict occurred, <tt>rule</tt> is the production rule that was selected 6456498Snate@binkert.organd <tt>rejected</tt> is the production rule that was rejected. Both <tt>rule</tt> and </tt>rejected</tt> are 6466498Snate@binkert.orginstances of <tt>Production</tt>. They can be inspected to provide the user with more information. 6476498Snate@binkert.org</blockquote> 6486498Snate@binkert.org 6496498Snate@binkert.org<p> 6506498Snate@binkert.orgThere are two public methods of <tt>LRGeneratedTable</tt>. 6516498Snate@binkert.org 6526498Snate@binkert.org<p> 6536498Snate@binkert.org<b><tt>lr.write_table(modulename,outputdir="",signature="")</tt></b> 6546498Snate@binkert.org<blockquote> 6556498Snate@binkert.orgWrites the LR parsing table information to a Python module. <tt>modulename</tt> is a string 6566498Snate@binkert.orgspecifying the name of a module such as <tt>"parsetab"</tt>. <tt>outputdir</tt> is the name of a 6576498Snate@binkert.orgdirectory where the module should be created. <tt>signature</tt> is a string representing a 6586498Snate@binkert.orggrammar signature that's written into the output file. This can be used to detect when 6596498Snate@binkert.orgthe data stored in a module file is out-of-sync with the the grammar specification (and that 6606498Snate@binkert.orgthe tables need to be regenerated). If <tt>modulename</tt> is a string <tt>"parsetab"</tt>, 6616498Snate@binkert.orgthis function creates a file called <tt>parsetab.py</tt>. If the module name represents a 6626498Snate@binkert.orgpackage such as <tt>"foo.bar.parsetab"</tt>, then only the last component, <tt>"parsetab"</tt> is 6636498Snate@binkert.orgused. 6646498Snate@binkert.org</blockquote> 6656498Snate@binkert.org 6666498Snate@binkert.org 6676498Snate@binkert.org<H2><a name="internal_nn7"></a>7. LRParser</H2> 6686498Snate@binkert.org 6696498Snate@binkert.org 6706498Snate@binkert.orgThe <tt>LRParser</tt> class implements the low-level LR parsing engine. 6716498Snate@binkert.org 6726498Snate@binkert.org 6736498Snate@binkert.org<p> 6746498Snate@binkert.org<b><tt>LRParser(lrtab, error_func)</tt></b> 6756498Snate@binkert.org<blockquote> 6766498Snate@binkert.orgCreate an LRParser. <tt>lrtab</tt> is an instance of <tt>LRTable</tt> 6776498Snate@binkert.orgcontaining the LR production and state tables. <tt>error_func</tt> is the 6786498Snate@binkert.orgerror function to invoke in the event of a parsing error. 6796498Snate@binkert.org</blockquote> 6806498Snate@binkert.org 6816498Snate@binkert.orgAn instance <tt>p</tt> of <tt>LRParser</tt> has the following methods: 6826498Snate@binkert.org 6836498Snate@binkert.org<p> 6846498Snate@binkert.org<b><tt>p.parse(input=None,lexer=None,debug=0,tracking=0,tokenfunc=None)</tt></b> 6856498Snate@binkert.org<blockquote> 6866498Snate@binkert.orgRun the parser. <tt>input</tt> is a string, which if supplied is fed into the 6876498Snate@binkert.orglexer using its <tt>input()</tt> method. <tt>lexer</tt> is an instance of the 6886498Snate@binkert.org<tt>Lexer</tt> class to use for tokenizing. If not supplied, the last lexer 6896498Snate@binkert.orgcreated with the <tt>lex</tt> module is used. <tt>debug</tt> is a boolean flag 6906498Snate@binkert.orgthat enables debugging. <tt>tracking</tt> is a boolean flag that tells the 6916498Snate@binkert.orgparser to perform additional line number tracking. <tt>tokenfunc</tt> is a callable 6926498Snate@binkert.orgfunction that returns the next token. If supplied, the parser will use it to get 6936498Snate@binkert.orgall tokens. 6946498Snate@binkert.org</blockquote> 6956498Snate@binkert.org 6966498Snate@binkert.org<p> 6976498Snate@binkert.org<b><tt>p.restart()</tt></b> 6986498Snate@binkert.org<blockquote> 6996498Snate@binkert.orgResets the parser state for a parse already in progress. 7006498Snate@binkert.org</blockquote> 7016498Snate@binkert.org 7026498Snate@binkert.org<H2><a name="internal_nn8"></a>8. ParserReflect</H2> 7036498Snate@binkert.org 7046498Snate@binkert.org 7056498Snate@binkert.org<p> 7066498Snate@binkert.orgThe <tt>ParserReflect</tt> class is used to collect parser specification data 7076498Snate@binkert.orgfrom a Python module or object. This class is what collects all of the 7086498Snate@binkert.org<tt>p_rule()</tt> functions in a PLY file, performs basic error checking, 7096498Snate@binkert.organd collects all of the needed information to build a grammar. Most of the 7106498Snate@binkert.orghigh-level PLY interface as used by the <tt>yacc()</tt> function is actually 7116498Snate@binkert.orgimplemented by this class. 7126498Snate@binkert.org 7136498Snate@binkert.org<p> 7146498Snate@binkert.org<b><tt>ParserReflect(pdict, log=None)</tt></b> 7156498Snate@binkert.org<blockquote> 7166498Snate@binkert.orgCreates a <tt>ParserReflect</tt> instance. <tt>pdict</tt> is a dictionary 7176498Snate@binkert.orgcontaining parser specification data. This dictionary typically corresponds 7186498Snate@binkert.orgto the module or class dictionary of code that implements a PLY parser. 7196498Snate@binkert.org<tt>log</tt> is a logger instance that will be used to report error 7206498Snate@binkert.orgmessages. 7216498Snate@binkert.org</blockquote> 7226498Snate@binkert.org 7236498Snate@binkert.orgAn instance <tt>p</tt> of <tt>ParserReflect</tt> has the following methods: 7246498Snate@binkert.org 7256498Snate@binkert.org<p> 7266498Snate@binkert.org<b><tt>p.get_all()</tt></b> 7276498Snate@binkert.org<blockquote> 7286498Snate@binkert.orgCollect and store all required parsing information. 7296498Snate@binkert.org</blockquote> 7306498Snate@binkert.org 7316498Snate@binkert.org<p> 7326498Snate@binkert.org<b><tt>p.validate_all()</tt></b> 7336498Snate@binkert.org<blockquote> 7346498Snate@binkert.orgValidate all of the collected parsing information. This is a seprate step 7356498Snate@binkert.orgfrom <tt>p.get_all()</tt> as a performance optimization. In order to 7366498Snate@binkert.orgincrease parser start-up time, a parser can elect to only validate the 7376498Snate@binkert.orgparsing data when regenerating the parsing tables. The validation 7386498Snate@binkert.orgstep tries to collect as much information as possible rather than 7396498Snate@binkert.orgraising an exception at the first sign of trouble. The attribute 7406498Snate@binkert.org<tt>p.error</tt> is set if there are any validation errors. The 7416498Snate@binkert.orgvalue of this attribute is also returned. 7426498Snate@binkert.org</blockquote> 7436498Snate@binkert.org 7446498Snate@binkert.org<p> 7456498Snate@binkert.org<b><tt>p.signature()</tt></b> 7466498Snate@binkert.org<blockquote> 7476498Snate@binkert.orgCompute a signature representing the contents of the collected parsing 7486498Snate@binkert.orgdata. The signature value should change if anything in the parser 7496498Snate@binkert.orgspecification has changed in a way that would justify parser table 7506498Snate@binkert.orgregeneration. This method can be called after <tt>p.get_all()</tt>, 7516498Snate@binkert.orgbut before <tt>p.validate_all()</tt>. 7526498Snate@binkert.org</blockquote> 7536498Snate@binkert.org 7546498Snate@binkert.orgThe following attributes are set in the process of collecting data: 7556498Snate@binkert.org 7566498Snate@binkert.org<p> 7576498Snate@binkert.org<b><tt>p.start</tt></b> 7586498Snate@binkert.org<blockquote> 7596498Snate@binkert.orgThe grammar start symbol, if any. Taken from <tt>pdict['start']</tt>. 7606498Snate@binkert.org</blockquote> 7616498Snate@binkert.org 7626498Snate@binkert.org<p> 7636498Snate@binkert.org<b><tt>p.error_func</tt></b> 7646498Snate@binkert.org<blockquote> 7656498Snate@binkert.orgThe error handling function or <tt>None</tt>. Taken from <tt>pdict['p_error']</tt>. 7666498Snate@binkert.org</blockquote> 7676498Snate@binkert.org 7686498Snate@binkert.org<p> 7696498Snate@binkert.org<b><tt>p.tokens</tt></b> 7706498Snate@binkert.org<blockquote> 7716498Snate@binkert.orgThe token list. Taken from <tt>pdict['tokens']</tt>. 7726498Snate@binkert.org</blockquote> 7736498Snate@binkert.org 7746498Snate@binkert.org<p> 7756498Snate@binkert.org<b><tt>p.prec</tt></b> 7766498Snate@binkert.org<blockquote> 7776498Snate@binkert.orgThe precedence specifier. Taken from <tt>pdict['precedence']</tt>. 7786498Snate@binkert.org</blockquote> 7796498Snate@binkert.org 7806498Snate@binkert.org<p> 7816498Snate@binkert.org<b><tt>p.preclist</tt></b> 7826498Snate@binkert.org<blockquote> 7836498Snate@binkert.orgA parsed version of the precedence specified. A list of tuples of the form 7846498Snate@binkert.org<tt>(token,assoc,level)</tt> where <tt>token</tt> is the terminal symbol, 7856498Snate@binkert.org<tt>assoc</tt> is the associativity (e.g., <tt>'left'</tt>) and <tt>level</tt> 7866498Snate@binkert.orgis a numeric precedence level. 7876498Snate@binkert.org</blockquote> 7886498Snate@binkert.org 7896498Snate@binkert.org<p> 7906498Snate@binkert.org<b><tt>p.grammar</tt></b> 7916498Snate@binkert.org<blockquote> 7926498Snate@binkert.orgA list of tuples <tt>(name, rules)</tt> representing the grammar rules. <tt>name</tt> is the 7936498Snate@binkert.orgname of a Python function or method in <tt>pdict</tt> that starts with <tt>"p_"</tt>. 7946498Snate@binkert.org<tt>rules</tt> is a list of tuples <tt>(filename,line,prodname,syms)</tt> representing 7956498Snate@binkert.orgthe grammar rules found in the documentation string of that function. <tt>filename</tt> and <tt>line</tt> contain location 7966498Snate@binkert.orginformation that can be used for debugging. <tt>prodname</tt> is the name of the 7976498Snate@binkert.orgproduction. <tt>syms</tt> is the right-hand side of the production. If you have a 7986498Snate@binkert.orgfunction like this 7996498Snate@binkert.org 8006498Snate@binkert.org<pre> 8016498Snate@binkert.orgdef p_expr(p): 8026498Snate@binkert.org '''expr : expr PLUS expr 8036498Snate@binkert.org | expr MINUS expr 8046498Snate@binkert.org | expr TIMES expr 8056498Snate@binkert.org | expr DIVIDE expr''' 8066498Snate@binkert.org</pre> 8076498Snate@binkert.org 8086498Snate@binkert.orgthen the corresponding entry in <tt>p.grammar</tt> might look like this: 8096498Snate@binkert.org 8106498Snate@binkert.org<pre> 8116498Snate@binkert.org('p_expr', [ ('calc.py',10,'expr', ['expr','PLUS','expr']), 8126498Snate@binkert.org ('calc.py',11,'expr', ['expr','MINUS','expr']), 8136498Snate@binkert.org ('calc.py',12,'expr', ['expr','TIMES','expr']), 8146498Snate@binkert.org ('calc.py',13,'expr', ['expr','DIVIDE','expr']) 8156498Snate@binkert.org ]) 8166498Snate@binkert.org</pre> 8176498Snate@binkert.org</blockquote> 8186498Snate@binkert.org 8196498Snate@binkert.org<p> 8206498Snate@binkert.org<b><tt>p.pfuncs</tt></b> 8216498Snate@binkert.org<blockquote> 8226498Snate@binkert.orgA sorted list of tuples <tt>(line, file, name, doc)</tt> representing all of 8236498Snate@binkert.orgthe <tt>p_</tt> functions found. <tt>line</tt> and <tt>file</tt> give location 8246498Snate@binkert.orginformation. <tt>name</tt> is the name of the function. <tt>doc</tt> is the 8256498Snate@binkert.orgdocumentation string. This list is sorted in ascending order by line number. 8266498Snate@binkert.org</blockquote> 8276498Snate@binkert.org 8286498Snate@binkert.org<p> 8296498Snate@binkert.org<b><tt>p.files</tt></b> 8306498Snate@binkert.org<blockquote> 8316498Snate@binkert.orgA dictionary holding all of the source filenames that were encountered 8326498Snate@binkert.orgwhile collecting parser information. Only the keys of this dictionary have 8336498Snate@binkert.organy meaning. 8346498Snate@binkert.org</blockquote> 8356498Snate@binkert.org 8366498Snate@binkert.org<p> 8376498Snate@binkert.org<b><tt>p.error</tt></b> 8386498Snate@binkert.org<blockquote> 8396498Snate@binkert.orgAn attribute that indicates whether or not any critical errors 8406498Snate@binkert.orgoccurred in validation. If this is set, it means that that some kind 8416498Snate@binkert.orgof problem was detected and that no further processing should be 8426498Snate@binkert.orgperformed. 8436498Snate@binkert.org</blockquote> 8446498Snate@binkert.org 8456498Snate@binkert.org 8466498Snate@binkert.org<H2><a name="internal_nn9"></a>9. High-level operation</H2> 8476498Snate@binkert.org 8486498Snate@binkert.org 8496498Snate@binkert.orgUsing all of the above classes requires some attention to detail. The <tt>yacc()</tt> 8506498Snate@binkert.orgfunction carries out a very specific sequence of operations to create a grammar. 8516498Snate@binkert.orgThis same sequence should be emulated if you build an alternative PLY interface. 8526498Snate@binkert.org 8536498Snate@binkert.org<ol> 8546498Snate@binkert.org<li>A <tt>ParserReflect</tt> object is created and raw grammar specification data is 8556498Snate@binkert.orgcollected. 8566498Snate@binkert.org<li>A <tt>Grammar</tt> object is created and populated with information 8576498Snate@binkert.orgfrom the specification data. 8586498Snate@binkert.org<li>A <tt>LRGenerator</tt> object is created to run the LALR algorithm over 8596498Snate@binkert.orgthe <tt>Grammar</tt> object. 8606498Snate@binkert.org<li>Productions in the LRGenerator and bound to callables using the <tt>bind_callables()</tt> 8616498Snate@binkert.orgmethod. 8626498Snate@binkert.org<li>A <tt>LRParser</tt> object is created from from the information in the 8636498Snate@binkert.org<tt>LRGenerator</tt> object. 8646498Snate@binkert.org</ol> 8656498Snate@binkert.org 8666498Snate@binkert.org</body> 8676498Snate@binkert.org</html> 8686498Snate@binkert.org 8696498Snate@binkert.org 8706498Snate@binkert.org 8716498Snate@binkert.org 8726498Snate@binkert.org 8736498Snate@binkert.org 8746498Snate@binkert.org 875