16498Snate@binkert.orgPLY (Python Lex-Yacc) Version 3.2 22632Sstever@eecs.umich.edu 36498Snate@binkert.orgCopyright (C) 2001-2009, 46498Snate@binkert.orgDavid M. Beazley (Dabeaz LLC) 56498Snate@binkert.orgAll rights reserved. 62632Sstever@eecs.umich.edu 76498Snate@binkert.orgRedistribution and use in source and binary forms, with or without 86498Snate@binkert.orgmodification, are permitted provided that the following conditions are 96498Snate@binkert.orgmet: 102632Sstever@eecs.umich.edu 116498Snate@binkert.org* Redistributions of source code must retain the above copyright notice, 126498Snate@binkert.org this list of conditions and the following disclaimer. 136498Snate@binkert.org* Redistributions in binary form must reproduce the above copyright notice, 146498Snate@binkert.org this list of conditions and the following disclaimer in the documentation 156498Snate@binkert.org and/or other materials provided with the distribution. 166498Snate@binkert.org* Neither the name of the David Beazley or Dabeaz LLC may be used to 176498Snate@binkert.org endorse or promote products derived from this software without 186498Snate@binkert.org specific prior written permission. 192632Sstever@eecs.umich.edu 206498Snate@binkert.orgTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 216498Snate@binkert.org"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 226498Snate@binkert.orgLIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 236498Snate@binkert.orgA PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 246498Snate@binkert.orgOWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 256498Snate@binkert.orgSPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 266498Snate@binkert.orgLIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 276498Snate@binkert.orgDATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 286498Snate@binkert.orgTHEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 296498Snate@binkert.org(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 306498Snate@binkert.orgOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 312632Sstever@eecs.umich.edu 322632Sstever@eecs.umich.eduIntroduction 332632Sstever@eecs.umich.edu============ 342632Sstever@eecs.umich.edu 352632Sstever@eecs.umich.eduPLY is a 100% Python implementation of the common parsing tools lex 366498Snate@binkert.organd yacc. Here are a few highlights: 372632Sstever@eecs.umich.edu 386498Snate@binkert.org - PLY is very closely modeled after traditional lex/yacc. 392632Sstever@eecs.umich.edu If you know how to use these tools in C, you will find PLY 402632Sstever@eecs.umich.edu to be similar. 412632Sstever@eecs.umich.edu 422632Sstever@eecs.umich.edu - PLY provides *very* extensive error reporting and diagnostic 432632Sstever@eecs.umich.edu information to assist in parser construction. The original 442632Sstever@eecs.umich.edu implementation was developed for instructional purposes. As 452632Sstever@eecs.umich.edu a result, the system tries to identify the most common types 462632Sstever@eecs.umich.edu of errors made by novice users. 472632Sstever@eecs.umich.edu 482632Sstever@eecs.umich.edu - PLY provides full support for empty productions, error recovery, 492632Sstever@eecs.umich.edu precedence specifiers, and moderately ambiguous grammars. 502632Sstever@eecs.umich.edu 512632Sstever@eecs.umich.edu - Parsing is based on LR-parsing which is fast, memory efficient, 522632Sstever@eecs.umich.edu better suited to large grammars, and which has a number of nice 532632Sstever@eecs.umich.edu properties when dealing with syntax errors and other parsing problems. 546498Snate@binkert.org Currently, PLY builds its parsing tables using the LALR(1) 556498Snate@binkert.org algorithm used in yacc. 562632Sstever@eecs.umich.edu 574479Sbinkertn@umich.edu - PLY uses Python introspection features to build lexers and parsers. 584479Sbinkertn@umich.edu This greatly simplifies the task of parser construction since it reduces 594479Sbinkertn@umich.edu the number of files and eliminates the need to run a separate lex/yacc 604479Sbinkertn@umich.edu tool before running your program. 612632Sstever@eecs.umich.edu 622632Sstever@eecs.umich.edu - PLY can be used to build parsers for "real" programming languages. 632632Sstever@eecs.umich.edu Although it is not ultra-fast due to its Python implementation, 642632Sstever@eecs.umich.edu PLY can be used to parse grammars consisting of several hundred 652632Sstever@eecs.umich.edu rules (as might be found for a language like C). The lexer and LR 662632Sstever@eecs.umich.edu parser are also reasonably efficient when parsing typically 676498Snate@binkert.org sized programs. People have used PLY to build parsers for 686498Snate@binkert.org C, C++, ADA, and other real programming languages. 692632Sstever@eecs.umich.edu 702632Sstever@eecs.umich.eduHow to Use 712632Sstever@eecs.umich.edu========== 722632Sstever@eecs.umich.edu 734479Sbinkertn@umich.eduPLY consists of two files : lex.py and yacc.py. These are contained 744479Sbinkertn@umich.eduwithin the 'ply' directory which may also be used as a Python package. 754479Sbinkertn@umich.eduTo use PLY, simply copy the 'ply' directory to your project and import 764479Sbinkertn@umich.edulex and yacc from the associated 'ply' package. For example: 774479Sbinkertn@umich.edu 784479Sbinkertn@umich.edu import ply.lex as lex 794479Sbinkertn@umich.edu import ply.yacc as yacc 804479Sbinkertn@umich.edu 814479Sbinkertn@umich.eduAlternatively, you can copy just the files lex.py and yacc.py 824479Sbinkertn@umich.eduindividually and use them as modules. For example: 834479Sbinkertn@umich.edu 844479Sbinkertn@umich.edu import lex 854479Sbinkertn@umich.edu import yacc 864479Sbinkertn@umich.edu 874479Sbinkertn@umich.eduThe file setup.py can be used to install ply using distutils. 882632Sstever@eecs.umich.edu 892632Sstever@eecs.umich.eduThe file doc/ply.html contains complete documentation on how to use 902632Sstever@eecs.umich.eduthe system. 912632Sstever@eecs.umich.edu 922632Sstever@eecs.umich.eduThe example directory contains several different examples including a 934479Sbinkertn@umich.eduPLY specification for ANSI C as given in K&R 2nd Ed. 942632Sstever@eecs.umich.edu 952632Sstever@eecs.umich.eduA simple example is found at the end of this document 962632Sstever@eecs.umich.edu 972632Sstever@eecs.umich.eduRequirements 982632Sstever@eecs.umich.edu============ 996498Snate@binkert.orgPLY requires the use of Python 2.2 or greater. However, you should 1004479Sbinkertn@umich.eduuse the latest Python release if possible. It should work on just 1014479Sbinkertn@umich.eduabout any platform. PLY has been tested with both CPython and Jython. 1026498Snate@binkert.orgIt also seems to work with IronPython. 1032632Sstever@eecs.umich.edu 1042632Sstever@eecs.umich.eduResources 1052632Sstever@eecs.umich.edu========= 1062632Sstever@eecs.umich.eduMore information about PLY can be obtained on the PLY webpage at: 1072632Sstever@eecs.umich.edu 1084479Sbinkertn@umich.edu http://www.dabeaz.com/ply 1092632Sstever@eecs.umich.edu 1102632Sstever@eecs.umich.eduFor a detailed overview of parsing theory, consult the excellent 1112632Sstever@eecs.umich.edubook "Compilers : Principles, Techniques, and Tools" by Aho, Sethi, and 1122632Sstever@eecs.umich.eduUllman. The topics found in "Lex & Yacc" by Levine, Mason, and Brown 1132632Sstever@eecs.umich.edumay also be useful. 1142632Sstever@eecs.umich.edu 1154479Sbinkertn@umich.eduA Google group for PLY can be found at 1164479Sbinkertn@umich.edu 1174479Sbinkertn@umich.edu http://groups.google.com/group/ply-hack 1182632Sstever@eecs.umich.edu 1192632Sstever@eecs.umich.eduAcknowledgments 1202632Sstever@eecs.umich.edu=============== 1212632Sstever@eecs.umich.eduA special thanks is in order for all of the students in CS326 who 1222632Sstever@eecs.umich.edusuffered through about 25 different versions of these tools :-). 1232632Sstever@eecs.umich.edu 1244479Sbinkertn@umich.eduThe CHANGES file acknowledges those who have contributed patches. 1254479Sbinkertn@umich.edu 1264479Sbinkertn@umich.eduElias Ioup did the first implementation of LALR(1) parsing in PLY-1.x. 1274479Sbinkertn@umich.eduAndrew Waters and Markus Schoepflin were instrumental in reporting bugs 1284479Sbinkertn@umich.eduand testing a revised LALR(1) implementation for PLY-2.0. 1294479Sbinkertn@umich.edu 1306498Snate@binkert.orgSpecial Note for PLY-3.0 1314479Sbinkertn@umich.edu======================== 1326498Snate@binkert.orgPLY-3.0 the first PLY release to support Python 3. However, backwards 1336498Snate@binkert.orgcompatibility with Python 2.2 is still preserved. PLY provides dual 1346498Snate@binkert.orgPython 2/3 compatibility by restricting its implementation to a common 1356498Snate@binkert.orgsubset of basic language features. You should not convert PLY using 1366498Snate@binkert.org2to3--it is not necessary and may in fact break the implementation. 1374479Sbinkertn@umich.edu 1382632Sstever@eecs.umich.eduExample 1392632Sstever@eecs.umich.edu======= 1402632Sstever@eecs.umich.edu 1414479Sbinkertn@umich.eduHere is a simple example showing a PLY implementation of a calculator 1424479Sbinkertn@umich.eduwith variables. 1432632Sstever@eecs.umich.edu 1442632Sstever@eecs.umich.edu# ----------------------------------------------------------------------------- 1452632Sstever@eecs.umich.edu# calc.py 1462632Sstever@eecs.umich.edu# 1472632Sstever@eecs.umich.edu# A simple calculator with variables. 1482632Sstever@eecs.umich.edu# ----------------------------------------------------------------------------- 1492632Sstever@eecs.umich.edu 1502632Sstever@eecs.umich.edutokens = ( 1512632Sstever@eecs.umich.edu 'NAME','NUMBER', 1522632Sstever@eecs.umich.edu 'PLUS','MINUS','TIMES','DIVIDE','EQUALS', 1532632Sstever@eecs.umich.edu 'LPAREN','RPAREN', 1542632Sstever@eecs.umich.edu ) 1552632Sstever@eecs.umich.edu 1562632Sstever@eecs.umich.edu# Tokens 1572632Sstever@eecs.umich.edu 1582632Sstever@eecs.umich.edut_PLUS = r'\+' 1592632Sstever@eecs.umich.edut_MINUS = r'-' 1602632Sstever@eecs.umich.edut_TIMES = r'\*' 1612632Sstever@eecs.umich.edut_DIVIDE = r'/' 1622632Sstever@eecs.umich.edut_EQUALS = r'=' 1632632Sstever@eecs.umich.edut_LPAREN = r'\(' 1642632Sstever@eecs.umich.edut_RPAREN = r'\)' 1652632Sstever@eecs.umich.edut_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*' 1662632Sstever@eecs.umich.edu 1672632Sstever@eecs.umich.edudef t_NUMBER(t): 1682632Sstever@eecs.umich.edu r'\d+' 1696498Snate@binkert.org t.value = int(t.value) 1702632Sstever@eecs.umich.edu return t 1712632Sstever@eecs.umich.edu 1722632Sstever@eecs.umich.edu# Ignored characters 1732632Sstever@eecs.umich.edut_ignore = " \t" 1742632Sstever@eecs.umich.edu 1752632Sstever@eecs.umich.edudef t_newline(t): 1762632Sstever@eecs.umich.edu r'\n+' 1774479Sbinkertn@umich.edu t.lexer.lineno += t.value.count("\n") 1782632Sstever@eecs.umich.edu 1792632Sstever@eecs.umich.edudef t_error(t): 1802632Sstever@eecs.umich.edu print "Illegal character '%s'" % t.value[0] 1814479Sbinkertn@umich.edu t.lexer.skip(1) 1822632Sstever@eecs.umich.edu 1832632Sstever@eecs.umich.edu# Build the lexer 1844479Sbinkertn@umich.eduimport ply.lex as lex 1852632Sstever@eecs.umich.edulex.lex() 1862632Sstever@eecs.umich.edu 1872632Sstever@eecs.umich.edu# Precedence rules for the arithmetic operators 1882632Sstever@eecs.umich.eduprecedence = ( 1892632Sstever@eecs.umich.edu ('left','PLUS','MINUS'), 1902632Sstever@eecs.umich.edu ('left','TIMES','DIVIDE'), 1912632Sstever@eecs.umich.edu ('right','UMINUS'), 1922632Sstever@eecs.umich.edu ) 1932632Sstever@eecs.umich.edu 1942632Sstever@eecs.umich.edu# dictionary of names (for storing variables) 1952632Sstever@eecs.umich.edunames = { } 1962632Sstever@eecs.umich.edu 1974479Sbinkertn@umich.edudef p_statement_assign(p): 1982632Sstever@eecs.umich.edu 'statement : NAME EQUALS expression' 1994479Sbinkertn@umich.edu names[p[1]] = p[3] 2002632Sstever@eecs.umich.edu 2014479Sbinkertn@umich.edudef p_statement_expr(p): 2022632Sstever@eecs.umich.edu 'statement : expression' 2034479Sbinkertn@umich.edu print p[1] 2042632Sstever@eecs.umich.edu 2054479Sbinkertn@umich.edudef p_expression_binop(p): 2062632Sstever@eecs.umich.edu '''expression : expression PLUS expression 2072632Sstever@eecs.umich.edu | expression MINUS expression 2082632Sstever@eecs.umich.edu | expression TIMES expression 2092632Sstever@eecs.umich.edu | expression DIVIDE expression''' 2104479Sbinkertn@umich.edu if p[2] == '+' : p[0] = p[1] + p[3] 2114479Sbinkertn@umich.edu elif p[2] == '-': p[0] = p[1] - p[3] 2124479Sbinkertn@umich.edu elif p[2] == '*': p[0] = p[1] * p[3] 2134479Sbinkertn@umich.edu elif p[2] == '/': p[0] = p[1] / p[3] 2142632Sstever@eecs.umich.edu 2154479Sbinkertn@umich.edudef p_expression_uminus(p): 2162632Sstever@eecs.umich.edu 'expression : MINUS expression %prec UMINUS' 2174479Sbinkertn@umich.edu p[0] = -p[2] 2182632Sstever@eecs.umich.edu 2194479Sbinkertn@umich.edudef p_expression_group(p): 2202632Sstever@eecs.umich.edu 'expression : LPAREN expression RPAREN' 2214479Sbinkertn@umich.edu p[0] = p[2] 2222632Sstever@eecs.umich.edu 2234479Sbinkertn@umich.edudef p_expression_number(p): 2242632Sstever@eecs.umich.edu 'expression : NUMBER' 2254479Sbinkertn@umich.edu p[0] = p[1] 2262632Sstever@eecs.umich.edu 2274479Sbinkertn@umich.edudef p_expression_name(p): 2282632Sstever@eecs.umich.edu 'expression : NAME' 2292632Sstever@eecs.umich.edu try: 2304479Sbinkertn@umich.edu p[0] = names[p[1]] 2312632Sstever@eecs.umich.edu except LookupError: 2324479Sbinkertn@umich.edu print "Undefined name '%s'" % p[1] 2334479Sbinkertn@umich.edu p[0] = 0 2342632Sstever@eecs.umich.edu 2354479Sbinkertn@umich.edudef p_error(p): 2364479Sbinkertn@umich.edu print "Syntax error at '%s'" % p.value 2372632Sstever@eecs.umich.edu 2384479Sbinkertn@umich.eduimport ply.yacc as yacc 2392632Sstever@eecs.umich.eduyacc.yacc() 2402632Sstever@eecs.umich.edu 2412632Sstever@eecs.umich.eduwhile 1: 2422632Sstever@eecs.umich.edu try: 2432632Sstever@eecs.umich.edu s = raw_input('calc > ') 2442632Sstever@eecs.umich.edu except EOFError: 2452632Sstever@eecs.umich.edu break 2462632Sstever@eecs.umich.edu yacc.parse(s) 2472632Sstever@eecs.umich.edu 2482632Sstever@eecs.umich.edu 2494479Sbinkertn@umich.eduBug Reports and Patches 2504479Sbinkertn@umich.edu======================= 2516498Snate@binkert.orgMy goal with PLY is to simply have a decent lex/yacc implementation 2526498Snate@binkert.orgfor Python. As a general rule, I don't spend huge amounts of time 2536498Snate@binkert.orgworking on it unless I receive very specific bug reports and/or 2546498Snate@binkert.orgpatches to fix problems. I also try to incorporate submitted feature 2556498Snate@binkert.orgrequests and enhancements into each new version. To contact me about 2566498Snate@binkert.orgbugs and/or new features, please send email to dave@dabeaz.com. 2572632Sstever@eecs.umich.edu 2584479Sbinkertn@umich.eduIn addition there is a Google group for discussing PLY related issues at 2592632Sstever@eecs.umich.edu 2604479Sbinkertn@umich.edu http://groups.google.com/group/ply-hack 2614479Sbinkertn@umich.edu 2624479Sbinkertn@umich.edu-- Dave 2632632Sstever@eecs.umich.edu 2642632Sstever@eecs.umich.edu 2652632Sstever@eecs.umich.edu 2662632Sstever@eecs.umich.edu 2672632Sstever@eecs.umich.edu 2682632Sstever@eecs.umich.edu 2692632Sstever@eecs.umich.edu 2702632Sstever@eecs.umich.edu 2712632Sstever@eecs.umich.edu 272