README revision 4479
14479Sbinkertn@umich.eduPLY (Python Lex-Yacc) Version 2.3 (February 18, 2007) 22632Sstever@eecs.umich.edu 34479Sbinkertn@umich.eduDavid M. Beazley (dave@dabeaz.com) 42632Sstever@eecs.umich.edu 54479Sbinkertn@umich.eduCopyright (C) 2001-2007 David M. Beazley 62632Sstever@eecs.umich.edu 72632Sstever@eecs.umich.eduThis library is free software; you can redistribute it and/or 82632Sstever@eecs.umich.edumodify it under the terms of the GNU Lesser General Public 92632Sstever@eecs.umich.eduLicense as published by the Free Software Foundation; either 102632Sstever@eecs.umich.eduversion 2.1 of the License, or (at your option) any later version. 112632Sstever@eecs.umich.edu 122632Sstever@eecs.umich.eduThis library is distributed in the hope that it will be useful, 132632Sstever@eecs.umich.edubut WITHOUT ANY WARRANTY; without even the implied warranty of 142632Sstever@eecs.umich.eduMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 152632Sstever@eecs.umich.eduLesser General Public License for more details. 162632Sstever@eecs.umich.edu 172632Sstever@eecs.umich.eduYou should have received a copy of the GNU Lesser General Public 182632Sstever@eecs.umich.eduLicense along with this library; if not, write to the Free Software 192632Sstever@eecs.umich.eduFoundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA 202632Sstever@eecs.umich.edu 212632Sstever@eecs.umich.eduSee the file COPYING for a complete copy of the LGPL. 222632Sstever@eecs.umich.edu 232632Sstever@eecs.umich.eduIntroduction 242632Sstever@eecs.umich.edu============ 252632Sstever@eecs.umich.edu 262632Sstever@eecs.umich.eduPLY is a 100% Python implementation of the common parsing tools lex 272632Sstever@eecs.umich.eduand yacc. Although several other parsing tools are available for 282632Sstever@eecs.umich.eduPython, there are several reasons why you might want to consider PLY: 292632Sstever@eecs.umich.edu 302632Sstever@eecs.umich.edu - The tools are very closely modeled after traditional lex/yacc. 312632Sstever@eecs.umich.edu If you know how to use these tools in C, you will find PLY 322632Sstever@eecs.umich.edu to be similar. 332632Sstever@eecs.umich.edu 342632Sstever@eecs.umich.edu - PLY provides *very* extensive error reporting and diagnostic 352632Sstever@eecs.umich.edu information to assist in parser construction. The original 362632Sstever@eecs.umich.edu implementation was developed for instructional purposes. As 372632Sstever@eecs.umich.edu a result, the system tries to identify the most common types 382632Sstever@eecs.umich.edu of errors made by novice users. 392632Sstever@eecs.umich.edu 402632Sstever@eecs.umich.edu - PLY provides full support for empty productions, error recovery, 412632Sstever@eecs.umich.edu precedence specifiers, and moderately ambiguous grammars. 422632Sstever@eecs.umich.edu 432632Sstever@eecs.umich.edu - Parsing is based on LR-parsing which is fast, memory efficient, 442632Sstever@eecs.umich.edu better suited to large grammars, and which has a number of nice 452632Sstever@eecs.umich.edu properties when dealing with syntax errors and other parsing problems. 462632Sstever@eecs.umich.edu Currently, PLY builds its parsing tables using the SLR algorithm which 472632Sstever@eecs.umich.edu is slightly weaker than LALR(1) used in traditional yacc. 482632Sstever@eecs.umich.edu 494479Sbinkertn@umich.edu - PLY uses Python introspection features to build lexers and parsers. 504479Sbinkertn@umich.edu This greatly simplifies the task of parser construction since it reduces 514479Sbinkertn@umich.edu the number of files and eliminates the need to run a separate lex/yacc 524479Sbinkertn@umich.edu tool before running your program. 532632Sstever@eecs.umich.edu 542632Sstever@eecs.umich.edu - PLY can be used to build parsers for "real" programming languages. 552632Sstever@eecs.umich.edu Although it is not ultra-fast due to its Python implementation, 562632Sstever@eecs.umich.edu PLY can be used to parse grammars consisting of several hundred 572632Sstever@eecs.umich.edu rules (as might be found for a language like C). The lexer and LR 582632Sstever@eecs.umich.edu parser are also reasonably efficient when parsing typically 592632Sstever@eecs.umich.edu sized programs. 602632Sstever@eecs.umich.edu 612632Sstever@eecs.umich.eduThe original version of PLY was developed for an Introduction to 622632Sstever@eecs.umich.eduCompilers course where students used it to build a compiler for a 632632Sstever@eecs.umich.edusimple Pascal-like language. Their compiler had to include lexical 642632Sstever@eecs.umich.eduanalysis, parsing, type checking, type inference, and generation of 652632Sstever@eecs.umich.eduassembly code for the SPARC processor. Because of this, the current 662632Sstever@eecs.umich.eduimplementation has been extensively tested and debugged. In addition, 672632Sstever@eecs.umich.edumost of the API and error checking steps have been adapted to address 682632Sstever@eecs.umich.educommon usability problems. 692632Sstever@eecs.umich.edu 702632Sstever@eecs.umich.eduHow to Use 712632Sstever@eecs.umich.edu========== 722632Sstever@eecs.umich.edu 734479Sbinkertn@umich.eduPLY consists of two files : lex.py and yacc.py. These are contained 744479Sbinkertn@umich.eduwithin the 'ply' directory which may also be used as a Python package. 754479Sbinkertn@umich.eduTo use PLY, simply copy the 'ply' directory to your project and import 764479Sbinkertn@umich.edulex and yacc from the associated 'ply' package. For example: 774479Sbinkertn@umich.edu 784479Sbinkertn@umich.edu import ply.lex as lex 794479Sbinkertn@umich.edu import ply.yacc as yacc 804479Sbinkertn@umich.edu 814479Sbinkertn@umich.eduAlternatively, you can copy just the files lex.py and yacc.py 824479Sbinkertn@umich.eduindividually and use them as modules. For example: 834479Sbinkertn@umich.edu 844479Sbinkertn@umich.edu import lex 854479Sbinkertn@umich.edu import yacc 864479Sbinkertn@umich.edu 874479Sbinkertn@umich.eduThe file setup.py can be used to install ply using distutils. 882632Sstever@eecs.umich.edu 892632Sstever@eecs.umich.eduThe file doc/ply.html contains complete documentation on how to use 902632Sstever@eecs.umich.eduthe system. 912632Sstever@eecs.umich.edu 922632Sstever@eecs.umich.eduThe example directory contains several different examples including a 934479Sbinkertn@umich.eduPLY specification for ANSI C as given in K&R 2nd Ed. 942632Sstever@eecs.umich.edu 952632Sstever@eecs.umich.eduA simple example is found at the end of this document 962632Sstever@eecs.umich.edu 972632Sstever@eecs.umich.eduRequirements 982632Sstever@eecs.umich.edu============ 994479Sbinkertn@umich.eduPLY requires the use of Python 2.1 or greater. However, you should 1004479Sbinkertn@umich.eduuse the latest Python release if possible. It should work on just 1014479Sbinkertn@umich.eduabout any platform. PLY has been tested with both CPython and Jython. 1024479Sbinkertn@umich.eduHowever, it does not seem to work with IronPython. 1032632Sstever@eecs.umich.edu 1042632Sstever@eecs.umich.eduResources 1052632Sstever@eecs.umich.edu========= 1062632Sstever@eecs.umich.eduMore information about PLY can be obtained on the PLY webpage at: 1072632Sstever@eecs.umich.edu 1084479Sbinkertn@umich.edu http://www.dabeaz.com/ply 1092632Sstever@eecs.umich.edu 1102632Sstever@eecs.umich.eduFor a detailed overview of parsing theory, consult the excellent 1112632Sstever@eecs.umich.edubook "Compilers : Principles, Techniques, and Tools" by Aho, Sethi, and 1122632Sstever@eecs.umich.eduUllman. The topics found in "Lex & Yacc" by Levine, Mason, and Brown 1132632Sstever@eecs.umich.edumay also be useful. 1142632Sstever@eecs.umich.edu 1154479Sbinkertn@umich.eduA Google group for PLY can be found at 1164479Sbinkertn@umich.edu 1174479Sbinkertn@umich.edu http://groups.google.com/group/ply-hack 1182632Sstever@eecs.umich.edu 1192632Sstever@eecs.umich.eduAcknowledgments 1202632Sstever@eecs.umich.edu=============== 1212632Sstever@eecs.umich.eduA special thanks is in order for all of the students in CS326 who 1222632Sstever@eecs.umich.edusuffered through about 25 different versions of these tools :-). 1232632Sstever@eecs.umich.edu 1244479Sbinkertn@umich.eduThe CHANGES file acknowledges those who have contributed patches. 1254479Sbinkertn@umich.edu 1264479Sbinkertn@umich.eduElias Ioup did the first implementation of LALR(1) parsing in PLY-1.x. 1274479Sbinkertn@umich.eduAndrew Waters and Markus Schoepflin were instrumental in reporting bugs 1284479Sbinkertn@umich.eduand testing a revised LALR(1) implementation for PLY-2.0. 1294479Sbinkertn@umich.edu 1304479Sbinkertn@umich.eduSpecial Note for PLY-2.x 1314479Sbinkertn@umich.edu======================== 1324479Sbinkertn@umich.eduPLY-2.0 is the first in a series of PLY releases that will be adding a 1334479Sbinkertn@umich.eduvariety of significant new features. The first release in this series 1344479Sbinkertn@umich.edu(Ply-2.0) should be 100% compatible with all previous Ply-1.x releases 1354479Sbinkertn@umich.eduexcept for the fact that Ply-2.0 features a correct implementation of 1364479Sbinkertn@umich.eduLALR(1) table generation. 1374479Sbinkertn@umich.edu 1384479Sbinkertn@umich.eduIf you have suggestions for improving PLY in future 2.x releases, please 1394479Sbinkertn@umich.educontact me. - Dave 1404479Sbinkertn@umich.edu 1412632Sstever@eecs.umich.eduExample 1422632Sstever@eecs.umich.edu======= 1432632Sstever@eecs.umich.edu 1444479Sbinkertn@umich.eduHere is a simple example showing a PLY implementation of a calculator 1454479Sbinkertn@umich.eduwith variables. 1462632Sstever@eecs.umich.edu 1472632Sstever@eecs.umich.edu# ----------------------------------------------------------------------------- 1482632Sstever@eecs.umich.edu# calc.py 1492632Sstever@eecs.umich.edu# 1502632Sstever@eecs.umich.edu# A simple calculator with variables. 1512632Sstever@eecs.umich.edu# ----------------------------------------------------------------------------- 1522632Sstever@eecs.umich.edu 1532632Sstever@eecs.umich.edutokens = ( 1542632Sstever@eecs.umich.edu 'NAME','NUMBER', 1552632Sstever@eecs.umich.edu 'PLUS','MINUS','TIMES','DIVIDE','EQUALS', 1562632Sstever@eecs.umich.edu 'LPAREN','RPAREN', 1572632Sstever@eecs.umich.edu ) 1582632Sstever@eecs.umich.edu 1592632Sstever@eecs.umich.edu# Tokens 1602632Sstever@eecs.umich.edu 1612632Sstever@eecs.umich.edut_PLUS = r'\+' 1622632Sstever@eecs.umich.edut_MINUS = r'-' 1632632Sstever@eecs.umich.edut_TIMES = r'\*' 1642632Sstever@eecs.umich.edut_DIVIDE = r'/' 1652632Sstever@eecs.umich.edut_EQUALS = r'=' 1662632Sstever@eecs.umich.edut_LPAREN = r'\(' 1672632Sstever@eecs.umich.edut_RPAREN = r'\)' 1682632Sstever@eecs.umich.edut_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*' 1692632Sstever@eecs.umich.edu 1702632Sstever@eecs.umich.edudef t_NUMBER(t): 1712632Sstever@eecs.umich.edu r'\d+' 1722632Sstever@eecs.umich.edu try: 1732632Sstever@eecs.umich.edu t.value = int(t.value) 1742632Sstever@eecs.umich.edu except ValueError: 1752632Sstever@eecs.umich.edu print "Integer value too large", t.value 1762632Sstever@eecs.umich.edu t.value = 0 1772632Sstever@eecs.umich.edu return t 1782632Sstever@eecs.umich.edu 1792632Sstever@eecs.umich.edu# Ignored characters 1802632Sstever@eecs.umich.edut_ignore = " \t" 1812632Sstever@eecs.umich.edu 1822632Sstever@eecs.umich.edudef t_newline(t): 1832632Sstever@eecs.umich.edu r'\n+' 1844479Sbinkertn@umich.edu t.lexer.lineno += t.value.count("\n") 1852632Sstever@eecs.umich.edu 1862632Sstever@eecs.umich.edudef t_error(t): 1872632Sstever@eecs.umich.edu print "Illegal character '%s'" % t.value[0] 1884479Sbinkertn@umich.edu t.lexer.skip(1) 1892632Sstever@eecs.umich.edu 1902632Sstever@eecs.umich.edu# Build the lexer 1914479Sbinkertn@umich.eduimport ply.lex as lex 1922632Sstever@eecs.umich.edulex.lex() 1932632Sstever@eecs.umich.edu 1942632Sstever@eecs.umich.edu# Precedence rules for the arithmetic operators 1952632Sstever@eecs.umich.eduprecedence = ( 1962632Sstever@eecs.umich.edu ('left','PLUS','MINUS'), 1972632Sstever@eecs.umich.edu ('left','TIMES','DIVIDE'), 1982632Sstever@eecs.umich.edu ('right','UMINUS'), 1992632Sstever@eecs.umich.edu ) 2002632Sstever@eecs.umich.edu 2012632Sstever@eecs.umich.edu# dictionary of names (for storing variables) 2022632Sstever@eecs.umich.edunames = { } 2032632Sstever@eecs.umich.edu 2044479Sbinkertn@umich.edudef p_statement_assign(p): 2052632Sstever@eecs.umich.edu 'statement : NAME EQUALS expression' 2064479Sbinkertn@umich.edu names[p[1]] = p[3] 2072632Sstever@eecs.umich.edu 2084479Sbinkertn@umich.edudef p_statement_expr(p): 2092632Sstever@eecs.umich.edu 'statement : expression' 2104479Sbinkertn@umich.edu print p[1] 2112632Sstever@eecs.umich.edu 2124479Sbinkertn@umich.edudef p_expression_binop(p): 2132632Sstever@eecs.umich.edu '''expression : expression PLUS expression 2142632Sstever@eecs.umich.edu | expression MINUS expression 2152632Sstever@eecs.umich.edu | expression TIMES expression 2162632Sstever@eecs.umich.edu | expression DIVIDE expression''' 2174479Sbinkertn@umich.edu if p[2] == '+' : p[0] = p[1] + p[3] 2184479Sbinkertn@umich.edu elif p[2] == '-': p[0] = p[1] - p[3] 2194479Sbinkertn@umich.edu elif p[2] == '*': p[0] = p[1] * p[3] 2204479Sbinkertn@umich.edu elif p[2] == '/': p[0] = p[1] / p[3] 2212632Sstever@eecs.umich.edu 2224479Sbinkertn@umich.edudef p_expression_uminus(p): 2232632Sstever@eecs.umich.edu 'expression : MINUS expression %prec UMINUS' 2244479Sbinkertn@umich.edu p[0] = -p[2] 2252632Sstever@eecs.umich.edu 2264479Sbinkertn@umich.edudef p_expression_group(p): 2272632Sstever@eecs.umich.edu 'expression : LPAREN expression RPAREN' 2284479Sbinkertn@umich.edu p[0] = p[2] 2292632Sstever@eecs.umich.edu 2304479Sbinkertn@umich.edudef p_expression_number(p): 2312632Sstever@eecs.umich.edu 'expression : NUMBER' 2324479Sbinkertn@umich.edu p[0] = p[1] 2332632Sstever@eecs.umich.edu 2344479Sbinkertn@umich.edudef p_expression_name(p): 2352632Sstever@eecs.umich.edu 'expression : NAME' 2362632Sstever@eecs.umich.edu try: 2374479Sbinkertn@umich.edu p[0] = names[p[1]] 2382632Sstever@eecs.umich.edu except LookupError: 2394479Sbinkertn@umich.edu print "Undefined name '%s'" % p[1] 2404479Sbinkertn@umich.edu p[0] = 0 2412632Sstever@eecs.umich.edu 2424479Sbinkertn@umich.edudef p_error(p): 2434479Sbinkertn@umich.edu print "Syntax error at '%s'" % p.value 2442632Sstever@eecs.umich.edu 2454479Sbinkertn@umich.eduimport ply.yacc as yacc 2462632Sstever@eecs.umich.eduyacc.yacc() 2472632Sstever@eecs.umich.edu 2482632Sstever@eecs.umich.eduwhile 1: 2492632Sstever@eecs.umich.edu try: 2502632Sstever@eecs.umich.edu s = raw_input('calc > ') 2512632Sstever@eecs.umich.edu except EOFError: 2522632Sstever@eecs.umich.edu break 2532632Sstever@eecs.umich.edu yacc.parse(s) 2542632Sstever@eecs.umich.edu 2552632Sstever@eecs.umich.edu 2564479Sbinkertn@umich.eduBug Reports and Patches 2574479Sbinkertn@umich.edu======================= 2584479Sbinkertn@umich.eduBecause of the extremely specialized and advanced nature of PLY, I 2594479Sbinkertn@umich.edurarely spend much time working on it unless I receive very specific 2604479Sbinkertn@umich.edubug-reports and/or patches to fix problems. I also try to incorporate 2614479Sbinkertn@umich.edusubmitted feature requests and enhancements into each new version. To 2624479Sbinkertn@umich.educontact me about bugs and/or new features, please send email to 2634479Sbinkertn@umich.edudave@dabeaz.com. 2642632Sstever@eecs.umich.edu 2654479Sbinkertn@umich.eduIn addition there is a Google group for discussing PLY related issues at 2662632Sstever@eecs.umich.edu 2674479Sbinkertn@umich.edu http://groups.google.com/group/ply-hack 2684479Sbinkertn@umich.edu 2694479Sbinkertn@umich.edu-- Dave 2702632Sstever@eecs.umich.edu 2712632Sstever@eecs.umich.edu 2722632Sstever@eecs.umich.edu 2732632Sstever@eecs.umich.edu 2742632Sstever@eecs.umich.edu 2752632Sstever@eecs.umich.edu 2762632Sstever@eecs.umich.edu 2772632Sstever@eecs.umich.edu 2782632Sstever@eecs.umich.edu 279