README revision 2632
12632Sstever@eecs.umich.eduPLY (Python Lex-Yacc) Version 1.2 (November 27, 2002) 22632Sstever@eecs.umich.edu 32632Sstever@eecs.umich.eduDavid M. Beazley 42632Sstever@eecs.umich.eduDepartment of Computer Science 52632Sstever@eecs.umich.eduUniversity of Chicago 62632Sstever@eecs.umich.eduChicago, IL 60637 72632Sstever@eecs.umich.edubeazley@cs.uchicago.edu 82632Sstever@eecs.umich.edu 92632Sstever@eecs.umich.eduCopyright (C) 2001 David M. Beazley 102632Sstever@eecs.umich.edu 112632Sstever@eecs.umich.edu$Header: /home/stever/bk/newmem2/ext/ply/README 1.1 03/06/06 14:53:34-00:00 stever@ $ 122632Sstever@eecs.umich.edu 132632Sstever@eecs.umich.eduThis library is free software; you can redistribute it and/or 142632Sstever@eecs.umich.edumodify it under the terms of the GNU Lesser General Public 152632Sstever@eecs.umich.eduLicense as published by the Free Software Foundation; either 162632Sstever@eecs.umich.eduversion 2.1 of the License, or (at your option) any later version. 172632Sstever@eecs.umich.edu 182632Sstever@eecs.umich.eduThis library is distributed in the hope that it will be useful, 192632Sstever@eecs.umich.edubut WITHOUT ANY WARRANTY; without even the implied warranty of 202632Sstever@eecs.umich.eduMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 212632Sstever@eecs.umich.eduLesser General Public License for more details. 222632Sstever@eecs.umich.edu 232632Sstever@eecs.umich.eduYou should have received a copy of the GNU Lesser General Public 242632Sstever@eecs.umich.eduLicense along with this library; if not, write to the Free Software 252632Sstever@eecs.umich.eduFoundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA 262632Sstever@eecs.umich.edu 272632Sstever@eecs.umich.eduSee the file COPYING for a complete copy of the LGPL. 282632Sstever@eecs.umich.edu 292632Sstever@eecs.umich.eduIntroduction 302632Sstever@eecs.umich.edu============ 312632Sstever@eecs.umich.edu 322632Sstever@eecs.umich.eduPLY is a 100% Python implementation of the common parsing tools lex 332632Sstever@eecs.umich.eduand yacc. Although several other parsing tools are available for 342632Sstever@eecs.umich.eduPython, there are several reasons why you might want to consider PLY: 352632Sstever@eecs.umich.edu 362632Sstever@eecs.umich.edu - The tools are very closely modeled after traditional lex/yacc. 372632Sstever@eecs.umich.edu If you know how to use these tools in C, you will find PLY 382632Sstever@eecs.umich.edu to be similar. 392632Sstever@eecs.umich.edu 402632Sstever@eecs.umich.edu - PLY provides *very* extensive error reporting and diagnostic 412632Sstever@eecs.umich.edu information to assist in parser construction. The original 422632Sstever@eecs.umich.edu implementation was developed for instructional purposes. As 432632Sstever@eecs.umich.edu a result, the system tries to identify the most common types 442632Sstever@eecs.umich.edu of errors made by novice users. 452632Sstever@eecs.umich.edu 462632Sstever@eecs.umich.edu - PLY provides full support for empty productions, error recovery, 472632Sstever@eecs.umich.edu precedence specifiers, and moderately ambiguous grammars. 482632Sstever@eecs.umich.edu 492632Sstever@eecs.umich.edu - Parsing is based on LR-parsing which is fast, memory efficient, 502632Sstever@eecs.umich.edu better suited to large grammars, and which has a number of nice 512632Sstever@eecs.umich.edu properties when dealing with syntax errors and other parsing problems. 522632Sstever@eecs.umich.edu Currently, PLY builds its parsing tables using the SLR algorithm which 532632Sstever@eecs.umich.edu is slightly weaker than LALR(1) used in traditional yacc. 542632Sstever@eecs.umich.edu 552632Sstever@eecs.umich.edu - Like John Aycock's excellent SPARK toolkit, PLY uses Python 562632Sstever@eecs.umich.edu reflection to build lexers and parsers. This greatly simplifies 572632Sstever@eecs.umich.edu the task of parser construction since it reduces the number of files 582632Sstever@eecs.umich.edu and eliminates the need to run a separate lex/yacc tool before 592632Sstever@eecs.umich.edu running your program. 602632Sstever@eecs.umich.edu 612632Sstever@eecs.umich.edu - PLY can be used to build parsers for "real" programming languages. 622632Sstever@eecs.umich.edu Although it is not ultra-fast due to its Python implementation, 632632Sstever@eecs.umich.edu PLY can be used to parse grammars consisting of several hundred 642632Sstever@eecs.umich.edu rules (as might be found for a language like C). The lexer and LR 652632Sstever@eecs.umich.edu parser are also reasonably efficient when parsing typically 662632Sstever@eecs.umich.edu sized programs. 672632Sstever@eecs.umich.edu 682632Sstever@eecs.umich.eduThe original version of PLY was developed for an Introduction to 692632Sstever@eecs.umich.eduCompilers course where students used it to build a compiler for a 702632Sstever@eecs.umich.edusimple Pascal-like language. Their compiler had to include lexical 712632Sstever@eecs.umich.eduanalysis, parsing, type checking, type inference, and generation of 722632Sstever@eecs.umich.eduassembly code for the SPARC processor. Because of this, the current 732632Sstever@eecs.umich.eduimplementation has been extensively tested and debugged. In addition, 742632Sstever@eecs.umich.edumost of the API and error checking steps have been adapted to address 752632Sstever@eecs.umich.educommon usability problems. 762632Sstever@eecs.umich.edu 772632Sstever@eecs.umich.eduHow to Use 782632Sstever@eecs.umich.edu========== 792632Sstever@eecs.umich.edu 802632Sstever@eecs.umich.eduPLY consists of two files : lex.py and yacc.py. To use the system, 812632Sstever@eecs.umich.edusimply copy these files to your project and import them like standard 822632Sstever@eecs.umich.eduPython modules. 832632Sstever@eecs.umich.edu 842632Sstever@eecs.umich.eduThe file doc/ply.html contains complete documentation on how to use 852632Sstever@eecs.umich.eduthe system. 862632Sstever@eecs.umich.edu 872632Sstever@eecs.umich.eduThe example directory contains several different examples including a 882632Sstever@eecs.umich.eduPLY specification for ANSI C as given in K&R 2nd Ed. Note: To use 892632Sstever@eecs.umich.eduthe examples, you will need to copy the lex.py and yacc.py files to 902632Sstever@eecs.umich.eduthe example directory. 912632Sstever@eecs.umich.edu 922632Sstever@eecs.umich.eduA simple example is found at the end of this document 932632Sstever@eecs.umich.edu 942632Sstever@eecs.umich.eduRequirements 952632Sstever@eecs.umich.edu============ 962632Sstever@eecs.umich.eduPLY requires the use of Python 2.0 or greater. It should work on 972632Sstever@eecs.umich.edujust about any platform. 982632Sstever@eecs.umich.edu 992632Sstever@eecs.umich.eduResources 1002632Sstever@eecs.umich.edu========= 1012632Sstever@eecs.umich.edu 1022632Sstever@eecs.umich.eduMore information about PLY can be obtained on the PLY webpage at: 1032632Sstever@eecs.umich.edu 1042632Sstever@eecs.umich.edu http://systems.cs.uchicago.edu/ply 1052632Sstever@eecs.umich.edu 1062632Sstever@eecs.umich.eduFor a detailed overview of parsing theory, consult the excellent 1072632Sstever@eecs.umich.edubook "Compilers : Principles, Techniques, and Tools" by Aho, Sethi, and 1082632Sstever@eecs.umich.eduUllman. The topics found in "Lex & Yacc" by Levine, Mason, and Brown 1092632Sstever@eecs.umich.edumay also be useful. 1102632Sstever@eecs.umich.edu 1112632Sstever@eecs.umich.eduGiven that this is the first release, I welcome your comments on how 1122632Sstever@eecs.umich.eduto improve the current implementation. See the TODO file for things that 1132632Sstever@eecs.umich.edustill need to be done. 1142632Sstever@eecs.umich.edu 1152632Sstever@eecs.umich.eduAcknowledgments 1162632Sstever@eecs.umich.edu=============== 1172632Sstever@eecs.umich.edu 1182632Sstever@eecs.umich.eduA special thanks is in order for all of the students in CS326 who 1192632Sstever@eecs.umich.edusuffered through about 25 different versions of these tools :-). 1202632Sstever@eecs.umich.edu 1212632Sstever@eecs.umich.eduExample 1222632Sstever@eecs.umich.edu======= 1232632Sstever@eecs.umich.edu 1242632Sstever@eecs.umich.eduHere is a simple example showing a PLY implementation of a calculator with variables. 1252632Sstever@eecs.umich.edu 1262632Sstever@eecs.umich.edu# ----------------------------------------------------------------------------- 1272632Sstever@eecs.umich.edu# calc.py 1282632Sstever@eecs.umich.edu# 1292632Sstever@eecs.umich.edu# A simple calculator with variables. 1302632Sstever@eecs.umich.edu# ----------------------------------------------------------------------------- 1312632Sstever@eecs.umich.edu 1322632Sstever@eecs.umich.edutokens = ( 1332632Sstever@eecs.umich.edu 'NAME','NUMBER', 1342632Sstever@eecs.umich.edu 'PLUS','MINUS','TIMES','DIVIDE','EQUALS', 1352632Sstever@eecs.umich.edu 'LPAREN','RPAREN', 1362632Sstever@eecs.umich.edu ) 1372632Sstever@eecs.umich.edu 1382632Sstever@eecs.umich.edu# Tokens 1392632Sstever@eecs.umich.edu 1402632Sstever@eecs.umich.edut_PLUS = r'\+' 1412632Sstever@eecs.umich.edut_MINUS = r'-' 1422632Sstever@eecs.umich.edut_TIMES = r'\*' 1432632Sstever@eecs.umich.edut_DIVIDE = r'/' 1442632Sstever@eecs.umich.edut_EQUALS = r'=' 1452632Sstever@eecs.umich.edut_LPAREN = r'\(' 1462632Sstever@eecs.umich.edut_RPAREN = r'\)' 1472632Sstever@eecs.umich.edut_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*' 1482632Sstever@eecs.umich.edu 1492632Sstever@eecs.umich.edudef t_NUMBER(t): 1502632Sstever@eecs.umich.edu r'\d+' 1512632Sstever@eecs.umich.edu try: 1522632Sstever@eecs.umich.edu t.value = int(t.value) 1532632Sstever@eecs.umich.edu except ValueError: 1542632Sstever@eecs.umich.edu print "Integer value too large", t.value 1552632Sstever@eecs.umich.edu t.value = 0 1562632Sstever@eecs.umich.edu return t 1572632Sstever@eecs.umich.edu 1582632Sstever@eecs.umich.edu# Ignored characters 1592632Sstever@eecs.umich.edut_ignore = " \t" 1602632Sstever@eecs.umich.edu 1612632Sstever@eecs.umich.edudef t_newline(t): 1622632Sstever@eecs.umich.edu r'\n+' 1632632Sstever@eecs.umich.edu t.lineno += t.value.count("\n") 1642632Sstever@eecs.umich.edu 1652632Sstever@eecs.umich.edudef t_error(t): 1662632Sstever@eecs.umich.edu print "Illegal character '%s'" % t.value[0] 1672632Sstever@eecs.umich.edu t.skip(1) 1682632Sstever@eecs.umich.edu 1692632Sstever@eecs.umich.edu# Build the lexer 1702632Sstever@eecs.umich.eduimport lex 1712632Sstever@eecs.umich.edulex.lex() 1722632Sstever@eecs.umich.edu 1732632Sstever@eecs.umich.edu# Precedence rules for the arithmetic operators 1742632Sstever@eecs.umich.eduprecedence = ( 1752632Sstever@eecs.umich.edu ('left','PLUS','MINUS'), 1762632Sstever@eecs.umich.edu ('left','TIMES','DIVIDE'), 1772632Sstever@eecs.umich.edu ('right','UMINUS'), 1782632Sstever@eecs.umich.edu ) 1792632Sstever@eecs.umich.edu 1802632Sstever@eecs.umich.edu# dictionary of names (for storing variables) 1812632Sstever@eecs.umich.edunames = { } 1822632Sstever@eecs.umich.edu 1832632Sstever@eecs.umich.edudef p_statement_assign(t): 1842632Sstever@eecs.umich.edu 'statement : NAME EQUALS expression' 1852632Sstever@eecs.umich.edu names[t[1]] = t[3] 1862632Sstever@eecs.umich.edu 1872632Sstever@eecs.umich.edudef p_statement_expr(t): 1882632Sstever@eecs.umich.edu 'statement : expression' 1892632Sstever@eecs.umich.edu print t[1] 1902632Sstever@eecs.umich.edu 1912632Sstever@eecs.umich.edudef p_expression_binop(t): 1922632Sstever@eecs.umich.edu '''expression : expression PLUS expression 1932632Sstever@eecs.umich.edu | expression MINUS expression 1942632Sstever@eecs.umich.edu | expression TIMES expression 1952632Sstever@eecs.umich.edu | expression DIVIDE expression''' 1962632Sstever@eecs.umich.edu if t[2] == '+' : t[0] = t[1] + t[3] 1972632Sstever@eecs.umich.edu elif t[2] == '-': t[0] = t[1] - t[3] 1982632Sstever@eecs.umich.edu elif t[2] == '*': t[0] = t[1] * t[3] 1992632Sstever@eecs.umich.edu elif t[2] == '/': t[0] = t[1] / t[3] 2002632Sstever@eecs.umich.edu 2012632Sstever@eecs.umich.edudef p_expression_uminus(t): 2022632Sstever@eecs.umich.edu 'expression : MINUS expression %prec UMINUS' 2032632Sstever@eecs.umich.edu t[0] = -t[2] 2042632Sstever@eecs.umich.edu 2052632Sstever@eecs.umich.edudef p_expression_group(t): 2062632Sstever@eecs.umich.edu 'expression : LPAREN expression RPAREN' 2072632Sstever@eecs.umich.edu t[0] = t[2] 2082632Sstever@eecs.umich.edu 2092632Sstever@eecs.umich.edudef p_expression_number(t): 2102632Sstever@eecs.umich.edu 'expression : NUMBER' 2112632Sstever@eecs.umich.edu t[0] = t[1] 2122632Sstever@eecs.umich.edu 2132632Sstever@eecs.umich.edudef p_expression_name(t): 2142632Sstever@eecs.umich.edu 'expression : NAME' 2152632Sstever@eecs.umich.edu try: 2162632Sstever@eecs.umich.edu t[0] = names[t[1]] 2172632Sstever@eecs.umich.edu except LookupError: 2182632Sstever@eecs.umich.edu print "Undefined name '%s'" % t[1] 2192632Sstever@eecs.umich.edu t[0] = 0 2202632Sstever@eecs.umich.edu 2212632Sstever@eecs.umich.edudef p_error(t): 2222632Sstever@eecs.umich.edu print "Syntax error at '%s'" % t.value 2232632Sstever@eecs.umich.edu 2242632Sstever@eecs.umich.eduimport yacc 2252632Sstever@eecs.umich.eduyacc.yacc() 2262632Sstever@eecs.umich.edu 2272632Sstever@eecs.umich.eduwhile 1: 2282632Sstever@eecs.umich.edu try: 2292632Sstever@eecs.umich.edu s = raw_input('calc > ') 2302632Sstever@eecs.umich.edu except EOFError: 2312632Sstever@eecs.umich.edu break 2322632Sstever@eecs.umich.edu yacc.parse(s) 2332632Sstever@eecs.umich.edu 2342632Sstever@eecs.umich.edu 2352632Sstever@eecs.umich.edu 2362632Sstever@eecs.umich.edu 2372632Sstever@eecs.umich.edu 2382632Sstever@eecs.umich.edu 2392632Sstever@eecs.umich.edu 2402632Sstever@eecs.umich.edu 2412632Sstever@eecs.umich.edu 2422632Sstever@eecs.umich.edu 2432632Sstever@eecs.umich.edu 2442632Sstever@eecs.umich.edu 2452632Sstever@eecs.umich.edu 2462632Sstever@eecs.umich.edu 2472632Sstever@eecs.umich.edu 2482632Sstever@eecs.umich.edu 2492632Sstever@eecs.umich.edu 250