README revision 6498
1955SN/APLY (Python Lex-Yacc) Version 3.2 2955SN/A 31762SN/ACopyright (C) 2001-2009, 4955SN/ADavid M. Beazley (Dabeaz LLC) 5955SN/AAll rights reserved. 6955SN/A 7955SN/ARedistribution and use in source and binary forms, with or without 8955SN/Amodification, are permitted provided that the following conditions are 9955SN/Amet: 10955SN/A 11955SN/A* Redistributions of source code must retain the above copyright notice, 12955SN/A this list of conditions and the following disclaimer. 13955SN/A* Redistributions in binary form must reproduce the above copyright notice, 14955SN/A this list of conditions and the following disclaimer in the documentation 15955SN/A and/or other materials provided with the distribution. 16955SN/A* Neither the name of the David Beazley or Dabeaz LLC may be used to 17955SN/A endorse or promote products derived from this software without 18955SN/A specific prior written permission. 19955SN/A 20955SN/ATHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 21955SN/A"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 22955SN/ALIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 23955SN/AA PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 24955SN/AOWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 25955SN/ASPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 26955SN/ALIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 27955SN/ADATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 282665Ssaidi@eecs.umich.eduTHEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 294762Snate@binkert.org(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 30955SN/AOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 315522Snate@binkert.org 326143Snate@binkert.orgIntroduction 334762Snate@binkert.org============ 345522Snate@binkert.org 35955SN/APLY is a 100% Python implementation of the common parsing tools lex 365522Snate@binkert.organd yacc. Here are a few highlights: 37955SN/A 385522Snate@binkert.org - PLY is very closely modeled after traditional lex/yacc. 394202Sbinkertn@umich.edu If you know how to use these tools in C, you will find PLY 405742Snate@binkert.org to be similar. 41955SN/A 424381Sbinkertn@umich.edu - PLY provides *very* extensive error reporting and diagnostic 434381Sbinkertn@umich.edu information to assist in parser construction. The original 44955SN/A implementation was developed for instructional purposes. As 45955SN/A a result, the system tries to identify the most common types 46955SN/A of errors made by novice users. 474202Sbinkertn@umich.edu 48955SN/A - PLY provides full support for empty productions, error recovery, 494382Sbinkertn@umich.edu precedence specifiers, and moderately ambiguous grammars. 504382Sbinkertn@umich.edu 514382Sbinkertn@umich.edu - Parsing is based on LR-parsing which is fast, memory efficient, 526654Snate@binkert.org better suited to large grammars, and which has a number of nice 535517Snate@binkert.org properties when dealing with syntax errors and other parsing problems. 546143Snate@binkert.org Currently, PLY builds its parsing tables using the LALR(1) 556143Snate@binkert.org algorithm used in yacc. 566143Snate@binkert.org 576143Snate@binkert.org - PLY uses Python introspection features to build lexers and parsers. 586143Snate@binkert.org This greatly simplifies the task of parser construction since it reduces 596143Snate@binkert.org the number of files and eliminates the need to run a separate lex/yacc 606143Snate@binkert.org tool before running your program. 616143Snate@binkert.org 626143Snate@binkert.org - PLY can be used to build parsers for "real" programming languages. 636143Snate@binkert.org Although it is not ultra-fast due to its Python implementation, 646143Snate@binkert.org PLY can be used to parse grammars consisting of several hundred 656143Snate@binkert.org rules (as might be found for a language like C). The lexer and LR 666143Snate@binkert.org parser are also reasonably efficient when parsing typically 676143Snate@binkert.org sized programs. People have used PLY to build parsers for 686143Snate@binkert.org C, C++, ADA, and other real programming languages. 694762Snate@binkert.org 706143Snate@binkert.orgHow to Use 716143Snate@binkert.org========== 726143Snate@binkert.org 736143Snate@binkert.orgPLY consists of two files : lex.py and yacc.py. These are contained 746143Snate@binkert.orgwithin the 'ply' directory which may also be used as a Python package. 756143Snate@binkert.orgTo use PLY, simply copy the 'ply' directory to your project and import 766143Snate@binkert.orglex and yacc from the associated 'ply' package. For example: 776143Snate@binkert.org 786143Snate@binkert.org import ply.lex as lex 796143Snate@binkert.org import ply.yacc as yacc 806143Snate@binkert.org 816143Snate@binkert.orgAlternatively, you can copy just the files lex.py and yacc.py 826143Snate@binkert.orgindividually and use them as modules. For example: 836143Snate@binkert.org 846143Snate@binkert.org import lex 856143Snate@binkert.org import yacc 866143Snate@binkert.org 876143Snate@binkert.orgThe file setup.py can be used to install ply using distutils. 886143Snate@binkert.org 896143Snate@binkert.orgThe file doc/ply.html contains complete documentation on how to use 906143Snate@binkert.orgthe system. 917065Snate@binkert.org 926143Snate@binkert.orgThe example directory contains several different examples including a 936143Snate@binkert.orgPLY specification for ANSI C as given in K&R 2nd Ed. 946143Snate@binkert.org 956143Snate@binkert.orgA simple example is found at the end of this document 966143Snate@binkert.org 976143Snate@binkert.orgRequirements 986143Snate@binkert.org============ 996143Snate@binkert.orgPLY requires the use of Python 2.2 or greater. However, you should 1006143Snate@binkert.orguse the latest Python release if possible. It should work on just 1016143Snate@binkert.orgabout any platform. PLY has been tested with both CPython and Jython. 1026143Snate@binkert.orgIt also seems to work with IronPython. 1036143Snate@binkert.org 1046143Snate@binkert.orgResources 1056143Snate@binkert.org========= 1066143Snate@binkert.orgMore information about PLY can be obtained on the PLY webpage at: 1076143Snate@binkert.org 1086143Snate@binkert.org http://www.dabeaz.com/ply 1096143Snate@binkert.org 1106143Snate@binkert.orgFor a detailed overview of parsing theory, consult the excellent 1116143Snate@binkert.orgbook "Compilers : Principles, Techniques, and Tools" by Aho, Sethi, and 1126143Snate@binkert.orgUllman. The topics found in "Lex & Yacc" by Levine, Mason, and Brown 1135522Snate@binkert.orgmay also be useful. 1146143Snate@binkert.org 1156143Snate@binkert.orgA Google group for PLY can be found at 1166143Snate@binkert.org 1176143Snate@binkert.org http://groups.google.com/group/ply-hack 1186143Snate@binkert.org 1196143Snate@binkert.orgAcknowledgments 1206143Snate@binkert.org=============== 1216143Snate@binkert.orgA special thanks is in order for all of the students in CS326 who 1226143Snate@binkert.orgsuffered through about 25 different versions of these tools :-). 1236143Snate@binkert.org 1245522Snate@binkert.orgThe CHANGES file acknowledges those who have contributed patches. 1255522Snate@binkert.org 1265522Snate@binkert.orgElias Ioup did the first implementation of LALR(1) parsing in PLY-1.x. 1275522Snate@binkert.orgAndrew Waters and Markus Schoepflin were instrumental in reporting bugs 1285604Snate@binkert.organd testing a revised LALR(1) implementation for PLY-2.0. 1295604Snate@binkert.org 1306143Snate@binkert.orgSpecial Note for PLY-3.0 1316143Snate@binkert.org======================== 1324762Snate@binkert.orgPLY-3.0 the first PLY release to support Python 3. However, backwards 1334762Snate@binkert.orgcompatibility with Python 2.2 is still preserved. PLY provides dual 1346143Snate@binkert.orgPython 2/3 compatibility by restricting its implementation to a common 1356727Ssteve.reinhardt@amd.comsubset of basic language features. You should not convert PLY using 1366727Ssteve.reinhardt@amd.com2to3--it is not necessary and may in fact break the implementation. 1376727Ssteve.reinhardt@amd.com 1384762Snate@binkert.orgExample 1396143Snate@binkert.org======= 1406143Snate@binkert.org 1416143Snate@binkert.orgHere is a simple example showing a PLY implementation of a calculator 1426143Snate@binkert.orgwith variables. 1436727Ssteve.reinhardt@amd.com 1446143Snate@binkert.org# ----------------------------------------------------------------------------- 1456143Snate@binkert.org# calc.py 1466143Snate@binkert.org# 1475604Snate@binkert.org# A simple calculator with variables. 1486143Snate@binkert.org# ----------------------------------------------------------------------------- 1496143Snate@binkert.org 1506143Snate@binkert.orgtokens = ( 1514762Snate@binkert.org 'NAME','NUMBER', 1526143Snate@binkert.org 'PLUS','MINUS','TIMES','DIVIDE','EQUALS', 1534762Snate@binkert.org 'LPAREN','RPAREN', 1544762Snate@binkert.org ) 1554762Snate@binkert.org 1566143Snate@binkert.org# Tokens 1576143Snate@binkert.org 1584762Snate@binkert.orgt_PLUS = r'\+' 1596143Snate@binkert.orgt_MINUS = r'-' 1606143Snate@binkert.orgt_TIMES = r'\*' 1616143Snate@binkert.orgt_DIVIDE = r'/' 1626143Snate@binkert.orgt_EQUALS = r'=' 1634762Snate@binkert.orgt_LPAREN = r'\(' 1646143Snate@binkert.orgt_RPAREN = r'\)' 1654762Snate@binkert.orgt_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*' 1666143Snate@binkert.org 1674762Snate@binkert.orgdef t_NUMBER(t): 1686143Snate@binkert.org r'\d+' 1696143Snate@binkert.org t.value = int(t.value) 1706143Snate@binkert.org return t 1716143Snate@binkert.org 1726143Snate@binkert.org# Ignored characters 1736143Snate@binkert.orgt_ignore = " \t" 1746143Snate@binkert.org 1756143Snate@binkert.orgdef t_newline(t): 1766143Snate@binkert.org r'\n+' 1776143Snate@binkert.org t.lexer.lineno += t.value.count("\n") 1786143Snate@binkert.org 1796143Snate@binkert.orgdef t_error(t): 1806143Snate@binkert.org print "Illegal character '%s'" % t.value[0] 181955SN/A t.lexer.skip(1) 1825584Snate@binkert.org 1835584Snate@binkert.org# Build the lexer 1845584Snate@binkert.orgimport ply.lex as lex 1855584Snate@binkert.orglex.lex() 1866143Snate@binkert.org 1876143Snate@binkert.org# Precedence rules for the arithmetic operators 1886143Snate@binkert.orgprecedence = ( 1895584Snate@binkert.org ('left','PLUS','MINUS'), 1904382Sbinkertn@umich.edu ('left','TIMES','DIVIDE'), 1914202Sbinkertn@umich.edu ('right','UMINUS'), 1924382Sbinkertn@umich.edu ) 1934382Sbinkertn@umich.edu 1944382Sbinkertn@umich.edu# dictionary of names (for storing variables) 1955584Snate@binkert.orgnames = { } 1964382Sbinkertn@umich.edu 1974382Sbinkertn@umich.edudef p_statement_assign(p): 1984382Sbinkertn@umich.edu 'statement : NAME EQUALS expression' 1995192Ssaidi@eecs.umich.edu names[p[1]] = p[3] 2005192Ssaidi@eecs.umich.edu 2015799Snate@binkert.orgdef p_statement_expr(p): 2025799Snate@binkert.org 'statement : expression' 2035799Snate@binkert.org print p[1] 2045192Ssaidi@eecs.umich.edu 2055799Snate@binkert.orgdef p_expression_binop(p): 2065192Ssaidi@eecs.umich.edu '''expression : expression PLUS expression 2075799Snate@binkert.org | expression MINUS expression 2085799Snate@binkert.org | expression TIMES expression 2095192Ssaidi@eecs.umich.edu | expression DIVIDE expression''' 2105192Ssaidi@eecs.umich.edu if p[2] == '+' : p[0] = p[1] + p[3] 2115192Ssaidi@eecs.umich.edu elif p[2] == '-': p[0] = p[1] - p[3] 2125799Snate@binkert.org elif p[2] == '*': p[0] = p[1] * p[3] 2135192Ssaidi@eecs.umich.edu elif p[2] == '/': p[0] = p[1] / p[3] 2145192Ssaidi@eecs.umich.edu 2155192Ssaidi@eecs.umich.edudef p_expression_uminus(p): 2165192Ssaidi@eecs.umich.edu 'expression : MINUS expression %prec UMINUS' 2175192Ssaidi@eecs.umich.edu p[0] = -p[2] 2185192Ssaidi@eecs.umich.edu 2194382Sbinkertn@umich.edudef p_expression_group(p): 2204382Sbinkertn@umich.edu 'expression : LPAREN expression RPAREN' 2214382Sbinkertn@umich.edu p[0] = p[2] 2222667Sstever@eecs.umich.edu 2232667Sstever@eecs.umich.edudef p_expression_number(p): 2242667Sstever@eecs.umich.edu 'expression : NUMBER' 2252667Sstever@eecs.umich.edu p[0] = p[1] 2262667Sstever@eecs.umich.edu 2272667Sstever@eecs.umich.edudef p_expression_name(p): 2285742Snate@binkert.org 'expression : NAME' 2295742Snate@binkert.org try: 2305742Snate@binkert.org p[0] = names[p[1]] 2315793Snate@binkert.org except LookupError: 2325793Snate@binkert.org print "Undefined name '%s'" % p[1] 2335793Snate@binkert.org p[0] = 0 2345793Snate@binkert.org 2355793Snate@binkert.orgdef p_error(p): 2364382Sbinkertn@umich.edu print "Syntax error at '%s'" % p.value 2374762Snate@binkert.org 2385344Sstever@gmail.comimport ply.yacc as yacc 2394382Sbinkertn@umich.eduyacc.yacc() 2405341Sstever@gmail.com 2415742Snate@binkert.orgwhile 1: 2425742Snate@binkert.org try: 2435742Snate@binkert.org s = raw_input('calc > ') 2445742Snate@binkert.org except EOFError: 2455742Snate@binkert.org break 2464762Snate@binkert.org yacc.parse(s) 2475742Snate@binkert.org 2485742Snate@binkert.org 2495742Snate@binkert.orgBug Reports and Patches 2505742Snate@binkert.org======================= 2515742Snate@binkert.orgMy goal with PLY is to simply have a decent lex/yacc implementation 2525742Snate@binkert.orgfor Python. As a general rule, I don't spend huge amounts of time 2535742Snate@binkert.orgworking on it unless I receive very specific bug reports and/or 2545341Sstever@gmail.compatches to fix problems. I also try to incorporate submitted feature 2555742Snate@binkert.orgrequests and enhancements into each new version. To contact me about 2565341Sstever@gmail.combugs and/or new features, please send email to dave@dabeaz.com. 2574773Snate@binkert.org 2586108Snate@binkert.orgIn addition there is a Google group for discussing PLY related issues at 2591858SN/A 2601085SN/A http://groups.google.com/group/ply-hack 2616658Snate@binkert.org 2626658Snate@binkert.org-- Dave 2636658Snate@binkert.org 2646658Snate@binkert.org 2656658Snate@binkert.org 2666658Snate@binkert.org 2676658Snate@binkert.org 2686658Snate@binkert.org 2696658Snate@binkert.org 2706658Snate@binkert.org 2716658Snate@binkert.org 2726658Snate@binkert.org