README revision 6498
16498Snate@binkert.orgPLY (Python Lex-Yacc) Version 3.2 26498Snate@binkert.org 36498Snate@binkert.orgCopyright (C) 2001-2009, 46498Snate@binkert.orgDavid M. Beazley (Dabeaz LLC) 56498Snate@binkert.orgAll rights reserved. 66498Snate@binkert.org 76498Snate@binkert.orgRedistribution and use in source and binary forms, with or without 86498Snate@binkert.orgmodification, are permitted provided that the following conditions are 96498Snate@binkert.orgmet: 106498Snate@binkert.org 116498Snate@binkert.org* Redistributions of source code must retain the above copyright notice, 126498Snate@binkert.org this list of conditions and the following disclaimer. 136498Snate@binkert.org* Redistributions in binary form must reproduce the above copyright notice, 146498Snate@binkert.org this list of conditions and the following disclaimer in the documentation 156498Snate@binkert.org and/or other materials provided with the distribution. 166498Snate@binkert.org* Neither the name of the David Beazley or Dabeaz LLC may be used to 176498Snate@binkert.org endorse or promote products derived from this software without 186498Snate@binkert.org specific prior written permission. 196498Snate@binkert.org 206498Snate@binkert.orgTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 216498Snate@binkert.org"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 226498Snate@binkert.orgLIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 236498Snate@binkert.orgA PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 246498Snate@binkert.orgOWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 256498Snate@binkert.orgSPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 266498Snate@binkert.orgLIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 276498Snate@binkert.orgDATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 286498Snate@binkert.orgTHEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 296498Snate@binkert.org(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 306498Snate@binkert.orgOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 316498Snate@binkert.org 326498Snate@binkert.orgIntroduction 336498Snate@binkert.org============ 346498Snate@binkert.org 356498Snate@binkert.orgPLY is a 100% Python implementation of the common parsing tools lex 366498Snate@binkert.organd yacc. Here are a few highlights: 376498Snate@binkert.org 386498Snate@binkert.org - PLY is very closely modeled after traditional lex/yacc. 396498Snate@binkert.org If you know how to use these tools in C, you will find PLY 406498Snate@binkert.org to be similar. 416498Snate@binkert.org 426498Snate@binkert.org - PLY provides *very* extensive error reporting and diagnostic 436498Snate@binkert.org information to assist in parser construction. The original 446498Snate@binkert.org implementation was developed for instructional purposes. As 456498Snate@binkert.org a result, the system tries to identify the most common types 466498Snate@binkert.org of errors made by novice users. 476498Snate@binkert.org 486498Snate@binkert.org - PLY provides full support for empty productions, error recovery, 496498Snate@binkert.org precedence specifiers, and moderately ambiguous grammars. 506498Snate@binkert.org 516498Snate@binkert.org - Parsing is based on LR-parsing which is fast, memory efficient, 526498Snate@binkert.org better suited to large grammars, and which has a number of nice 536498Snate@binkert.org properties when dealing with syntax errors and other parsing problems. 546498Snate@binkert.org Currently, PLY builds its parsing tables using the LALR(1) 556498Snate@binkert.org algorithm used in yacc. 566498Snate@binkert.org 576498Snate@binkert.org - PLY uses Python introspection features to build lexers and parsers. 586498Snate@binkert.org This greatly simplifies the task of parser construction since it reduces 596498Snate@binkert.org the number of files and eliminates the need to run a separate lex/yacc 606498Snate@binkert.org tool before running your program. 616498Snate@binkert.org 626498Snate@binkert.org - PLY can be used to build parsers for "real" programming languages. 636498Snate@binkert.org Although it is not ultra-fast due to its Python implementation, 646498Snate@binkert.org PLY can be used to parse grammars consisting of several hundred 656498Snate@binkert.org rules (as might be found for a language like C). The lexer and LR 666498Snate@binkert.org parser are also reasonably efficient when parsing typically 676498Snate@binkert.org sized programs. People have used PLY to build parsers for 686498Snate@binkert.org C, C++, ADA, and other real programming languages. 696498Snate@binkert.org 706498Snate@binkert.orgHow to Use 716498Snate@binkert.org========== 726498Snate@binkert.org 736498Snate@binkert.orgPLY consists of two files : lex.py and yacc.py. These are contained 746498Snate@binkert.orgwithin the 'ply' directory which may also be used as a Python package. 756498Snate@binkert.orgTo use PLY, simply copy the 'ply' directory to your project and import 766498Snate@binkert.orglex and yacc from the associated 'ply' package. For example: 776498Snate@binkert.org 786498Snate@binkert.org import ply.lex as lex 796498Snate@binkert.org import ply.yacc as yacc 806498Snate@binkert.org 816498Snate@binkert.orgAlternatively, you can copy just the files lex.py and yacc.py 826498Snate@binkert.orgindividually and use them as modules. For example: 836498Snate@binkert.org 846498Snate@binkert.org import lex 856498Snate@binkert.org import yacc 866498Snate@binkert.org 876498Snate@binkert.orgThe file setup.py can be used to install ply using distutils. 886498Snate@binkert.org 896498Snate@binkert.orgThe file doc/ply.html contains complete documentation on how to use 906498Snate@binkert.orgthe system. 916498Snate@binkert.org 926498Snate@binkert.orgThe example directory contains several different examples including a 936498Snate@binkert.orgPLY specification for ANSI C as given in K&R 2nd Ed. 946498Snate@binkert.org 956498Snate@binkert.orgA simple example is found at the end of this document 966498Snate@binkert.org 976498Snate@binkert.orgRequirements 986498Snate@binkert.org============ 996498Snate@binkert.orgPLY requires the use of Python 2.2 or greater. However, you should 1006498Snate@binkert.orguse the latest Python release if possible. It should work on just 1016498Snate@binkert.orgabout any platform. PLY has been tested with both CPython and Jython. 1026498Snate@binkert.orgIt also seems to work with IronPython. 1036498Snate@binkert.org 1046498Snate@binkert.orgResources 1056498Snate@binkert.org========= 1066498Snate@binkert.orgMore information about PLY can be obtained on the PLY webpage at: 1076498Snate@binkert.org 1086498Snate@binkert.org http://www.dabeaz.com/ply 1096498Snate@binkert.org 1106498Snate@binkert.orgFor a detailed overview of parsing theory, consult the excellent 1116498Snate@binkert.orgbook "Compilers : Principles, Techniques, and Tools" by Aho, Sethi, and 1126498Snate@binkert.orgUllman. The topics found in "Lex & Yacc" by Levine, Mason, and Brown 1136498Snate@binkert.orgmay also be useful. 1146498Snate@binkert.org 1156498Snate@binkert.orgA Google group for PLY can be found at 1166498Snate@binkert.org 1176498Snate@binkert.org http://groups.google.com/group/ply-hack 1186498Snate@binkert.org 1196498Snate@binkert.orgAcknowledgments 1206498Snate@binkert.org=============== 1216498Snate@binkert.orgA special thanks is in order for all of the students in CS326 who 1226498Snate@binkert.orgsuffered through about 25 different versions of these tools :-). 1236498Snate@binkert.org 1246498Snate@binkert.orgThe CHANGES file acknowledges those who have contributed patches. 1256498Snate@binkert.org 1266498Snate@binkert.orgElias Ioup did the first implementation of LALR(1) parsing in PLY-1.x. 1276498Snate@binkert.orgAndrew Waters and Markus Schoepflin were instrumental in reporting bugs 1286498Snate@binkert.organd testing a revised LALR(1) implementation for PLY-2.0. 1296498Snate@binkert.org 1306498Snate@binkert.orgSpecial Note for PLY-3.0 1316498Snate@binkert.org======================== 1326498Snate@binkert.orgPLY-3.0 the first PLY release to support Python 3. However, backwards 1336498Snate@binkert.orgcompatibility with Python 2.2 is still preserved. PLY provides dual 1346498Snate@binkert.orgPython 2/3 compatibility by restricting its implementation to a common 1356498Snate@binkert.orgsubset of basic language features. You should not convert PLY using 1366498Snate@binkert.org2to3--it is not necessary and may in fact break the implementation. 1376498Snate@binkert.org 1386498Snate@binkert.orgExample 1396498Snate@binkert.org======= 1406498Snate@binkert.org 1416498Snate@binkert.orgHere is a simple example showing a PLY implementation of a calculator 1426498Snate@binkert.orgwith variables. 1436498Snate@binkert.org 1446498Snate@binkert.org# ----------------------------------------------------------------------------- 1456498Snate@binkert.org# calc.py 1466498Snate@binkert.org# 1476498Snate@binkert.org# A simple calculator with variables. 1486498Snate@binkert.org# ----------------------------------------------------------------------------- 1496498Snate@binkert.org 1506498Snate@binkert.orgtokens = ( 1516498Snate@binkert.org 'NAME','NUMBER', 1526498Snate@binkert.org 'PLUS','MINUS','TIMES','DIVIDE','EQUALS', 1536498Snate@binkert.org 'LPAREN','RPAREN', 1546498Snate@binkert.org ) 1556498Snate@binkert.org 1566498Snate@binkert.org# Tokens 1576498Snate@binkert.org 1586498Snate@binkert.orgt_PLUS = r'\+' 1596498Snate@binkert.orgt_MINUS = r'-' 1606498Snate@binkert.orgt_TIMES = r'\*' 1616498Snate@binkert.orgt_DIVIDE = r'/' 1626498Snate@binkert.orgt_EQUALS = r'=' 1636498Snate@binkert.orgt_LPAREN = r'\(' 1646498Snate@binkert.orgt_RPAREN = r'\)' 1656498Snate@binkert.orgt_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*' 1666498Snate@binkert.org 1676498Snate@binkert.orgdef t_NUMBER(t): 1686498Snate@binkert.org r'\d+' 1696498Snate@binkert.org t.value = int(t.value) 1706498Snate@binkert.org return t 1716498Snate@binkert.org 1726498Snate@binkert.org# Ignored characters 1736498Snate@binkert.orgt_ignore = " \t" 1746498Snate@binkert.org 1756498Snate@binkert.orgdef t_newline(t): 1766498Snate@binkert.org r'\n+' 1776498Snate@binkert.org t.lexer.lineno += t.value.count("\n") 1786498Snate@binkert.org 1796498Snate@binkert.orgdef t_error(t): 1806498Snate@binkert.org print "Illegal character '%s'" % t.value[0] 1816498Snate@binkert.org t.lexer.skip(1) 1826498Snate@binkert.org 1836498Snate@binkert.org# Build the lexer 1846498Snate@binkert.orgimport ply.lex as lex 1856498Snate@binkert.orglex.lex() 1866498Snate@binkert.org 1876498Snate@binkert.org# Precedence rules for the arithmetic operators 1886498Snate@binkert.orgprecedence = ( 1896498Snate@binkert.org ('left','PLUS','MINUS'), 1906498Snate@binkert.org ('left','TIMES','DIVIDE'), 1916498Snate@binkert.org ('right','UMINUS'), 1926498Snate@binkert.org ) 1936498Snate@binkert.org 1946498Snate@binkert.org# dictionary of names (for storing variables) 1956498Snate@binkert.orgnames = { } 1966498Snate@binkert.org 1976498Snate@binkert.orgdef p_statement_assign(p): 1986498Snate@binkert.org 'statement : NAME EQUALS expression' 1996498Snate@binkert.org names[p[1]] = p[3] 2006498Snate@binkert.org 2016498Snate@binkert.orgdef p_statement_expr(p): 2026498Snate@binkert.org 'statement : expression' 2036498Snate@binkert.org print p[1] 2046498Snate@binkert.org 2056498Snate@binkert.orgdef p_expression_binop(p): 2066498Snate@binkert.org '''expression : expression PLUS expression 2076498Snate@binkert.org | expression MINUS expression 2086498Snate@binkert.org | expression TIMES expression 2096498Snate@binkert.org | expression DIVIDE expression''' 2106498Snate@binkert.org if p[2] == '+' : p[0] = p[1] + p[3] 2116498Snate@binkert.org elif p[2] == '-': p[0] = p[1] - p[3] 2126498Snate@binkert.org elif p[2] == '*': p[0] = p[1] * p[3] 2136498Snate@binkert.org elif p[2] == '/': p[0] = p[1] / p[3] 2146498Snate@binkert.org 2156498Snate@binkert.orgdef p_expression_uminus(p): 2166498Snate@binkert.org 'expression : MINUS expression %prec UMINUS' 2176498Snate@binkert.org p[0] = -p[2] 2186498Snate@binkert.org 2196498Snate@binkert.orgdef p_expression_group(p): 2206498Snate@binkert.org 'expression : LPAREN expression RPAREN' 2216498Snate@binkert.org p[0] = p[2] 2226498Snate@binkert.org 2236498Snate@binkert.orgdef p_expression_number(p): 2246498Snate@binkert.org 'expression : NUMBER' 2256498Snate@binkert.org p[0] = p[1] 2266498Snate@binkert.org 2276498Snate@binkert.orgdef p_expression_name(p): 2286498Snate@binkert.org 'expression : NAME' 2296498Snate@binkert.org try: 2306498Snate@binkert.org p[0] = names[p[1]] 2316498Snate@binkert.org except LookupError: 2326498Snate@binkert.org print "Undefined name '%s'" % p[1] 2336498Snate@binkert.org p[0] = 0 2346498Snate@binkert.org 2356498Snate@binkert.orgdef p_error(p): 2366498Snate@binkert.org print "Syntax error at '%s'" % p.value 2376498Snate@binkert.org 2386498Snate@binkert.orgimport ply.yacc as yacc 2396498Snate@binkert.orgyacc.yacc() 2406498Snate@binkert.org 2416498Snate@binkert.orgwhile 1: 2426498Snate@binkert.org try: 2436498Snate@binkert.org s = raw_input('calc > ') 2446498Snate@binkert.org except EOFError: 2456498Snate@binkert.org break 2466498Snate@binkert.org yacc.parse(s) 2476498Snate@binkert.org 2486498Snate@binkert.org 2496498Snate@binkert.orgBug Reports and Patches 2506498Snate@binkert.org======================= 2516498Snate@binkert.orgMy goal with PLY is to simply have a decent lex/yacc implementation 2526498Snate@binkert.orgfor Python. As a general rule, I don't spend huge amounts of time 2536498Snate@binkert.orgworking on it unless I receive very specific bug reports and/or 2546498Snate@binkert.orgpatches to fix problems. I also try to incorporate submitted feature 2556498Snate@binkert.orgrequests and enhancements into each new version. To contact me about 2566498Snate@binkert.orgbugs and/or new features, please send email to dave@dabeaz.com. 2576498Snate@binkert.org 2586498Snate@binkert.orgIn addition there is a Google group for discussing PLY related issues at 2596498Snate@binkert.org 2606498Snate@binkert.org http://groups.google.com/group/ply-hack 2616498Snate@binkert.org 2626498Snate@binkert.org-- Dave 2636498Snate@binkert.org 2646498Snate@binkert.org 2656498Snate@binkert.org 2666498Snate@binkert.org 2676498Snate@binkert.org 2686498Snate@binkert.org 2696498Snate@binkert.org 2706498Snate@binkert.org 2716498Snate@binkert.org 2726498Snate@binkert.org