16498Snate@binkert.orgPLY (Python Lex-Yacc)                   Version 3.2
22632Sstever@eecs.umich.edu
36498Snate@binkert.orgCopyright (C) 2001-2009,
46498Snate@binkert.orgDavid M. Beazley (Dabeaz LLC)
56498Snate@binkert.orgAll rights reserved.
62632Sstever@eecs.umich.edu
76498Snate@binkert.orgRedistribution and use in source and binary forms, with or without
86498Snate@binkert.orgmodification, are permitted provided that the following conditions are
96498Snate@binkert.orgmet:
102632Sstever@eecs.umich.edu
116498Snate@binkert.org* Redistributions of source code must retain the above copyright notice,
126498Snate@binkert.org  this list of conditions and the following disclaimer.  
136498Snate@binkert.org* Redistributions in binary form must reproduce the above copyright notice, 
146498Snate@binkert.org  this list of conditions and the following disclaimer in the documentation
156498Snate@binkert.org  and/or other materials provided with the distribution.  
166498Snate@binkert.org* Neither the name of the David Beazley or Dabeaz LLC may be used to
176498Snate@binkert.org  endorse or promote products derived from this software without
186498Snate@binkert.org  specific prior written permission. 
192632Sstever@eecs.umich.edu
206498Snate@binkert.orgTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
216498Snate@binkert.org"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
226498Snate@binkert.orgLIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
236498Snate@binkert.orgA PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
246498Snate@binkert.orgOWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
256498Snate@binkert.orgSPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
266498Snate@binkert.orgLIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
276498Snate@binkert.orgDATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
286498Snate@binkert.orgTHEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
296498Snate@binkert.org(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
306498Snate@binkert.orgOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
312632Sstever@eecs.umich.edu
322632Sstever@eecs.umich.eduIntroduction
332632Sstever@eecs.umich.edu============
342632Sstever@eecs.umich.edu
352632Sstever@eecs.umich.eduPLY is a 100% Python implementation of the common parsing tools lex
366498Snate@binkert.organd yacc. Here are a few highlights:
372632Sstever@eecs.umich.edu
386498Snate@binkert.org -  PLY is very closely modeled after traditional lex/yacc.
392632Sstever@eecs.umich.edu    If you know how to use these tools in C, you will find PLY
402632Sstever@eecs.umich.edu    to be similar.
412632Sstever@eecs.umich.edu
422632Sstever@eecs.umich.edu -  PLY provides *very* extensive error reporting and diagnostic 
432632Sstever@eecs.umich.edu    information to assist in parser construction.  The original
442632Sstever@eecs.umich.edu    implementation was developed for instructional purposes.  As
452632Sstever@eecs.umich.edu    a result, the system tries to identify the most common types
462632Sstever@eecs.umich.edu    of errors made by novice users.  
472632Sstever@eecs.umich.edu
482632Sstever@eecs.umich.edu -  PLY provides full support for empty productions, error recovery,
492632Sstever@eecs.umich.edu    precedence specifiers, and moderately ambiguous grammars.
502632Sstever@eecs.umich.edu
512632Sstever@eecs.umich.edu -  Parsing is based on LR-parsing which is fast, memory efficient, 
522632Sstever@eecs.umich.edu    better suited to large grammars, and which has a number of nice
532632Sstever@eecs.umich.edu    properties when dealing with syntax errors and other parsing problems.
546498Snate@binkert.org    Currently, PLY builds its parsing tables using the LALR(1)
556498Snate@binkert.org    algorithm used in yacc.
562632Sstever@eecs.umich.edu
574479Sbinkertn@umich.edu -  PLY uses Python introspection features to build lexers and parsers.  
584479Sbinkertn@umich.edu    This greatly simplifies the task of parser construction since it reduces 
594479Sbinkertn@umich.edu    the number of files and eliminates the need to run a separate lex/yacc 
604479Sbinkertn@umich.edu    tool before running your program.
612632Sstever@eecs.umich.edu
622632Sstever@eecs.umich.edu -  PLY can be used to build parsers for "real" programming languages.
632632Sstever@eecs.umich.edu    Although it is not ultra-fast due to its Python implementation,
642632Sstever@eecs.umich.edu    PLY can be used to parse grammars consisting of several hundred
652632Sstever@eecs.umich.edu    rules (as might be found for a language like C).  The lexer and LR 
662632Sstever@eecs.umich.edu    parser are also reasonably efficient when parsing typically
676498Snate@binkert.org    sized programs.  People have used PLY to build parsers for
686498Snate@binkert.org    C, C++, ADA, and other real programming languages.
692632Sstever@eecs.umich.edu
702632Sstever@eecs.umich.eduHow to Use
712632Sstever@eecs.umich.edu==========
722632Sstever@eecs.umich.edu
734479Sbinkertn@umich.eduPLY consists of two files : lex.py and yacc.py.  These are contained
744479Sbinkertn@umich.eduwithin the 'ply' directory which may also be used as a Python package.
754479Sbinkertn@umich.eduTo use PLY, simply copy the 'ply' directory to your project and import
764479Sbinkertn@umich.edulex and yacc from the associated 'ply' package.  For example:
774479Sbinkertn@umich.edu
784479Sbinkertn@umich.edu     import ply.lex as lex
794479Sbinkertn@umich.edu     import ply.yacc as yacc
804479Sbinkertn@umich.edu
814479Sbinkertn@umich.eduAlternatively, you can copy just the files lex.py and yacc.py
824479Sbinkertn@umich.eduindividually and use them as modules.  For example:
834479Sbinkertn@umich.edu
844479Sbinkertn@umich.edu     import lex
854479Sbinkertn@umich.edu     import yacc
864479Sbinkertn@umich.edu
874479Sbinkertn@umich.eduThe file setup.py can be used to install ply using distutils.
882632Sstever@eecs.umich.edu
892632Sstever@eecs.umich.eduThe file doc/ply.html contains complete documentation on how to use
902632Sstever@eecs.umich.eduthe system.
912632Sstever@eecs.umich.edu
922632Sstever@eecs.umich.eduThe example directory contains several different examples including a
934479Sbinkertn@umich.eduPLY specification for ANSI C as given in K&R 2nd Ed.   
942632Sstever@eecs.umich.edu
952632Sstever@eecs.umich.eduA simple example is found at the end of this document
962632Sstever@eecs.umich.edu
972632Sstever@eecs.umich.eduRequirements
982632Sstever@eecs.umich.edu============
996498Snate@binkert.orgPLY requires the use of Python 2.2 or greater.  However, you should
1004479Sbinkertn@umich.eduuse the latest Python release if possible.  It should work on just
1014479Sbinkertn@umich.eduabout any platform.  PLY has been tested with both CPython and Jython.
1026498Snate@binkert.orgIt also seems to work with IronPython.
1032632Sstever@eecs.umich.edu
1042632Sstever@eecs.umich.eduResources
1052632Sstever@eecs.umich.edu=========
1062632Sstever@eecs.umich.eduMore information about PLY can be obtained on the PLY webpage at:
1072632Sstever@eecs.umich.edu
1084479Sbinkertn@umich.edu     http://www.dabeaz.com/ply
1092632Sstever@eecs.umich.edu
1102632Sstever@eecs.umich.eduFor a detailed overview of parsing theory, consult the excellent
1112632Sstever@eecs.umich.edubook "Compilers : Principles, Techniques, and Tools" by Aho, Sethi, and
1122632Sstever@eecs.umich.eduUllman.  The topics found in "Lex & Yacc" by Levine, Mason, and Brown
1132632Sstever@eecs.umich.edumay also be useful.
1142632Sstever@eecs.umich.edu
1154479Sbinkertn@umich.eduA Google group for PLY can be found at
1164479Sbinkertn@umich.edu
1174479Sbinkertn@umich.edu     http://groups.google.com/group/ply-hack
1182632Sstever@eecs.umich.edu
1192632Sstever@eecs.umich.eduAcknowledgments
1202632Sstever@eecs.umich.edu===============
1212632Sstever@eecs.umich.eduA special thanks is in order for all of the students in CS326 who
1222632Sstever@eecs.umich.edusuffered through about 25 different versions of these tools :-).
1232632Sstever@eecs.umich.edu
1244479Sbinkertn@umich.eduThe CHANGES file acknowledges those who have contributed patches.
1254479Sbinkertn@umich.edu
1264479Sbinkertn@umich.eduElias Ioup did the first implementation of LALR(1) parsing in PLY-1.x. 
1274479Sbinkertn@umich.eduAndrew Waters and Markus Schoepflin were instrumental in reporting bugs
1284479Sbinkertn@umich.eduand testing a revised LALR(1) implementation for PLY-2.0.
1294479Sbinkertn@umich.edu
1306498Snate@binkert.orgSpecial Note for PLY-3.0
1314479Sbinkertn@umich.edu========================
1326498Snate@binkert.orgPLY-3.0 the first PLY release to support Python 3. However, backwards
1336498Snate@binkert.orgcompatibility with Python 2.2 is still preserved. PLY provides dual
1346498Snate@binkert.orgPython 2/3 compatibility by restricting its implementation to a common
1356498Snate@binkert.orgsubset of basic language features. You should not convert PLY using
1366498Snate@binkert.org2to3--it is not necessary and may in fact break the implementation.
1374479Sbinkertn@umich.edu
1382632Sstever@eecs.umich.eduExample
1392632Sstever@eecs.umich.edu=======
1402632Sstever@eecs.umich.edu
1414479Sbinkertn@umich.eduHere is a simple example showing a PLY implementation of a calculator
1424479Sbinkertn@umich.eduwith variables.
1432632Sstever@eecs.umich.edu
1442632Sstever@eecs.umich.edu# -----------------------------------------------------------------------------
1452632Sstever@eecs.umich.edu# calc.py
1462632Sstever@eecs.umich.edu#
1472632Sstever@eecs.umich.edu# A simple calculator with variables.
1482632Sstever@eecs.umich.edu# -----------------------------------------------------------------------------
1492632Sstever@eecs.umich.edu
1502632Sstever@eecs.umich.edutokens = (
1512632Sstever@eecs.umich.edu    'NAME','NUMBER',
1522632Sstever@eecs.umich.edu    'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
1532632Sstever@eecs.umich.edu    'LPAREN','RPAREN',
1542632Sstever@eecs.umich.edu    )
1552632Sstever@eecs.umich.edu
1562632Sstever@eecs.umich.edu# Tokens
1572632Sstever@eecs.umich.edu
1582632Sstever@eecs.umich.edut_PLUS    = r'\+'
1592632Sstever@eecs.umich.edut_MINUS   = r'-'
1602632Sstever@eecs.umich.edut_TIMES   = r'\*'
1612632Sstever@eecs.umich.edut_DIVIDE  = r'/'
1622632Sstever@eecs.umich.edut_EQUALS  = r'='
1632632Sstever@eecs.umich.edut_LPAREN  = r'\('
1642632Sstever@eecs.umich.edut_RPAREN  = r'\)'
1652632Sstever@eecs.umich.edut_NAME    = r'[a-zA-Z_][a-zA-Z0-9_]*'
1662632Sstever@eecs.umich.edu
1672632Sstever@eecs.umich.edudef t_NUMBER(t):
1682632Sstever@eecs.umich.edu    r'\d+'
1696498Snate@binkert.org    t.value = int(t.value)
1702632Sstever@eecs.umich.edu    return t
1712632Sstever@eecs.umich.edu
1722632Sstever@eecs.umich.edu# Ignored characters
1732632Sstever@eecs.umich.edut_ignore = " \t"
1742632Sstever@eecs.umich.edu
1752632Sstever@eecs.umich.edudef t_newline(t):
1762632Sstever@eecs.umich.edu    r'\n+'
1774479Sbinkertn@umich.edu    t.lexer.lineno += t.value.count("\n")
1782632Sstever@eecs.umich.edu    
1792632Sstever@eecs.umich.edudef t_error(t):
1802632Sstever@eecs.umich.edu    print "Illegal character '%s'" % t.value[0]
1814479Sbinkertn@umich.edu    t.lexer.skip(1)
1822632Sstever@eecs.umich.edu    
1832632Sstever@eecs.umich.edu# Build the lexer
1844479Sbinkertn@umich.eduimport ply.lex as lex
1852632Sstever@eecs.umich.edulex.lex()
1862632Sstever@eecs.umich.edu
1872632Sstever@eecs.umich.edu# Precedence rules for the arithmetic operators
1882632Sstever@eecs.umich.eduprecedence = (
1892632Sstever@eecs.umich.edu    ('left','PLUS','MINUS'),
1902632Sstever@eecs.umich.edu    ('left','TIMES','DIVIDE'),
1912632Sstever@eecs.umich.edu    ('right','UMINUS'),
1922632Sstever@eecs.umich.edu    )
1932632Sstever@eecs.umich.edu
1942632Sstever@eecs.umich.edu# dictionary of names (for storing variables)
1952632Sstever@eecs.umich.edunames = { }
1962632Sstever@eecs.umich.edu
1974479Sbinkertn@umich.edudef p_statement_assign(p):
1982632Sstever@eecs.umich.edu    'statement : NAME EQUALS expression'
1994479Sbinkertn@umich.edu    names[p[1]] = p[3]
2002632Sstever@eecs.umich.edu
2014479Sbinkertn@umich.edudef p_statement_expr(p):
2022632Sstever@eecs.umich.edu    'statement : expression'
2034479Sbinkertn@umich.edu    print p[1]
2042632Sstever@eecs.umich.edu
2054479Sbinkertn@umich.edudef p_expression_binop(p):
2062632Sstever@eecs.umich.edu    '''expression : expression PLUS expression
2072632Sstever@eecs.umich.edu                  | expression MINUS expression
2082632Sstever@eecs.umich.edu                  | expression TIMES expression
2092632Sstever@eecs.umich.edu                  | expression DIVIDE expression'''
2104479Sbinkertn@umich.edu    if p[2] == '+'  : p[0] = p[1] + p[3]
2114479Sbinkertn@umich.edu    elif p[2] == '-': p[0] = p[1] - p[3]
2124479Sbinkertn@umich.edu    elif p[2] == '*': p[0] = p[1] * p[3]
2134479Sbinkertn@umich.edu    elif p[2] == '/': p[0] = p[1] / p[3]
2142632Sstever@eecs.umich.edu
2154479Sbinkertn@umich.edudef p_expression_uminus(p):
2162632Sstever@eecs.umich.edu    'expression : MINUS expression %prec UMINUS'
2174479Sbinkertn@umich.edu    p[0] = -p[2]
2182632Sstever@eecs.umich.edu
2194479Sbinkertn@umich.edudef p_expression_group(p):
2202632Sstever@eecs.umich.edu    'expression : LPAREN expression RPAREN'
2214479Sbinkertn@umich.edu    p[0] = p[2]
2222632Sstever@eecs.umich.edu
2234479Sbinkertn@umich.edudef p_expression_number(p):
2242632Sstever@eecs.umich.edu    'expression : NUMBER'
2254479Sbinkertn@umich.edu    p[0] = p[1]
2262632Sstever@eecs.umich.edu
2274479Sbinkertn@umich.edudef p_expression_name(p):
2282632Sstever@eecs.umich.edu    'expression : NAME'
2292632Sstever@eecs.umich.edu    try:
2304479Sbinkertn@umich.edu        p[0] = names[p[1]]
2312632Sstever@eecs.umich.edu    except LookupError:
2324479Sbinkertn@umich.edu        print "Undefined name '%s'" % p[1]
2334479Sbinkertn@umich.edu        p[0] = 0
2342632Sstever@eecs.umich.edu
2354479Sbinkertn@umich.edudef p_error(p):
2364479Sbinkertn@umich.edu    print "Syntax error at '%s'" % p.value
2372632Sstever@eecs.umich.edu
2384479Sbinkertn@umich.eduimport ply.yacc as yacc
2392632Sstever@eecs.umich.eduyacc.yacc()
2402632Sstever@eecs.umich.edu
2412632Sstever@eecs.umich.eduwhile 1:
2422632Sstever@eecs.umich.edu    try:
2432632Sstever@eecs.umich.edu        s = raw_input('calc > ')
2442632Sstever@eecs.umich.edu    except EOFError:
2452632Sstever@eecs.umich.edu        break
2462632Sstever@eecs.umich.edu    yacc.parse(s)
2472632Sstever@eecs.umich.edu
2482632Sstever@eecs.umich.edu
2494479Sbinkertn@umich.eduBug Reports and Patches
2504479Sbinkertn@umich.edu=======================
2516498Snate@binkert.orgMy goal with PLY is to simply have a decent lex/yacc implementation
2526498Snate@binkert.orgfor Python.  As a general rule, I don't spend huge amounts of time
2536498Snate@binkert.orgworking on it unless I receive very specific bug reports and/or
2546498Snate@binkert.orgpatches to fix problems. I also try to incorporate submitted feature
2556498Snate@binkert.orgrequests and enhancements into each new version.  To contact me about
2566498Snate@binkert.orgbugs and/or new features, please send email to dave@dabeaz.com.
2572632Sstever@eecs.umich.edu
2584479Sbinkertn@umich.eduIn addition there is a Google group for discussing PLY related issues at
2592632Sstever@eecs.umich.edu
2604479Sbinkertn@umich.edu    http://groups.google.com/group/ply-hack
2614479Sbinkertn@umich.edu 
2624479Sbinkertn@umich.edu-- Dave
2632632Sstever@eecs.umich.edu
2642632Sstever@eecs.umich.edu
2652632Sstever@eecs.umich.edu
2662632Sstever@eecs.umich.edu
2672632Sstever@eecs.umich.edu
2682632Sstever@eecs.umich.edu
2692632Sstever@eecs.umich.edu
2702632Sstever@eecs.umich.edu
2712632Sstever@eecs.umich.edu
272