README revision 6498
1955SN/APLY (Python Lex-Yacc)                   Version 3.2
2955SN/A
31762SN/ACopyright (C) 2001-2009,
4955SN/ADavid M. Beazley (Dabeaz LLC)
5955SN/AAll rights reserved.
6955SN/A
7955SN/ARedistribution and use in source and binary forms, with or without
8955SN/Amodification, are permitted provided that the following conditions are
9955SN/Amet:
10955SN/A
11955SN/A* Redistributions of source code must retain the above copyright notice,
12955SN/A  this list of conditions and the following disclaimer.  
13955SN/A* Redistributions in binary form must reproduce the above copyright notice, 
14955SN/A  this list of conditions and the following disclaimer in the documentation
15955SN/A  and/or other materials provided with the distribution.  
16955SN/A* Neither the name of the David Beazley or Dabeaz LLC may be used to
17955SN/A  endorse or promote products derived from this software without
18955SN/A  specific prior written permission. 
19955SN/A
20955SN/ATHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
21955SN/A"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
22955SN/ALIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
23955SN/AA PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
24955SN/AOWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
25955SN/ASPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
26955SN/ALIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
27955SN/ADATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
282665Ssaidi@eecs.umich.eduTHEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
294762Snate@binkert.org(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
30955SN/AOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
315522Snate@binkert.org
326143Snate@binkert.orgIntroduction
334762Snate@binkert.org============
345522Snate@binkert.org
35955SN/APLY is a 100% Python implementation of the common parsing tools lex
365522Snate@binkert.organd yacc. Here are a few highlights:
37955SN/A
385522Snate@binkert.org -  PLY is very closely modeled after traditional lex/yacc.
394202Sbinkertn@umich.edu    If you know how to use these tools in C, you will find PLY
405742Snate@binkert.org    to be similar.
41955SN/A
424381Sbinkertn@umich.edu -  PLY provides *very* extensive error reporting and diagnostic 
434381Sbinkertn@umich.edu    information to assist in parser construction.  The original
44955SN/A    implementation was developed for instructional purposes.  As
45955SN/A    a result, the system tries to identify the most common types
46955SN/A    of errors made by novice users.  
474202Sbinkertn@umich.edu
48955SN/A -  PLY provides full support for empty productions, error recovery,
494382Sbinkertn@umich.edu    precedence specifiers, and moderately ambiguous grammars.
504382Sbinkertn@umich.edu
514382Sbinkertn@umich.edu -  Parsing is based on LR-parsing which is fast, memory efficient, 
526654Snate@binkert.org    better suited to large grammars, and which has a number of nice
535517Snate@binkert.org    properties when dealing with syntax errors and other parsing problems.
546143Snate@binkert.org    Currently, PLY builds its parsing tables using the LALR(1)
556143Snate@binkert.org    algorithm used in yacc.
566143Snate@binkert.org
576143Snate@binkert.org -  PLY uses Python introspection features to build lexers and parsers.  
586143Snate@binkert.org    This greatly simplifies the task of parser construction since it reduces 
596143Snate@binkert.org    the number of files and eliminates the need to run a separate lex/yacc 
606143Snate@binkert.org    tool before running your program.
616143Snate@binkert.org
626143Snate@binkert.org -  PLY can be used to build parsers for "real" programming languages.
636143Snate@binkert.org    Although it is not ultra-fast due to its Python implementation,
646143Snate@binkert.org    PLY can be used to parse grammars consisting of several hundred
656143Snate@binkert.org    rules (as might be found for a language like C).  The lexer and LR 
666143Snate@binkert.org    parser are also reasonably efficient when parsing typically
676143Snate@binkert.org    sized programs.  People have used PLY to build parsers for
686143Snate@binkert.org    C, C++, ADA, and other real programming languages.
694762Snate@binkert.org
706143Snate@binkert.orgHow to Use
716143Snate@binkert.org==========
726143Snate@binkert.org
736143Snate@binkert.orgPLY consists of two files : lex.py and yacc.py.  These are contained
746143Snate@binkert.orgwithin the 'ply' directory which may also be used as a Python package.
756143Snate@binkert.orgTo use PLY, simply copy the 'ply' directory to your project and import
766143Snate@binkert.orglex and yacc from the associated 'ply' package.  For example:
776143Snate@binkert.org
786143Snate@binkert.org     import ply.lex as lex
796143Snate@binkert.org     import ply.yacc as yacc
806143Snate@binkert.org
816143Snate@binkert.orgAlternatively, you can copy just the files lex.py and yacc.py
826143Snate@binkert.orgindividually and use them as modules.  For example:
836143Snate@binkert.org
846143Snate@binkert.org     import lex
856143Snate@binkert.org     import yacc
866143Snate@binkert.org
876143Snate@binkert.orgThe file setup.py can be used to install ply using distutils.
886143Snate@binkert.org
896143Snate@binkert.orgThe file doc/ply.html contains complete documentation on how to use
906143Snate@binkert.orgthe system.
917065Snate@binkert.org
926143Snate@binkert.orgThe example directory contains several different examples including a
936143Snate@binkert.orgPLY specification for ANSI C as given in K&R 2nd Ed.   
946143Snate@binkert.org
956143Snate@binkert.orgA simple example is found at the end of this document
966143Snate@binkert.org
976143Snate@binkert.orgRequirements
986143Snate@binkert.org============
996143Snate@binkert.orgPLY requires the use of Python 2.2 or greater.  However, you should
1006143Snate@binkert.orguse the latest Python release if possible.  It should work on just
1016143Snate@binkert.orgabout any platform.  PLY has been tested with both CPython and Jython.
1026143Snate@binkert.orgIt also seems to work with IronPython.
1036143Snate@binkert.org
1046143Snate@binkert.orgResources
1056143Snate@binkert.org=========
1066143Snate@binkert.orgMore information about PLY can be obtained on the PLY webpage at:
1076143Snate@binkert.org
1086143Snate@binkert.org     http://www.dabeaz.com/ply
1096143Snate@binkert.org
1106143Snate@binkert.orgFor a detailed overview of parsing theory, consult the excellent
1116143Snate@binkert.orgbook "Compilers : Principles, Techniques, and Tools" by Aho, Sethi, and
1126143Snate@binkert.orgUllman.  The topics found in "Lex & Yacc" by Levine, Mason, and Brown
1135522Snate@binkert.orgmay also be useful.
1146143Snate@binkert.org
1156143Snate@binkert.orgA Google group for PLY can be found at
1166143Snate@binkert.org
1176143Snate@binkert.org     http://groups.google.com/group/ply-hack
1186143Snate@binkert.org
1196143Snate@binkert.orgAcknowledgments
1206143Snate@binkert.org===============
1216143Snate@binkert.orgA special thanks is in order for all of the students in CS326 who
1226143Snate@binkert.orgsuffered through about 25 different versions of these tools :-).
1236143Snate@binkert.org
1245522Snate@binkert.orgThe CHANGES file acknowledges those who have contributed patches.
1255522Snate@binkert.org
1265522Snate@binkert.orgElias Ioup did the first implementation of LALR(1) parsing in PLY-1.x. 
1275522Snate@binkert.orgAndrew Waters and Markus Schoepflin were instrumental in reporting bugs
1285604Snate@binkert.organd testing a revised LALR(1) implementation for PLY-2.0.
1295604Snate@binkert.org
1306143Snate@binkert.orgSpecial Note for PLY-3.0
1316143Snate@binkert.org========================
1324762Snate@binkert.orgPLY-3.0 the first PLY release to support Python 3. However, backwards
1334762Snate@binkert.orgcompatibility with Python 2.2 is still preserved. PLY provides dual
1346143Snate@binkert.orgPython 2/3 compatibility by restricting its implementation to a common
1356727Ssteve.reinhardt@amd.comsubset of basic language features. You should not convert PLY using
1366727Ssteve.reinhardt@amd.com2to3--it is not necessary and may in fact break the implementation.
1376727Ssteve.reinhardt@amd.com
1384762Snate@binkert.orgExample
1396143Snate@binkert.org=======
1406143Snate@binkert.org
1416143Snate@binkert.orgHere is a simple example showing a PLY implementation of a calculator
1426143Snate@binkert.orgwith variables.
1436727Ssteve.reinhardt@amd.com
1446143Snate@binkert.org# -----------------------------------------------------------------------------
1456143Snate@binkert.org# calc.py
1466143Snate@binkert.org#
1475604Snate@binkert.org# A simple calculator with variables.
1486143Snate@binkert.org# -----------------------------------------------------------------------------
1496143Snate@binkert.org
1506143Snate@binkert.orgtokens = (
1514762Snate@binkert.org    'NAME','NUMBER',
1526143Snate@binkert.org    'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
1534762Snate@binkert.org    'LPAREN','RPAREN',
1544762Snate@binkert.org    )
1554762Snate@binkert.org
1566143Snate@binkert.org# Tokens
1576143Snate@binkert.org
1584762Snate@binkert.orgt_PLUS    = r'\+'
1596143Snate@binkert.orgt_MINUS   = r'-'
1606143Snate@binkert.orgt_TIMES   = r'\*'
1616143Snate@binkert.orgt_DIVIDE  = r'/'
1626143Snate@binkert.orgt_EQUALS  = r'='
1634762Snate@binkert.orgt_LPAREN  = r'\('
1646143Snate@binkert.orgt_RPAREN  = r'\)'
1654762Snate@binkert.orgt_NAME    = r'[a-zA-Z_][a-zA-Z0-9_]*'
1666143Snate@binkert.org
1674762Snate@binkert.orgdef t_NUMBER(t):
1686143Snate@binkert.org    r'\d+'
1696143Snate@binkert.org    t.value = int(t.value)
1706143Snate@binkert.org    return t
1716143Snate@binkert.org
1726143Snate@binkert.org# Ignored characters
1736143Snate@binkert.orgt_ignore = " \t"
1746143Snate@binkert.org
1756143Snate@binkert.orgdef t_newline(t):
1766143Snate@binkert.org    r'\n+'
1776143Snate@binkert.org    t.lexer.lineno += t.value.count("\n")
1786143Snate@binkert.org    
1796143Snate@binkert.orgdef t_error(t):
1806143Snate@binkert.org    print "Illegal character '%s'" % t.value[0]
181955SN/A    t.lexer.skip(1)
1825584Snate@binkert.org    
1835584Snate@binkert.org# Build the lexer
1845584Snate@binkert.orgimport ply.lex as lex
1855584Snate@binkert.orglex.lex()
1866143Snate@binkert.org
1876143Snate@binkert.org# Precedence rules for the arithmetic operators
1886143Snate@binkert.orgprecedence = (
1895584Snate@binkert.org    ('left','PLUS','MINUS'),
1904382Sbinkertn@umich.edu    ('left','TIMES','DIVIDE'),
1914202Sbinkertn@umich.edu    ('right','UMINUS'),
1924382Sbinkertn@umich.edu    )
1934382Sbinkertn@umich.edu
1944382Sbinkertn@umich.edu# dictionary of names (for storing variables)
1955584Snate@binkert.orgnames = { }
1964382Sbinkertn@umich.edu
1974382Sbinkertn@umich.edudef p_statement_assign(p):
1984382Sbinkertn@umich.edu    'statement : NAME EQUALS expression'
1995192Ssaidi@eecs.umich.edu    names[p[1]] = p[3]
2005192Ssaidi@eecs.umich.edu
2015799Snate@binkert.orgdef p_statement_expr(p):
2025799Snate@binkert.org    'statement : expression'
2035799Snate@binkert.org    print p[1]
2045192Ssaidi@eecs.umich.edu
2055799Snate@binkert.orgdef p_expression_binop(p):
2065192Ssaidi@eecs.umich.edu    '''expression : expression PLUS expression
2075799Snate@binkert.org                  | expression MINUS expression
2085799Snate@binkert.org                  | expression TIMES expression
2095192Ssaidi@eecs.umich.edu                  | expression DIVIDE expression'''
2105192Ssaidi@eecs.umich.edu    if p[2] == '+'  : p[0] = p[1] + p[3]
2115192Ssaidi@eecs.umich.edu    elif p[2] == '-': p[0] = p[1] - p[3]
2125799Snate@binkert.org    elif p[2] == '*': p[0] = p[1] * p[3]
2135192Ssaidi@eecs.umich.edu    elif p[2] == '/': p[0] = p[1] / p[3]
2145192Ssaidi@eecs.umich.edu
2155192Ssaidi@eecs.umich.edudef p_expression_uminus(p):
2165192Ssaidi@eecs.umich.edu    'expression : MINUS expression %prec UMINUS'
2175192Ssaidi@eecs.umich.edu    p[0] = -p[2]
2185192Ssaidi@eecs.umich.edu
2194382Sbinkertn@umich.edudef p_expression_group(p):
2204382Sbinkertn@umich.edu    'expression : LPAREN expression RPAREN'
2214382Sbinkertn@umich.edu    p[0] = p[2]
2222667Sstever@eecs.umich.edu
2232667Sstever@eecs.umich.edudef p_expression_number(p):
2242667Sstever@eecs.umich.edu    'expression : NUMBER'
2252667Sstever@eecs.umich.edu    p[0] = p[1]
2262667Sstever@eecs.umich.edu
2272667Sstever@eecs.umich.edudef p_expression_name(p):
2285742Snate@binkert.org    'expression : NAME'
2295742Snate@binkert.org    try:
2305742Snate@binkert.org        p[0] = names[p[1]]
2315793Snate@binkert.org    except LookupError:
2325793Snate@binkert.org        print "Undefined name '%s'" % p[1]
2335793Snate@binkert.org        p[0] = 0
2345793Snate@binkert.org
2355793Snate@binkert.orgdef p_error(p):
2364382Sbinkertn@umich.edu    print "Syntax error at '%s'" % p.value
2374762Snate@binkert.org
2385344Sstever@gmail.comimport ply.yacc as yacc
2394382Sbinkertn@umich.eduyacc.yacc()
2405341Sstever@gmail.com
2415742Snate@binkert.orgwhile 1:
2425742Snate@binkert.org    try:
2435742Snate@binkert.org        s = raw_input('calc > ')
2445742Snate@binkert.org    except EOFError:
2455742Snate@binkert.org        break
2464762Snate@binkert.org    yacc.parse(s)
2475742Snate@binkert.org
2485742Snate@binkert.org
2495742Snate@binkert.orgBug Reports and Patches
2505742Snate@binkert.org=======================
2515742Snate@binkert.orgMy goal with PLY is to simply have a decent lex/yacc implementation
2525742Snate@binkert.orgfor Python.  As a general rule, I don't spend huge amounts of time
2535742Snate@binkert.orgworking on it unless I receive very specific bug reports and/or
2545341Sstever@gmail.compatches to fix problems. I also try to incorporate submitted feature
2555742Snate@binkert.orgrequests and enhancements into each new version.  To contact me about
2565341Sstever@gmail.combugs and/or new features, please send email to dave@dabeaz.com.
2574773Snate@binkert.org
2586108Snate@binkert.orgIn addition there is a Google group for discussing PLY related issues at
2591858SN/A
2601085SN/A    http://groups.google.com/group/ply-hack
2616658Snate@binkert.org 
2626658Snate@binkert.org-- Dave
2636658Snate@binkert.org
2646658Snate@binkert.org
2656658Snate@binkert.org
2666658Snate@binkert.org
2676658Snate@binkert.org
2686658Snate@binkert.org
2696658Snate@binkert.org
2706658Snate@binkert.org
2716658Snate@binkert.org
2726658Snate@binkert.org