README revision 6498
16498Snate@binkert.orgPLY (Python Lex-Yacc)                   Version 3.2
26498Snate@binkert.org
36498Snate@binkert.orgCopyright (C) 2001-2009,
46498Snate@binkert.orgDavid M. Beazley (Dabeaz LLC)
56498Snate@binkert.orgAll rights reserved.
66498Snate@binkert.org
76498Snate@binkert.orgRedistribution and use in source and binary forms, with or without
86498Snate@binkert.orgmodification, are permitted provided that the following conditions are
96498Snate@binkert.orgmet:
106498Snate@binkert.org
116498Snate@binkert.org* Redistributions of source code must retain the above copyright notice,
126498Snate@binkert.org  this list of conditions and the following disclaimer.  
136498Snate@binkert.org* Redistributions in binary form must reproduce the above copyright notice, 
146498Snate@binkert.org  this list of conditions and the following disclaimer in the documentation
156498Snate@binkert.org  and/or other materials provided with the distribution.  
166498Snate@binkert.org* Neither the name of the David Beazley or Dabeaz LLC may be used to
176498Snate@binkert.org  endorse or promote products derived from this software without
186498Snate@binkert.org  specific prior written permission. 
196498Snate@binkert.org
206498Snate@binkert.orgTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
216498Snate@binkert.org"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
226498Snate@binkert.orgLIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
236498Snate@binkert.orgA PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
246498Snate@binkert.orgOWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
256498Snate@binkert.orgSPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
266498Snate@binkert.orgLIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
276498Snate@binkert.orgDATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
286498Snate@binkert.orgTHEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
296498Snate@binkert.org(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
306498Snate@binkert.orgOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
316498Snate@binkert.org
326498Snate@binkert.orgIntroduction
336498Snate@binkert.org============
346498Snate@binkert.org
356498Snate@binkert.orgPLY is a 100% Python implementation of the common parsing tools lex
366498Snate@binkert.organd yacc. Here are a few highlights:
376498Snate@binkert.org
386498Snate@binkert.org -  PLY is very closely modeled after traditional lex/yacc.
396498Snate@binkert.org    If you know how to use these tools in C, you will find PLY
406498Snate@binkert.org    to be similar.
416498Snate@binkert.org
426498Snate@binkert.org -  PLY provides *very* extensive error reporting and diagnostic 
436498Snate@binkert.org    information to assist in parser construction.  The original
446498Snate@binkert.org    implementation was developed for instructional purposes.  As
456498Snate@binkert.org    a result, the system tries to identify the most common types
466498Snate@binkert.org    of errors made by novice users.  
476498Snate@binkert.org
486498Snate@binkert.org -  PLY provides full support for empty productions, error recovery,
496498Snate@binkert.org    precedence specifiers, and moderately ambiguous grammars.
506498Snate@binkert.org
516498Snate@binkert.org -  Parsing is based on LR-parsing which is fast, memory efficient, 
526498Snate@binkert.org    better suited to large grammars, and which has a number of nice
536498Snate@binkert.org    properties when dealing with syntax errors and other parsing problems.
546498Snate@binkert.org    Currently, PLY builds its parsing tables using the LALR(1)
556498Snate@binkert.org    algorithm used in yacc.
566498Snate@binkert.org
576498Snate@binkert.org -  PLY uses Python introspection features to build lexers and parsers.  
586498Snate@binkert.org    This greatly simplifies the task of parser construction since it reduces 
596498Snate@binkert.org    the number of files and eliminates the need to run a separate lex/yacc 
606498Snate@binkert.org    tool before running your program.
616498Snate@binkert.org
626498Snate@binkert.org -  PLY can be used to build parsers for "real" programming languages.
636498Snate@binkert.org    Although it is not ultra-fast due to its Python implementation,
646498Snate@binkert.org    PLY can be used to parse grammars consisting of several hundred
656498Snate@binkert.org    rules (as might be found for a language like C).  The lexer and LR 
666498Snate@binkert.org    parser are also reasonably efficient when parsing typically
676498Snate@binkert.org    sized programs.  People have used PLY to build parsers for
686498Snate@binkert.org    C, C++, ADA, and other real programming languages.
696498Snate@binkert.org
706498Snate@binkert.orgHow to Use
716498Snate@binkert.org==========
726498Snate@binkert.org
736498Snate@binkert.orgPLY consists of two files : lex.py and yacc.py.  These are contained
746498Snate@binkert.orgwithin the 'ply' directory which may also be used as a Python package.
756498Snate@binkert.orgTo use PLY, simply copy the 'ply' directory to your project and import
766498Snate@binkert.orglex and yacc from the associated 'ply' package.  For example:
776498Snate@binkert.org
786498Snate@binkert.org     import ply.lex as lex
796498Snate@binkert.org     import ply.yacc as yacc
806498Snate@binkert.org
816498Snate@binkert.orgAlternatively, you can copy just the files lex.py and yacc.py
826498Snate@binkert.orgindividually and use them as modules.  For example:
836498Snate@binkert.org
846498Snate@binkert.org     import lex
856498Snate@binkert.org     import yacc
866498Snate@binkert.org
876498Snate@binkert.orgThe file setup.py can be used to install ply using distutils.
886498Snate@binkert.org
896498Snate@binkert.orgThe file doc/ply.html contains complete documentation on how to use
906498Snate@binkert.orgthe system.
916498Snate@binkert.org
926498Snate@binkert.orgThe example directory contains several different examples including a
936498Snate@binkert.orgPLY specification for ANSI C as given in K&R 2nd Ed.   
946498Snate@binkert.org
956498Snate@binkert.orgA simple example is found at the end of this document
966498Snate@binkert.org
976498Snate@binkert.orgRequirements
986498Snate@binkert.org============
996498Snate@binkert.orgPLY requires the use of Python 2.2 or greater.  However, you should
1006498Snate@binkert.orguse the latest Python release if possible.  It should work on just
1016498Snate@binkert.orgabout any platform.  PLY has been tested with both CPython and Jython.
1026498Snate@binkert.orgIt also seems to work with IronPython.
1036498Snate@binkert.org
1046498Snate@binkert.orgResources
1056498Snate@binkert.org=========
1066498Snate@binkert.orgMore information about PLY can be obtained on the PLY webpage at:
1076498Snate@binkert.org
1086498Snate@binkert.org     http://www.dabeaz.com/ply
1096498Snate@binkert.org
1106498Snate@binkert.orgFor a detailed overview of parsing theory, consult the excellent
1116498Snate@binkert.orgbook "Compilers : Principles, Techniques, and Tools" by Aho, Sethi, and
1126498Snate@binkert.orgUllman.  The topics found in "Lex & Yacc" by Levine, Mason, and Brown
1136498Snate@binkert.orgmay also be useful.
1146498Snate@binkert.org
1156498Snate@binkert.orgA Google group for PLY can be found at
1166498Snate@binkert.org
1176498Snate@binkert.org     http://groups.google.com/group/ply-hack
1186498Snate@binkert.org
1196498Snate@binkert.orgAcknowledgments
1206498Snate@binkert.org===============
1216498Snate@binkert.orgA special thanks is in order for all of the students in CS326 who
1226498Snate@binkert.orgsuffered through about 25 different versions of these tools :-).
1236498Snate@binkert.org
1246498Snate@binkert.orgThe CHANGES file acknowledges those who have contributed patches.
1256498Snate@binkert.org
1266498Snate@binkert.orgElias Ioup did the first implementation of LALR(1) parsing in PLY-1.x. 
1276498Snate@binkert.orgAndrew Waters and Markus Schoepflin were instrumental in reporting bugs
1286498Snate@binkert.organd testing a revised LALR(1) implementation for PLY-2.0.
1296498Snate@binkert.org
1306498Snate@binkert.orgSpecial Note for PLY-3.0
1316498Snate@binkert.org========================
1326498Snate@binkert.orgPLY-3.0 the first PLY release to support Python 3. However, backwards
1336498Snate@binkert.orgcompatibility with Python 2.2 is still preserved. PLY provides dual
1346498Snate@binkert.orgPython 2/3 compatibility by restricting its implementation to a common
1356498Snate@binkert.orgsubset of basic language features. You should not convert PLY using
1366498Snate@binkert.org2to3--it is not necessary and may in fact break the implementation.
1376498Snate@binkert.org
1386498Snate@binkert.orgExample
1396498Snate@binkert.org=======
1406498Snate@binkert.org
1416498Snate@binkert.orgHere is a simple example showing a PLY implementation of a calculator
1426498Snate@binkert.orgwith variables.
1436498Snate@binkert.org
1446498Snate@binkert.org# -----------------------------------------------------------------------------
1456498Snate@binkert.org# calc.py
1466498Snate@binkert.org#
1476498Snate@binkert.org# A simple calculator with variables.
1486498Snate@binkert.org# -----------------------------------------------------------------------------
1496498Snate@binkert.org
1506498Snate@binkert.orgtokens = (
1516498Snate@binkert.org    'NAME','NUMBER',
1526498Snate@binkert.org    'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
1536498Snate@binkert.org    'LPAREN','RPAREN',
1546498Snate@binkert.org    )
1556498Snate@binkert.org
1566498Snate@binkert.org# Tokens
1576498Snate@binkert.org
1586498Snate@binkert.orgt_PLUS    = r'\+'
1596498Snate@binkert.orgt_MINUS   = r'-'
1606498Snate@binkert.orgt_TIMES   = r'\*'
1616498Snate@binkert.orgt_DIVIDE  = r'/'
1626498Snate@binkert.orgt_EQUALS  = r'='
1636498Snate@binkert.orgt_LPAREN  = r'\('
1646498Snate@binkert.orgt_RPAREN  = r'\)'
1656498Snate@binkert.orgt_NAME    = r'[a-zA-Z_][a-zA-Z0-9_]*'
1666498Snate@binkert.org
1676498Snate@binkert.orgdef t_NUMBER(t):
1686498Snate@binkert.org    r'\d+'
1696498Snate@binkert.org    t.value = int(t.value)
1706498Snate@binkert.org    return t
1716498Snate@binkert.org
1726498Snate@binkert.org# Ignored characters
1736498Snate@binkert.orgt_ignore = " \t"
1746498Snate@binkert.org
1756498Snate@binkert.orgdef t_newline(t):
1766498Snate@binkert.org    r'\n+'
1776498Snate@binkert.org    t.lexer.lineno += t.value.count("\n")
1786498Snate@binkert.org    
1796498Snate@binkert.orgdef t_error(t):
1806498Snate@binkert.org    print "Illegal character '%s'" % t.value[0]
1816498Snate@binkert.org    t.lexer.skip(1)
1826498Snate@binkert.org    
1836498Snate@binkert.org# Build the lexer
1846498Snate@binkert.orgimport ply.lex as lex
1856498Snate@binkert.orglex.lex()
1866498Snate@binkert.org
1876498Snate@binkert.org# Precedence rules for the arithmetic operators
1886498Snate@binkert.orgprecedence = (
1896498Snate@binkert.org    ('left','PLUS','MINUS'),
1906498Snate@binkert.org    ('left','TIMES','DIVIDE'),
1916498Snate@binkert.org    ('right','UMINUS'),
1926498Snate@binkert.org    )
1936498Snate@binkert.org
1946498Snate@binkert.org# dictionary of names (for storing variables)
1956498Snate@binkert.orgnames = { }
1966498Snate@binkert.org
1976498Snate@binkert.orgdef p_statement_assign(p):
1986498Snate@binkert.org    'statement : NAME EQUALS expression'
1996498Snate@binkert.org    names[p[1]] = p[3]
2006498Snate@binkert.org
2016498Snate@binkert.orgdef p_statement_expr(p):
2026498Snate@binkert.org    'statement : expression'
2036498Snate@binkert.org    print p[1]
2046498Snate@binkert.org
2056498Snate@binkert.orgdef p_expression_binop(p):
2066498Snate@binkert.org    '''expression : expression PLUS expression
2076498Snate@binkert.org                  | expression MINUS expression
2086498Snate@binkert.org                  | expression TIMES expression
2096498Snate@binkert.org                  | expression DIVIDE expression'''
2106498Snate@binkert.org    if p[2] == '+'  : p[0] = p[1] + p[3]
2116498Snate@binkert.org    elif p[2] == '-': p[0] = p[1] - p[3]
2126498Snate@binkert.org    elif p[2] == '*': p[0] = p[1] * p[3]
2136498Snate@binkert.org    elif p[2] == '/': p[0] = p[1] / p[3]
2146498Snate@binkert.org
2156498Snate@binkert.orgdef p_expression_uminus(p):
2166498Snate@binkert.org    'expression : MINUS expression %prec UMINUS'
2176498Snate@binkert.org    p[0] = -p[2]
2186498Snate@binkert.org
2196498Snate@binkert.orgdef p_expression_group(p):
2206498Snate@binkert.org    'expression : LPAREN expression RPAREN'
2216498Snate@binkert.org    p[0] = p[2]
2226498Snate@binkert.org
2236498Snate@binkert.orgdef p_expression_number(p):
2246498Snate@binkert.org    'expression : NUMBER'
2256498Snate@binkert.org    p[0] = p[1]
2266498Snate@binkert.org
2276498Snate@binkert.orgdef p_expression_name(p):
2286498Snate@binkert.org    'expression : NAME'
2296498Snate@binkert.org    try:
2306498Snate@binkert.org        p[0] = names[p[1]]
2316498Snate@binkert.org    except LookupError:
2326498Snate@binkert.org        print "Undefined name '%s'" % p[1]
2336498Snate@binkert.org        p[0] = 0
2346498Snate@binkert.org
2356498Snate@binkert.orgdef p_error(p):
2366498Snate@binkert.org    print "Syntax error at '%s'" % p.value
2376498Snate@binkert.org
2386498Snate@binkert.orgimport ply.yacc as yacc
2396498Snate@binkert.orgyacc.yacc()
2406498Snate@binkert.org
2416498Snate@binkert.orgwhile 1:
2426498Snate@binkert.org    try:
2436498Snate@binkert.org        s = raw_input('calc > ')
2446498Snate@binkert.org    except EOFError:
2456498Snate@binkert.org        break
2466498Snate@binkert.org    yacc.parse(s)
2476498Snate@binkert.org
2486498Snate@binkert.org
2496498Snate@binkert.orgBug Reports and Patches
2506498Snate@binkert.org=======================
2516498Snate@binkert.orgMy goal with PLY is to simply have a decent lex/yacc implementation
2526498Snate@binkert.orgfor Python.  As a general rule, I don't spend huge amounts of time
2536498Snate@binkert.orgworking on it unless I receive very specific bug reports and/or
2546498Snate@binkert.orgpatches to fix problems. I also try to incorporate submitted feature
2556498Snate@binkert.orgrequests and enhancements into each new version.  To contact me about
2566498Snate@binkert.orgbugs and/or new features, please send email to dave@dabeaz.com.
2576498Snate@binkert.org
2586498Snate@binkert.orgIn addition there is a Google group for discussing PLY related issues at
2596498Snate@binkert.org
2606498Snate@binkert.org    http://groups.google.com/group/ply-hack
2616498Snate@binkert.org 
2626498Snate@binkert.org-- Dave
2636498Snate@binkert.org
2646498Snate@binkert.org
2656498Snate@binkert.org
2666498Snate@binkert.org
2676498Snate@binkert.org
2686498Snate@binkert.org
2696498Snate@binkert.org
2706498Snate@binkert.org
2716498Snate@binkert.org
2726498Snate@binkert.org