README revision 2632
12632Sstever@eecs.umich.eduPLY (Python Lex-Yacc)                   Version 1.2  (November 27, 2002)
22632Sstever@eecs.umich.edu
32632Sstever@eecs.umich.eduDavid M. Beazley
42632Sstever@eecs.umich.eduDepartment of Computer Science
52632Sstever@eecs.umich.eduUniversity of Chicago
62632Sstever@eecs.umich.eduChicago, IL  60637
72632Sstever@eecs.umich.edubeazley@cs.uchicago.edu
82632Sstever@eecs.umich.edu
92632Sstever@eecs.umich.eduCopyright (C) 2001   David M. Beazley
102632Sstever@eecs.umich.edu
112632Sstever@eecs.umich.edu$Header: /home/stever/bk/newmem2/ext/ply/README 1.1 03/06/06 14:53:34-00:00 stever@ $
122632Sstever@eecs.umich.edu
132632Sstever@eecs.umich.eduThis library is free software; you can redistribute it and/or
142632Sstever@eecs.umich.edumodify it under the terms of the GNU Lesser General Public
152632Sstever@eecs.umich.eduLicense as published by the Free Software Foundation; either
162632Sstever@eecs.umich.eduversion 2.1 of the License, or (at your option) any later version.
172632Sstever@eecs.umich.edu
182632Sstever@eecs.umich.eduThis library is distributed in the hope that it will be useful,
192632Sstever@eecs.umich.edubut WITHOUT ANY WARRANTY; without even the implied warranty of
202632Sstever@eecs.umich.eduMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
212632Sstever@eecs.umich.eduLesser General Public License for more details.
222632Sstever@eecs.umich.edu
232632Sstever@eecs.umich.eduYou should have received a copy of the GNU Lesser General Public
242632Sstever@eecs.umich.eduLicense along with this library; if not, write to the Free Software
252632Sstever@eecs.umich.eduFoundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
262632Sstever@eecs.umich.edu
272632Sstever@eecs.umich.eduSee the file COPYING for a complete copy of the LGPL.
282632Sstever@eecs.umich.edu
292632Sstever@eecs.umich.eduIntroduction
302632Sstever@eecs.umich.edu============
312632Sstever@eecs.umich.edu
322632Sstever@eecs.umich.eduPLY is a 100% Python implementation of the common parsing tools lex
332632Sstever@eecs.umich.eduand yacc.   Although several other parsing tools are available for
342632Sstever@eecs.umich.eduPython, there are several reasons why you might want to consider PLY:
352632Sstever@eecs.umich.edu
362632Sstever@eecs.umich.edu -  The tools are very closely modeled after traditional lex/yacc.
372632Sstever@eecs.umich.edu    If you know how to use these tools in C, you will find PLY
382632Sstever@eecs.umich.edu    to be similar.
392632Sstever@eecs.umich.edu
402632Sstever@eecs.umich.edu -  PLY provides *very* extensive error reporting and diagnostic 
412632Sstever@eecs.umich.edu    information to assist in parser construction.  The original
422632Sstever@eecs.umich.edu    implementation was developed for instructional purposes.  As
432632Sstever@eecs.umich.edu    a result, the system tries to identify the most common types
442632Sstever@eecs.umich.edu    of errors made by novice users.  
452632Sstever@eecs.umich.edu
462632Sstever@eecs.umich.edu -  PLY provides full support for empty productions, error recovery,
472632Sstever@eecs.umich.edu    precedence specifiers, and moderately ambiguous grammars.
482632Sstever@eecs.umich.edu
492632Sstever@eecs.umich.edu -  Parsing is based on LR-parsing which is fast, memory efficient, 
502632Sstever@eecs.umich.edu    better suited to large grammars, and which has a number of nice
512632Sstever@eecs.umich.edu    properties when dealing with syntax errors and other parsing problems.
522632Sstever@eecs.umich.edu    Currently, PLY builds its parsing tables using the SLR algorithm which 
532632Sstever@eecs.umich.edu    is slightly weaker than LALR(1) used in traditional yacc. 
542632Sstever@eecs.umich.edu
552632Sstever@eecs.umich.edu -  Like John Aycock's excellent SPARK toolkit, PLY uses Python 
562632Sstever@eecs.umich.edu    reflection to build lexers and parsers.  This greatly simplifies 
572632Sstever@eecs.umich.edu    the task of parser construction since it reduces the number of files
582632Sstever@eecs.umich.edu    and eliminates the need to run a separate lex/yacc tool before
592632Sstever@eecs.umich.edu    running your program.
602632Sstever@eecs.umich.edu
612632Sstever@eecs.umich.edu -  PLY can be used to build parsers for "real" programming languages.
622632Sstever@eecs.umich.edu    Although it is not ultra-fast due to its Python implementation,
632632Sstever@eecs.umich.edu    PLY can be used to parse grammars consisting of several hundred
642632Sstever@eecs.umich.edu    rules (as might be found for a language like C).  The lexer and LR 
652632Sstever@eecs.umich.edu    parser are also reasonably efficient when parsing typically
662632Sstever@eecs.umich.edu    sized programs.
672632Sstever@eecs.umich.edu
682632Sstever@eecs.umich.eduThe original version of PLY was developed for an Introduction to
692632Sstever@eecs.umich.eduCompilers course where students used it to build a compiler for a
702632Sstever@eecs.umich.edusimple Pascal-like language.  Their compiler had to include lexical
712632Sstever@eecs.umich.eduanalysis, parsing, type checking, type inference, and generation of
722632Sstever@eecs.umich.eduassembly code for the SPARC processor.  Because of this, the current
732632Sstever@eecs.umich.eduimplementation has been extensively tested and debugged.  In addition,
742632Sstever@eecs.umich.edumost of the API and error checking steps have been adapted to address
752632Sstever@eecs.umich.educommon usability problems.
762632Sstever@eecs.umich.edu
772632Sstever@eecs.umich.eduHow to Use
782632Sstever@eecs.umich.edu==========
792632Sstever@eecs.umich.edu
802632Sstever@eecs.umich.eduPLY consists of two files : lex.py and yacc.py.  To use the system,
812632Sstever@eecs.umich.edusimply copy these files to your project and import them like standard
822632Sstever@eecs.umich.eduPython modules.
832632Sstever@eecs.umich.edu
842632Sstever@eecs.umich.eduThe file doc/ply.html contains complete documentation on how to use
852632Sstever@eecs.umich.eduthe system.
862632Sstever@eecs.umich.edu
872632Sstever@eecs.umich.eduThe example directory contains several different examples including a
882632Sstever@eecs.umich.eduPLY specification for ANSI C as given in K&R 2nd Ed.   Note: To use
892632Sstever@eecs.umich.eduthe examples, you will need to copy the lex.py and yacc.py files to
902632Sstever@eecs.umich.eduthe example directory.
912632Sstever@eecs.umich.edu
922632Sstever@eecs.umich.eduA simple example is found at the end of this document
932632Sstever@eecs.umich.edu
942632Sstever@eecs.umich.eduRequirements
952632Sstever@eecs.umich.edu============
962632Sstever@eecs.umich.eduPLY requires the use of Python 2.0 or greater.  It should work on
972632Sstever@eecs.umich.edujust about any platform.
982632Sstever@eecs.umich.edu
992632Sstever@eecs.umich.eduResources
1002632Sstever@eecs.umich.edu=========
1012632Sstever@eecs.umich.edu
1022632Sstever@eecs.umich.eduMore information about PLY can be obtained on the PLY webpage at:
1032632Sstever@eecs.umich.edu
1042632Sstever@eecs.umich.edu     http://systems.cs.uchicago.edu/ply
1052632Sstever@eecs.umich.edu
1062632Sstever@eecs.umich.eduFor a detailed overview of parsing theory, consult the excellent
1072632Sstever@eecs.umich.edubook "Compilers : Principles, Techniques, and Tools" by Aho, Sethi, and
1082632Sstever@eecs.umich.eduUllman.  The topics found in "Lex & Yacc" by Levine, Mason, and Brown
1092632Sstever@eecs.umich.edumay also be useful.
1102632Sstever@eecs.umich.edu
1112632Sstever@eecs.umich.eduGiven that this is the first release, I welcome your comments on how
1122632Sstever@eecs.umich.eduto improve the current implementation.  See the TODO file for things that
1132632Sstever@eecs.umich.edustill need to be done.
1142632Sstever@eecs.umich.edu
1152632Sstever@eecs.umich.eduAcknowledgments
1162632Sstever@eecs.umich.edu===============
1172632Sstever@eecs.umich.edu
1182632Sstever@eecs.umich.eduA special thanks is in order for all of the students in CS326 who
1192632Sstever@eecs.umich.edusuffered through about 25 different versions of these tools :-).
1202632Sstever@eecs.umich.edu
1212632Sstever@eecs.umich.eduExample
1222632Sstever@eecs.umich.edu=======
1232632Sstever@eecs.umich.edu
1242632Sstever@eecs.umich.eduHere is a simple example showing a PLY implementation of a calculator with variables.
1252632Sstever@eecs.umich.edu
1262632Sstever@eecs.umich.edu# -----------------------------------------------------------------------------
1272632Sstever@eecs.umich.edu# calc.py
1282632Sstever@eecs.umich.edu#
1292632Sstever@eecs.umich.edu# A simple calculator with variables.
1302632Sstever@eecs.umich.edu# -----------------------------------------------------------------------------
1312632Sstever@eecs.umich.edu
1322632Sstever@eecs.umich.edutokens = (
1332632Sstever@eecs.umich.edu    'NAME','NUMBER',
1342632Sstever@eecs.umich.edu    'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
1352632Sstever@eecs.umich.edu    'LPAREN','RPAREN',
1362632Sstever@eecs.umich.edu    )
1372632Sstever@eecs.umich.edu
1382632Sstever@eecs.umich.edu# Tokens
1392632Sstever@eecs.umich.edu
1402632Sstever@eecs.umich.edut_PLUS    = r'\+'
1412632Sstever@eecs.umich.edut_MINUS   = r'-'
1422632Sstever@eecs.umich.edut_TIMES   = r'\*'
1432632Sstever@eecs.umich.edut_DIVIDE  = r'/'
1442632Sstever@eecs.umich.edut_EQUALS  = r'='
1452632Sstever@eecs.umich.edut_LPAREN  = r'\('
1462632Sstever@eecs.umich.edut_RPAREN  = r'\)'
1472632Sstever@eecs.umich.edut_NAME    = r'[a-zA-Z_][a-zA-Z0-9_]*'
1482632Sstever@eecs.umich.edu
1492632Sstever@eecs.umich.edudef t_NUMBER(t):
1502632Sstever@eecs.umich.edu    r'\d+'
1512632Sstever@eecs.umich.edu    try:
1522632Sstever@eecs.umich.edu        t.value = int(t.value)
1532632Sstever@eecs.umich.edu    except ValueError:
1542632Sstever@eecs.umich.edu        print "Integer value too large", t.value
1552632Sstever@eecs.umich.edu        t.value = 0
1562632Sstever@eecs.umich.edu    return t
1572632Sstever@eecs.umich.edu
1582632Sstever@eecs.umich.edu# Ignored characters
1592632Sstever@eecs.umich.edut_ignore = " \t"
1602632Sstever@eecs.umich.edu
1612632Sstever@eecs.umich.edudef t_newline(t):
1622632Sstever@eecs.umich.edu    r'\n+'
1632632Sstever@eecs.umich.edu    t.lineno += t.value.count("\n")
1642632Sstever@eecs.umich.edu    
1652632Sstever@eecs.umich.edudef t_error(t):
1662632Sstever@eecs.umich.edu    print "Illegal character '%s'" % t.value[0]
1672632Sstever@eecs.umich.edu    t.skip(1)
1682632Sstever@eecs.umich.edu    
1692632Sstever@eecs.umich.edu# Build the lexer
1702632Sstever@eecs.umich.eduimport lex
1712632Sstever@eecs.umich.edulex.lex()
1722632Sstever@eecs.umich.edu
1732632Sstever@eecs.umich.edu# Precedence rules for the arithmetic operators
1742632Sstever@eecs.umich.eduprecedence = (
1752632Sstever@eecs.umich.edu    ('left','PLUS','MINUS'),
1762632Sstever@eecs.umich.edu    ('left','TIMES','DIVIDE'),
1772632Sstever@eecs.umich.edu    ('right','UMINUS'),
1782632Sstever@eecs.umich.edu    )
1792632Sstever@eecs.umich.edu
1802632Sstever@eecs.umich.edu# dictionary of names (for storing variables)
1812632Sstever@eecs.umich.edunames = { }
1822632Sstever@eecs.umich.edu
1832632Sstever@eecs.umich.edudef p_statement_assign(t):
1842632Sstever@eecs.umich.edu    'statement : NAME EQUALS expression'
1852632Sstever@eecs.umich.edu    names[t[1]] = t[3]
1862632Sstever@eecs.umich.edu
1872632Sstever@eecs.umich.edudef p_statement_expr(t):
1882632Sstever@eecs.umich.edu    'statement : expression'
1892632Sstever@eecs.umich.edu    print t[1]
1902632Sstever@eecs.umich.edu
1912632Sstever@eecs.umich.edudef p_expression_binop(t):
1922632Sstever@eecs.umich.edu    '''expression : expression PLUS expression
1932632Sstever@eecs.umich.edu                  | expression MINUS expression
1942632Sstever@eecs.umich.edu                  | expression TIMES expression
1952632Sstever@eecs.umich.edu                  | expression DIVIDE expression'''
1962632Sstever@eecs.umich.edu    if t[2] == '+'  : t[0] = t[1] + t[3]
1972632Sstever@eecs.umich.edu    elif t[2] == '-': t[0] = t[1] - t[3]
1982632Sstever@eecs.umich.edu    elif t[2] == '*': t[0] = t[1] * t[3]
1992632Sstever@eecs.umich.edu    elif t[2] == '/': t[0] = t[1] / t[3]
2002632Sstever@eecs.umich.edu
2012632Sstever@eecs.umich.edudef p_expression_uminus(t):
2022632Sstever@eecs.umich.edu    'expression : MINUS expression %prec UMINUS'
2032632Sstever@eecs.umich.edu    t[0] = -t[2]
2042632Sstever@eecs.umich.edu
2052632Sstever@eecs.umich.edudef p_expression_group(t):
2062632Sstever@eecs.umich.edu    'expression : LPAREN expression RPAREN'
2072632Sstever@eecs.umich.edu    t[0] = t[2]
2082632Sstever@eecs.umich.edu
2092632Sstever@eecs.umich.edudef p_expression_number(t):
2102632Sstever@eecs.umich.edu    'expression : NUMBER'
2112632Sstever@eecs.umich.edu    t[0] = t[1]
2122632Sstever@eecs.umich.edu
2132632Sstever@eecs.umich.edudef p_expression_name(t):
2142632Sstever@eecs.umich.edu    'expression : NAME'
2152632Sstever@eecs.umich.edu    try:
2162632Sstever@eecs.umich.edu        t[0] = names[t[1]]
2172632Sstever@eecs.umich.edu    except LookupError:
2182632Sstever@eecs.umich.edu        print "Undefined name '%s'" % t[1]
2192632Sstever@eecs.umich.edu        t[0] = 0
2202632Sstever@eecs.umich.edu
2212632Sstever@eecs.umich.edudef p_error(t):
2222632Sstever@eecs.umich.edu    print "Syntax error at '%s'" % t.value
2232632Sstever@eecs.umich.edu
2242632Sstever@eecs.umich.eduimport yacc
2252632Sstever@eecs.umich.eduyacc.yacc()
2262632Sstever@eecs.umich.edu
2272632Sstever@eecs.umich.eduwhile 1:
2282632Sstever@eecs.umich.edu    try:
2292632Sstever@eecs.umich.edu        s = raw_input('calc > ')
2302632Sstever@eecs.umich.edu    except EOFError:
2312632Sstever@eecs.umich.edu        break
2322632Sstever@eecs.umich.edu    yacc.parse(s)
2332632Sstever@eecs.umich.edu
2342632Sstever@eecs.umich.edu
2352632Sstever@eecs.umich.edu
2362632Sstever@eecs.umich.edu
2372632Sstever@eecs.umich.edu
2382632Sstever@eecs.umich.edu
2392632Sstever@eecs.umich.edu
2402632Sstever@eecs.umich.edu
2412632Sstever@eecs.umich.edu    
2422632Sstever@eecs.umich.edu
2432632Sstever@eecs.umich.edu
2442632Sstever@eecs.umich.edu
2452632Sstever@eecs.umich.edu
2462632Sstever@eecs.umich.edu
2472632Sstever@eecs.umich.edu
2482632Sstever@eecs.umich.edu
2492632Sstever@eecs.umich.edu
250