CHANGES revision 4479
14479Sbinkertn@umich.eduVersion 2.3
24479Sbinkertn@umich.edu-----------------------------
34479Sbinkertn@umich.edu02/20/07: beazley
44479Sbinkertn@umich.edu          Fixed a bug with character literals if the literal '.' appeared as the
54479Sbinkertn@umich.edu          last symbol of a grammar rule.  Reported by Ales Smrcka.
64479Sbinkertn@umich.edu
74479Sbinkertn@umich.edu02/19/07: beazley
84479Sbinkertn@umich.edu          Warning messages are now redirected to stderr instead of being printed
94479Sbinkertn@umich.edu          to standard output.
104479Sbinkertn@umich.edu
114479Sbinkertn@umich.edu02/19/07: beazley
124479Sbinkertn@umich.edu          Added a warning message to lex.py if it detects a literal backslash
134479Sbinkertn@umich.edu          character inside the t_ignore declaration.  This is to help
144479Sbinkertn@umich.edu          problems that might occur if someone accidentally defines t_ignore
154479Sbinkertn@umich.edu          as a Python raw string.  For example:
164479Sbinkertn@umich.edu
174479Sbinkertn@umich.edu              t_ignore = r' \t'
184479Sbinkertn@umich.edu
194479Sbinkertn@umich.edu          The idea for this is from an email I received from David Cimimi who
204479Sbinkertn@umich.edu          reported bizarre behavior in lexing as a result of defining t_ignore
214479Sbinkertn@umich.edu          as a raw string by accident.
224479Sbinkertn@umich.edu
234479Sbinkertn@umich.edu02/18/07: beazley
244479Sbinkertn@umich.edu          Performance improvements.  Made some changes to the internal
254479Sbinkertn@umich.edu          table organization and LR parser to improve parsing performance.
264479Sbinkertn@umich.edu
274479Sbinkertn@umich.edu02/18/07: beazley
284479Sbinkertn@umich.edu          Automatic tracking of line number and position information must now be 
294479Sbinkertn@umich.edu          enabled by a special flag to parse().  For example:
304479Sbinkertn@umich.edu
314479Sbinkertn@umich.edu              yacc.parse(data,tracking=True)
324479Sbinkertn@umich.edu
334479Sbinkertn@umich.edu          In many applications, it's just not that important to have the
344479Sbinkertn@umich.edu          parser automatically track all line numbers.  By making this an 
354479Sbinkertn@umich.edu          optional feature, it allows the parser to run significantly faster
364479Sbinkertn@umich.edu          (more than a 20% speed increase in many cases).    Note: positional
374479Sbinkertn@umich.edu          information is always available for raw tokens---this change only
384479Sbinkertn@umich.edu          applies to positional information associated with nonterminal
394479Sbinkertn@umich.edu          grammar symbols.
404479Sbinkertn@umich.edu          *** POTENTIAL INCOMPATIBILITY ***
414479Sbinkertn@umich.edu	  
424479Sbinkertn@umich.edu02/18/07: beazley
434479Sbinkertn@umich.edu          Yacc no longer supports extended slices of grammar productions.
444479Sbinkertn@umich.edu          However, it does support regular slices.  For example:
454479Sbinkertn@umich.edu
464479Sbinkertn@umich.edu          def p_foo(p):
474479Sbinkertn@umich.edu              '''foo: a b c d e'''
484479Sbinkertn@umich.edu              p[0] = p[1:3]
494479Sbinkertn@umich.edu
504479Sbinkertn@umich.edu          This change is a performance improvement to the parser--it streamlines
514479Sbinkertn@umich.edu          normal access to the grammar values since slices are now handled in
524479Sbinkertn@umich.edu          a __getslice__() method as opposed to __getitem__().
534479Sbinkertn@umich.edu
544479Sbinkertn@umich.edu02/12/07: beazley
554479Sbinkertn@umich.edu          Fixed a bug in the handling of token names when combined with
564479Sbinkertn@umich.edu          start conditions.   Bug reported by Todd O'Bryan.
574479Sbinkertn@umich.edu
584479Sbinkertn@umich.eduVersion 2.2
594479Sbinkertn@umich.edu------------------------------
604479Sbinkertn@umich.edu11/01/06: beazley
614479Sbinkertn@umich.edu          Added lexpos() and lexspan() methods to grammar symbols.  These
624479Sbinkertn@umich.edu          mirror the same functionality of lineno() and linespan().  For
634479Sbinkertn@umich.edu          example:
644479Sbinkertn@umich.edu
654479Sbinkertn@umich.edu          def p_expr(p):
664479Sbinkertn@umich.edu              'expr : expr PLUS expr'
674479Sbinkertn@umich.edu               p.lexpos(1)     # Lexing position of left-hand-expression
684479Sbinkertn@umich.edu               p.lexpos(1)     # Lexing position of PLUS
694479Sbinkertn@umich.edu               start,end = p.lexspan(3)  # Lexing range of right hand expression
704479Sbinkertn@umich.edu
714479Sbinkertn@umich.edu11/01/06: beazley
724479Sbinkertn@umich.edu          Minor change to error handling.  The recommended way to skip characters
734479Sbinkertn@umich.edu          in the input is to use t.lexer.skip() as shown here:
744479Sbinkertn@umich.edu
754479Sbinkertn@umich.edu             def t_error(t):
764479Sbinkertn@umich.edu                 print "Illegal character '%s'" % t.value[0]
774479Sbinkertn@umich.edu                 t.lexer.skip(1)
784479Sbinkertn@umich.edu          
794479Sbinkertn@umich.edu          The old approach of just using t.skip(1) will still work, but won't
804479Sbinkertn@umich.edu          be documented.
814479Sbinkertn@umich.edu
824479Sbinkertn@umich.edu10/31/06: beazley
834479Sbinkertn@umich.edu          Discarded tokens can now be specified as simple strings instead of
844479Sbinkertn@umich.edu          functions.  To do this, simply include the text "ignore_" in the
854479Sbinkertn@umich.edu          token declaration.  For example:
864479Sbinkertn@umich.edu
874479Sbinkertn@umich.edu              t_ignore_cppcomment = r'//.*'
884479Sbinkertn@umich.edu          
894479Sbinkertn@umich.edu          Previously, this had to be done with a function.  For example:
904479Sbinkertn@umich.edu
914479Sbinkertn@umich.edu              def t_ignore_cppcomment(t):
924479Sbinkertn@umich.edu                  r'//.*'
934479Sbinkertn@umich.edu                  pass
944479Sbinkertn@umich.edu
954479Sbinkertn@umich.edu          If start conditions/states are being used, state names should appear
964479Sbinkertn@umich.edu          before the "ignore_" text.
974479Sbinkertn@umich.edu
984479Sbinkertn@umich.edu10/19/06: beazley
994479Sbinkertn@umich.edu          The Lex module now provides support for flex-style start conditions
1004479Sbinkertn@umich.edu          as described at http://www.gnu.org/software/flex/manual/html_chapter/flex_11.html.
1014479Sbinkertn@umich.edu          Please refer to this document to understand this change note.  Refer to
1024479Sbinkertn@umich.edu          the PLY documentation for PLY-specific explanation of how this works.
1034479Sbinkertn@umich.edu
1044479Sbinkertn@umich.edu          To use start conditions, you first need to declare a set of states in
1054479Sbinkertn@umich.edu          your lexer file:
1064479Sbinkertn@umich.edu
1074479Sbinkertn@umich.edu          states = (
1084479Sbinkertn@umich.edu                    ('foo','exclusive'),
1094479Sbinkertn@umich.edu                    ('bar','inclusive')
1104479Sbinkertn@umich.edu          )
1114479Sbinkertn@umich.edu
1124479Sbinkertn@umich.edu          This serves the same role as the %s and %x specifiers in flex.
1134479Sbinkertn@umich.edu
1144479Sbinkertn@umich.edu          One a state has been declared, tokens for that state can be 
1154479Sbinkertn@umich.edu          declared by defining rules of the form t_state_TOK.  For example:
1164479Sbinkertn@umich.edu
1174479Sbinkertn@umich.edu            t_PLUS = '\+'          # Rule defined in INITIAL state
1184479Sbinkertn@umich.edu            t_foo_NUM = '\d+'      # Rule defined in foo state
1194479Sbinkertn@umich.edu            t_bar_NUM = '\d+'      # Rule defined in bar state
1204479Sbinkertn@umich.edu
1214479Sbinkertn@umich.edu            t_foo_bar_NUM = '\d+'  # Rule defined in both foo and bar
1224479Sbinkertn@umich.edu            t_ANY_NUM = '\d+'      # Rule defined in all states
1234479Sbinkertn@umich.edu
1244479Sbinkertn@umich.edu          In addition to defining tokens for each state, the t_ignore and t_error
1254479Sbinkertn@umich.edu          specifications can be customized for specific states.  For example:
1264479Sbinkertn@umich.edu
1274479Sbinkertn@umich.edu            t_foo_ignore = " "     # Ignored characters for foo state
1284479Sbinkertn@umich.edu            def t_bar_error(t):   
1294479Sbinkertn@umich.edu                # Handle errors in bar state
1304479Sbinkertn@umich.edu
1314479Sbinkertn@umich.edu          With token rules, the following methods can be used to change states
1324479Sbinkertn@umich.edu          
1334479Sbinkertn@umich.edu            def t_TOKNAME(t):
1344479Sbinkertn@umich.edu                t.lexer.begin('foo')        # Begin state 'foo'
1354479Sbinkertn@umich.edu                t.lexer.push_state('foo')   # Begin state 'foo', push old state
1364479Sbinkertn@umich.edu                                            # onto a stack
1374479Sbinkertn@umich.edu                t.lexer.pop_state()         # Restore previous state
1384479Sbinkertn@umich.edu                t.lexer.current_state()     # Returns name of current state
1394479Sbinkertn@umich.edu
1404479Sbinkertn@umich.edu          These methods mirror the BEGIN(), yy_push_state(), yy_pop_state(), and
1414479Sbinkertn@umich.edu          yy_top_state() functions in flex.
1424479Sbinkertn@umich.edu
1434479Sbinkertn@umich.edu          The use of start states can be used as one way to write sub-lexers.
1444479Sbinkertn@umich.edu          For example, the lexer or parser might instruct the lexer to start
1454479Sbinkertn@umich.edu          generating a different set of tokens depending on the context.
1464479Sbinkertn@umich.edu          
1474479Sbinkertn@umich.edu          example/yply/ylex.py shows the use of start states to grab C/C++ 
1484479Sbinkertn@umich.edu          code fragments out of traditional yacc specification files.
1494479Sbinkertn@umich.edu
1504479Sbinkertn@umich.edu          *** NEW FEATURE *** Suggested by Daniel Larraz with whom I also
1514479Sbinkertn@umich.edu          discussed various aspects of the design.
1524479Sbinkertn@umich.edu
1534479Sbinkertn@umich.edu10/19/06: beazley
1544479Sbinkertn@umich.edu          Minor change to the way in which yacc.py was reporting shift/reduce
1554479Sbinkertn@umich.edu          conflicts.  Although the underlying LALR(1) algorithm was correct,
1564479Sbinkertn@umich.edu          PLY was under-reporting the number of conflicts compared to yacc/bison
1574479Sbinkertn@umich.edu          when precedence rules were in effect.  This change should make PLY
1584479Sbinkertn@umich.edu          report the same number of conflicts as yacc.
1594479Sbinkertn@umich.edu
1604479Sbinkertn@umich.edu10/19/06: beazley
1614479Sbinkertn@umich.edu          Modified yacc so that grammar rules could also include the '-' 
1624479Sbinkertn@umich.edu          character.  For example:
1634479Sbinkertn@umich.edu
1644479Sbinkertn@umich.edu            def p_expr_list(p):
1654479Sbinkertn@umich.edu                'expression-list : expression-list expression'
1664479Sbinkertn@umich.edu
1674479Sbinkertn@umich.edu          Suggested by Oldrich Jedlicka.
1684479Sbinkertn@umich.edu
1694479Sbinkertn@umich.edu10/18/06: beazley
1704479Sbinkertn@umich.edu          Attribute lexer.lexmatch added so that token rules can access the re 
1714479Sbinkertn@umich.edu          match object that was generated.  For example:
1724479Sbinkertn@umich.edu
1734479Sbinkertn@umich.edu          def t_FOO(t):
1744479Sbinkertn@umich.edu              r'some regex'
1754479Sbinkertn@umich.edu              m = t.lexer.lexmatch
1764479Sbinkertn@umich.edu              # Do something with m
1774479Sbinkertn@umich.edu
1784479Sbinkertn@umich.edu
1794479Sbinkertn@umich.edu          This may be useful if you want to access named groups specified within
1804479Sbinkertn@umich.edu          the regex for a specific token. Suggested by Oldrich Jedlicka.
1814479Sbinkertn@umich.edu          
1824479Sbinkertn@umich.edu10/16/06: beazley
1834479Sbinkertn@umich.edu          Changed the error message that results if an illegal character
1844479Sbinkertn@umich.edu          is encountered and no default error function is defined in lex.
1854479Sbinkertn@umich.edu          The exception is now more informative about the actual cause of
1864479Sbinkertn@umich.edu          the error.
1874479Sbinkertn@umich.edu      
1884479Sbinkertn@umich.eduVersion 2.1
1894479Sbinkertn@umich.edu------------------------------
1904479Sbinkertn@umich.edu10/02/06: beazley
1914479Sbinkertn@umich.edu          The last Lexer object built by lex() can be found in lex.lexer.
1924479Sbinkertn@umich.edu          The last Parser object built  by yacc() can be found in yacc.parser.
1934479Sbinkertn@umich.edu
1944479Sbinkertn@umich.edu10/02/06: beazley
1954479Sbinkertn@umich.edu          New example added:  examples/yply
1964479Sbinkertn@umich.edu
1974479Sbinkertn@umich.edu          This example uses PLY to convert Unix-yacc specification files to
1984479Sbinkertn@umich.edu          PLY programs with the same grammar.   This may be useful if you
1994479Sbinkertn@umich.edu          want to convert a grammar from bison/yacc to use with PLY.
2004479Sbinkertn@umich.edu    
2014479Sbinkertn@umich.edu10/02/06: beazley
2024479Sbinkertn@umich.edu          Added support for a start symbol to be specified in the yacc
2034479Sbinkertn@umich.edu          input file itself.  Just do this:
2044479Sbinkertn@umich.edu
2054479Sbinkertn@umich.edu               start = 'name'
2064479Sbinkertn@umich.edu
2074479Sbinkertn@umich.edu          where 'name' matches some grammar rule.  For example:
2084479Sbinkertn@umich.edu
2094479Sbinkertn@umich.edu               def p_name(p):
2104479Sbinkertn@umich.edu                   'name : A B C'
2114479Sbinkertn@umich.edu                   ...
2124479Sbinkertn@umich.edu
2134479Sbinkertn@umich.edu          This mirrors the functionality of the yacc %start specifier.
2144479Sbinkertn@umich.edu
2154479Sbinkertn@umich.edu09/30/06: beazley
2164479Sbinkertn@umich.edu          Some new examples added.:
2174479Sbinkertn@umich.edu
2184479Sbinkertn@umich.edu          examples/GardenSnake : A simple indentation based language similar
2194479Sbinkertn@umich.edu                                 to Python.  Shows how you might handle 
2204479Sbinkertn@umich.edu                                 whitespace.  Contributed by Andrew Dalke.
2214479Sbinkertn@umich.edu
2224479Sbinkertn@umich.edu          examples/BASIC       : An implementation of 1964 Dartmouth BASIC.
2234479Sbinkertn@umich.edu                                 Contributed by Dave against his better
2244479Sbinkertn@umich.edu                                 judgement.
2254479Sbinkertn@umich.edu
2264479Sbinkertn@umich.edu09/28/06: beazley
2274479Sbinkertn@umich.edu          Minor patch to allow named groups to be used in lex regular
2284479Sbinkertn@umich.edu          expression rules.  For example:
2294479Sbinkertn@umich.edu
2304479Sbinkertn@umich.edu              t_QSTRING = r'''(?P<quote>['"]).*?(?P=quote)'''
2314479Sbinkertn@umich.edu
2324479Sbinkertn@umich.edu          Patch submitted by Adam Ring.
2334479Sbinkertn@umich.edu 
2344479Sbinkertn@umich.edu09/28/06: beazley
2354479Sbinkertn@umich.edu          LALR(1) is now the default parsing method.   To use SLR, use
2364479Sbinkertn@umich.edu          yacc.yacc(method="SLR").  Note: there is no performance impact
2374479Sbinkertn@umich.edu          on parsing when using LALR(1) instead of SLR. However, constructing
2384479Sbinkertn@umich.edu          the parsing tables will take a little longer.
2394479Sbinkertn@umich.edu
2404479Sbinkertn@umich.edu09/26/06: beazley
2414479Sbinkertn@umich.edu          Change to line number tracking.  To modify line numbers, modify
2424479Sbinkertn@umich.edu          the line number of the lexer itself.  For example:
2434479Sbinkertn@umich.edu
2444479Sbinkertn@umich.edu          def t_NEWLINE(t):
2454479Sbinkertn@umich.edu              r'\n'
2464479Sbinkertn@umich.edu              t.lexer.lineno += 1
2474479Sbinkertn@umich.edu
2484479Sbinkertn@umich.edu          This modification is both cleanup and a performance optimization.
2494479Sbinkertn@umich.edu          In past versions, lex was monitoring every token for changes in
2504479Sbinkertn@umich.edu          the line number.  This extra processing is unnecessary for a vast
2514479Sbinkertn@umich.edu          majority of tokens. Thus, this new approach cleans it up a bit.
2524479Sbinkertn@umich.edu
2534479Sbinkertn@umich.edu          *** POTENTIAL INCOMPATIBILITY ***
2544479Sbinkertn@umich.edu          You will need to change code in your lexer that updates the line
2554479Sbinkertn@umich.edu          number. For example, "t.lineno += 1" becomes "t.lexer.lineno += 1"
2564479Sbinkertn@umich.edu         
2574479Sbinkertn@umich.edu09/26/06: beazley
2584479Sbinkertn@umich.edu          Added the lexing position to tokens as an attribute lexpos. This
2594479Sbinkertn@umich.edu          is the raw index into the input text at which a token appears.
2604479Sbinkertn@umich.edu          This information can be used to compute column numbers and other
2614479Sbinkertn@umich.edu          details (e.g., scan backwards from lexpos to the first newline
2624479Sbinkertn@umich.edu          to get a column position).
2634479Sbinkertn@umich.edu
2644479Sbinkertn@umich.edu09/25/06: beazley
2654479Sbinkertn@umich.edu          Changed the name of the __copy__() method on the Lexer class
2664479Sbinkertn@umich.edu          to clone().  This is used to clone a Lexer object (e.g., if
2674479Sbinkertn@umich.edu          you're running different lexers at the same time).
2684479Sbinkertn@umich.edu
2694479Sbinkertn@umich.edu09/21/06: beazley
2704479Sbinkertn@umich.edu          Limitations related to the use of the re module have been eliminated.
2714479Sbinkertn@umich.edu          Several users reported problems with regular expressions exceeding
2724479Sbinkertn@umich.edu          more than 100 named groups. To solve this, lex.py is now capable
2734479Sbinkertn@umich.edu          of automatically splitting its master regular regular expression into
2744479Sbinkertn@umich.edu          smaller expressions as needed.   This should, in theory, make it
2754479Sbinkertn@umich.edu          possible to specify an arbitrarily large number of tokens.
2764479Sbinkertn@umich.edu
2774479Sbinkertn@umich.edu09/21/06: beazley
2784479Sbinkertn@umich.edu          Improved error checking in lex.py.  Rules that match the empty string
2794479Sbinkertn@umich.edu          are now rejected (otherwise they cause the lexer to enter an infinite
2804479Sbinkertn@umich.edu          loop).  An extra check for rules containing '#' has also been added.
2814479Sbinkertn@umich.edu          Since lex compiles regular expressions in verbose mode, '#' is interpreted
2824479Sbinkertn@umich.edu          as a regex comment, it is critical to use '\#' instead.  
2834479Sbinkertn@umich.edu
2844479Sbinkertn@umich.edu09/18/06: beazley
2854479Sbinkertn@umich.edu          Added a @TOKEN decorator function to lex.py that can be used to 
2864479Sbinkertn@umich.edu          define token rules where the documentation string might be computed
2874479Sbinkertn@umich.edu          in some way.
2884479Sbinkertn@umich.edu          
2894479Sbinkertn@umich.edu          digit            = r'([0-9])'
2904479Sbinkertn@umich.edu          nondigit         = r'([_A-Za-z])'
2914479Sbinkertn@umich.edu          identifier       = r'(' + nondigit + r'(' + digit + r'|' + nondigit + r')*)'        
2924479Sbinkertn@umich.edu
2934479Sbinkertn@umich.edu          from ply.lex import TOKEN
2944479Sbinkertn@umich.edu
2954479Sbinkertn@umich.edu          @TOKEN(identifier)
2964479Sbinkertn@umich.edu          def t_ID(t):
2974479Sbinkertn@umich.edu               # Do whatever
2984479Sbinkertn@umich.edu
2994479Sbinkertn@umich.edu          The @TOKEN decorator merely sets the documentation string of the
3004479Sbinkertn@umich.edu          associated token function as needed for lex to work.  
3014479Sbinkertn@umich.edu
3024479Sbinkertn@umich.edu          Note: An alternative solution is the following:
3034479Sbinkertn@umich.edu
3044479Sbinkertn@umich.edu          def t_ID(t):
3054479Sbinkertn@umich.edu              # Do whatever
3064479Sbinkertn@umich.edu   
3074479Sbinkertn@umich.edu          t_ID.__doc__ = identifier
3084479Sbinkertn@umich.edu
3094479Sbinkertn@umich.edu          Note: Decorators require the use of Python 2.4 or later.  If compatibility
3104479Sbinkertn@umich.edu          with old versions is needed, use the latter solution.
3114479Sbinkertn@umich.edu
3124479Sbinkertn@umich.edu          The need for this feature was suggested by Cem Karan.
3134479Sbinkertn@umich.edu
3144479Sbinkertn@umich.edu09/14/06: beazley
3154479Sbinkertn@umich.edu          Support for single-character literal tokens has been added to yacc.
3164479Sbinkertn@umich.edu          These literals must be enclosed in quotes.  For example:
3174479Sbinkertn@umich.edu
3184479Sbinkertn@umich.edu          def p_expr(p):
3194479Sbinkertn@umich.edu               "expr : expr '+' expr"
3204479Sbinkertn@umich.edu               ...
3214479Sbinkertn@umich.edu
3224479Sbinkertn@umich.edu          def p_expr(p):
3234479Sbinkertn@umich.edu               'expr : expr "-" expr'
3244479Sbinkertn@umich.edu               ...
3254479Sbinkertn@umich.edu
3264479Sbinkertn@umich.edu          In addition to this, it is necessary to tell the lexer module about
3274479Sbinkertn@umich.edu          literal characters.   This is done by defining the variable 'literals'
3284479Sbinkertn@umich.edu          as a list of characters.  This should  be defined in the module that
3294479Sbinkertn@umich.edu          invokes the lex.lex() function.  For example:
3304479Sbinkertn@umich.edu
3314479Sbinkertn@umich.edu             literals = ['+','-','*','/','(',')','=']
3324479Sbinkertn@umich.edu 
3334479Sbinkertn@umich.edu          or simply
3344479Sbinkertn@umich.edu
3354479Sbinkertn@umich.edu             literals = '+=*/()='
3364479Sbinkertn@umich.edu
3374479Sbinkertn@umich.edu          It is important to note that literals can only be a single character.
3384479Sbinkertn@umich.edu          When the lexer fails to match a token using its normal regular expression
3394479Sbinkertn@umich.edu          rules, it will check the current character against the literal list.
3404479Sbinkertn@umich.edu          If found, it will be returned with a token type set to match the literal
3414479Sbinkertn@umich.edu          character.  Otherwise, an illegal character will be signalled.
3424479Sbinkertn@umich.edu
3434479Sbinkertn@umich.edu
3444479Sbinkertn@umich.edu09/14/06: beazley
3454479Sbinkertn@umich.edu          Modified PLY to install itself as a proper Python package called 'ply'.
3464479Sbinkertn@umich.edu          This will make it a little more friendly to other modules.  This
3474479Sbinkertn@umich.edu          changes the usage of PLY only slightly.  Just do this to import the
3484479Sbinkertn@umich.edu          modules
3494479Sbinkertn@umich.edu
3504479Sbinkertn@umich.edu                import ply.lex as lex
3514479Sbinkertn@umich.edu                import ply.yacc as yacc
3524479Sbinkertn@umich.edu
3534479Sbinkertn@umich.edu          Alternatively, you can do this:
3544479Sbinkertn@umich.edu
3554479Sbinkertn@umich.edu                from ply import *
3564479Sbinkertn@umich.edu
3574479Sbinkertn@umich.edu          Which imports both the lex and yacc modules.
3584479Sbinkertn@umich.edu          Change suggested by Lee June.
3594479Sbinkertn@umich.edu
3604479Sbinkertn@umich.edu09/13/06: beazley
3614479Sbinkertn@umich.edu          Changed the handling of negative indices when used in production rules.
3624479Sbinkertn@umich.edu          A negative production index now accesses already parsed symbols on the
3634479Sbinkertn@umich.edu          parsing stack.  For example, 
3644479Sbinkertn@umich.edu
3654479Sbinkertn@umich.edu              def p_foo(p):
3664479Sbinkertn@umich.edu                   "foo: A B C D"
3674479Sbinkertn@umich.edu                   print p[1]       # Value of 'A' symbol
3684479Sbinkertn@umich.edu                   print p[2]       # Value of 'B' symbol
3694479Sbinkertn@umich.edu                   print p[-1]      # Value of whatever symbol appears before A
3704479Sbinkertn@umich.edu                                    # on the parsing stack.
3714479Sbinkertn@umich.edu
3724479Sbinkertn@umich.edu                   p[0] = some_val  # Sets the value of the 'foo' grammer symbol
3734479Sbinkertn@umich.edu                                    
3744479Sbinkertn@umich.edu          This behavior makes it easier to work with embedded actions within the
3754479Sbinkertn@umich.edu          parsing rules. For example, in C-yacc, it is possible to write code like
3764479Sbinkertn@umich.edu          this:
3774479Sbinkertn@umich.edu
3784479Sbinkertn@umich.edu               bar:   A { printf("seen an A = %d\n", $1); } B { do_stuff; }
3794479Sbinkertn@umich.edu
3804479Sbinkertn@umich.edu          In this example, the printf() code executes immediately after A has been
3814479Sbinkertn@umich.edu          parsed.  Within the embedded action code, $1 refers to the A symbol on
3824479Sbinkertn@umich.edu          the stack.
3834479Sbinkertn@umich.edu
3844479Sbinkertn@umich.edu          To perform this equivalent action in PLY, you need to write a pair
3854479Sbinkertn@umich.edu          of rules like this:
3864479Sbinkertn@umich.edu
3874479Sbinkertn@umich.edu               def p_bar(p):
3884479Sbinkertn@umich.edu                     "bar : A seen_A B"
3894479Sbinkertn@umich.edu                     do_stuff
3904479Sbinkertn@umich.edu
3914479Sbinkertn@umich.edu               def p_seen_A(p):
3924479Sbinkertn@umich.edu                     "seen_A :"
3934479Sbinkertn@umich.edu                     print "seen an A =", p[-1]
3944479Sbinkertn@umich.edu
3954479Sbinkertn@umich.edu          The second rule "seen_A" is merely a empty production which should be
3964479Sbinkertn@umich.edu          reduced as soon as A is parsed in the "bar" rule above.  The use 
3974479Sbinkertn@umich.edu          of the negative index p[-1] is used to access whatever symbol appeared
3984479Sbinkertn@umich.edu          before the seen_A symbol.
3994479Sbinkertn@umich.edu
4004479Sbinkertn@umich.edu          This feature also makes it possible to support inherited attributes.
4014479Sbinkertn@umich.edu          For example:
4024479Sbinkertn@umich.edu
4034479Sbinkertn@umich.edu               def p_decl(p):
4044479Sbinkertn@umich.edu                     "decl : scope name"
4054479Sbinkertn@umich.edu
4064479Sbinkertn@umich.edu               def p_scope(p):
4074479Sbinkertn@umich.edu                     """scope : GLOBAL
4084479Sbinkertn@umich.edu                              | LOCAL"""
4094479Sbinkertn@umich.edu                   p[0] = p[1]
4104479Sbinkertn@umich.edu
4114479Sbinkertn@umich.edu               def p_name(p):
4124479Sbinkertn@umich.edu                     "name : ID"
4134479Sbinkertn@umich.edu                     if p[-1] == "GLOBAL":
4144479Sbinkertn@umich.edu                          # ...
4154479Sbinkertn@umich.edu                     else if p[-1] == "LOCAL":
4164479Sbinkertn@umich.edu                          #...
4174479Sbinkertn@umich.edu
4184479Sbinkertn@umich.edu          In this case, the name rule is inheriting an attribute from the
4194479Sbinkertn@umich.edu          scope declaration that precedes it.
4204479Sbinkertn@umich.edu       
4214479Sbinkertn@umich.edu          *** POTENTIAL INCOMPATIBILITY ***
4224479Sbinkertn@umich.edu          If you are currently using negative indices within existing grammar rules,
4234479Sbinkertn@umich.edu          your code will break.  This should be extremely rare if non-existent in
4244479Sbinkertn@umich.edu          most cases.  The argument to various grammar rules is not usually not
4254479Sbinkertn@umich.edu          processed in the same way as a list of items.
4264479Sbinkertn@umich.edu          
4274479Sbinkertn@umich.eduVersion 2.0
4284479Sbinkertn@umich.edu------------------------------
4294479Sbinkertn@umich.edu09/07/06: beazley
4304479Sbinkertn@umich.edu          Major cleanup and refactoring of the LR table generation code.  Both SLR
4314479Sbinkertn@umich.edu          and LALR(1) table generation is now performed by the same code base with
4324479Sbinkertn@umich.edu          only minor extensions for extra LALR(1) processing.
4334479Sbinkertn@umich.edu
4344479Sbinkertn@umich.edu09/07/06: beazley
4354479Sbinkertn@umich.edu          Completely reimplemented the entire LALR(1) parsing engine to use the
4364479Sbinkertn@umich.edu          DeRemer and Pennello algorithm for calculating lookahead sets.  This
4374479Sbinkertn@umich.edu          significantly improves the performance of generating LALR(1) tables
4384479Sbinkertn@umich.edu          and has the added feature of actually working correctly!  If you
4394479Sbinkertn@umich.edu          experienced weird behavior with LALR(1) in prior releases, this should
4404479Sbinkertn@umich.edu          hopefully resolve all of those problems.  Many thanks to 
4414479Sbinkertn@umich.edu          Andrew Waters and Markus Schoepflin for submitting bug reports
4424479Sbinkertn@umich.edu          and helping me test out the revised LALR(1) support.
4434479Sbinkertn@umich.edu
4444479Sbinkertn@umich.eduVersion 1.8
4454479Sbinkertn@umich.edu------------------------------
4464479Sbinkertn@umich.edu08/02/06: beazley
4474479Sbinkertn@umich.edu          Fixed a problem related to the handling of default actions in LALR(1)
4484479Sbinkertn@umich.edu          parsing.  If you experienced subtle and/or bizarre behavior when trying
4494479Sbinkertn@umich.edu          to use the LALR(1) engine, this may correct those problems.  Patch
4504479Sbinkertn@umich.edu          contributed by Russ Cox.  Note: This patch has been superceded by 
4514479Sbinkertn@umich.edu          revisions for LALR(1) parsing in Ply-2.0.
4524479Sbinkertn@umich.edu
4534479Sbinkertn@umich.edu08/02/06: beazley
4544479Sbinkertn@umich.edu          Added support for slicing of productions in yacc.  
4554479Sbinkertn@umich.edu          Patch contributed by Patrick Mezard.
4564479Sbinkertn@umich.edu
4574479Sbinkertn@umich.eduVersion 1.7
4584479Sbinkertn@umich.edu------------------------------
4594479Sbinkertn@umich.edu03/02/06: beazley
4604479Sbinkertn@umich.edu          Fixed infinite recursion problem ReduceToTerminals() function that
4614479Sbinkertn@umich.edu          would sometimes come up in LALR(1) table generation.  Reported by 
4624479Sbinkertn@umich.edu          Markus Schoepflin.
4634479Sbinkertn@umich.edu
4644479Sbinkertn@umich.edu03/01/06: beazley
4654479Sbinkertn@umich.edu          Added "reflags" argument to lex().  For example:
4664479Sbinkertn@umich.edu
4674479Sbinkertn@umich.edu               lex.lex(reflags=re.UNICODE)
4684479Sbinkertn@umich.edu
4694479Sbinkertn@umich.edu          This can be used to specify optional flags to the re.compile() function
4704479Sbinkertn@umich.edu          used inside the lexer.   This may be necessary for special situations such
4714479Sbinkertn@umich.edu          as processing Unicode (e.g., if you want escapes like \w and \b to consult
4724479Sbinkertn@umich.edu          the Unicode character property database).   The need for this suggested by
4734479Sbinkertn@umich.edu          Andreas Jung.
4744479Sbinkertn@umich.edu
4754479Sbinkertn@umich.edu03/01/06: beazley
4764479Sbinkertn@umich.edu          Fixed a bug with an uninitialized variable on repeated instantiations of parser
4774479Sbinkertn@umich.edu          objects when the write_tables=0 argument was used.   Reported by Michael Brown.
4784479Sbinkertn@umich.edu
4794479Sbinkertn@umich.edu03/01/06: beazley
4804479Sbinkertn@umich.edu          Modified lex.py to accept Unicode strings both as the regular expressions for
4814479Sbinkertn@umich.edu          tokens and as input. Hopefully this is the only change needed for Unicode support.
4824479Sbinkertn@umich.edu          Patch contributed by Johan Dahl.
4834479Sbinkertn@umich.edu
4844479Sbinkertn@umich.edu03/01/06: beazley
4854479Sbinkertn@umich.edu          Modified the class-based interface to work with new-style or old-style classes.
4864479Sbinkertn@umich.edu          Patch contributed by Michael Brown (although I tweaked it slightly so it would work
4874479Sbinkertn@umich.edu          with older versions of Python).
4884479Sbinkertn@umich.edu
4894479Sbinkertn@umich.eduVersion 1.6
4904479Sbinkertn@umich.edu------------------------------
4914479Sbinkertn@umich.edu05/27/05: beazley
4924479Sbinkertn@umich.edu          Incorporated patch contributed by Christopher Stawarz to fix an extremely
4934479Sbinkertn@umich.edu          devious bug in LALR(1) parser generation.   This patch should fix problems
4944479Sbinkertn@umich.edu          numerous people reported with LALR parsing.
4954479Sbinkertn@umich.edu
4964479Sbinkertn@umich.edu05/27/05: beazley
4974479Sbinkertn@umich.edu          Fixed problem with lex.py copy constructor.  Reported by Dave Aitel, Aaron Lav,
4984479Sbinkertn@umich.edu          and Thad Austin. 
4994479Sbinkertn@umich.edu
5004479Sbinkertn@umich.edu05/27/05: beazley
5014479Sbinkertn@umich.edu          Added outputdir option to yacc()  to control output directory. Contributed
5024479Sbinkertn@umich.edu          by Christopher Stawarz.
5034479Sbinkertn@umich.edu
5044479Sbinkertn@umich.edu05/27/05: beazley
5054479Sbinkertn@umich.edu          Added rununit.py test script to run tests using the Python unittest module.
5064479Sbinkertn@umich.edu          Contributed by Miki Tebeka.
5074479Sbinkertn@umich.edu
5084479Sbinkertn@umich.eduVersion 1.5
5094479Sbinkertn@umich.edu------------------------------
5104479Sbinkertn@umich.edu05/26/04: beazley
5114479Sbinkertn@umich.edu          Major enhancement. LALR(1) parsing support is now working.
5124479Sbinkertn@umich.edu          This feature was implemented by Elias Ioup (ezioup@alumni.uchicago.edu)
5134479Sbinkertn@umich.edu          and optimized by David Beazley. To use LALR(1) parsing do
5144479Sbinkertn@umich.edu          the following:
5154479Sbinkertn@umich.edu
5164479Sbinkertn@umich.edu               yacc.yacc(method="LALR")
5174479Sbinkertn@umich.edu
5184479Sbinkertn@umich.edu          Computing LALR(1) parsing tables takes about twice as long as
5194479Sbinkertn@umich.edu          the default SLR method.  However, LALR(1) allows you to handle
5204479Sbinkertn@umich.edu          more complex grammars.  For example, the ANSI C grammar
5214479Sbinkertn@umich.edu          (in example/ansic) has 13 shift-reduce conflicts with SLR, but
5224479Sbinkertn@umich.edu          only has 1 shift-reduce conflict with LALR(1).
5234479Sbinkertn@umich.edu
5244479Sbinkertn@umich.edu05/20/04: beazley
5254479Sbinkertn@umich.edu          Added a __len__ method to parser production lists.  Can
5264479Sbinkertn@umich.edu          be used in parser rules like this:
5274479Sbinkertn@umich.edu
5284479Sbinkertn@umich.edu             def p_somerule(p):
5294479Sbinkertn@umich.edu                 """a : B C D
5304479Sbinkertn@umich.edu                      | E F"
5314479Sbinkertn@umich.edu                 if (len(p) == 3):
5324479Sbinkertn@umich.edu                     # Must have been first rule
5334479Sbinkertn@umich.edu                 elif (len(p) == 2):
5344479Sbinkertn@umich.edu                     # Must be second rule
5354479Sbinkertn@umich.edu
5364479Sbinkertn@umich.edu          Suggested by Joshua Gerth and others.
5374479Sbinkertn@umich.edu
5384479Sbinkertn@umich.eduVersion 1.4
5394479Sbinkertn@umich.edu------------------------------
5404479Sbinkertn@umich.edu04/23/04: beazley
5414479Sbinkertn@umich.edu          Incorporated a variety of patches contributed by Eric Raymond.
5424479Sbinkertn@umich.edu          These include:
5434479Sbinkertn@umich.edu
5444479Sbinkertn@umich.edu           0. Cleans up some comments so they don't wrap on an 80-column display.
5454479Sbinkertn@umich.edu           1. Directs compiler errors to stderr where they belong.
5464479Sbinkertn@umich.edu           2. Implements and documents automatic line counting when \n is ignored.
5474479Sbinkertn@umich.edu           3. Changes the way progress messages are dumped when debugging is on. 
5484479Sbinkertn@umich.edu              The new format is both less verbose and conveys more information than
5494479Sbinkertn@umich.edu              the old, including shift and reduce actions.
5504479Sbinkertn@umich.edu
5514479Sbinkertn@umich.edu04/23/04: beazley
5524479Sbinkertn@umich.edu          Added a Python setup.py file to simply installation.  Contributed
5534479Sbinkertn@umich.edu          by Adam Kerrison.
5544479Sbinkertn@umich.edu
5554479Sbinkertn@umich.edu04/23/04: beazley
5564479Sbinkertn@umich.edu          Added patches contributed by Adam Kerrison.
5574479Sbinkertn@umich.edu 
5584479Sbinkertn@umich.edu          -   Some output is now only shown when debugging is enabled.  This
5594479Sbinkertn@umich.edu              means that PLY will be completely silent when not in debugging mode.
5604479Sbinkertn@umich.edu          
5614479Sbinkertn@umich.edu          -   An optional parameter "write_tables" can be passed to yacc() to
5624479Sbinkertn@umich.edu              control whether or not parsing tables are written.   By default,
5634479Sbinkertn@umich.edu              it is true, but it can be turned off if you don't want the yacc
5644479Sbinkertn@umich.edu              table file. Note: disabling this will cause yacc() to regenerate
5654479Sbinkertn@umich.edu              the parsing table each time.
5664479Sbinkertn@umich.edu
5674479Sbinkertn@umich.edu04/23/04: beazley
5684479Sbinkertn@umich.edu          Added patches contributed by David McNab.  This patch addes two
5694479Sbinkertn@umich.edu          features:
5704479Sbinkertn@umich.edu
5714479Sbinkertn@umich.edu          -   The parser can be supplied as a class instead of a module.
5724479Sbinkertn@umich.edu              For an example of this, see the example/classcalc directory.
5734479Sbinkertn@umich.edu
5744479Sbinkertn@umich.edu          -   Debugging output can be directed to a filename of the user's
5754479Sbinkertn@umich.edu              choice.  Use
5764479Sbinkertn@umich.edu
5774479Sbinkertn@umich.edu                 yacc(debugfile="somefile.out")
5784479Sbinkertn@umich.edu
5794479Sbinkertn@umich.edu          
5802632Sstever@eecs.umich.eduVersion 1.3
5812632Sstever@eecs.umich.edu------------------------------
5822632Sstever@eecs.umich.edu12/10/02: jmdyck
5832632Sstever@eecs.umich.edu          Various minor adjustments to the code that Dave checked in today.
5842632Sstever@eecs.umich.edu          Updated test/yacc_{inf,unused}.exp to reflect today's changes.
5852632Sstever@eecs.umich.edu
5862632Sstever@eecs.umich.edu12/10/02: beazley
5872632Sstever@eecs.umich.edu          Incorporated a variety of minor bug fixes to empty production
5882632Sstever@eecs.umich.edu          handling and infinite recursion checking.  Contributed by
5892632Sstever@eecs.umich.edu          Michael Dyck.
5902632Sstever@eecs.umich.edu
5912632Sstever@eecs.umich.edu12/10/02: beazley
5922632Sstever@eecs.umich.edu          Removed bogus recover() method call in yacc.restart()
5932632Sstever@eecs.umich.edu
5942632Sstever@eecs.umich.eduVersion 1.2
5952632Sstever@eecs.umich.edu------------------------------
5962632Sstever@eecs.umich.edu11/27/02: beazley
5972632Sstever@eecs.umich.edu          Lexer and parser objects are now available as an attribute
5982632Sstever@eecs.umich.edu          of tokens and slices respectively. For example:
5992632Sstever@eecs.umich.edu 
6002632Sstever@eecs.umich.edu             def t_NUMBER(t):
6012632Sstever@eecs.umich.edu                 r'\d+'
6022632Sstever@eecs.umich.edu                 print t.lexer
6032632Sstever@eecs.umich.edu
6042632Sstever@eecs.umich.edu             def p_expr_plus(t):
6052632Sstever@eecs.umich.edu                 'expr: expr PLUS expr'
6062632Sstever@eecs.umich.edu                 print t.lexer
6072632Sstever@eecs.umich.edu                 print t.parser
6082632Sstever@eecs.umich.edu
6092632Sstever@eecs.umich.edu          This can be used for state management (if needed).
6102632Sstever@eecs.umich.edu 
6112632Sstever@eecs.umich.edu10/31/02: beazley
6122632Sstever@eecs.umich.edu          Modified yacc.py to work with Python optimize mode.  To make
6132632Sstever@eecs.umich.edu          this work, you need to use
6142632Sstever@eecs.umich.edu
6152632Sstever@eecs.umich.edu              yacc.yacc(optimize=1)
6162632Sstever@eecs.umich.edu
6172632Sstever@eecs.umich.edu          Furthermore, you need to first run Python in normal mode
6182632Sstever@eecs.umich.edu          to generate the necessary parsetab.py files.  After that,
6192632Sstever@eecs.umich.edu          you can use python -O or python -OO.  
6202632Sstever@eecs.umich.edu
6212632Sstever@eecs.umich.edu          Note: optimized mode turns off a lot of error checking.
6222632Sstever@eecs.umich.edu          Only use when you are sure that your grammar is working.
6232632Sstever@eecs.umich.edu          Make sure parsetab.py is up to date!
6242632Sstever@eecs.umich.edu
6252632Sstever@eecs.umich.edu10/30/02: beazley
6262632Sstever@eecs.umich.edu          Added cloning of Lexer objects.   For example:
6272632Sstever@eecs.umich.edu
6282632Sstever@eecs.umich.edu              import copy
6292632Sstever@eecs.umich.edu              l = lex.lex()
6302632Sstever@eecs.umich.edu              lc = copy.copy(l)
6312632Sstever@eecs.umich.edu
6322632Sstever@eecs.umich.edu              l.input("Some text")
6332632Sstever@eecs.umich.edu              lc.input("Some other text")
6342632Sstever@eecs.umich.edu              ...
6352632Sstever@eecs.umich.edu
6362632Sstever@eecs.umich.edu          This might be useful if the same "lexer" is meant to
6372632Sstever@eecs.umich.edu          be used in different contexts---or if multiple lexers
6382632Sstever@eecs.umich.edu          are running concurrently.
6392632Sstever@eecs.umich.edu                
6402632Sstever@eecs.umich.edu10/30/02: beazley
6412632Sstever@eecs.umich.edu          Fixed subtle bug with first set computation and empty productions.
6422632Sstever@eecs.umich.edu          Patch submitted by Michael Dyck.
6432632Sstever@eecs.umich.edu
6442632Sstever@eecs.umich.edu10/30/02: beazley
6452632Sstever@eecs.umich.edu          Fixed error messages to use "filename:line: message" instead
6462632Sstever@eecs.umich.edu          of "filename:line. message".  This makes error reporting more
6472632Sstever@eecs.umich.edu          friendly to emacs. Patch submitted by Fran�ois Pinard.
6482632Sstever@eecs.umich.edu
6492632Sstever@eecs.umich.edu10/30/02: beazley
6502632Sstever@eecs.umich.edu          Improvements to parser.out file.  Terminals and nonterminals
6512632Sstever@eecs.umich.edu          are sorted instead of being printed in random order.
6522632Sstever@eecs.umich.edu          Patch submitted by Fran�ois Pinard.
6532632Sstever@eecs.umich.edu
6542632Sstever@eecs.umich.edu10/30/02: beazley
6552632Sstever@eecs.umich.edu          Improvements to parser.out file output.  Rules are now printed
6562632Sstever@eecs.umich.edu          in a way that's easier to understand.  Contributed by Russ Cox.
6572632Sstever@eecs.umich.edu
6582632Sstever@eecs.umich.edu10/30/02: beazley
6592632Sstever@eecs.umich.edu          Added 'nonassoc' associativity support.    This can be used
6602632Sstever@eecs.umich.edu          to disable the chaining of operators like a < b < c.
6612632Sstever@eecs.umich.edu          To use, simply specify 'nonassoc' in the precedence table
6622632Sstever@eecs.umich.edu
6632632Sstever@eecs.umich.edu          precedence = (
6642632Sstever@eecs.umich.edu            ('nonassoc', 'LESSTHAN', 'GREATERTHAN'),  # Nonassociative operators
6652632Sstever@eecs.umich.edu            ('left', 'PLUS', 'MINUS'),
6662632Sstever@eecs.umich.edu            ('left', 'TIMES', 'DIVIDE'),
6672632Sstever@eecs.umich.edu            ('right', 'UMINUS'),            # Unary minus operator
6682632Sstever@eecs.umich.edu          )
6692632Sstever@eecs.umich.edu
6702632Sstever@eecs.umich.edu          Patch contributed by Russ Cox.
6712632Sstever@eecs.umich.edu
6722632Sstever@eecs.umich.edu10/30/02: beazley
6732632Sstever@eecs.umich.edu          Modified the lexer to provide optional support for Python -O and -OO
6742632Sstever@eecs.umich.edu          modes.  To make this work, Python *first* needs to be run in
6752632Sstever@eecs.umich.edu          unoptimized mode.  This reads the lexing information and creates a
6762632Sstever@eecs.umich.edu          file "lextab.py".  Then, run lex like this:
6772632Sstever@eecs.umich.edu
6782632Sstever@eecs.umich.edu                   # module foo.py
6792632Sstever@eecs.umich.edu                   ...
6802632Sstever@eecs.umich.edu                   ...
6812632Sstever@eecs.umich.edu                   lex.lex(optimize=1)
6822632Sstever@eecs.umich.edu
6832632Sstever@eecs.umich.edu          Once the lextab file has been created, subsequent calls to
6842632Sstever@eecs.umich.edu          lex.lex() will read data from the lextab file instead of using 
6852632Sstever@eecs.umich.edu          introspection.   In optimized mode (-O, -OO) everything should
6862632Sstever@eecs.umich.edu          work normally despite the loss of doc strings.
6872632Sstever@eecs.umich.edu
6882632Sstever@eecs.umich.edu          To change the name of the file 'lextab.py' use the following:
6892632Sstever@eecs.umich.edu
6902632Sstever@eecs.umich.edu                  lex.lex(lextab="footab")
6912632Sstever@eecs.umich.edu
6922632Sstever@eecs.umich.edu          (this creates a file footab.py)
6932632Sstever@eecs.umich.edu         
6942632Sstever@eecs.umich.edu
6952632Sstever@eecs.umich.eduVersion 1.1   October 25, 2001
6962632Sstever@eecs.umich.edu------------------------------
6972632Sstever@eecs.umich.edu
6982632Sstever@eecs.umich.edu10/25/01: beazley
6992632Sstever@eecs.umich.edu          Modified the table generator to produce much more compact data.
7002632Sstever@eecs.umich.edu          This should greatly reduce the size of the parsetab.py[c] file.
7012632Sstever@eecs.umich.edu          Caveat: the tables still need to be constructed so a little more
7022632Sstever@eecs.umich.edu          work is done in parsetab on import. 
7032632Sstever@eecs.umich.edu
7042632Sstever@eecs.umich.edu10/25/01: beazley
7052632Sstever@eecs.umich.edu          There may be a possible bug in the cycle detector that reports errors
7062632Sstever@eecs.umich.edu          about infinite recursion.   I'm having a little trouble tracking it
7072632Sstever@eecs.umich.edu          down, but if you get this problem, you can disable the cycle
7082632Sstever@eecs.umich.edu          detector as follows:
7092632Sstever@eecs.umich.edu
7102632Sstever@eecs.umich.edu                 yacc.yacc(check_recursion = 0)
7112632Sstever@eecs.umich.edu
7122632Sstever@eecs.umich.edu10/25/01: beazley
7132632Sstever@eecs.umich.edu          Fixed a bug in lex.py that sometimes caused illegal characters to be
7142632Sstever@eecs.umich.edu          reported incorrectly.  Reported by Sverre J�rgensen.
7152632Sstever@eecs.umich.edu
7162632Sstever@eecs.umich.edu7/8/01  : beazley
7172632Sstever@eecs.umich.edu          Added a reference to the underlying lexer object when tokens are handled by
7182632Sstever@eecs.umich.edu          functions.   The lexer is available as the 'lexer' attribute.   This
7192632Sstever@eecs.umich.edu          was added to provide better lexing support for languages such as Fortran
7202632Sstever@eecs.umich.edu          where certain types of tokens can't be conveniently expressed as regular 
7212632Sstever@eecs.umich.edu          expressions (and where the tokenizing function may want to perform a 
7222632Sstever@eecs.umich.edu          little backtracking).  Suggested by Pearu Peterson.
7232632Sstever@eecs.umich.edu
7242632Sstever@eecs.umich.edu6/20/01 : beazley
7252632Sstever@eecs.umich.edu          Modified yacc() function so that an optional starting symbol can be specified.
7262632Sstever@eecs.umich.edu          For example:
7272632Sstever@eecs.umich.edu            
7282632Sstever@eecs.umich.edu                 yacc.yacc(start="statement")
7292632Sstever@eecs.umich.edu
7302632Sstever@eecs.umich.edu          Normally yacc always treats the first production rule as the starting symbol.
7312632Sstever@eecs.umich.edu          However, if you are debugging your grammar it may be useful to specify
7322632Sstever@eecs.umich.edu          an alternative starting symbol.  Idea suggested by Rich Salz.
7332632Sstever@eecs.umich.edu                      
7342632Sstever@eecs.umich.eduVersion 1.0  June 18, 2001
7352632Sstever@eecs.umich.edu--------------------------
7362632Sstever@eecs.umich.eduInitial public offering
7372632Sstever@eecs.umich.edu
738