pmacs3/IDEAS

26 lines
977 B
Plaintext

2007/07/15:
Rename "lexing" to "parsing" since really we have moved way beyond a simple
lexing/tokenization strategy.
2007/07/14:
The rules are currently confusingly implemented, and have poor performance when
used in deeply nested grammars.
We need to refactor lex2 so that rules have two methods:
1. match():
This method should return whether or not the rule can match the current input
that the lexer is lexing. If its result is true, the result will be passed
(along with the lexer, etc.) to the rule's lex() method. Otherwise, the next
rule will be tried.
2. lex():
This method is a generator, which is expected to return one or more tokens. In
addition to the arguments given to match() it will be passed the result of the
call to match() (which is guaranteed to be true, and will most often be a
re.Match object). As all generators, this method will raise StopIteration when
there are no more tokens to return, and will raise LexError if there are other
problems.