MAGELANG

Home

ANTLR 2.6.0
Release Notes

May 31, 1999

The ANTLR 2.6.0 release is a big feature release, but also fixes a number of bugs.  See http://www.antlr.org/bug for the current list of bugs/suggestions and the bugs fixed for this version.

Binary Incompatibility

YOU MUST REGENERATE PARSERS/LEXERS FROM 2.5.0 GRAMMAR FILES TO BE COMPATIBLE WITH 2.6.0 CLASSES--A CLASS NAME HAS CHANGED.

Source Incompatibility

If you use tokdef or tokenVocabulary options, you must change the source to use importVocab and exportVocab, respectively, and rerun ANTLR on your grammars.  See the documentation for precise information.

You may have at most one type of grammar per grammar file; i.e., one lexer, one parser, and one tree parser.  This prevents a conflict between two grammars both sharing and inheriting vocabularies at the same time.

The literal option for lexer grammars is not valid.  Use the more powerful "tokens {}" structure instead.

Enhancements

ANTLR 2.6.0 has the following enhancements:

  • Token streams.  Many thanks to the maniacs who helped me chew through this to get the correct semantics.  See the 2.6.0 documentation main page for the credits.  See the streams documentation.  Interface Tokenizer has been renamed TokenStream!
  • Multiple lexers, "lexical states" (implemented as a token stream filter).  See examples/multiLexer directory in 2.6.0 distribution and the documentation.
  • Added "tokens {...}" syntax for defining tokens for all grammar types.  E.g.,
    tokens {
        DECL;

        FOR_INIT="forinit";
    }
  • Cleaned up vocabulary definition options and changed their names.  See the documentation on options.
  • Moved ASTFrame to use javax package (swing 1.1).
  • Shawn Vincent enhanced exception objects so that you can get line and column information as well as the error message without line numbers.  Added getLine (), getColumn (), getErrorMessage (), and modified toString() to use getLine() and getErrorMessage().  Thanks, Shawn!
  • Multiple lexers can now easily share the same input by using the LexerSharedInputState object obtained from one of the lexers via getInputState().  This is also true of parsers that share ParserSharedInputState objects. Very useful for having two lexers pull characters from the same input file (i.e., multiple lexer "states" and so on).  See Multiple Lexers/Parsers With Shared Input State.
  • Both parsers and lexers now have a file name string attribute (set/getFilename()) that is used when reporting errors.
  • Added a debug jar file, antlr.debug.jar; this is the stuff needed by the parseview debugger.

C++ Output

According to Pete Wells, the C++ output has been updated to include the 2.6.0 functionality and fix a few bugs in the C++ code generator.  A big thanks to Pete for his hard work on the C++ code generator.

New IDL Grammar

Also, I have added an IDL grammar written by Jim Coker to the distribution.

Miscellaneous updates (not including the bugs fixes described at http://www.antlr.org/bug):

  • Modified tree grammars so that #TOKEN works in tree walker actions even if not building ASTs. Beware that #TOKEN will refer to the OUTPUT tree node associated with TOKEN if you turn on buildAST (whereas without buildAST, #TOKEN refers to the input tree). To be precise all the time use #TOKEN_in to get the input node. Referenes to #rulename didn't work either unless building trees in tree grammars.
  • Method equalsListPartial(AST) was only considering the token type, now it calls equals() on the tree, which checks type and text.
  • The java tree grammar was missing the (stat)? clause for the else statement.
  • InputBuffer and CharQueue were made more reusable.  I made the fields protected and made CharQueue itself public.
  • Lexers could not use semantic predicates unless they were hoisted into a decision.  I have added SemanticException to the list of exceptions thrown by lexer rules.  This is a temporary fix until I can redo the exception hierarchy.
  • antlr.collections.impl.BitSet.member() had a bug (the old "off by one" error).  Tested > instead of >=.  This did not affect ANTLR's analysis because the code never asked for anything outside of the defined token type space.
  • Fixed the handling of comments in the HTML grammar.   Didn't count \n's, didn't allow some valid comments, and didn't allow comments everywhere.
  • Updated the Java grammar.  Didn't properly construct trees for interfaces.
  • Can have '.' wildcard as the root of a #(...) tree construct now.