j-guru-blue.jpg (8086 bytes)

ANTLR

jGuru

ANTLR 2.7.0
Release Notes

"The Millenium release" (A2K?)

January 19, 2000

The ANTLR 2.7.0 release is a big feature release, but also fixes a number of bugs.  See http://www.antlr.org/bug for the current list of bugs/suggestions and the bugs fixed for this version.

Brought to you by those hip cats at jGuru.com.

Binary Incompatibility

YOU MUST REGENERATE PARSERS/LEXERS FROM 2.6.0 GRAMMAR FILES TO BE COMPATIBLE WITH 2.7.0 CLASSES--SOME CLASS NAME HAS CHANGED.

Source Incompatibility

The exception names have changed.  If you trapped exceptions such as ParserException emanating from the parser in your main() or wherever, you will have to change the invocation code; e.g., use RecognitionException.

Enhancements

ANTLR 2.7.0 has the following enhancements:

New Exception Hierarchy

I have radically changed the exception hierarchy from 2.6.x.   The ANTLR source is compatible, but if you trap exceptions in your main() or wherever, have to change use new exception names.  See ANTLR Exception Hierarchy and note the following:

  • TokenStream.nextToken() now throws TokenStreamException instead of IOException.
  • CharScanner.consume() now throws CharStreamException
  • Added ANTLRError, though it is not currently used.
  • Lexer rules now throw RecognitionException, CharStreamException, and TokenStreamException instead of ScannerException and IOException
  • Parser rules now throw RecognitionException and TokenStreamException not IOException and ParserException.

C++ Output

According to Pete Wells, the C++ output has been updated to include the 2.7.0 functionality and fix a few bugs in the C++ code generator.  A big thanks to Pete for his hard work on the C++ code generator.  Pete says:

  • Updates to namespace support - all namespace references are through macros, so that older compilers can still work ; the library code goes into a namespace antlr ; there is a new file
    level option namespace which allows the generated code to go
    into a namespace.
  • Changes to the AST model - reverting to more like the Java model (rather than separate tree and leaf classes). NOTE: this uses dynamic_cast, so MSVC users need to turn on RTTI.
  • Named header actions - there are now 4 named blocks (pre_include_cpp, pre_include_hpp, post_include_cpp, and post_include_hpp) which place the code in different parts of the generated code. Syntax is header "name" {...code...}, and multiple headers are allowed. NOTE: no checks are yet done for invalid names.

New! Sather Code Generator

Mika Illouz has generously developed and released a Sather code-generator for ANTLR. The Sather code generation option sports:

  • a port of most of the Java/C++ grammar examples
  • a new Sather grammar adapted from the ICSI Sather 1.1 specification
  • look ma! No typecasts!

New Grammars

New Pascal Grammar

Hakki Dogusan has written and donated a cool Pascal grammar.  See examples/java/pascal.

New TinyBasic Interpreter

Sinan Karasu has written and donated a TinyBasic interpreter; see examples/java/tinybasic.  It's pretty cool!  He says:

This is a simple Basic Parser/Interpreter I put together in 3 days.  Has NOT been tested, only for instructional value.  Do whatever you do with it, I don't care....

ANTLR Emacs mode

Christoph Wedler wrote a nice emacs mode for ANTLR.  Check it out!

Perforce Revision Control

Terence is now using Perforce's groovy source code control system.  They have graciously granted the ANTLR project a 100-user license, which means that up to 100 active ANTLR folks will have access via the net.  Some users will have write access to update bugs they find, letting all the other users get immediate fixes.  Also, you'll note the file revision now in source and documentation files.

Perforce is an absolutely great tool for distributed development.  We use it at jGuru.com commercially to collaborate, to develop software, and to keep all of our gurus and operations folks in sync.  The advent of this system for the ANTLR project will dramatically improve collaboration between all of us.

I will put together a description of how to gain access (there is a perforce client for just about everything...ok not the timex sinclair zx-80) when I get the chance.

Miscellaneous updates

(not including the bugs fixes described at http://www.antlr.org/bug):

  • To better support C++, you can name header blocks:
     
      header "name" {...code...}.
    See the section above from Pete Wells.
  • I have added method:
        public int testLiteralsTable(String text, int ttype) {}
    for use when you only want to test a portion of a token's text.
  • A new method is available for reacting to the end of file condition; e.g., you might want to pop the lexer state at the end of an include file.   This method, CharScanner.uponEOF(), is called from nextToken() right before the scanner returns an EOF_TYPE token object to parser.   See The End Of File Condition.
  • Added method public int getColumn() to CharScanner.   Read Monty Zukowski's article on tracking column info.
  • Updated action translation to be less sensitive to spacing and to have fewer warnings.  The grammar file was physically moved from antlr/actions/action.g to antlr/actions/java/action.g.
  • The examples dir is now a generic examples dir and has java, cpp, and sather subdirectories.  Directory examples/* is now examples/java/*.
  • Fixed preprocessor/preproc.g to use latest goodies to hush warnings etc...
  • Fixed bugs in java grammar; see java.g and bug list for details.
  • Enhanced TokenStreamSelector: added a retry mechanism.   Throw TokenStreamRetryException from any selector input stream and the nextToken() method will try to reset and grab another token.  This is great when switching input streams because the selector will try to grab a token from a different stream.
  • Added antlr.Tool.getGrammarReader() so you can override Tool to specify where to get input (nice for using in IDEs).
  • All parsers and tree parsers now import AST, ASTPair, and ASTArray in case you use #(...) without buildAST on.
  • Add setASTNodeClass to TreeParser to be consistent with Parser.
  • The file/line error message prefix is now emacs-friendly.   Since it wasn't friendly to any environment, I felt ok to change it arbitrarily.   I have set it up so you can modify ANTLR fairly easily to change prefix handling.   I have moved all the error handling into DefaultToolErrorHandler and defined a file/line formatter interface so you can change how error messages are printed out in one place.

    If you want to change the file/line error prefix, set Tool.fileLineFormatter to the object of your choice.  The default FileLineFormatter for emacs-style errors is defined as:
public static
FileLineFormatter fileLineFormatter =
  new FileLineFormatter() {
    public String getFormatString(
        String fileName, int line)
    {
        if ( fileName != null ) {
            return fileName+":"+line+": ";
        }
        else {
            return "line "+line+": ";
        }
    }
  }
};

ANTLR Installation

ANTLR comes as a single zip or compressed tar file. Unzipping the file you receive will produce a directory called antlr-2.7.0 with subdirectories antlr, doc, examples, cpp, and examples.cpp. You need to place the antlr-2.7.0 directory in your CLASSPATH environment variable. For example, if you placed antlr-2.7.0 in directory /tools, you need to append

/tools/antlr-2.7.0

to your CLASSPATH or.

\tools\antlr-2.7.0

if you work on an NT or Win95 box.

References to antlr.* will map to /tools/antlr-2.7.0/antlr/*.class.

You must have at least JDK 1.1 installed properly on your machine.  The ASTFrame AST viewer uses Swing 1.1.

JAR FILE

Try using the runtime library antlr.jar file. Place it in your CLASSPATH instead of the antlr-2.7.0 directory. The jar includes all parse-time files needed (if it is missing a file, email parrt@jguru.com) You cannot run the antlr tool itself with the jar, but your parsers should run with just this jar file.   It's pretty small, around 50k uncompressed.

RUNNING ANTLR

ANTLR is a command line tool (although many development environments let you run ANTLR on grammar files from within the environment). The main method within antlr.Tool is the ANTLR entry point.

java antlr.Tool file.g

The command-line option is -diagnostic, which generates a text file for each output parser class that describes the lookahead sets. Note that there are number of options that you can specify at the grammar class and rule level.

Options -trace, -traceParser, -traceTreeParser may be used to track the lexer, parser, and tree parser invocations.

Try the new -html option to generate HTML output of your grammar(s); this is only partially done.

If you have trouble running ANTLR, ensure that you have Java installed correctly and then ensure that you have the appropriate CLASSPATH set.

Version: $Id: //depot/code/org.antlr/release/antlr-2.7.0/doc/antlr270release.html#3 $