Error Handling and Recovery

MAGELANG!

Error Handling and Recovery

All syntactic and semantic errors cause parser exceptions to be thrown. In particular, the methods used to match tokens in the parser base class (match et al) throw MismatchedTokenException. If the lookahead predicts no alternative of a production in either the parser or lexer, then a NoViableAltException is thrown. The methods in the lexer base class used to match characters (match et al) throw ScannerException.

When the parser (or lexer) is implemented in an object-priented language, the parser exceptions correspond to the exceptions of the language. Otherwise, parser exceptions must be simulated by the code generator in a more laborious fashion.

ANTLR will generate default error-handling code, or you may specify your own exception handlers. Either case results (where supported by the language) in the creation of a try/catch block. Such try{} blocks surround the generated code for the grammar element of interest (rule, alternate, token reference, or rule reference). If no exception handlers (default or otherwise) are specified, then the exception will propagate all the way out of the parser to the calling program.

ANTLR's default exception handling is good to get something working, but you will have more control over error-reporting and resynchronization if you write your own exception handlers.

Note that the '@' exception specification of PCCTS 1.33 does not apply to ANTLR 2.0.

Modifying Default Error Messages With Paraphrases

The name or definition of a token in your lexer is rarely meaningful to the user of your recognizer or translator. For example, instead of seeing

Error: line(1), expecting ID, found ';'

you can have the parser generate:

Error: line(1), expecting an identifier, found ';'

ANTLR provides an easy way to specify a string to use in place of the token name. In the definition for ID, use the paraphrase option:

ID
options {
  paraphrase = "an identifier";
}
  : ('a'..'z'|'A'..'Z'|'_')
    ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*
  ;

Note that this paraphrase goes into the token types text file (ANTLR's persistence file). In other words, a grammar that uses this vocabulary will also use the paraphrase.

Parser Exception Handling

ANTLR 2.0 generates recursive-descent recognizers. Since recursive-descent recognizers operate by recursively calling the rule-matching methods, this results in a call stack that is populated by the contexts of the recursive-descent methods. Parser exception handling for grammar rules is a lot like exception handling in a language like C++ or Java. Namely, when an exception is thrown, the normal thread of execution is stopped, and functions on the call stack are exited sequentially until one is encountered that wants to catch the exception. When an exception is caught, execution resumes at that point.

In ANTLR 2.0, parser exceptions are thrown when (a) there is a syntax error, (b) there is a failed validating semantic predicate, or (c) you throw a parser exception from an action.

In all cases, the recursive-descent functions on the call stack are exited until an exception handler is encountered for that exception type or one of its base classes (in non-object-oriented languages, the hierarchy of execption types is not implemented by a class hierarchy). Exception handlers arise in one of two ways. First, if you do nothing, ANTLR will generate a default exception handler for every parser rule. The default exception handler will report an error, sync to the follow set of the rule, and return from that rule. Second, you may specify your own exception handlers in a variety of ways, as described later.

If you specify an exception handler for a rule, then the default exception handler is not generated for that rule. In addition, you may control the generation of default exception handlers with a per-grammar or per-rule option.

Specifying Parser Exception-Handlers

You may attach exception handlers to a rule, an alternative, or a labeled element. The general form for specifying an exception handler is:

exception [label]
catch [exceptionType exceptionVariable] { action }
catch ...
catch ...

where the label is only used for attaching exceptions to labeled elements. The exceptionType is the exception (or class of exceptions) to catch, and the exceptionVariable is the variable name of the caught exception, so that the action can process the exception if desired. Here is an example that catches an exception for the rule, for an alternate and for a labeled element:

rule:   a:A B C
    |   D E
        exception // for alternate
          catch [ParserException ex] {
            reportError(ex.toString());
        }
    ;
    exception // for rule
    catch [ParserException ex] {
       reportError(ex.toString());
    }
    exception[a] // for a:A
    catch [ParserException ex] {
       reportError(ex.toString());
    }

Note that exceptions attached to alternates and labeled elements do not cause the rule to exit. Matching and control flow continues as if the error had not occurred. Because of this, you must be careful not to use any variables that would have been set by a successful match when an exception is caught.