Arith - Evaluator for simple Java Expressions

We present a simple evaluator for a tiny subset of Java integer and string expressions. Its only purpose is to demonstrate how to implement an interpreter.

Syntax

The recognized JAVA expressions are given by the following term grammar:
    i ::= 0 | 1 | -1 | 2 | -2 ...
    s ::= "bla" | "hallo" | ...
 
    t ::= i                       Integer constant
        | s                       String constant
        | t1 + t2                 Addition
        | t1 - t2                 Subtraction
        | t.length()              String length
        | Integer.toString(t)     Decimal notation
    
Parentheses are used to group subexpressions in the usual fashion. Arith parses expressions, terminated by semicolons ";", and evaluates them step-by-step until a value is reached.

Sample session

The following session protocol shows how to use Arith interactively.
    andreas@pepper:~/Teaching/OO/arith/sml> arith
    ARITH Version 0.1
    Arith> -5 - 4 - 3 - 2 - 1; // a long difference
    ((((-5 - 4) - 3) - 2) - 1)
    --> (((-9 - 3) - 2) - 1)
    --> ((-12 - 2) - 1)
    --> (-14 - 1)
    --> -15
    Arith> "hallo".length();
    "hallo".length()
    --> 5
    Arith> Integer.toString(5 - 4).length();
    Integer.toString((5 - 4)).length()
    --> Integer.toString(1).length()
    --> "1".length()
    --> 1
    Arith> Integer.toString (5 - 3 - "hallo".length() + -1).length();
    Integer.toString((((5 - 3) - "hallo".length()) + -1)).length()
    --> Integer.toString(((2 - "hallo".length()) + -1)).length()
    --> Integer.toString(((2 - 5) + -1)).length()
    --> Integer.toString((-3 + -1)).length()
    --> Integer.toString(-4).length()
    --> "-4".length()
    --> 2
    Arith> 5 - 3 + "hallo".length(Integer.toString(5));
    stdin:1.17-1.44 Error: 
    Too many arguments to method length()
    Arith> ^D andreas@pepper:~/Teaching/OO/arith/sml> 
    
Alternatively, Arith can be invoked with some filenames as arguments. In this case, it processes the declarations in these files in order, aborting if some error occurs.
    andreas@pepper:~/Teaching/OO/arith/sml> arith test.arith
    ARITH Version 0.1
    [Opening file test.arith]
    ((((-5 - 4) - 3) - 2) - 1)
    --> (((-9 - 3) - 2) - 1)
    --> ((-12 - 2) - 1)
    --> (-14 - 1)
    --> -15
    (5 + 3)
    --> 8
    "hallo".length()
    --> 5
    Integer.toString((5 - 4)).length()
    --> Integer.toString(1).length()
    --> "1".length()
    --> 1
    -3
    [Closing file test.arith]
    andreas@pepper:~/Teaching/OO/arith/sml> 
    

Implementation

The bulk of code is devoted to lexing and parsing. From the input a stream of declarations is generated. Each declaration consists of a term tree denoting the abstract representation of the parsed Java expression and a region tree to denote the position of each subterm in the input. These positions are used to generate informative error messages.
Most of the remaining code deals with evaluating and printing parsed expressions.
Arith is implemented in SML/NJ Version 110 with Compilation Manager (CM). It can be build by
    andreas@pepper:~/Teaching/OO/arith/sml> sml 
    The file arith.sml only invokes the compilation of all
      files of the project arith.cm via the
    compilation manager.  The binary code then is written to
    .heap/arith.yourArchitecture.  It is
    executed by running arith.

    

Lexer

The lexer lexer.sml transforms the input (character stream) into a stream of tokens. Each token comes with the region it occupies in the input.
Recognized tokens are non-negative decimal integers, strings enclosed in double quotes, identifiers, end of file and
    ( ) . ; , + - length Integer toString
    
The tokens with a print function are defined in token.sml.

Parser

We have implemented a simple top-down parser. The orginal grammar for accepted JAVA expressions (see above) is not suitable for top-down parsing. It has to be refined to resolve ambiguities, and parentheses have to be treated explicitly.
Refined Grammar:
    Int     = NUMBER 
            | -NUMBER

    Head    = STRING 
            | Int 
            | ID 
            | ( Exp )

    Args    =  
            | Exp, Args

    Method  = ID ( Args )

    Summand = Head 
            | Summand.Method   

    Exp     = Summand 
            | Exp + Summand  
            | Exp - Summand
    

Summary

Here is a summary of the Arith source files.

exp.smlInternal represenation of Arith expressions
eval.smlEvaluation of Arith expressions
stream.smlStreams (for lexing, parsing...)
region.smlInput regions (=intervals)
paths.smlRegion trees and pathes in these trees
token.smlTokens (for lexer)
parsing.smlError messages for lexer and parser
lexer.smlThe lexer
extsyn.sml"External syntax", combining term and region trees
decl.smlInvokes evaluation of parsed declarations
parser.smlTop-down parser producing external syntax
top.smlTop-level: command-line parsing etc.
arith.cmCM project file
arith.smlInvokes compilation
arithShell script to run Arith

Download the source: arith.tgz.
Andreas Abel
Last modified: Mon Jul 15 16:52:52 CEST 2002