Why use integers for tokens?-Collection of common programming errors

12 years ago

admin

2 minutes

devoured elysium

Is there any good reason for using numbers for identifying tokens, nowadays? I am following Crafting a Compiler.

The code the author presents is here:

public class Token {
    public final static int ID = 0, FLTDCL = 1, INTDCL = 2, PRINT = 3,
            ASSIGN = 4, PLUS = 5, MINUS = 6, EOF = 7, INUM = 8, FNUM = 9;

    public final static String[] token2str = new String[] { "id", "fltdcl",
            "intdcl", "print", "assign", "plus", "minus", "$", "inum", "fnum" };

    public final int type;
    public final String val;

    public Token(int type) {
        this(type, "");
    }

    public Token(int type, String val) {
        this.type = type;
        this.val = val;
    }

    public String toString() {
        return "Token type\t" + token2str[type] + "\tval\t" + val;
    }
}

Instead of using the ugly arrays, wouldn’t it be smarter to modify the constructors to accept strings for the type variable instead of integers? Then we could get rid of

    public final static int ID = 0, FLTDCL = 1, INTDCL = 2, PRINT = 3,
            ASSIGN = 4, PLUS = 5, MINUS = 6, EOF = 7, INUM = 8, FNUM = 9;

or is it needing later, being that using a string instead would be worse?

larsmans

There are several benefits:
- It’s faster, since comparing two integers takes (in your average compiled language) only a few instructions, while comparing strings takes O(n) time where n is the length of the larger token. Compilers need this extra bit of speed.
- In C, C++ and Java, you can switch on an int but not on a string.
- Mistyping a token name will be a compile-time error instead of a hard-to-debug runtime error.