Why use integers for tokens?-Collection of common programming errors
devoured elysiumIs there any good reason for using numbers for identifying tokens, nowadays? I am following
Crafting a Compiler
.The code the author presents is here:
public class Token { public final static int ID = 0, FLTDCL = 1, INTDCL = 2, PRINT = 3, ASSIGN = 4, PLUS = 5, MINUS = 6, EOF = 7, INUM = 8, FNUM = 9; public final static String[] token2str = new String[] { "id", "fltdcl", "intdcl", "print", "assign", "plus", "minus", "$", "inum", "fnum" }; public final int type; public final String val; public Token(int type) { this(type, ""); } public Token(int type, String val) { this.type = type; this.val = val; } public String toString() { return "Token type\t" + token2str[type] + "\tval\t" + val; } }
Instead of using the ugly arrays, wouldn’t it be smarter to modify the constructors to accept strings for the
type
variable instead of integers? Then we could get rid ofpublic final static int ID = 0, FLTDCL = 1, INTDCL = 2, PRINT = 3, ASSIGN = 4, PLUS = 5, MINUS = 6, EOF = 7, INUM = 8, FNUM = 9;
or is it needing later, being that using a string instead would be worse?
larsmansThere are several benefits:
- It’s faster, since comparing two integers takes (in your average compiled language) only a few instructions, while comparing strings takes O(n) time where n is the length of the larger token. Compilers need this extra bit of speed.
- In C, C++ and Java, you can
switch
on anint
but not on a string. - Mistyping a token name will be a compile-time error instead of a hard-to-debug runtime error.