Thursday, February 14, 2008

Fast and safe jump tables

Factor's case combinator provides efficient dispatch based on object equality:
{
{ "hello" [ ... ] }
{ 3 [ ... ] }
{ { 1 7 } [ ... ] }
...
} case

Now case is a macro, which means the compiler converts it to efficient code. If there are fewer than five objects, it expands into a series of conditionals, if there are more it becomes an open-coded hash lookup. I added a new heuristic today: if all the keys are integers and they form a contiguous sequence, a jump table is emitted. It handles the case where the keys are out of order, and where they do not begin at zero, however they do have to be contiguous otherwise it falls back to the slightly slower open-coded hash dispatch.

This allows a handful of modules such as the 8080 emulator to avoid using the unsafe dispatch combinator, and use case which is type safe (and it expands into dispatch in the case outlined above).

Here is an example from icfp2006, which implements a simple VM:
: run-op ( -- bool )
advance
{
{ 0 [ op0 ] }
{ 1 [ op1 ] }
{ 2 [ op2 ] }
{ 3 [ op3 ] }
{ 4 [ op4 ] }
{ 5 [ op5 ] }
{ 6 [ op6 ] }
{ 7 [ drop t ] }
{ 8 [ op8 ] }
{ 9 [ op9 ] }
{ 10 [ op10 ] }
{ 11 [ op11 ] }
{ 12 [ op12 ] }
{ 13 [ op13 ] }
} case ;

Here is what it expands to:
: run-op ( -- bool )
advance
0 13 3dup pick >= [ >= ] [ 2drop f ] if [
drop - >fixnum {
[ op0 ]
[ op1 ]
[ op2 ]
[ op3 ]
[ op4 ]
[ op5 ]
[ op6 ]
[ drop t ]
[ op8 ]
[ op9 ]
[ op10 ]
[ op11 ]
[ op12 ]
[ op13 ]
} dispatch
] [ 2drop no-case ] if ;

Of course, if you wanted to get rid of the extra verbosity with the key numbers, you could write a "safe-dispatch" macro, which adds them for you. But I would rather incorporate this functionality into case.

This is another example where macros (compile-time transforms) permit you to define a high-level abstraction which is converted into efficient code.

1 comment:

Anonymous said...

I am not that familiar with factor but...

This kind of case statement seems to lend itself nicely to enumerations. Say instead of numbers you had some identifiers which got bound to numbers. Then these identifiers could be used instead of numbers.