Node:Introduction, Next:, Previous:Top, Up:Top



Introduction

Vmgen is a tool for writing efficient interpreters. It takes a simple virtual machine description and generates efficient C code for dealing with the virtual machine code in various ways (in particular, executing it). The run-time efficiency of the resulting interpreters is usually within a factor of 10 of machine code produced by an optimizing compiler.

The interpreter design strategy supported by Vmgen is to divide the interpreter into two parts:

Such a division is usually used in interpreters, for modularity as well as for efficiency. The virtual machine code is typically passed between front end and virtual machine interpreter in memory, like in a load-and-go compiler; this avoids the complexity and time cost of writing the code to a file and reading it again.

A virtual machine (VM) represents the program as a sequence of VM instructions, following each other in memory, similar to real machine code. Control flow occurs through VM branch instructions, like in a real machine.

In this setup, Vmgen can generate most of the code dealing with virtual machine instructions from a simple description of the virtual machine instructions (see Input File Format), in particular:


VM instruction execution

VM code generation
Useful in the front end.
VM code decompiler
Useful for debugging the front end.
VM code tracing
Useful for debugging the front end and the VM interpreter. You will typically provide other means for debugging the user's programs at the source level.
VM code profiling
Useful for optimizing the VM interpreter with superinstructions (see VM profiler).

To create parts of the interpretive system that do not deal with VM instructions, you have to use other tools (e.g., bison) and/or hand-code them.

Vmgen supports efficient interpreters though various optimizations, in particular

As a result, Vmgen-based interpreters are only about an order of magnitude slower than native code from an optimizing C compiler on small benchmarks; on large benchmarks, which spend more time in the run-time system, the slowdown is often less (e.g., the slowdown of a Vmgen-generated JVM interpreter over the best JVM JIT compiler we measured is only a factor of 2-3 for large benchmarks; some other JITs and all other interpreters we looked at were slower than our interpreter).

VMs are usually designed as stack machines (passing data between VM instructions on a stack), and Vmgen supports such designs especially well; however, you can also use Vmgen for implementing a register VM (see Register Machines) and still benefit from most of the advantages offered by Vmgen.

There are many potential uses of the instruction descriptions that are not implemented at the moment, but we are open for feature requests, and we will consider new features if someone asks for them; so the feature list above is not exhaustive.