Vmgen (Gforth 0.6.1)

Node:VM engine, Next:VM instruction table, Previous:Using the generated code, Up:Using the generated code

VM engine

The VM engine is the VM interpreter that executes the VM code. It is essential for an interpretive system.

Vmgen supports two methods of VM instruction dispatch: threaded code (fast, but gcc-specific), and switch dispatch (slow, but portable across C compilers); you can use conditional compilation (defined(__GNUC__)) to choose between these methods, and our example does so.

For both methods, the VM engine is contained in a C-level function. Vmgen generates most of the contents of the function for you (name-vm.i), but you have to define this function, and macros and variables used in the engine, and initialize the variables. In our example the engine function also includes name-labels.i (see VM instruction table).

In addition to executing the code, the VM engine can optionally also print out a trace of the executed instructions, their arguments and results. For superinstructions it prints the trace as if only component instructions were executed; this allows to introduce new superinstructions while keeping the traces comparable to old ones (important for regression tests).

It costs significant performance to check in each instruction whether to print tracing code, so we recommend producing two copies of the engine: one for fast execution, and one for tracing. See the rules for engine.o and engine-debug.o in vmgen-ex/Makefile for an example.

The following macros and variables are used in name-vm.i:

LABEL(inst_name)

This is used just before each VM instruction to provide a jump or switch label (the : is provided by Vmgen). For switch dispatch this should expand to case label:; for threaded-code dispatch this should just expand to label:. In either case label is usually the inst_name with some prefix or suffix to avoid naming conflicts.

LABEL2(inst_name)

This will be used for dynamic superinstructions; at the moment, this should expand to nothing.

NAME(inst_name_string)

Called on entering a VM instruction with a string containing the name of the VM instruction as parameter. In normal execution this should be expand to nothing, but for tracing this usually prints the name, and possibly other information (several VM registers in our example).

DEF_CA

Usually empty. Called just inside a new scope at the start of a VM instruction. Can be used to define variables that should be visible during every VM instruction. If you define this macro as non-empty, you have to provide the finishing ; in the macro.

NEXT_P0 NEXT_P1 NEXT_P2

The three parts of instruction dispatch. They can be defined in different ways for best performance on various processors (see engine.c in the example or engine/threaded.h in Gforth). NEXT_P0 is invoked right at the start of the VM instruction (but after DEF_CA), NEXT_P1 right after the user-supplied C code, and NEXT_P2 at the end. The actual jump has to be performed by NEXT_P2 (if you would do it earlier, important parts of the VM instruction would not be executed).

The simplest variant is if NEXT_P2 does everything and the other macros do nothing. Then also related macros like IP, SET_IP, IP, INC_IP and IPTOS are very straightforward to define. For switch dispatch this code consists just of a jump to the dispatch code (goto next_inst; in our example); for direct threaded code it consists of something like ({cfa=*ip++; goto *cfa;}).

Pulling code (usually the cfa=*ip++;) up into NEXT_P1 usually does not cause problems, but pulling things up into NEXT_P0 usually requires changing the other macros (and, at least for Gforth on Alpha, it does not buy much, because the compiler often manages to schedule the relevant stuff up by itself). An even more extreme variant is to pull code up even further, into, e.g., NEXT_P1 of the previous VM instruction (prefetching, useful on PowerPCs).

INC_IP(n)

This increments IP by n.

SET_IP(target)

This sets IP to target.

vm_A2B(a,b)

Type casting macro that assigns a (of type A) to b (of type B). This is mainly used for getting stack items into variables and back. So you need to define macros for every combination of stack basic type (Cell in our example) and type-prefix types used with that stack (in both directions). For the type-prefix type, you use the type-prefix (not the C type string) as type name (e.g., vm_Cell2i, not vm_Cell2Cell). In addition, you have to define a vm_X2X macro for the stack's basic type X (used in superinstructions).

The stack basic type for the predefined inst-stream is Cell. If you want a stack with the same item size, making its basic type Cell usually reduces the number of macros you have to define.

Here our examples differ a lot: vmgen-ex uses casts in these macros, whereas vmgen-ex2 uses union-field selection (or assignment to union fields). Note that casting floats into integers and vice versa changes the bit pattern (and you do not want that). In this case your options are to use a (temporary) union, or to take the address of the value, cast the pointer, and dereference that (not always possible, and sometimes expensive).

vm_twoA2B(a1,a2,b)

vm_B2twoA(b,a1,a2)

Type casting between two stack items (a1, a2) and a variable b of a type that takes two stack items. This does not occur in our small examples, but you can look at Gforth for examples (see vm_twoCell2d in engine/forth.h).

stackpointer

For each stack used, the stackpointer name given in the stack declaration is used. For a regular stack this must be an l-expression; typically it is a variable declared as a pointer to the stack's basic type. For inst-stream, the name is IP, and it can be a plain r-value; typically it is a macro that abstracts away the differences between the various implementations of NEXT_P*.

IMM_ARG(access,value)

Define this to expland to "(access)". This is just a placeholder for future extensions.

stackpointerTOS

The top-of-stack for the stack pointed to by stackpointer. If you are using top-of-stack caching for that stack, this should be defined as variable; if you are not using top-of-stack caching for that stack, this should be a macro expanding to stackpointer[0]. The stack pointer for the predefined inst-stream is called IP, so the top-of-stack is called IPTOS.

IF_stackpointerTOS(expr)

Macro for executing expr, if top-of-stack caching is used for the stackpointer stack. I.e., this should do expr if there is top-of-stack caching for stackpointer; otherwise it should do nothing.

SUPER_END

This is used by the VM profiler (see VM profiler); it should not do anything in normal operation, and call vm_count_block(IP) for profiling.

SUPER_CONTINUE

This is just a hint to Vmgen and does nothing at the C level.

VM_DEBUG

If this is defined, the tracing code will be compiled in (slower interpretation, but better debugging). Our example compiles two versions of the engine, a fast-running one that cannot trace, and one with potential tracing and profiling.

vm_debug

Needed only if VM_DEBUG is defined. If this variable contains true, the VM instructions produce trace output. It can be turned on or off at any time.

vm_out

Needed only if VM_DEBUG is defined. Specifies the file on which to print the trace output (type FILE *).

printarg_type(value)

Needed only if VM_DEBUG is defined. Macro or function for printing value in a way appropriate for the type. This is used for printing the values of stack items during tracing. Type is normally the type prefix specified in a type-prefix definition (e.g., printarg_i); in superinstructions it is currently the basic type of the stack.