Vmgen (Gforth 0.6.1)

Node:VM profiler, Previous:VM disassembler, Up:Using the generated code

VM profiler

The VM profiler is designed for getting execution and occurence counts for VM instruction sequences, and these counts can then be used for selecting sequences as superinstructions. The VM profiler is probably not useful as profiling tool for the interpretive system. I.e., the VM profiler is useful for the developers, but not the users of the interpretive system.

The output of the profiler is: for each basic block (executed at least once), it produces the dynamic execution count of that basic block and all its subsequences; e.g.,

       9227465  lit storelocal
       9227465  storelocal branch
       9227465  lit storelocal branch

I.e., a basic block consisting of lit storelocal branch is executed 9227465 times.

This output can be combined in various ways. E.g., vmgen-ex/stat.awk adds up the occurences of a given sequence wrt dynamic execution, static occurence, and per-program occurence. E.g.,

      2      16        36910041 loadlocal lit

indicates that the sequence loadlocal lit occurs in 2 programs, in 16 places, and has been executed 36910041 times. Now you can select superinstructions in any way you like (note that compile time and space typically limit the number of superinstructions to 100-1000). After you have done that, vmgen/seq2rule.awk turns lines of the form above into rules for inclusion in a Vmgen input file. Note that this script does not ensure that all prefixes are defined, so you have to do that in other ways. So, an overall script for turning profiles into superinstructions can look like this:

awk -f stat.awk fib.prof test.prof|
awk '$3>=10000'|                #select sequences
fgrep -v -f peephole-blacklist| #eliminate wrong instructions
awk -f seq2rule.awk|            #turn into superinstructions
sort -k 3 >mini-super.vmg       #sort sequences

Here the dynamic count is used for selecting sequences (preliminary results indicate that the static count gives better results, though); the third line eliminates sequences containing instructions that must not occur in a superinstruction, because they access a stack directly. The dynamic count selection ensures that all subsequences (including prefixes) of longer sequences occur (because subsequences have at least the same count as the longer sequences); the sort in the last line ensures that longer superinstructions occur after their prefixes.

But before using this, you have to have the profiler. Vmgen supports its creation by generating file-profile.i; you also need the wrapper file vmgen-ex/profile.c that you can use almost verbatim.

The profiler works by recording the targets of all VM control flow changes (through SUPER_END during execution, and through BB_BOUNDARY in the front end), and counting (through SUPER_END) how often they were targeted. After the program run, the numbers are corrected such that each VM basic block has the correct count (entering a block without executing a branch does not increase the count, and the correction fixes that), then the subsequences of all basic blocks are printed. To get all this, you just have to define SUPER_END (and BB_BOUNDARY) appropriately, and call vm_print_profile(FILE *file) when you want to output the profile on file.

The file-profile.i is similar to the disassembler file, and it uses variables and functions defined in vmgen-ex/profile.c, plus VM_IS_INST already defined for the VM disassembler (see VM disassembler).