This section deals with the following ANS Forth defining words:
CONSTANT has exactly the same semantics as specified by ANS Forth. Whenever the constant defined by CONSTANT is executed, it pushes a constant value onto the data stack. Is that all? Sure. But again, there's a small but significant difference between strongForth and ANS Forth. In strongForth, each constant has a data type, which is identical to the data type of the value that is provided to CONSTANT. To figure out how CONSTANT gets access to this data type, let's have a look at a small example:
42 CONSTANT ANSWER OK LATEST . ANSWER ( -- UNSIGNED ) OK
Immediately after interpreting 42, the topmost data type on the interpreter data type heap is UNSIGNED. Next, CONSTANT is about to be interpreted. The interpreter finds CONSTANT in the dictionary, updates the interpreter data type heap according to the stack diagram of CONSTANT and then calls the inner interpreter to execute CONSTANT. The stack diagram of CONSTANT indicates that this word expects an item of data type SINGLE on the stack. It doesn't not have any output parameters:
: CONSTANT ( SINGLE -- ) <VALUE (CONSTANT) ROT CONST, ['CODE] FALSE VALUE> ;
Data type SINGLE, or UNSIGNED in our example, is removed from the interpreter data type heap before CONSTANT is being executed. Removing an item from a data type heap means that the heap pointer, which always points to the next free cell on the heap, is decremented. This means, the interpreter data type heap pointer now points to UNSIGNED. And this is where CONSTANT gets the data type from.
The definition of CONSTANT contains three new words: <VALUE, (CONSTANT) and VALUE>. <VALUE creates a new definition and prepares for adding the stack diagram of a constant or a variable. VALUE> is the counterpart of <VALUE, finalizing the new definition and restoring the memory space to what it was before the execution of <VALUE.
: <VALUE ( -- MEMORY-SPACE STACK-DIAGRAM ) CREATE SPACE@ NAME-SPACE NULL STACK-DIAGRAM ; : VALUE> ( MEMORY-SPACE STACK-DIAGRAM CODE -- ) END-DEFINITION SPACE! ;
(CONSTANT) compiles the data type of a constant as the output parameter into the stack diagram of the new constant. A loop is used to cover the general case of compiling a compound data type consisting of more than one basic data type:
: (CONSTANT) ( STACK-DIAGRAM -- 1ST ) DTP@ BEGIN TUCK @ DT-OUTPUT OR PARAM, SWAP DUP @ DT-PREFIX ATTRIBUTE? WHILE 1+ REPEAT DROP ;
CONSTANT compiles the actual value of the constant into the data field of the new definition. The machine code, whose address is compiled into the code field, is identical to the one of the predefined constant FALSE.
CONSTANT as in the above definition creates single-cell constants only, For double-cell constants, strongForth provides an overloaded version of CONSTANT, which looks almost identical to the single-cell version:
: CONSTANT ( DOUBLE -- ) <VALUE (CONSTANT) ROT CONST, ['CODE] DT-INPUT VALUE> ;
The only visible difference besides the stack diagrams is the machine code address, which is compiled into the code field of the new definition. But there are also invisible differences. The single-cell and double-cell versions use different overloaded versions of ROT and CONST,, because these are the words dealing with the input parameter (SINGLE or DOUBLE).
ANS Forth specifies a separate word 2CONSTANT for defining double-cell constants or a couple of single-cell constants. Similar to 2DUP, 2DROP, 2SWAP and 2ROT, 2CONSTANT does not exist in strongForth. However, be aware that the overloaded double-cell version of CONSTANT only covers the double-cell semantics of 2CONSTANT, but not the couple of single-cells semantics.
The ANS Forth word VARIABLE does not expect any parameters on the stack. It just allocates one cell of memory and creates a new definition that returns the address of this memory cell when executed. So, how does the strongForth version of VARIABLE know about the data type of the variable to be created? Since you already know some of strongForth's system variables and how they are defined, you probably know the answer.
BASE .S DROP DATA -> UNSIGNED OK STATE .S DROP DATA -> FLAG OK
StrongForth's version of VARIABLE has a slightly different semantics than the ANS Forth version. It expects an input parameter, whose data type becomes the data type of the variable and whose value initialize the variable:
1973 VARIABLE YEAR OK YEAR .S DATA -> UNSIGNED OK @ .S . UNSIGNED 1973 OK
The definition of VARIABLE itself is similar to the definition of CONSTANT.
: VARIABLE ( SINGLE -- ) <VALUE (VARIABLE) DATA-SPACE HERE CONST, ROT , ['CODE] BASE VALUE> ;
Let's just discuss the differences. First, VARIABLE uses (VARIABLE) instead of (CONSTANT) to create the stack diagram of the new definition. (VARIABLE) compiles the data type of the initializer, which is preceded by the prefix DATA -> in order to add one level of indirection.:
: (VARIABLE) ( STACK-DIAGRAM -- 1ST ) [ DT DATA DT-OUTPUT DT-PREFIX OR OR ] LITERAL PARAM, (CONSTANT) ;
The second difference between VARIABLE and CONSTANT is that VARIABLE stores the value of the initalizer in the data space and the variable's address in the data field. The reason why the variable's value is not directly stored in the data field should be obvious. The data field is located in the constant data space. In an embedded system, the constant data space is typically mapped to some kind of read-only memory, which cannot be written to at runtime. As a consequence, the variable's value would be frozen. In contrast to variables, constants and addresses of variables can savely be stored in the constant data space, because they are not changed at runtime.
The third and final thing that catches the eye is the code field. VARIABLE compiles the code field of the variable BASE instead of the code field of the constant FALSE. What's the difference? There is none! BASE actually has the same code field as FALSE. At runtime, both constants and variables push the contents of their data field onto the data stack. For constants, it is the value of the constant, while for variables, it is the address of the variable.
Of course, strongForth provides an overloaded version of VARIABLE for double-cell variables. Just as with CONSTANT, there's almost no difference between the single-cell and the double-cell versions. The double-cell version of VARIABLE even compiles the same code field like the single-cell version, because addresses of single-cell and double-cell variables do not differ at all. The only visible difference between the definitions of the two versions of VARIABLE is in the stack diagrams:
: VARIABLE ( DOUBLE -- ) <VALUE (VARIABLE) DATA-SPACE HERE CONST, ROT , ['CODE] BASE VALUE> ;
A value is a variable that is used like a constant. To see the differences between values, constants and variables, let's compare the definition of VALUE with the definitions of CONSTANT and VARIABLE:
: VALUE ( SINGLE -- ) <VALUE (CONSTANT) DATA-SPACE HERE CONST, ROT , ['CODE] SOURCE-ID VALUE> ;
Because a value has the same stack diagram like a constant, it compiles the stack diagram of the new definition with (CONSTANT). On the other hand, to ensure that a value can be changed at runtime, it has to be stored in the data space, just like a variable. The data field of the definition contains the address of the value. The code field is the same as the one of SOURCE-ID, which is actually a predefined value.
The overloaded version of VALUE for double-cell values is almost identical. The only visible difference to the version for single-cell values is in the compilation of the code field. LATEST is a typical double-cell value that is predefined by strongForth.
: VALUE ( DOUBLE -- ) <VALUE (CONSTANT) DATA-SPACE HERE CONST, ROT , ['CODE] LATEST VALUE> ;
StrongForth keeps locals in the dedicated local dictionary, which resides in the local name space. The local name space is located in the DATA memory area, because it needs to be re-written for each new definition that uses locals. Besides the local dictionary, the local name space contains other temporary data structures, which will be presented later. As explained in chapter 10, both : and :NONAME execute LOCALS! to initialize the local name space at the beginning of a new colon definition.
Locals have a different memory image than ordinary words. The local dictionary only occupies memory in the local name space, whereas the (global) dictionary is spread over the name space, the constant data space and the code space. The memory image of a local within the local name space looks like this:
name field |
link field |
index field |
output parameter field |
The name field and the link field are the same as for a word in the (global) dictionary. But where are the attribute and token fields? These fields are not required, because locals all have the same attributes and the same tokens. Instead, an index field specifies where to find the value of a local at execution time. Finally, the output parameter field contains the compound data type of the local. Input parameters do not exist for locals. Since each local has exactly one compound data type as output parameter, it is not necessary to specify the length of the stack diagram in the attribute field.
At execution time, the values of locals are kept on the return stack. They are accessed relatively to the return stack pointer,
RP@ offset +
where the positive offset is calculated at compilation time as
#LOCALS @ index - CELLS
#LOCALS is a system variable that contains the number of cells that will be occupied by locals on the return stack at runtime. The content of the index field is subtracted from this value in order to calculate the return stack offset. As an example, let's consider a definition with two locals:
: EXAMPLE ( DOUBLE SINGLE -- ... ) LOCALS| S D | ... S ... D ... ;
At runtime, S and D are located on the return stack, with offsets 2 CELLS and 0 CELLS from the return stack pointer, respectively. At compilation time, #LOCALS is 3, because the two locals occupy three cells on the return stack. The index field of the dictionary entry for S contains 1, while the index field of the dictionary entry for D contains 3.
Because the memory images of the global and local dictionaries are quite different, it is not possible to reuse CREATE or (CREATE) for creating locals in the local dictionary. Instead, strongForth provides the dedicated word CREATE-LOCAL:
: CREATE-LOCAL ( CDATA -> CHARACTER UNSIGNED -- ) ?COMPILE SPACE@ LOCAL-SPACE ROT ROT HERE ROT ROT ", ALIGN LINK, #LOCALS @ , DTP@ BEGIN DUP @ DUP , DT-PREFIX ATTRIBUTE? WHILE 1+ REPEAT DROP SPACE! ;
Other than CREATE, which obtains the name of the new definition by parsing the input source, does CREATE-LOCAL expect the name of the local as a character string on the data stack. The name field and the link field are created in the same way as for ordinary words, using ", ALIGN LINK,. As long as locals are being defined, #LOCALS counts the number of cells occupied by the locals defined so far. This is exactly the value that needs to be compiled into the index field. In the above example, S is the first local, occupying one cell. #LOCALS is 1. D occupies two cells, which are added to the one cell occupied by S. #LOCALS is now 3.
Finally, CREATE-LOCAL compiles the compound data type of the new local into the output parameter field. It is assumed that the data type has been popped from the compiler data type heap immediately before CREATE-LOCAL is executed. Therefore, the data type heap pointer now points to the data type of the local.
Note that CREATE-LOCAL creates a complete local. Nothing else is required. This is different from CREATE, which leaves some fields of the new word's memory image to be filled later.
After all locals have been created, they can be found in the local dictionary. FIND-LOCAL is similar to FIND, but it can only be applied to the local dictionary:FIND-LOCAL ( CDATA -> CHARACTER UNSIGNED -- DATA -> DATA-TYPE SIGNED )
It expects the name of the local as a character string on the data stack. No other input parameters are required. If a local with the given name exists in the local dictionary, FIND-LOCAL returns a pointer to its output parameter field, and the content of its index field as a signed number. If a local with the given name cannot be found in the local dictionary, FIND-LOCAL returns a null pointer and 0 as the index.
StrongForth provides a means to remove locals from the local dictionary before the compilation is done. Is this really necessary? Locals defined by LOCALS| exist until the end of the definition, so there's no need to explicitely remove them. But there are other kinds of locals, whose scope is even smaller than those of usual locals. Those are the ANS Forth words I, J and R@. In strongForth, these words do not exist in the dictionary. Instead, they are defined as locals at compilation time. I, and possibly J, are locals that are created by DO and ?DO, and removed by LOOP and +LOOP. They do only exist during the compilation of a DO loop. Similarly, R@ is created by >R, and removed by R>. The scope of R@ is the compiled code between >R and R>. Details on the compilation of loops and about using the return stack as temporary data storage will be given later.
FORGET-LOCAL removes the most recently defined local from the local dictionary:
: FORGET-LOCAL ( SIGNED -- ) ?COMPILE NEGATE #LOCALS +! SPACE@ LOCAL-SPACE LINK @ HERE - ALLOT HERE DUP CAST CDATA -> UNSIGNED @ 1+ + ALIGNED CAST DATA -> ADDRESS @ LINK ! SPACE! ;The size in cells of the local is expected on the data stack. This parameter is used to update #LOCALS. Next, FORGET-LOCAL deallocates the space occupied by the local. And finally, the pointer to the latest local has to be updated as well. This is a little bit tricky, because FORGET-LOCAL has to calculate the address of the link field of the removed local by traversing its name field. Remember that the system variable LINK contains a pointer to the name field of the latest dictionary entry in the name space or in the local name space.
In order to define locals during the compilation of a new word, ANS Forth specifies a well-defined process. The ANS Forth word (LOCAL) shall send a message to the system for each local to be defined, and an additional last local message after the last local. Implementing this process obviously requires that the system knows whether the last local message has already been sent. StrongForth keeps this status information in the system variable #LOCALS:
#LOCALS | Semantics |
---|---|
zero | No locals defined yet. |
negative | First local has been defined, but last local message has not yet been sent. The absolute value of #LOCALS is the total number of cells that have been allocated for locals so far. |
positive | All locals have been defined and the last local message has been sent. The value of #LOCALS is the total number of cells that have been allocated for locals. |
The definition of (LOCAL) in based on the status information contained in #LOCALS:
: (LOCAL) ( CDATA -> CHARACTER UNSIGNED -- ) ?COMPILE DUP IF #LOCALS @ NEGATE 0< IF DROP DROP -263 THROW ELSE POSTPONE (>R) -1 DTP@ @ DOUBLE? IF 1- THEN #LOCALS +! CREATE-LOCAL THEN ELSE DROP DROP #LOCALS @ NEGATE #LOCALS ! THEN ;
First, let's see what happens if (LOCAL) is executed with a valid string as input parameter. If the last local message has already been sent, (LOCAL) throws an exception. Otherwise, it defines a new local by compiling code to push the value of the local onto the return stack, incrementing the (negative) value of #LOCALS by the number of cells to be allocated on the return stack, and by executing CREATE-LOCAL in order to add the local to the local dicitionary.
(>R) is the runtime code of (LOCAL). It takes the item on top of the data stack and pushes it onto the return stack. As usual, strongForth provides two overloaded versions for single-cell and double-cell items:
(>R) ( SINGLE -- ) (>R) ( DOUBLE -- )
The second case for (LOCAL) is when it is executed with a null string as the input parameter. This is the last local message. (LOCAL) just changes the status information by negating the value of #LOCALS. From now on until the end of the definition, the value of #LOCALS is positive.
(LOCAL) is just the system's low-level interface to defining locals. The high-level interface specified by ANS Forth is the immediate word LOCALS|. LOCALS| is typically executed somewhere at the beginning of a new definition, using the syntax
: name ... LOCALS| name-1 name-2 ... name-n | ... ;The semantics of LOCALS| is to parse the input source and to execute (LOCAL) for each word up to but excluding |. Finally, it sends the last local message by executing (LOCAL) with a null string as input parameter. Here's the definition of LOCALS|:
: LOCALS| ( -- ) BEGIN PARSE-WORD OVER OVER " |" COMPARE WHILE (LOCAL) REPEAT 1- (LOCAL) ; IMMEDIATE
In the previous section, you've seen how locals are defined. Now, what needs to be done if a local is actually used during compilation? Let's have a look on a simple definition that uses locals. MSWAP swaps the contents of two memory cells:
: MSWAP ( DATA -> SINGLE 1ST -- ) LOCALS| ADDR1 ADDR2 | ADDR1 @ ADDR2 @ ADDR1 ! ADDR2 ! ;You might be tempted to say that it's probably much easier to implement the semantics of MSWAP by using simple stack movements like DUP, ROT and SWAP, but you'll be surprised how complicated things get if you try to avoid locals in this case. The solution with locals is more straight-forward and much easier to read.
Next, we'll use SEE to see the virtual machine code that has been compiled by the above definition of MSWAP:
SEE MSWAP : MSWAP ( DATA -> SINGLE 1ST -- ) (>R) (>R) (R@) 2 @ (R@) 0 @ (R@) 2 ! (R@) 0 ! ; OK
The first two tokens are compiled by LOCALS|. They push the two input parameters onto the return stack at runtime. From the rest of the virtual machine code, it's easy to see that ADDR1 is compiled into (R@) 2 and ADDR2 is compiled into (R@) 0. (R@) actually fetches one cell from an address relative to the return stack pointer, and pushes it onto the data stack. The offset in address units is a parameter to (R@), which is stored in the virtual machine code. It is calculated during compilation time as follows, with index being the absolute value of the local's index field:
#LOCALS index - CELLS
In the example of MSWAP, #LOCALS is 2, while the index fields of ADDR1 and ADDR2 contain -1 and -2, respectively. The return stack pointer directly points to ADDR2, i. e., with zero offset. Since ADDR1 was pushed onto the return stack before ADDR2, its offset is positive. On a 16-bit machine, an offset of one cell is the same as two address units.
To compile locals, strongForth provides the word LOCAL,. LOCAL, expects the output of FIND-LOCAL ABS on the data stack, which are a pointer to the compound data type of the local, and the absolute value of the local's index field:
: LOCAL, ( DATA -> DATA-TYPE SIGNED -- ) ?COMPILE #LOCALS @ SWAP - DUP 0< IF DROP DROP -263 THROW ELSE OVER @ DOUBLE? IF ['TOKEN] (DR@) ELSE ['TOKEN] (R@) THEN CONST, CELLS CONST, @>DT THEN ;
An exception is thrown if the calculated return stack offset is a negative value for some reason. For example, this might happen if LOCAL, is executed before the last local message has been sent. If the local is a single-cell value, the token of (R@) is be compiled as the virtual machine code. For a double-cell local, the token of (DR@) needs to be compiled. The semantics of (DR@) is similar to the one of (R@). Instead of fetching only one cell from the return stack, (DR@) fetches two cells from the return stack and pushes them onto the data stack. (R@) and (DR@) are low-level words that should normally not be used directly in a definition:
(R@) ( -- SINGLE ) (DR@) ( -- DOUBLE )
After compiling the virtual machine code, consisting of the tokens of either (R@) and (DR@) plus the calculated return stack offset, LOCAL, adds the local's compound data type to the compiler data type heap. That's all.
ANS Forth specifies the word TO to change locals and values after they have been initialized. Let's first investigate how locals are being changed by looking at the virtual machine code of an example word:
: EXAMPLE ( UNSIGNED -- 1ST ) 3 + LOCALS| X | X 8 TO X X + ; OK 2 EXAMPLE . 13 OK SEE EXAMPLE : EXAMPLE ( UNSIGNED -- 1ST ) 3 + (>R) (R@) 0 8 (R!) 0 (R@) 0 + ; OK
You can see that TO actually compiles (R!) 0 as virtual machine code. Just like with (R@), the constant 0 following (R!) is a parameter indicating the return stack offset of the local to be changed. Again, (R!) is a low-level word that shouldn't be used directly within a definition. There are two overloaded versions of (R!):
(R!) ( SINGLE 1ST -- ) (R!) ( DOUBLE 1ST -- )
Why does (R!) have two input parameters? The first one specifies the data type of the value that is to be stored in the local, while the second one is the data type of the local itself. Similar to !, strongForth prevents that items can be stored in a local, which do not have exactly the same data type. But in the case of (R!), the second input parameter is just a dummy parameter that helps the compiler doing its job. Neither at runtime nor at compilation time can this parameter be found on the data stack. This is not necessary, because the address information is the constant return stack offset, which is implicitely contained in the virtual machine code.
Now it's time to have a closer look at the definition of TO:
: TO ( -- ) PARSE-WORD OVER OVER FIND-LOCAL DUP IF ?COMPILE SWAP @>DT POSTPONE (R!) #LOCALS @ SWAP ABS - CELLS CONST, DROP DROP ELSE DROP DROP OVER OVER ['CODE] SOURCE-ID 1 FIND IF NIP NIP ELSE DROP ['CODE] LATEST 1 FIND 0= IF DROP -32 THROW EXIT THEN THEN [ DT DATA DT-PREFIX OR ] LITERAL >DT DUP [ NULL DATA-TYPE 1 OFFSET+ ] LITERAL PARAMS>DT >BODY -> DATA @ STATE @ IF LITERAL, ELSE ( DATA -- )CAST THEN POSTPONE ! THEN ; IMMEDIATE
TO starts parsing the input source for the name of the local or the value, and then tries to find this name in the local dictionary. If a local with this name exists, TO asserts that the system is in compilation state. It adds the local's data type to the compiler data type heap in order to provide the second input parameter for (R!). Based on the two input parameters, POSTPONE compiles the correct overloaded version of (R!). Finally, the return stack offset is calculated and compiled as virtual machine code.
The ELSE branch of the definition of TO deals with values. If the name parsed by TO does not belong to a local, TO searches the (global) dictionary first for a single-cell and then for a double-cell value with the given name. In order to find only values, TO selects a matching critera for FIND that matches only words with the same code field as the predefined single-cell value SOURCE-ID or the predefined double-cell value LATEST.
If such a value exists, TO simply executes or compiles its address as a literal, plus a suitable version of !, as shown in the following example:
0 VALUE COUNTER OK : INCREMENT ( -- ) COUNTER 1+ TO COUNTER ; OK INCREMENT INCREMENT COUNTER . 2 OK SEE INCREMENT : INCREMENT ( -- ) COUNTER 1+ 2668 ! ; OK
2668 is the address of the memory cell where the contents of COUNTER is stored.
Compiling ! is simple. But using POSTPONE only works if the address literal has the correct (compound) data type, which is
DATA -> type
where type is the value's data type. TO adds the head and the tail of the compound data type to the data type heap. Remember that PARAMS>DT adds a complete compound data type to the data type heap, starting at the given position of a definition's stack diagram. In this case, it's the definition of the value, and the data type is the data type of the value's one and only parameter, starting at the first position of the stack diagram. The actual address is in the data field of the value's definition. In compilation state, the address is compiled into the virtual machine code, while in interpretation state it is simply left on the data stack. The phrase
STATE @ IF LITERAL, ELSE ( DATA -- )CAST THEN
is similar to the one in the definition of INTERPRET. However, TO does not need to distinguish between single-cell and double-cell literals, because addresses of data type DATA always occupy only one cell.
TO is obviously a rather complex word. This is due to the fact that it can handle several different cases, which are all the combinations of
If you're in favour of a more classical Forth programming style, please feel encouraged to do some factoring in order to shrink the size of the definition of TO.
ANS Forth specifies the words >R, R@ and R> to transfer single-cell items between the data stack and the return stack. The stack diagrams contain everything you need to know about the semantics:
>R ( x -- ) ( R: -- x ) R> ( -- x ) ( R: x -- ) R@ ( -- x ) ( R: x -- x )Now, let's see what strongForth has to offer:
WORDS >R >R ( -- R-SIZE ) OK WORDS R> R> ( R-SIZE -- ) OK WORDS R@ OK
Oops. What's that? >R and R> seem have a totally different stack diagram with a strange data type called R-SIZE. And R@ doesn't even seem to exist! This must be a mistake.
Well, by now you should be used to the fact that strongForth offers some surprises. What is really happening here? Let's assume strongForth had R@ in it's dictionary. What would be the stack diagram? That's difficult to say, because it depends on what data type >R has actually pushed onto the return stack. In the definition
: X1 ... 8753 >R ... R@ ... R> ... ;
both R@ and R> are expected to have the stack diagram ( -- UNSIGNED ), while in
: X2 ... PAD >R ... R@ ... R> ... ;
the stack diagrams of R@ and R> should be ( -- CDATA -> CHARACTER ). Therefore, it makes no sense for strongForth to provide a pre-defined version of R@ in the dictionary. The easiest way to solve this problem is to make R@ a local. Each local has a stack diagram that is specified at the definition of the local. In this case, >R defines R@ as a local, and R> removes the most recent local from the local dictionary. And that's exactly how it works. Both >R and R> are immediate words:
DT SINGLE PROCREATES R-SIZE : >R ( -- R-SIZE ) ?COMPILE POSTPONE (>R) 1 DTP@ @ DOUBLE? IF 1+ THEN DUP #LOCALS +! " R@" TRANSIENT CREATE-LOCAL CAST R-SIZE ; IMMEDIATE : R> ( R-SIZE -- ) ?COMPILE POSTPONE R@ CAST SIGNED DUP FORGET-LOCAL 1- IF POSTPONE (DRDROP) ELSE POSTPONE (RDROP) THEN ; IMMEDIATE
>R starts with compiling (>R) in order to push a single-cell or double-cell item onto the return stack. You already know the two overloaded versions of (>R) as the runtime code of (LOCAL). Next, #LOCALS is incremented by 1 or 2, depending on whether the item that has just been pushed onto the return stack occupies one or two cells. CREATE-LOCAL creates a local with the name R@ and with the same compound data type as the item that has been taken from the data stack. Note that after compiling (>R), the compiler data type heap pointer still points to the data type of (>R)'s input parameter.
An interesting detail is the fact that the index of the local R@ is positive, whereas locals defined by (LOCAL) always have negative index values. This allows distinguishing R@ and other special locals from ordinary locals. On the other hand, you need to ensure that every calculation of the return stack offset, like the one in TO, uses the absolute value of the index.
>R returns the size of the local in cells as an item of data type RSIZE. RSIZE is a special data type to be used only by >R and R>. Using a dedicated data type ensures that >R and R> are always used in pairs and with proper nesting. Syntax violations like
... R> ... >R ... ;
or
... IF ... >R ... THEN ... R> ... ;
are simply rejected by the compiler. On the other hand, the requirement for pairwise use of >R and R> prevents usages like
... IF ... >R ... ELSE ... >R ... THEN ... R> ... ;
or
... >R ... IF ... R> ... ELSE ... R> ... THEN ... ;
which would be allowed in ANS Forth. Note that this strict syntax does not apply to R@. R@ is a local whose scope is between >R and R>, independently of the control structure inbetween. This means for example, that you can use R@ within a DO loop even if >R and R> are outside of the loop:
... >R ... DO ... R@ ... LOOP ... R> ...
Another interesting consequence of R@ being a local is the fact that TO can be applied to it the same way as with all other locals:
: TEST ( UNSIGNED -- ) >R R@ . R@ 2* TO R@ R> . ; OK 7 TEST 7 14 OK
However, you should be careful using R@ in such a way, because this is not ANS Forth compliant. Always keep in mind that >R and R> are immediate words that build a control structure, just like IF ... ELSE ... THEN, BEGIN ... UNTIL and DO ... LOOP, and that R@ is a local that is being defined by >R, and that is being removed by R>.
Finally, let's see how R> works. R> compiles R@, removes this local from the local dictionary, and then compiles virtual code to clean up the return stack. (RDROP) and (DRDROP) drop a single and a double cell from the return stack, respectively:
(RDROP) ( -- ) (DRDROP) ( -- )But why is the virtual code splitted into R@ and (RDROP) or (DRDROP)? Why doesn't strongForth provide a single low-level word called (R>) for this purpose? The reason is that only R@ contains the information on the data type and the return stack offset that is needed for R>. (RDROP) and (DRDROP) just provide the additional semantics of R> with respect to R@. Splitting the semantics into two low-level words is the easiest way to go. You can try to implement a more efficient version of R>, for example by defining something like (R>) and (DR>) as machine code words and patching the compiled token of R@ with their tokens. It would definitely be an interesting exercise.
Dr. Stephan Becher - November 4th, 2005