From anton Mon Dec 31 18:11:13 2018 X-newsreader: xrn 10.00-beta-3 From: anton@mips.complang.tuwien.ac.at (Anton Ertl) Path: mips.complang.tuwien.ac.at!anton Newsgroups: comp.lang.forth Subject: FORWARD in Gforth Organization: Institut fuer Computersprachen, Technische Universitaet Wien Date: Mon, 31 Dec 2018 15:17:43 GMT Message-ID: <2018Dec31.161743@mips.complang.tuwien.ac.at> This posting discusses the implementation of FORWARD in Gforth, which may be an unnecessary feature, but the implementation is still interesting, as is shown by the fact that we came up with four different implementations in the span of a few days. Why FORWARD may be unnecessary: We have DEFERred words, which are the good-enough solution for implementing indirect recursion (which is the point of having FORWARD). DEFER causes a little overhead, which IMO is not enough to justify FORWARD, but others may disagree. In particular, the instruction sequence executed in gforth-fast on AMD64 is with DEFER with FORWARD: lit-perform: mov rax,[rbx] call: mov rax,rbx add rbx,$08 mov rbx,[rbx] mov rcx,[rax] mov rdx,r13 mov rdx,[rcx] add rax,$08 mov rax,rdx lea r13,-$08[r13] jmp eax mov -$08[rdx],rax docol: mov rax,r13 add rbx,$08 mov rdx,rbx mov rdx,-$08[rbx] lea r13,-$08[r13] mov rax,rdx lea rbx,$18[rcx] jmp eax mov -$08[rax],rdx mov rdx,-$08[rbx] mov rax,rdx jmp eax The other issue is the syntax: defer foo : bar ... foo ... ; :noname ... bar ... bar ... ; is foo There is at least one complaint about this syntax: What if we add FORWARD anyway? The FORWARD syntax we use is: FORWARD foo : bar ... foo ... ; : foo ... bar ... bar ... ; I.e., the definition of foo automatically resolves the earlier forward definition (in the same wordlist). Let's look at an implementation that implements just this syntax, but otherwise is the same as the DEFER solution: s" unresolved forward definition" exception constant unresolved-forward s" forward must be resolved with :" exception constant forward-needs-: : unresolved-forward-error ( -- ) unresolved-forward throw ; : forward ( "name" -- ) \g Defines a forward reference to a colon definition. Defining a \g colon definition with the same name in the same wordlist \g resolves the forward references. Use @code{.unresolved} to \g check whether any forwards are unresolved. defer ['] unresolved-forward-error lastxt defer! ; : is-forward? ( xt -- f ) \ f is true if xt is an unresolved forward definition dup >code-address dodefer: = if defer@ ['] unresolved-forward-error = exit then drop false ; : auto-resolve ( addr u wid -- ) \G auto-resolve the forward reference in check-shadow dup 2over rot find-name-in dup if dup is-forward? if latestxt >code-address docol: <> forward-needs-: and throw latestxt swap defer! 2drop drop exit then then drop defers check-shadow ; ' auto-resolve is check-shadow EXCEPTION ( c-addr u -- n ) produces a new exception number, that, when thrown, reports the string as error. FIND-NAME-IN ( c-addr u wid -- nt|0 ) searches for the name c-addr u in the wordlist wid. This produces an nt, but it is used as xt here (this is possible in Gforth, as long as no ALIASes or somesuch are involved). >CODE-ADDRESS ( xt -- addr ) produces essentially a token that identifies how the xt was defined. E.g., if it was defined with :, it produces the value produced by DOCOL: ( -- addr ). While it is not necessary to resolve the forward word with a colon definition in this implementation, this check prepares the implementation for the stricter requirements of the following versions: LATESTXT produces the xt of the latest definition (in this case, the definition that resolves the forward word). CHECK-SHADOW ( addr u wid -- ) is a deferred word for checking whether a new definition shadows an old one. It is called whenever a word is REVEALed. DEFERS CHECK-SHADOW produces a call to the previous behaviour of CHECK-SHADOW. AUTO-RESOLVE is then plugged into CHECK-SHADOW, such that forward words are resolved (by changing the deferred word) if they would be shadowed, but for non-forward words, the normal CHECK-SHADOW behaviour is performed. Let's add the code for eliminating the defer overhead in most cases: : unfixed-forward ( ... -- ... ) \ the compiled code for a forward child branches to this word, \ which patches the call to the forward word; if the word has not \ been resolved, this produces an error after patching the call to \ report an error. r@ cell- {: target :} target @ cell- @ dup >body target ! execute-exit ; : forward ( "name" -- ) \g Defines a forward reference to a colon definition. Defining a \g colon definition with the same name in the same wordlist \g resolves the forward references. Use @code{.unresolved} to \g check whether any forwards are unresolved. defer ['] unresolved-forward-error lastxt defer! ['] branch peephole-compile, ['] unfixed-forward >body , [: ['] call peephole-compile, >body cell+ , ;] set-optimizer ; This makes use of Gforth's threaded-code basis. A forward word FOO now looks as follows: FOO header xt -> dodefer unused body -> xt (of unresolved-forward-error or of the resolution) tc -> branch (threaded code fragment) unfixed-forward (body) The middle line of FORWARD produces the branch and its target. PEEPHOLE-COMPILE, ( xt -- ) is the code that COMPILE, calls for primitives. The last line of FORWARD means that, when FOO is COMPILE,d, this compilation produces the following code: call foo-tc SET-OPTIMIZER ( xt1 -- ) changes the last defined word (our forward word in this case) to call xt1 ( xt2 -- ) when this word is COMPILE,d, with xt2 being the xt passed to COMPILE,. In this case we compute foo-tc from it with >BODY CELL+. When, at a later time (and usually after the forward is resolved), the call is run, it calls UNFIXED-FORWARD, which uses the return address to find the call, and the forward word FOO, and the xt stored there. It patches the call to directly call the code of the xt. And it also runs the xt with EXECUTE-EXIT, which does what it says, except that it deals with the return address before performing the EXECUTE part. In order to find unresolved words, we also have: : .unresolved ( -- ) \G print all unresolved forward references [: [: replace-sourceview >r dup name>view @ to replace-sourceview dup is-forward? [: dup .name ." is unresolved" cr ;] ?warning r> to replace-sourceview drop true ;] swap traverse-wordlist ;] map-vocs ; I won't go into the details of this word, but note the use of quotations and TRAVERSE-WORDLIST. These are useful features. Why did we need three earlier versions? The first version defined forward words as CREATEd words with a special SET-OPTIMIZER behaviour. This is not a good idea, because what COMPILE, compiled was different from the behaviour of the word when EXECUTEd. And consequently, this did not work for gforth-itc with "," as COMPILE, (which ignores what SET-OPTIMIZER says). Another difference in the first version was that the calls were eagerly resolved, not lazily as above: The calls were organized as a linked list, and once the target became known, they were all patched. The second version defines FOO as an immediate create...does> child, again with eager resolution. This worked with the intelligent compile, as well as with ",", but did not work when the user did something like "['] foo compile," (because that sequence ignores the immediacy). The third version defines FOO as a colon definition followed by a cell that contains 0 or the resolution. The colon definition calls (a differently-defined) UNFIXED-FORWARD, which performs lazy patching. The downside of this approach was that every non-compiled use (e.g., through EXECUTE or a DEFERred word) would check every time whether it can patch the call, and find that it cannot; but next time it would incur that overhead again. So we decided to implement the fourth version described above, i.e., with DEFER, SET-OPTIMIZER and lazy patching. You can find it in . Note that this is a proper use of SET-OPTIMIZER (unlike in the first version): You can leave the SET-OPTIMIZER part away (e.g., by using "," as COMPILE,), and it will still work as intended (and the overhead is not too bad, as discussed in the beginning). - anton -- M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html New standard: http://www.forth200x.org/forth200x.html EuroForth 2018: http://www.euroforth.org/ef18/