Forth Tutorial
**************

   The difference of this chapter from the Introduction (*note
Introduction::) is that this tutorial is more fast-paced, should be
used while sitting in front of a computer, and covers much more
material, but does not explain how the Forth system works.

   This tutorial can be used with any ANS-compliant Forth; any
Gforth-specific features are marked as such and you can skip them if you
work with another Forth.  This tutorial does not explain all features of
Forth, just enough to get you started and give you some ideas about the
facilities available in Forth.  Read the rest of the manual and the
standard when you are through this.

   The intended way to use this tutorial is that you work through it
while sitting in front of the console, take a look at the examples and
predict what they will do, then try them out; if the outcome is not as
expected, find out why (e.g., by trying out variations of the example),
so you understand what's going on.  There are also some assignments
that you should solve.

   This tutorial assumes that you have programmed before and know what,
e.g., a loop is.

Starting Gforth
===============

   You can start Gforth by typing its name:

     gforth

   That puts you into interactive mode; you can leave Gforth by typing
`bye'.  While in Gforth, you can edit the command line and access the
command line history with cursor keys, similar to bash.

Syntax
======

   A "word" is a sequence of arbitrary characters (expcept white
space).  Words are separated by white space.  E.g., each of the
following lines contains exactly one word:

     word
     !@#$%^&*()
     1234567890
     5!a

   A frequent beginner's error is to leave away necessary white space,
resulting in an error like `Undefined word'; so if you see such an
error, check if you have put spaces wherever necessary.

     ." hello, world" \ correct
     ."hello, world"  \ gives an "Undefined word" error

   Gforth and most other Forth systems ignore differences in case (they
are case-insensitive), i.e., `word' is the same as `Word'.  If your
system is case-sensitive, you may have to type all the examples given
here in upper case.

Crash Course
============

   Type

     0 0 !
     here execute
     ' catch >body 20 erase abort
     ' (quit) >body 20 erase

   The last two examples are guaranteed to destroy parts of Gforth (and
most other systems), so you better leave Gforth afterwards (if it has
not finished by itself).  On some systems you may have to kill gforth
from outside (e.g., in Unix with `kill').

   Now that you know how to produce crashes (and that there's not much
to them), let's learn how to produce meaningful programs.

Stack
=====

   The most obvious feature of Forth is the stack.  When you type in a
number, it is pushed on the stack.  You can display the content of the
stack with `.s'.

     1 2 .s
     3 .s

   `.s' displays the top-of-stack to the right, i.e., the numbers
appear in `.s' output as they appeared in the input.

   You can print the top of stack element with `.'.

     1 2 3 . . .

   In general, words consume their stack arguments (`.s' is an
exception).

Assignment:
     What does the stack contain after `5 6 7 .'?

Arithmetics
===========

   The words `+', `-', `*', `/', and `mod' always operate on the top
two stack items:

     2 2 .s
     + .s
     .
     2 1 - .
     7 3 mod .

   The operands of `-', `/', and `mod' are in the same order as in the
corresponding infix expression (this is generally the case in Forth).

   Parentheses are superfluous (and not available), because the order of
the words unambiguously determines the order of evaluation and the
operands:

     3 4 + 5 * .
     3 4 5 * + .

Assignment:
     What are the infix expressions corresponding to the Forth code
     above?  Write `6-7*8+9' in Forth notation(1).

   To change the sign, use `negate':

     2 negate .

Assignment:
     Convert -(-3)*4-5 to Forth.

   `/mod' performs both `/' and `mod'.

     7 3 /mod . .

   Reference: *Note Arithmetic::.

   ---------- Footnotes ----------

   (1) This notation is also known as Postfix or RPN (Reverse Polish
Notation).

Stack Manipulation
==================

   Stack manipulation words rearrange the data on the stack.

     1 .s drop .s
     1 .s dup .s drop drop .s
     1 2 .s over .s drop drop drop
     1 2 .s swap .s drop drop
     1 2 3 .s rot .s drop drop drop

   These are the most important stack manipulation words.  There are
also variants that manipulate twice as many stack items:

     1 2 3 4 .s 2swap .s 2drop 2drop

   Two more stack manipulation words are:

     1 2 .s nip .s drop
     1 2 .s tuck .s 2drop drop

Assignment:
     Replace `nip' and `tuck' with combinations of other stack
     manipulation words.

          Given:          How do you get:
          1 2 3           3 2 1
          1 2 3           1 2 3 2
          1 2 3           1 2 3 3
          1 2 3           1 3 3
          1 2 3           2 1 3
          1 2 3 4         4 3 2 1
          1 2 3           1 2 3 1 2 3
          1 2 3 4         1 2 3 4 1 2
          1 2 3
          1 2 3           1 2 3 4
          1 2 3           1 3

     5 dup * .

Assignment:
     Write 17^3 and 17^4 in Forth, without writing `17' more than once.
     Write a piece of Forth code that expects two numbers on the stack
     (A and B, with B on top) and computes `(a-b)(a+1)'.

   Reference: *Note Stack Manipulation::.

Using files for Forth code
==========================

   While working at the Forth command line is convenient for one-line
examples and short one-off code, you probably want to store your source
code in files for convenient editing and persistence.  You can use your
favourite editor (Gforth includes Emacs support, *note Emacs and
Gforth::) to create FILE and use

     s" FILE" included

   to load it into your Forth system.  The file name extension I use for
Forth files is `.fs'.

   You can easily start Gforth with some files loaded like this:

     gforth FILE1 FILE2

   If an error occurs during loading these files, Gforth terminates,
whereas an error during `INCLUDED' within Gforth usually gives you a
Gforth command line.  Starting the Forth system every time gives you a
clean start every time, without interference from the results of earlier
tries.

   I often put all the tests in a file, then load the code and run the
tests with

     gforth CODE TESTS -e bye

   (often by performing this command with `C-x C-e' in Emacs).  The `-e
bye' ensures that Gforth terminates afterwards so that I can restart
this command without ado.

   The advantage of this approach is that the tests can be repeated
easily every time the program ist changed, making it easy to catch bugs
introduced by the change.

   Reference: *Note Forth source files::.

Comments
========

     \ That's a comment; it ends at the end of the line
     ( Another comment; it ends here: )  .s

   `\' and `(' are ordinary Forth words and therefore have to be
separated with white space from the following text.

     \This gives an "Undefined word" error

   The first `)' ends a comment started with `(', so you cannot nest
`('-comments; and you cannot comment out text containing a `)' with `(
... )'(1).

   I use `\'-comments for descriptive text and for commenting out code
of one or more line; I use `('-comments for describing the stack
effect, the stack contents, or for commenting out sub-line pieces of
code.

   The Emacs mode `gforth.el' (*note Emacs and Gforth::) supports these
uses by commenting out a region with `C-x \', uncommenting a region
with `C-u C-x \', and filling a `\'-commented region with `M-q'.

   Reference: *Note Comments::.

   ---------- Footnotes ----------

   (1) therefore it's a good idea to avoid `)' in word names.

Colon Definitions
=================

   are similar to procedures and functions in other programming
languages.

     : squared ( n -- n^2 )
        dup * ;
     5 squared .
     7 squared .

   `:' starts the colon definition; its name is `squared'.  The
following comment describes its stack effect.  The words `dup *' are
not executed, but compiled into the definition.  `;' ends the colon
definition.

   The newly-defined word can be used like any other word, including
using it in other definitions:

     : cubed ( n -- n^3 )
        dup squared * ;
     -5 cubed .
     : fourth-power ( n -- n^4 )
        squared squared ;
     3 fourth-power .

Assignment:
     Write colon definitions for `nip', `tuck', `negate', and `/mod' in
     terms of other Forth words, and check if they work (hint: test
     your tests on the originals first).  Don't let the
     `redefined'-Messages spook you, they are just warnings.

   Reference: *Note Colon Definitions::.

Decompilation
=============

   You can decompile colon definitions with `see':

     see squared
     see cubed

   In Gforth `see' shows you a reconstruction of the source code from
the executable code.  Informations that were present in the source, but
not in the executable code, are lost (e.g., comments).

   You can also decompile the predefined words:

     see .
     see +

Stack-Effect Comments
=====================

   By convention the comment after the name of a definition describes
the stack effect: The part in from of the `--' describes the state of
the stack before the execution of the definition, i.e., the parameters
that are passed into the colon definition; the part behind the `--' is
the state of the stack after the execution of the definition, i.e., the
results of the definition.  The stack comment only shows the top stack
items that the definition accesses and/or changes.

   You should put a correct stack effect on every definition, even if
it is just `( -- )'.  You should also add some descriptive comment to
more complicated words (I usually do this in the lines following `:').
If you don't do this, your code becomes unreadable (because you have to
work through every definition before you can undertsand any).

Assignment:
     The stack effect of `swap' can be written like this: `x1 x2 -- x2
     x1'.  Describe the stack effect of `-', `drop', `dup', `over',
     `rot', `nip', and `tuck'.  Hint: When you are done, you can
     compare your stack effects to those in this manual (*note Word
     Index::).

   Sometimes programmers put comments at various places in colon
definitions that describe the contents of the stack at that place (stack
comments); i.e., they are like the first part of a stack-effect
comment. E.g.,

     : cubed ( n -- n^3 )
        dup squared  ( n n^2 ) * ;

   In this case the stack comment is pretty superfluous, because the
word is simple enough.  If you think it would be a good idea to add
such a comment to increase readability, you should also consider
factoring the word into several simpler words (*note Factoring:
Factoring Tutorial.), which typically eliminates the need for the stack
comment; however, if you decide not to refactor it, then having such a
comment is better than not having it.

   The names of the stack items in stack-effect and stack comments in
the standard, in this manual, and in many programs specify the type
through a type prefix, similar to Fortran and Hungarian notation.  The
most frequent prefixes are:

`n'
     signed integer

`u'
     unsigned integer

`c'
     character

`f'
     Boolean flags, i.e. `false' or `true'.

`a-addr,a-'
     Cell-aligned address

`c-addr,c-'
     Char-aligned address (note that a Char may have two bytes in
     Windows NT)

`xt'
     Execution token, same size as Cell

`w,x'
     Cell, can contain an integer or an address.  It usually takes 32,
     64 or 16 bits (depending on your platform and Forth system). A
     cell is more commonly known as machine word, but the term _word_
     already means something different in Forth.

`d'
     signed double-cell integer

`ud'
     unsigned double-cell integer

`r'
     Float (on the FP stack)

   You can find a more complete list in *Note Notation::.

Assignment:
     Write stack-effect comments for all definitions you have written
     up to now.

Types
=====

   In Forth the names of the operations are not overloaded; so similar
operations on different types need different names; e.g., `+' adds
integers, and you have to use `f+' to add floating-point numbers.  The
following prefixes are often used for related operations on different
types:

`(none)'
     signed integer

`u'
     unsigned integer

`c'
     character

`d'
     signed double-cell integer

`ud, du'
     unsigned double-cell integer

`2'
     two cells (not-necessarily double-cell numbers)

`m, um'
     mixed single-cell and double-cell operations

`f'
     floating-point (note that in stack comments `f' represents flags,
     and `r' represents FP numbers).

   If there are no differences between the signed and the unsigned
variant (e.g., for `+'), there is only the prefix-less variant.

   Forth does not perform type checking, neither at compile time, nor at
run time.  If you use the wrong oeration, the data are interpreted
incorrectly:

     -1 u.

   If you have only experience with type-checked languages until now,
and have heard how important type-checking is, don't panic!  In my
experience (and that of other Forthers), type errors in Forth code are
usually easy to find (once you get used to it), the increased vigilance
of the programmer tends to catch some harder errors in addition to most
type errors, and you never have to work around the type system, so in
most situations the lack of type-checking seems to be a win (projects to
add type checking to Forth have not caught on).

Factoring
=========

   If you try to write longer definitions, you will soon find it hard to
keep track of the stack contents.  Therefore, good Forth programmers
tend to write only short definitions (e.g., three lines).  The art of
finding meaningful short definitions is known as factoring (as in
factoring polynomials).

   Well-factored programs offer additional advantages: smaller, more
general words, are easier to test and debug and can be reused more and
better than larger, specialized words.

   So, if you run into difficulties with stack management, when writing
code, try to define meaningful factors for the word, and define the word
in terms of those.  Even if a factor contains only two words, it is
often helpful.

   Good factoring is not easy, and it takes some practice to get the
knack for it; but even experienced Forth programmers often don't find
the right solution right away, but only when rewriting the program.
So, if you don't come up with a good solution immediately, keep trying,
don't despair.

Designing the stack effect
==========================

   In other languages you can use an arbitrary order of parameters for a
function; and since there is only one result, you don't have to deal
with the order of results, either.

   In Forth (and other stack-based languages, e.g., Postscript) the
parameter and result order of a definition is important and should be
designed well.  The general guideline is to design the stack effect such
that the word is simple to use in most cases, even if that complicates
the implementation of the word.  Some concrete rules are:

   * Words consume all of their parameters (e.g., `.').

   * If there is a convention on the order of parameters (e.g., from
     mathematics or another programming language), stick with it (e.g.,
     `-').

   * If one parameter usually requires only a short computation (e.g.,
     it is a constant), pass it on the top of the stack.  Conversely,
     parameters that usually require a long sequence of code to compute
     should be passed as the bottom (i.e., first) parameter.  This
     makes the code easier to read, because reader does not need to
     keep track of the bottom item through a long sequence of code (or,
     alternatively, through stack manipulations). E.g., `!' (store,
     *note Memory::) expects the address on top of the stack because it
     is usually simpler to compute than the stored value (often the
     address is just a variable).

   * Similarly, results that are usually consumed quickly should be
     returned on the top of stack, whereas a result that is often used
     in long computations should be passed as bottom result.  E.g., the
     file words like `open-file' return the error code on the top of
     stack, because it is usually consumed quickly by `throw';
     moreover, the error code has to be checked before doing anything
     with the other results.


   These rules are just general guidelines, don't lose sight of the
overall goal to make the words easy to use.  E.g., if the convention
rule conflicts with the computation-length rule, you might decide in
favour of the convention if the word will be used rarely, and in favour
of the computation-length rule if the word will be used frequently
(because with frequent use the cost of breaking the computation-length
rule would be quite high, and frequent use makes it easier to remember
an unconventional order).

Local Variables
===============

   You can define local variables (_locals_) in a colon definition:

     : swap { a b -- b a }
       b a ;
     1 2 swap .s 2drop

   (If your Forth system does not support this syntax, include
`compat/anslocals.fs' first).

   In this example `{ a b -- b a }' is the locals definition; it takes
two cells from the stack, puts the top of stack in `b' and the next
stack element in `a'.  `--' starts a comment ending with `}'.  After
the locals definition, using the name of the local will push its value
on the stack.  You can leave the comment part (`-- b a') away:

     : swap ( x1 x2 -- x2 x1 )
       { a b } b a ;

   In Gforth you can have several locals definitions, anywhere in a
colon definition; in contrast, in a standard program you can have only
one locals definition per colon definition, and that locals definition
must be outside any controll structure.

   With locals you can write slightly longer definitions without running
into stack trouble.  However, I recommend trying to write colon
definitions without locals for exercise purposes to help you gain the
essential factoring skills.

Assignment:
     Rewrite your definitions until now with locals

   Reference: *Note Locals::.

Conditional execution
=====================

   In Forth you can use control structures only inside colon
definitions.  An `if'-structure looks like this:

     : abs ( n1 -- +n2 )
         dup 0 < if
             negate
         endif ;
     5 abs .
     -5 abs .

   `if' takes a flag from the stack.  If the flag is non-zero (true),
the following code is performed, otherwise execution continues after the
`endif' (or `else').  `<' compares the top two stack elements and
prioduces a flag:

     1 2 < .
     2 1 < .
     1 1 < .

   Actually the standard name for `endif' is `then'.  This tutorial
presents the examples using `endif', because this is often less
confusing for people familiar with other programming languages where
`then' has a different meaning.  If your system does not have `endif',
define it with

     : endif postpone then ; immediate

   You can optionally use an `else'-part:

     : min ( n1 n2 -- n )
       2dup < if
         drop
       else
         nip
       endif ;
     2 3 min .
     3 2 min .

Assignment:
     Write `min' without `else'-part (hint: what's the definition of
     `nip'?).

   Reference: *Note Selection::.

Flags and Comparisons
=====================

   In a false-flag all bits are clear (0 when interpreted as integer).
In a canonical true-flag all bits are set (-1 as a twos-complement
signed integer); in many contexts (e.g., `if') any non-zero value is
treated as true flag.

     false .
     true .
     true hex u. decimal

   Comparison words produce canonical flags:

     1 1 = .
     1 0= .
     0 1 < .
     0 0 < .
     -1 1 u< . \ type error, u< interprets -1 as large unsigned number
     -1 1 < .

   Gforth supports all combinations of the prefixes `0 u d d0 du f f0'
(or none) and the comparisons `= <> < > <= >='.  Only a part of these
combinations are standard (for details see the standard, *Note Numeric
comparison::, *Note Floating Point:: or *Note Word Index::).

   You can use `and or xor invert' can be used as operations on
canonical flags.  Actually they are bitwise operations:

     1 2 and .
     1 2 or .
     1 3 xor .
     1 invert .

   You can convert a zero/non-zero flag into a canonical flag with
`0<>' (and complement it on the way with `0=').

     1 0= .
     1 0<> .

   You can use the all-bits-set feature of canonical flags and the
bitwise operation of the Boolean operations to avoid `if's:

     : foo ( n1 -- n2 )
       0= if
         14
       else
         0
       endif ;
     0 foo .
     1 foo .
     
     : foo ( n1 -- n2 )
       0= 14 and ;
     0 foo .
     1 foo .

Assignment:
     Write `min' without `if'.

   For reference, see *Note Boolean Flags::, *Note Numeric
comparison::, and *Note Bitwise operations::.

General Loops
=============

   The endless loop is the most simple one:

     : endless ( -- )
       0 begin
         dup . 1+
       again ;
     endless

   Terminate this loop by pressing `Ctrl-C' (in Gforth).  `begin' does
nothing at run-time, `again' jumps back to `begin'.

   A loop with one exit at any place looks like this:

     : log2 ( +n1 -- n2 )
     \ logarithmus dualis of n1>0, rounded down to the next integer
       assert( dup 0> )
       2/ 0 begin
         over 0> while
           1+ swap 2/ swap
       repeat
       nip ;
     7 log2 .
     8 log2 .

   At run-time `while' consumes a flag; if it is 0, execution continues
behind the `repeat'; if the flag is non-zero, execution continues
behind the `while'.  `Repeat' jumps back to `begin', just like `again'.

   In Forth there are many combinations/abbreviations, like `1+'.
However, `2/' is not one of them; it shifts it's argument right by one
bit (arithmetic shift right):

     -5 2 / .
     -5 2/ .

   `assert(' is no standard word, but you can get it on systems other
then Gforth by including `compat/assert.fs'.  You can see what it does
by trying

     0 log2 .

   Here's a loop with an exit at the end:

     : log2 ( +n1 -- n2 )
     \ logarithmus dualis of n1>0, rounded down to the next integer
       assert( dup 0 > )
       -1 begin
         1+ swap 2/ swap
         over 0 <=
       until
       nip ;

   `Until' consumes a flag; if it is non-zero, execution continues at
the `begin', otherwise after the `until'.

Assignment:
     Write a definition for computing the greatest common divisor.

   Reference: *Note Simple Loops::.

Counted loops
=============

     : ^ ( n1 u -- n )
     \ n = the uth power of u1
       1 swap 0 u+do
         over *
       loop
       nip ;
     3 2 ^ .
     4 3 ^ .

   `U+do' (from `compat/loops.fs', if your Forth system doesn't have
it) takes two numbers of the stack `( u3 u4 -- )', and then performs
the code between `u+do' and `loop' for `u3-u4' times (or not at all, if
`u3-u4<0').

   You can see the stack effect design rules at work in the stack
effect of the loop start words: Since the start value of the loop is
more frequently constant than the end value, the start value is passed
on the top-of-stack.

   You can access the counter of a counted loop with `i':

     : fac ( u -- u! )
       1 swap 1+ 1 u+do
         i *
       loop ;
     5 fac .
     7 fac .

   There is also `+do', which expects signed numbers (important for
deciding whether to enter the loop).

Assignment:
     Write a definition for computing the nth Fibonacci number.

   You can also use increments other than 1:

     : up2 ( n1 n2 -- )
       +do
         i .
       2 +loop ;
     10 0 up2
     
     : down2 ( n1 n2 -- )
       -do
         i .
       2 -loop ;
     0 10 down2

   Reference: *Note Counted Loops::.

Recursion
=========

   Usually the name of a definition is not visible in the definition;
but earlier definitions are usually visible:

     1 0 / . \ "Floating-point unidentified fault" in Gforth on most platforms
     : / ( n1 n2 -- n )
       dup 0= if
         -10 throw \ report division by zero
       endif
       /           \ old version
     ;
     1 0 /

   For recursive definitions you can use `recursive' (non-standard) or
`recurse':

     : fac1 ( n -- n! ) recursive
      dup 0> if
        dup 1- fac1 *
      else
        drop 1
      endif ;
     7 fac1 .
     
     : fac2 ( n -- n! )
      dup 0> if
        dup 1- recurse *
      else
        drop 1
      endif ;
     8 fac2 .

Assignment:
     Write a recursive definition for computing the nth Fibonacci
     number.

   Reference (including indirect recursion): *Note Calls and returns::.

Leaving definitions or loops
============================

   `EXIT' exits the current definition right away.  For every counted
loop that is left in this way, an `UNLOOP' has to be performed before
the `EXIT':

     : ...
      ... u+do
        ... if
          ... unloop exit
        endif
        ...
      loop
      ... ;

   `LEAVE' leaves the innermost counted loop right away:

     : ...
      ... u+do
        ... if
          ... leave
        endif
        ...
      loop
      ... ;

   Reference: *Note Calls and returns::, *Note Counted Loops::.

Return Stack
============

   In addition to the data stack Forth also has a second stack, the
return stack; most Forth systems store the return addresses of
procedure calls there (thus its name).  Programmers can also use this
stack:

     : foo ( n1 n2 -- )
      .s
      >r .s
      r@ .
      >r .s
      r@ .
      r> .
      r@ .
      r> . ;
     1 2 foo

   `>r' takes an element from the data stack and pushes it onto the
return stack; conversely, `r>' moves an elementm from the return to the
data stack; `r@' pushes a copy of the top of the return stack on the
return stack.

   Forth programmers usually use the return stack for storing data
temporarily, if using the data stack alone would be too complex, and
factoring and locals are not an option:

     : 2swap ( x1 x2 x3 x4 -- x3 x4 x1 x2 )
      rot >r rot r> ;

   The return address of the definition and the loop control parameters
of counted loops usually reside on the return stack, so you have to take
all items, that you have pushed on the return stack in a colon
definition or counted loop, from the return stack before the definition
or loop ends.  You cannot access items that you pushed on the return
stack outside some definition or loop within the definition of loop.

   If you miscount the return stack items, this usually ends in a crash:

     : crash ( n -- )
       >r ;
     5 crash

   You cannot mix using locals and using the return stack (according to
the standard; Gforth has no problem).  However, they solve the same
problems, so this shouldn't be an issue.

Assignment:
     Can you rewrite any of the definitions you wrote until now in a
     better way using the return stack?

   Reference: *Note Return stack::.

Memory
======

   You can create a global variable `v' with

     variable v ( -- addr )

   `v' pushes the address of a cell in memory on the stack.  This cell
was reserved by `variable'.  You can use `!' (store) to store values
into this cell and `@' (fetch) to load the value from the stack into
memory:

     v .
     5 v ! .s
     v @ .

   You can see a raw dump of memory with `dump':

     v 1 cells .s dump

   `Cells ( n1 -- n2 )' gives you the number of bytes (or, more
generally, address units (aus)) that `n1 cells' occupy.  You can also
reserve more memory:

     create v2 20 cells allot
     v2 20 cells dump

   creates a word `v2' and reserves 20 uninitialized cells; the address
pushed by `v2' points to the start of these 20 cells.  You can use
address arithmetic to access these cells:

     3 v2 5 cells + !
     v2 20 cells dump

   You can reserve and initialize memory with `,':

     create v3
       5 , 4 , 3 , 2 , 1 ,
     v3 @ .
     v3 cell+ @ .
     v3 2 cells + @ .
     v3 5 cells dump

Assignment:
     Write a definition `vsum ( addr u -- n )' that computes the sum of
     `u' cells, with the first of these cells at `addr', the next one
     at `addr cell+' etc.

   You can also reserve memory without creating a new word:

     here 10 cells allot .
     here .

   `Here' pushes the start address of the memory area.  You should
store it somewhere, or you will have a hard time finding the memory area
again.

   `Allot' manages dictionary memory.  The dictionary memory contains
the system's data structures for words etc. on Gforth and most other
Forth systems.  It is managed like a stack: You can free the memory that
you have just `allot'ed with

     -10 cells allot
     here .

   Note that you cannot do this if you have created a new word in the
meantime (because then your `allot'ed memory is no longer on the top of
the dictionary "stack").

   Alternatively, you can use `allocate' and `free' which allow freeing
memory in any order:

     10 cells allocate throw .s
     20 cells allocate throw .s
     swap
     free throw
     free throw

   The `throw's deal with errors (e.g., out of memory).

   And there is also a garbage collector
(http://www.complang.tuwien.ac.at/forth/garbage-collection.zip), which
eliminates the need to `free' memory explicitly.

   Reference: *Note Memory::.

Characters and Strings
======================

   On the stack characters take up a cell, like numbers.  In memory they
have their own size (one 8-bit byte on most systems), and therefore
require their own words for memory access:

     create v4
       104 c, 97 c, 108 c, 108 c, 111 c,
     v4 4 chars + c@ .
     v4 5 chars dump

   The preferred representation of strings on the stack is `addr
u-count', where `addr' is the address of the first character and
`u-count' is the number of characters in the string.

     v4 5 type

   You get a string constant with

     s" hello, world" .s
     type

   Make sure you have a space between `s"' and the string; `s"' is a
normal Forth word and must be delimited with white space (try what
happens when you remove the space).

   However, this interpretive use of `s"' is quite restricted: the
string exists only until the next call of `s"' (some Forth systems keep
more than one of these strings, but usually they still have a limited
lifetime).

     s" hello," s" world" .s
     type
     type

   You can also use `s"' in a definition, and the resulting strings
then live forever (well, for as long as the definition):

     : foo s" hello," s" world" ;
     foo .s
     type
     type

Assignment:
     `Emit ( c -- )' types `c' as character (not a number).  Implement
     `type ( addr u -- )'.

   Reference: *Note Memory Blocks::.

Alignment
=========

   On many processors cells have to be aligned in memory, if you want to
access them with `@' and `!' (and even if the processor does not
require alignment, access to aligned cells is faster).

   `Create' aligns `here' (i.e., the place where the next allocation
will occur, and that the `create'd word points to).  Likewise, the
memory produced by `allocate' starts at an aligned address.  Adding a
number of `cells' to an aligned address produces another aligned
address.

   However, address arithmetic involving `char+' and `chars' can create
an address that is not cell-aligned.  `Aligned ( addr -- a-addr )'
produces the next aligned address:

     v3 char+ aligned .s @ .
     v3 char+ .s @ .

   Similarly, `align' advances `here' to the next aligned address:

     create v5 97 c,
     here .
     align here .
     1000 ,

   Note that you should use aligned addresses even if your processor
does not require them, if you want your program to be portable.

   Reference: *Note Address arithmetic::.

Files
=====

   This section gives a short introduction into how to use files inside
Forth. It's broken up into five easy steps:

  1. Opened an ASCII text file for input

  2. Opened a file for output

  3. Read input file until string matched (or some other condition
     matched)

  4. Wrote some lines from input ( modified or not) to output

  5. Closed the files.

Open file for input
-------------------

     s" foo.in"  r/o open-file throw Value fd-in

Create file for output
----------------------

     s" foo.out" w/o create-file throw Value fd-out

   The available file modes are r/o for read-only access, r/w for
read-write access, and w/o for write-only access. You could open both
files with r/w, too, if you like. All file words return error codes; for
most applications, it's best to pass there error codes with `throw' to
the outer error handler.

   If you want words for opening and assigning, define them as follows:

     0 Value fd-in
     0 Value fd-out
     : open-input ( addr u -- )  r/o open-file throw to fd-in ;
     : open-output ( addr u -- )  w/o create-file throw to fd-out ;

   Usage example:

     s" foo.in" open-input
     s" foo.out" open-output

Scan file for a particular line
-------------------------------

     256 Constant max-line
     Create line-buffer  max-line 2 + allot
     
     : scan-file ( addr u -- )
       begin
           line-buffer max-line fd-in read-line throw
       while
              >r 2dup line-buffer r> compare 0=
          until
       else
          drop
       then
       2drop ;

   `read-line ( addr u1 fd -- u2 flag ior )' reads up to u1 bytes into
the buffer at addr, and returns the number of bytes read, a flag that's
true when the end of file is reached, and an error code.

   `compare ( addr1 u1 addr2 u2 -- n )' compares two strings and
returns zero if both strings are equal. It returns a positive number if
the first string is lexically greater, a negative if the second string
is lexically greater.

   We haven't seen this loop here; it has two exits. Since the `while'
exits with the number of bytes read on the stack, we have to clean up
that separately; that's after the `else'.

   Usage example:

     s" The text I search is here" scan-file

Copy input to output
--------------------

     : copy-file ( -- )
       begin
           line-buffer max-line fd-in read-line throw
       while
           line-buffer swap fd-out write-file throw
       repeat ;

Close files
-----------

     fd-in close-file throw
     fd-out close-file throw

   Likewise, you can put that into definitions, too:

     : close-input ( -- )  fd-in close-file throw ;
     : close-output ( -- )  fd-out close-file throw ;

Assignment:
     How could you modify `copy-file' so that it copies until a second
     line is matched? Can you write a program that extracts a section
     of a text file, given the line that starts and the line that
     terminates that section?

Interpretation and Compilation Semantics and Immediacy
======================================================

   When a word is compiled, it behaves differently from being
interpreted.  E.g., consider `+':

     1 2 + .
     : foo + ;

   These two behaviours are known as compilation and interpretation
semantics.  For normal words (e.g., `+'), the compilation semantics is
to append the interpretation semantics to the currently defined word
(`foo' in the example above).  I.e., when `foo' is executed later, the
interpretation semantics of `+' (i.e., adding two numbers) will be
performed.

   However, there are words with non-default compilation semantics,
e.g., the control-flow words like `if'.  You can use `immediate' to
change the compilation semantics of the last defined word to be equal to
the interpretation semantics:

     : [FOO] ( -- )
      5 . ; immediate
     
     [FOO]
     : bar ( -- )
       [FOO] ;
     bar
     see bar

   Two conventions to mark words with non-default compilation semnatics
are names with brackets (more frequently used) and to write them all in
upper case (less frequently used).

   In Gforth (and many other systems) you can also remove the
interpretation semantics with `compile-only' (the compilation semantics
is derived from the original interpretation semantics):

     : flip ( -- )
      6 . ; compile-only \ but not immediate
     flip
     
     : flop ( -- )
      flip ;
     flop

   In this example the interpretation semantics of `flop' is equal to
the original interpretation semantics of `flip'.

   The text interpreter has two states: in interpret state, it performs
the interpretation semantics of words it encounters; in compile state,
it performs the compilation semantics of these words.

   Among other things, `:' switches into compile state, and `;'
switches back to interpret state.  They contain the factors `]' (switch
to compile state) and `[' (switch to interpret state), that do nothing
but switch the state.

     : xxx ( -- )
       [ 5 . ]
     ;
     
     xxx
     see xxx

   These brackets are also the source of the naming convention mentioned
above.

   Reference: *Note Interpretation and Compilation Semantics::.

Execution Tokens
================

   `' word' gives you the execution token (XT) of a word.  The XT is a
cell representing the interpretation semantics of a word.  You can
execute this semantics with `execute':

     ' + .s
     1 2 rot execute .

   The XT is similar to a function pointer in C.  However, parameter
passing through the stack makes it a little more flexible:

     : map-array ( ... addr u xt -- ... )
     \ executes xt ( ... x -- ... ) for every element of the array starting
     \ at addr and containing u elements
       { xt }
       cells over + swap ?do
         i @ xt execute
       1 cells +loop ;
     
     create a 3 , 4 , 2 , -1 , 4 ,
     a 5 ' . map-array .s
     0 a 5 ' + map-array .
     s" max-n" environment? drop .s
     a 5 ' min map-array .

   You can use map-array with the XTs of words that consume one element
more than they produce.  In theory you can also use it with other XTs,
but the stack effect then depends on the size of the array, which is
hard to understand.

   Since XTs are cell-sized, you can store them in memory and manipulate
them on the stack like other cells.  You can also compile the XT into a
word with `compile,':

     : foo1 ( n1 n2 -- n )
        [ ' + compile, ] ;
     see foo

   This is non-standard, because `compile,' has no compilation
semantics in the standard, but it works in good Forth systems.  For the
broken ones, use

     : [compile,] compile, ; immediate
     
     : foo1 ( n1 n2 -- n )
        [ ' + ] [compile,] ;
     see foo

   `'' is a word with default compilation semantics; it parses the next
word when its interpretation semantics are executed, not during
compilation:

     : foo ( -- xt )
       ' ;
     see foo
     : bar ( ... "word" -- ... )
       ' execute ;
     see bar
     1 2 bar + .

   You often want to parse a word during compilation and compile its XT
so it will be pushed on the stack at run-time.  `[']' does this:

     : xt-+ ( -- xt )
       ['] + ;
     see xt-+
     1 2 xt-+ execute .

   Many programmers tend to see `'' and the word it parses as one unit,
and expect it to behave like `[']' when compiled, and are confused by
the actual behaviour.  If you are, just remember that the Forth system
just takes `'' as one unit and has no idea that it is a parsing word
(attempts to convenience programmers in this issue have usually
resulted in even worse pitfalls, see `State'-smartness--Why it is evil
and How to Exorcise it
(http://www.complang.tuwien.ac.at/papers/ertl98.ps.gz)).

   Note that the state of the interpreter does not come into play when
creating and executing XTs.  I.e., even when you execute `'' in compile
state, it still gives you the interpretation semantics.  And whatever
that state is, `execute' performs the semantics represented by the XT
(i.e., for XTs produced with `'' the interpretation semantics).

   Reference: *Note Tokens for Words::.

Exceptions
==========

   `throw ( n -- )' causes an exception unless n is zero.

     100 throw .s
     0 throw .s

   `catch ( ... xt -- ... n )' behaves similar to `execute', but it
catches exceptions and pushes the number of the exception on the stack
(or 0, if the xt executed without exception).  If there was an
exception, the stacks have the same depth as when entering `catch':

     .s
     3 0 ' / catch .s
     3 2 ' / catch .s

Assignment:
     Try the same with `execute' instead of `catch'.

   `Throw' always jumps to the dynamically next enclosing `catch', even
if it has to leave several call levels to achieve this:

     : foo 100 throw ;
     : foo1 foo ." after foo" ;
     : bar ['] foo1 catch ;
     bar .

   It is often important to restore a value upon leaving a definition,
even if the definition is left through an exception.  You can ensure
this like this:

     : ...
        save-x
        ['] word-changing-x catch ( ... n )
        restore-x
        ( ... n ) throw ;

   Gforth provides an alternative syntax in addition to `catch': `try
... recover ... endtry'.  If the code between `try' and `recover' has
an exception, the stack depths are restored, the exception number is
pushed on the stack, and the code between `recover' and `endtry' is
performed.  E.g., the definition for `catch' is

     : catch ( x1 .. xn xt -- y1 .. ym 0 / z1 .. zn error ) \ exception
       try
         execute 0
       recover
         nip
       endtry ;

   The equivalent to the restoration code above is

     : ...
       save-x
       try
         word-changing-x
       end-try
       restore-x
       throw ;

   As you can see, the `recover' part is optional.

   Reference: *Note Exception Handling::.

Defining Words
==============

   `:', `create', and `variable' are definition words: They define
other words.  `Constant' is another definition word:

     5 constant foo
     foo .

   You can also use the prefixes `2' (double-cell) and `f' (floating
point) with `variable' and `constant'.

   You can also define your own defining words.  E.g.:

     : variable ( "name" -- )
       create 0 , ;

   You can also define defining words that create words that do
something other than just producing their address:

     : constant ( n "name" -- )
       create ,
     does> ( -- n )
       ( addr ) @ ;
     
     5 constant foo
     foo .

   The definition of `constant' above ends at the `does>'; i.e.,
`does>' replaces `;', but it also does something else: It changes the
last defined word such that it pushes the address of the body of the
word and then performs the code after the `does>' whenever it is called.

   In the example above, `constant' uses `,' to store 5 into the body
of `foo'.  When `foo' executes, it pushes the address of the body onto
the stack, then (in the code after the `does>') fetches the 5 from
there.

   The stack comment near the `does>' reflects the stack effect of the
defined word, not the stack effect of the code after the `does>' (the
difference is that the code expects the address of the body that the
stack comment does not show).

   You can use these definition words to do factoring in cases that
involve (other) definition words.  E.g., a field offset is always added
to an address.  Instead of defining

     2 cells constant offset-field1

   and using this like

     ( addr ) offset-field1 +

   you can define a definition word

     : simple-field ( n "name" -- )
       create ,
     does> ( n1 -- n1+n )
       ( addr ) @ + ;

   Definition and use of field offsets now look like this:

     2 cells simple-field field1
     create mystruct 4 cells allot
     mystruct .s field1 .s drop

   If you want to do something with the word without performing the code
after the `does>', you can access the body of a `create'd word with
`>body ( xt -- addr )':

     : value ( n "name" -- )
       create ,
     does> ( -- n1 )
       @ ;
     : to ( n "name" -- )
       ' >body ! ;
     
     5 value foo
     foo .
     7 to foo
     foo .

Assignment:
     Define `defer ( "name" -- )', which creates a word that stores an
     XT (at the start the XT of `abort'), and upon execution `execute's
     the XT.  Define `is ( xt "name" -- )' that stores `xt' into
     `name', a word defined with `defer'.  Indirect recursion is one
     application of `defer'.

   Reference: *Note User-defined Defining Words::.

Arrays and Records
==================

   Forth has no standard words for defining data structures such as
arrays and records (structs in C terminology), but you can build them
yourself based on address arithmetic.  You can also define words for
defining arrays and records (*note Defining Words: Defining Words
Tutorial.).

   One of the first projects a Forth newcomer sets out upon when
learning about defining words is an array defining word (possibly for
n-dimensional arrays).  Go ahead and do it, I did it, too; you will
learn something from it.  However, don't be disappointed when you later
learn that you have little use for these words (inappropriate use would
be even worse).  I have not yet found a set of useful array words yet;
the needs are just too diverse, and named, global arrays (the result of
naive use of defining words) are often not flexible enough (e.g.,
consider how to pass them as parameters).  Another such project is a set
of words to help dealing with strings.

   On the other hand, there is a useful set of record words, and it has
been defined in `compat/struct.fs'; these words are predefined in
Gforth.  They are explained in depth elsewhere in this manual (see
*note Structures::).  The `simple-field' example above is simplified
variant of fields in this package.

`POSTPONE'
==========

   You can compile the compilation semantics (instead of compiling the
interpretation semantics) of a word with `POSTPONE':

     : MY-+ ( Compilation: -- ; Run-time of compiled code: n1 n2 -- n )
      POSTPONE + ; immediate
     : foo ( n1 n2 -- n )
      MY-+ ;
     1 2 foo .
     see foo

   During the definition of `foo' the text interpreter performs the
compilation semantics of `MY-+', which performs the compilation
semantics of `+', i.e., it compiles `+' into `foo'.

   This example also displays separate stack comments for the
compilation semantics and for the stack effect of the compiled code.
For words with default compilation semantics these stack effects are
usually not displayed; the stack effect of the compilation semantics is
always `( -- )' for these words, the stack effect for the compiled code
is the stack effect of the interpretation semantics.

   Note that the state of the interpreter does not come into play when
performing the compilation semantics in this way.  You can also perform
it interpretively, e.g.:

     : foo2 ( n1 n2 -- n )
      [ MY-+ ] ;
     1 2 foo .
     see foo

   However, there are some broken Forth systems where this does not
always work, and therefore this practice was been declared non-standard
in 1999.

   Here is another example for using `POSTPONE':

     : MY-- ( Compilation: -- ; Run-time of compiled code: n1 n2 -- n )
      POSTPONE negate POSTPONE + ; immediate compile-only
     : bar ( n1 n2 -- n )
       MY-- ;
     2 1 bar .
     see bar

   You can define `ENDIF' in this way:

     : ENDIF ( Compilation: orig -- )
       POSTPONE then ; immediate

Assignment:
     Write `MY-2DUP' that has compilation semantics equivalent to
     `2dup', but compiles `over over'.

`Literal'
=========

   You cannot `POSTPONE' numbers:

     : [FOO] POSTPONE 500 ; immediate

   Instead, you can use `LITERAL (compilation: n --; run-time: -- n )':

     : [FOO] ( compilation: --; run-time: -- n )
       500 POSTPONE literal ; immediate
     
     : flip [FOO] ;
     flip .
     see flip

   `LITERAL' consumes a number at compile-time (when it's compilation
semantics are executed) and pushes it at run-time (when the code it
compiled is executed).  A frequent use of `LITERAL' is to compile a
number computed at compile time into the current word:

     : bar ( -- n )
       [ 2 2 + ] literal ;
     see bar

Assignment:
     Write `]L' which allows writing the example above as `: bar ( -- n
     ) [ 2 2 + ]L ;'

Advanced macros
===============

   Reconsider `map-array' from *Note Execution Tokens: Execution Tokens
Tutorial.  It frequently performs `execute', a relatively expensive
operation in some Forth implementations.  You can use `compile,' and
`POSTPONE' to eliminate these `execute's and produce a word that
contains the word to be performed directly:

     : compile-map-array ( compilation: xt -- ; run-time: ... addr u -- ... )
     \ at run-time, execute xt ( ... x -- ... ) for each element of the
     \ array beginning at addr and containing u elements
       { xt }
       POSTPONE cells POSTPONE over POSTPONE + POSTPONE swap POSTPONE ?do
         POSTPONE i POSTPONE @ xt compile,
       1 cells POSTPONE literal POSTPONE +loop ;
     
     : sum-array ( addr u -- n )
      0 rot rot [ ' + compile-map-array ] ;
     see sum-array
     a 5 sum-array .

   You can use the full power of Forth for generating the code; here's
an example where the code is generated in a loop:

     : compile-vmul-step ( compilation: n --; run-time: n1 addr1 -- n2 addr2 )
     \ n2=n1+(addr1)*n, addr2=addr1+cell
       POSTPONE tuck POSTPONE @
       POSTPONE literal POSTPONE * POSTPONE +
       POSTPONE swap POSTPONE cell+ ;
     
     : compile-vmul ( compilation: addr1 u -- ; run-time: addr2 -- n )
     \ n=v1*v2 (inner product), where the v_i are represented as addr_i u
       0 postpone literal postpone swap
       [ ' compile-vmul-step compile-map-array ]
       postpone drop ;
     see compile-vmul
     
     : a-vmul ( addr -- n )
     \ n=a*v, where v is a vector that's as long as a and starts at addr
      [ a 5 compile-vmul ] ;
     see a-vmul
     a a-vmul .

   This example uses `compile-map-array' to show off, but you could
also use `map-array' instead (try it now!).

   You can use this technique for efficient multiplication of large
matrices.  In matrix multiplication, you multiply every line of one
matrix with every column of the other matrix.  You can generate the code
for one line once, and use it for every column.  The only downside of
this technique is that it is cumbersome to recover the memory consumed
by the generated code when you are done (and in more complicated cases
it is not possible portably).

Compilation Tokens
==================

   This section is Gforth-specific.  You can skip it.

   `' word compile,' compiles the interpretation semantics.  For words
with default compilation semantics this is the same as performing the
compilation semantics.  To represent the compilation semantics of other
words (e.g., words like `if' that have no interpretation semantics),
Gforth has the concept of a compilation token (CT, consisting of two
cells), and words `comp'' and `[comp']'.  You can perform the
compilation semantics represented by a CT with `execute':

     : foo2 ( n1 n2 -- n )
        [ comp' + execute ] ;
     see foo

   You can compile the compilation semantics represented by a CT with
`postpone,':

     : foo3 ( -- )
       [ comp' + postpone, ] ;
     see foo3

   `[ comp' word postpone, ]' is equivalent to `POSTPONE word'.
`comp'' is particularly useful for words that have no interpretation
semantics:

     ' if
     comp' if .s 2drop

   Reference: *Note Tokens for Words::.

Wordlists and Search Order
==========================

   The dictionary is not just a memory area that allows you to allocate
memory with `allot', it also contains the Forth words, arranged in
several wordlists.  When searching for a word in a wordlist,
conceptually you start searching at the youngest and proceed towards
older words (in reality most systems nowadays use hash-tables); i.e., if
you define a word with the same name as an older word, the new word
shadows the older word.

   Which wordlists are searched in which order is determined by the
search order.  You can display the search order with `order'.  It
displays first the search order, starting with the wordlist searched
first, then it displays the wordlist that will contain newly defined
words.

   You can create a new, empty wordlist with `wordlist ( -- wid )':

     wordlist constant mywords

   `Set-current ( wid -- )' sets the wordlist that will contain newly
defined words (the _current_ wordlist):

     mywords set-current
     order

   Gforth does not display a name for the wordlist in `mywords' because
this wordlist was created anonymously with `wordlist'.

   You can get the current wordlist with `get-current ( -- wid)'.  If
you want to put something into a specific wordlist without overall
effect on the current wordlist, this typically looks like this:

     get-current mywords set-current ( wid )
     create someword
     ( wid ) set-current

   You can write the search order with `set-order ( wid1 .. widn n --
)' and read it with `get-order ( -- wid1 .. widn n )'.  The first
searched wordlist is topmost.

     get-order mywords swap 1+ set-order
     order

   Yes, the order of wordlists in the output of `order' is reversed
from stack comments and the output of `.s' and thus unintuitive.

Assignment:
     Define `>order ( wid -- )' with adds `wid' as first searched
     wordlist to the search order.  Define `previous ( -- )', which
     removes the first searched wordlist from the search order.
     Experiment with boundary conditions (you will see some crashes or
     situations that are hard or impossible to leave).

   The search order is a powerful foundation for providing features
similar to Modula-2 modules and C++ namespaces.  However, trying to
modularize programs in this way has disadvantages for debugging and
reuse/factoring that overcome the advantages in my experience (I don't
do huge projects, though).  These disadvantages are not so clear in
other languages/programming environments, because these languages are
not so strong in debugging and reuse.

   Reference: *Note Word Lists::.
