Next: , Previous: Files, Up: Words


5.18 Blocks

When you run Gforth on a modern desk-top computer, it runs under the control of an operating system which provides certain services. One of these services is file services, which allows Forth source code and data to be stored in files and read into Gforth (see Files).

Traditionally, Forth has been an important programming language on systems where it has interfaced directly to the underlying hardware with no intervening operating system. Forth provides a mechanism, called blocks, for accessing mass storage on such systems.

A block is a 1024-byte data area, which can be used to hold data or Forth source code. No structure is imposed on the contents of the block. A block is identified by its number; blocks are numbered contiguously from 1 to an implementation-defined maximum.

A typical system that used blocks but no operating system might use a single floppy-disk drive for mass storage, with the disks formatted to provide 256-byte sectors. Blocks would be implemented by assigning the first four sectors of the disk to block 1, the second four sectors to block 2 and so on, up to the limit of the capacity of the disk. The disk would not contain any file system information, just the set of blocks.

On systems that do provide file services, blocks are typically implemented by storing a sequence of blocks within a single blocks file. The size of the blocks file will be an exact multiple of 1024 bytes, corresponding to the number of blocks it contains. This is the mechanism that Gforth uses.

Only one blocks file can be open at a time. If you use block words without having specified a blocks file, Gforth defaults to the blocks file blocks.fb. Gforth uses the Forth search path when attempting to locate a blocks file (see Source Search Paths).

When you read and write blocks under program control, Gforth uses a number of block buffers as intermediate storage. These buffers are not used when you use load to interpret the contents of a block.

The behaviour of the block buffers is analagous to that of a cache. Each block buffer has three states:

Initially, all block buffers are unassigned. In order to access a block, the block (specified by its block number) must be assigned to a block buffer.

The assignment of a block to a block buffer is performed by block or buffer. Use block when you wish to modify the existing contents of a block. Use buffer when you don't care about the existing contents of the block1.

Once a block has been assigned to a block buffer using block or buffer, that block buffer becomes the current block buffer. Data may only be manipulated (read or written) within the current block buffer.

When the contents of the current block buffer has been modified it is necessary, before calling block or buffer again, to either abandon the changes (by doing nothing) or mark the block as changed (assigned-dirty), using update. Using update does not change the blocks file; it simply changes a block buffer's state to assigned-dirty. The block will be written implicitly when it's buffer is needed for another block, or explicitly by flush or save-buffers.

word Flush writes all assigned-dirty blocks back to the blocks file on disk. Leaving Gforth with bye also performs a flush.

In Gforth, block and buffer use a direct-mapped algorithm to assign a block buffer to a block. That means that any particular block can only be assigned to one specific block buffer, called (for the particular operation) the victim buffer. If the victim buffer is unassigned or assigned-clean it is allocated to the new block immediately. If it is assigned-dirty its current contents are written back to the blocks file on disk before it is allocated to the new block.

Although no structure is imposed on the contents of a block, it is traditional to display the contents as 16 lines each of 64 characters. A block provides a single, continuous stream of input (for example, it acts as a single parse area) – there are no end-of-line characters within a block, and no end-of-file character at the end of a block. There are two consequences of this:

In Gforth, when you use block with a non-existent block number, the current blocks file will be extended to the appropriate size and the block buffer will be initialised with spaces.

Gforth includes a simple block editor (type use blocked.fb 0 list for details) but doesn't encourage the use of blocks; the mechanism is only provided for backward compatibility – ANS Forth requires blocks to be available when files are.

Common techniques that are used when working with blocks include:

See Frank Sergeant's Pygmy Forth to see just how well blocks can be integrated into a Forth programming environment.

open-blocks       c-addr u –         gforth       “open-blocks”

Use the file, whose name is given by c-addr u, as the blocks file.

use       "file" –         gforth       “use”

Use file as the blocks file.

block-offset       – addr         gforth       “block-offset”

User variable containing the number of the first block (default since 0.5.0: 0). Block files created with Gforth versions before 0.5.0 have the offset 1. If you use these files you can: 1 offset !; or add 1 to every block number used; or prepend 1024 characters to the file.

get-block-fid       – wfileid         gforth       “get-block-fid”

Return the file-id of the current blocks file. If no blocks file has been opened, use blocks.fb as the default blocks file.

block-position       u –         block       “block-position”

Position the block file to the start of block u.

list       u –         block-ext       “list”

Display block u. In Gforth, the block is displayed as 16 numbered lines, each of 64 characters.

scr       – a-addr         block-ext       “s-c-r”

User variable – a-addr is the address of a cell containing the block number of the block most recently processed by list.

block       u – a-addr         block       “block”

If a block buffer is assigned for block u, return its start address, a-addr. Otherwise, assign a block buffer for block u (if the assigned block buffer has been updated, transfer the contents to mass storage), read the block into the block buffer and return its start address, a-addr.

buffer       u – a-addr         block       “buffer”

If a block buffer is assigned for block u, return its start address, a-addr. Otherwise, assign a block buffer for block u (if the assigned block buffer has been updated, transfer the contents to mass storage) and return its start address, a-addr. The subtle difference between buffer and block mean that you should only use buffer if you don't care about the previous contents of block u. In Gforth, this simply calls block.

empty-buffers              block-ext       “empty-buffers”

Mark all block buffers as unassigned; if any had been marked as assigned-dirty (by update), the changes to those blocks will be lost.

empty-buffer       buffer –         gforth       “empty-buffer”

update              block       “update”

Mark the state of the current block buffer as assigned-dirty.

updated?       n – f         gforth       “updated?”

Return true if updated has been used to mark block n as assigned-dirty.

save-buffers              block       “save-buffers”

Transfer the contents of each updated block buffer to mass storage, then mark all block buffers as assigned-clean.

save-buffer       buffer –         gforth       “save-buffer”

flush              block       “flush”

Perform the functions of save-buffers then empty-buffers.

load       i*x n – j*x         block       “load”

Save the current input source specification. Store n in BLK, set >IN to 0 and interpret. When the parse area is exhausted, restore the input source specification.

thru       i*x n1 n2 – j*x         block-ext       “thru”

load the blocks n1 through n2 in sequence.

+load       i*x n – j*x         gforth       “+load”

Used within a block to load the block specified as the current block + n.

+thru       i*x n1 n2 – j*x         gforth       “+thru”

Used within a block to load the range of blocks specified as the current block + n1 thru the current block + n2.

-->              gforth       “chain”

If this symbol is encountered whilst loading block n, discard the remainder of the block and load block n+1. Used for chaining multiple blocks together as a single loadable unit. Not recommended, because it destroys the independence of loading. Use thru (which is standard) or +thru instead.

block-included       a-addr u –         gforth       “block-included”

Use within a block that is to be processed by load. Save the current blocks file specification, open the blocks file specified by a-addr u and load block 1 from that file (which may in turn chain or load other blocks). Finally, close the blocks file and restore the original blocks file.


Footnotes

[1] The ANS Forth definition of buffer is intended not to cause disk I/O; if the data associated with the particular block is already stored in a block buffer due to an earlier block command, buffer will return that block buffer and the existing contents of the block will be available. Otherwise, buffer will simply assign a new, empty block buffer for the block.