Notes on the ISO-Prolog Standard

Author
Joachim Schimpf
Last modified
2011-04-11
As constructive discussions on the prolog-standard mailing list are quite difficult, I have decided to quietly collect my comments and contributions here, so they can be considered by those who want to do so. I will try to address any feedback (no matter how I become aware of it) directly here on the web site. If you want to interactively discuss a concrete point, please email me personally.

Comments on 2nd Technical Corrigendum

Corrections for Arithmetic

Please include the following corrections to the core standard.

Section 9.1.4:
The definition of divF is incomplete, the zero-divisor case is missing (this can be seen by comparing signature and definition). Please correct as follows:
divF(x,y) = resultF(x/y),rndF) if y\=0
          = undefined          if x=0,y=0
          = zero_divisor       if x\=0,y=0
This mistake is apparently inherited from the first LIA standard.
Section 9.3.1.3 (and future ^/2 function):
Change the error condition (d) for zero raised to a negative exponent from undefined to zero_divisor.
Section 9.3.6.3:
Change (c) from "VX is zero or negative" to "VX is negative". Add case (d) saying "VX is zero" leads to evaluation_error(zero_divisor).
The idea here is that zero_divisor corresponds to an infinite result, while undefined is truly undefined. This is not my idea, I hasten to add, and every system that creates infinities must make this distinction anyway.

I have been told that these changes cannot be put into the corrigendum because they would make existing conforming implementations non-comforming. It is, however, the whole point of a "corrigendum" to correct mistakes, and inevitable that implementations may have to change to implement the corrections.

min and max

In the new section on min/2 and max/2, please do not refer to float_overflow when talking about to the case that an integer is not exactly representable as a float! As defined in 9.1.4.2, float_overflow occurs when a result is greater than fmax, which is not the case we want to capture here.

See 2006 Mailing list discussion for a previous discussion on that topic.

Uninstantiation error

Whether such an error is needed is debatable. In the suggested case of the open/3 predicate being called with an instantiated output argument, the need for a new error goes away when we imagine that the check is performed after the new stream has been created, and just before it is unified with the result argument. Without check, this unification would fail, and it is this failure that we want to supplant with an approriate error. Elsewhere, such a situation would be signaled via a type_error (if types are different) or a domain_error (if types match, but value differs from what's expected). So why not

% newly opened stream would be unified with 99 -> type error
?- open(f1, read, 99).
type_error(stream, 99)

% newly opened stream would be unified with other stream -> domain error
?- open(f1, read, S), open(f2, read, S).
domain_error(stream, $stream(f1))
It has been argued that in some implementations open(t,write,S), close(S), open(t,write,S2), S==S2 may succeed, and thus the error case does not replace a failure. I hope it is undisputed that the intended semantics in that case is indeed a failure (even though the standard leaves it unspecified), and not to implement this failure is merely an oversight.

A case for the "uninstantiation" error can be made, however, in a completely different situation, where a variable is used as a quasi-identifier in a quantifier-like construct, e.g.

?- Y=foo, setof(X, Y^p(X,Y), L).
One could argue that this is likely to be a mistake and should be flagged by an "uninstantiation" error for Y. In this context, however, the name "uninstantiation" is unhelpful. A type_error(variable) would be quite appropriate, since these variables are not meant to ever become instantiated, so being a "variable" is their final destiny or, arguably, their "type".

Another (non-ISO) argument that has been made for why we need this error is a predicate that attaches attributes to variables, but is called with a non-variable, e.g. put_attr(foo,attrname,attrval). There is, however, no reason why this should not be equivalent to put_attr(X,attrname,attrval),X=foo and behave accordingly, i.e. succeed or fail according to the semantics of the attribute.

pi/0

The pi/0 arithmetic function is problematic because of its return type. In a basic ISO system, the return type is simply float. A system that provides more than one representation for real numbers (e.g. different floating-point representations, or both floats and intervals) must settle for one of these types as the return type for pi (because it is argument-less, it cannot be polymorphic). It would have to be the one with the highest precision because it then can be converted to a lower-precision type as needed (whereas missing precision cannot be recovered later).

What this means, however, is that a multi-representation system cannot really commit to pi/0 returning the float-type (which may not have enough precision). Or, if it did, it would have to provide additional variants to return the higher precision representations (pi_binary_128/0, pi_decimal_64/0 etc).

A similar problem exists with regard to the type of literal constants such as 1.0. In a system with multiple representations, possible solutions are

Often, one wants to write code that does a computation in a type-generic way, i.e. uses and propagates the precision of its input arguments. Unfortunately, none of the above alternatives is much help with that.

For the case of pi, the easiest way to solve the problem is to provide a function pi/1, where pi(X) is equivalent to pi*X computed with the precision of X (but at least the smallest floating point precision, to avoid biblical surprises like 3=:=pi(1)).

Signed Numbers (Apr 2011)

Draft technical corrigendum 2 (Mar 2011) includes the introduction of a predefined prefix operator declaration and evaluable function for +/1. It seems that no corresponding modification for the signed number term syntax has been proposed. Unless I am misinterpreting something, this leads to an asymmetry between the minus and the plus sign:

Input ISO 13211-1 DTC2 Mar 2011 ECLiPSe 6.0 native My suggestion for ISO
-1 -1 -1 -1 -1
+1 error +(1) 1 1
- 1 -1 -1 -(1) -1
+ 1 error +(1) +(1) 1
'-'1 -1 -1 -(1) (version 6.1) -(1)
'+'1 error +(1) +(1) (version 6.1) +(1)
'-' 1 -1 -1 -(1) -(1)
'+' 1 error +(1) +(1) +(1)
'-'/**/1 -1 -1 -(1) -(1)
'+'/**/1 error +(1) +(1) +(1)
- /**/1 -1 -1 -(1) -1
+ /**/1 error +(1) +(1) 1
+1.0e+2 error +(1.0e2) 1.0e2 1.0e2
1.0e'-'2 error error error error
1.0e- 2 error error error error
number_codes(N,"-1") N = -1 N = -1 N = -1 N = -1
number_codes(N,"+1") error error N = 1 N = 1

I suggest to adopt the last column as correction for DTC2:

On the other hand, it has been pointed out that allowing + as a sign is less "necessary" than allowing - (negative number syntax is necessary to allow writeq+read give the correct result with negative numbers), and therefore the +/- asymmetry may be considered ok.

NOTE (not for ISO - I think it's too late to change this aspect): I do think that the ECLiPSe (and SWI) choice (of not allowing space of any kind between sign and number) is the most consistent because:


Comments on Minutes of WG17 2010 Edinburgh Meeting

term_variables/3

I assume this is meant to be a difference-list version of term_variables/2. Difference-list-variants of list-constructing predicates can be a good idea (findall/3 comes to mind, but also atom_codes/2 etc), but as the standard does not systematically provide these, it seems there should be a special reason for providing it in this case - what is it?

Apparently, the use case is the implementation of bagof/setof. The arguments made below still hold.

As opposed to the other examples quoted above, a difference-list version of term_variables/2 is a particularly bad idea because

   term_variables(T1, Vs, Vs1), term_variables(T2, Vs1, []).
is not the same as
   term_variables(T1-T2, Vs).
because it is supposed to return a duplicate-free variable list. The predicate invites bugs by suggesting a usage that is unlikely to give the expected results.

Note that the correct way to augment a term-variable-list is

   term_variables(T1, Vs1), term_variables(Vs1-T2, Vs).


Miscellaneous

number_chars/2 and number_codes/2 (Apr 2011)

Exceptions

The predicate as originally specified in the standard is unnecessarily limited in usefulness by requiring an error for the case that the right-hand side string cannot be parsed as a number. Failure would be much more useful, as the predicate could be readily used to "convert to a number if possible", allowing the following common pattern:

    ( number_chars(Num, Chars) ->
        <deal with a number Num>
    ;
        <deal with something else in Chars>
    )

Accepted Language

Neumerkel's comparison drew my attention to other surprises in the ISO spec:

  1. space and even comments can occur in the string before the actual number, i.e. number_codes(3," /*comment*/ 3") is supposed to succeed. Why comments? We are not parsing a program here!
  2. On the other hand, space after the number is forbidden, i.e. number_codes(N, "3 ") is required to raise an exception. Why the asymmetry?
  3. number_codes(3, "03") succeeds, but number_codes(3, "+3") is required to raise an exception. Do we want to allow redundant characters or not?
  4. number_codes(N, "- 3.1e-2") is allowed, but number_codes(N, "-3.1e- 2") is not. Do we want to allow redundant space or not?
This all makes very little sense for a predicate that is supposed to convert back and forth between numbers and their various string representations. While it is clear that the mapping is not one-to-one, the flexibility allowed in the string-to-number backward direction looks completely arbitrary.

The reason for the strangeness is, of course, that the specification refers to the Prolog term syntax (probably to avoid having to deal with signs), rather than more directly to number token syntax. A more sensible, but still compact spec can be had by specifying the language accepted by number_chars separately in terms of token syntax, e.g.

    [sign] (integer token|float number token)
If it is felt that redundant spaces must be accepted, these can now be re-introduced explicitly (but preferably on both sides of the number, and without allowing comments). Nevertheless, for a built-in 'primitive', it would be cleaner to accept the pure number only.

It can be doubted whether point 1 above (allowing comments) and point 4 (space after sign) are really required by 8.16.7, because it mentions "a character sequence which could be output". No ouput primitive will output extra comments, and no output primitive according to 7.10.5 will output space between sign and number.


include/1

Suppose we have the following 3 files:
% top.pl
:- include('somedir/include_p').

% somedir/include_p.pl
:- ensure_loaded(p).

% somedir/p.pl
p.
and we compile top.pl. What current directory should we expect when encountering the ensure_loaded(p) directive in somedir/include_p.pl? With pure include-semantics, the above code does not work: the ensure_loaded behaves as if it occurred within top.pl directly. SWI Prolog, YAP and ECLiPSe behave that way. SICStus (and similar directives in C) behave the other way, i.e. certain pathnames in included files are relative to the includee's not the includer's location. However, the current directory is still the includer's, so the situation is a bit confusing. A discussion from the ECLiPSe point of view is in bug 678.

Thanks

Thanks for comments to: Richard O'Keefe, Paulo Moura, Ulrich Neumerkel.