inform7/services/calculus-module/Preliminaries/What This Module Does.w

What This Module Does.

An overview of the calculus module's role and abilities.

@h Prerequisites.
The calculus module is a part of the Inform compiler toolset. It is
presented as a literate program or "web". Before diving in:
(a) It helps to have some experience of reading webs: see //inweb// for more.
(b) The module is written in C, in fact ANSI C99, but this is disguised by the
fact that it uses some extension syntaxes provided by the //inweb// literate
programming tool, making it a dialect of C called InC. See //inweb// for
full details, but essentially: it's C without predeclarations or header files,
and where functions have names like |Tags::add_by_name| rather than |add_by_name|.
(c) This module uses other modules drawn from the compiler (see //structure//), and also
uses a module of utility functions called //foundation//.
For more, see //foundation: A Brief Guide to Foundation//.

@h What predicate calculus is.
The word "calculus" is often used to mean differentiating and integrating
functions, but properly speaking that is "infinitesimal calculus", and there
are many others.[1] In particular, any set of rules for making deductions
tends to be called a "calculus", and we will use a form of one of the most
popular, "predicate calculus".[2]

Most attempts to codify the meaning of sentences in any systematic way involve
predicate calculus, and most people generally seem to agree that linguistic
concepts (like verbs, adjectives, and determiners) correspond uncannily well
with logical ones (like binary predicates, unary predicates, and quantifiers).[3]
All the same, it is striking how good the fit is, considering that human language
is so haphazard at first sight.

At any rate Inform goes along with this consensus, and converts the difficult
passages in its source text into logical "propositions" -- lines written in
logical notation. This is useful partly as a tidy way to store complicated
meanings inside the program, but also because these propositions can then be
simplified by logical rules, without changing their meaning. Without such
simplifications, Inform would generate much less efficient code.

[1] At time of writing, //nearly 40 can be found here -> https://en.wikipedia.org/wiki/Calculus_(disambiguation)//,
though admittedly that includes a genus of spider and a Tintin character.

[2] Specifically, first order predicate calculus with equality, but with
generalised quantifiers added, and disjunction removed.

[3] This is not altogether a coincidence since the pioneers of mathematical
logic, and in particular Frege and Wittgenstein, began by thinking about
natural language.

@h Notation.
This module deals with propositions in predicate calculus, that is, with
logical statements which are normally written in mathematical notation. To
the end user of Inform, these are invisible: they exist only inside the
compiler and are never typed in or printed out. But for the debugging log,
for unit testing, and for the literate source, we need to do both of these.

A glimpse of the propositions generated by Inform can be had by running this
test, whose output uses our notation:
= (text as Inform 7)
Laboratory is a room. The box is a container.
Test sentence (internal) with a man can see the box in the Laboratory.
Test description (internal) with animals which are in lighted rooms.
=
But a much easier way to test the functions in this module is to use the
//calculus-test// tool. As with //kinds-test//, this is a REPL: that is,
a read-evaluate-print-loop tool, which reads in calculations, performs them,
and prints the result.

= (text from Figures/notation.txt as REPL)

@h Formal description.
1. A "term" is any of the following:
(*) A constant, corresponding to anything which can be evaluated to Inform --
a number, a text, etc. -- and which has a definite kind.
(*) One of 26 variables, which we print to the debugging log as |x|, |y|,
|z|, |a|, |b|, |c|, ..., |w|.
(*) A function $f$ applied to another term.[1]

Note that if we have given values to the necessary variables, then any term
can be evaluated to a value, and its kind determined. For example, if |x| is 7,
then the terms |17|, |x| and |f(x)| evaluate to 17, 7 and $f(7)$ respectively.

2. An "atomic proposition" is any of the following:
(*) A "unary predicate" $U(t)$, where $t$ is a term, which is either true or
false depending on the evaluation of $t$.
(*) A "binary predicate" $B(t_1, t_2)$ depending on two terms.[2]
(*) A "quantifier" $Q(v, n)$ applying to a variable $v$, optionally with a
parameter $n$. See //linguistics: Determiners and Quantifiers// for the range
of quantifiers available.

3. A "proposition" is a sequence of 0 or more of the following:
(*) A conjunction $P_1\land P_2$, where $P_1$ and $P_2$ are propositions.
(*) A negation $\lnot P$, where $P$ is a proposition.
(*) A quantification $Q v\in D: P$, where $Q$ is a quantifier, optionally
also with a numerical parameter, $v$ is a variable, $D$ is a set
specifying the domain of $v$, and $P$ is a proposition.[3]
(*) An existential quantification $\exists v: P$ without a domain.

[1] In this module we use words such as "constant", "variable" and "function" in
their predicate-calculus senses, not their Inform ones. For example, if we are
to decide whether it is true that "a container in the location of Nicole contains
the prize", we have to forget about the passage of time and think only about a
single moment. In the resultant proposition, "the location of Nicole" and
"the prize" lead to constant terms, "a container" leads to a variable term (since
we do not know its identity) and there are no functions.

[2] We do not support higher arities of predicates as such, but they can be
simulated. The universal relation in Inform is in effect a ternary predicate,
but is achieved by combining two of its terms into an ordered pair.

[3] Some quantifiers also carry a numerical parameter, to express, e.g.,
"at least 7" -- the parameter for that being 7.

@ The implementation uses the term "atom" a little more loosely, to include
four punctuation marks: |NOT<|, |NOT>|, |IN<|, |IN>|, which act like
opening and closing parentheses. These are considered atoms purely for
convenience when building more complicated constructions -- they make no sense
standing alone. Thus:

(*) $\lnot P$ is implemented as |NOT< P NOT>|.
(*) $Q v\in D: P$ is implemented as |Q IN< D IN>|.

Note that the domain $D$ of a quantifier is itself expressed as a proposition.
Thus "for all numbers $n$" is implemented as |ForAll n IN< kind=number(n) IN>|.

In all other cases, adjacent atoms in a sequence are considered to be conjoined:
i.e., |X Y| means $X\land Y$, the proposition which is true if $X$ and $Y$ are
both true. To emphasise this, the textual notation uses the |^| sign. For
example, |odd(n) ^ prime(n)| is the notation for two consecutive atoms |odd(n)|
and |prime(n)|.

@h Unary predicates.
The //calculus// module aims to be agnostic about what unary predicates will
exist. They are grouped into "families" -- see //Unary Predicate Families//
for details -- which loosely group them by implementation. So, for example,
Inform has a family of unary predicates in the form |calling='whatever'(x)|
which assert that |x| represents something of a given name. But //calculus//
is not concerned with the details. Only one family is built in:
(*) For each kind $K$, there is a predicate |kind=K(t)|, which is true if $t$
is of the kind $K$.

New UPs can be constructed with //UnaryPredicates::new//.

@h Binary predicates.
Similarly, //calculus// allows the user to create as many families of binary
predicates as are wanted. See //Binary Predicate Families//. For example,
the "same property value as" relations all belong to a single family. This
module builds in only one family:
(*) The equality predicate $=$, whose special meaning is used when simplifying
propositions. See //The Equality Relation//. It is written with the special
notation |(x == y)|, though this is just syntactic sugar.

Binary predicates are of central importance to us because they allow complex
sentences to be written which talk about more than one thing at a time,
with some connection between them. In excerpts of Inform source like "an animal
inside something" or "a man who wears the top hat", the meanings of the two
connecting pieces of text -- "inside" and "who wears" -- are binary predicates:
the containment relation and the wearing relation. To avoid scaring the horses,
binary predicates are called "relations" in all of the Inform documentation.

New BPs can be constructed with //BinaryPredicates::make_pair//. The term "pair"
is used because every $B$ has a "reversal" $B^r$, such that $B^r(s, t)$ is true
if and only if $B(t, s)$. $B$ and $B^r$ are created in pairs.[1]

[1] Except for equality, which is its own reversal. See //BinaryPredicates::make_equality//.

@h Making propositions.
Propositions are built incrementally, like Lego, with a sequence of function
calls.

1. Terms are made using the functions //Terms::new_constant//,
//Terms::new_variable// and //Terms::new_function//.

2. Unary predicate atoms are made using //Atoms::binary_PREDICATE_new//.
Binary predicate atoms are made using //Atoms::binary_PREDICATE_new//.

3. Propositions are then built up from atoms or other propositions[1] by calling:
(*) //Propositions::conjoin//.
(*) //Propositions::negate//.
(*) //Propositions::quantify//.

[1] But beware that propositions are passed by reference not value. Use
//Propositions::copy// before changing one, if you need to use it
again.

@ There are two senses in which it's possible to make an impossible proposition:

(1) You could make a mess of the punctuation markers improperly, or fail to
give a domain set for a quantifier.
(2) You could concatenate two propositions in which the same variable is
used with a different meaning in each.

The functions //Propositions::is_syntactically_valid// and
//Binding::is_well_formed// test that (1) and (2) have not happened.
It's because of (2) that it's important to use //Propositions::conjoin//
and not the simpler //Propositions::concatenate//.

@ Making propositions is a largely syntactic game, but in the end they will
have meanings. Terms will have kinds, and relations will only apply to certain
kinds. This means that some propositions which make syntactic sense will
nevertheless be "bad": for example, ${\it contains}(7, x)$ is not just untrue
but at some level meaningless -- a number cannot contain things.

Throughout Inform, then, all propositions have to be type-checked before use,
and this is done by //TypecheckPropositions::type_check//.

@h Sentences.
Our whole interest in propositions is to use them to provide a meaning for
a sentence of natural language: we evaluate text like "Peter is in the car"
into a proposition.[1] This task is central to how Inform works, and occupies
the whole of //Sentence Conversions//.

On the face of it, this is a simple matter of conjoining propositions for
the subject (Peter) and the object (the car) together with a binary predicate
(is-in). But most cases are not so simple, and if we did only that then we
would often end up with a much longer proposition than necessary, which it
would be inefficient to test or assert at run-time. So the sentence converter
makes use of logical deductions to "simplify" its output, and these tactics
form roughly 20 functions in the //Simplifications// section.

[1] In fact, a proposition with no free variables. It's not a coincidence that
logicians call such propositions "sentences".