1
0
Fork 0
mirror of https://github.com/ganelson/inform.git synced 2024-07-02 23:14:57 +03:00

Continuing work on manual

This commit is contained in:
Graham Nelson 2019-03-20 12:51:06 +00:00
parent 7d93f1473c
commit 86c10609f4
7 changed files with 822 additions and 28 deletions

View file

@ -1,7 +1,7 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>P/ui</title>
<title>P/ti</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta http-equiv="Content-Language" content="en-gb">
<link href="inweb.css" rel="stylesheet" rev="stylesheet" type="text/css">
@ -13,21 +13,13 @@
<ul class="toc"><li><a href="#SP1">&#167;1. Stages and descriptions</a></li><li><a href="#SP3">&#167;3. The code-generation stages</a></li><li><a href="#SP13">&#167;13. Diagnostic or non-working stages</a></li></ul><hr class="tocbar">
<p class="inwebparagraph"><a id="SP1"></a><b>&#167;1. Stages and descriptions. </b>Inter code has three representations: as a binary file, as a textual file,
and in memory &mdash; a sort of cross-referenced form of binary. For speed, the
Inform compiler generates memory inter directly, and code-generates from
that, so that the inter is normally never written out to disc. When Inter
performs a conversion, it loads (say) textual inter into memory inter, then
writes that out as binary inter.
<p class="inwebparagraph"><a id="SP1"></a><b>&#167;1. Stages and descriptions. </b>A processing stage is a step in code generation which acts on a repository
of inter in memory. Some stages change, add to or edit down that code, while
others leave it untouched but output a file based on it.
</p>
<p class="inwebparagraph">A processing stage is a step in code generation which acts on memory inter.
Some stages change, add to or edit down that code, while others leave it
untouched but output a file based on it.
</p>
<p class="inwebparagraph">Each stage can see an entire "repository" of inter code at a time, and is
not restricted to working through in sequence. Those which read in or write
<p class="inwebparagraph">Each stage can see an entire repository of inter code at a time, and is
not restricted to working through it in sequence. Those which read in or write
out a file also have a filename supplied to them as a parameter, but there
are otherwise no configuration options. It's not possible to tell a stage
to work on one specific function alone, for example.
@ -274,7 +266,7 @@ ice while a better and more systematic solution was found.
</p>
<hr class="tocbar">
<ul class="toc"><li><a href="P-ui.html">Back to 'Using Inter'</a></li><li><i>(This section ends Preliminaries.)</i></li></ul><hr class="tocbar">
<ul class="toc"><li><a href="P-ti.html">Back to 'Textual Inter'</a></li><li><i>(This section ends Preliminaries.)</i></li></ul><hr class="tocbar">
<!--End of weave-->
</body>
</html>

488
docs/inter/P-ti.html Normal file
View file

@ -0,0 +1,488 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>P/ui</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta http-equiv="Content-Language" content="en-gb">
<link href="inweb.css" rel="stylesheet" rev="stylesheet" type="text/css">
</head>
<body>
<!--Weave of 'P/ti' generated by 7-->
<ul class="crumbs"><li><a href="../webs.html">&#9733;</a></li><li><a href="index.html">inter 1</a></li><li><a href="index.html#P">Preliminaries</a></li><li><b>Textual Inter</b></li></ul><p class="purpose">A specification of the inter language, as written out in text file form.</p>
<ul class="toc"><li><a href="#SP1">&#167;1. Textual, Binary, Memory</a></li><li><a href="#SP2">&#167;2. Global statements</a></li><li><a href="#SP7">&#167;7. Package declarations</a></li><li><a href="#SP12">&#167;12. Kinds and values</a></li><li><a href="#SP14">&#167;14. Code statements</a></li></ul><hr class="tocbar">
<p class="inwebparagraph"><a id="SP1"></a><b>&#167;1. Textual, Binary, Memory. </b>Inter code has three representations: as a binary file, as a textual file,
and in memory &mdash; a sort of cross-referenced form of binary. For speed, the
Inform compiler generates memory inter directly, and code-generates from
that, so that the inter is normally never written out to disc. When Inter
performs a conversion, it loads (say) textual inter into memory inter, then
writes that out as binary inter.
</p>
<p class="inwebparagraph">The following specification covers the inter language in its textual form:
a UTF-8 encoded text file which conventionally takes the file extension
".intert".
</p>
<p class="inwebparagraph">It should be stressed that inter is designed for inspection &mdash; that is, for
people to be able to read. It's not intended as a programming language for
humans to write: the code is verbose and low-level. The idea is that inter
code will be written by programs (such as Inform), but that this code will
be possible for humans to check.
</p>
<p class="inwebparagraph">Like assembly language, inter code is line-based: each line is a "statement".
Lines can be of arbitrary length. A line beginning with a <code class="display"><span class="extract">#</span></code> (in column 1) is
a comment, and blank lines are ignored.
</p>
<p class="inwebparagraph">The term "name" below means a string of one or more English upper or lower
case letters, underscores, or digits, except that it must not begin with
a digit.
</p>
<p class="inwebparagraph">As in Python, indentation from the left margin is highly significant, and
should be in the form of tab characters.
</p>
<p class="inwebparagraph">Inform follows certain conventions in the inter that it writes, but these
conventions are not part of the specification, and may change. Any paragraph
below which begins with "Convention" records the current practice.
</p>
<p class="inwebparagraph">There are three forms of statement: global statements, data statements, and
code statements. We will take these in turn.
</p>
<p class="inwebparagraph"><a id="SP2"></a><b>&#167;2. Global statements. </b>These statements must appear first in the file, and must be unindented.
There are only four of these:
</p>
<p class="inwebparagraph"><a id="SP3"></a><b>&#167;3. </b><code class="display"><span class="extract">version NUMBER</span></code> indicates that the file was written in that version of
the inter language. At present there has only ever been one version, but
that may not always be true. A <code class="display"><span class="extract">version</span></code> statement number must come before
anything else, even other global statements; in particular, there cannot be
two such statements in the same file.
</p>
<p class="inwebparagraph">Convention. Inform always opens with the statement: <code class="display"><span class="extract">version 1</span></code>
</p>
<p class="inwebparagraph"><a id="SP4"></a><b>&#167;4. </b><code class="display"><span class="extract">packagetype NAME</span></code> declares that <code class="display"><span class="extract">NAME</span></code> is the name of a type of package.
Packages are the main hierarchical organisation for inter files, as we
will see below. Each package has a type as well as a name, and the type
must be one of those declared like this.
</p>
<p class="inwebparagraph">For example, <code class="display"><span class="extract">packagetype _adjective</span></code> creates <code class="display"><span class="extract">_adjective</span></code> as a possible type
for packages in this file.
</p>
<p class="inwebparagraph">The first two package types must be <code class="display"><span class="extract">_plain</span></code> and <code class="display"><span class="extract">_code</span></code>, in that order.
</p>
<p class="inwebparagraph">Convention. All of Inform's package type names begin similarly with an
underscore, to prevent name clashes. Inform uses package types semantically,
to show what kind of thing is being defined in the content of a particular
package. This makes it easier to search a large inter repository for all of
the adjective defimitions, for example: we just need to look for packages of
type <code class="display"><span class="extract">_adjective</span></code>.
</p>
<p class="inwebparagraph"><a id="SP5"></a><b>&#167;5. </b><code class="display"><span class="extract">pragma TARGET "WHATEVER"</span></code> does not change the meaning of the inter file;
it simply provides pragmatic advice to the eventual compiler of code
generated from this file. <code class="display"><span class="extract">TARGET</span></code> indicates the context for which this
is intended; at present, the only possible choice is <code class="display"><span class="extract">target_I6</span></code>, meaning,
"if you are compiling me to Inform 6".
</p>
<p class="inwebparagraph">Convention. Inform uses this to pass on ICL (Inform Command Language)
commands to Inform 6, such as memory settings or command-line switches.
For example,
</p>
<p class="inwebparagraph"></p>
<pre class="display">
<span class="plain">pragma target_I6 "$MAX_LABELS=200000"</span>
</pre>
<p class="inwebparagraph">(This would be meaningless if we were compiling to some other format.)
</p>
<p class="inwebparagraph"><a id="SP6"></a><b>&#167;6. </b><code class="display"><span class="extract">primitive PRIMITIVE IN -&gt; OUT</span></code> defines a new code statement &mdash; if inter
were an assembly language, these would be the opcodes. For example,
</p>
<p class="inwebparagraph"></p>
<pre class="display">
<span class="plain">primitive !move val val -&gt; void</span>
</pre>
<p class="inwebparagraph">defines the primtive <code class="display"><span class="extract">!move</span></code> as something which consumes two values and
produces none. <code class="display"><span class="extract">IN</span></code> can either be <code class="display"><span class="extract">void</span></code> or can be a list of one or more
terms which are all either <code class="display"><span class="extract">ref</span></code>, <code class="display"><span class="extract">val</span></code> or <code class="display"><span class="extract">code</span></code>. <code class="display"><span class="extract">OUT</span></code> can be either
<code class="display"><span class="extract">void</span></code> or else a single term which is either <code class="display"><span class="extract">ref</span></code> or <code class="display"><span class="extract">val</span></code>. For
example,
</p>
<p class="inwebparagraph"></p>
<pre class="display">
<span class="plain">primitive !plus val val -&gt; val</span>
</pre>
<p class="inwebparagraph">says that <code class="display"><span class="extract">!plus</span></code> consumes two values and produces a new one, while
</p>
<p class="inwebparagraph"></p>
<pre class="display">
<span class="plain">primitive !ifelse val code code -&gt; void</span>
</pre>
<p class="inwebparagraph">says that <code class="display"><span class="extract">!ifelse</span></code> consumes a value and two blocks of code, and produces
nothing. Of course, <code class="display"><span class="extract">!plus</span></code> adds the values, whereas <code class="display"><span class="extract">!ifelse</span></code> evaluates
the value and then executes one of the two code blocks depending on
the result. But at this stage, we don't see the meaming of these
primitives, only their prototypes.
</p>
<p class="inwebparagraph">The third term type, <code class="display"><span class="extract">ref</span></code>, means "a reference to a value", and is in
effect an lvalue rather than an rvalue: for example,
</p>
<p class="inwebparagraph"></p>
<pre class="display">
<span class="plain">primitive !pull ref -&gt; void</span>
</pre>
<p class="inwebparagraph">is the prototype of a primitive which pulls a value from the stack and
stores it in whatever is referred to by the <code class="display"><span class="extract">ref</span></code> (typically, a variable).
</p>
<p class="inwebparagraph">Convention. Inform defines a standard set of around 90 primitives. Although
their names and prototypes are not part of the inter specification as such,
you will only be able to use Inter's "compile to I6" feature if those are
the primitives you use, so in effect this is the standard set. Details of
these primitives and what they do will appear below.
</p>
<p class="inwebparagraph"><a id="SP7"></a><b>&#167;7. Package declarations. </b>After the global area, an inter file should declare a package called <code class="display"><span class="extract">main</span></code>,
which must have the package type <code class="display"><span class="extract">_plain</span></code>.
</p>
<p class="inwebparagraph">The statement <code class="display"><span class="extract">package NAME TYPE</span></code> declares a new package, and the <code class="display"><span class="extract">TYPE</span></code>
must be one of those declared by <code class="display"><span class="extract">packagetype</span></code> statements in the global area.
</p>
<p class="inwebparagraph">The declaration line for a package begins at the level of indentation of
the package's owner. For <code class="display"><span class="extract">main</span></code>, it should be unindented, and this is the
only package allowed to appear at the top level: all other packages should
be inside <code class="display"><span class="extract">main</span></code> in some way.
</p>
<p class="inwebparagraph">The contents of the package are then one tab stop in from the declaration. Thus:
</p>
<p class="inwebparagraph"></p>
<pre class="display">
<span class="plain">package main _plain</span>
<span class="plain"> ...</span>
<span class="plain"> package m1_RBLK1 _code</span>
<span class="plain"> ...</span>
<span class="plain"> package m1_RBLK2 _code</span>
<span class="plain"> ...</span>
</pre>
<p class="inwebparagraph">Here, <code class="display"><span class="extract">main</span></code> contains two sub-packages, <code class="display"><span class="extract">m1_RBLK1</span></code> and <code class="display"><span class="extract">m1_RBLK2</span></code>, and
indentation is used to show which package a statement belongs to.
</p>
<p class="inwebparagraph"><a id="SP8"></a><b>&#167;8. </b>After the declaration line, a package definition continues with a set
of symbols definitions. In effect, this is the symbols table for the
package written out explicitly. Each definition is a <code class="display"><span class="extract">symbol</span></code> line, in
one of these three forms:
</p>
<p class="inwebparagraph"></p>
<pre class="display">
<span class="plain">symbol private TYPE NAME</span>
<span class="plain">symbol public TYPE NAME</span>
<span class="plain">symbol external TYPE NAME == SYMBOL</span>
</pre>
<p class="inwebparagraph">For example,
</p>
<p class="inwebparagraph"></p>
<pre class="display">
<span class="plain">symbol public misc MEMORY_HEAP_SIZE</span>
<span class="plain">symbol external misc AllowInShowme == /main/resources/template/AllowInShowme</span>
</pre>
<p class="inwebparagraph"><code class="display"><span class="extract">private</span></code> means that the meaning and existence of <code class="display"><span class="extract">NAME</span></code> are invisible
from outside the current package; <code class="display"><span class="extract">public</span></code> means that other packages are
allowed to refer to <code class="display"><span class="extract">NAME</span></code>; and <code class="display"><span class="extract">external</span></code> means that this package is
making just such a reference, and that <code class="display"><span class="extract">NAME</span></code> in this package is equivalent
to <code class="display"><span class="extract">SYMBOL</span></code>, defined elsewhere. It is possible that <code class="display"><span class="extract">SYMBOL</span></code> points only to
another symbol which is also <code class="display"><span class="extract">external</span></code>, so that we then have to follow
another link to find the original non-external definition. However, it is
a requirement that this process must eventually end. It would be illegal
to write
</p>
<p class="inwebparagraph"></p>
<pre class="display">
<span class="plain">package main _plain</span>
<span class="plain"> package A _plain</span>
<span class="plain"> symbol external misc S == /main/B/T</span>
<span class="plain"> package B _plain</span>
<span class="plain"> symbol external misc T == /main/B/S</span>
</pre>
<p class="inwebparagraph">The symbol <code class="display"><span class="extract">TYPE</span></code> must be one of four possibilities:
</p>
<ul class="items"><li>(a) <code class="display"><span class="extract">label</span></code>, used to mark execution positions in code packages;
</li><li>(b) <code class="display"><span class="extract">package</span></code>, meaning that this is the name of a package;
</li><li>(c) <code class="display"><span class="extract">packagetype</span></code>, meaning that this is a package type;
</li><li>(d) <code class="display"><span class="extract">misc</span></code>, meaning "anything else" &mdash; most symbols have this type.
</li></ul>
<p class="inwebparagraph">The run of <code class="display"><span class="extract">symbol</span></code> declarations at the top of a module can become quite
long, since it has to give a complete description of all symbols used inside
the module, whether they're defined internally or externally. As a
convenience for people writing test cases by hand, it's in fact optional
to predeclare a symbol in textual inter provided that this symbol is
declared earlier in the file than its first use. However, when Inter
writes out a textual inter file, it always writes the symbols table out
in full, and never exercises this option.
</p>
<p class="inwebparagraph"><a id="SP9"></a><b>&#167;9. </b>Where a local symbol is being equated with an external one, the <code class="display"><span class="extract">SYMBOL</span></code>
given is a sort of URL showing the package to look inside. Thus
</p>
<p class="inwebparagraph"></p>
<pre class="display">
<span class="plain">/main/resources/template/AllowInShowme</span>
</pre>
<p class="inwebparagraph">means "the symbol <code class="display"><span class="extract">AllowInShowme</span></code> in package <code class="display"><span class="extract">template</span></code> inside package
<code class="display"><span class="extract">resources</span></code> inside package <code class="display"><span class="extract">main</span></code>".
</p>
<p class="inwebparagraph"><a id="SP10"></a><b>&#167;10. </b>Optionally, a <code class="display"><span class="extract">private</span></code> or <code class="display"><span class="extract">public</span></code> symbol can also specify a name it
wishes to be given when the Inter is translated into some other language
(i.e., Inform 6 or similar). This is written like so:
</p>
<p class="inwebparagraph"></p>
<pre class="display">
<span class="plain">symbol private TYPE NAME -&gt; TRANSLATION</span>
</pre>
<p class="inwebparagraph">So, for example,
</p>
<p class="inwebparagraph"></p>
<pre class="display">
<span class="plain">symbol public misc launcher -&gt; launcher_U32</span>
</pre>
<p class="inwebparagraph">Symbols tabulated as <code class="display"><span class="extract">external</span></code> cannot be marked in this way, but of course
the original definition (to which the external link eventually leads) can be.
For example,
</p>
<p class="inwebparagraph"></p>
<pre class="display">
<span class="plain">package main _plain</span>
<span class="plain"> package A _plain</span>
<span class="plain"> symbol external misc S == /main/B/T</span>
<span class="plain"> package B _plain</span>
<span class="plain"> symbol public misc T -&gt; FancyName </span>
</pre>
<p class="inwebparagraph">would result in the names <code class="display"><span class="extract">S</span></code> and <code class="display"><span class="extract">T</span></code> both being compiled to the name
<code class="display"><span class="extract">FancyName</span></code> in the final code.
</p>
<p class="inwebparagraph">Convention. Inform mostly makes use of this feature of inter late in code
generation, essentially to avoid namespace clashes in the final output code,
but it also needs to use it to implement low-level features of the Inform
language such as:
</p>
<blockquote>
<p>The marked for listing property translates into I6 as "workflag".</p>
</blockquote>
<p class="inwebparagraph"><a id="SP11"></a><b>&#167;11. </b>With the package and its symbol table declared, we can then get on with
the definitions of what is inside the package.
</p>
<p class="inwebparagraph">A package with the special type <code class="display"><span class="extract">_code</span></code> must contain only code statements;
all other packages must contain only data statements. Note that <code class="display"><span class="extract">package</span></code>
is itself a data statement, and it follows that <code class="display"><span class="extract">_code</span></code> packages cannot
contain sub-packages, but that all others can.
</p>
<p class="inwebparagraph">"Data" is a slightly loose phrase for what data statements convey: it
includes metadata, and indeed almost anything other than actual executable
code.
</p>
<p class="inwebparagraph"><a id="SP12"></a><b>&#167;12. Kinds and values. </b>Inter is a very loosely typed language, in the sense that it is possible
to require that values conform to particular data types. As in Inform, data
types are called "kinds" in this context (which usefully distinguishes them
from "types" of packages, a completely different concept).
</p>
<p class="inwebparagraph">No kinds are built in: all must be declared before use. However, these
declarations are able to say something about them, so they aren't entirely
abstract. The syntax is:
</p>
<p class="inwebparagraph"></p>
<pre class="display">
<span class="plain">kind NAME CONTENT</span>
</pre>
<p class="inwebparagraph">The <code class="display"><span class="extract">NAME</span></code>, like all names, goes into the owning package's symbol table;
other packages wanting to use this kind will have to have an <code class="display"><span class="extract">external</span></code>
symbol pointing to this definition.
</p>
<p class="inwebparagraph"><code class="display"><span class="extract">CONTENT</span></code> must be one of the following:
</p>
<p class="inwebparagraph"></p>
<ul class="items"><li>(a) <code class="display"><span class="extract">unchecked</span></code>, meaning that absolutely any data can be referred to by this type;
</li><li>(b) <code class="display"><span class="extract">int32</span></code>, <code class="display"><span class="extract">int16</span></code>, <code class="display"><span class="extract">int8</span></code>, <code class="display"><span class="extract">int2</span></code>, for numerical data stored in these numbers
of bits (which the program may choose to treat as character values, as flags,
as signed or unsigned integers. and so on, as it pleases);
</li><li>(c) <code class="display"><span class="extract">text</span></code>, meaning text;
</li><li>(d) <code class="display"><span class="extract">enum</span></code>, meaning that data of this kind must be equal to one (and only one)
of the enumerated constants with this kind;
</li><li>(e) <code class="display"><span class="extract">table</span></code>, a special sort of data referring to tables made up of columns each
of which has a different kind;
</li><li>(f) <code class="display"><span class="extract">list of K</span></code>, meaning that data must be a list, each of whose terms is
data of kind <code class="display"><span class="extract">K</span></code> &mdash; which must be a kind name known to the symbols table
of the package in which this definition occurs;
</li><li>(g) <code class="display"><span class="extract">column of K</span></code>, similarly, but for a table column;
</li><li>(h) <code class="display"><span class="extract">relation of K1 to K2</span></code>, meaning that data must be such a relation, in the
same sort of sense as in Inform;
</li><li>(i) <code class="display"><span class="extract">description of K</span></code>, meaning that data must be a description which either
matches or does not match values of kind <code class="display"><span class="extract">K</span></code>;
</li><li>(j) <code class="display"><span class="extract">struct</span></code>, which is similar to <code class="display"><span class="extract">list of K</span></code>, but which has entries which do
not all have to have the same kind;
</li><li>(k) and <code class="display"><span class="extract">routine</span></code>, meaning that data must be references to functions.
</li></ul>
<p class="inwebparagraph"><a id="SP13"></a><b>&#167;13. </b>In the remainder of this specification, <code class="display"><span class="extract">VALUE</span></code> means either the name of
a defined <code class="display"><span class="extract">constant</span></code> (see below), or else a literal. The following notation
is used for literals:
</p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"><a id="SP14"></a><b>&#167;14. Code statements. </b></p>
<p class="inwebparagraph">...
</p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<p class="inwebparagraph"></p>
<hr class="tocbar">
<ul class="toc"><li><a href="P-ui.html">Back to 'Using Inter'</a></li><li><a href="P-cas.html">Continue with 'Chains and Stages'</a></li></ul><hr class="tocbar">
<!--End of weave-->
</body>
</html>

View file

@ -124,7 +124,7 @@ the file system location <code class="display"><span class="extract">T</span></c
</p>
<hr class="tocbar">
<ul class="toc"><li><i>(This section begins Preliminaries.)</i></li><li><a href="P-cas.html">Continue with 'Chains and Stages'</a></li></ul><hr class="tocbar">
<ul class="toc"><li><i>(This section begins Preliminaries.)</i></li><li><a href="P-ti.html">Continue with 'Textual Inter'</a></li></ul><hr class="tocbar">
<!--End of weave-->
</body>
</html>

View file

@ -19,6 +19,10 @@
<p><a href="P-ui.html"><spon class="sectiontitle">Using Inter</span></a> -
<span class="purpose">Using Inter at the command line.</span></p>
</li>
<li>
<p><a href="P-ti.html"><spon class="sectiontitle">Textual Inter</span></a> -
<span class="purpose">A specification of the inter language, as written out in text file form.</span></p>
</li>
<li>
<p><a href="P-cas.html"><spon class="sectiontitle">Chains and Stages</span></a> -
<span class="purpose">Sequences of named code-generation stages are called chains.</span></p>

View file

@ -12,6 +12,7 @@ Import: codegen
Preliminaries
Using Inter
Textual Inter
Chains and Stages
Chapter 1: Everything

View file

@ -3,19 +3,12 @@ Chains and Stages.
Sequences of named code-generation stages are called chains.
@h Stages and descriptions.
Inter code has three representations: as a binary file, as a textual file,
and in memory -- a sort of cross-referenced form of binary. For speed, the
Inform compiler generates memory inter directly, and code-generates from
that, so that the inter is normally never written out to disc. When Inter
performs a conversion, it loads (say) textual inter into memory inter, then
writes that out as binary inter.
A processing stage is a step in code generation which acts on a repository
of inter in memory. Some stages change, add to or edit down that code, while
others leave it untouched but output a file based on it.
A processing stage is a step in code generation which acts on memory inter.
Some stages change, add to or edit down that code, while others leave it
untouched but output a file based on it.
Each stage can see an entire "repository" of inter code at a time, and is
not restricted to working through in sequence. Those which read in or write
Each stage can see an entire repository of inter code at a time, and is
not restricted to working through it in sequence. Those which read in or write
out a file also have a filename supplied to them as a parameter, but there
are otherwise no configuration options. It's not possible to tell a stage
to work on one specific function alone, for example.

View file

@ -0,0 +1,316 @@
Textual Inter.
A specification of the inter language, as written out in text file form.
@h Textual, Binary, Memory.
Inter code has three representations: as a binary file, as a textual file,
and in memory -- a sort of cross-referenced form of binary. For speed, the
Inform compiler generates memory inter directly, and code-generates from
that, so that the inter is normally never written out to disc. When Inter
performs a conversion, it loads (say) textual inter into memory inter, then
writes that out as binary inter.
The following specification covers the inter language in its textual form:
a UTF-8 encoded text file which conventionally takes the file extension
".intert".
It should be stressed that inter is designed for inspection -- that is, for
people to be able to read. It's not intended as a programming language for
humans to write: the code is verbose and low-level. The idea is that inter
code will be written by programs (such as Inform), but that this code will
be possible for humans to check.
Like assembly language, inter code is line-based: each line is a "statement".
Lines can be of arbitrary length. A line beginning with a |#| (in column 1) is
a comment, and blank lines are ignored.
The term "name" below means a string of one or more English upper or lower
case letters, underscores, or digits, except that it must not begin with
a digit.
As in Python, indentation from the left margin is highly significant, and
should be in the form of tab characters.
Inform follows certain conventions in the inter that it writes, but these
conventions are not part of the specification, and may change. Any paragraph
below which begins with "Convention" records the current practice.
There are three forms of statement: global statements, data statements, and
code statements. We will take these in turn.
@h Global statements.
These statements must appear first in the file, and must be unindented.
There are only four of these:
@ |version NUMBER| indicates that the file was written in that version of
the inter language. At present there has only ever been one version, but
that may not always be true. A |version| statement number must come before
anything else, even other global statements; in particular, there cannot be
two such statements in the same file.
Convention. Inform always opens with the statement: |version 1|
@ |packagetype NAME| declares that |NAME| is the name of a type of package.
Packages are the main hierarchical organisation for inter files, as we
will see below. Each package has a type as well as a name, and the type
must be one of those declared like this.
For example, |packagetype _adjective| creates |_adjective| as a possible type
for packages in this file.
The first two package types must be |_plain| and |_code|, in that order.
Convention. All of Inform's package type names begin similarly with an
underscore, to prevent name clashes. Inform uses package types semantically,
to show what kind of thing is being defined in the content of a particular
package. This makes it easier to search a large inter repository for all of
the adjective defimitions, for example: we just need to look for packages of
type |_adjective|.
@ |pragma TARGET "WHATEVER"| does not change the meaning of the inter file;
it simply provides pragmatic advice to the eventual compiler of code
generated from this file. |TARGET| indicates the context for which this
is intended; at present, the only possible choice is |target_I6|, meaning,
"if you are compiling me to Inform 6".
Convention. Inform uses this to pass on ICL (Inform Command Language)
commands to Inform 6, such as memory settings or command-line switches.
For example,
|pragma target_I6 "$MAX_LABELS=200000"|
(This would be meaningless if we were compiling to some other format.)
@ |primitive PRIMITIVE IN -> OUT| defines a new code statement -- if inter
were an assembly language, these would be the opcodes. For example,
|primitive !move val val -> void|
defines the primtive |!move| as something which consumes two values and
produces none. |IN| can either be |void| or can be a list of one or more
terms which are all either |ref|, |val| or |code|. |OUT| can be either
|void| or else a single term which is either |ref| or |val|. For
example,
|primitive !plus val val -> val|
says that |!plus| consumes two values and produces a new one, while
|primitive !ifelse val code code -> void|
says that |!ifelse| consumes a value and two blocks of code, and produces
nothing. Of course, |!plus| adds the values, whereas |!ifelse| evaluates
the value and then executes one of the two code blocks depending on
the result. But at this stage, we don't see the meaming of these
primitives, only their prototypes.
The third term type, |ref|, means "a reference to a value", and is in
effect an lvalue rather than an rvalue: for example,
|primitive !pull ref -> void|
is the prototype of a primitive which pulls a value from the stack and
stores it in whatever is referred to by the |ref| (typically, a variable).
Convention. Inform defines a standard set of around 90 primitives. Although
their names and prototypes are not part of the inter specification as such,
you will only be able to use Inter's "compile to I6" feature if those are
the primitives you use, so in effect this is the standard set. Details of
these primitives and what they do will appear below.
@h Package declarations.
After the global area, an inter file should declare a package called |main|,
which must have the package type |_plain|.
The statement |package NAME TYPE| declares a new package, and the |TYPE|
must be one of those declared by |packagetype| statements in the global area.
The declaration line for a package begins at the level of indentation of
the package's owner. For |main|, it should be unindented, and this is the
only package allowed to appear at the top level: all other packages should
be inside |main| in some way.
The contents of the package are then one tab stop in from the declaration. Thus:
|package main _plain|
| ...|
| package m1_RBLK1 _code|
| ...|
| package m1_RBLK2 _code|
| ...|
Here, |main| contains two sub-packages, |m1_RBLK1| and |m1_RBLK2|, and
indentation is used to show which package a statement belongs to.
@ After the declaration line, a package definition continues with a set
of symbols definitions. In effect, this is the symbols table for the
package written out explicitly. Each definition is a |symbol| line, in
one of these three forms:
|symbol private TYPE NAME|
|symbol public TYPE NAME|
|symbol external TYPE NAME == SYMBOL|
For example,
|symbol public misc MEMORY_HEAP_SIZE|
|symbol external misc AllowInShowme == /main/resources/template/AllowInShowme|
|private| means that the meaning and existence of |NAME| are invisible
from outside the current package; |public| means that other packages are
allowed to refer to |NAME|; and |external| means that this package is
making just such a reference, and that |NAME| in this package is equivalent
to |SYMBOL|, defined elsewhere. It is possible that |SYMBOL| points only to
another symbol which is also |external|, so that we then have to follow
another link to find the original non-external definition. However, it is
a requirement that this process must eventually end. It would be illegal
to write
|package main _plain|
| package A _plain|
| symbol external misc S == /main/B/T|
| package B _plain|
| symbol external misc T == /main/B/S|
The symbol |TYPE| must be one of four possibilities:
(a) |label|, used to mark execution positions in code packages;
(b) |package|, meaning that this is the name of a package;
(c) |packagetype|, meaning that this is a package type;
(d) |misc|, meaning "anything else" -- most symbols have this type.
The run of |symbol| declarations at the top of a module can become quite
long, since it has to give a complete description of all symbols used inside
the module, whether they're defined internally or externally. As a
convenience for people writing test cases by hand, it's in fact optional
to predeclare a symbol in textual inter provided that this symbol is
declared earlier in the file than its first use. However, when Inter
writes out a textual inter file, it always writes the symbols table out
in full, and never exercises this option.
@ Where a local symbol is being equated with an external one, the |SYMBOL|
given is a sort of URL showing the package to look inside. Thus
|/main/resources/template/AllowInShowme|
means "the symbol |AllowInShowme| in package |template| inside package
|resources| inside package |main|".
@ Optionally, a |private| or |public| symbol can also specify a name it
wishes to be given when the Inter is translated into some other language
(i.e., Inform 6 or similar). This is written like so:
|symbol private TYPE NAME -> TRANSLATION|
So, for example,
|symbol public misc launcher -> launcher_U32|
Symbols tabulated as |external| cannot be marked in this way, but of course
the original definition (to which the external link eventually leads) can be.
For example,
|package main _plain|
| package A _plain|
| symbol external misc S == /main/B/T|
| package B _plain|
| symbol public misc T -> FancyName |
would result in the names |S| and |T| both being compiled to the name
|FancyName| in the final code.
Convention. Inform mostly makes use of this feature of inter late in code
generation, essentially to avoid namespace clashes in the final output code,
but it also needs to use it to implement low-level features of the Inform
language such as:
>> The marked for listing property translates into I6 as "workflag".
@ With the package and its symbol table declared, we can then get on with
the definitions of what is inside the package.
A package with the special type |_code| must contain only code statements;
all other packages must contain only data statements. Note that |package|
is itself a data statement, and it follows that |_code| packages cannot
contain sub-packages, but that all others can.
"Data" is a slightly loose phrase for what data statements convey: it
includes metadata, and indeed almost anything other than actual executable
code.
@h Kinds and values.
Inter is a very loosely typed language, in the sense that it is possible
to require that values conform to particular data types. As in Inform, data
types are called "kinds" in this context (which usefully distinguishes them
from "types" of packages, a completely different concept).
No kinds are built in: all must be declared before use. However, these
declarations are able to say something about them, so they aren't entirely
abstract. The syntax is:
|kind NAME CONTENT|
The |NAME|, like all names, goes into the owning package's symbol table;
other packages wanting to use this kind will have to have an |external|
symbol pointing to this definition.
|CONTENT| must be one of the following:
(a) |unchecked|, meaning that absolutely any data can be referred to by this type;
(b) |int32|, |int16|, |int8|, |int2|, for numerical data stored in these numbers
of bits (which the program may choose to treat as character values, as flags,
as signed or unsigned integers. and so on, as it pleases);
(c) |text|, meaning text;
(d) |enum|, meaning that data of this kind must be equal to one (and only one)
of the enumerated constants with this kind;
(e) |table|, a special sort of data referring to tables made up of columns each
of which has a different kind;
(f) |list of K|, meaning that data must be a list, each of whose terms is
data of kind |K| -- which must be a kind name known to the symbols table
of the package in which this definition occurs;
(g) |column of K|, similarly, but for a table column;
(h) |relation of K1 to K2|, meaning that data must be such a relation, in the
same sort of sense as in Inform;
(i) |description of K|, meaning that data must be a description which either
matches or does not match values of kind |K|;
(j) |struct|, which is similar to |list of K|, but which has entries which do
not all have to have the same kind;
(k) and |routine|, meaning that data must be references to functions.
@ In the remainder of this specification, |VALUE| means either the name of
a defined |constant| (see below), or else a literal. The following notation
is used for literals:
@h Code statements.
...