1
0
Fork 0
mirror of https://github.com/ganelson/inform.git synced 2024-07-16 22:14:23 +03:00
inform7/docs/words-module/4-ap.html
2020-05-12 23:33:17 +01:00

161 lines
11 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>About Preform</title>
<link href="../docs-assets/Breadcrumbs.css" rel="stylesheet" rev="stylesheet" type="text/css">
<meta name="viewport" content="width=device-width initial-scale=1">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta http-equiv="Content-Language" content="en-gb">
<link href="../docs-assets/Contents.css" rel="stylesheet" rev="stylesheet" type="text/css">
<link href="../docs-assets/Progress.css" rel="stylesheet" rev="stylesheet" type="text/css">
<link href="../docs-assets/Navigation.css" rel="stylesheet" rev="stylesheet" type="text/css">
<link href="../docs-assets/Fonts.css" rel="stylesheet" rev="stylesheet" type="text/css">
<link href="../docs-assets/Base.css" rel="stylesheet" rev="stylesheet" type="text/css">
<link href="../docs-assets/Colours.css" rel="stylesheet" rev="stylesheet" type="text/css">
</head>
<body class="commentary-font">
<nav role="navigation">
<h1><a href="../index.html">
<img src="../docs-assets/Inform.png" height=72">
</a></h1>
<ul><li><a href="../compiler.html">compiler tools</a></li>
<li><a href="../other.html">other tools</a></li>
<li><a href="../extensions.html">extensions and kits</a></li>
<li><a href="../units.html">unit test tools</a></li>
</ul><h2>Compiler Webs</h2><ul>
<li><a href="../inbuild/index.html">inbuild</a></li>
<li><a href="../inform7/index.html">inform7</a></li>
<li><a href="../inter/index.html">inter</a></li>
</ul><h2>Inbuild Modules</h2><ul>
<li><a href="../supervisor-module/index.html">supervisor</a></li>
</ul><h2>Inform7 Modules</h2><ul>
<li><a href="../core-module/index.html">core</a></li>
<li><a href="../inflections-module/index.html">inflections</a></li>
<li><a href="../linguistics-module/index.html">linguistics</a></li>
<li><a href="../kinds-module/index.html">kinds</a></li>
<li><a href="../if-module/index.html">if</a></li>
<li><a href="../multimedia-module/index.html">multimedia</a></li>
<li><a href="../problems-module/index.html">problems</a></li>
<li><a href="../index-module/index.html">index</a></li>
</ul><h2>Inter Modules</h2><ul>
<li><a href="../bytecode-module/index.html">bytecode</a></li>
<li><a href="../building-module/index.html">building</a></li>
<li><a href="../codegen-module/index.html">codegen</a></li>
</ul><h2>Shared Modules</h2><ul>
<li><a href="../arch-module/index.html">arch</a></li>
<li><a href="../syntax-module/index.html">syntax</a></li>
<li><a href="index.html"><span class="selectedlink">words</span></a></li>
<li><a href="../html-module/index.html">html</a></li>
<li><a href="../../../inweb/docs/foundation-module/index.html">foundation</a></li>
</ul>
</nav>
<main role="main">
<!--Weave of 'About Preform' generated by Inweb-->
<div class="breadcrumbs">
<ul class="crumbs"><li><a href="../index.html">Home</a></li><li><a href="../compiler.html">Shared Modules</a></li><li><a href="index.html">words</a></li><li><a href="index.html#4">Chapter 4: Parsing</a></li><li><b>About Preform</b></li></ul></div>
<p class="purpose">A brief guide to Preform and how to use it.</p>
<p class="commentary firstcommentary"><a id="SP1"></a><b>&#167;1. </b>That's what it would look like in the Preform file, but here is how it's
typed in the Inform source code. Definitions like this one are scattered all
across the Inform web, in order to keep them close to the code which relates to
them. The <span class="extract"><span class="extract-syntax">inweb</span></span> tangler compiles them in two halves: the instructions right
of the <span class="extract"><span class="extract-syntax">==&gt;</span></span> arrows are extracted and compiled into a C routine called the
"compositor" for the nonterminal (see below), while the actual grammar is
extracted and placed into Inform's "Preform.txt" file.
</p>
<p class="commentary">In the document of Preform grammar extracted from Inform's source code to
lay the language out for translators, the <span class="extract"><span class="extract-syntax">==&gt;</span></span> arrows and formulae to the
right of them are omitted &mdash; those represent semantics, not syntax.
</p>
<pre class="displayed-code all-displayed-code code-font">
<span class="plain-syntax"> &lt;competitor&gt; ::=</span>
<span class="plain-syntax"> &lt;ordinal-number&gt; runner | ==&gt; TRUE</span>
<span class="plain-syntax"> runner no &lt;cardinal-number&gt; ==&gt; FALSE</span>
</pre>
<p class="commentary firstcommentary"><a id="SP2"></a><b>&#167;2. </b>Each nonterminal, when successfully matched, can provide both or more usually
just one of two results: an integer, to be stored in <span class="extract"><span class="extract-syntax">*X</span></span>, and a void pointer,
to be stored in <span class="extract"><span class="extract-syntax">*XP</span></span>. For example, &lt;k-kind&gt; matches if and only if the
text declares a legal kind, such as "number"; its pointer result is to the
kind found, such as <span class="extract"><span class="extract-syntax">K_number</span></span>. But &lt;competitor&gt; only results in an integer.
The <span class="extract"><span class="extract-syntax">==&gt;</span></span> arrow is optional, but if present, it says what the result is if
the given production is matched; the <span class="extract"><span class="extract-syntax">inweb</span></span> tangler, if it sees an expression
on the right of the arrow, assigns that value to the integer result. So,
for example, "runner bean" or "beetroot" would not match &lt;competitor&gt;;
"4th runner" would match with integer result <span class="extract"><span class="extract-syntax">TRUE</span></span>; "runner no 17" would
match with integer result <span class="extract"><span class="extract-syntax">FALSE</span></span>.
</p>
<p class="commentary">Usually, though, the result(s) of a nonterminal depend on the result(s) of
other nonterminals used to make the match. In the compositing expression,
so called because it composes together the various intermediate results into
one final result, <span class="extract"><span class="extract-syntax">R[1]</span></span> is the integer result of the first nonterminal in
the production, <span class="extract"><span class="extract-syntax">R[2]</span></span> the second, and so on; <span class="extract"><span class="extract-syntax">RP[1]</span></span> and so on hold the
pointer results. Here, on both productions, there's just one nonterminal
in the line, &lt;ordinal-number&gt; in the first case, &lt;cardinal-number&gt; in
the second. So the following refinement of &lt;competitor&gt; means that "4th
runner" matches with integer result 4, because &lt;ordinal-number&gt; matches
"4th" with integer result 4, and that goes into <span class="extract"><span class="extract-syntax">R[1]</span></span>. Similarly,
"runner no 17" ends up with integer result 17. "The pacemaker" matches
with integer result 1; here there are no intermediate results to make use
of, so <span class="extract"><span class="extract-syntax">R[...]</span></span> can't be used.
</p>
<pre class="displayed-code all-displayed-code code-font">
<span class="plain-syntax"> &lt;competitor&gt; ::=</span>
<span class="plain-syntax"> the pacemaker | ==&gt; 1</span>
<span class="plain-syntax"> &lt;ordinal-number&gt; runner | ==&gt; R[1]</span>
<span class="plain-syntax"> runner no &lt;cardinal-number&gt; ==&gt; R[1]</span>
</pre>
<p class="commentary firstcommentary"><a id="SP3"></a><b>&#167;3. </b>The arrows and expressions are optional, and if they are omitted, then the
result integer is set to the production number, counting up from 0. For
example, given the following, "polkadot" matches with result 1, and "green"
with result 2.
</p>
<pre class="displayed-code all-displayed-code code-font">
<span class="plain-syntax"> &lt;race-jersey&gt; ::=</span>
<span class="plain-syntax"> yellow | polkadot | green | white</span>
<span class="plain-syntax">Since I have found that well-known computer programmers look at me strangely</span>
<span class="plain-syntax">when I tell them that Inform doesn't use |yacc|, or |antlr|, or for that</span>
<span class="plain-syntax">matter any of the elegant theory of LALR parsers, perhaps an explanation</span>
<span class="plain-syntax">is called for.</span>
<span class="plain-syntax">One reason is that I am sceptical that formal grammars specify natural language</span>
<span class="plain-syntax">terribly well -- which is ironic, considering that the relevant computer</span>
<span class="plain-syntax">science, dating from the 1950s and 1960s, was strongly influenced by Noam</span>
<span class="plain-syntax">Chomsky's generative linguistics. Such formal descriptions tend to be too rigid</span>
<span class="plain-syntax">to be applied universally. The classical use case for |yacc| is to manage</span>
<span class="plain-syntax">hierarchies of associative operators on different levels: well, natural language</span>
<span class="plain-syntax">doesn't have those.</span>
<span class="plain-syntax">Another reason is that |yacc|-style grammars tend to react badly to uncompliant</span>
<span class="plain-syntax">input: that is, they correctly reject it, but are bad at diagnosing the</span>
<span class="plain-syntax">problem, and at recovering their wits afterwards. For Inform purposes, this</span>
<span class="plain-syntax">would be too sloppy: the user more often miscompiles than compiles, and quality</span>
<span class="plain-syntax">lies in how good our problem messages are in reply.</span>
<span class="plain-syntax">Lastly, there are two pragmatic reasons. In order to make Preform grammar</span>
<span class="plain-syntax">extensible, we couldn't use a parser-compiler like |yacc| anyway: we have to</span>
<span class="plain-syntax">interpret our grammar, not compile code to parse it. And we also want speed;</span>
<span class="plain-syntax">folk wisdom has it that |yacc| parsers are about half as fast as a shrewdly</span>
<span class="plain-syntax">hand-coded equivalent. (|gcc| abandoned the use of |bison| for exactly this</span>
<span class="plain-syntax">reason some years ago.) Until Preform's arrival in February 2011, Inform had a</span>
<span class="plain-syntax">hard-coded syntax analyser scattered throughout its code, which often made what</span>
<span class="plain-syntax">were provably the minimum possible number of comparisons. Even Preform's</span>
<span class="plain-syntax">parser is intentionally lean.</span>
</pre>
<nav role="progress"><div class="progresscontainer">
<ul class="progressbar"><li class="progressprev"><a href="3-idn.html">&#10094;</a></li><li class="progresschapter"><a href="P-wtmd.html">P</a></li><li class="progresschapter"><a href="1-wm.html">1</a></li><li class="progresschapter"><a href="2-vcb.html">2</a></li><li class="progresschapter"><a href="3-lxr.html">3</a></li><li class="progresscurrentchapter">4</li><li class="progresscurrent">ap</li><li class="progresssection"><a href="4-lp.html">lp</a></li><li class="progresssection"><a href="4-to.html">to</a></li><li class="progresssection"><a href="4-prf.html">prf</a></li><li class="progresssection"><a href="4-bn.html">bn</a></li><li class="progressnext"><a href="4-lp.html">&#10095;</a></li></ul></div>
</nav><!--End of weave-->
</main>
</body>
</html>