mirror of
https://github.com/ganelson/inform.git
synced 2024-07-16 22:14:23 +03:00
186 lines
14 KiB
HTML
186 lines
14 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
|
<html>
|
|
<head>
|
|
<title>What This Module Does</title>
|
|
<link href="../docs-assets/Breadcrumbs.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<meta name="viewport" content="width=device-width initial-scale=1">
|
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
|
<meta http-equiv="Content-Language" content="en-gb">
|
|
|
|
<link href="../docs-assets/Contents.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<link href="../docs-assets/Progress.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<link href="../docs-assets/Navigation.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<link href="../docs-assets/Fonts.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<link href="../docs-assets/Base.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<link href="../docs-assets/Colours.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
|
|
</head>
|
|
<body class="commentary-font">
|
|
<nav role="navigation">
|
|
<h1><a href="../index.html">
|
|
<img src="../docs-assets/Inform.png" height=72">
|
|
</a></h1>
|
|
<ul><li><a href="../index.html">home</a></li>
|
|
</ul><h2>Compiler</h2><ul>
|
|
<li><a href="../structure.html">structure</a></li>
|
|
<li><a href="../inbuildn.html">inbuild</a></li>
|
|
<li><a href="../inform7n.html">inform7</a></li>
|
|
<li><a href="../intern.html">inter</a></li>
|
|
<li><a href="../services.html">services</a></li>
|
|
<li><a href="../secrets.html">secrets</a></li>
|
|
</ul><h2>Other Tools</h2><ul>
|
|
<li><a href="../inblorbn.html">inblorb</a></li>
|
|
<li><a href="../indocn.html">indoc</a></li>
|
|
<li><a href="../inform6.html">inform6</a></li>
|
|
<li><a href="../inpolicyn.html">inpolicy</a></li>
|
|
<li><a href="../inrtpsn.html">inrtps</a></li>
|
|
</ul><h2>Resources</h2><ul>
|
|
<li><a href="../extensions.html">extensions</a></li>
|
|
<li><a href="../kits.html">kits</a></li>
|
|
</ul><h2>Repository</h2><ul>
|
|
<li><a href="https://github.com/ganelson/inform"><img src="../docs-assets/github.png" height=18> github</a></li>
|
|
</ul><h2>Related Projects</h2><ul>
|
|
<li><a href="../../../inweb/index.html">inweb</a></li>
|
|
<li><a href="../../../intest/index.html">intest</a></li>
|
|
|
|
</ul>
|
|
</nav>
|
|
<main role="main">
|
|
<!--Weave of 'What This Module Does' generated by Inweb-->
|
|
<div class="breadcrumbs">
|
|
<ul class="crumbs"><li><a href="../index.html">Home</a></li><li><a href="../services.html">Services</a></li><li><a href="index.html">syntax</a></li><li><a href="index.html#P">Preliminaries</a></li><li><b>What This Module Does</b></li></ul></div>
|
|
<p class="purpose">An overview of the syntax module's role and abilities.</p>
|
|
|
|
<ul class="toc"><li><a href="P-wtmd.html#SP1">§1. Prerequisites</a></li><li><a href="P-wtmd.html#SP2">§2. Syntax trees</a></li><li><a href="P-wtmd.html#SP6">§6. Nodes</a></li><li><a href="P-wtmd.html#SP7">§7. Fussy, defensive, pedantry</a></li></ul><hr class="tocbar">
|
|
|
|
<p class="commentary firstcommentary"><a id="SP1" class="paragraph-anchor"></a><b>§1. Prerequisites. </b>The syntax module is a part of the Inform compiler toolset. It is
|
|
presented as a literate program or "web". Before diving in:
|
|
</p>
|
|
|
|
<ul class="items"><li>(a) It helps to have some experience of reading webs: see <a href="../../../inweb/index.html" class="internal">inweb</a> for more.
|
|
</li><li>(b) The module is written in C, in fact ANSI C99, but this is disguised by the
|
|
fact that it uses some extension syntaxes provided by the <a href="../../../inweb/index.html" class="internal">inweb</a> literate
|
|
programming tool, making it a dialect of C called InC. See <a href="../../../inweb/index.html" class="internal">inweb</a> for
|
|
full details, but essentially: it's C without predeclarations or header files,
|
|
and where functions have names like <span class="extract"><span class="extract-syntax">Tags::add_by_name</span></span> rather than <span class="extract"><span class="extract-syntax">add_by_name</span></span>.
|
|
</li><li>(c) This module uses other modules drawn from the compiler (see <a href="../structure.html" class="internal">structure</a>), and also
|
|
uses a module of utility functions called <a href="../../../inweb/foundation-module/index.html" class="internal">foundation</a>.
|
|
For more, see <a href="../../../inweb/foundation-module/P-abgtf.html" class="internal">A Brief Guide to Foundation (in foundation)</a>.
|
|
</li></ul>
|
|
<p class="commentary firstcommentary"><a id="SP2" class="paragraph-anchor"></a><b>§2. Syntax trees. </b>Most algorithms for parsing natural language involve the construction of
|
|
trees, in which the original words appear as leaves at the top of the tree,
|
|
while the grammatical functions they serve appear as the branches and trunk:
|
|
thus the word "orange", as an adjective, might be growing from a branch
|
|
which represents a noun clause ("the orange envelope"), growing in turn from
|
|
a trunk which in turn might represent a assertion sentence:
|
|
</p>
|
|
|
|
<blockquote>
|
|
<p>The card is in the orange envelope.</p>
|
|
</blockquote>
|
|
|
|
<p class="commentary">The Inform tools represent syntax trees by <a href="2-st.html#SP2" class="internal">parse_node_tree</a> structures
|
|
(see <a href="2-st.html#SP2" class="internal">SyntaxTree::new</a>), but there are very few of these: the entire
|
|
source text compiled by <a href="../inform7/index.html" class="internal">inform7</a> is just one syntax tree. When <a href="../supervisor-module/index.html" class="internal">supervisor</a>
|
|
manages extensions, it may generate one <a href="2-st.html#SP2" class="internal">parse_node_tree</a> object for each
|
|
extension whose text it reads. Still — there are few trees.
|
|
</p>
|
|
|
|
<p class="commentary firstcommentary"><a id="SP3" class="paragraph-anchor"></a><b>§3. </b>The trunk of the tree can be grown in any sequence: call <a href="2-st.html#SP3" class="internal">SyntaxTree::push_bud</a>
|
|
to begin "budding" from a particular branch, and <a href="2-st.html#SP3" class="internal">SyntaxTree::pop_bud</a> to go back
|
|
to where you were. These are also used automatically to ensure that sentences
|
|
arriving at <a href="2-st.html#SP4" class="internal">SyntaxTree::graft_sentence</a> are grafted under the headings to
|
|
which they belong. Thus, the sentences
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">Chapter</span><span class="plain-syntax"> </span><span class="constant-syntax">20</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">Section</span><span class="plain-syntax"> </span><span class="constant-syntax">1</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">The</span><span class="plain-syntax"> </span><span class="identifier-syntax">cat</span><span class="plain-syntax"> </span><span class="identifier-syntax">is</span><span class="plain-syntax"> </span><span class="identifier-syntax">in</span><span class="plain-syntax"> </span><span class="identifier-syntax">the</span><span class="plain-syntax"> </span><span class="identifier-syntax">cardboard</span><span class="plain-syntax"> </span><span class="identifier-syntax">box</span><span class="plain-syntax">.</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">Section</span><span class="plain-syntax"> </span><span class="constant-syntax">2</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">The</span><span class="plain-syntax"> </span><span class="identifier-syntax">ball</span><span class="plain-syntax"> </span><span class="identifier-syntax">of</span><span class="plain-syntax"> </span><span class="identifier-syntax">yarn</span><span class="plain-syntax"> </span><span class="identifier-syntax">is</span><span class="plain-syntax"> </span><span class="identifier-syntax">here</span><span class="plain-syntax">.</span>
|
|
</pre>
|
|
<p class="commentary">would actually be grafted like so:
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="plain-syntax"> RESULT BUD STACK BEFORE THIS</span>
|
|
<span class="plain-syntax"> Chapter 20 (empty)</span>
|
|
<span class="plain-syntax"> Section 1 Chapter 20</span>
|
|
<span class="plain-syntax"> The cat is in the cardboard box. Chapter 20 > Section 1</span>
|
|
<span class="plain-syntax"> Section 2 Chapter 20 > Section 1</span>
|
|
<span class="plain-syntax"> The ball of yarn is here. Chapter 20 > Section 2</span>
|
|
</pre>
|
|
<p class="commentary">But it is also possible to graft smaller (not-whole-sentence) cuttings onto
|
|
each other using <a href="2-st.html#SP6" class="internal">SyntaxTree::graft</a>, which doesn't involve the bud stack
|
|
at all.
|
|
</p>
|
|
|
|
<p class="commentary firstcommentary"><a id="SP4" class="paragraph-anchor"></a><b>§4. </b>Meaning is an ambiguous thing, and so the tree needs to be capable of
|
|
representing multiple interpretations of the same wording. So nodes have not
|
|
only <span class="extract"><span class="extract-syntax">next</span></span> and <span class="extract"><span class="extract-syntax">down</span></span> links to other nodes, but also <span class="extract"><span class="extract-syntax">next_alternative</span></span> links,
|
|
which — if used — fork the syntax tree into different possible readings.
|
|
</p>
|
|
|
|
<p class="commentary">These are not added to the tree by grafting: that's only done for definite
|
|
meanings. Instead, multiple ambiguous readings mostly lie beneath <span class="extract"><span class="extract-syntax">AMBIGUITY_NT</span></span>
|
|
nodes — see <a href="2-st.html#SP21" class="internal">SyntaxTree::add_reading</a>. For example, we might have:
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="plain-syntax"> sun is orange</span>
|
|
<span class="plain-syntax"> sun</span>
|
|
<span class="plain-syntax"> AMBIGUITY</span>
|
|
<span class="plain-syntax"> orange (read as being a fruit)</span>
|
|
<span class="plain-syntax"> orange (read as being a colour)</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP5" class="paragraph-anchor"></a><b>§5. </b>An extensive suite of functions is provided to make it easy to traverse
|
|
a syntax tree, calling a visitor function on each node: see <a href="2-st.html#SP10" class="internal">SyntaxTree::traverse</a>.
|
|
</p>
|
|
|
|
<p class="commentary firstcommentary"><a id="SP6" class="paragraph-anchor"></a><b>§6. Nodes. </b>Syntax trees are made up of <a href="2-pn.html#SP1" class="internal">parse_node</a> structures. While these are in
|
|
principle individual nodes, they effectively represent subtrees, because they
|
|
carry with them links to the nodes below. A <a href="2-pn.html#SP1" class="internal">parse_node</a> object can
|
|
therefore equally represent "orange", "the orange envelope", or "now the card
|
|
is in the orange envelope".
|
|
</p>
|
|
|
|
<p class="commentary">Each node carries three essential pieces of information with it:
|
|
</p>
|
|
|
|
<ul class="items"><li>(1) The text giving rise to it (say, "Section Five - Fruit").
|
|
</li><li>(2) A node type ID, which in broad terms says what kind of reference is being
|
|
made (say, <span class="extract"><span class="extract-syntax">HEADING_NT</span></span>). The possible node types are stored in the C type
|
|
<span class="extract"><span class="extract-syntax">node_type_t</span></span>, which corresponds to some metadata in a <a href="2-nt.html#SP3" class="internal">node_type_metadata</a>
|
|
object: see <a href="2-pn.html#SP5" class="internal">Node::get_type</a> and <a href="2-nt.html#SP7" class="internal">NodeType::get_metadata</a>.
|
|
</li><li>(3) A list of optional annotations, which are either integer or object-valued,
|
|
and which give specifics about the meaning (say, the level number in the
|
|
hierarchy of headings). See <a href="2-na.html" class="internal">Node Annotations</a>.
|
|
</li></ul>
|
|
<p class="commentary firstcommentary"><a id="SP7" class="paragraph-anchor"></a><b>§7. Fussy, defensive, pedantry. </b>Safe to say that Inform includes bugs: the more defensive coding we can do,
|
|
the better. That means not only extensive logging (see <a href="2-pn.html#SP16" class="internal">Node::log_tree</a>)
|
|
but also strict verification tests on every tree made (see <a href="2-tv.html" class="internal">Tree Verification</a>).
|
|
</p>
|
|
|
|
<ul class="items"><li>(a) The only nodes allowed to exist are those for node types declared
|
|
by <a href="2-nt.html#SP9" class="internal">NodeType::new</a>: more generally, see <a href="2-nt.html" class="internal">Node Types</a> on metadata associated
|
|
with these.
|
|
</li><li>(b) A node of type <span class="extract"><span class="extract-syntax">A</span></span> can only be a child of a node of type <span class="extract"><span class="extract-syntax">B</span></span> if
|
|
<a href="2-nt.html#SP13" class="internal">NodeType::parentage_allowed</a> says so, and this is (mostly) a matter
|
|
of calling <a href="2-nt.html#SP5" class="internal">NodeType::allow_parentage_for_categories</a> — parentage depends
|
|
not on the type per se, but on the category of the type, which groups types
|
|
together.
|
|
</li><li>(c) A node of type <span class="extract"><span class="extract-syntax">A</span></span> can only have an annotation with ID <span class="extract"><span class="extract-syntax">I</span></span> if
|
|
<a href="2-na.html#SP15" class="internal">Annotations::is_allowed</a> says so. To declare an annotation legal,
|
|
call <span class="extract"><span class="extract-syntax">Annotations::allow(A, I)</span></span>, or <span class="extract"><span class="extract-syntax">Annotations::allow_for_category(C, I)</span></span>
|
|
for the category <span class="extract"><span class="extract-syntax">C</span></span> of <span class="extract"><span class="extract-syntax">A</span></span>.
|
|
</li></ul>
|
|
<nav role="progress"><div class="progresscontainer">
|
|
<ul class="progressbar"><li class="progressprevoff">❮</li><li class="progresscurrentchapter">P</li><li class="progresscurrent">wtmd</li><li class="progresssection"><a href="P-htitm.html">htitm</a></li><li class="progresschapter"><a href="1-sm.html">1</a></li><li class="progresschapter"><a href="2-st.html">2</a></li><li class="progresschapter"><a href="3-snt.html">3</a></li><li class="progressnext"><a href="P-htitm.html">❯</a></li></ul></div>
|
|
</nav><!--End of weave-->
|
|
|
|
</main>
|
|
</body>
|
|
</html>
|
|
|