mirror of
https://github.com/ganelson/inform.git
synced 2024-07-16 22:14:23 +03:00
583 lines
104 KiB
HTML
583 lines
104 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
|
<html>
|
|
<head>
|
|
<title>Vocabulary</title>
|
|
<link href="../docs-assets/Breadcrumbs.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<meta name="viewport" content="width=device-width initial-scale=1">
|
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
|
<meta http-equiv="Content-Language" content="en-gb">
|
|
|
|
<link href="../docs-assets/Contents.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<link href="../docs-assets/Progress.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<link href="../docs-assets/Navigation.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<link href="../docs-assets/Fonts.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<link href="../docs-assets/Base.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<script>
|
|
function togglePopup(material_id) {
|
|
var popup = document.getElementById(material_id);
|
|
popup.classList.toggle("show");
|
|
}
|
|
</script>
|
|
|
|
<link href="../docs-assets/Popups.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<link href="../docs-assets/Colours.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
|
|
</head>
|
|
<body class="commentary-font">
|
|
<nav role="navigation">
|
|
<h1><a href="../index.html">
|
|
<img src="../docs-assets/Inform.png" height=72">
|
|
</a></h1>
|
|
<ul><li><a href="../compiler.html">compiler tools</a></li>
|
|
<li><a href="../other.html">other tools</a></li>
|
|
<li><a href="../extensions.html">extensions and kits</a></li>
|
|
<li><a href="../units.html">unit test tools</a></li>
|
|
</ul><h2>Compiler Webs</h2><ul>
|
|
<li><a href="../inbuild/index.html">inbuild</a></li>
|
|
<li><a href="../inform7/index.html">inform7</a></li>
|
|
<li><a href="../inter/index.html">inter</a></li>
|
|
</ul><h2>Inbuild Modules</h2><ul>
|
|
<li><a href="../supervisor-module/index.html">supervisor</a></li>
|
|
</ul><h2>Inform7 Modules</h2><ul>
|
|
<li><a href="../core-module/index.html">core</a></li>
|
|
<li><a href="../inflections-module/index.html">inflections</a></li>
|
|
<li><a href="../linguistics-module/index.html">linguistics</a></li>
|
|
<li><a href="../kinds-module/index.html">kinds</a></li>
|
|
<li><a href="../if-module/index.html">if</a></li>
|
|
<li><a href="../multimedia-module/index.html">multimedia</a></li>
|
|
<li><a href="../problems-module/index.html">problems</a></li>
|
|
<li><a href="../index-module/index.html">index</a></li>
|
|
</ul><h2>Inter Modules</h2><ul>
|
|
<li><a href="../bytecode-module/index.html">bytecode</a></li>
|
|
<li><a href="../building-module/index.html">building</a></li>
|
|
<li><a href="../codegen-module/index.html">codegen</a></li>
|
|
</ul><h2>Shared Modules</h2><ul>
|
|
<li><a href="../arch-module/index.html">arch</a></li>
|
|
<li><a href="../syntax-module/index.html">syntax</a></li>
|
|
<li><a href="index.html"><span class="selectedlink">words</span></a></li>
|
|
<li><a href="../html-module/index.html">html</a></li>
|
|
<li><a href="../../../inweb/docs/foundation-module/index.html">foundation</a></li>
|
|
|
|
</ul>
|
|
</nav>
|
|
<main role="main">
|
|
<!--Weave of 'Vocabulary' generated by Inweb-->
|
|
<div class="breadcrumbs">
|
|
<ul class="crumbs"><li><a href="../index.html">Home</a></li><li><a href="../compiler.html">Shared Modules</a></li><li><a href="index.html">words</a></li><li><a href="index.html#2">Chapter 2: Words in Isolation</a></li><li><b>Vocabulary</b></li></ul></div>
|
|
<p class="purpose">To classify the words in the lexical stream, where two different words are considered equivalent if they are unquoted and have the same text, taken case insensitively.</p>
|
|
|
|
<ul class="toc"><li><a href="2-vcb.html#SP1">§1. Vocabulary Entries</a></li><li><a href="2-vcb.html#SP13">§13. Hash coding of words</a></li><li><a href="2-vcb.html#SP14">§14. The hash table of vocabulary</a></li><li><a href="2-vcb.html#SP16">§16. Partial words</a></li><li><a href="2-vcb.html#SP17">§17. Ordinals</a></li></ul><hr class="tocbar">
|
|
|
|
<p class="commentary firstcommentary"><a id="SP1"></a><b>§1. Vocabulary Entries. </b>A <a href="2-vcb.html#SP1" class="internal">vocabulary_entry</a> object is created for each different word found in the
|
|
source. (Recall that these are not necessarily words in the usual English
|
|
sense: for instance, <span class="extract"><span class="extract-syntax">17</span></span> is a word here.)
|
|
</p>
|
|
|
|
<p class="commentary">The vocabulary entry structure exists to make textual comparisons faster,
|
|
which is essential to make Inform run tolerably quickly: Inform's speed on
|
|
typical source texts increased by a factor of 5-10 when this structure was
|
|
introduced. Firstly, the vocabulary is hashed so that it is not too
|
|
painful to compare a newly-read word against the known vocabulary;
|
|
secondly, each word stores linked lists of meanings which it begins,
|
|
occurs in the middle of, ends, or is optionally part of (in the sense
|
|
that "brown" is optionally part of the name "small brown shoe", which
|
|
could also be written "small shoe"); and thirdly, each word also carries
|
|
a bitmap of flags indicating the possible contexts in which it might
|
|
be used. Finally, to avoid parsing the same text over and over for its
|
|
possible meaning as a literal integer, we cache the result: for instance,
|
|
17 for the text <span class="extract"><span class="extract-syntax">17</span></span>.
|
|
</p>
|
|
|
|
<p class="commentary">The meaning codes alluded to below are also used for excerpts of text
|
|
(i.e., are not just for single words) and are defined in Excerpt Meanings.
|
|
</p>
|
|
|
|
<pre class="definitions code-font"><span class="definition-keyword">define</span> <span class="constant-syntax">ING_MC</span><span class="plain-syntax"> </span><span class="constant-syntax">0x04000000</span><span class="plain-syntax"> </span><span class="comment-syntax"> a word ending in -ing</span>
|
|
<span class="definition-keyword">define</span> <span class="constant-syntax">NUMBER_MC</span><span class="plain-syntax"> </span><span class="constant-syntax">0x08000000</span><span class="plain-syntax"> </span><span class="comment-syntax"> one, two, ..., twelve, 1, 2, ...</span>
|
|
<span class="definition-keyword">define</span> <span class="constant-syntax">I6_MC</span><span class="plain-syntax"> </span><span class="constant-syntax">0x10000000</span><span class="plain-syntax"> </span><span class="comment-syntax"> piece of verbatim I6 code</span>
|
|
<span class="definition-keyword">define</span> <span class="constant-syntax">TEXTWITHSUBS_MC</span><span class="plain-syntax"> </span><span class="constant-syntax">0x20000000</span><span class="plain-syntax"> </span><span class="comment-syntax"> double-quoted text literal with substitutions</span>
|
|
<span class="definition-keyword">define</span> <span class="constant-syntax">TEXT_MC</span><span class="plain-syntax"> </span><span class="constant-syntax">0x40000000</span><span class="plain-syntax"> </span><span class="comment-syntax"> double-quoted text literal without substitutions</span>
|
|
<span class="definition-keyword">define</span> <span class="constant-syntax">ORDINAL_MC</span><span class="plain-syntax"> </span><span class="constant-syntax">0x80000000</span><span class="plain-syntax"> </span><span class="comment-syntax"> first, second, third, ..., twelfth</span>
|
|
</pre>
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">typedef</span><span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">unsigned</span><span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">flags</span><span class="plain-syntax">; </span><span class="comment-syntax"> bitmap of "meaning codes" indicating possible usages</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">literal_number_value</span><span class="plain-syntax">; </span><span class="comment-syntax"> evaluation as a literal number, if any</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">exemplar</span><span class="plain-syntax">; </span><span class="comment-syntax"> text of one instance of this word</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">raw_exemplar</span><span class="plain-syntax">; </span><span class="comment-syntax"> text of one instance in its raw untreated form</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">hash</span><span class="plain-syntax">; </span><span class="comment-syntax"> hash code derived from text of word</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">next_in_vocab_hash</span><span class="plain-syntax">; </span><span class="comment-syntax"> next in list with this hash</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">lower_case_form</span><span class="plain-syntax">; </span><span class="comment-syntax"> or null if none exists</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">upper_case_form</span><span class="plain-syntax">; </span><span class="comment-syntax"> or null if none exists</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">nt_incidence</span><span class="plain-syntax">; </span><span class="comment-syntax"> bitmap hashing which Preform nonterminals it occurs in</span>
|
|
<span class="plain-syntax"> #</span><span class="identifier-syntax">ifdef</span><span class="plain-syntax"> </span><span class="identifier-syntax">VOCABULARY_MEANING_INITIALISER</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="identifier-syntax">vocabulary_meaning</span><span class="plain-syntax"> </span><span class="identifier-syntax">means</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> #</span><span class="identifier-syntax">endif</span>
|
|
<span class="plain-syntax">} </span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax">;</span>
|
|
</pre>
|
|
<ul class="endnotetexts"><li>The structure vocabulary_entry is accessed in 4/to and here.</li></ul>
|
|
<p class="commentary firstcommentary"><a id="SP2"></a><b>§2. </b>Some standard punctuation marks:
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">CLOSEBRACE_V</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">CLOSEBRACKET_V</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">COLON_V</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">COMMA_V</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">DOUBLEDASH_V</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">FORWARDSLASH_V</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">FULLSTOP_V</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">OPENBRACE_V</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">OPENBRACKET_V</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">OPENI6_V</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">PARBREAK_V</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">PLUS_V</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">SEMICOLON_V</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">STROKE_V</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
|
|
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::create_punctuation</span><button class="popup" onclick="togglePopup('usagePopup1')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup1">Usage of <span class="code-font"><span class="function-syntax">Vocabulary::create_punctuation</span></span>:<br/>Words Module - <a href="1-wm.html#SP3">§3</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">void</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">CLOSEBRACE_V</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP15" class="function-link"><span class="function-syntax">Vocabulary::entry_for_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">L</span><span class="string-syntax">"}"</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">CLOSEBRACKET_V</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP15" class="function-link"><span class="function-syntax">Vocabulary::entry_for_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">L</span><span class="string-syntax">")"</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">COLON_V</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP15" class="function-link"><span class="function-syntax">Vocabulary::entry_for_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">L</span><span class="string-syntax">":"</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">COMMA_V</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP15" class="function-link"><span class="function-syntax">Vocabulary::entry_for_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">L</span><span class="string-syntax">","</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">DOUBLEDASH_V</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP15" class="function-link"><span class="function-syntax">Vocabulary::entry_for_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">L</span><span class="string-syntax">"--"</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">FORWARDSLASH_V</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP15" class="function-link"><span class="function-syntax">Vocabulary::entry_for_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">L</span><span class="string-syntax">"/"</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">FULLSTOP_V</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP15" class="function-link"><span class="function-syntax">Vocabulary::entry_for_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">L</span><span class="string-syntax">"."</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">OPENBRACE_V</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP15" class="function-link"><span class="function-syntax">Vocabulary::entry_for_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">L</span><span class="string-syntax">"{"</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">OPENBRACKET_V</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP15" class="function-link"><span class="function-syntax">Vocabulary::entry_for_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">L</span><span class="string-syntax">"("</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">OPENI6_V</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP15" class="function-link"><span class="function-syntax">Vocabulary::entry_for_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">L</span><span class="string-syntax">"(-"</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">PARBREAK_V</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP15" class="function-link"><span class="function-syntax">Vocabulary::entry_for_text</span></a><span class="plain-syntax">(</span><span class="constant-syntax">PARAGRAPH_BREAK</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">PLUS_V</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP15" class="function-link"><span class="function-syntax">Vocabulary::entry_for_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">L</span><span class="string-syntax">"+"</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">SEMICOLON_V</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP15" class="function-link"><span class="function-syntax">Vocabulary::entry_for_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">L</span><span class="string-syntax">";"</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">STROKE_V</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP15" class="function-link"><span class="function-syntax">Vocabulary::entry_for_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">L</span><span class="string-syntax">"|"</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP3"></a><b>§3. </b>Each distinct word is to have a unique <span class="extract"><span class="extract-syntax">vocabulary_entry</span></span> structure, and the
|
|
"identity" at word number <span class="extract"><span class="extract-syntax">wn</span></span> is to point to the structure for the text
|
|
at that word. Two words are distinct if their lower-case forms are different,
|
|
except that two quoted literal texts are always distinct, even if they have
|
|
the same content. So for instance,
|
|
</p>
|
|
|
|
<blockquote>
|
|
<p>Daleks conquer and destroy! "Ba-dum." Exterminate, exterminate! "Ba-dum."</p>
|
|
</blockquote>
|
|
|
|
<p class="commentary">would be identified as
|
|
</p>
|
|
|
|
<blockquote>
|
|
<p>|ve0| |ve1| |ve2| |ve3| |ve4| |ve5| |ve6| |ve6| |ve4| |ve7|</p>
|
|
</blockquote>
|
|
|
|
<p class="commentary">where <span class="extract"><span class="extract-syntax">ve4</span></span> is the common identity of both exclamation marks, and <span class="extract"><span class="extract-syntax">ve6</span></span>
|
|
that of the two "exterminate"s, even though they have different casings;
|
|
while the quoted text <span class="extract"><span class="extract-syntax">"Ba-dum."</span></span> came out with two different identities
|
|
<span class="extract"><span class="extract-syntax">ve5</span></span> and <span class="extract"><span class="extract-syntax">ve7</span></span>.
|
|
</p>
|
|
|
|
<p class="commentary">When we want to set the identity for a given word, we call these front-door
|
|
routines, either on a single word or on a range.
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::identify_word</span><button class="popup" onclick="togglePopup('usagePopup2')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup2">Usage of <span class="code-font"><span class="function-syntax">Vocabulary::identify_word</span></span>:<br/><a href="2-vcb.html#SP4">§4</a><br/>Numbered Words - <a href="2-nw.html#SP7">§7</a><br/>Lexer - <a href="3-lxr.html#SP28_5_2">§28.5.2</a>, <a href="3-lxr.html#SP28_6">§28.6</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">wn</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">ve</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP15" class="function-link"><span class="function-syntax">Vocabulary::entry_for_text</span></a><span class="plain-syntax">(</span><a href="3-lxr.html#SP20" class="function-link"><span class="function-syntax">Lexer::word_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">wn</span><span class="plain-syntax">));</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">raw_exemplar</span><span class="plain-syntax"> = </span><a href="3-lxr.html#SP20" class="function-link"><span class="function-syntax">Lexer::word_raw_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">wn</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><a href="3-lxr.html#SP20" class="function-link"><span class="function-syntax">Lexer::set_word</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">wn</span><span class="plain-syntax">, </span><span class="identifier-syntax">ve</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax">}</span>
|
|
|
|
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::identify_word_range</span><button class="popup" onclick="togglePopup('usagePopup3')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup3">Usage of <span class="code-font"><span class="function-syntax">Vocabulary::identify_word_range</span></span>:<br/>Feeds - <a href="3-fds.html#SP4_2">§4.2</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">wording</span><span class="plain-syntax"> </span><span class="identifier-syntax">W</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">LOOP_THROUGH_WORDING</span><span class="plain-syntax">(</span><span class="identifier-syntax">i</span><span class="plain-syntax">, </span><span class="identifier-syntax">W</span><span class="plain-syntax">)</span>
|
|
<span class="plain-syntax"> </span><a href="2-vcb.html#SP3" class="function-link"><span class="function-syntax">Vocabulary::identify_word</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">i</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP4"></a><b>§4. </b>Should we ever change the text of a word, it's essential to re-identify it,
|
|
as otherwise its <span class="extract"><span class="extract-syntax">lw_identity</span></span> points to the wrong vocabulary entry.
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::change_text_of_word</span><span class="plain-syntax">(</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">wn</span><span class="plain-syntax">, </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">new</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><a href="3-lxr.html#SP20" class="function-link"><span class="function-syntax">Lexer::set_word_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">wn</span><span class="plain-syntax">, </span><span class="identifier-syntax">new</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><a href="3-lxr.html#SP20" class="function-link"><span class="function-syntax">Lexer::set_word_raw_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">wn</span><span class="plain-syntax">, </span><span class="identifier-syntax">new</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><a href="2-vcb.html#SP3" class="function-link"><span class="function-syntax">Vocabulary::identify_word</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">wn</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP5"></a><b>§5. </b>We now need some utilities for dealing with vocabulary entries. Here is a
|
|
creator, and a debugging logger:
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="function-syntax">Vocabulary::vocab_entry_new</span><button class="popup" onclick="togglePopup('usagePopup4')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup4">Usage of <span class="code-font"><span class="function-syntax">Vocabulary::vocab_entry_new</span></span>:<br/><a href="2-vcb.html#SP9">§9</a>, <a href="2-vcb.html#SP15_1">§15.1</a>, <a href="2-vcb.html#SP15_2">§15.2</a></span></button><span class="plain-syntax">(</span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">hash_code</span><span class="plain-syntax">,</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">unsigned</span><span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">flags</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">val</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">ve</span><span class="plain-syntax"> = </span><span class="identifier-syntax">CREATE</span><span class="plain-syntax">(</span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">exemplar</span><span class="plain-syntax"> = </span><span class="identifier-syntax">text</span><span class="plain-syntax">; </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">raw_exemplar</span><span class="plain-syntax"> = </span><span class="identifier-syntax">text</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">next_in_vocab_hash</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">lower_case_form</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">; </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">upper_case_form</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">hash</span><span class="plain-syntax"> = </span><span class="identifier-syntax">hash_code</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">nt_incidence</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">flags</span><span class="plain-syntax"> = </span><span class="identifier-syntax">flags</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">l</span><span class="plain-syntax"> = </span><span class="identifier-syntax">Wide::len</span><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">l</span><span class="plain-syntax">>3) && (</span><span class="identifier-syntax">text</span><span class="plain-syntax">[</span><span class="identifier-syntax">l</span><span class="plain-syntax">-3] == </span><span class="character-syntax">'i'</span><span class="plain-syntax">) && (</span><span class="identifier-syntax">text</span><span class="plain-syntax">[</span><span class="identifier-syntax">l</span><span class="plain-syntax">-2] == </span><span class="character-syntax">'n'</span><span class="plain-syntax">) && (</span><span class="identifier-syntax">text</span><span class="plain-syntax">[</span><span class="identifier-syntax">l</span><span class="plain-syntax">-1] == </span><span class="character-syntax">'g'</span><span class="plain-syntax">))</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">flags</span><span class="plain-syntax"> |= </span><span class="constant-syntax">ING_MC</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">literal_number_value</span><span class="plain-syntax"> = </span><span class="identifier-syntax">val</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> #</span><span class="identifier-syntax">ifdef</span><span class="plain-syntax"> </span><span class="identifier-syntax">VOCABULARY_MEANING_INITIALISER</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">means</span><span class="plain-syntax"> = </span><span class="identifier-syntax">VOCABULARY_MEANING_INITIALISER</span><span class="plain-syntax">(</span><span class="identifier-syntax">ve</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> #</span><span class="identifier-syntax">endif</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
|
|
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::log</span><button class="popup" onclick="togglePopup('usagePopup5')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup5">Usage of <span class="code-font"><span class="function-syntax">Vocabulary::log</span></span>:<br/>Words Module - <a href="1-wm.html#SP3">§3</a></span></button><span class="plain-syntax">(</span><span class="identifier-syntax">OUTPUT_STREAM</span><span class="plain-syntax">, </span><span class="reserved-syntax">void</span><span class="plain-syntax"> *</span><span class="identifier-syntax">vve</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">ve</span><span class="plain-syntax"> = (</span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *) </span><span class="identifier-syntax">vve</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">ve</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">) { </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">"NULL"</span><span class="plain-syntax">); </span><span class="reserved-syntax">return</span><span class="plain-syntax">; }</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">exemplar</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">) { </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">"NULL-EXEMPLAR"</span><span class="plain-syntax">); </span><span class="reserved-syntax">return</span><span class="plain-syntax">; }</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">"%08x-%w-%08x"</span><span class="plain-syntax">, </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">hash</span><span class="plain-syntax">, </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">raw_exemplar</span><span class="plain-syntax">, </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">flags</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP6"></a><b>§6. </b>It's perhaps unexpected that a vocabulary entry not only stores a (pointer
|
|
to) a copy of the text, the "exemplar" (since it is text which is an
|
|
example of this vocabulary being used), but also a separate raw copy of
|
|
the text: raw in the sense of retaining the original form in the source
|
|
files which the word came from. This looks strange because we normally
|
|
identify words on their case-lowered text, not on their raw text. In
|
|
the source material:
|
|
</p>
|
|
|
|
<blockquote>
|
|
<p>Former Marillion vocalist Fish derived his nickname not from a fish, but from habitual bathing.</p>
|
|
</blockquote>
|
|
|
|
<p class="commentary">words 4, "Fish", and 11, "fish", each have the same vocabulary entry
|
|
as identity, even though their raw texts differ. Clearly the ordinary
|
|
exemplar of this entry must be "fish". But what should the raw exemplar
|
|
be, "Fish" or "fish"? The answer is the latter, or in general, the raw
|
|
exemplar will always be the same as the exemplar; unless we have amended
|
|
it by hand, using the following routine.
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::set_raw_exemplar_to_text</span><button class="popup" onclick="togglePopup('usagePopup6')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup6">Usage of <span class="code-font"><span class="function-syntax">Vocabulary::set_raw_exemplar_to_text</span></span>:<br/>Numbered Words - <a href="2-nw.html#SP7">§7</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">wn</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><a href="3-lxr.html#SP20" class="function-link"><span class="function-syntax">Lexer::word</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">wn</span><span class="plain-syntax">)-></span><span class="element-syntax">raw_exemplar</span><span class="plain-syntax"> = </span><a href="3-lxr.html#SP20" class="function-link"><span class="function-syntax">Lexer::word_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">wn</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP7"></a><b>§7. </b>Here are some access routines for the data stored in this
|
|
structure:
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="function-syntax">Vocabulary::get_exemplar</span><button class="popup" onclick="togglePopup('usagePopup7')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup7">Usage of <span class="code-font"><span class="function-syntax">Vocabulary::get_exemplar</span></span>:<br/>Loading Preform - <a href="4-lp.html#SP16">§16</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">ve</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">raw</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">raw</span><span class="plain-syntax">) </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">raw_exemplar</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">else</span><span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">exemplar</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
|
|
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::writer</span><button class="popup" onclick="togglePopup('usagePopup8')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup8">Usage of <span class="code-font"><span class="function-syntax">Vocabulary::writer</span></span>:<br/>Words Module - <a href="1-wm.html#SP3">§3</a></span></button><span class="plain-syntax">(</span><span class="identifier-syntax">OUTPUT_STREAM</span><span class="plain-syntax">, </span><span class="reserved-syntax">char</span><span class="plain-syntax"> *</span><span class="identifier-syntax">format_string</span><span class="plain-syntax">, </span><span class="reserved-syntax">void</span><span class="plain-syntax"> *</span><span class="identifier-syntax">vV</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">ve</span><span class="plain-syntax"> = (</span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *) </span><span class="identifier-syntax">vV</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">ve</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">) </span><span class="identifier-syntax">internal_error</span><span class="plain-syntax">(</span><span class="string-syntax">"tried to write null vocabulary"</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">switch</span><span class="plain-syntax"> (</span><span class="identifier-syntax">format_string</span><span class="plain-syntax">[0]) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'+'</span><span class="plain-syntax">: </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">"%w"</span><span class="plain-syntax">, </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">raw_exemplar</span><span class="plain-syntax">); </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'V'</span><span class="plain-syntax">: </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">"%w"</span><span class="plain-syntax">, </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">exemplar</span><span class="plain-syntax">); </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">default:</span><span class="plain-syntax"> </span><span class="identifier-syntax">internal_error</span><span class="plain-syntax">(</span><span class="string-syntax">"bad %V extension"</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP8"></a><b>§8. </b>An integer is stored at each vocabulary entry, recording its value
|
|
if it every turns out to parse as a literal number:
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::get_literal_number_value</span><button class="popup" onclick="togglePopup('usagePopup9')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup9">Usage of <span class="code-font"><span class="function-syntax">Vocabulary::get_literal_number_value</span></span>:<br/>Loading Preform - <a href="4-lp.html#SP14_1_1_1">§14.1.1.1</a>, <a href="4-lp.html#SP14_1_3">§14.1.3</a><br/>Basic Nonterminals - <a href="4-bn.html#SP6">§6</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">ve</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">literal_number_value</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::set_literal_number_value</span><span class="plain-syntax">(</span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">ve</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">val</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">literal_number_value</span><span class="plain-syntax"> = </span><span class="identifier-syntax">val</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP9"></a><b>§9. </b>Almost all text is used case insensitively in Inform source, but we do
|
|
occasionally need to distinguish "The" from "the" and the like, when
|
|
parsing the names of text substitutions. When a new text substitution is
|
|
declared whose first word, in the definition, begins with a capital letter,
|
|
<span class="extract"><span class="extract-syntax">Vocabulary::make_case_sensitive</span></span> is called on the first word, and its identity
|
|
is changed to the upper case variant form.
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::used_case_sensitively</span><span class="plain-syntax">(</span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">ve</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">upper_case_form</span><span class="plain-syntax">) || (</span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">lower_case_form</span><span class="plain-syntax">)) </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">TRUE</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">FALSE</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="function-syntax">Vocabulary::get_lower_case_form</span><span class="plain-syntax">(</span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">ve</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">lower_case_form</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="function-syntax">Vocabulary::make_case_sensitive</span><span class="plain-syntax">(</span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">ve</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">upper_case_form</span><span class="plain-syntax">) </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">upper_case_form</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">upper_case_form</span><span class="plain-syntax"> =</span>
|
|
<span class="plain-syntax"> </span><a href="2-vcb.html#SP5" class="function-link"><span class="function-syntax">Vocabulary::vocab_entry_new</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">exemplar</span><span class="plain-syntax">, </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">hash</span><span class="plain-syntax">, </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">flags</span><span class="plain-syntax">, </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">literal_number_value</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">upper_case_form</span><span class="plain-syntax">-></span><span class="element-syntax">lower_case_form</span><span class="plain-syntax"> = </span><span class="identifier-syntax">ve</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">upper_case_form</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP10"></a><b>§10. </b>Finally, each vocabulary entry comes with a bitmap of flags, and here
|
|
we get to set and test them:
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::set_flags</span><span class="plain-syntax">(</span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">ve</span><span class="plain-syntax">, </span><span class="reserved-syntax">unsigned</span><span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">t</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">flags</span><span class="plain-syntax"> |= </span><span class="identifier-syntax">t</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
<span class="reserved-syntax">unsigned</span><span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::test_vflags</span><span class="plain-syntax">(</span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">ve</span><span class="plain-syntax">, </span><span class="reserved-syntax">unsigned</span><span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">t</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> (</span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">flags</span><span class="plain-syntax">) & </span><span class="identifier-syntax">t</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
<span class="reserved-syntax">unsigned</span><span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::test_flags</span><button class="popup" onclick="togglePopup('usagePopup10')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup10">Usage of <span class="code-font"><span class="function-syntax">Vocabulary::test_flags</span></span>:<br/>Wordings - <a href="3-wrd.html#SP16">§16</a><br/>Loading Preform - <a href="4-lp.html#SP14_1_1_1">§14.1.1.1</a>, <a href="4-lp.html#SP14_1_3">§14.1.3</a><br/>Basic Nonterminals - <a href="4-bn.html#SP6">§6</a>, <a href="4-bn.html#SP7">§7</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">wn</span><span class="plain-syntax">, </span><span class="reserved-syntax">unsigned</span><span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">t</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> (</span><a href="3-lxr.html#SP20" class="function-link"><span class="function-syntax">Lexer::word</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">wn</span><span class="plain-syntax">)-></span><span class="element-syntax">flags</span><span class="plain-syntax">) & </span><span class="identifier-syntax">t</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP11"></a><b>§11. </b>It can be useful to find the disjunction of the flags for all the words
|
|
in a range, as that gives us a single bitmap which tells us quickly whether
|
|
any of the words in that range is a number, or is a word ending in "-ing",
|
|
and so on:
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">unsigned</span><span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::disjunction_of_flags</span><span class="plain-syntax">(</span><span class="reserved-syntax">wording</span><span class="plain-syntax"> </span><span class="identifier-syntax">W</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">unsigned</span><span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">d</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">LOOP_THROUGH_WORDING</span><span class="plain-syntax">(</span><span class="identifier-syntax">i</span><span class="plain-syntax">, </span><span class="identifier-syntax">W</span><span class="plain-syntax">)</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">d</span><span class="plain-syntax"> |= (</span><a href="3-lxr.html#SP20" class="function-link"><span class="function-syntax">Lexer::word</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">i</span><span class="plain-syntax">)-></span><span class="element-syntax">flags</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">d</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP12"></a><b>§12. </b>Also:
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::set_ntb</span><button class="popup" onclick="togglePopup('usagePopup11')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup11">Usage of <span class="code-font"><span class="function-syntax">Vocabulary::set_ntb</span></span>:<br/>The Optimiser - <a href="4-to.html#SP3">§3</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">ve</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">R</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">nt_incidence</span><span class="plain-syntax"> = </span><span class="identifier-syntax">R</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::get_ntb</span><button class="popup" onclick="togglePopup('usagePopup12')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup12">Usage of <span class="code-font"><span class="function-syntax">Vocabulary::get_ntb</span></span>:<br/>The Optimiser - <a href="4-to.html#SP3">§3</a>, <a href="4-to.html#SP4">§4</a>, <a href="4-to.html#SP5">§5</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">ve</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">ve</span><span class="plain-syntax">-></span><span class="element-syntax">nt_incidence</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP13"></a><b>§13. Hash coding of words. </b>To find all the different words used in the source text, we need in principle
|
|
to make an enormous number of comparisons of their texts. It is slow to make
|
|
a correct identification of two texts as being equal: we have to compare
|
|
their every characters against each other. Fortunately, it can be much
|
|
faster to tell if they are different. We do this by rapidly deriving a
|
|
number from their texts, and then comparing the numbers: if different,
|
|
the texts were different.
|
|
</p>
|
|
|
|
<p class="commentary">The most obvious number would be the length of the text, but this produces
|
|
too little variation, and too many false positives: "blue" and "cyan",
|
|
for instance, would each produce the number 4.
|
|
</p>
|
|
|
|
<p class="commentary">Instead we use a standard method to derive a number traditionally called
|
|
a "hash code". This is the algorithm called "X 30011" in Aho, Sethi and
|
|
Ullman's standard "Compilers: Principles, Techniques and Tools" (1986).
|
|
Because it is derived from constantly overflowing integer arithmetic,
|
|
it will produce different codes on different architectures (say, where
|
|
<span class="extract"><span class="extract-syntax">int</span></span> is 64 bits long rather than 32, or where <span class="extract"><span class="extract-syntax">char</span></span> is unsigned).
|
|
All that matters is that it provides a good spread of hash codes for
|
|
typical texts fed into it on any given occasion.
|
|
</p>
|
|
|
|
<p class="commentary">Good results depend on the number of possible codes being not too tiny
|
|
compared to the number of different texts fed in, and also on the key value
|
|
30011 being coprime to this number (but 30011 is prime, so that's easily
|
|
arranged). A typical source text of 50,000 words has an unquoted vocabulary
|
|
of only about 2000 different words. The variation in vocabulary size
|
|
between the smallest text source and the largest is only about a factor of
|
|
three or four, so there is no need to make a dynamic estimate of the size
|
|
of the source. We will always choose 997 as the number of possible hash
|
|
codes produced by X 30011: we reserve a further three special codes to be
|
|
the hashes of literals rather than ordinary words, and this brings us up to
|
|
a round 1000.
|
|
</p>
|
|
|
|
<p class="commentary">Inside the lexer, decimal integers such as <span class="extract"><span class="extract-syntax">-506</span></span> were treated as ordinary
|
|
words, as there were no lexical difficulties in parsing them. Here they
|
|
begin to semantically diverge from the way other ordinary words are handled:
|
|
they're treated more like literal texts and I6 inclusions.
|
|
</p>
|
|
|
|
<pre class="definitions code-font"><span class="definition-keyword">define</span> <span class="constant-syntax">HASH_TAB_SIZE</span><span class="plain-syntax"> </span><span class="constant-syntax">1000</span><span class="plain-syntax"> </span><span class="comment-syntax"> the possible hash codes are 0 up to this minus 1</span>
|
|
<span class="definition-keyword">define</span> <span class="constant-syntax">NUMBER_HASH</span><span class="plain-syntax"> </span><span class="constant-syntax">0</span><span class="plain-syntax"> </span><span class="comment-syntax"> literal decimal integers, and no other words, have this hash code</span>
|
|
<span class="definition-keyword">define</span> <span class="constant-syntax">TEXT_HASH</span><span class="plain-syntax"> </span><span class="constant-syntax">1</span><span class="plain-syntax"> </span><span class="comment-syntax"> double quoted texts, and no other words, have this hash code</span>
|
|
<span class="definition-keyword">define</span> <span class="constant-syntax">I6_HASH</span><span class="plain-syntax"> </span><span class="constant-syntax">2</span><span class="plain-syntax"> </span><span class="comment-syntax"> the </span><span class="extract"><span class="extract-syntax">(-</span></span><span class="comment-syntax"> word introducing an I6 inclusion uniquely has this hash code</span>
|
|
</pre>
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::hash_code_from_word</span><button class="popup" onclick="togglePopup('usagePopup13')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup13">Usage of <span class="code-font"><span class="function-syntax">Vocabulary::hash_code_from_word</span></span>:<br/><a href="2-vcb.html#SP15">§15</a></span></button><span class="plain-syntax">(</span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">text</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">unsigned</span><span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">hash_code</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">p</span><span class="plain-syntax"> = </span><span class="identifier-syntax">text</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">switch</span><span class="plain-syntax">(*</span><span class="identifier-syntax">p</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'-'</span><span class="plain-syntax">: </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">p</span><span class="plain-syntax">[1] == </span><span class="constant-syntax">0</span><span class="plain-syntax">) </span><span class="reserved-syntax">break</span><span class="plain-syntax">; </span><span class="comment-syntax"> an isolated minus sign is an ordinary word</span>
|
|
<span class="plain-syntax"> </span><span class="comment-syntax"> and otherwise fall into...</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'0'</span><span class="plain-syntax">: </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'1'</span><span class="plain-syntax">: </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'2'</span><span class="plain-syntax">: </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'3'</span><span class="plain-syntax">: </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'4'</span><span class="plain-syntax">:</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'5'</span><span class="plain-syntax">: </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'6'</span><span class="plain-syntax">: </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'7'</span><span class="plain-syntax">: </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'8'</span><span class="plain-syntax">: </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'9'</span><span class="plain-syntax">:</span>
|
|
<span class="plain-syntax"> </span><span class="comment-syntax"> the first character may prove to be the start of a number: is this true?</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="identifier-syntax">p</span><span class="plain-syntax">++; *</span><span class="identifier-syntax">p</span><span class="plain-syntax">; </span><span class="identifier-syntax">p</span><span class="plain-syntax">++) </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">Characters::isdigit</span><span class="plain-syntax">(*</span><span class="identifier-syntax">p</span><span class="plain-syntax">) == </span><span class="identifier-syntax">FALSE</span><span class="plain-syntax">) </span><span class="reserved-syntax">goto</span><span class="plain-syntax"> </span><span class="identifier-syntax">Try_Text</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">NUMBER_HASH</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">' '</span><span class="plain-syntax">: </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">I6_HASH</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'('</span><span class="plain-syntax">: </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">p</span><span class="plain-syntax">[1] == </span><span class="character-syntax">'-'</span><span class="plain-syntax">) </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">I6_HASH</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'"'</span><span class="plain-syntax">: </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">TEXT_HASH</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">Try_Text:</span>
|
|
<span class="plain-syntax"> #</span><span class="identifier-syntax">pragma</span><span class="plain-syntax"> </span><span class="identifier-syntax">clang</span><span class="plain-syntax"> </span><span class="identifier-syntax">diagnostic</span><span class="plain-syntax"> </span><span class="identifier-syntax">push</span>
|
|
<span class="plain-syntax"> #</span><span class="identifier-syntax">pragma</span><span class="plain-syntax"> </span><span class="identifier-syntax">clang</span><span class="plain-syntax"> </span><span class="identifier-syntax">diagnostic</span><span class="plain-syntax"> </span><span class="identifier-syntax">ignored</span><span class="plain-syntax"> </span><span class="string-syntax">"-Wsign-conversion"</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="identifier-syntax">p</span><span class="plain-syntax">=</span><span class="identifier-syntax">text</span><span class="plain-syntax">; *</span><span class="identifier-syntax">p</span><span class="plain-syntax">; </span><span class="identifier-syntax">p</span><span class="plain-syntax">++) </span><span class="identifier-syntax">hash_code</span><span class="plain-syntax"> = </span><span class="identifier-syntax">hash_code</span><span class="plain-syntax">*30011 + (*</span><span class="identifier-syntax">p</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> #</span><span class="identifier-syntax">pragma</span><span class="plain-syntax"> </span><span class="identifier-syntax">clang</span><span class="plain-syntax"> </span><span class="identifier-syntax">diagnostic</span><span class="plain-syntax"> </span><span class="identifier-syntax">pop</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax">) (3+(</span><span class="identifier-syntax">hash_code</span><span class="plain-syntax"> % (</span><span class="constant-syntax">HASH_TAB_SIZE</span><span class="plain-syntax">-3))); </span><span class="comment-syntax"> result of X 30011, plus 3</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP14"></a><b>§14. The hash table of vocabulary. </b>Armed with these hash codes, we now store the pointers to the vocabulary
|
|
entry structures in linked lists, one for each possible hash code.
|
|
These begin empty.
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">list_of_vocab_with_hash</span><span class="plain-syntax">[</span><span class="constant-syntax">HASH_TAB_SIZE</span><span class="plain-syntax">];</span>
|
|
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::start_hash_table</span><button class="popup" onclick="togglePopup('usagePopup14')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup14">Usage of <span class="code-font"><span class="function-syntax">Vocabulary::start_hash_table</span></span>:<br/>Lexer - <a href="3-lxr.html#SP13">§13</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">void</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">i</span><span class="plain-syntax">=0; </span><span class="identifier-syntax">i</span><span class="plain-syntax"><</span><span class="constant-syntax">HASH_TAB_SIZE</span><span class="plain-syntax">; </span><span class="identifier-syntax">i</span><span class="plain-syntax">++) </span><span class="identifier-syntax">list_of_vocab_with_hash</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">] = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
|
|
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::write_hash_table</span><span class="plain-syntax">(</span><span class="identifier-syntax">OUTPUT_STREAM</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">i</span><span class="plain-syntax">=0; </span><span class="identifier-syntax">i</span><span class="plain-syntax"><</span><span class="constant-syntax">HASH_TAB_SIZE</span><span class="plain-syntax">; </span><span class="identifier-syntax">i</span><span class="plain-syntax">++) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">c</span><span class="plain-syntax">=0;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">entry</span><span class="plain-syntax"> = </span><span class="identifier-syntax">list_of_vocab_with_hash</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">];</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">entry</span><span class="plain-syntax">; </span><span class="identifier-syntax">entry</span><span class="plain-syntax"> = </span><span class="identifier-syntax">entry</span><span class="plain-syntax">-></span><span class="element-syntax">next_in_vocab_hash</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">c</span><span class="plain-syntax">++ == </span><span class="constant-syntax">0</span><span class="plain-syntax">) </span><span class="identifier-syntax">PRINT</span><span class="plain-syntax">(</span><span class="string-syntax">"%d:"</span><span class="plain-syntax">, </span><span class="identifier-syntax">i</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">PRINT</span><span class="plain-syntax">(</span><span class="string-syntax">" %w"</span><span class="plain-syntax">, </span><span class="identifier-syntax">entry</span><span class="plain-syntax">-></span><span class="element-syntax">exemplar</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">c</span><span class="plain-syntax">>0) </span><span class="identifier-syntax">PRINT</span><span class="plain-syntax">(</span><span class="string-syntax">"\n"</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP15"></a><b>§15. </b>And that leaves only one routine: for finding the unique vocabulary
|
|
entry pointer associated with the material in <span class="extract"><span class="extract-syntax">text</span></span>. We search the
|
|
hash table to see if we have the word already, and if not, we add it.
|
|
Either way, we return a valid pointer. (Compare Isaiah 55:11, "So shall
|
|
my word be that goeth forth out of my mouth: it shall not return unto
|
|
me void.")
|
|
</p>
|
|
|
|
<p class="commentary">It is in order to set the initial values of the flags for the new
|
|
word (if it does turn out to be new) that we mandated special hash
|
|
codes for any number, any text, or any I6 inclusion.
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">no_vocabulary_entries</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
|
|
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="function-syntax">Vocabulary::entry_for_text</span><button class="popup" onclick="togglePopup('usagePopup15')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup15">Usage of <span class="code-font"><span class="function-syntax">Vocabulary::entry_for_text</span></span>:<br/><a href="2-vcb.html#SP2">§2</a>, <a href="2-vcb.html#SP3">§3</a><br/>Nonterminals - <a href="4-nnt.html#SP2">§2</a>, <a href="4-nnt.html#SP4">§4</a><br/>Loading Preform - <a href="4-lp.html#SP6">§6</a></span></button><span class="plain-syntax">(</span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">text</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">new_entry</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">hash_code</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP13" class="function-link"><span class="function-syntax">Vocabulary::hash_code_from_word</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">), </span><span class="identifier-syntax">val</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">unsigned</span><span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">f</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">switch</span><span class="plain-syntax">(</span><span class="identifier-syntax">hash_code</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">NUMBER_HASH:</span><span class="plain-syntax"> </span><span class="identifier-syntax">f</span><span class="plain-syntax"> = </span><span class="constant-syntax">NUMBER_MC</span><span class="plain-syntax">; </span><span class="identifier-syntax">val</span><span class="plain-syntax"> = </span><span class="identifier-syntax">Wide::atoi</span><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">); </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">TEXT_HASH:</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">switch</span><span class="plain-syntax"> (</span><a href="2-nw.html#SP3" class="function-link"><span class="function-syntax">Word::perhaps_ill_formed_text_routine</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">)) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">TRUE:</span><span class="plain-syntax"> </span><span class="identifier-syntax">f</span><span class="plain-syntax"> = </span><span class="constant-syntax">TEXTWITHSUBS_MC</span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">FALSE:</span><span class="plain-syntax"> </span><span class="identifier-syntax">f</span><span class="plain-syntax"> = </span><span class="constant-syntax">TEXT_MC</span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">NOT_APPLICABLE:</span><span class="plain-syntax"> </span><span class="identifier-syntax">f</span><span class="plain-syntax"> = </span><span class="constant-syntax">TEXT_MC</span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">I6_HASH:</span><span class="plain-syntax"> </span><span class="identifier-syntax">f</span><span class="plain-syntax"> = </span><span class="constant-syntax">I6_MC</span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">default:</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">val</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP17" class="function-link"><span class="function-syntax">Vocabulary::an_ordinal_number</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">val</span><span class="plain-syntax"> >= </span><span class="constant-syntax">0</span><span class="plain-syntax">) </span><span class="identifier-syntax">f</span><span class="plain-syntax"> = </span><span class="constant-syntax">NUMBER_MC</span><span class="plain-syntax"> + </span><span class="constant-syntax">ORDINAL_MC</span><span class="plain-syntax">; </span><span class="comment-syntax"> so that "4th", say, picks up both</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">list_of_vocab_with_hash</span><span class="plain-syntax">[</span><span class="identifier-syntax">hash_code</span><span class="plain-syntax">] == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="named-paragraph-container code-font"><a href="2-vcb.html#SP15_1" class="named-paragraph-link"><span class="named-paragraph">Pi-ty? That word is not in my vocabulary banks</span><span class="named-paragraph-number">15.1</span></a></span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> } </span><span class="reserved-syntax">else</span><span class="plain-syntax"> {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="identifier-syntax">old_entry</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">n</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="comment-syntax"> search the non-empty list of words with this hash code</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="identifier-syntax">n</span><span class="plain-syntax">=0, </span><span class="identifier-syntax">new_entry</span><span class="plain-syntax"> = </span><span class="identifier-syntax">list_of_vocab_with_hash</span><span class="plain-syntax">[</span><span class="identifier-syntax">hash_code</span><span class="plain-syntax">];</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">new_entry</span><span class="plain-syntax"> != </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">n</span><span class="plain-syntax">++, </span><span class="identifier-syntax">old_entry</span><span class="plain-syntax"> = </span><span class="identifier-syntax">new_entry</span><span class="plain-syntax">, </span><span class="identifier-syntax">new_entry</span><span class="plain-syntax"> = </span><span class="identifier-syntax">new_entry</span><span class="plain-syntax">-></span><span class="element-syntax">next_in_vocab_hash</span><span class="plain-syntax">)</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">Wide::cmp</span><span class="plain-syntax">(</span><span class="identifier-syntax">new_entry</span><span class="plain-syntax">-></span><span class="element-syntax">exemplar</span><span class="plain-syntax">, </span><span class="identifier-syntax">text</span><span class="plain-syntax">) == </span><span class="constant-syntax">0</span><span class="plain-syntax">)</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">new_entry</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="comment-syntax"> and if we do not find </span><span class="extract"><span class="extract-syntax">text</span></span><span class="comment-syntax"> in there, then...</span>
|
|
<span class="plain-syntax"> </span><span class="named-paragraph-container code-font"><a href="2-vcb.html#SP15_2" class="named-paragraph-link"><span class="named-paragraph">My vision is impaired! I cannot see!</span><span class="named-paragraph-number">15.2</span></a></span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP15_1"></a><b>§15.1. </b>Here the list for this word's hash code was empty, either meaning that this
|
|
is a hash code never seen for any word before (in which case we start the
|
|
list for that hash code with the new word), or that the word is a text
|
|
literal — because, for efficiency's sake, we deliberately keep the
|
|
hash list for all text literals empty.
|
|
</p>
|
|
|
|
<p class="commentary"><span class="named-paragraph-container code-font"><span class="named-paragraph-defn">Pi-ty? That word is not in my vocabulary banks</span><span class="named-paragraph-number">15.1</span></span><span class="comment-syntax"> =</span>
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">new_entry</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP5" class="function-link"><span class="function-syntax">Vocabulary::vocab_entry_new</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">hash_code</span><span class="plain-syntax">, </span><span class="identifier-syntax">f</span><span class="plain-syntax">, </span><span class="identifier-syntax">val</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">hash_code</span><span class="plain-syntax"> != </span><span class="constant-syntax">TEXT_HASH</span><span class="plain-syntax">) </span><span class="identifier-syntax">list_of_vocab_with_hash</span><span class="plain-syntax">[</span><span class="identifier-syntax">hash_code</span><span class="plain-syntax">] = </span><span class="identifier-syntax">new_entry</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">LOGIF</span><span class="plain-syntax">(</span><span class="identifier-syntax">VOCABULARY</span><span class="plain-syntax">, </span><span class="string-syntax">"Word %d <%w> is first vocabulary with hash %d\n"</span><span class="plain-syntax">,</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">no_vocabulary_entries</span><span class="plain-syntax">++, </span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">hash_code</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">new_entry</span><span class="plain-syntax">;</span>
|
|
</pre>
|
|
<ul class="endnotetexts"><li>This code is used in <a href="2-vcb.html#SP15">§15</a>.</li></ul>
|
|
<p class="commentary firstcommentary"><a id="SP15_2"></a><b>§15.2. </b>And here, we exhausted the list at entry <span class="extract"><span class="extract-syntax">n-1</span></span>, with the last entry being
|
|
pointed to by <span class="extract"><span class="extract-syntax">old_entry</span></span>. We add the new word at the end.
|
|
</p>
|
|
|
|
<p class="commentary"><span class="named-paragraph-container code-font"><span class="named-paragraph-defn">My vision is impaired! I cannot see!</span><span class="named-paragraph-number">15.2</span></span><span class="comment-syntax"> =</span>
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">new_entry</span><span class="plain-syntax"> = </span><a href="2-vcb.html#SP5" class="function-link"><span class="function-syntax">Vocabulary::vocab_entry_new</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">hash_code</span><span class="plain-syntax">, </span><span class="identifier-syntax">f</span><span class="plain-syntax">, </span><span class="identifier-syntax">val</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">old_entry</span><span class="plain-syntax">-></span><span class="element-syntax">next_in_vocab_hash</span><span class="plain-syntax"> = </span><span class="identifier-syntax">new_entry</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">LOGIF</span><span class="plain-syntax">(</span><span class="identifier-syntax">VOCABULARY</span><span class="plain-syntax">, </span><span class="string-syntax">"Word %d <%w> is vocabulary entry no. %d with hash %d\n"</span><span class="plain-syntax">,</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">no_vocabulary_entries</span><span class="plain-syntax">++, </span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">n</span><span class="plain-syntax">, </span><span class="identifier-syntax">hash_code</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">new_entry</span><span class="plain-syntax">;</span>
|
|
</pre>
|
|
<ul class="endnotetexts"><li>This code is used in <a href="2-vcb.html#SP15">§15</a>.</li></ul>
|
|
<p class="commentary firstcommentary"><a id="SP16"></a><b>§16. Partial words. </b>Much the same, except that we enter a fragment of a word into lexical memory
|
|
and then find its identity as if it were a whole word.
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">vocabulary_entry</span><span class="plain-syntax"> *</span><span class="function-syntax">Vocabulary::entry_for_partial_text</span><span class="plain-syntax">(</span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">str</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">from</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">to</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">TEMPORARY_TEXT</span><span class="plain-syntax">(</span><span class="identifier-syntax">TEMP</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">i</span><span class="plain-syntax">=</span><span class="identifier-syntax">from</span><span class="plain-syntax">; </span><span class="identifier-syntax">i</span><span class="plain-syntax"><=</span><span class="identifier-syntax">to</span><span class="plain-syntax">; </span><span class="identifier-syntax">i</span><span class="plain-syntax">++) </span><span class="identifier-syntax">PUT_TO</span><span class="plain-syntax">(</span><span class="identifier-syntax">TEMP</span><span class="plain-syntax">, </span><span class="identifier-syntax">str</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">]);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">PUT_TO</span><span class="plain-syntax">(</span><span class="identifier-syntax">TEMP</span><span class="plain-syntax">, </span><span class="constant-syntax">0</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">wording</span><span class="plain-syntax"> </span><span class="identifier-syntax">W</span><span class="plain-syntax"> = </span><a href="3-fds.html#SP3" class="function-link"><span class="function-syntax">Feeds::feed_text</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">TEMP</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">DISCARD_TEXT</span><span class="plain-syntax">(</span><span class="identifier-syntax">TEMP</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><a href="3-wrd.html#SP12" class="function-link"><span class="function-syntax">Wordings::empty</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">W</span><span class="plain-syntax">)) </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><a href="3-lxr.html#SP20" class="function-link"><span class="function-syntax">Lexer::word</span></a><span class="plain-syntax">(</span><a href="3-wrd.html#SP8" class="function-link"><span class="function-syntax">Wordings::first_wn</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">W</span><span class="plain-syntax">));</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP17"></a><b>§17. Ordinals. </b>The following parses the string to see if it is a non-negative integer,
|
|
written as an English ordinal: 0th, 1st, 2nd, 3rd, 4th, 5th, ... Note
|
|
that we don't bother to police the finicky rules on which suffix should
|
|
accompany which value (22nd not 22th, and so on).
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Vocabulary::an_ordinal_number</span><button class="popup" onclick="togglePopup('usagePopup16')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup16">Usage of <span class="code-font"><span class="function-syntax">Vocabulary::an_ordinal_number</span></span>:<br/><a href="2-vcb.html#SP15">§15</a></span></button><span class="plain-syntax">(</span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">fw</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">i</span><span class="plain-syntax">=0; </span><span class="identifier-syntax">fw</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">] != </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="identifier-syntax">i</span><span class="plain-syntax">++)</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (!(</span><span class="identifier-syntax">Characters::isdigit</span><span class="plain-syntax">(</span><span class="identifier-syntax">fw</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">]))) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">i</span><span class="plain-syntax">>0) &&</span>
|
|
<span class="plain-syntax"> (((</span><span class="identifier-syntax">fw</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">] == </span><span class="character-syntax">'s'</span><span class="plain-syntax">) && (</span><span class="identifier-syntax">fw</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">+1] == </span><span class="character-syntax">'t'</span><span class="plain-syntax">) && (</span><span class="identifier-syntax">fw</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">+2] == </span><span class="constant-syntax">0</span><span class="plain-syntax">)) ||</span>
|
|
<span class="plain-syntax"> ((</span><span class="identifier-syntax">fw</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">] == </span><span class="character-syntax">'n'</span><span class="plain-syntax">) && (</span><span class="identifier-syntax">fw</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">+1] == </span><span class="character-syntax">'d'</span><span class="plain-syntax">) && (</span><span class="identifier-syntax">fw</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">+2] == </span><span class="constant-syntax">0</span><span class="plain-syntax">)) ||</span>
|
|
<span class="plain-syntax"> ((</span><span class="identifier-syntax">fw</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">] == </span><span class="character-syntax">'r'</span><span class="plain-syntax">) && (</span><span class="identifier-syntax">fw</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">+1] == </span><span class="character-syntax">'d'</span><span class="plain-syntax">) && (</span><span class="identifier-syntax">fw</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">+2] == </span><span class="constant-syntax">0</span><span class="plain-syntax">)) ||</span>
|
|
<span class="plain-syntax"> ((</span><span class="identifier-syntax">fw</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">] == </span><span class="character-syntax">'t'</span><span class="plain-syntax">) && (</span><span class="identifier-syntax">fw</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">+1] == </span><span class="character-syntax">'h'</span><span class="plain-syntax">) && (</span><span class="identifier-syntax">fw</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">+2] == </span><span class="constant-syntax">0</span><span class="plain-syntax">))))</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">Wide::atoi</span><span class="plain-syntax">(</span><span class="identifier-syntax">fw</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> -1;</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<nav role="progress"><div class="progresscontainer">
|
|
<ul class="progressbar"><li class="progressprev"><a href="1-wm.html">❮</a></li><li class="progresschapter"><a href="P-wtmd.html">P</a></li><li class="progresschapter"><a href="1-wm.html">1</a></li><li class="progresscurrentchapter">2</li><li class="progresscurrent">vcb</li><li class="progresssection"><a href="2-wa.html">wa</a></li><li class="progresssection"><a href="2-nw.html">nw</a></li><li class="progresschapter"><a href="3-lxr.html">3</a></li><li class="progresschapter"><a href="4-ap.html">4</a></li><li class="progressnext"><a href="2-wa.html">❯</a></li></ul></div>
|
|
</nav><!--End of weave-->
|
|
|
|
</main>
|
|
</body>
|
|
</html>
|
|
|