mirror of
https://github.com/ganelson/inform.git
synced 2024-07-16 22:14:23 +03:00
662 lines
78 KiB
HTML
662 lines
78 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
|
<html>
|
|
<head>
|
|
<title>1/wm</title>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
|
<meta http-equiv="Content-Language" content="en-gb">
|
|
<link href="inweb.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
</head>
|
|
<body>
|
|
|
|
<!--Weave of '2/vcb' generated by 7-->
|
|
<ul class="crumbs"><li><a href="../webs.html">★</a></li><li><a href="index.html">words</a></li><li><a href="index.html#2">Chapter 2: Words in Isolation</a></li><li><b>Vocabulary</b></li></ul><p class="purpose">To classify the words in the lexical stream, where two different words are considered equivalent if they are unquoted and have the same text, taken case insensitively.</p>
|
|
|
|
<ul class="toc"><li><a href="#SP1">§1. Definitions</a></li><li><a href="#SP14">§14. Hash coding of words</a></li><li><a href="#SP15">§15. The hash table of vocabulary</a></li><li><a href="#SP17">§17. Partial words</a></li><li><a href="#SP18">§18. Ordinals</a></li></ul><hr class="tocbar">
|
|
|
|
<p class="inwebparagraph"><a id="SP1"></a><b>§1. Definitions. </b></p>
|
|
|
|
<p class="inwebparagraph"><a id="SP2"></a><b>§2. </b>The following structure is created for each different word found in the
|
|
source. (Recall that these are not necessarily words in the usual English
|
|
sense: for instance, <code class="display"><span class="extract">17</span></code> is a word here.)
|
|
</p>
|
|
|
|
<p class="inwebparagraph">The vocabulary entry structure exists to make textual comparisons faster,
|
|
which is essential to make NI run tolerably quickly: NI's speed on typical
|
|
source texts increased by a factor of 5-10 when this structure was
|
|
introduced. Firstly, the vocabulary is hashed so that it is not too
|
|
painful to compare a newly-read word against the known vocabulary;
|
|
secondly, each word stores linked lists of meanings which it begins,
|
|
occurs in the middle of, ends, or is optionally part of (in the sense
|
|
that "brown" is optionally part of the name "small brown shoe", which
|
|
could also be written "small shoe"); and thirdly, each word also carries
|
|
a bitmap of flags indicating the possible contexts in which it might
|
|
be used. Finally, to avoid parsing the same text over and over for its
|
|
possible meaning as a literal integer, we cache the result: for instance,
|
|
17 for the text <code class="display"><span class="extract">17</span></code>.
|
|
</p>
|
|
|
|
<p class="inwebparagraph">The meaning codes alluded to below are also used for excerpts of text
|
|
(i.e., are not just for single words) and are defined in Excerpt Meanings.
|
|
</p>
|
|
|
|
|
|
<pre class="definitions">
|
|
<span class="definitionkeyword">define</span> <span class="constant">ING_MC</span><span class="plain"> 0</span><span class="identifier">x04000000</span><span class="plain"> </span> <span class="comment">a word ending in -ing</span>
|
|
<span class="definitionkeyword">define</span> <span class="constant">NUMBER_MC</span><span class="plain"> 0</span><span class="identifier">x08000000</span><span class="plain"> </span> <span class="comment">one, two, ..., twelve, 1, 2, ...</span>
|
|
<span class="definitionkeyword">define</span> <span class="constant">I6_MC</span><span class="plain"> 0</span><span class="identifier">x10000000</span><span class="plain"> </span> <span class="comment">piece of verbatim I6 code</span>
|
|
<span class="definitionkeyword">define</span> <span class="constant">TEXTWITHSUBS_MC</span><span class="plain"> 0</span><span class="identifier">x20000000</span><span class="plain"> </span> <span class="comment">double-quoted text literal with substitutions</span>
|
|
<span class="definitionkeyword">define</span> <span class="constant">TEXT_MC</span><span class="plain"> 0</span><span class="identifier">x40000000</span><span class="plain"> </span> <span class="comment">double-quoted text literal without substitutions</span>
|
|
<span class="definitionkeyword">define</span> <span class="constant">ORDINAL_MC</span><span class="plain"> 0</span><span class="identifier">x80000000</span><span class="plain"> </span> <span class="comment">first, second, third, ..., twelfth</span>
|
|
</pre>
|
|
|
|
<pre class="display">
|
|
<span class="reserved">typedef</span><span class="plain"> </span><span class="reserved">struct</span><span class="plain"> </span><span class="reserved">vocabulary_entry</span><span class="plain"> {</span>
|
|
<span class="reserved">unsigned</span><span class="plain"> </span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">flags</span><span class="plain">; </span> <span class="comment">bitmap of "meaning codes" indicating possible usages</span>
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">literal_number_value</span><span class="plain">; </span> <span class="comment">evaluation as a literal number, if any</span>
|
|
<span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">exemplar</span><span class="plain">; </span> <span class="comment">text of one instance of this word</span>
|
|
<span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">raw_exemplar</span><span class="plain">; </span> <span class="comment">text of one instance in its raw untreated form</span>
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">hash</span><span class="plain">; </span> <span class="comment">hash code derived from text of word</span>
|
|
<span class="reserved">struct</span><span class="plain"> </span><span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">next_in_vocab_hash</span><span class="plain">; </span> <span class="comment">next in list with this hash</span>
|
|
<span class="reserved">struct</span><span class="plain"> </span><span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">lower_case_form</span><span class="plain">; </span> <span class="comment">or null if none exists</span>
|
|
<span class="reserved">struct</span><span class="plain"> </span><span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">upper_case_form</span><span class="plain">; </span> <span class="comment">or null if none exists</span>
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">nt_incidence</span><span class="plain">; </span> <span class="comment">bitmap hashing which Preform nonterminals it occurs in</span>
|
|
<span class="reserved">struct</span><span class="plain"> </span><span class="identifier">vocabulary_meaning</span><span class="plain"> </span><span class="identifier">means</span><span class="plain">;</span>
|
|
<span class="plain">} </span><span class="reserved">vocabulary_entry</span><span class="plain">;</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The structure vocabulary_entry is accessed in 4/prf and here.</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP3"></a><b>§3. </b>Some standard punctuation marks:
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">CLOSEBRACE_V</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">CLOSEBRACKET_V</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">COLON_V</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">COMMA_V</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">DOUBLEDASH_V</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">FORWARDSLASH_V</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">FULLSTOP_V</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">OPENBRACE_V</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">OPENBRACKET_V</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">OPENI6_V</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">PARBREAK_V</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">PLUS_V</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">SEMICOLON_V</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">STROKE_V</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
|
|
<span class="reserved">void</span><span class="plain"> </span><span class="functiontext">Vocabulary::create_punctuation</span><span class="plain">(</span><span class="reserved">void</span><span class="plain">) {</span>
|
|
<span class="identifier">CLOSEBRACE_V</span><span class="plain"> = </span><span class="functiontext">Vocabulary::entry_for_text</span><span class="plain">(</span><span class="identifier">L</span><span class="string">"}"</span><span class="plain">);</span>
|
|
<span class="identifier">CLOSEBRACKET_V</span><span class="plain"> = </span><span class="functiontext">Vocabulary::entry_for_text</span><span class="plain">(</span><span class="identifier">L</span><span class="string">")"</span><span class="plain">);</span>
|
|
<span class="identifier">COLON_V</span><span class="plain"> = </span><span class="functiontext">Vocabulary::entry_for_text</span><span class="plain">(</span><span class="identifier">L</span><span class="string">":"</span><span class="plain">);</span>
|
|
<span class="identifier">COMMA_V</span><span class="plain"> = </span><span class="functiontext">Vocabulary::entry_for_text</span><span class="plain">(</span><span class="identifier">L</span><span class="string">","</span><span class="plain">);</span>
|
|
<span class="identifier">DOUBLEDASH_V</span><span class="plain"> = </span><span class="functiontext">Vocabulary::entry_for_text</span><span class="plain">(</span><span class="identifier">L</span><span class="string">"--"</span><span class="plain">);</span>
|
|
<span class="identifier">FORWARDSLASH_V</span><span class="plain"> = </span><span class="functiontext">Vocabulary::entry_for_text</span><span class="plain">(</span><span class="identifier">L</span><span class="string">"/"</span><span class="plain">);</span>
|
|
<span class="identifier">FULLSTOP_V</span><span class="plain"> = </span><span class="functiontext">Vocabulary::entry_for_text</span><span class="plain">(</span><span class="identifier">L</span><span class="string">"."</span><span class="plain">);</span>
|
|
<span class="identifier">OPENBRACE_V</span><span class="plain"> = </span><span class="functiontext">Vocabulary::entry_for_text</span><span class="plain">(</span><span class="identifier">L</span><span class="string">"{"</span><span class="plain">);</span>
|
|
<span class="identifier">OPENBRACKET_V</span><span class="plain"> = </span><span class="functiontext">Vocabulary::entry_for_text</span><span class="plain">(</span><span class="identifier">L</span><span class="string">"("</span><span class="plain">);</span>
|
|
<span class="identifier">OPENI6_V</span><span class="plain"> = </span><span class="functiontext">Vocabulary::entry_for_text</span><span class="plain">(</span><span class="identifier">L</span><span class="string">"(-"</span><span class="plain">);</span>
|
|
<span class="identifier">PARBREAK_V</span><span class="plain"> = </span><span class="functiontext">Vocabulary::entry_for_text</span><span class="plain">(</span><span class="constant">PARAGRAPH_BREAK</span><span class="plain">);</span>
|
|
<span class="identifier">PLUS_V</span><span class="plain"> = </span><span class="functiontext">Vocabulary::entry_for_text</span><span class="plain">(</span><span class="identifier">L</span><span class="string">"+"</span><span class="plain">);</span>
|
|
<span class="identifier">SEMICOLON_V</span><span class="plain"> = </span><span class="functiontext">Vocabulary::entry_for_text</span><span class="plain">(</span><span class="identifier">L</span><span class="string">";"</span><span class="plain">);</span>
|
|
<span class="identifier">STROKE_V</span><span class="plain"> = </span><span class="functiontext">Vocabulary::entry_for_text</span><span class="plain">(</span><span class="identifier">L</span><span class="string">"|"</span><span class="plain">);</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Vocabulary::create_punctuation is used in 1/wm (<a href="1-wm.html#SP3">§3</a>).</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP4"></a><b>§4. </b>Each distinct word is to have a unique <code class="display"><span class="extract">vocabulary_entry</span></code> structure, and the
|
|
"identity" at word number <code class="display"><span class="extract">wn</span></code> is to point to the structure for the text
|
|
at that word. Two words are distinct if their lower-case forms are different,
|
|
except that two quoted literal texts are always distinct, even if they have
|
|
the same content. So for instance,
|
|
</p>
|
|
|
|
<blockquote>
|
|
<p>Daleks conquer and destroy! "Ba-dum." Exterminate, exterminate! "Ba-dum."</p>
|
|
|
|
</blockquote>
|
|
|
|
<p class="inwebparagraph">would be identified as
|
|
</p>
|
|
|
|
<blockquote>
|
|
<p>|ve0| |ve1| |ve2| |ve3| |ve4| |ve5| |ve6| |ve6| |ve4| |ve7|</p>
|
|
|
|
</blockquote>
|
|
|
|
<p class="inwebparagraph">where <code class="display"><span class="extract">ve4</span></code> is the common identity of both exclamation marks, and <code class="display"><span class="extract">ve6</span></code>
|
|
that of the two "exterminate"s, even though they have different casings;
|
|
while the quoted text <code class="display"><span class="extract">"Ba-dum."</span></code> came out with two different identities
|
|
<code class="display"><span class="extract">ve5</span></code> and <code class="display"><span class="extract">ve7</span></code>.
|
|
</p>
|
|
|
|
<p class="inwebparagraph">When we want to set the identity for a given word, we call these front-door
|
|
routines, either on a single word or on a range.
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">void</span><span class="plain"> </span><span class="functiontext">Vocabulary::identify_word</span><span class="plain">(</span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">wn</span><span class="plain">) {</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">ve</span><span class="plain"> = </span><span class="functiontext">Vocabulary::entry_for_text</span><span class="plain">(</span><span class="functiontext">Lexer::word_text</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">));</span>
|
|
<span class="identifier">ve</span><span class="plain">-</span><span class="element">>raw_exemplar</span><span class="plain"> = </span><span class="functiontext">Lexer::word_raw_text</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">);</span>
|
|
<span class="functiontext">Lexer::set_word</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">, </span><span class="identifier">ve</span><span class="plain">);</span>
|
|
<span class="plain">}</span>
|
|
|
|
<span class="reserved">void</span><span class="plain"> </span><span class="functiontext">Vocabulary::identify_word_range</span><span class="plain">(</span><span class="reserved">wording</span><span class="plain"> </span><span class="identifier">W</span><span class="plain">) {</span>
|
|
<span class="identifier">LOOP_THROUGH_WORDING</span><span class="plain">(</span><span class="identifier">i</span><span class="plain">, </span><span class="identifier">W</span><span class="plain">)</span>
|
|
<span class="functiontext">Vocabulary::identify_word</span><span class="plain">(</span><span class="identifier">i</span><span class="plain">);</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Vocabulary::identify_word is used in <a href="#SP5">§5</a>, 3/lxr (<a href="3-lxr.html#SP26_5_2">§26.5.2</a>, <a href="3-lxr.html#SP26_6">§26.6</a>), 4/nw (<a href="4-nw.html#SP8">§8</a>).</p>
|
|
|
|
<p class="endnote">The function Vocabulary::identify_word_range is used in 3/fds (<a href="3-fds.html#SP5">§5</a>).</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP5"></a><b>§5. </b>Should we ever change the text of a word, it's essential to re-identify it,
|
|
as otherwise its <code class="display"><span class="extract">lw_identity</span></code> points to the wrong vocabulary entry.
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">void</span><span class="plain"> </span><span class="functiontext">Vocabulary::change_text_of_word</span><span class="plain">(</span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">wn</span><span class="plain">, </span><span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">new</span><span class="plain">) {</span>
|
|
<span class="functiontext">Lexer::set_word_text</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">, </span><span class="identifier">new</span><span class="plain">);</span>
|
|
<span class="functiontext">Lexer::set_word_raw_text</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">, </span><span class="identifier">new</span><span class="plain">);</span>
|
|
<span class="functiontext">Vocabulary::identify_word</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">);</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Vocabulary::change_text_of_word appears nowhere else.</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP6"></a><b>§6. </b>We now need some utilities for dealing with vocabulary entries. Here is a
|
|
creator, and a debugging logger:
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="functiontext">Vocabulary::vocab_entry_new</span><span class="plain">(</span><span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">text</span><span class="plain">, </span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">hash_code</span><span class="plain">, </span><span class="reserved">unsigned</span><span class="plain"> </span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">flags</span><span class="plain">, </span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">val</span><span class="plain">) {</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">ve</span><span class="plain"> = </span><span class="identifier">CREATE</span><span class="plain">(</span><span class="reserved">vocabulary_entry</span><span class="plain">);</span>
|
|
<span class="identifier">ve</span><span class="plain">-</span><span class="element">>exemplar</span><span class="plain"> = </span><span class="identifier">text</span><span class="plain">; </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>raw_exemplar</span><span class="plain"> = </span><span class="identifier">text</span><span class="plain">;</span>
|
|
<span class="identifier">ve</span><span class="plain">-</span><span class="element">>next_in_vocab_hash</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="identifier">ve</span><span class="plain">-</span><span class="element">>lower_case_form</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">; </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>upper_case_form</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="identifier">ve</span><span class="plain">-</span><span class="element">>hash</span><span class="plain"> = </span><span class="identifier">hash_code</span><span class="plain">;</span>
|
|
<span class="identifier">ve</span><span class="plain">-</span><span class="element">>nt_incidence</span><span class="plain"> = 0;</span>
|
|
<span class="identifier">ve</span><span class="plain">-</span><span class="element">>flags</span><span class="plain"> = </span><span class="identifier">flags</span><span class="plain">;</span>
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">l</span><span class="plain"> = </span><span class="identifier">Wide::len</span><span class="plain">(</span><span class="identifier">text</span><span class="plain">);</span>
|
|
<span class="reserved">if</span><span class="plain"> ((</span><span class="identifier">l</span><span class="plain">>3) && (</span><span class="identifier">text</span><span class="plain">[</span><span class="identifier">l</span><span class="plain">-3] == </span><span class="character">'i'</span><span class="plain">) && (</span><span class="identifier">text</span><span class="plain">[</span><span class="identifier">l</span><span class="plain">-2] == </span><span class="character">'n'</span><span class="plain">) && (</span><span class="identifier">text</span><span class="plain">[</span><span class="identifier">l</span><span class="plain">-1] == </span><span class="character">'g'</span><span class="plain">))</span>
|
|
<span class="identifier">ve</span><span class="plain">-</span><span class="element">>flags</span><span class="plain"> |= </span><span class="constant">ING_MC</span><span class="plain">;</span>
|
|
<span class="identifier">ve</span><span class="plain">-</span><span class="element">>literal_number_value</span><span class="plain"> = </span><span class="identifier">val</span><span class="plain">;</span>
|
|
<span class="identifier">ve</span><span class="plain">-</span><span class="element">>means</span><span class="plain"> = </span><span class="identifier">VOCABULARY_MEANING_INITIALISER</span><span class="plain">(</span><span class="identifier">ve</span><span class="plain">);</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">ve</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
|
|
<span class="reserved">void</span><span class="plain"> </span><span class="functiontext">Vocabulary::log</span><span class="plain">(</span><span class="identifier">OUTPUT_STREAM</span><span class="plain">, </span><span class="reserved">void</span><span class="plain"> *</span><span class="identifier">vve</span><span class="plain">) {</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">ve</span><span class="plain"> = (</span><span class="reserved">vocabulary_entry</span><span class="plain"> *) </span><span class="identifier">vve</span><span class="plain">;</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">ve</span><span class="plain"> == </span><span class="identifier">NULL</span><span class="plain">) { </span><span class="identifier">WRITE</span><span class="plain">(</span><span class="string">"NULL"</span><span class="plain">); </span><span class="reserved">return</span><span class="plain">; }</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>exemplar</span><span class="plain"> == </span><span class="identifier">NULL</span><span class="plain">) { </span><span class="identifier">WRITE</span><span class="plain">(</span><span class="string">"NULL-EXEMPLAR"</span><span class="plain">); </span><span class="reserved">return</span><span class="plain">; }</span>
|
|
<span class="identifier">WRITE</span><span class="plain">(</span><span class="string">"%08x-%w-%08x"</span><span class="plain">, </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>hash</span><span class="plain">, </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>raw_exemplar</span><span class="plain">, </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>flags</span><span class="plain">);</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Vocabulary::vocab_entry_new is used in <a href="#SP10">§10</a>, <a href="#SP16_1">§16.1</a>, <a href="#SP16_2">§16.2</a>.</p>
|
|
|
|
<p class="endnote">The function Vocabulary::log is used in 1/wm (<a href="1-wm.html#SP3_4">§3.4</a>).</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP7"></a><b>§7. </b>It's perhaps unexpected that a vocabulary entry not only stores a (pointer
|
|
to) a copy of the text, the "exemplar" (since it is text which is an
|
|
example of this vocabulary being used), but also a separate raw copy of
|
|
the text: raw in the sense of retaining the original form in the source
|
|
files which the word came from. This looks strange because we normally
|
|
identify words on their case-lowered text, not on their raw text. In
|
|
the source material:
|
|
</p>
|
|
|
|
<blockquote>
|
|
<p>Former Marillion vocalist Fish derived his nickname not from a fish, but from habitual bathing.</p>
|
|
|
|
</blockquote>
|
|
|
|
<p class="inwebparagraph">words 4, "Fish", and 11, "fish", each have the same vocabulary entry
|
|
as identity, even though their raw texts differ. Clearly the ordinary
|
|
exemplar of this entry must be "fish". But what should the raw exemplar
|
|
be, "Fish" or "fish"? The answer is the latter, or in general, the raw
|
|
exemplar will always be the same as the exemplar; unless we have amended
|
|
it by hand, using the following routine.
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">void</span><span class="plain"> </span><span class="functiontext">Vocabulary::set_raw_exemplar_to_text</span><span class="plain">(</span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">wn</span><span class="plain">) {</span>
|
|
<span class="functiontext">Lexer::word</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">)-</span><span class="element">>raw_exemplar</span><span class="plain"> = </span><span class="functiontext">Lexer::word_text</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">);</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Vocabulary::set_raw_exemplar_to_text is used in 4/nw (<a href="4-nw.html#SP8">§8</a>).</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP8"></a><b>§8. </b>Here are some access routines for the data stored in this
|
|
structure:
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="identifier">wchar_t</span><span class="plain"> *</span><span class="functiontext">Vocabulary::get_exemplar</span><span class="plain">(</span><span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">ve</span><span class="plain">, </span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">raw</span><span class="plain">) {</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">raw</span><span class="plain">) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>raw_exemplar</span><span class="plain">;</span>
|
|
<span class="reserved">else</span><span class="plain"> </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>exemplar</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
|
|
<span class="reserved">void</span><span class="plain"> </span><span class="functiontext">Vocabulary::writer</span><span class="plain">(</span><span class="identifier">OUTPUT_STREAM</span><span class="plain">, </span><span class="reserved">char</span><span class="plain"> *</span><span class="identifier">format_string</span><span class="plain">, </span><span class="reserved">void</span><span class="plain"> *</span><span class="identifier">vV</span><span class="plain">) {</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">ve</span><span class="plain"> = (</span><span class="reserved">vocabulary_entry</span><span class="plain"> *) </span><span class="identifier">vV</span><span class="plain">;</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">ve</span><span class="plain"> == </span><span class="identifier">NULL</span><span class="plain">) </span><span class="identifier">internal_error</span><span class="plain">(</span><span class="string">"tried to write null vocabulary"</span><span class="plain">);</span>
|
|
<span class="reserved">switch</span><span class="plain"> (</span><span class="identifier">format_string</span><span class="plain">[0]) {</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="character">'+'</span><span class="plain">: </span><span class="identifier">WRITE</span><span class="plain">(</span><span class="string">"%w"</span><span class="plain">, </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>raw_exemplar</span><span class="plain">); </span><span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="character">'V'</span><span class="plain">: </span><span class="identifier">WRITE</span><span class="plain">(</span><span class="string">"%w"</span><span class="plain">, </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>exemplar</span><span class="plain">); </span><span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="reserved">default</span><span class="plain">: </span><span class="identifier">internal_error</span><span class="plain">(</span><span class="string">"bad %V extension"</span><span class="plain">);</span>
|
|
<span class="plain">}</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Vocabulary::get_exemplar is used in 4/prf (<a href="4-prf.html#SP31">§31</a>).</p>
|
|
|
|
<p class="endnote">The function Vocabulary::writer is used in 1/wm (<a href="1-wm.html#SP3_1">§3.1</a>).</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP9"></a><b>§9. </b>An integer is stored at each vocabulary entry, recording its value
|
|
if it every turns out to parse as a literal number:
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="functiontext">Vocabulary::get_literal_number_value</span><span class="plain">(</span><span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">ve</span><span class="plain">) {</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>literal_number_value</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">void</span><span class="plain"> </span><span class="functiontext">Vocabulary::set_literal_number_value</span><span class="plain">(</span><span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">ve</span><span class="plain">, </span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">val</span><span class="plain">) {</span>
|
|
<span class="identifier">ve</span><span class="plain">-</span><span class="element">>literal_number_value</span><span class="plain"> = </span><span class="identifier">val</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Vocabulary::get_literal_number_value is used in 4/prf (<a href="4-prf.html#SP29_1_1">§29.1.1</a>, <a href="4-prf.html#SP29_1_3">§29.1.3</a>), 4/bn (<a href="4-bn.html#SP5">§5</a>).</p>
|
|
|
|
<p class="endnote">The function Vocabulary::set_literal_number_value appears nowhere else.</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP10"></a><b>§10. </b>Almost all text is used case insensitively in Inform source, but we do
|
|
occasionally need to distinguish "The" from "the" and the like, when
|
|
parsing the names of text substitutions. When a new text substitution is
|
|
declared whose first word, in the definition, begins with a capital letter,
|
|
<code class="display"><span class="extract">Vocabulary::make_case_sensitive</span></code> is called on the first word, and its identity
|
|
is changed to the upper case variant form.
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="functiontext">Vocabulary::used_case_sensitively</span><span class="plain">(</span><span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">ve</span><span class="plain">) {</span>
|
|
<span class="reserved">if</span><span class="plain"> ((</span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>upper_case_form</span><span class="plain">) || (</span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>lower_case_form</span><span class="plain">)) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">TRUE</span><span class="plain">;</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">FALSE</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="functiontext">Vocabulary::get_lower_case_form</span><span class="plain">(</span><span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">ve</span><span class="plain">) {</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>lower_case_form</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="functiontext">Vocabulary::make_case_sensitive</span><span class="plain">(</span><span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">ve</span><span class="plain">) {</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>upper_case_form</span><span class="plain">) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>upper_case_form</span><span class="plain">;</span>
|
|
<span class="identifier">ve</span><span class="plain">-</span><span class="element">>upper_case_form</span><span class="plain"> =</span>
|
|
<span class="functiontext">Vocabulary::vocab_entry_new</span><span class="plain">(</span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>exemplar</span><span class="plain">, </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>hash</span><span class="plain">, </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>flags</span><span class="plain">, </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>literal_number_value</span><span class="plain">);</span>
|
|
<span class="identifier">ve</span><span class="plain">-</span><span class="element">>upper_case_form</span><span class="plain">-</span><span class="element">>lower_case_form</span><span class="plain"> = </span><span class="identifier">ve</span><span class="plain">;</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>upper_case_form</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Vocabulary::used_case_sensitively appears nowhere else.</p>
|
|
|
|
<p class="endnote">The function Vocabulary::get_lower_case_form appears nowhere else.</p>
|
|
|
|
<p class="endnote">The function Vocabulary::make_case_sensitive appears nowhere else.</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP11"></a><b>§11. </b>Finally, each vocabulary entry comes with a bitmap of flags, and here
|
|
we get to set and test them:
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">void</span><span class="plain"> </span><span class="functiontext">Vocabulary::set_flags</span><span class="plain">(</span><span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">ve</span><span class="plain">, </span><span class="reserved">unsigned</span><span class="plain"> </span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">t</span><span class="plain">) {</span>
|
|
<span class="identifier">ve</span><span class="plain">-</span><span class="element">>flags</span><span class="plain"> |= </span><span class="identifier">t</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">unsigned</span><span class="plain"> </span><span class="reserved">int</span><span class="plain"> </span><span class="functiontext">Vocabulary::test_vflags</span><span class="plain">(</span><span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">ve</span><span class="plain">, </span><span class="reserved">unsigned</span><span class="plain"> </span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">t</span><span class="plain">) {</span>
|
|
<span class="reserved">return</span><span class="plain"> (</span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>flags</span><span class="plain">) & </span><span class="identifier">t</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">unsigned</span><span class="plain"> </span><span class="reserved">int</span><span class="plain"> </span><span class="functiontext">Vocabulary::test_flags</span><span class="plain">(</span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">wn</span><span class="plain">, </span><span class="reserved">unsigned</span><span class="plain"> </span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">t</span><span class="plain">) {</span>
|
|
<span class="reserved">return</span><span class="plain"> (</span><span class="functiontext">Lexer::word</span><span class="plain">(</span><span class="identifier">wn</span><span class="plain">)-</span><span class="element">>flags</span><span class="plain">) & </span><span class="identifier">t</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Vocabulary::set_flags appears nowhere else.</p>
|
|
|
|
<p class="endnote">The function Vocabulary::test_vflags appears nowhere else.</p>
|
|
|
|
<p class="endnote">The function Vocabulary::test_flags is used in 3/wrd (<a href="3-wrd.html#SP16">§16</a>), 4/prf (<a href="4-prf.html#SP29_1_1">§29.1.1</a>, <a href="4-prf.html#SP29_1_3">§29.1.3</a>), 4/bn (<a href="4-bn.html#SP5">§5</a>, <a href="4-bn.html#SP6">§6</a>).</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP12"></a><b>§12. </b>It can be useful to find the disjunction of the flags for all the words
|
|
in a range, as that gives us a single bitmap which tells us quickly whether
|
|
any of the words in that range is a number, or is a word ending in "-ing",
|
|
and so on:
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">unsigned</span><span class="plain"> </span><span class="reserved">int</span><span class="plain"> </span><span class="functiontext">Vocabulary::disjunction_of_flags</span><span class="plain">(</span><span class="reserved">wording</span><span class="plain"> </span><span class="identifier">W</span><span class="plain">) {</span>
|
|
<span class="reserved">unsigned</span><span class="plain"> </span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">d</span><span class="plain"> = 0;</span>
|
|
<span class="identifier">LOOP_THROUGH_WORDING</span><span class="plain">(</span><span class="identifier">i</span><span class="plain">, </span><span class="identifier">W</span><span class="plain">)</span>
|
|
<span class="identifier">d</span><span class="plain"> |= (</span><span class="functiontext">Lexer::word</span><span class="plain">(</span><span class="identifier">i</span><span class="plain">)-</span><span class="element">>flags</span><span class="plain">);</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">d</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Vocabulary::disjunction_of_flags appears nowhere else.</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP13"></a><b>§13. </b>Also:
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">void</span><span class="plain"> </span><span class="functiontext">Vocabulary::set_ntb</span><span class="plain">(</span><span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">ve</span><span class="plain">, </span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">R</span><span class="plain">) {</span>
|
|
<span class="identifier">ve</span><span class="plain">-</span><span class="element">>nt_incidence</span><span class="plain"> = </span><span class="identifier">R</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="functiontext">Vocabulary::get_ntb</span><span class="plain">(</span><span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">ve</span><span class="plain">) {</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">ve</span><span class="plain">-</span><span class="element">>nt_incidence</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Vocabulary::set_ntb is used in 4/prf (<a href="4-prf.html#SP34">§34</a>).</p>
|
|
|
|
<p class="endnote">The function Vocabulary::get_ntb is used in 4/prf (<a href="4-prf.html#SP34">§34</a>, <a href="4-prf.html#SP35">§35</a>, <a href="4-prf.html#SP36">§36</a>).</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP14"></a><b>§14. Hash coding of words. </b>To find all the different words used in the source text, we need in principle
|
|
to make an enormous number of comparisons of their texts. It is slow to make
|
|
a correct identification of two texts as being equal: we have to compare
|
|
their every characters against each other. Fortunately, it can be much
|
|
faster to tell if they are different. We do this by rapidly deriving a
|
|
number from their texts, and then comparing the numbers: if different,
|
|
the texts were different.
|
|
</p>
|
|
|
|
<p class="inwebparagraph">The most obvious number would be the length of the text, but this produces
|
|
too little variation, and too many false positives: "blue" and "cyan",
|
|
for instance, would each produce the number 4.
|
|
</p>
|
|
|
|
<p class="inwebparagraph">Instead we use a standard method to derive a number traditionally called
|
|
a "hash code". This is the algorithm called "X 30011" in Aho, Sethi and
|
|
Ullman's standard reference "Compilers: Principles, Techniques and Tools" (1986).
|
|
Because it is derived from constantly overflowing integer arithmetic,
|
|
it will produce different codes on different architectures (say, where
|
|
<code class="display"><span class="extract">int</span></code> is 64 bits long rather than 32, or where <code class="display"><span class="extract">char</span></code> is unsigned).
|
|
All that matters is that it provides a good spread of hash codes for
|
|
typical texts fed into it on any given occasion.
|
|
</p>
|
|
|
|
<p class="inwebparagraph">Good results depend on the number of possible codes being not too tiny
|
|
compared to the number of different texts fed in, and also on the key value
|
|
30011 being coprime to this number (but 30011 is prime, so that's easily
|
|
arranged). A typical source text of 50,000 words has an unquoted vocabulary
|
|
of only about 2000 different words. The variation in vocabulary size
|
|
between the smallest text source and the largest is only about a factor of
|
|
three or four, so there is no need to make a dynamic estimate of the size
|
|
of the source. We will always choose 997 as the number of possible hash
|
|
codes produced by X 30011: we reserve a further three special codes to be
|
|
the hashes of literals rather than ordinary words, and this brings us up to
|
|
a round 1000.
|
|
</p>
|
|
|
|
<p class="inwebparagraph">Inside the lexer, decimal integers such as <code class="display"><span class="extract">-506</span></code> were treated as ordinary
|
|
words, as there were no lexical difficulties in parsing them. Here they
|
|
begin to semantically diverge from the way other ordinary words are handled:
|
|
they're treated more like literal texts and I6 inclusions.
|
|
</p>
|
|
|
|
|
|
<pre class="definitions">
|
|
<span class="definitionkeyword">define</span> <span class="constant">HASH_TAB_SIZE</span><span class="plain"> 1000 </span> <span class="comment">the possible hash codes are 0 up to this minus 1</span>
|
|
<span class="definitionkeyword">define</span> <span class="constant">NUMBER_HASH</span><span class="plain"> 0 </span> <span class="comment">literal decimal integers, and no other words, have this hash code</span>
|
|
<span class="definitionkeyword">define</span> <span class="constant">TEXT_HASH</span><span class="plain"> 1 </span> <span class="comment">double quoted texts, and no other words, have this hash code</span>
|
|
<span class="definitionkeyword">define</span> <span class="constant">I6_HASH</span><span class="plain"> 2 </span> <span class="comment">the <code class="display"><span class="extract">(-</span></code> word introducing an I6 inclusion uniquely has this hash code</span>
|
|
</pre>
|
|
|
|
<pre class="display">
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="functiontext">Vocabulary::hash_code_from_word</span><span class="plain">(</span><span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">text</span><span class="plain">) {</span>
|
|
<span class="reserved">unsigned</span><span class="plain"> </span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">hash_code</span><span class="plain"> = 0;</span>
|
|
<span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">p</span><span class="plain"> = </span><span class="identifier">text</span><span class="plain">;</span>
|
|
<span class="reserved">switch</span><span class="plain">(*</span><span class="identifier">p</span><span class="plain">) {</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="character">'-'</span><span class="plain">: </span><span class="reserved">if</span><span class="plain"> (</span><span class="identifier">p</span><span class="plain">[1] == 0) </span><span class="reserved">break</span><span class="plain">; </span> <span class="comment">an isolated minus sign is an ordinary word</span>
|
|
<span class="comment">and otherwise fall into...</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="character">'0'</span><span class="plain">: </span><span class="reserved">case</span><span class="plain"> </span><span class="character">'1'</span><span class="plain">: </span><span class="reserved">case</span><span class="plain"> </span><span class="character">'2'</span><span class="plain">: </span><span class="reserved">case</span><span class="plain"> </span><span class="character">'3'</span><span class="plain">: </span><span class="reserved">case</span><span class="plain"> </span><span class="character">'4'</span><span class="plain">:</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="character">'5'</span><span class="plain">: </span><span class="reserved">case</span><span class="plain"> </span><span class="character">'6'</span><span class="plain">: </span><span class="reserved">case</span><span class="plain"> </span><span class="character">'7'</span><span class="plain">: </span><span class="reserved">case</span><span class="plain"> </span><span class="character">'8'</span><span class="plain">: </span><span class="reserved">case</span><span class="plain"> </span><span class="character">'9'</span><span class="plain">:</span>
|
|
<span class="comment">the first character may prove to be the start of a number: is this true?</span>
|
|
<span class="reserved">for</span><span class="plain"> (</span><span class="identifier">p</span><span class="plain">++; *</span><span class="identifier">p</span><span class="plain">; </span><span class="identifier">p</span><span class="plain">++) </span><span class="reserved">if</span><span class="plain"> (</span><span class="identifier">Characters::isdigit</span><span class="plain">(*</span><span class="identifier">p</span><span class="plain">) == </span><span class="identifier">FALSE</span><span class="plain">) </span><span class="reserved">goto</span><span class="plain"> </span><span class="identifier">Try_Text</span><span class="plain">;</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="constant">NUMBER_HASH</span><span class="plain">;</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="character">' '</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="constant">I6_HASH</span><span class="plain">;</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="character">'('</span><span class="plain">: </span><span class="reserved">if</span><span class="plain"> (</span><span class="identifier">p</span><span class="plain">[1] == </span><span class="character">'-'</span><span class="plain">) </span><span class="reserved">return</span><span class="plain"> </span><span class="constant">I6_HASH</span><span class="plain">;</span>
|
|
<span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="character">'"'</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="constant">TEXT_HASH</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
<span class="identifier">Try_Text</span><span class="plain">:</span>
|
|
<span class="plain">#</span><span class="identifier">pragma</span><span class="plain"> </span><span class="identifier">clang</span><span class="plain"> </span><span class="identifier">diagnostic</span><span class="plain"> </span><span class="identifier">push</span>
|
|
<span class="plain">#</span><span class="identifier">pragma</span><span class="plain"> </span><span class="identifier">clang</span><span class="plain"> </span><span class="identifier">diagnostic</span><span class="plain"> </span><span class="identifier">ignored</span><span class="plain"> </span><span class="string">"-Wsign-conversion"</span>
|
|
<span class="reserved">for</span><span class="plain"> (</span><span class="identifier">p</span><span class="plain">=</span><span class="identifier">text</span><span class="plain">; *</span><span class="identifier">p</span><span class="plain">; </span><span class="identifier">p</span><span class="plain">++) </span><span class="identifier">hash_code</span><span class="plain"> = </span><span class="identifier">hash_code</span><span class="plain">*30011 + (*</span><span class="identifier">p</span><span class="plain">);</span>
|
|
<span class="plain">#</span><span class="identifier">pragma</span><span class="plain"> </span><span class="identifier">clang</span><span class="plain"> </span><span class="identifier">diagnostic</span><span class="plain"> </span><span class="identifier">pop</span>
|
|
<span class="reserved">return</span><span class="plain"> (</span><span class="reserved">int</span><span class="plain">) (3+(</span><span class="identifier">hash_code</span><span class="plain"> % (</span><span class="constant">HASH_TAB_SIZE</span><span class="plain">-3))); </span> <span class="comment">result of X 30011, plus 3</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Vocabulary::hash_code_from_word is used in <a href="#SP16">§16</a>.</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP15"></a><b>§15. The hash table of vocabulary. </b>Armed with these hash codes, we now store the pointers to the vocabulary
|
|
entry structures in linked lists, one for each possible hash code.
|
|
These begin empty.
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">list_of_vocab_with_hash</span><span class="plain">[</span><span class="constant">HASH_TAB_SIZE</span><span class="plain">];</span>
|
|
<span class="reserved">void</span><span class="plain"> </span><span class="functiontext">Vocabulary::start_hash_table</span><span class="plain">(</span><span class="reserved">void</span><span class="plain">) {</span>
|
|
<span class="reserved">for</span><span class="plain"> (</span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">i</span><span class="plain">=0; </span><span class="identifier">i</span><span class="plain"><</span><span class="constant">HASH_TAB_SIZE</span><span class="plain">; </span><span class="identifier">i</span><span class="plain">++) </span><span class="identifier">list_of_vocab_with_hash</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">] = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
|
|
<span class="reserved">void</span><span class="plain"> </span><span class="functiontext">Vocabulary::write_hash_table</span><span class="plain">(</span><span class="identifier">OUTPUT_STREAM</span><span class="plain">) {</span>
|
|
<span class="reserved">for</span><span class="plain"> (</span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">i</span><span class="plain">=0; </span><span class="identifier">i</span><span class="plain"><</span><span class="constant">HASH_TAB_SIZE</span><span class="plain">; </span><span class="identifier">i</span><span class="plain">++) {</span>
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">c</span><span class="plain">=0;</span>
|
|
<span class="reserved">for</span><span class="plain"> (</span><span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">entry</span><span class="plain"> = </span><span class="identifier">list_of_vocab_with_hash</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">];</span>
|
|
<span class="identifier">entry</span><span class="plain">; </span><span class="identifier">entry</span><span class="plain"> = </span><span class="identifier">entry</span><span class="plain">-</span><span class="element">>next_in_vocab_hash</span><span class="plain">) {</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">c</span><span class="plain">++ == 0) </span><span class="identifier">PRINT</span><span class="plain">(</span><span class="string">"%d:"</span><span class="plain">, </span><span class="identifier">i</span><span class="plain">);</span>
|
|
<span class="identifier">PRINT</span><span class="plain">(</span><span class="string">" %w"</span><span class="plain">, </span><span class="identifier">entry</span><span class="plain">-</span><span class="element">>exemplar</span><span class="plain">);</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">c</span><span class="plain">>0) </span><span class="identifier">PRINT</span><span class="plain">(</span><span class="string">"\</span><span class="plain">n</span><span class="string">"</span><span class="plain">);</span>
|
|
<span class="plain">}</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Vocabulary::start_hash_table is used in 3/lxr (<a href="3-lxr.html#SP11">§11</a>).</p>
|
|
|
|
<p class="endnote">The function Vocabulary::write_hash_table appears nowhere else.</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP16"></a><b>§16. </b>And that leaves only one routine: for finding the unique vocabulary
|
|
entry pointer associated with the material in <code class="display"><span class="extract">text</span></code>. We search the
|
|
hash table to see if we have the word already, and if not, we add it.
|
|
</p>
|
|
|
|
<p class="inwebparagraph">It is in order to set the initial values of the flags for the new
|
|
word (if it does turn out to be new) that we mandated special hash
|
|
codes for any number, any text, or any I6 inclusion.
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">no_vocabulary_entries</span><span class="plain"> = 0;</span>
|
|
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="functiontext">Vocabulary::entry_for_text</span><span class="plain">(</span><span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">text</span><span class="plain">) {</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">new_entry</span><span class="plain">;</span>
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">hash_code</span><span class="plain"> = </span><span class="functiontext">Vocabulary::hash_code_from_word</span><span class="plain">(</span><span class="identifier">text</span><span class="plain">), </span><span class="identifier">val</span><span class="plain"> = 0;</span>
|
|
<span class="reserved">unsigned</span><span class="plain"> </span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">f</span><span class="plain"> = 0;</span>
|
|
<span class="reserved">switch</span><span class="plain">(</span><span class="identifier">hash_code</span><span class="plain">) {</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="constant">NUMBER_HASH</span><span class="plain">: </span><span class="identifier">f</span><span class="plain"> = </span><span class="constant">NUMBER_MC</span><span class="plain">; </span><span class="identifier">val</span><span class="plain"> = </span><span class="identifier">Wide::atoi</span><span class="plain">(</span><span class="identifier">text</span><span class="plain">); </span><span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="constant">TEXT_HASH</span><span class="plain">:</span>
|
|
<span class="reserved">switch</span><span class="plain"> (</span><span class="functiontext">Word::perhaps_ill_formed_text_routine</span><span class="plain">(</span><span class="identifier">text</span><span class="plain">)) {</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="identifier">TRUE</span><span class="plain">: </span><span class="identifier">f</span><span class="plain"> = </span><span class="constant">TEXTWITHSUBS_MC</span><span class="plain">; </span><span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="identifier">FALSE</span><span class="plain">: </span><span class="identifier">f</span><span class="plain"> = </span><span class="constant">TEXT_MC</span><span class="plain">; </span><span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="identifier">NOT_APPLICABLE</span><span class="plain">: </span><span class="identifier">f</span><span class="plain"> = </span><span class="constant">TEXT_MC</span><span class="plain">; </span><span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="reserved">case</span><span class="plain"> </span><span class="constant">I6_HASH</span><span class="plain">: </span><span class="identifier">f</span><span class="plain"> = </span><span class="constant">I6_MC</span><span class="plain">; </span><span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="reserved">default</span><span class="plain">:</span>
|
|
<span class="identifier">val</span><span class="plain"> = </span><span class="functiontext">Vocabulary::an_ordinal_number</span><span class="plain">(</span><span class="identifier">text</span><span class="plain">);</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">val</span><span class="plain"> >= 0) </span><span class="identifier">f</span><span class="plain"> = </span><span class="constant">NUMBER_MC</span><span class="plain"> + </span><span class="constant">ORDINAL_MC</span><span class="plain">; </span> <span class="comment">so that "4th", say, picks up both</span>
|
|
<span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">list_of_vocab_with_hash</span><span class="plain">[</span><span class="identifier">hash_code</span><span class="plain">] == </span><span class="identifier">NULL</span><span class="plain">) {</span>
|
|
<<span class="cwebmacro">Pi-ty? That word is not in my vocabulary banks</span> <span class="cwebmacronumber">16.1</span>><span class="plain">;</span>
|
|
<span class="plain">} </span><span class="reserved">else</span><span class="plain"> {</span>
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="identifier">old_entry</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">n</span><span class="plain">;</span>
|
|
<span class="comment">search the non-empty list of words with this hash code</span>
|
|
<span class="reserved">for</span><span class="plain"> (</span><span class="identifier">n</span><span class="plain">=0, </span><span class="identifier">new_entry</span><span class="plain"> = </span><span class="identifier">list_of_vocab_with_hash</span><span class="plain">[</span><span class="identifier">hash_code</span><span class="plain">];</span>
|
|
<span class="identifier">new_entry</span><span class="plain"> != </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="identifier">n</span><span class="plain">++, </span><span class="identifier">old_entry</span><span class="plain"> = </span><span class="identifier">new_entry</span><span class="plain">, </span><span class="identifier">new_entry</span><span class="plain"> = </span><span class="identifier">new_entry</span><span class="plain">-</span><span class="element">>next_in_vocab_hash</span><span class="plain">)</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">Wide::cmp</span><span class="plain">(</span><span class="identifier">new_entry</span><span class="plain">-</span><span class="element">>exemplar</span><span class="plain">, </span><span class="identifier">text</span><span class="plain">) == 0)</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">new_entry</span><span class="plain">;</span>
|
|
<span class="comment">and if we do not find <code class="display"><span class="extract">text</span></code> in there, then...</span>
|
|
<<span class="cwebmacro">My vision is impaired! I cannot see!</span> <span class="cwebmacronumber">16.2</span>><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Vocabulary::entry_for_text is used in <a href="#SP3">§3</a>, <a href="#SP4">§4</a>, 4/prf (<a href="4-prf.html#SP25">§25</a>, <a href="4-prf.html#SP26">§26</a>).</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP16_1"></a><b>§16.1. </b>Here the list for this word's hash code was empty, either meaning that this
|
|
is a hash code never seen for any word before (in which case we start the
|
|
list for that hash code with the new word), or that the word is a text
|
|
literal — because, for efficiency's sake, we deliberately keep the
|
|
hash list for all text literals empty.
|
|
</p>
|
|
|
|
|
|
<p class="macrodefinition"><code class="display">
|
|
<<span class="cwebmacrodefn">Pi-ty? That word is not in my vocabulary banks</span> <span class="cwebmacronumber">16.1</span>> =
|
|
</code></p>
|
|
|
|
|
|
<pre class="displaydefn">
|
|
<span class="identifier">new_entry</span><span class="plain"> = </span><span class="functiontext">Vocabulary::vocab_entry_new</span><span class="plain">(</span><span class="identifier">text</span><span class="plain">, </span><span class="identifier">hash_code</span><span class="plain">, </span><span class="identifier">f</span><span class="plain">, </span><span class="identifier">val</span><span class="plain">);</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">hash_code</span><span class="plain"> != </span><span class="constant">TEXT_HASH</span><span class="plain">) </span><span class="identifier">list_of_vocab_with_hash</span><span class="plain">[</span><span class="identifier">hash_code</span><span class="plain">] = </span><span class="identifier">new_entry</span><span class="plain">;</span>
|
|
<span class="identifier">LOGIF</span><span class="plain">(</span><span class="identifier">VOCABULARY</span><span class="plain">, </span><span class="string">"Word %d <%w> is first vocabulary with hash %d\</span><span class="plain">n</span><span class="string">"</span><span class="plain">,</span>
|
|
<span class="identifier">no_vocabulary_entries</span><span class="plain">++, </span><span class="identifier">text</span><span class="plain">, </span><span class="identifier">hash_code</span><span class="plain">);</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">new_entry</span><span class="plain">;</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">This code is used in <a href="#SP16">§16</a>.</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP16_2"></a><b>§16.2. </b>And here, we exhausted the list at entry <code class="display"><span class="extract">n-1</span></code>, with the last entry being
|
|
pointed to by <code class="display"><span class="extract">old_entry</span></code>. We add the new word at the end.
|
|
</p>
|
|
|
|
|
|
<p class="macrodefinition"><code class="display">
|
|
<<span class="cwebmacrodefn">My vision is impaired! I cannot see!</span> <span class="cwebmacronumber">16.2</span>> =
|
|
</code></p>
|
|
|
|
|
|
<pre class="displaydefn">
|
|
<span class="identifier">new_entry</span><span class="plain"> = </span><span class="functiontext">Vocabulary::vocab_entry_new</span><span class="plain">(</span><span class="identifier">text</span><span class="plain">, </span><span class="identifier">hash_code</span><span class="plain">, </span><span class="identifier">f</span><span class="plain">, </span><span class="identifier">val</span><span class="plain">);</span>
|
|
<span class="identifier">old_entry</span><span class="plain">-</span><span class="element">>next_in_vocab_hash</span><span class="plain"> = </span><span class="identifier">new_entry</span><span class="plain">;</span>
|
|
<span class="identifier">LOGIF</span><span class="plain">(</span><span class="identifier">VOCABULARY</span><span class="plain">, </span><span class="string">"Word %d <%w> is vocabulary entry no. %d with hash %d\</span><span class="plain">n</span><span class="string">"</span><span class="plain">,</span>
|
|
<span class="identifier">no_vocabulary_entries</span><span class="plain">++, </span><span class="identifier">text</span><span class="plain">, </span><span class="identifier">n</span><span class="plain">, </span><span class="identifier">hash_code</span><span class="plain">);</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">new_entry</span><span class="plain">;</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">This code is used in <a href="#SP16">§16</a>.</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP17"></a><b>§17. Partial words. </b>Much the same, except that we enter a fragment of a word into lexical memory
|
|
and then find its identity as if it were a whole word.
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">vocabulary_entry</span><span class="plain"> *</span><span class="functiontext">Vocabulary::entry_for_partial_text</span><span class="plain">(</span><span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">str</span><span class="plain">, </span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">from</span><span class="plain">, </span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">to</span><span class="plain">) {</span>
|
|
<span class="identifier">TEMPORARY_TEXT</span><span class="plain">(</span><span class="identifier">TEMP</span><span class="plain">);</span>
|
|
<span class="reserved">for</span><span class="plain"> (</span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">i</span><span class="plain">=</span><span class="identifier">from</span><span class="plain">; </span><span class="identifier">i</span><span class="plain"><=</span><span class="identifier">to</span><span class="plain">; </span><span class="identifier">i</span><span class="plain">++) </span><span class="identifier">PUT_TO</span><span class="plain">(</span><span class="identifier">TEMP</span><span class="plain">, </span><span class="identifier">str</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">]);</span>
|
|
<span class="identifier">PUT_TO</span><span class="plain">(</span><span class="identifier">TEMP</span><span class="plain">, 0);</span>
|
|
<span class="reserved">wording</span><span class="plain"> </span><span class="identifier">W</span><span class="plain"> = </span><span class="functiontext">Feeds::feed_stream</span><span class="plain">(</span><span class="identifier">TEMP</span><span class="plain">);</span>
|
|
<span class="identifier">DISCARD_TEXT</span><span class="plain">(</span><span class="identifier">TEMP</span><span class="plain">);</span>
|
|
<span class="reserved">if</span><span class="plain"> (</span><span class="functiontext">Wordings::empty</span><span class="plain">(</span><span class="identifier">W</span><span class="plain">)) </span><span class="reserved">return</span><span class="plain"> </span><span class="identifier">NULL</span><span class="plain">;</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="functiontext">Lexer::word</span><span class="plain">(</span><span class="functiontext">Wordings::first_wn</span><span class="plain">(</span><span class="identifier">W</span><span class="plain">));</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Vocabulary::entry_for_partial_text appears nowhere else.</p>
|
|
|
|
<p class="inwebparagraph"><a id="SP18"></a><b>§18. Ordinals. </b>The following parses the string to see if it is a non-negative integer,
|
|
written as an English ordinal: 0th, 1st, 2nd, 3rd, 4th, 5th, ... Note
|
|
that we don't bother to police the finicky rules on which suffix should
|
|
accompany which value (22nd not 22th, and so on).
|
|
</p>
|
|
|
|
|
|
<pre class="display">
|
|
<span class="reserved">int</span><span class="plain"> </span><span class="functiontext">Vocabulary::an_ordinal_number</span><span class="plain">(</span><span class="identifier">wchar_t</span><span class="plain"> *</span><span class="identifier">fw</span><span class="plain">) {</span>
|
|
<span class="reserved">for</span><span class="plain"> (</span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">i</span><span class="plain">=0; </span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">] != 0; </span><span class="identifier">i</span><span class="plain">++)</span>
|
|
<span class="reserved">if</span><span class="plain"> (!(</span><span class="identifier">Characters::isdigit</span><span class="plain">(</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">]))) {</span>
|
|
<span class="reserved">if</span><span class="plain"> ((</span><span class="identifier">i</span><span class="plain">>0) &&</span>
|
|
<span class="plain">(((</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">] == </span><span class="character">'s'</span><span class="plain">) && (</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">+1] == </span><span class="character">'t'</span><span class="plain">) && (</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">+2] == 0)) ||</span>
|
|
<span class="plain">((</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">] == </span><span class="character">'n'</span><span class="plain">) && (</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">+1] == </span><span class="character">'d'</span><span class="plain">) && (</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">+2] == 0)) ||</span>
|
|
<span class="plain">((</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">] == </span><span class="character">'r'</span><span class="plain">) && (</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">+1] == </span><span class="character">'d'</span><span class="plain">) && (</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">+2] == 0)) ||</span>
|
|
<span class="plain">((</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">] == </span><span class="character">'t'</span><span class="plain">) && (</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">+1] == </span><span class="character">'h'</span><span class="plain">) && (</span><span class="identifier">fw</span><span class="plain">[</span><span class="identifier">i</span><span class="plain">+2] == 0))))</span>
|
|
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">Wide::atoi</span><span class="plain">(</span><span class="identifier">fw</span><span class="plain">);</span>
|
|
<span class="reserved">break</span><span class="plain">;</span>
|
|
<span class="plain">}</span>
|
|
<span class="reserved">return</span><span class="plain"> -1;</span>
|
|
<span class="plain">}</span>
|
|
</pre>
|
|
|
|
<p class="inwebparagraph"></p>
|
|
|
|
<p class="endnote">The function Vocabulary::an_ordinal_number is used in <a href="#SP16">§16</a>.</p>
|
|
|
|
<hr class="tocbar">
|
|
<ul class="toc"><li><i>(This section begins Chapter 2: Words in Isolation.)</i></li><li><a href="2-wa.html">Continue with 'Word Assemblages'</a></li></ul><hr class="tocbar">
|
|
<!--End of weave-->
|
|
</body>
|
|
</html>
|
|
|